You are on page 1of 669

Process Quality Control

Troubleshooting and
Interpretation of Data

Fourth Edition

Ellis R. Ott
Edward G. Schilling
Dean V. Neubauer

ASQ Quality Press


Milwaukee, Wisconsin
American Society for Quality, Quality Press, Milwaukee 53203
© 2005 by ASQ
All rights reserved. Published 2005
Printed in the United States of America
12 11 10 09 08 07 06 05 5 4 3 2 1

Library of Congress Cataloging-in-Publication Data

Ott, Ellis R. (Ellis Raymond), 1906–


Process quality control : troubleshooting and interpretation of data / Ellis R. Ott,
Edward G. Schilling, Dean V. Neubauer.—4th ed.
p. cm.
Includes bibliographical references and index.
ISBN 0-87389-655-6 (hard cover, case binding : alk. paper)
1. Process control—Statistical methods. 2. Quality control—Statistical methods.
I. Schilling, Edward G., 1931– II. Neubauer, Dean V. III. Title.
TS156.O86 2005
658.5'62—dc22 2005010988

ISBN 0-87389-655-6

No part of this book may be reproduced in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior written permission of the publisher.

Publisher: William A. Tony


Acquisitions Editor: Annemieke Hytinen
Project Editor: Paul O’Mara
Production Administrator: Randall Benson

ASQ Mission: The American Society for Quality advances individual, organizational, and
community excellence worldwide through learning, quality improvement, and knowledge exchange.

Attention Bookstores, Wholesalers, Schools, and Corporations: ASQ Quality Press books,
videotapes, audiotapes, and software are available at quantity discounts with bulk purchases for
business, educational, or instructional use. For information, please contact ASQ Quality Press at
800-248-1946, or write to ASQ Quality Press, P.O. Box 3005, Milwaukee, WI 53201-3005.

To place orders or to request a free copy of the ASQ Quality Press Publications Catalog, including
ASQ membership information, call 800-248-1946. Visit our Web site at www.asq.org or
http://qualitypress.asq.org.

Printed on acid-free paper


To Virginia, Jean, and Kimberly with love and appreciation.
About the Authors

T
he late Ellis R. Ott was professor emeritus of experimental statistics at Rutgers,
The State University of New Jersey, and the founding director of the Rutgers
Statistics Center. He received his PhD from the University of Illinois. He
consulted extensively, including work with the U.S. State Department and the United
Nations. Dr. Ott was the recipient of numerous quality control awards, including hon-
orary member of the American Society for Quality and its Brumbaugh Award, the
Eugene L. Grant Award, and the Shewhart Medal. He was honored by an award estab-
lished in his name by the Metropolitan Section of ASQ.
Dr. Edward G. Schilling is professor emeritus of statistics and former director of the
Center for Quality and Applied Statistics in the College of Engineering at Rochester
Institute of Technology. He was previously manager of the Lighting Quality Operation
of the General Electric Company. He received his MS and PhD degrees in statistics
from Rutgers University, where he studied under Ellis Ott. Dr. Schilling is a fellow of
the American Statistical Association, ASTM International, and the American Society for
Quality, and is the first person to win the American Society for Quality’s Brumbaugh
Award four times. He is also a recipient of the Shewhart Medal, the Ellis R. Ott Award,
Eugene L. Grant Award, Distinguished Service Medal from ASQ, and the Award of
Merit from ASTM International. He is the author of Acceptance Sampling in Quality
Control and was associate editor of the fifth edition of Juran’s Quality Handbook.
Dean V. Neubauer is employed at Corning Incorporated where he holds the appointed
position of senior engineering associate—statistical engineering, and holds multiple U.S.
patents and trade secrets. He is also an adjunct professor at the Center for Quality and
Applied Statistics in the College of Engineering at Rochester Institute of Technology.
Mr. Neubauer received a BS degree in statistics from Iowa State University and an MS
degree in applied and mathematical statistics from Rochester Institute of Technology. He
has actively participated on ISO and ASTM standards committees. He is a fellow and a

617
618 About the Authors

charter statistician of the Royal Statistical Society, and a member of the American Statis-
tical Association. He is a fellow and certified quality engineer of the American Society
for Quality, as well as the past chair of the Chemical and Process Industries Division.
He is also a book reviewer for Technometrics.
List of Figures and Tables

Table 1.1 Mica thickness, thousandths of an inch. . . . . . . . . . . . . . . . . 4


Figure 1.1 Thickness of mica pieces shown as a histogram. . . . . . . . . . . . 4
Figure 1.2 A normal distribution (with m = 0). . . . . . . . . . . . . . . . . . . 6
Figure 1.3 A lognormal distribution. . . . . . . . . . . . . . . . . . . . . . . . 7
Figure 1.4 A bimodal distribution composed of two normal distributions. . . . . 7
Table 1.2 Data: mica thickness as a tally sheet. . . . . . . . . . . . . . . . . . 9
Table 1.3 Mica thickness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Figure 1.5 Two normal distributions with m 1 = m 2 but s2 > s1. . . . . . . . . . . 13
Figure 1.6 Mica thickness; accumulated percents plotted on normal
probability paper. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Table 1.4 Data: depth of cut. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Figure 1.7 Depth of cut on normal probability paper. . . . . . . . . . . . . . . . 22
Figure 1.8 Distributions sampled by Shewhart: (a) rectangular parent
population; (b) right-triangular parent population; (c) normal
parent population. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Figure 1.9 Estimating percent of a normal curve outside given specifications. . . 26
Figure 1.10 Estimating confidence intervals of unknown process average. . . . . 28
Table 1.5 Mica thickness data in subgroups of ng = 5 with their averages
and ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Figure 1.11 Mica thickness, X and R charts; data in order as in Table 1.5. . . . . 33
Table 1.6 Values of the constant d2. . . . . . . . . . . . . . . . . . . . . . . . 34
Figure 1.12 Schematic of a tobogganing production process. . . . . . . . . . . . 35
Figure 1.13 Stem-and-leaf of mica data means. . . . . . . . . . . . . . . . . . . 38
Figure 1.14 Ordered stem-and-leaf diagram of mica data. . . . . . . . . . . . . . 38
Figure 1.15 Form of box plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

xiii
xiv List of Figures and Tables

Figure 1.16 Ordered stem-and-leaf diagram of 40 individual mica


measurements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 1.17 Box plot of mica individuals and means. . . . . . . . . . . . . . . . 39
Figure 1.18 Box plot of 200 mica measurements. . . . . . . . . . . . . . . . . . 40
Figure 1.19 Dot plot of 200 mica thickness measurements and 40
subgroup means. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Table 1.7 Electrical characteristics (in decibels) of final assemblies from
11 strips of ceramic: Case History 15.1. . . . . . . . . . . . . . . . . 48
Table 1.8 Air-receiver magnetic assembly: Case History 2.1. . . . . . . . . . . 49
Figure 2.1 Measured time for sand to run through a 3-minute egg timer
(recorded in order of observation). . . . . . . . . . . . . . . . . . . . 52
Figure 2.2 An egg timer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Figure 2.3 Twelve averages showing six runs above and below the median. . . . 57
Figure 2.4 Gross average weights of ice cream fill at 10-minute intervals. . . . . 57
Table 2.1 A comparison of the expected number of runs and the
observed number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Table 2.2 Critical extreme length of a run-up or a run-down in a random
set of k observations (one-tail). . . . . . . . . . . . . . . . . . . . . 61
Figure 2.5 Control chart of mica thickness data with limits. . . . . . . . . . . . 64

Table 2.3 Factors to use with X and R control charts for variables. . . . . . . . 65
Figure 2.6 Matching a hole in a brass piece with diaphragm assembly. . . . . . 68
Table 2.4 Data: air-receiver magnetic assembly (depth of cut in mils). . . . . . 69

Figure 2.7 Control chart (historical) of X and R on depth of cut. . . . . . . . . . 70

Figure 2.8 Comparing sensitivities of two X charts, ng = 4 and ng = 9
with operating-characteristic curves. . . . . . . . . . . . . . . . . . . 73
Table 2.5 Computation of OC curve and average run length for Shewhart

X control chart with sample size ng = 4. . . . . . . . . . . . . . . . . 75

Figure 2.9 Average run length curve for X chart with ng = 4. . . . . . . . . . . . 75
Figure 2.10 Distributions with their associated distributions of averages
(ng = 4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 2.11 Operating-characteristic curves of three decision plans associated

with an X control chart, ng = 4. . . . . . . . . . . . . . . . . . . . . 77
Figure 2.12 Accumulated analyses from hourly samples over two weeks’
production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 2.13 A control chart (historical) of chemical concentration of data
taken about once an hour over a two-week period (sample averages
and ranges of four consecutive analyses). . . . . . . . . . . . . . . . 79
Table 2.6 Data: gross weights of ice cream fill in 2.5-gallon containers. . . . . 80
Figure 2.14 A control chart (historical) of filling weights of ice cream
containers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table 2.7 Gross weights of ice cream fill in 2.5-gallon containers. . . . . . . . 83
List of Figures and Tables xv

Table 2.8 Computations basic to a control chart test set calibration. . . . . . . 86


Figure 2.15 A control chart guide to test-set adjustments. . . . . . . . . . . . . . 86
Table 2.9 A performance comparison of six test sets over three
time periods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Table 2.10 Part diameter for nine hours of production. . . . . . . . . . . . . . . 89
Table 2.11 Diameter of initial sample of fifty successive parts. . . . . . . . . . . 89
Table 2.12 Record of filling isotonic solution with averages and
ranges computed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD
Table 2.13 Subgroups of four across needles with averages and
ranges computed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD
Figure 2.16 Plot of hourly diameter readings shown in Table 2.10. . . . . . . . . 90
Figure 2.17 A diameter trend chart developed by the least squares method for
subsequent runs using a forced intercept. . . . . . . . . . . . . . . . 94
Figure 2.18 Control charts for data of Table 2.12. . . . . . . . . . . . . . . . . . CD
Figure 2.19 Control charts for data of Table 2.13. . . . . . . . . . . . . . . . . . CD
Figure 2.20 Digidot plot for subgroup mean data of Table 1.5. . . . . . . . . . . 96
Table 3.1 Record of chemical analyses (column 1) made on consecutive
batches of a chemical compound. . . . . . . . . . . . . . . . . . . . 100

Figure 3.1 An X and R control chart analysis of data. . . . . . . . . . . . . . . 101
Figure 3.2 Individual batch analyses showing two outages. . . . . . . . . . . . . 102
Table 3.2 Estimating s from a moving range. . . . . . . . . . . . . . . . . . . 103
Figure 3.3 A chart check for an outlier. . . . . . . . . . . . . . . . . . . . . . . 104
Figure 3.4 Data with two suggested outliers on the same end. . . . . . . . . . . 106
Table 4.1 Factors c4 to give an unbiased estimate. . . . . . . . . . . . . . . . . 110

Table 4.2 Statistical efficiency of ŝ = R /d2 in estimating the
population parameter from k small samples. . . . . . . . . . . . . . . 111
Table 4.3 Data: breaking strength of single-fiber yarn spun on two machines. . 115
Figure 4.1 Breaking strength of single-fiber yarn from two machines. . . . . . . 115
Table 4.4 Data: measurements of transconductance of two groups of
electronic devices made from two batches (melts) of nickel. . . . . . 117
Figure 4.2 Transconductance readings on electronic devices from two batches
of nickel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Figure 4.3 Evidence of increased process variability. . . . . . . . . . . . . . . . 120
Table 4.5 Variability (as measured by ranges, r = 3) of two methods of
chemical analysis using four analysts. . . . . . . . . . . . . . . . . . 121
Table 4.6 Summary: estimating variability. . . . . . . . . . . . . . . . . . . . 124
Table 5.1 Probabilities Pr (x) of exactly x heads in 10 tosses of an
ordinary coin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Figure 5.1 Probabilities of exactly x heads in 10 tosses of an ordinary coin
(n = 10, p = 0.50). . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
xvi List of Figures and Tables

Table 5.2 Probabilities of exactly x occurrences in n = 75 trials and p = 0.03. . . 131


Figure 5.2 Probabilities of exactly x defectives in a sample of 75 from an
infinite population with p = 0.03. . . . . . . . . . . . . . . . . . . . 131
Figure 5.3 A control chart record of defective glass bottles found in
samples of 120 per shift over a seven-day period. . . . . . . . . . . . 135
Figure 5.4 c chart on stoppages of spinning frame. . . . . . . . . . . . . . . . . 142
Figure 5.5 u chart on stoppages of spinning frame. . . . . . . . . . . . . . . . . 143
Figure 5.6 Form to record inspection by attributes. . . . . . . . . . . . . . . . . 144
Figure 5.7 A control chart using attributes data; visual inspection of a
TV component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD
Figure 5.8 Plot of average gram weight of n = 3 tubes/sample taken at
15 minute intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Table 6.1 Probabilities PA of finding x ≤ 2 in a sample of n = 45 for
different values of p. . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Figure 6.1 Operating-characteristic curve of a single sampling plan for
attributes (n = 45, c = 2). . . . . . . . . . . . . . . . . . . . . . . . . 157
Figure 6.2 Averaging outgoing quality (AOQ) compared to incoming percent
defective P for the plan n = 45, c = 2. . . . . . . . . . . . . . . . . . 159
Table 6.2 Average outgoing quality (AOQ) of lots proceeding past an
acceptance sampling station using the plan n = 45, c = 2. . . . . . . 161
Figure 6.3 Lot-by-lot record for acceptance sampling (single sampling). . . . . 165
Table 6.3 OPQR: outgoing product quality rating—weekly summary. . . . . . CD
Table 6.4 Computation of standard quality demerit level and control limits
–––
(QDn and sn) per n units. . . . . . . . . . . . . . . . . . . . . . . . . CD
Figure 6.4 An OPQR control chart showing control limits. . . . . . . . . . . . . CD
Table 6.5 Steps in producing an enameled basin. . . . . . . . . . . . . . . . . 168
Figure 6.5 Representation of steps in metal fabrication to form an
enameled basin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Figure 6.6 An enameled basin. . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Table 6.6 Daily inspection sheet (sampling). . . . . . . . . . . . . . . . . . . . 170
Table 6.7 Enamel basins—defect analysis after four days. . . . . . . . . . . . . 171
Figure 6.7 Tripod supporting 16-cm enameled basin during firing. . . . . . . . 172
Table 6.8 Summary showing percentage of major defects over four
time periods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Figure 6.8 Summary of percent classification of 16-cm enameled basins
over four sampling periods. . . . . . . . . . . . . . . . . . . . . . . 173
Table 6.9 Changes in quality classification over four time periods. . . . . . . . 174
Table 6.10 Record of weaving defects—major and minor—found in cloth
pieces over two days (from five looms on two shifts). . . . . . . . . 177
List of Figures and Tables xvii

Figure 6.9 Record of percent major damaged cloth in March and April
following start of quality control program. . . . . . . . . . . . . . . 180
Figure 6.10 Definition of ts for NL gauge. . . . . . . . . . . . . . . . . . . . . . 181
Figure 6.11 Some operating characteristics of NL-gauging plans and a

variables control chart on X with ng = 4. . . . . . . . . . . . . . . . 183
Figure 6.12 Molded plastic bottle components. . . . . . . . . . . . . . . . . . . . 184
Figure 6.13 Adjustment chart on a screw machine operation using
NL-gauging principles. . . . . . . . . . . . . . . . . . . . . . . . . . 187
Figure 6.14 Deriving an OC curve for an NL-gauging plan (general procedure). . 189
Table 6.11 Derivation of operating-characteristic curves for some
NL-gauging plans with gauge compressed by 1.0s (t = 1.0). . . . . . 189
Figure 6.15 OC curves of NL-gauging plan. . . . . . . . . . . . . . . . . . . . . 189
Table 6.12 Percent of normally distributed product outside 3s specification
from nominal mean of control chart for comparison of NLG to
other control chart procedures. . . . . . . . . . . . . . . . . . . . . . 190
Table 6.13 Deriving an OC curve for the NLG plan n = 4, t = 1.2, c = 1. . . . . 190
Figure 7.1 Statistical process quality control. . . . . . . . . . . . . . . . . . . . 196
Table 7.1 Factors for Shewhart charts, n = ng. . . . . . . . . . . . . . . . . . . 198

Table 7.2a Factors for conversion of X chart into median chart. . . . . . . . . . 201

Table 7.2b Factors for conversion of X chart into midrange chart. . . . . . . . . 201
Table 7.3 Mean, median, range, and standard deviation of mica thickness. . . . 202
Figure 7.2 Median chart for mica thickness. . . . . . . . . . . . . . . . . . . . 203
Figure 7.3 s chart for mica thickness. . . . . . . . . . . . . . . . . . . . . . . . 204
Figure 7.4 Acceptance control chart for mica thickness. . . . . . . . . . . . . . 208
Figure 7.5 Modified control limits. . . . . . . . . . . . . . . . . . . . . . . . . 211
Figure 7.6 Moving average chart for mica thickness. . . . . . . . . . . . . . . . 216
Figure 7.7 Geometric moving average chart for mica thickness. . . . . . . . . . 216
Figure 7.8 CUSUM chart for mica thickness, d = 1.58, q = 45°. . . . . . . . . . 217
Figure 7.9 V-mask. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Figure 7.10 Kemp cumulative sum chart. . . . . . . . . . . . . . . . . . . . . . . 224
Figure 7.11 One-sided cumulative sum chart. . . . . . . . . . . . . . . . . . . . 225
Figure 7.12 CUSUM chart equivalent to Shewhart chart. . . . . . . . . . . . . . 226
Figure 7.13 Snub-nosed CUSUM mask. . . . . . . . . . . . . . . . . . . . . . . 227
Table 7.4 Average run length for special CUSUM charts. . . . . . . . . . . . . 228
Figure 7.14 Precontrol justification. . . . . . . . . . . . . . . . . . . . . . . . . 230
Table 7.5 Precontrol rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Figure 7.15 Precontrol schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . 231
xviii List of Figures and Tables

Figure 7.16 Disturbances of metallic film thickness from a target value of


T = 80 for an uncontrolled process. . . . . . . . . . . . . . . . . . . 234
Figure 7.17 A bounded Box–Jenkins manual adjustment chart, which allows
the process operator to plot the thickness and then read off the
appropriate change in the deposition rate needed to bring the
process to the target of T = 80. . . . . . . . . . . . . . . . . . . . . . 235
Table 7.6 Summary of short-run control chart plotting measures and limits. . . 241
Table 7.7 Use of control charts. . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Table 7.8 Selection of chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Figure 7.18 Progression of control charts. . . . . . . . . . . . . . . . . . . . . . 243
Figure 7.19 Time line for control. . . . . . . . . . . . . . . . . . . . . . . . . . 243
Figure 7.20 Lifecycle of control chart application. . . . . . . . . . . . . . . . . . 244
Figure 7.21 Check sequence for control chart implementation. . . . . . . . . . . 245
Table 7.9 Data: air-receiver magnetic assembly (depth of cut). . . . . . . . . . 246
Table 8.1 Assessment of capabilities under narrow limit plan (n = 11,
t = 3.75, c = 5). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Figure 8.1 Cause-and-effect diagram for burned toast. . . . . . . . . . . . . . . 263
Table 8.2 Pressed-glass defects. . . . . . . . . . . . . . . . . . . . . . . . . . 264
Figure 8.2 Pareto diagram of pressed-glass defects. . . . . . . . . . . . . . . . . 264
Figure 9.1 Plant layout of molds and furnace. . . . . . . . . . . . . . . . . . . . 274
Figure 9.2 (a) Form to record black patch areas on molds; (b) representation
of a mold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Figure 9.3 Bicking’s checklist for planning test programs. . . . . . . . . . . . . 277
Figure 9.4 A Six Sigma process that produces a 3.4 ppm level of defects. . . . . 280
Figure 9.5 DMAIC process used in Six Sigma methodology. . . . . . . . . . . 283
Figure 9.6 The SIPOC model used for understanding the process from
an overview standpoint. . . . . . . . . . . . . . . . . . . . . . . . . 285
Table 10.1 Experimental plan. . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Table 10.2 Experimental results. . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Table 10.3 The 22 configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Table 10.4 Signs of interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Table 10.5 Analysis of variance. . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Table 10.6 Yates method for 22 experiment. . . . . . . . . . . . . . . . . . . . . 294
Table 10.7 Yates analysis of production data. . . . . . . . . . . . . . . . . . . . 294
Table 10.8 Yates method with r replicates per treatment combination. . . . . . . 294
Table 10.9 23 configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Table 10.10 Signs for effect calculation. . . . . . . . . . . . . . . . . . . . . . . 295
Table 10.11 Yates method for 23 experiment. . . . . . . . . . . . . . . . . . . . . 296
Table 10.12 Illustrative example of 23. . . . . . . . . . . . . . . . . . . . . . . . 297
List of Figures and Tables xix

Table 10.13 Yates analysis of illustrative example. . . . . . . . . . . . . . . . . . 297


Table 10.14 ANOVA of illustrative example. . . . . . . . . . . . . . . . . . . . . 297
Table 10.15 Fraction of a 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Table 10.16 Yates analysis of 1⁄2 fraction of illustrative example.
. . . . . . . . . . 302
Figure 10.1 Main-effect and interaction plots for the 22 design in Table 10.2. . . . 303
Figure 10.2 The relationship between the normal frequency distribution,
cumulative distribution, and the nature of the normal
probability plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Figure 10.3 Drawing the line on a normal probability plot of effects. . . . . . . . 305
Figure 10.4 DesignExpert normal probability plot for the effects shown in
Table 10.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Figure 10.5 DesignExpert half-normal probability plot for the effects shown
in Table 10.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Table 10.17 Average temperature after 10 minutes (minus 200°C). . . . . . . . . 308
Figure 10.6 DesignExpert normal probability plot for the effects shown in
Yates analysis for Case History 10.1. . . . . . . . . . . . . . . . . . 309
Figure 10.7 DesignExpert BC interaction plot. . . . . . . . . . . . . . . . . . . . 310
Figure 11.1 Winners at different post positions. . . . . . . . . . . . . . . . . . . 317
Table 11.1 Nonrandom variability. . . . . . . . . . . . . . . . . . . . . . . . . . 318
Table 11.2 Winners at different post positions. . . . . . . . . . . . . . . . . . . 321
Table 11.3 Analysis of means, attributes data, one independent variable. . . . . 322
Table 11.4 Analysis of means; no standard given; df = ∞. . . . . . . . . . . . . 323
Figure 11.2 Analysis of means plot; proportion defective. . . . . . . . . . . . . . 326
Figure 11.3 Analysis of means plot; accidents by shift. . . . . . . . . . . . . . . 330
Table 11.5 Welding rejects by operator–machine. . . . . . . . . . . . . . . . . . 333
Figure 11.4 Welding rejects by operator–machine. . . . . . . . . . . . . . . . . . 333
Table 11.6 Effect of copper on corrosion. . . . . . . . . . . . . . . . . . . . . . 335
Figure 11.5 Effect of copper on corrosion. . . . . . . . . . . . . . . . . . . . . . 335
Table 11.7 End breaks during spinning cotton yarn. . . . . . . . . . . . . . . . 337
Figure 11.6 End breaks on spinning frames. . . . . . . . . . . . . . . . . . . . . 337
Table 11.8 Plastic caps breaking at the capper. . . . . . . . . . . . . . . . . . . 339
Figure 11.7 Cap breakage at different heads. . . . . . . . . . . . . . . . . . . . . 340
Figure 11.8 Alignment defects found in samples during an interchange of
two operators on two machines. . . . . . . . . . . . . . . . . . . . . 342
Figure 11.9 Alignment comparison shows difference in effect of machines,
but not in operators or before-and-after effect (ANOM). . . . . . . . 343
Figure 11.10 Spacing defects found in samples during an interchange of
two operators on two machines. . . . . . . . . . . . . . . . . . . . . 345
Figure 11.11 Spacing defects comparison showing differences in effect of
machines, but not in operators or before-and-after interchange. . . . . 345
xx List of Figures and Tables

Figure 11.12 Routing card used to obtain data on an audio component


assembly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Figure 11.13 Record of the defects of each type found in the first study,
arranged according to the combination of operators from whom
they originated, for defect types a, b, and c. . . . . . . . . . . . . . . 348
Figure 11.14 Defects of type c only. . . . . . . . . . . . . . . . . . . . . . . . . . 349
Figure 11.15 Comparing significant effects of operator/machine combinations
A and C (ANOM) (type c defects). . . . . . . . . . . . . . . . . . . 349
Figure 11.16 Number of defects found in second study of audio component
assemblies (type c defects). . . . . . . . . . . . . . . . . . . . . . . 350
Table 11.9 Talon’s press-shift performance record. . . . . . . . . . . . . . . . . 351
Table 11.10 Table of shutdowns. . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Figure 11.17 Comparing number of press shutdowns by press and by shift. . . . . 353
Figure 11.18 Figure 11.17 redrawn and decision limits recomputed using
actual ni, instead of average n for two borderline points. . . . . . . . 354
Table 11.11 A study of stem cracking: A 23 production design. . . . . . . . . . . 357
Table 11.12 Computations for analysis of means. . . . . . . . . . . . . . . . . . 358
Figure 11.19 Comparing effects of three factors on glass stem cracking: three
main effects and their interactions. . . . . . . . . . . . . . . . . . . . 358
Figure 11.20 A graphical comparison of effect on stem cracks. . . . . . . . . . . . 360
Figure 11.21 Components in a toiletry assembly. . . . . . . . . . . . . . . . . . . 362
Table 11.13 Data from a 23 factorial production study of reasons for
cracked caps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Figure 11.22 Comparing effects of bottles, rings, and caps from different vendors
on cracked caps (main effects and two-factor interactions). . . . . . . 365
Table 11.14 Computations for two-factor interactions—cracked cap. . . . . . . . 365
Table 11.15 Effects of rings and bottles using only caps C1. . . . . . . . . . . . . 366
Figure 11.23 Comparing effects of bottles and rings from different vendors
when using caps from the better vendor. . . . . . . . . . . . . . . . . 367
Table 11.16 Two special halves of a 23 factorial design. . . . . . . . . . . . . . . 368
Figure 11.24 Effects of pullers, formers, and tension on four defect types. . . . . . 369
Table 11.17 Computations of decision lines (ANOM). . . . . . . . . . . . . . . . 373
Table 11.18 Defective glass bottles from three machines—three shifts and
seven days. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Table 11.19 Wire samples from spools tested after firing under different
conditions of temperature, diameter, and pH. . . . . . . . . . . . . . 376
Figure 12.1 Plastic bottle and plug insert. . . . . . . . . . . . . . . . . . . . . . 380
Table 12.1 Number of defective plastic plugs (short-shot) from each of
32 cavities in a mold. . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Table 12.2a Numbering on cavities in the mold. . . . . . . . . . . . . . . . . . . 381
Table 12.2b Pattern of short-shot plugs. . . . . . . . . . . . . . . . . . . . . . . . 381
List of Figures and Tables xxi

Figure 12.2 Plastic bottle and crooked label. . . . . . . . . . . . . . . . . . . . . 383


Table 12.3 Data on reassemblies of mixers. . . . . . . . . . . . . . . . . . . . . 385
Table 12.4 Computations for main effects and interactions (ANOM). . . . . . . 386
Figure 12.3 A formal comparison of mixer performance (analysis assumes
independence) in reassemblies using subassemblies from six noisy
and six good mixers. . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Table 12.5 A screening design for 23 – 1 = 7 factors. . . . . . . . . . . . . . . . 389
Table 12.6 A screening design for 24 – 1 = 15 factors. . . . . . . . . . . . . . . 391
Table 12.7 A screening design for five factors. . . . . . . . . . . . . . . . . . . 392
Table 12.8 Variables data in a screening design for 15 factors (trace elements). . 393
Table 12.9 12 tubes at 900° and 12 tubes at 800°. . . . . . . . . . . . . . . . . . 395
Figure 12.4 A scatter diagram showing relationship of capacitance on same
individual tubes before and after stage A. . . . . . . . . . . . . . . . 396
Figure 12.5 A scatter diagram of n plotted points with an estimated line
of best fit; differences (Yi – Yc) have been indicated by vertical
dotted lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Figure 12.6 Some frequently occurring patterns of data that lead to seriously
misleading values of r and are not recognized as a consequence. . . . 401
Table 12.10 Hald cement data. . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Figure 12.7 Scatter plot matrix of Hald cement data. . . . . . . . . . . . . . . . 403
Table 12.11 Common power transformations for various data types. . . . . . . . 404
Figure 12.8 Box–Cox transformation plot for n = 200 mica thickness values. . . . 406
Table 12.12 Portion of quality data collected on glass sheets over a
two-month period. . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Figure 12.9 Histogram of the untransformed s/cm2 defect data. . . . . . . . . . . 408
Figure 12.10 Histogram of the square root of the s/cm2 defect data. . . . . . . . . 409
Figure 12.11 Box–Cox transformation plot for the original s/cm2 defect data. . . . 409
Figure 12.12 Histogram of the natural log transformation of the s/cm2
defect data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
Figure 12.13 ANOM plot of the transformed ln(s/cm2) data using the Excel
add-in program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
Figure 12.14 ANOM plot of the original s/cm2 data using the Excel
add-in program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Table 13.1 Critical values of the Tukey-Duckworth sum. . . . . . . . . . . . . . 414
Table 13.2 Data: capacitance of nickel-cadmium batteries measured at
two stations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Figure 13.1 Comparing levels of a battery characteristic manufactured at
two different stations. . . . . . . . . . . . . . . . . . . . . . . . . . 415
Figure 13.2 Comparing two process averages by analysis of means (variables). . . 416
Table 13.3 Summary: mechanics of analysis of means, ANOM, for two small
samples (A and B) with r1 = r2 = r. . . . . . . . . . . . . . . . . . . . 416
xxii List of Figures and Tables

Figure 13.3 Heights of lilies under two different storage conditions. . . . . . . . 422
Figure 13.4 Comparing average heights of lilies under two different
conditions (ANOM). . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Table 13.4 Data: vials from two manufacturing firms. . . . . . . . . . . . . . . 424
Figure 13.5 Weights of individual vials from two manufacturing firms. . . . . . . 424
Table 13.5 Data: measurements on electronic devices made from two batches
of nickel cathode sleeves. . . . . . . . . . . . . . . . . . . . . . . . 426
Figure 14.1 Analysis of means plot; Parr calorimeter determination. . . . . . . . 436
Figure 14.2 A general display of data in a 22 factorial design, r replicates in

each average, Xij. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Figure 14.3 A graphical interpretation of a two-factor interaction. . . . . . . . . . 439
Table 14.1 Analysis of means in a 22 factorial design, r replicates. . . . . . . . . 440
Table 14.2 Height of Easter lilies (inches). . . . . . . . . . . . . . . . . . . . . 441
Figure 14.4 ANOM data from Table 14.2. (a) Height of Easter lilies; (b) ranges
of heights. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Figure 14.5 An auxiliary chart to understand the interaction of S with time. . . . 443
Table 14.3 General analysis of a 23 factorial design, r ≥ 1. . . . . . . . . . . . . 444
Table 14.4 Capacitance of individual nickel-cadmium batteries in a
23 factorial design (data coded). . . . . . . . . . . . . . . . . . . . . 445
Table 14.5 Averages of battery capacitances (r = 6) in a 23 factorial design;
displayed as two 2 × 2 tables. . . . . . . . . . . . . . . . . . . . . . 446
Figure 14.6 (a) Electrical capacitance of nickel-cadmium batteries: the ANOM
comparisons; (b) range chart, nickel-cadmium batteries. . . . . . . . 447
Table 14.6 Averages to test for main effects and two-factor interactions. . . . . . 447
Table 14.7 Diagram to display a combination selection procedure to
–– ––
compute L and U in testing AB interaction. . . . . . . . . . . . . . 449
Table 14.8 Battery capacitances: A special half of a 23 design. . . . . . . . . . . 451
Figure 14.7 Analysis of means (ANOM) for a half replicate of a 23 design
(1/2 × 23). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Table 14.9 Coded contact potential readings in a half replicate of a 23. . . . . . . 454
Figure 14.8 Analysis of three factors and their effects on contact potential. . . . . 455

Figure 14.9 X and R control charts from production before and after
changes made as a consequence of the study discussed in
Case History 14.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Table 14.10 General analysis of a 2 p or 2 p–1 factorial design, r ≥ 1. . . . . . . . . 457
Figure 14.10 ANOM of Case History 10.1 data. . . . . . . . . . . . . . . . . . . . 458
Table 14.11 Contact potential in a half replicate of a 23 design, r = 12;
P = plate temperature; F = filament lighting; A = aging. . . . . . . . 459
Table 15.1 Limits for standards given. . . . . . . . . . . . . . . . . . . . . . . . 462
List of Figures and Tables xxiii

Figure 15.1 Analysis of means chart for eight casino tables. . . . . . . . . . . . . 463
Table 15.2 Measurements on an electronic assembly. . . . . . . . . . . . . . . . 465
Figure 15.2 Analysis of means charts (averages and ranges). . . . . . . . . . . . 466
Figure 15.3 Comparing a group average with a given specification or a desired
average (average of first five ceramic sheets compared to
desired average). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Table 15.3 Grid diameters under tensions. . . . . . . . . . . . . . . . . . . . . . 468
Figure 15.4 Comparing k = 5 subgroups with their own grand mean. . . . . . . . 469
Figure 15.5 Basic form of a two-factor crossed factorial experiment. . . . . . . . 470
Figure 15.6 Analysis of means chart for two-factor experiment. . . . . . . . . . . 474
Figure 15.7 Density of photographic film plate. . . . . . . . . . . . . . . . . . . 475
Figure 15.8 Analysis of means of density. . . . . . . . . . . . . . . . . . . . . . 476
Figure 15.9 Analysis of variance of density. . . . . . . . . . . . . . . . . . . . . 477
Figure 15.10 ANOVA table format using treatment effects. . . . . . . . . . . . . . 478
Figure 15.11 Copper content of castings (X – 84). . . . . . . . . . . . . . . . . . . 481
Figure 15.12 Nested analysis of means of copper content of castings. . . . . . . . 483
Figure 15.13 Analysis of variance of copper content of castings. . . . . . . . . . . 484
Table 15.4 A 2 × 3 × 4 factorial experiment (data coded). . . . . . . . . . . . . 487
Table 15.5 Summary of averages (main effects). . . . . . . . . . . . . . . . . . 487
Figure 15.14 Range chart of lengths of steel bars. . . . . . . . . . . . . . . . . . . 488
Figure 15.15 Decision limits for main effects for length of steel bars. . . . . . . . 489
Figure 15.16 Analysis of means of length of steel bars—main effects. . . . . . . . 489
Figure 15.17 Analysis of means for treatment effects—length of steel bars. . . . . 495
–– ––
Figure 15.18 Interaction comparison of patterns W and L. . . . . . . . . . . . . 496
–– ––
Figure 15.19 Interaction analysis, W × L: ANOM. . . . . . . . . . . . . . . . . 496
Table 15.6 Proportion defective on bonders (ng = 1800). . . . . . . . . . . . . . 499
Figure 15.20 ANOM of bonder data. . . . . . . . . . . . . . . . . . . . . . . . . . 500
Table 15.7 Particle count on wafers. . . . . . . . . . . . . . . . . . . . . . . . . 503
Figure 15.21 ANOM of particulates. . . . . . . . . . . . . . . . . . . . . . . . . . 504
Figure 15.22 Interaction of particulates. . . . . . . . . . . . . . . . . . . . . . . . 505
Figure 15.23 Subgroup ranges (r = 4) arranged by machines. . . . . . . . . . . . . 508
Table 15.8 Subgroup ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Figure 15.24 Comparing average machine variabilities. . . . . . . . . . . . . . . . 508

Table 15.9 Values of dR where ŝR = dR R and dR = (D4 – 1)/3 = d3/d2. . . . . . . . 509
Figure 15.25 Subgroup ranges (r = 4) arranged by time periods. . . . . . . . . . . 510
Figure 15.26 Comparing average time variabilities. . . . . . . . . . . . . . . . . . 511
Table 15.10 A two-way table (machine by time) ignoring heat treatment. . . . . . 511
xxiv List of Figures and Tables

Figure 15.27 Graph of machine x time interaction. . . . . . . . . . . . . . . . . . 512


Table 15.11 Factors to judge presence of nonrandom uniformity,
standard given. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
Figure 15.28 Nonrandom uniformity chart for eight casino tables. . . . . . . . . . 514
Figure 16.1 Measurement data are a result of a process involving several
inputs, most of them controllable. . . . . . . . . . . . . . . . . . . . 526
Figure 16.2 Gauge accuracy is the difference between the measured average
of the gauge and the true value, which is defined with the most
accurate measurement equipment available. . . . . . . . . . . . . . . 527
Figure 16.3 Measurement data can be represented by one of four
possible scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Figure 16.4 Gauge reproducibility can be represented as the variation in the
average of measurements made by multiple operations using
the same gauge and measuring the same parts. . . . . . . . . . . . . 531
Figure 16.5 Gauge repeatability can be represented as the variation in the
measurements made by a single operator using the same gauge
and measuring the same parts. . . . . . . . . . . . . . . . . . . . . . 531
Table 16.1 Gauge repeatability and reproducibility data collection sheet
(long method). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Table 16.2 Gauge repeatability and reproducibility calculations sheet
(long method). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Figure 16.6 Gauge R&R can be represented as the total variation due to
measurements made by multiple operators using the same gauge
and measuring the same parts. . . . . . . . . . . . . . . . . . . . . . 536
Figure 16.7 Variance components of overall variation can be represented as
the breakdown of the total variation into part-to-part variation and
measurement (gauge R&R) variation. . . . . . . . . . . . . . . . . . 539
Table 16.3 Gasket thicknesses for a gauge R&R study. . . . . . . . . . . . . . . 540
Table 16.4 Gauge repeatability and reproducibility data collection sheet
(long method) for Case History 16.1. . . . . . . . . . . . . . . . . . 541
Table 16.5 Gauge repeatability and reproducibility calculations sheet
(long method) for Case History 16.1. . . . . . . . . . . . . . . . . . 542
Figure 16.8 Gauge R&R run plot for Case History 16.1. . . . . . . . . . . . . . . 546
Figure 16.9 Gauge R&R appraiser variation plot for Case History 16.1. . . . . . 547
Figure 16.10 Gauge R&R plot for Case History 16.1. . . . . . . . . . . . . . . . . 547
Figure 16.11 Gauge R&R variance component chart for Case History 16.1. . . . . 548
Figure 16.12 Gauge R&R variance component pie chart for Case History 16.1. . . 548

Figure 16.13 Gauge R&R X and R chart for Case History 16.1. . . . . . . . . . . 549
Table 16.6 Gasket thicknesses for a gauge R&R study. . . . . . . . . . . . . . . 550
Figure 16.14 ANOME chart for Case History 16.1. . . . . . . . . . . . . . . . . . 552
Figure 17.1 Excel histogram of the mica thickness data comparable
to Figure 1.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
List of Figures and Tables xxv

Figure 17.2 Exponentially-weighted moving average and range charts of the


mica thickness data which is comparable to Figure 7.7. . . . . . . . . 558
Figure 17.3 ANOM.xla add-in version of the ANOME plot for copper content
of two samples from each of 11 castings (data from Figure 15.11)
shown in Figure 15.12. . . . . . . . . . . . . . . . . . . . . . . . . . 561
Figure 17.4 ANOM add-in version of the ANOM plot for a 22 design in
Case History 14.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
Figure 17.5 ANOME add-in plot for interactions based on data in
Figure 15.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
Figure 17.6 ANOME plot produced for a balanced data set based on
Figure 15.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Figure 17.7 ANOME plot produced for an unbalanced data set based on
Figure 15.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Figure 18.1 The relationship between statistical thinking and statistical
methods (ASQ Statistics Division). . . . . . . . . . . . . . . . . . . 569
Figure 18.2 A Six Sigma process that produces 3.4 ppm level of defects. . . . . . 570
Table A.1 Areas under the normal curve. . . . . . . . . . . . . . . . . . . . . . 576
Table A.2 Critical values of the number of runs NR above and below the
median in k = 2 m observations (one-tail probabilities). . . . . . . . . 578
Table A.3 Runs above and below the median of length s in k = 2 m
observations with k as large as 16 or 20. . . . . . . . . . . . . . . . 579
Table A.4 Control chart limits for samples of ng. . . . . . . . . . . . . . . . . . 580
Table A.5 Binomial probability tables. . . . . . . . . . . . . . . . . . . . . . . 582
Table A.6 Poisson probability curves. . . . . . . . . . . . . . . . . . . . . . . . 591
Table A.7 Nonrandom variability—standard given: df = ∞ (two-sided). . . . . . 592
Table A.8 Exact factors for one-way analysis of means, Ha (two-sided). . . . . 593
Table A.9 Dixon criteria for testing extreme mean or individual. . . . . . . . . 597
Table A.10 Grubbs criteria for simultaneously testing the two largest or
two smallest observations. . . . . . . . . . . . . . . . . . . . . . . . 598
Table A.11 Expanded table of the adjusted d2 factor (d2* ) for estimating the
standard deviation from the average range. . . . . . . . . . . . . . . 599
Table A.12a F distribution, upper five percent points (F0.95) (one-sided). . . . . . 602
Table A.12b F distribution, upper 2.5 percent points (F0.975) (one-sided). . . . . . 603
Table A.12c F distribution, upper one percent points (F0.99) (one-sided). . . . . . . 604
Table A.13 Critical values of the Tukey-Duckworth sum. . . . . . . . . . . . . . 605
Table A.14 Values of Ha , k = 2, ANOM (two-tailed test). . . . . . . . . . . . . . 605
Table A.15 Distribution of Student’s t (two-tail). . . . . . . . . . . . . . . . . . 606
Table A.16 Nonrandom uniformity, Na (no standard given). . . . . . . . . . . . . 607
Table A.17 Some blocked full factorials. . . . . . . . . . . . . . . . . . . . . . . 607
Table A.18 Some fractional factorials. . . . . . . . . . . . . . . . . . . . . . . . 608
xxvi List of Figures and Tables

Table A.19 Sidak factors for analysis of means for treatment effects,
ha* (two-sided). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609

Table A.20 Criteria for the ratio F* = ŝ LT2 /ŝ ST2 for the X chart with ng = 5. . . . 613
Table A.21a Tolerance factors, K, using the standard deviation s to obtain
intervals containing P percent of the population with g = 95 percent
confidence, for samples of size n, assuming a normal distribution. . . 614

Table A.21b Tolerance factors, K*, using the average range, R, of samples
of ng = 5 to obtain intervals containing P percent of the population
with g = 95 percent confidence assuming a normal distribution. . . . 615
Preface to the Fourth Edition

The endless cycle of idea and action,


Endless invention, endless experiment,
Brings knowledge of motion, but not of stillness;
Knowledge of speech, but not of silence;
Knowledge of words, and ignorance of the Word.
All our knowledge brings us nearer to our ignorance.
T. S. Eliot

E
llis R. Ott taught generations of quality practitioners to be explorers of the truth
through the collection of and graphical portrayal of data. From a simple plea to
“plot the data” to devising a graphical analytical tool called the analysis of means
(ANOM), Ott demonstrated that process knowledge is to be gained by seeking the infor-
mation contained within the data. Ellis believed that process knowledge is not just to
be gained by a single analysis, but rather that the process continually speaks to us in the
form of data, and that we must understand its language. The more we learn from a
process, the more we realize how much we didn’t know to begin with. This process of
learning what we don’t know is the essence of T. S. Eliot’s endless cycle of learning.
In this newest version of Ellis’s classic text, we have strived to continue on the path
that he has laid down for others to follow. Additional material has been added to sup-
plement the techniques covered in many of the chapters, and the CD-ROM has been
enhanced since the last edition.
Specifically, in Chapter 1, new material has been added on the use of dot plots as an
alternative to histograms, stem-and-leaf diagrams, and box plots for showing the shape
of a distribution. In Chapter 2, the idea of looking at data over time is combined with
the dot plot in the form of a digidot plot. Chapters 3 and 4 are relatively unchanged, but
in Chapter 5, material has been added to address the subject of adding events to charts.
Though Chapter 5 is devoted to the analysis of attributes data, adding events to a chart

xxix
xxx Preface to the Fourth Edition

is applicable to the charting of any type of data and so it is fitting that it is discussed
after the material on control charting of both attributes and variables data. A case his-
tory is used to illustrate how a manufacturing problem was solved simply through the
addition of events to a simple trend chart.
Chapters 6 and 7 in the third edition have been combined into a single chapter for
this edition. The subject of narrow-limit gauging is a natural extension of the ideas of
acceptance sampling and the material on control charting in the earlier chapters.
Material in this new chapter has been clarified to show the intrinsic beauty of this tech-
nique in the hope that it may stir others to renew their acquaintance with narrow-limit
gauging for processes where go/no-go gauges are a necessary inspection tool.
Chapter 8 in the third edition has been split into two new chapters—Chapters 7 and
8. The new Chapter 7 is devoted to the principles and applications of control charts.
New material has been added to emphasize the role that acceptance control charts play
in controlling both a and b risks, and the computation of average run length (ARL). A
section on acceptance control charts for attributes has been added as well to comple-
ment the material on acceptance control charts for variables data. Also, some additional
material has been added to the discussion of EWMA charts showing their relation to an
integrated moving average (IMA) time series model.
The new Chapter 8 is devoted to the topics of process capability, process perfor-
mance, and process improvement. New material on the use of confidence intervals for
process capability metrics is introduced so users will see these metrics as estimates
rather than absolutes with error of their own. Narrow-limit gauging is discussed as
another means of assessing the capability of a process.
In Chapter 9, ideas for troubleshooting processes are supplemented with the Six
Sigma methodology that has been popular in recent years. Specifically, the DMAIC
and DMADV processes are introduced as a means of developing understanding of
existing and newly developed processes, respectively. Also, the problem-solving strat-
egy developed by Kepner and Tregoe is discussed as a means of addressing many types
of problems.
Chapter 10 has been developed further to introduce the idea of design resolution. In
particular, designs of resolution III, IV, V, and higher are discussed, along with the use
of Tables A.17 and A.18 for choosing the proper fractional factorial design. The case
history in this chapter has also been expanded to illustrate the use of normal probabil-
ity plotting of effects and the combination of nonsignificant interaction effects into the
error so that non-replicated designs can be effectively analyzed.
The material on ANOM for proportions data in Chapter 11 has been expanded to
cover the problem of unequal sample sizes. This idea is further discussed in Chapter 15
in the form of analysis of means for treatment effects (ANOME).
Chapter 12 has been expanded to cover scatter plot matrices, which are introduced
to show how the idea of a scatter plot can be applied to datasets of higher dimensions.
In addition, Chapter 12 covers the important areas of correlation and regression with
material that originally appeared in the first edition, but which has been revised and
updated for this edition.
Preface to the Fourth Edition xxxi

Chapter 16 has been added to discuss the topic of measurement studies. Common
approaches to measurement studies, including R&R studies, are addressed, and many of
the techniques covered in earlier chapters are implemented to analyze the data. In par-
ticular, ANOME is presented as one graphical alternative to the analysis of an R&R
study. A discussion of measurement as a process is provided, as well as how such stud-
ies should be set up and the data analyzed in a form that is easy for others to understand
with a minimal background in statistics. Examples of common problems associated with
measurements and how they can be resolved will prepare the user of these studies with some
meaningful and practical advice.
Chapter 17 provides a more detailed discussion of what has been included on the
latest version of the CD-ROM. We hope that the readers of this text, as well as instruc-
tors planning to use this book for a course, will find a plethora of information and soft-
ware that will make this text a helpful tool for gaining process knowledge. In particular,
the Excel add-in for ANOM and ANOME analyses has been greatly expanded since the
last edition. It now includes the analysis of up to three factors for attributes and variables
data, as well as nested designs in two factors. Examples of output from this add-in can be
found in later chapters of this text. The CD-ROM also includes a subdirectory containing
many of the papers on ANOM and ANOME published in Journal of Quality Technology.
Readers wishing to learn more about these methods can research these papers which are
given as PDF files for easy online viewing. Freeware versions of some useful graphing
and statistical utilities can also be found on the CD-ROM.
Of course, no book would be possible without the people who have supported it and
assisted in its development. We wish to thank the many students at Rochester Institute
of Technology who have provided many valuable comments and examples over the
years as part of the Interpretation of Data course that is based on this text. Their insight
has been valuable in helping to provide ideas for improving this text, as many of them
work in industry while completing their graduate degree.
We would also like to recognize the contributions over the years from Dr. Peter R.
Nelson, who passed away in 2004. He made many contributions to the development of
the analysis of means, reference to which has been made in this edition.
Our thanks also go to Annemieke Hytinen of ASQ Quality Press for her guidance
and support for this latest edition and her efforts to bring this work to press in a
timely manner. We also wish to thank Paul O’Mara for his efforts as the project editor
of this book, and Paul and Leayn Tabili of New Paradigm for their work in reformat-
ting of the text and renewing all of the graphs and tables.
Last, and certainly not least, we must thank our wives, Jean and Kimberly, for their
continued devotion to us as we toiled on yet another edition of this text. Their under-
standing, contributions, and support in many ways have enabled us to find the time we
needed to make this an even better book. The spirit of continuous improvement lives on
and we feel that Professor Ott’s guiding hand is still evident.
Ellis Ott was ahead of his time in many ways. His influence not only lives on in
those who learned from him directly, but his work and philosophy can be seen in any-
one who is willing to endure the seemingly endless cycle of idea and action, endless
xxxii Preface to the Fourth Edition

invention, endless experiment. Ott knew that if we continued to pursue the information
buried in the data that we would in fact find the knowledge we need to improve and
control processes. In the end, Ott also knew that knowledge brings us nearer to our
ignorance.

Edward G. Schilling
Rochester, New York
Dean V. Neubauer
Horseheads, New York
Table of Contents

List of Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii


Case Histories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvii
Preface to the Fourth Edition . . . . . . . . . . . . . . . . . . . . . . . . . . xxix
Preface to the Third Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxiii
Preface to the Second Edition . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvii
Preface to the First Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxix

Part 1 Basics of Interpretation of Data


Chapter 1 Variables Data: An Introduction . . . . . . . . . . . . . . . . . 3
1.1 Introduction: An Experience with Data . . . . . . . . . . . . . . . 3
1.2 Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Organizing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Grouping Data When n Is Large . . . . . . . . . . . . . . . . . . . 9
1.5 The Arithmetic Average or Mean—Central Value . . . . . . . . . 12
1.6 Measures of Variation . . . . . . . . . . . . . . . . . . . . . . . . 13
1.7 Normal Probability Plots . . . . . . . . . . . . . . . . . . . . . . 18
1.8 Predictions Regarding Sampling Variation: The Normal Curve . . 20
1.9 Series of Small Samples from a Production Process . . . . . . . . 28

1.10 Change in Sample Size: Predictions about X and ŝ . . . . . . . . . 29
1.11 How Large a Sample Is Needed to Estimate a Process Average? . . 31
1.12 Sampling and a Second Method of Computing ŝ . . . . . . . . . . 32
1.13 Some Important Remarks about the Two Estimates . . . . . . . . 35
1.14 Stem-and-Leaf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.15 Box Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.16 Dot Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

vii
viii Table of Contents

1.17 Tolerance Intervals for Populations . . . . . . . . . . . . . . . . . 42


1.18 A Note on Notation . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.19 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.20 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Chapter 2 Ideas from Time Sequences of Observations . . . . . . . . . . . 51
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.2 Data from a Scientific or Production Process . . . . . . . . . . . . 54
2.3 Signals and Risks . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Run Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5 Shewhart Control Charts for Variables . . . . . . . . . . . . . . . 62

2.6 Probabilities Associated with an X Control Chart:
Operating-Characteristic Curves . . . . . . . . . . . . . . . . . . . 71
2.7 Control Charts for Trends . . . . . . . . . . . . . . . . . . . . . . 88
2.8 Digidot Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.9 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Chapter 3 Ideas from Outliers—Variables Data . . . . . . . . . . . . . . . 99
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.2 Other Objective Tests for Outliers . . . . . . . . . . . . . . . . . . 103
3.3 Two Suspected Outliers on the Same End of a Sample
of n (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.4 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Chapter 4 Variability—Estimating and Comparing . . . . . . . . . . . . . 109
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 Statistical Efficiency and Bias in Variability Estimates . . . . . . . 109
4.3 Estimating s and s 2 from Data: One Sample of Size n . . . . . . 111
4.4 Data from n Observations Consisting of k Subsets of ng = r:
Two Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5 Comparing Variabilities of Two Populations . . . . . . . . . . . . 114
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.7 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Chapter 5 Attributes or Go/No-Go Data . . . . . . . . . . . . . .. . . . . 127
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2 Three Important Problems . . . . . . . . . . . . . . . . . . . . . . 127
5.3 On How to Sample . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.4 Attributes Data That Approximate a Poisson Distribution . . . . . 139
5.5 Notes on Control Charts . . . . . . . . . . . . . . . . . . . . . . . 146
5.6 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Part 2 Statistical Process Control


Chapter 6 Sampling and Narrow-Limit Gauging . . . . . . . . . . . . . . 153
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Table of Contents ix

6.2 Scientific Sampling Plans . . . . . . . . . . . . . . . . . . . . . . 154


6.3 A Simple Probability . . . . . . . . . . . . . . . . . . . . . . . . 155
6.4 Operating-Characteristic Curves of a Single Sampling Plan . . . . 156
6.5 But Is It a Good Plan? . . . . . . . . . . . . . . . . . . . . . . . . 157
6.6 Average Outgoing Quality (AOQ) and Its Maximum
Limit (AOQL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.7 Computing the Average Outgoing Quality (AOQ) of Lots from
a Process Producing P Percent Defective . . . . . . . . . . . . . . 160
6.8 Other Important Concepts Associated with Sampling Plans . . . . 161
6.9 Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.10 Tabulated Sampling Plans . . . . . . . . . . . . . . . . . . . . . . 162
6.11 Feedback of Information . . . . . . . . . . . . . . . . . . . . . . . 163
6.12 Where Should Feedback Begin? . . . . . . . . . . . . . . . . . . . 166
6.13 Narrow-Limit Gauging . . . . . . . . . . . . . . . . . . . . . . . 180
6.14 Outline of an NL-Gauging Plan . . . . . . . . . . . . . . . . . . . 181
6.15 Selection of a Simple NL-Gauging Sampling Plan . . . . . . . . . 182
6.16 OC Curves of NL-Gauge Plans . . . . . . . . . . . . . . . . . . . 187
6.17 Hazards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
6.18 Selection of an NL-Gauging Plan . . . . . . . . . . . . . . . . . . 192
6.19 Optimal Narrow-Limit Plans . . . . . . . . . . . . . . . . . . . . 192
6.20 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Chapter 7 Principles and Applications of Control Charts . . . . . . . . . . 195
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.2 Key Aspects of Process Quality Control . . . . . . . . . . . . . . 196
7.3 Process Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.4 Uses of Control Charts . . . . . . . . . . . . . . . . . . . . . . . 199
7.5 Rational Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . 199
7.6 Special Control Charts . . . . . . . . . . . . . . . . . . . . . . . . 200
7.7 Median Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.8 Standard Deviation Chart . . . . . . . . . . . . . . . . . . . . . . 203
7.9 Acceptance Control Chart . . . . . . . . . . . . . . . . . . . . . . 204
7.10 Modified Control Limits . . . . . . . . . . . . . . . . . . . . . . . 211
7.11 Arithmetic and Exponentially Weighted Moving Average Charts . . 212
7.12 Cumulative Sum Charts . . . . . . . . . . . . . . . . . . . . . . . 216
7.13 Precontrol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.14 Narrow-Limit Control Charts . . . . . . . . . . . . . . . . . . . . 232
7.15 Other Control Charts . . . . . . . . . . . . . . . . . . . . . . . . . 232
7.16 How to Apply Control Charts . . . . . . . . . . . . . . . . . . . . 241
7.17 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Chapter 8 Process Capability, Performance, and Improvement . . . . . . 249
8.1 Process Capability . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.2 Process Optimization Studies . . . . . . . . . . . . . . . . . . . . 250
8.3 Capability and Specifications . . . . . . . . . . . . . . . . . . . . 251
x Table of Contents

8.4 Narrow-Limit Gauging for Process Capability . . . . . . . . . . . 257


8.5 Process Performance . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.6 Process Improvement . . . . . . . . . . . . . . . . . . . . . . . . 261
8.7 Process Change . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
8.8 Problem Identification . . . . . . . . . . . . . . . . . . . . . . . . 263
8.9 Prioritization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
8.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
8.11 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Part 3 Troubleshooting and Process Improvement


Chapter 9 Some Basic Ideas and Methods of Troubleshooting
and Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
9.2 Some Types of Independent and Dependent Variables . . . . . . . 270
9.3 Some Strategies in Problem Finding, Problem Solving, and
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
9.4 Bicking’s Checklist . . . . . . . . . . . . . . . . . . . . . . . . . 276
9.5 Problem Solving Skills . . . . . . . . . . . . . . . . . . . . . . . 278
9.6 Six Sigma Methodology . . . . . . . . . . . . . . . . . . . . . . . 280
9.7 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Chapter 10 Some Concepts of Statistical Design of Experiments . . . . . . 287
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.2 Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
10.3 Sums of Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
10.4 Yates Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
10.5 Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
10.6 Fractional Factorials . . . . . . . . . . . . . . . . . . . . . . . . . 299
10.7 Graphical Analysis of 2p Designs . . . . . . . . . . . . . . . . . . 302
10.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
10.9 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Chapter 11 Troubleshooting with Attributes Data . . . . . . . . . . . . . . 315
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
11.2 Ideas from Sequences of Observations over Time . . . . . . . . . 316
11.3 Decision Lines Applicable to k Points Simultaneously . . . . . . . 317
11.4 Analysis of Means for Proportions When n is Constant . . . . . . 324
11.5 Analysis of Means for Proportions When n Varies . . . . . . . . . 325
11.6 Analysis of Means for Count Data . . . . . . . . . . . . . . . . . 329
11.7 Introduction to Case Histories . . . . . . . . . . . . . . . . . . . . 330
11.8 One Independent Variable with k Levels . . . . . . . . . . . . . . 331
11.9 Two Independent Variables . . . . . . . . . . . . . . . . . . . . . 341
11.10 Three Independent Variables . . . . . . . . . . . . . . . . . . . . . 355
Table of Contents xi

11.11 A Very Important Experimental Design: 1/2 × 23 . . . . . . . . . . 367


11.12 Case History Problems . . . . . . . . . . . . . . . . . . . . . . . . 372
11.13 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Chapter 12 Special Strategies in Troubleshooting . . . . . . . . . . . . . . 379
12.1 Ideas from Patterns of Data . . . . . . . . . . . . . . . . . . . . . 379
12.2 Disassembly and Reassembly . . . . . . . . . . . . . . . . . . . . 383
12.3 A Special Screening Program for Many Treatments . . . . . . . . 387
12.4 Other Screening Strategies . . . . . . . . . . . . . . . . . . . . . . 393
12.5 Relationship of One Variable to Another . . . . . . . . . . . . . . 394
12.6 Mechanics of Measuring the Degree of a Relationship . . . . . . . 397
12.7 Scatter Plot Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 402
12.8 Use of Transformations and ANOM . . . . . . . . . . . . . . . . . 403
12.9 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Chapter 13 Comparing Two Process Averages . . . . . . . . . . . . . . . . 413
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
13.2 Tukey’s Two-Sample Test to Duckworth’s Specifications . . . . . 413
13.3 Analysis of Means, k = 2, ng = r1 = r2 = r . . . . . . . . . . . . . . 415
13.4 Student’s t and F test Comparison of Two Stable Processes . . . . 417
13.5 Magnitude of the Difference between Two Means . . . . . . . . . 420
13.6 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Chapter 14 Troubleshooting with Variables Data . . . . . . . . . . . . . . 431
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
14.2 Suggestions in Planning Investigations—Primarily Reminders . . . 432
14.3 A Statistical Tool for Process Change . . . . . . . . . . . . . . . . 433
14.4 Analysis of Means for Measurement Data . . . . . . . . . . . . . 434
14.5 Example—Measurement Data . . . . . . . . . . . . . . . . . . . . 435
14.6 Analysis of Means: A 22 Factorial Design . . . . . . . . . . . . . 436
14.7 Three Independent Variables: A 23 Factorial Design . . . . . . . . 444
14.8 Computational Details for Two-Factor Interactions in a
23 Factorial Design . . . . . . . . . . . . . . . . . . . . . . . . . . 449
14.9 A Very Important Experimental Design: 1/2 × 23 . . . . . . . . . . 450
14.10 General ANOM Analysis of 2p and 2p–1 Designs . . . . . . . . . . 456
14.11 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Chapter 15 More Than Two Levels of an Independent Variable . . . . . . 461
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
15.2 An Analysis of k Independent Samples—Standard Given,
One Independent Variable . . . . . . . . . . . . . . . . . . . . . . 462
15.3 An Analysis of k Independent Samples—No Standard Given,
One Independent Variable . . . . . . . . . . . . . . . . . . . . . . 463
15.4 Analysis of Means—No Standard Given, More Than One
Independent Variable . . . . . . . . . . . . . . . . . . . . . . . . . 469
xii Table of Contents

15.5 Analysis of Two-Factor Crossed Designs . . . . . . . . . . . . . . 470


15.6 The Relation of Analysis of Means to Analysis of
Variance (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . 477
15.7 Analysis of Fully Nested Designs (Optional) . . . . . . . . . . . . 479
15.8 Analysis of Means for Crossed Experiments—Multiple Factors . . 484
15.9 Nested Factorial Experiments (Optional) . . . . . . . . . . . . . . 497
15.10 Multifactor Experiments with Attributes Data . . . . . . . . . . . 498
15.11 Analysis of Means When the Sample Sizes are Unequal . . . . . . 505
15.12 Comparing Variabilities . . . . . . . . . . . . . . . . . . . . . . . 506
15.13 Nonrandom Uniformity . . . . . . . . . . . . . . . . . . . . . . . 512
15.14 Calculation of ANOM Limits for 2p Experiments . . . . . . . . . . 515
15.15 Development of Analysis of Means . . . . . . . . . . . . . . . . . 516
15.16 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Chapter 16 Assessing Measurements as a Process . . . . . . . . . . . . . . 525
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
16.2 Measurement as a Process . . . . . . . . . . . . . . . . . . . . . . 525
16.3 What Can Affect the Measurement Process? . . . . . . . . . . . . 527
16.4 Crossed vs. Nested Designs . . . . . . . . . . . . . . . . . . . . . 530
16.5 Gauge Repeatability and Reproducibility Studies . . . . . . . . . . 530
16.6 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 553
Chapter 17 What’s on the CD-ROM . . . . . . . . . . . . . . . . . . . . . 555
Chapter 18 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
18.1 Practice Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 573
Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Part I
Basics of Interpretation
of Data
1
Variables Data:
An Introduction

1.1 INTRODUCTION: AN EXPERIENCE WITH DATA


Look around you and you will see variation. Often we ignore it, sometimes we curse it,
always we live with it in an imperfect world. Ellis Ott’s first industrial experience with
variation pertained to the thickness of mica pieces being supplied by a vendor. The
pieces had a design pattern of punched holes. They were carefully arranged to hold var-
ious grids, plates, and other radio components in their proper places. But the failure of
mica pieces to conform to thickness specifications presented a problem. In particular,
there were too many thin pieces that were found in many different types of mica. The
vendor was aware of the problem, but was quite sure that there was nothing he could do
to resolve it. “The world supply of first-grade mica was cut off by World War II, and
only an inferior quality is obtainable. My workers split the mica blocks quite properly
to the specified dimensions.1 Because of the poor mica quality,” he insisted, “these
pieces subsequently split in handling, producing two thin pieces.” He was sorry, but there
was nothing he could do about it!
Now, there are some general principles to be recognized by troubleshooters:
Rule 1. Don’t expect many people to advance the idea that the problem is their
own fault. Rather it is the fault of raw materials and components, a worn-out
machine, or something else beyond their own control. “It’s not my fault!”
Rule 2. Get some data on the problem; but do not spend too much time in
initial planning. (An exception is when data collection requires a long time
or is very expensive; very careful planning is then important.)

1. Purchase specifications were 8.5 to 15 thousandths (0.0085 to 0.015 inches) with an industry-accepted allowance
of five percent over and five percent under these dimensions. Very thin pieces did not give enough support to the
assemblage. Very thick pieces were difficult to assemble. (It should be noted that specifications that designate such
an allowance outside the stated “specifications” are quite unusual.)

3
4 Part I: Basics of Interpretation of Data

Rule 3. Always graph your data in some simple way—always. In this case,
the engineer took mica samples from a recent shipment and measured the
thickness of 200 pieces.2 The resulting data, shown in Table 1.1, are presented
in Figure 1.1 as a histogram.

Table 1.1 Mica thickness, thousandths of an inch.


8.0 12.5 12.5 14.0 13.5 12.0 14.0 12.0 10.0 14.5
10.0 10.5 8.0 15.0 9.0 13.0 11.0 10.0 14.0 11.0
12.0 10.5 13.5 11.5 12.0 15.5 14.0 7.5 11.5 11.0
12.0 12.5 15.5 13.5 12.5 17.0 8.0 11.0 11.5 17.0
11.5 9.0 9.5 11.5 12.5 14.0 11.5 13.0 13.0 15.0
8.0 13.0 15.0 9.5 12.5 15.0 13.5 12.0 11.0 11.0
11.5 11.5 10.0 12.5 9.0 13.0 11.5 16.0 10.5 9.0
9.5 14.5 10.0 5.0 13.5 7.5 11.0 9.0 10.5 14.0
9.5 13.5 9.0 8.0 12.5 12.0 9.5 10.0 7.5 10.5
10.5 12.5 14.5 13.0 12.5 12.0 13.0 8.5 10.5 10.5
13.0 10.0 11.0 8.5 10.5 7.0 10.0 12.0 12.0 10.5
13.5 10.5 10.5 7.5 8.0 12.5 10.5 14.5 12.0 8.0
11.0 8.0 11.5 10.0 8.5 10.5 12.0 10.5 11.0 10.5
14.5 13.0 8.5 11.0 13.5 8.5 11.0 11.0 10.0 12.5
12.0 7.0 8.0 13.5 13.0 6.0 10.0 10.0 12.0 14.5
13.0 8.0 10.0 9.0 13.0 15.0 10.0 13.5 11.5 7.5
11.0 7.0 7.5 15.5 13.0 15.5 11.5 10.5 9.5 9.5
10.5 7.0 10.0 12.5 9.5 10.0 10.0 12.0 8.5 10.0
9.5 9.5 12.5 7.0 9.5 12.0 10.0 10.0 8.5 12.0
11.5 11.5 8.0 10.5 14.5 8.5 10.0 12.5 12.5 11.0
Source: Lewis M. Reagan, Ellis R. Ott, and Daniel T. Sigley, College Algebra, revised edition, chapter 18 (New
York: Holt, Rinehart and Company, 1940). Reprinted by permission of Holt, Rinehart & Winston, Inc.

n = 200
40

35

30
Frequency

25

20

15

10

0
4.95 8.95 12.95 16.95
Cell boundaries

Figure 1.1 Thickness of mica pieces shown as a histogram. (Data from Table 1.1.)

2. Experience with problem solving suggests that it is best to ask for only about 50 measurements, but not more than
100. An exception would arise when an overwhelming mass of data is required, usually for psychological reasons.
Chapter 1: Variables Data: An Introduction 5

Discussion: Figure 1.1 shows some important things:


1. A substantial number3 of mica pieces are too thin and some are too thick when
compared to the upper and lower specification limits of 8.5 and 15.
2. The center of the two specifications is 0.5(8.5 + 15) = 11.75 thousandths,
and the peak of the thickness distribution is to the left of 11.75 at 10.25
thousandths. If the splitting blades were adjusted to increase the thickness
by about 0.5 thousandth, the peak would be moved near the center of the
specifications and the number of thin pieces would be reduced slightly
more than the number of thick pieces would be increased. The adjusted
process would produce fewer nonconforming pieces, but would still produce
more outside the specifications than a five percent allowable deviation on
each side.
3. It is conceivable that a few of the mica pieces had split during handling, as
the vendor believed. However, if more than an occasional one was splitting,
a bimodal4 pattern with two humps would be formed. Consequently, it is
neither logical nor productive to attribute the problem of thin micas to the
splitting process.
What might the vendor investigate to reduce the variability in the splitting process?
Answer: There was more than one operator hand-splitting this particular type of mica
piece. Differences between the operators were almost certainly contributing to variation
in the process.
In addition, differences in thickness from an individual operator would be expected
to develop over a period of a few hours because operator fatigue and changes in knife
sharpness could produce important variations.
At the vendor’s suggestion, quality control charts were instituted for individual
operators. These charts helped reduce variability in the process and produce micas con-
forming to specifications.
In the larger study of the mica thickness problem, samples were examined from sev-
eral different mica types; many pieces were found to be too thin and relatively few pieces
too thick within each type. What then?
Economic factors often exert an influence on manufacturing processes, either con-
sciously or unconsciously. In this splitting operation, the vendor bought the mica by the
pound but sold it by the piece. One can imagine a possible reluctance to direct the mica-
splitting process to produce any greater thickness than absolutely necessary.
A more formal discussion of data display will be presented in following sections.
The mechanics of grouping the data in Table 1.1 and of constructing Figure 1.1 will also
be explained in Sections 1.3 and 1.4.

3. A count shows that 24 of the 200 pieces are under 8.5 thousandths of an inch and 7 are over 15 thousandths.
4. See Figure 1.4.
6 Part I: Basics of Interpretation of Data

1.2 VARIABILITY
In every manufacturing operation, there is variability. The variability becomes evident
whenever a quality characteristic of the product is measured. There are basically two
different reasons for variability, and it is very important to distinguish between them.

Variability Inherent in the Process


It is important to learn how much of the product variability is actually inherent in the
process. Is the variation a result of random effects of components and raw materials? Is
it from small mechanical linkage variations in a machine producing random variation
in the product? Is it from slight variations in an operator’s performance? Many factors
influence a process and each contributes to the inherent variation affecting the resulting
product. These are sometimes called common causes. There is also variation in test
equipment and test procedures—whether used to measure a physical dimension, an elec-
tronic or a chemical characteristic, or any other characteristic. This inherent variation in
testing is a factor contributing to variations in the observed measurements of product
characteristics—sometimes an important factor.5
There is variation in a process even when all adjustable factors known to affect the
process have been set and held constant during its operations.
Also, there is a pattern to the inherent variation of a specific stable process, and there
are different basic characteristic patterns of data from different processes. However, the
most frequent and useful one is called the normal distribution; its idealized mathemat-
ical form is shown in Figure 1.2; it is discussed further in Section 1.8.
The mica thickness data in Figure 1.1 have a general resemblance to the normal dis-
tribution of Figure 1.2. A majority of observations are clustered around a central value,

–X 2
1 2
Y= e 2s
s 2p

–3s –2s –s 0 +s +2s +3s

Figure 1.2 A normal distribution (with m = 0).

5. See Case History 2.4.


Chapter 1: Variables Data: An Introduction 7

there are tails on each end, and it is relatively symmetrical around a vertical line drawn
at about 11.2 thousandths.
There are other basic patterns of variability; they are referred to as nonnormal dis-
tributions. The lognormal is common when making acoustical measurements and certain
measurements of electronic products. If the logarithms of the measurements are plotted,
the resulting pattern is a normal distribution—hence its name. A lognormal distribution
is shown in Figure 1.3; it has a longer tail to the right.
Based on industrial experience, the lognormal distribution does not exist as frequently
as some analysts believe. Many apparent lognormal distributions of data are not the con-
sequence of a stable lognormal process but of two basically normal distributions with a
large percentage produced at one level. The net result can produce a bimodal distribution
as in Figure 1.4, which presents a false appearance of being inherently lognormal.
The production troubleshooter needs help in identifying the nature of the causes pro-
ducing variation. If different operators or machines are performing the same operation,
it is important to learn whether some are performing better or worse than others. Specific
differences, when discovered, will often lead to ideas for improvement when those per-
forming differently—either better or worse—are compared. Some causes may be com-
mon to all machines and operators. In reducing process variability caused by them, the

100

50

0
0 5 10 15 20

Figure 1.3 A lognormal distribution.

0
0 A B

Figure 1.4 A bimodal distribution composed of two normal distributions.


8 Part I: Basics of Interpretation of Data

troubleshooter may need to identify a variety of small improvements that can be extended
over all machines and operators.

Variability from Assignable Causes


There are other important causes of variability, which Dr. Walter Shewhart called
assignable causes.6 These are sometimes called special causes.
This second type of variation often contributes in large part to the overall variability
of the process. Evidence of this type of variation offers important opportunities for
improving the uniformity of product. The process average may change gradually because
of gradual changes in temperature, tool wear, or operator fatigue. Alternatively, the
process may be unnecessarily variable because two operators or machines are perform-
ing at different averages. Variability resulting from two or more processes operating at
different levels, or of a single source operating with an unstable average, is typical of
production processes. This is the rule, not the exception to the rule.
This second type of variability must be studied using various techniques of data
analysis, which will be presented in this book, with the aim of separating it from the
first. Then, after responsible factors have been identified and corrected, continuing con-
trol7 of the process will be needed.

1.3 ORGANIZING DATA


Certain concepts and methods of analysis and interpretation of data appear to be simple.
Yet they are not easily acquired and assimilated. The discussion of methods extends
over the first five chapters. Some patience will be needed. After some concepts and
methodologies have been considered, the weaving-together of them with actual experi-
ences (case histories) will begin to make sense.
The data presented in Table 1.1 are measurements of small pieces of mica delivered
as one shipment. These readings were made with a dial indicator gauge on a sample of
n = 200 pieces of one mica type.
The observations in Table 1.1 could also be displayed individually along a hori-
zontal scale. Such a display would show extremes and any indication of clustering.
When n is large, we often decide to present the data in the condensed form of a grouped
frequency distribution (Table 1.2) or a histogram. A histogram is a picture of the distri-
bution in the form of a bar chart with the bars juxtaposed.
Table 1.2 was prepared8 by selecting cell boundaries to form equal intervals of
width m = 1 called cells. A tally mark was entered in the proper cell corresponding to
each measurement in the table. The number of measurements that fall in a particular cell
is called the frequency (fi) for that ith cell; also, fi/n is called the relative frequency, and

6. Walter A. Shewhart, Economic Control of Quality of Manufactured Product (New York: D. Van Nostrand, 1931).
7. See Chapter 2.
8. A discussion of the grouping procedure is given in Section 1.4.
Chapter 1: Variables Data: An Introduction 9

Table 1.2 Data: mica thickness as a tally sheet.


Data from Table 1.1 grouped into cells whose boundaries and midpoints are shown in columns
at the left.
Cell Cell Observed Percent
boundaries midpoint Tally frequency frequency

4.75
5.75
5.25 ⁄ 1 0.5

6.75
6.25 ⁄ 1 0.5

7.75
7.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄ 11 5.5

8.75
8.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 19 9.5

9.75
9.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 18 9.0

10.75
10.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 40 20.0

11.75
11.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 29 14.5

12.75
12.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 33 16.5

13.75
13.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 23 11.5

14.75
14.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 13 6.5

15.75
15.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ 9 4.5

16.75
16.25 ⁄ 1 0.5

17.75
17.25 ⁄⁄ 2 1.0

n = 200 100%

100 fi /n is the percent frequency. An immediate observation from Table 1.2 is that mica
pieces vary from a thickness of about five thousandths at one extreme to 17 at the other.
Also, the frequency of measurements is greatest near the center, 10 or 11 thousandths,
and tails off to low frequencies on each end.

1.4 GROUPING DATA WHEN n IS LARGE

Cells—How Many and How Wide?


Table 1.1 has been presented in Table 1.2 as a tally sheet. Many times there are advan-
tages in recording the data initially on a blank tally sheet instead of listing the numbers
as in Table 1.1. In preparing a frequency distribution, it is usually best to:
1. Make the cell intervals equal, of width m.
2. Choose cell boundaries halfway between two possible observations. This
simplifies classification. For example, in Table 1.1, observations were
recorded to the nearest half (0.5); cell boundaries were chosen beginning
with 4.75, that is, halfway between 4.5 and 5.0.
3. Keep the number of cells in grouping data from a very large sample between
13 and 20. Sturges’ rule of thumb for cell size gives the following relationship
between the number of cells c and sample size n:
10 Part I: Basics of Interpretation of Data

c = 1 + 3.3 log10 n (1.1)

and since 23.3 ≅ 10, the relationship simplifies9 to a useful rule of thumb

2n = 2c
n = 2c–1

This leads to the following rough starting points for the number of cells in a
frequency distribution using Sturges’ rule.

Sample size n Number of cells c Sample size n Number of cells c


6–11 4 377–756 10
12–23 5 757–1,519 11
24–46 6 1,520–3,053 12
47–93 7 3,054–6,135 13
94–187 8 6,136–12,328 14
188–376 9 12,329–24,770 15

Note that a frequency distribution is a form of data display. The values


given are rough starting points only and thereafter the number of cells
should be adjusted to make the distribution reveal as much about the data
as possible. The number of cells c is directly related to the cell width:
c = ∆/m. In Table 1.1, a large10 observation is 17.0; a small one is 6.0.
Their difference ∆ is read “delta.”

∆ = 17.0 – 6.0 = 11.0

Now if the cell width m is chosen to be m = 1, we expect at least ∆/m = 11


cells; if chosen to be m = 0.5, we expect the data to extend over at least
11/0.5 = 22 cells. The tally (Table 1.2) was prepared with m = 1, resulting
in 13 cells. (See Exercise 1.a for m = 0.5.)
4. Choose cell boundaries that will simplify the tallying. The choice of 4.75,
5.75, 6.75, and so forth, as cell boundaries when grouping the data from
Table 1.1 results in classifying all numbers beginning with a 5 into the same
cell; its midpoint is 5.25, halfway between 5.0 and 5.5.

9. Converting Sturges’ rule to base two we obtain c = 1 + 3.3 log10 n


c = 1 + 3.33 log2 n log10 2
c = 1 + 3.33(0.301) log2 n
c = 1 + 0.993 log2 n
c – 1 = log2 n
n = 2c–1
2n = 2c
10. Whether these are actually the very largest or smallest is not critical. Using m = 2, about six cells
would be required.
Chapter 1: Variables Data: An Introduction 11

Similarly, all readings beginning with a 6 are grouped into the cell whose
midpoint is 6.25, and so forth. This makes tallying quite simple.
5. The midpoint of a cell is the average of its two boundaries. Midpoints
begin with
0.5(4.75 + 5.75) = 5.25

then increase successively by the cell width m.


6. A frequency distribution representing a set of data makes it possible to
compute two different numbers (see Table 1.3) that give objective and
useful information about the location and spread of the distribution. These
computed numbers are especially important in data analysis. The number11

X is an estimate of the central location of the process, m. It is the arithmetic
average, or mean, of the observations. Estimates of the process mean are

symbolized by m̂. Thus, when the sample mean is used as an estimate, m̂ = X,
and the number ŝ is an estimate of the process variation s. Some interpre-

tations of ŝ will be discussed in Section 1.8. Both X and ŝ can be computed
directly from the values themselves or after they have been organized as a
frequency distribution; the computations are shown in Table 1.3.

Table 1.3 Mica thickness.


––
Computation of X and ŝ ; data from Table 1.1.
Accumu-
mi lated
Cell Cell tally
boun- mid-
daries points Tally fi fimi fimi2 Sfi S%

5.75
5.25 ⁄ 1 5.25 27.56 1 0.5

6.75
6.25 ⁄ 1 6.25 39.06 2 1.0

7.75
7.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄ 11 79.75 578.19 13 6.5

8.75
8.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 19 156.75 1,293.19 32 16.0

9.75
9.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 18 166.50 1,540.13 50 25.0

10.75
10.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 40 410.00 4,202.50 90 45.0

11.75
11.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 29 326.25 3,670.31 119 59.5

12.75
12.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 33 404.25 4,952.06 152 76.0

13.75
13.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 23 304.75 4,037.94 175 87.5

14.75
14.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 13 185.25 2,639.81 188 94.0

15.75
15.25 ⁄⁄⁄⁄ ⁄⁄⁄⁄ 9 137.25 2,093.06 197 98.5

16.75
16.25 ⁄ 1 16.25 264.06 198 99.0

17.75
17.25 ⁄⁄ 2 34.50 595.13 200 100.0
n= 200 2,233 25,933


11. Read “X-bar” for the symbol X and read “sigma hat” for ŝ.
12 Part I: Basics of Interpretation of Data

1.5 THE ARITHMETIC AVERAGE


OR MEAN—CENTRAL VALUE
There are n = 200 measurements of mica thickness recorded in Table 1.1. The arithmetic
average of this sample could be found by adding the 200 numbers and then dividing by
200. We do this when n is small and sometimes when using a calculator or computer.

The average X obtained in this way is 11.1525.
More generally, let the n measurements be

X1, X2, X3, . . . , Xn (1.2)

A shorthand notation is commonly used to represent sums of numbers. The capital



Greek letter sigma, written Σ, indicates summation. Then the average X of the n numbers
in Equation (1.1) is written symbolically12 as
n

∑X i
X= i =1
(1.3a)
n

Median and Midrange


~
A second measure of the center of a distribution is the median, X. When there is an
odd number of ordered observations, the middle one is called the median. When there
is an even number, the median is defined to be halfway between the two central values,
that is, their arithmetic average. In brief, half of the observations are greater than the
median and half are smaller. For the mica data, the median is 11.0. A third measure is
the midrange, which is halfway between the extreme observations, that is, their mean.

12. The expression


n

∑X
i =1
i

is read “the summation of Xi from i = 1 to n.” The letter i (or j or whatever letter is used) is called the index
of summation. The numbers written above and below, or following Σ indicate that the index i is to be given
successively each integral value from 1 to n, inclusively.
The expression 10

∑X
i =1
i
2

represents the sum


X12 + X22 + X32 + . . . + X102
and sometimes the index of summation is omitted when all the observations are used. Thus,

∑( X − X ) = ( X − X ) + ( X 2 − X ) + ( X3 − X ) + … + ( X n − X )
2 2 2 2 2
1

Both of these expressions are used frequently in data analysis.


Chapter 1: Variables Data: An Introduction 13

f f

s1 s2

(a) (b)
m1 m2

Figure 1.5 Two normal distributions with m1 = m2 but s2 > s1.

For the mica data, the midrange is (17 + 5)/2 = 11. Meteorologists, for example, use the
midrange to represent average daily temperature.

1.6 MEASURES OF VARIATION

Computing a Standard Deviation


Figure 1.5 shows two distributions having the same average. It is very clear that the
average alone does not represent a distribution adequately. The distribution in (b) spreads
out more than the one in (a). Thus some measure is needed to describe the spread or
variability of a frequency distribution. The variability of the distribution in (b) appears
to be about twice that in (a).
A useful measure of variability is called the standard deviation ŝ . We shall present
the calculation and then discuss some uses and interpretations of ŝ . This is the small
Greek letter sigma with a “hat” to indicate that it is an estimate of the unknown measure
of population variability. The symbol ŝ is used as an omnibus symbol to represent any
estimate of the unknown process parameter s, just as m̂ represents any estimate of m.
Their specific meaning is taken in context.
The value of ŝ for smaller values of n is often obtained from the formula

∑( X − X ) n∑ X 2 − (∑ X )
2 2

σ̂ = s = =
n ( n − 1)
(1.4a)
n −1

The first formula is the definition of s, while the algebraically equivalent second
formula is often easier to compute. To illustrate the calculation required, consider the
first five observations of mica thickness in the first column of Table 1.1, that is, 8.0,

10.0, 12.0, 12.0, 11.5. Their mean is X = 53.5/5 = 10.7. We have
14 Part I: Basics of Interpretation of Data

– –
X X2 (X – X ) (X – X )2
8.0 64 –2.7 7.29
10.0 100 –0.7 0.49
12.0 144 1.3 1.69
12.0 144 1.3 1.69
11.5 132.25 0.8 0.64
Total 53.5 584.25 0.0 11.80

5 ( 584.25) − ( 53.5)
2
11.8
σˆ = s = = = 1.7176
4 5( 4 )

For the complete sample of 200 measurements of mica thickness we find


n n

∑ Xi = 2, 230.5
i =1
∑X
i =1
i
2
= 25, 881.75
n

∑X i
2, 230.5
X= i =1
= = 11.1525
n 200
and
 n 
2
n
n∑ X −  ∑ X i 
2
200 ( 25, 881.775) − ( 2, 230.5)
2
 i=1 
i
i =1
s= =
n ( n − 1) 200 (199 )
= 5.055772613 = 2.248505 ≈ 2.249

Occasionally, data are available only in grouped form, or it is necessary to group


for other purposes. When this happens, a simple computational procedure can be used
to obtain the arithmetic average and the standard deviation of the data. See Table 1.3 for
the mica thickness observations.
1. Characterize all the values in a cell by the cell midpoint.
2. Multiply the midpoint mi of each cell by the cell frequency fi to obtain an
fimi column.
3. Add the numbers in the fimi column to approximate the grand total of all
the observations as Σfimi (the actual sum is 2,230.5).
4. Divide Σfimi by the sample size n to obtain
n

∑fm i i
2, 233
X= i =1
= = 11.165 (1.3b)
n 200
Chapter 1: Variables Data: An Introduction 15

The value 11.165 has been obtained by assigning all measurements within a
cell to have the value of the midpoint of the cell. The result compares closely
with the arithmetic average of 11.1525 computed by adding all individual
measurements and dividing by 200.
5. The computation of ŝ = s in Table 1.3 requires one more column than for

X, which is labeled fimi2. It is obtained by multiplying each number in the mi
column by the corresponding number in the fimi column. Then compute

∑fm i
2
i
= 25, 933

which approximates the crude sum of squares of the individual observations


(which was ΣX2 = 25,881.75).
6. Obtain ŝ from Equation (1.4b):

 n 
2
n
n∑ fi m −  ∑ fi mi 
2
200 ( 25, 933) − ( 2, 233)
2
 i=1 
i
i =1
σ̂ = s = = = 2.243 (1.4b)
n ( n − 1) 200 (199 )

∑fm i i
= 2, 233 ∑fmi
2
i
= 25, 933
n
fi mi 2, 233
X =∑ = = 11.165
i =1 n 200

σˆ =
n (∑ f m ) − (∑ f m )
i
2
i i i
2

n ( n − 1)

200 ( 25, 933) − ( 2, 233)


2

= = 2.243
200 (199 )

These computation procedures are a great simplification over other procedures


when n is large. This result also compares closely with the sample standard deviation
s = 2.249 obtained from all the individual measurements. When n is small, see Chapter 4.
Some interpretations and uses of ŝ will be given in subsequent sections.
The estimate s has certain desirable properties. For instance, the sample variance s2
is an unbiased estimate of the true but unknown process variance s 2. There are, of
course, other possible estimates of the standard deviation, each with its own properties.
Differences in the behavior of these estimates give important clues in troubleshooting.
For example,13 there are the unbiased sample standard deviation estimate, the range

13. For other examples of simple estimates and relative efficiencies, see W. J. Dixon and F. J. Massey, Introduction to
Statistical Analysis (New York: McGraw-Hill, 1969).
16 Part I: Basics of Interpretation of Data

estimate, the mean deviation from the median, the best linear estimate, and others. For
sample size 5, these are:
Unbiased sample standard deviation estimate:

s
σˆ = = 1.0638s (100% efficiency)
c4

Best linear estimate:

ŝ = 0.3724(X(5) – X(1)) + 0.1352(X(4) – X(2)) (98.8% efficiency)

Range estimate:

ŝ = 0.43(X(5) – X(1)) (95.5% efficiency)

Mean deviation from median:

ŝ = 0.3016(X(5) + X(4) – X(2) – X(1)) (94% efficiency)

where X(i) indicates the ith ordered observation. For the first sample of 5 from the mica
data, ordered as 8, 10, 11.5, 12, 12, these are:
Unbiased sample standard deviation estimate:

1.7176
σˆ = = 1.8272
0.94

Best linear estimate:

ŝ = 0.3724(12 – 8) + 0.1352(12 – 10) = 1.760

Range estimate:

ŝ = 0.43(12 – 8) = 1.72

Mean deviation from median:

ŝ = 0.3016(12 + 12 – 10 – 8) = 1.8096

It should not be surprising that we come out with different values. We are esti-
mating and hence there is no “correct” answer. We don’t know what the true population
s is; if we did, there would be no need to calculate an estimate. Since each of these
methods has different properties, any (or all) of them may be appropriate depending
on the circumstances.
Chapter 1: Variables Data: An Introduction 17

The desirable properties of s, or more properly its square s2, have led to the popu-
larity of that estimate in characterizing sample variation. It can also be calculated rather
simply from a frequency distribution.

Note: The procedures of Table 1.3 not only provide values of X and ŝ but also pro-
vide a display of the data. If the histogram approximated by the tally shows a definite

bimodal shape, for example, any interpretations of either X or ŝ must be made carefully.
One advantage of plotting a histogram is to check whether the data appear to come
from a single source or from two or perhaps more sources having different averages. As
stated previously, the mica manufacturer believed that many of the mica pieces were
splitting during handling. If so, a definite bimodal shape should be expected in Table
1.3. The data do not lend support to this belief.
The mica-splitting data (Tables 1.1, 1.2, 1.3) were obtained by measuring the thick-
ness of mica pieces at the incoming inspection department. The data almost surely came
from a process representing production over some time period of different workers on
different knife splitters; it just is not reasonable to expect all conditions to be the same.
The data represent what was actually shipped to us—not necessarily what the produc-
tion process was capable of producing.

Some Coding of Data (Optional)14



The computations of X and ŝ can sometimes be made easier by using some important
properties and methods of coding (transforming) data. These are used in change of
scale. Consider again the n measurements in Equation (1.2).

X1, X2, X3, . . . , Xn (1.2)



• What happens to their average X and standard deviation ŝ if we translate the
origin by adding a constant c to each?

X1 + c, X2 + c, X3 + c, . . . , Xn + c (1.5)

1. The average of this new set of numbers will be the original average
increased by c:

∑ ( Xi + c)
n n

∑X i
+ nc
New average = i =1
= i =1
= X +c
n n

2. The standard deviation of this new set of numbers in Equation (1.5) is not
changed by a translation of origin; their standard deviation is still ŝ .

14. This procedure may be omitted. Simple algebra is sufficient to prove the following relations pertaining to standard
deviations; simple but tedious. The proofs are omitted.
18 Part I: Basics of Interpretation of Data


• What happens to the average X and standard deviation ŝ if we multiply each
number in Equation (1.1) by a constant c?

cX1, cX2, cX3, . . . , cXn (1.6)

1. The average of these numbers will be the original average multiplied by c:


n n

∑ cX i
c∑ X i
New average = i =1
= i =1
= cX
n n

2. The standard deviation of these numbers will be the original multiplied


by c; that is,
New ŝ = cŝ

We see then that data coded in the form y = a + bx will have y– = a + b x–


and sy = bsx.

1.7 NORMAL PROBABILITY PLOTS


A graphical method of presenting data is often helpful in checking on the stability of the
source producing the data. The accumulated percents of mica thickness data are shown
in Table 1.3, right-hand column. These have been plotted on a normal probability scale in
Figure 1.6. There are 13 cells in Table 1.3; a convenient scale has been chosen on the
baseline (Figure 1.6) to accommodate the 13 cells. The upper cell boundaries have been
printed on the base scale; the chart shows the accumulated percent frequencies up to the
upper cell boundaries.
Normal probability paper is scaled in such a way that a truly normal curve will be
represented by a straight line. A line can be drawn using a clear plastic ruler to approx-
imate the points; it is not unusual for one or two points on each end to deviate slightly,
as in Figure 1.6, even if the source of the data is essentially a normal curve. The data
line up surprisingly well.
The median and the standard deviation of the data can be estimated from the
straight-line graph on the normal probability plot.
The median is simply the 50 percent point. A perpendicular line has been dropped
from the intersection of the plotted line and the 50 percent horizontal line. This cuts the
base line at the median. Its estimate is

10.75 + 0.4 = 11.15



This is in close agreement with the computed X = 11.1525 from all the measurements.
Chapter 1: Variables Data: An Introduction 19

99.99
99.9 99.8

Percent
99.5
99
98
95
90
84% accumulated

80
70
60
50%
2ŝ

50
40
30
16% accumulated

20
10
Estimating s
84% = 13.55 5
16% = 8.95
2

2ŝ = 4.60
1

ŝ = 2.30
0.5
0.2
0.1 0.05

Median = 11.15 (50% point)


8.95
13.55
0.01

5.75 7.75 9.75 11.75 13.75 15.75 17.75


Cell boundaries

Figure 1.6 Mica thickness; accumulated percents plotted on normal probability paper.
(Data from Table 1.3.)

Estimating the standard deviation involves more arithmetic. One method is to deter-
mine where horizontal lines corresponding to the accumulated 16 percent and 84 per-
cent values cut the line. These numbers correspond to areas under the normal curve to
– –
the left of ordinates drawn at X – ŝ and X + ŝ , that is, they differ by an estimated 2ŝ .
20 Part I: Basics of Interpretation of Data

In Figure 1.6, corresponding vertical lines appear to cut the base line at

84% point: 13.75 – 0.2 ≅ 13.55


16% point: 8.75 + 0.2 ≅ 8.95
2ŝ ≅ 4.60
ŝ ≅ 2.30

This agrees reasonably well with the previously computed value

ŝ = s = 2.249

from all 200 measurements.15


Interpretation: The only possible evidence of mica pieces splitting in two during
handling is the pair of points at the lower left end of the line. But splitting is certainly
not a major factor—rather, the process average should be increased by about 0.6 thou-
sandths (11.75 – 11.15 = 0.6) since the center of specifications is at

1
⁄2 (8.5 + 15.0) = 11.75 thousandths

Example 1.1 Two Processes


Depth-of-cut data are shown as a frequency distribution in Table 1.4 with accumulated
percents on the right (data from Table 1.8).
These accumulated percents have been plotted on normal probability paper in
Figure 1.7. The points give evidence of fitting two line segments; a single line does not
fit them well. There is a run of length five below the initial line. Although it is possible

mechanically to compute an X and a ŝ , we should be hesitant to do so. These data rep-
resent two different processes.
This set of data is discussed again in Chapter 2, Case History 2.1.

1.8 PREDICTIONS REGARDING SAMPLING


VARIATION: THE NORMAL CURVE
This topic is of primary importance in process maintenance and improvement.
Consider pieces of mica being split by one operator. The operator produces many
thousands a day. We can imagine that this process will continue for many months or
years. The number produced is large—so large that we can consider it to be an infinite

15. The symbol ≅ means “approximately equal to.”


Chapter 1: Variables Data: An Introduction 21

Table 1.4 Data: depth of cut.


Data from Table 1.8 displayed on a tally sheet.
Cell Cell
boundaries interval Tally f Σf Σ%

1609.5
1610–11 ⁄⁄ 2 125 100%
1608–09 0 123 98.4
1607.5
1605.5
1606–07 ⁄⁄⁄⁄ 4 123 98.4

1603.5
1604–05 ⁄⁄ 2 119 95.2

1601.5
1602–03 ⁄⁄⁄⁄ ⁄⁄ 7 117 93.6

1599.5
1600–01 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄ 11 110 88.0

1597.5
1598–99 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄ 16 99 79.2

1595.5
1596–97 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ 29 83 66.4

1593.5
1594–95 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄ 28 54 43.2

1591.5
1592–93 ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄⁄⁄⁄ ⁄ 16 26 20.8

1589.5
1590–91 ⁄⁄⁄⁄ 5 10 8.0

1587.5
1588–89 ⁄⁄⁄ 3 5 4.0
1586–87 0 2 1.6
1585.5
1583.5
1584–85 ⁄ 1 2 1.6
1582–83 ⁄ 1 1 0.8

universe. In production operations, we are concerned not only with the mica pieces that
are actually produced and examined but with those that were produced and not exam-
ined. We are also concerned with those that are yet to be produced. We want to make
inferences about them. This is possible provided the process is stable.
We can think of the process as operating at some fixed stable level and with some
fixed stable standard deviation. We refer to these two concepts by the Greek letters m
and s,16 respectively. The actual values of m and s can never be learned in practice;
they are abstract concepts. Yet, they can be estimated as closely as we please by com-

puting X and ŝ from large enough samples. How large the sample must be is answered
in the following discussions (Section 1.11).
If we have two operators splitting micas, it is not unusual to find differences in their
output either in average thickness or in variability of product. If they are found to have
equal averages and standard deviations, then they can be considered to be a single pop-
ulation source.
It is important to know how much variation can be predicted in a succession of sam-
ples from a stable17 process. What can be predicted about a second sample of 200 mica

16. m is pronounced “mew.” It designates an assumed true, but usually unknown, process average. When n items
– – ˆ
of a random sample from a process are measured/tested, their average X designates an estimate of m: X = m.
Another important concept is that of a desired or specified average. It is commonly designated by the

symbol X' (read “X-bar prime”).
The symbol s ' (read “sigma prime”) is sometimes used to designate a desired or specified measure of
process variability.
17. Unstable processes are unpredictable. Few processes are stable for very long periods of time whether in a
laboratory or in production.
22 Part I: Basics of Interpretation of Data

99.99
99.9 99.8
99.5
99
98
Departure is consequence
of samples 16 to 25 (See

95
Table 1.8 and Figure 2.7)

90
80
70
60

Percent
50
40
30
20
10
5
2
1
0.5
0.2
0.1 0.05
0.01

1585.5 1589.5 1593.5 1597.5 1601.5 1605.5 1609.5


Depth

Figure 1.7 Depth of cut on normal probability paper. (Data from Table 1.4.)

pieces, which we might have obtained from the shipment that provided the data in Table

1.1? It would be surprising if the newly computed X were exactly 11.1525 thousandths
as it was for the first sample; it would be equally surprising if the computed ŝ were
Chapter 1: Variables Data: An Introduction 23

exactly 2.249 again. The following two theorems relate to the amount of variation

expected among k sample averages Xi and standard deviations ŝ i of samples of n each
drawn from a stable process (or which might be so drawn).
The k averages
– – – –
X1, X2, X3, . . . , Xk

of the k samples will vary and will themselves form a frequency distribution. The sam-
ple averages will vary considerably less than individuals vary.
Theorem 1: The standard deviation ŝ X– of averages of samples of size n drawn
from a process will be related to the standard deviation of individual observa-
tions by the relation

σˆ X = σˆ / n (1.7)

This theorem says that averages of n = 4, for example, are predicted to vary half as
much as individuals and that averages of n = 100 are predicted to vary one-tenth as
much as individuals.
From each of the k samples, we can also compute a standard deviation

ŝ 1, ŝ 2, ŝ 3, . . . , ŝ k

These also form a distribution. What can be predicted about the variation among
these standard deviations computed from samples of size n?
Theorem 2: The standard deviation of sample standard deviations will be related
to the standard deviation of individual measurements (ŝ ) by the relation:

σˆ
σˆ σˆ ≅ (1.8)
2n

These two theorems are important in the study of industrial processes. The basic
theorems about the predicted variation in ŝ X– and ŝsˆ relate to idealized mathematical dis-
tributions. In applying them to real data, we must obtain estimates ŝ X– and ŝsˆ . These
estimates are given in Equations (1.7) and (1.8).
Distributions of sample averages from parent universes (populations) of different
shapes are similar to each other.
Consider averages of samples of size n drawn from a parent population or process.
It had been known that sample averages were essentially normally distributed
1. When n was large, certainly when n approached infinity.
2. Usually regardless of the shape of the parent population, normally distributed
or not.
24 Part I: Basics of Interpretation of Data

In the late 1920s, Dr. Walter A. Shewhart conducted some basic and industrially
important chip drawings. Numbers were written on small metal-rimmed tags, placed in
a brown kitchen bowl, and experimental drawings (with replacement of chips) made
from it. Among other things, he wanted to see if there were predictable patterns (shapes)
to distributions of averages of size n drawn from some simple populations. He recog-
nized the portent of using small samples for industrial applications provided more was
known about small samplings from a stable universe, such as drawing numbered chips
from a bowl.
Three different sets of chips were used: one represented a rectangular universe;
another, a right-triangular distribution universe; and the third, a normal distribution. In
each experiment, many sample averages were obtained using n = 3, n = 4, and n = 5.
One important consequence is given in Theorem 3.
Theorem 3: Even with samples as small as n = 4, the distribution of averages
of random samples drawn from almost any shaped parent population18 will be
essentially normal.
Figure 1.8 portrays the relationship of the distribution of sample averages to their
parent universes even for samples as small as n = 4. For sample sizes larger than n = 4,
the shape of the curve of averages also tends to normality. Averages (with n as small as
4) from these different parent populations tend to be normally distributed (Theorem 3).
The normal curve is symmetrical and bell-shaped; it has an equation whose ideal-
ized form is

−( X − µ )
2

1
Y= 2σ 2
e (1.9)
σ 2π

The term normal is a technical term; it is neither synonymous with usual nor the
opposite of abnormal. Often the terms Gaussian or bell-shaped distribution are used.
Areas under the normal curve (Figure 1.2) can be calculated, but the calculations are
tedious. Values are given in Appendix Table A.1. However, there are a few important
area relationships that are used so frequently that they should be memorized. The fol-
lowing are obtained from Table A.1:

Between Percent of area under normal curve


m – 3s and m + 3s 99.73 ≅ 99.7, that is, “almost all”
m – 2s and m + 2s 95.44 ≅ 95
m – s and m + s 68.26 ≅ 68
(1.10)
In practice, of course, we do not know either m or s; they are replaced by their esti-

mates X and ŝ , computed from a representative sample of the population.

18. With a finite variance.


Chapter 1: Variables Data: An Introduction 25

Averages, n = 4

(a)

Averages, n = 4

(b)

Averages, n = 4

(c)

Figure 1.8 Distributions sampled by Shewhart: (a) rectangular parent population; (b) right-
triangular parent population; (c) normal parent population.

In other words, about 95 percent of all production from a well-controlled (stable)


process can be expected to lie within a range of ± 2s around the process average, and
almost all, 99.7 percent, within a range of ± 3s around the average.

Example 1.2 Two Applications


1. Within what region can we predict that mica thickness will vary in the
shipment from which the sample of Table 1.1 came?
To obtain the answer, we assume a stable process producing normally
distributed thicknesses.
26 Part I: Basics of Interpretation of Data

LSL USL

8.5 15.0
m^ = X = 11.152

Figure 1.9 Estimating percent of a normal curve outside given specifications. (Related to data
in Table 1.1.)


Answer A. From the 200 observations, we computed X = 11.152 and
ŝ = 2.249. From the relation in Equation (1.10), we expect almost all (about
99.7 percent) to be between

X + 3ŝ = 11.152 + 3(2.249) = 17.90 thousandths
and

X – 3ŝ = 11.152 – 3(2.249) = 4.40 thousandths

Also from the relation in Equation (1.10), we expect about 95 percent to


be between
– –
X + 2ŝ = 15.65 and X – 2ŝ = 6.65

Answer B. In Table 1.1, we find the one thinnest piece to be 5.0; also, the
two thickest ones to be 17.0. This is in agreement with the ±3s prediction
of Answer A.
2. What percent of nonconforming mica pieces do we expect to find in the entire
shipment of which the data in Table 1.1 comprise a sample?
Answer. The specifications on the mica thickness were 8.5 to 15.0 thousandths
of an inch as shown in Figure 1.9. We can compute the distance from

X = 11.152 to each of the specifications expressed in standard deviations:

X − LSL 11.152 − 8.5


Z1 = = = 1.18 (1.11a)
σˆ 2.249

From Appendix Table A.1, we find the corresponding percent below 8.5

(that is, below X – 1.18ŝ ) to be 11.9 percent. (The actual count from
Table 1.1 is 24, that is, 12 percent.) Also,
Chapter 1: Variables Data: An Introduction 27

USL − X 15.0 − 11.152


Z2 = = = 1.71 (1.11b)
σˆ 2.249

Again from Table A.1, we find the expected percent above 15 (that is, above

X + 1.71ŝ ) to be about 4.4 percent. (The actual count is 7, that is, 3.5 percent.)
Discussion: There are different possible explanations for the excessive variability of
the mica-splitting operation: variation in the mica hardness, variation in each of the
operators who split the blocks of mica using a small bench knife, and any variations
between operators.
An important method of future process surveillance was recommended—a
Shewhart control chart, which is discussed in Chapter 2.

Example 1.3. Using the Mica Thickness Data


The average and standard deviation were computed from the sample of 200 measure-
ments to be

X = 11.152 thousandths and ŝ = 2.249 thousandths

Then, from Theorem 1, for a series of averages of samples of n = 200 from this
same process, assumed stable,

σˆ 2.249
σˆ X = = = 0.159
200 14.14

An estimate of the variation of averages to be expected in random, representative



samples of n = 200 from a process with ŝ = 2.249 and average X = 11.152 is then19

X ± 2ŝ X– = 11.152 ± 0.318 thousandths (with about 95% probability)

X ± 3ŝ X– = 11.152 ± 0.477 thousandths (with about 99.7% probability)

Also from Theorem 1, we can estimate the location of the assumed true but
unknown average m of the mica-splitting process. This converse use of Theorem 1 is
applicable when n is as large as 30 or more. A modification, not discussed in this text,
is required for smaller sample sizes. For n = 200,

X – 2ŝ X– = 11.152 – 0.318 ≅ 10.83

19. See Equation (1.10) and Theorem 1.


28 Part I: Basics of Interpretation of Data

X – 3ŝx X + 3ŝx
99.7%

X – 2ŝx X + 2ŝx

95%

10.67 10.83 11.0 11.47 11.63


X – 3ŝx X = 11.152 X + 3ŝx

Figure 1.10 Estimating confidence intervals of unknown process average.

and

X + 2ŝ X– = 11.152 + 0.318 ≅ 11.47

that is, 10.83 < m < 11.47 thousandths (with about 95 percent confidence). Also, we can
estimate the location of the unknown average m to be between

X – 3ŝ X– = 11.152 – 0.477 ≅ 10.67
and

X + 3ŝ X– = 11.152 + 0.477 ≅ 11.63

that is, 10.67< m < 11.63 with 99.7 percent confidence.


In Figure 1.10, we see the increase in interval required to change the confidence in
our estimate from 95.5 to 99.7 percent.

1.9 SERIES OF SMALL SAMPLES FROM A


PRODUCTION PROCESS

The amount of variation expected in X and ŝ in a succession of samples from an indus-
trial process can be predicted only when the process average and variability are stable.20
In the following discussion, we assume that the process is stable and make predictions
about the expected variation in samples obtained randomly from it.

20. Actually, of course, the lack of basic stability in a process average is the usual situation; it is the major reason for
troubleshooting and process improvement studies. Methods for using a succession of small samples in studying
lack of stability in a process will be considered in Chapter 2 and subsequent chapters.
Chapter 1: Variables Data: An Introduction 29

1.10 CHANGE IN SAMPLE


– SIZE: PREDICTIONS
ABOUT X AND r̂
We might have taken a smaller sample from the mica shipment. For example, the mea-
surements in the top five rows of Table 1.1 constitute a sample of n = 50 from the mica-
splitting process that produced the shipment. We expect large random samples to provide
more accurate estimates of the true process average and standard deviation than smaller
samples. Smaller samples, however, often provide answers that are entirely adequate.
The two theorems in this chapter give useful information pertaining to sample size.

The computed values X and ŝ from all 200 values were

X = 11.152 and ŝ = 2.249 with n = 200

The values X and ŝ , from the top 50 measurements in Table 1.1, are computed
below from the original observations.
n n

∑ Xi = 604.5
i =1
∑X i =1
i
2
= 7552.25 n = 50
n

∑X i
604.5
X= i =1
= = 12.09
n 50
 n 
2
n
n∑ X −  ∑ X i 
2
50 ( 7552.25) − ( 604.5)
2
 i=1 
i
i =1
σ̂ = s = = = 2.231
n ( n − 1) 50 ( 49 )

First, what can we predict from X and ŝ of samples of size n = 50 drawn from the
mica-splitting process?


Variation of X
From Theorem 1, the average of a sample of size n = 50, assuming stability, is in
the region
σ
µ±2
n
2.231
µ±2
50
µ ± 0.636

with 95 percent probability. Clearly, this implies that the sample mean X will be within
a distance of 0.636 from the true mean 95 percent of the time when n = 50. So, the
interval
30 Part I: Basics of Interpretation of Data

σˆ
X ±2
n
12.09 ± 0.636

will contain the true mean approximately 95 percent of the time in the long run. On a
single trial, however, the interval may or may not contain the true mean. We say we
have 95 percent confidence that the interval contains the true mean on a single trial in
the sense that the statement “it contains the true mean” would be correct 95 percent of the
time in the long run. The term confidence is used since the term probability applies to
repeated trials, not a single result.
The sample of 50, then, gives a 95 percent confidence interval of 12.09 ± 0.636,
or 11.454 < m < 12.726. Note that the sample of 200 gives a 95 percent confidence
interval of
σˆ
X ±2
n
2.249
11.152 ± 2
200
11.152 ± 0.318
or 10.834 < µ < 11.470

The reduction in the size of the interval from n = 50 to n = 200 reflects the greater
certainty involved in taking the larger sample.

Variation of r̂
The expected value of ŝ , assuming stability, is in the region ŝ ± 2ss , which is esti-
mated by
σˆ
σˆ ± 2
2n

with 95 percent probability. By applying similar reasoning to the mean, it can then be
shown for the sample of size n = 50, a 95 percent confidence interval for s is

σˆ
σˆ ± 2
2n
 
2.231 
2.231 ± 2 
 2 ( 50 ) 
 
 2.2331
2.231 ± 2  
 100 
2.231 ± 0.446
or 1.785 < σ < 2.677
Chapter 1: Variables Data: An Introduction 31

1.11 HOW LARGE A SAMPLE IS NEEDED TO


ESTIMATE A PROCESS AVERAGE?
There are many things to consider when answering this question. In fact, the question
itself requires modification before answering. Is the test destructive, nondestructive, or
semidestructive? How expensive is it to obtain and test a sample of n units? How close
an answer is needed? How much variation among measurements is expected? What
level of confidence is adequate? All these questions must be considered.
Discussion: We begin by returning to the discussion of variation expected in averages
of random samples of n items around the true but unknown process average m. The
expected variation of sample averages21 about m is

σ
±2 (confidence about 95%)
n
and
σ
±3 (confidence 99.7%)
n

Now let the allowable deviation (error) in estimating m be ± ∆ (read “delta”); also
let an estimate or guess of s be ŝ ; then

 2σ̂ 
2
2σ̂
∆≅ and n ≅   (about 95% confidence) (1.12a)
n  ∆
also
 3σ̂ 
2
3σ̂
∆≅ and n ≅   (99.7% confidence) (1.12b)
n  ∆

Confidence levels other than the two shown in Equation (1.12) can be used by refer-
ring to Table A.1. When our estimate or guess of a required sample size n is even as
small as n = 4, then Theorem 3 applies.

Example 1.4 Sample Size


The mica manufacturer wants to estimate the true process average of one of his opera-
tors (data from Table 1.1). How large a random sample will he need?
• In this simple nondestructive testing situation, cost associated with sample size
selection and tests is of little concern.

21. See Section 1.8, Theorems 1 and 3; also Equation (1.10).


32 Part I: Basics of Interpretation of Data

• What is a reasonable choice for ∆? Since specifications are 8.5 to 15 thou-


sandths, an allowance of ± ∆ = ± 1 seems reasonable to use in estimating a
sample size.
• What is an estimate of s ? No information is available here for any one
operator; we do have an estimate of overall variation, ŝ = 2.243 from Table 1.3.
This estimate probably includes variation resulting from several operators and
thus is larger than for any one. However, the best available estimate is from
Equation (1.4a): ŝ = 2.249. Then from Equation (1.12a), n = (4.498/1)2 ≅ 20,
(about 95 percent confidence).
Decision: A sample size of n = 20 to 25 should be adequate to approximate the
process average m. However, a somewhat larger sample might be selected since it would
cost but little more and might be accepted more readily by other persons associated with
the project.

1.12 SAMPLING AND A SECOND METHOD


OF COMPUTING r̂
The method presented in this section is basic to many procedures for studying produc-
tion processes.
The data in Table 1.1 represent a sample of 200 thickness measurements from
pieces of mica delivered in one shipment. We have also considered a smaller sample
from the shipment and used it to make inferences about the average and variability of
the entire shipment.
There are definite advantages in subdividing sample data already in hand into
smaller samples, such as breaking the mica sample of n = 200 into k = 40 subsamples
or groups of size ng = 5. Table 1.5 shows the data from Table 1.1 displayed in 40 sets of
five each. The decision to choose five vertically aligned samples is an arbitrary one; there
is no known physical significance to the order of manufacture in this set. Where there is
a known order, either of manufacture or measurement, such an order should be pre-
served in representing the data, as in Figure 1.11.

We have computed two numbers from each of these 40 subsamples: the average X
and range R are shown directly below each sample. The range of a sample is simply:

R = the largest observation minus the smallest

The range is a measure of the variation within each small sample; the average of

the ranges is designated by a bar over the R, that is, R, and one reads “R-bar.” There is

an amazingly simple and useful relationship (theorem)22 between the average range R

22. Acheson J. Duncan, “The Use of Ranges in Comparing Variabilities,” Industrial Quality Control 11, no. 5
(February 1955): 18, 19, 22; E. S. Pearson, “A Further Note on the Distribution of Range in Samples from a
Normal Population,” Biometrika 24 (1932): 404.
Chapter 1: Variables Data: An Introduction 33

Table 1.5 Mica thickness data in subgroups of ng = 5 with their averages and ranges.
Data from Table 1.1
8.0 12.5 12.5 14.0 13.5 12.0 14.0 12.0 10.0 14.5
10.0 10.5 8.0 15.0 9.0 13.0 11.0 10.0 14.0 11.0
12.0 10.5 13.5 11.5 12.0 15.5 14.0 7.5 11.5 11.0
12.0 12.5 15.5 13.5 12.5 17.0 8.0 11.0 11.5 17.0
11.5 9.0 9.5 11.5 12.5 14.0 11.5 13.0 13.0 15.0

X: 10.7 11.0 11.8 13.1 11.9 14.3 11.7 10.7 12.0 13.7
R: 4.0 3.5 7.5 3.5 4.5 5.0 6.0 5.5 4.0 6.0
8.0 13.0 15.0 9.5 12.5 15.0 13.5 12.0 11.0 11.0
11.5 11.5 10.0 12.5 9.0 13.0 11.5 16.0 10.5 9.0
9.5 14.5 10.0 5.0 13.5 7.5 11.0 9.0 10.5 14.0
9.5 13.5 9.0 8.0 12.5 12.0 9.5 10.0 7.5 10.5
10.5 12.5 14.5 13.0 12.5 12.0 13.0 8.5 10.5 10.5

X: 9.8 13.0 11.7 9.6 12.0 11.9 11.7 11.1 10.0 11.0
R: 3.5 3.0 6.0 8.0 4.5 7.5 4.0 7.5 3.5 5.0
13.0 10.0 11.0 8.5 10.5 7.0 10.0 12.0 12.0 10.5
13.5 10.5 10.5 7.5 8.0 12.5 10.5 14.5 12.0 8.0
11.0 8.0 11.5 10.0 8.5 10.5 12.0 10.5 11.0 10.5
14.5 13.0 8.5 11.0 13.5 8.5 11.0 11.0 10.0 12.5
12.0 7.0 8.0 13.5 13.0 6.0 10.0 10.0 12.0 14.5

X: 12.8 9.7 9.9 10.1 10.7 8.9 10.7 11.6 11.4 11.2
R: 3.5 6.0 3.5 6.0 5.5 6.5 2.0 4.5 2.0 6.5
13.0 8.0 10.0 9.0 13.0 15.0 10.0 13.5 11.5 7.5
11.0 7.0 7.5 15.5 13.0 15.5 11.5 10.5 9.5 9.5
10.5 7.0 10.0 12.5 9.5 10.0 10.0 12.0 8.5 10.0
9.5 9.5 12.5 7.0 9.5 12.0 10.0 10.0 8.5 12.0
11.5 11.5 8.0 10.5 14.5 8.5 10.0 12.5 12.5 11.0

X: 11.1 8.6 9.6 10.9 11.9 12.2 10.3 11.7 10.1 10.0
R: 3.5 4.5 5.0 8.5 5.0 7.0 1.5 3.5 4.0 4.5

15
ng = 5

– X = 11.15
X
10


ng = 5 R = 4.875
R
5

––
Figure 1.11 Mica thickness, X and R charts; data in order as in Table 1.5. (Control limits on this
data are shown in Figure 2.5.)
34 Part I: Basics of Interpretation of Data

Table 1.6 Values of the constant d2.


See also Table A.4.
ng d2
2 1.13
3 1.69
4 2.06
5 2.33
6 2.53
7 2.70

and the standard deviation s of the process of which these k = 40 groups of ng = 5 are
subsamples. This theorem is very important in industrial applications.
Theorem 4: Consider k small random samples (k > 20, usually) of size ng drawn
from a normally distributed stable process. Compute the ranges for the k sam-

ples and their average R. Then the standard deviation (σ) of the stable process
is estimated by
R
σ̂ = (1.13)
d2

where d2 is a constant depending upon the subsample size ng. Some frequently
used values of d2 are given in Table 1.6.
In other words, an estimate of the standard deviation of the process can be obtained
either from the direct calculation of Equation (1.4a), the grouped method of Table 1.3,

or from R in Equation (1.13). For additional discussion, see Sections 2.5 and 4.4.

Example 1.5 Data from Table 1.5


The ranges (ng = 5) have been plotted in Figure 1.11b; the average of the 40 ranges is

R = 4.875. Then, from Equation (1.13) and Table 1.6,

ŝ = (4.875)/2.33 = 2.092

This estimate of s is somewhat smaller than the value 2.249 obtained by direct
computation. There are several possible reasons why the two estimates of s are not
exactly equal:
1. Theorem 1 is based on the concept of a process whose average is stable; this is
a condition seldom justified in real life. Almost every process, even those that
are stable for most practical purposes, shows gradual trends and abrupt shifts
in average when analyzed carefully by control chart methods.23

23. Chapter 2 considers practical methods of examining data from a process for excessive variation in its average
and variability.
Chapter 1: Variables Data: An Introduction 35

2. The difference is simply due to sampling error in the way the estimates are
computed. Actually, the difference between the first and second estimates is
small when compared to repeat sampling variation from the same stable process.

1.13 SOME IMPORTANT REMARKS ABOUT


THE TWO ESTIMATES
Variation can be measured in terms of overall long-term variation in all the numbers
taken together, ŝ LT, or in terms of an estimate of the short-term variation within sub-
groups of the data, ŝ ST. When the process producing the data is stable ŝ LT ≅ ŝ ST.
However, when the process is unstable, we may find evidence of lack of stability in
observing that ŝ LT > ŝ ST. Thus, the variability ŝ LT of a process over a long time interval
is measured24 by

∑( X − X )
2

σ̂ LT = s =
n −1

Figure 1.12 portrays a situation typified by machining a hole in the end of a shaft. The
shifting average of the process is represented in the figure by a succession of small curves
at 8 AM, 9 AM, and so forth. The short-term variation of the process is considered to be

8:00 AM 9:00 AM 10:00 AM • • • • 4:00 PM

n=5
s^LT = s

n=5


s^ST = R /d2

Figure 1.12 Schematic of a tobogganing production process.

24. Davis R. Bothe, Measuring Process Capability (New York: McGraw-Hill, 1997): 431–513.
36 Part I: Basics of Interpretation of Data

unchanging. The shift in average may be steady or irregular. Consider successive small
samples, say of n = 5, taken from the process at 60-minute intervals beginning at 8 AM.
The variabilityŝ ST of the machining process over a short time interval is measured by

σˆ ST = R / d 2 = R / 2.33

This is a measure of the inherent capability of the process provided it is operating at


a stable average; this stability is possible only if ways can be found to remove those
factors causing evident changes in the process average. (These include such possible fac-
tors as tool wear or slippage of chuck fastenings.)
The shaded area on the right of Figure 1.12 represents the accumulated measure-
ments of individuals from samples of five obtained successively beginning at 8 AM. The

variability of these accumulated sample measurements is not measured by ŝ ST = R/d2,
but from the overall standard deviation, ŝ LT = s.

∑( X − X )
2

σ̂ LT = s =
n −1

This latter is an estimate of the variation in the accumulated total production from
8 AM to 4:30 PM.
In Figure 1.12, the value of ŝ LT appears to be about twice ŝ ST, since the spread of
the accumulated shaded area is about twice that of each of the smaller distributions.
A gradually diminishing average diameter having a normal distribution (with spread

6ŝ ST = 6R/d2) at any given time is shown in Figure 1.12. Product accumulated over a
period of time will be much more widely spread, often appearing to be almost normal,
and with spread

∑( X − X )
2

6σ̂ LT = 6s = 6
n −1

Evidently, 6ŝ LT will be substantially larger than 6ŝ ST.



A comparison of the two estimates from the same set of data, ŝ LT = s and ŝ ST = R/d2,
is frequently helpful in troubleshooting. If they differ substantially, the process average is
suspected of instability.25 The first, ŝ LT, estimates the variability of individuals produced
over the period of sampling. The second, ŝ ST, estimates the within-subgroup variability of
individual observations. If the process is stable, we expect the two estimates to be in
fairly close agreement.
The following pertinent discussion is from the ASTM Manual on Presentation
of Data:26

25. More discussion of testing for stability is given in Section 2.5.


26. American Society for Testing and Materials, ASTM Manual on Presentation of Data (Philadelphia: ASTM,
1995): 52.
Chapter 1: Variables Data: An Introduction 37

Breaking up data into rational subgroups. One of the essential features of the
control chart method . . . is classifying the observations under consideration
into subgroups or samples, within which the variations may be considered on
engineering grounds to be due to nonassignable chance causes only, but
between which differences may be due to assignable causes whose presence is
suspected or considered possible.
This part of the problem depends on technical knowledge and familiarity
with the conditions under which the material sampled was produced and the
conditions under which the data were taken.
The production person has a problem in deciding what constitutes a reasonable pro-
cedure for obtaining rational subgroups from a process. Experience and knowledge of
the process will suggest certain possible sources that should be kept separate: product
from different machines, operators, or shifts; from different heads or positions on the
same machine; from different molds or cavities in the same mold; from different time
periods. Such problems will be considered throughout this book.

1.14 STEM-AND-LEAF
When a relatively small amount of data is collected, it is sometimes desirable to order it
in a stem-and-leaf pattern to observe the shape of the distribution and to facilitate further
analysis. This technique was developed by John Tukey27 and is particularly useful in
troubleshooting with few observations. A stem-and-leaf diagram is constructed as follows:
1. Find the extremes of the data, drop the rightmost digit, and form a vertical
column of the consecutive values between these extremes. This column is
called the stem.
2. Go through the data and record the rightmost digit of each across from the
appropriate value on the stem to fill out the number. These are called the leaves.
3. If an ordered stem-and-leaf diagram is desired, place the leaves in
numerical order.
Consider the means of the subgroups in Table 1.5 as follows:

10.7 11.0 11.8 13.1 11.9 14.3 11.7 10.7 12.0 13.7
9.8 13.0 11.7 9.6 12.0 11.9 11.7 11.1 10.0 11.0
12.8 9.7 9.9 10.1 10.7 8.9 10.7 11.6 11.4 11.2
11.1 8.6 9.6 10.9 11.9 12.2 10.3 11.7 10.1 10.0

The resulting stem-and-leaf diagram is shown in Figure 1.13. Notice that it has the
“normal” shape and is much tighter in spread than the frequency distribution of the indi-
vidual observations shown in Table 1.3, as predicted by Theorem 1 and Theorem 3.

27. John W. Tukey, Exploratory Data Analysis (Reading, MA: Addison-Wesley, 1977).
38 Part I: Basics of Interpretation of Data

8. 96
9. 86796
10. 7701779310
11. 089779710642197
12. 0082
13. 170
14. 3

Figure 1.13 Stem-and-leaf diagram of mica data means.

8. 69
9. 66789
10. 001/1377779
11. 001/1246777789/99
12. 0028
13. 017
14. 3

Figure 1.14 Ordered stem-and-leaf diagram of mica data.

An ordered stem-and-leaf diagram of the means from the mica data is shown in
Figure 1.14. The median, quartiles, and extremes are easily obtained from this type of
plot. Note that there are n = 40 observations. The slash (/) shows the position of the
quartiles. Thus we must count through n/2 = 20 observations to get to the median.
The median is the (n + 1)/2 = 20.5th observation and is taken halfway between 11.1
and 11.1, which is, of course, 11.1. Similarly, the lower quartile (1/4 through the
ordered data) is the (n + 1)/4 = 10.25th observation. Thus, it is one quarter of the way
between 10.1 and 10.1, and hence is 10.1. The third quartile (3/4 through the data) is
the (3/4)(n + 1) = 30.75th observation and hence is 3/4 of the way between 11.9 and
11.9, and since 3/4(0) = 0, it is 11.9. (Note that the first decile (1/10 through the ordered
data) is the (n + 1)/10 = 4.1th observation and is 1/10 of the distance between the fourth
and fifth observation, so it is 9.61.)

1.15 BOX PLOTS


Another of Tukey’s28 innovations is particularly useful in comparing distributions. It is
known as the box plot. To set up a box plot:
1. Order the data
2. Find the lowest and highest values, X(1) and X(n)
~
3. Find the median, X

28. Tukey, Exploratory Data.


Chapter 1: Variables Data: An Introduction 39

4. Obtain the first and third quartiles, Q1 and Q3



5. Obtain the mean (optional) X
The form of the box plot is then as shown in Figure 1.15. It depicts some essential
measures of the distribution in a way that allows comparison of various distributions.
For example, we wish to compare the distribution of the 40 means from Table 1.5 with
the distribution of the first 40 individual values. The sample size is chosen in order to
keep the number of observations equal since the appearance of the box plot is sample-
size dependent. An ordered stem-and-leaf diagram for the 40 individual measurements
is shown in Figure 1.16. A comparison of these distributions is shown in Figure 1.17.


X

~
X(1) Q1 X Q1 X(n )

Figure 1.15 Form of box plot.

7. 5 X(1) = 7.5

8. 000 X 40 = 17.0
~
9. 005 X = 12.0
10. 005/ 5 Q1 = 10.5
11. 005555 Q3 = 13.5

12. 000/ 0055555 X = 11.9
13. 005/ 55
14. 0000
15. 055
16.
17. 0

Figure 1.16 Ordered stem-and-leaf diagram of 40 individual mica measurements.

11.152

8.6 14.3
10.1 11.1 11.9

11.9
7.5 17.0
10.5 12.0 13.5

7 8 9 10 11 12 13 14 15 16 17

Figure 1.17 Box plot of mica individuals and means.


40 Part I: Basics of Interpretation of Data

As predicted by Theorem 1, the distribution of the means is tighter than that of the
individuals and we also see reasonable symmetry in the distribution of means as pre-
dicted by Theorem 3.
It may also be informative to form a box plot of the entire distribution of mica thick-
nesses as shown in Table 1.3. Here it is desirable to estimate the median and quartiles
from the frequency distribution itself. This is done by locating the class in which the
quartile is to be found and applying the following formulas:

 n − c  200 − 32 
Q1 = L +  4  m = 8.75 +  4  1 = 9.75
 f   18 
   
 n − c  200 − 90 

X = Q2 = L +  2  m = 10.75 +  2  1 = 11.09
 f   29 
   
 3n − c   600 − 119 
Q3 = L +  4  m = 11.75 +  4  1 = 12.69
 f   33 
   

where
L = lower class boundary of class containing quartile
n = total frequencies
c = cumulative frequencies up to, but not including, the quartile class
f = frequency of class containing quartile
m = class width
~ –
We then have X(1) = 5.0, X(200) = 17.0, X = 11.09, Q1 = 9.75, Q3 = 12.69, X = 11.16
and our box plot appears as in Figure 1.18. A comparison of Figure 1.18 with Figure
1.17 will show how increasing the sample size of the frequency distribution of individ-
uals changes the box plot by widening the extremes.

5.0 11.16 17.0


9.75 11.09 12.69

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Figure 1.18 Box plot of 200 mica measurements.


Chapter 1: Variables Data: An Introduction 41

Finally, it should be noted that, for a normal distribution, the semi-interquartile


range, that is (Q3 – Ql)/2, is equal to 2/3s. It is possible, then, to get an approximate feel
for the size of the standard deviation by visualizing 1.5 times the average distance from
the median to the quartiles when the box plot appears reasonably symmetric.
The stem-and-leaf diagram and the box plot are primarily tools of communication.
They help visualize the distribution and are used as vehicles of comparison. Used in
conjunction with other methods, they can be important vehicles for visualization and
understanding. The approach taken here differs slightly from the formal approach to box
plot development as proposed by Tukey.29 Rather, it is based on a more elementary
approach attributed to Chatfield and discussed at some length by Heyes.30

1.16 DOT PLOT


Another popular graphical choice for describing the shape of the distribution of a set of
data is the dot plot which is also attributed to John Tukey.31 The dot plot is similar to the
stem-and-leaf diagram in that the data are represented according to their value and not
to a cell of values, as done with a histogram. Also, the dot plot is simple to construct
like the stem-and-leaf diagram and tally sheets—as the data are collected or scanned, a
dot is placed over the corresponding number on the horizontal line. A dot plot of the
original 200 mica thickness values and the 40 subgroup means was made in Minitab and
is shown in Figure 1.19.

Mica
thickness

Mean
mica
thickness 5 10 15
Thickness (measured in 0.001")

Figure 1.19 Dot plot of 200 mica thickness measurements and 40 subgroup means.

29. John W. Tukey, Exploratory Data Analysis (Reading, MA: Addison-Wesley, 1977).
30. Gerald B. Heyes, “The Box Plot,” Quality Progress (December 1985): 13–17.
31. John W. Tukey, Exploratory Data Analysis, (Reading, MA: Addison-Wesley, 1977).
42 Part I: Basics of Interpretation of Data

1.17 TOLERANCE INTERVALS FOR POPULATIONS


Thus far we have discussed confidence intervals, which place limits on the possible
magnitude of population parameters such as m and s. We have seen that two such inter-
vals, with 95 percent confidence are:

Mean: X − 2σˆ / n ≤ µ ≤ X + 2σˆ / n

Standard Deviation: σˆ − 2σˆ / 2n ≤ σ ≤ σˆ + 2σˆ / 2n

The parameters m and s describe the population sampled in terms of its location
and variation. But what of the spread of the population itself; that is, what can we expect
in terms of the smallest and the largest value in the population sampled? Tolerance
intervals are designed to answer exactly this question.
For a normal distribution, if m and s were known, a tolerance interval containing
P = 95 percent of the population with g = 100 percent confidence, would be

m ± 2s
where
P = percent of population contained in the interval
g = 100(1 – a) confidence coefficient (in percent)
Now, if the population parameters are not known, it is possible to construct a simi-
lar interval

X ± Ks

for specified values of P and g . The confidence coefficient in this case will obviously
be less than 100 percent because we will be using only estimates of the population para-
meters. Values of K are given in Table A.21a for 95 percent confidence.

For the mica data, we have X = 12.09 and s = 2.249 with n = 200. A tolerance inter-
val for 99 percent of the population with 95 percent confidence would then be

X ± Ks
12.09 ± 2.816(2.249)
12.09 ± 6.33
5.76 to 18.42

The population extremes can be evaluated in another way also. Having taken a
sample of 200, what can we say about the population extremes from the extremes of
the sample? The mica sample of 200 shows the smallest observation to be 5.0 and the
Chapter 1: Variables Data: An Introduction 43

largest to be 17.0. But remember, the tolerance interval relates to the population from
which the sample was drawn—not to the sample itself.
The interval32 between the smallest and the largest value in a sample of size n can
be evaluated in terms of P and g by the relationship
n −1
 P   P 
n
γ
= 1− n  + ( n − 1)  
100  100   100 

So we can assert that the interval 5.0 to 17.0 contains 99 percent of the population
with a confidence of
200 −1
 99   99 
200
γ
= 1 − 200   + ( 200 − 1)  
100  100   100 
= 1 − 200 (.99 ) + (199 )(.99 )
199 200

= 1 − 27.067 + 26.662
= 0.595

as a proportion or about 60 percent. Note that this method of obtaining a confidence


interval is nonparametric and will work for any distribution shape (not just for the nor-
mal distribution). However, this is also why the confidence coefficient is so much lower
than the previous estimate.
In discussing process capability, the interval
– –
X ± 3 R/d2

is often used to characterize the distribution spread. R. S. Bingham33 has provided tables
for the construction of tolerance intervals using the range of the form
– –
X ± K* R

The tolerance factors, K*, for 95 percent confidence and subgroups of size five are
given in Table A.21b. We have seen that for the mica data, subgroups of size five, as in

Table 1.5, yield k = 40 ranges with R = 4.875. This gives a g = 95 percent tolerance inter-
val for P = 99 percent of the population of

12.09 ± 1.226(4.875)

32. G. S. Hahn and W. Q. Meeker, Statistical Intervals (New York: John Wiley and Sons, 1991): 91.
33. R. S. Bingham, Jr., “Tolerance Limits and Process Capability Studies,” Industrial Quality Control 19, no. 1
(July 1962): 36–40.
44 Part I: Basics of Interpretation of Data

12.09 ± 5.98
6.11 to 18.07

This interval is tighter than that obtained using s above. This is because R charac-
terizes the short-term (within) variability in the process. Thus, this interval better
describes the population that could be produced if the process were in control, that is,
process capability, while the interval using s described the population represented by
the 200 measurements, including any drift in the process mean.
It is always important to examine the properties of the measures we use in estimat-
ing process parameters and their effect on our conclusions. The difference between
these tolerance intervals points up this fact.

1.18 A NOTE ON NOTATION


In this chapter and in the remainder of this book, we have used the following notation:
n = sample size
k = number of subgroups or number of points plotted
ng = subgroup size
so
n = kng
and when
k =1
n = ng
Also, in later chapters, and particularly with regard to design of experiments, we
will use
p = number of factors in an experiment
r = number of replicate observations per cell
Note that when the number of observations per subgroup is constant, the treatment
total or the means of c cells is calculated from

ng = cr
observations.
The reader is cautioned that the literature of industrial statistics incorporates a vari-
ety of notations, so that some sources use the symbol n to represent both sample size
and subsample sizes, and tables are indexed accordingly.
Chapter 1: Variables Data: An Introduction 45

Case History 1.1

Solder Joints
A solder joint is a simple thing. In a hearing aid, there are some 85. Many of our every-
day items, a small radio, a telephone switchboard, a kitchen toaster, all are dependent
on solder joints.
When we asked Fritz, the head of a department that assembles hearing-aid chassis,
how many defective solder joints he had, the reply was, “Well, we don’t really know,
but not too many.”
Having no basis for even a wild guess, we suggested one in a hundred. “Well,
maybe,” said Fritz. So we talked to the quality control supervisor.
How does one proceed to improve soldering? There are many answers: “better”
soldering irons or “better” solder, a quality motivation program, or improved instruc-
tions to the foreman and operators. We began a small study by recording the number of
defects found on a sample of just ten hearing aids per day. We recorded the location
of defects by making tally marks on an enlarged diagram of the circuitry. After a few
days, it was evident that there were six or seven positions, of the possible 87, responsi-
ble for the great majority of defects.
Initial data showed about one defect per 100 solder joints—such as cold solder
joints, open joints, shorting contacts. Spacings were very close (not like a big telephone
switchboard or a guided missile, critics argued), and some thought that 1:100 was as
good as could be expected. Besides, solder joints were inspected 100 percent, so why
the concern?
Data were taken and analyzed. Reductions in defects came quickly. One wire at a
soldering position was given a pretinning. Another was given a sleeve to eliminate
possible shorting. Specific instructions to individual operators on their soldering
techniques also helped. A control chart was posted and was a surprisingly good moti-
vational factor.
In three months, the continuing samples of 10 units per day showed a reduction of
soldering defects to about 1:10,000. More important (and surprising to many) was the
marked improvement in quality of the completed hearing aids.
Once again, quality can only be manufactured into the product—not inspected
into it.
Improvements in the hearing-aid assembly required a detailed analysis of individ-
ual operator performance (the individual is indeed important, and each may require spe-
cific help). Group motivation can be helpful, too, in some situations.
Are you investigating the few positions in your operations that account for most of
the defects? Are you then establishing continuing control charts, graphical reports, and
other aspects of a quality feedback system that will help maintain improvements and point
to the beginning of later problems that will surely develop? Getting a quality system
organized is not easy.
46 Part I: Basics of Interpretation of Data

What is the quality problem? What is a good way to attack the problem? Production
will have one answer, design may have another, purchasing and testing another, and so
on. But you can be sure of one thing: everyone will tell you in some indirect way, “It
isn’t my fault!” To anticipate this is not cynicism, it is merely a recognition of human
nature, shared by all of us.
Almost everyone would like a magic wand, an overall panacea, applying with
equal effectiveness to all machines, operators, and situations, thereby eliminating the
need for us to give attention to piece-by-piece operation. There are different types of
magic wands:
“Give us better soldering irons, or better solder, or better components, or better
raw materials and equipment. This will solve the problem!” Of course, one or more
such changes may be helpful, but they will not excuse us from the responsibility of
working with specific details of processes within our own control to obtain optimum
performance.
“Give us operators who care,” also known as “if we could only find a way to inter-
est operators on the line in their assignments and get them to pay attention to instruc-
tions!” Of course. But in the hearing aid experience, operators primarily needed to be
well instructed (and re-instructed) in the details of their operations. It was the system of
taking samples of 10 units per day that provided clues as to which operators (or machines)
needed specific types of instructions.
Each of these improvements can be helpful. Indeed, they were helpful in one way
or another in the improvement of hearing aid defects from a rate of 1:100 units to
1:10,000 units. The critical decision, however, was the one to keep records on individ-
ual solder positions in such a way that individual trouble points could be pinpointed and
kept under surveillance.

1.19 SUMMARY

An orderly collection and plotting of a moderate data sample will frequently suggest trou-
ble, and if not solutions, then sources of trouble that warrant further investigation. This
chapter considered a case history of 200 samples taken from a repetitive mass process.
From this sample, methods of calculating various statistical estimates are set forth:
the standard deviation, average (mean), and range. In a discussion of basic probability
distributions, a foundation is laid for comparing what is expected to happen with what
is actually happening.
This chapter introduced concepts about samples from a process or other population
source. It presented methods of estimating the central tendency of the process m and its
inherent process capability s. The importance of an estimate computed from a large
sample was compared with that computed from a set of k small samples from the same
process. These concepts are basic in troubleshooting. They are also basic to the methods
of Chapter 2.
Chapter 1: Variables Data: An Introduction 47

1.20 PRACTICE EXERCISES


The following exercises are suggestions that may be used in association with the indi-
cated sets of data. Working with sets of data is helpful in understanding the basic concepts
that have been discussed. If you have real sets of data from your own experience, how-
ever, you are urged to use these same methods with them.
1. Tally sheets can reveal the basic patterns of the data.
a. Make a tally sheet for the data in Table 1.1 using cell width m = 0.5
thousandth (inch).
b. Make a tally sheet for the data in Table 1.1 using cell width m = 2.0
thousandths (inch).

c. Compare the results of X and ŝ in exercise 1.a and exercise 1.b with

the results for m = 1 shown in Table 1.3. Using estimates of X and ŝ
based on Equations (1.3a) and (1.4a), which value of m produces the
most exact answer?
2. The top half of the data in Table 1.1 is also a sample, n = 100, of the mica
splitting process.

a. Compute X and ŝ from Equations (1.3b) and (1.4b) for this top half.

b. Also, compute ŝ = R/d2 from Equation (1. 13) using k = 20 vertical sets
of size ng = 5.
c. Compare ŝ obtained in (b) with that obtained in exercise 1.b and Table 1.3.
d. Also, compute ŝ from Equation (1.13) using horizontal subgroups, ng = 5.
3. The first four columns of Table 1.1 comprise a sample, n = 80, of the mica
splitting process.

a. Compute X and ŝ from Equations (1.3a) and (1.4a).

b. Compute ŝ = R/d2 from Equation (1.13) using k = 20 sets of ng = 4,
grouped horizontally. Are the results of (a) and (b) similar?
4. Prepare a frequency distribution for the “depth of cut” data in Table 1.8.
(We suggest that you make your tally marks in one color for the first 16 rows
of samples and in a contrasting color for the last nine rows. Then note the
contrast in location of the two sets.)

a. Compute X and ŝ from Equations (1.3b) and (1.4b).
b. Compute ŝ ST from Equation (1.13) using ranges from the rows of samples,
ng = 5.
c. Compute ŝ LT from the histogram of the first 16 rows. Note: See Case
History 2.1 for some discussion.
48 Part I: Basics of Interpretation of Data

5. Prepare suitable frequency distributions of the sets of data referred to in the



exercises below. Find X and ŝ for each set. (Also, compare ŝ , obtained from

the frequency distribution with ŝ = R/d2 from a suitable range chart, using any
grouping you choose or may be assigned.)
a. The 77 measurements in Table 1.7 on an electrical characteristic.
b. Consider again the process that produced the data in Table 1.1. If we
assume that the average of the process could be increased to be at the
center of the specifications, what percent would be expected to be under
the LSL and what percent over the USL? Assume no change in s.
6. If the specifications are as listed below, find the expected percentages
nonconforming produced by the process that produced the corresponding
samples:
a. In Table 1.7, below LSL = 14.5 dB and above USL = 17 dB.
b. In Table 1.8, below LSL = 0.159 inches and above USL = 0.160 inches.
7. Prepare a graph on normal-probability paper for all the data in Table 1.7.
a. Is there seeming evidence of more than one principal parent universe?

b. Estimate s from the normal-probability graph. Compare it with ŝ = R/d2.
Do they disagree “substantially,” or are they “in the same ball park?”
8. A design engineer made measurements of a particular electrical characteristic

on an initial sample of hearing aids. Find X and s for these eight
measurements: 1.71, 2.20, 1.58, 1.69, 2.00, 1.59, 1.52, 2.52.
a. Estimate the range within which 99.7 percent of the product will fall.
b. Give a 95 percent confidence interval for the true mean and standard
deviation of the process from which this sample was taken.
c. Are these data normal? Check with a probability plot.

Table 1.7 Electrical characteristics (in decibels) of final assemblies from 11 strips of ceramic:
Case History 15.1.
1 2 3 4 5 6 7 8 9 10 11
16.5 15.7 17.3 16.9 15.5 13.5 16.5 16.5 14.5 16.9 16.5
17.2 17.6 15.8 15.8 16.6 13.5 14.3 16.9 14.9 16.5 16.7
16.6 16.3 16.8 16.9 15.9 16.0 16.9 16.8 15.6 17.1 16.3
15.0 14.6 17.2 16.8 16.5 15.9 14.6 16.1 16.8 15.8 14.0
14.4 14.9 16.2 16.6 16.1 13.7 17.5 16.9 12.9 15.7 14.9
16.5 15.2 16.9 16.0 16.2 15.2 15.5 15.0 16.6 13.0 15.6
– 15.5 16.1 14.9 16.6 15.7 15.9 16.1 16.1 10.9 15.0 16.8
X = 16.0 15.8 16.4 16.5 16.1 14.8 15.9 16.3 14.6 15.7 15.8
R = 2.8 3.0 2.4 1.1 1.1 2.5 3.2 1.9 5.9 4.1 2.8
Source: Ellis R. Ott, “Variables Control Charts in Production Research,” Industrial Quality Control 6, no. 3
(1949): 30. Reprinted by permission of the editor.
Chapter 1: Variables Data: An Introduction 49

Table 1.8 Air-receiver magnetic assembly: Case History 2.1.


Measurements (depth of cut) in inches on each of five items in a sample
taken at 15-minute intervals during production.
Sample no. (ng = 5)
1 .1600 .1595 .1596 .1597 .1597
2 .1597 .1595 .1595 .1595 .1600
3 .1592 .1597 .1597 .1595 .1602
4 .1595 .1597 .1592 .1592 .1591
5 .1596 .1593 .1596 .1595 .1594
6 .1598 .1605 .1602 .1593 .1595
7 .1597 .1602 .1595 .1590 .1597
8 .1592 .1596 .1596 .1600 .1599
9 .1594 .1597 .1593 .1599 .1595
10 .1595 .1602 .1595 .1589 .1595
11 .1594 .1583 .1596 .1598 .1598
12 .1595 .1597 .1600 .1593 .1594
13 .1597 .1595 .1593 .1594 .1592
14 .1593 .1597 .1599 .1585 .1595
15 .1597 .1591 .1588 .1606 .1591
16 .1591 .1594 .1589 .1596 .1597
17 .1592 .1600 .1598 .1598 .1597
18 .1600 .1605 .1599 .1603 .1593
19 .1599 .1601 .1597 .1596 .1593
20 .1595 .1595 .1606 .1606 .1598
21 .1599 .1597 .1599 .1595 .1610
22 .1596 .1611 .1595 .1597 .1595
23 .1598 .1602 .1594 .1600 .1597
24 .1593 .1606 .1603 .1599 .1600
25 .1593 .1598 .1597 .1601 .1601

9. The ordered stem-and-leaf diagram makes it easy to estimate the quartiles


Q1, Q2, and Q3, whereas the frequency distribution also produces estimates
of the quartiles as shown in this chapter.
a. Make a frequency distribution of the data shown in Figure 1.16 and
compare Q1, Q2, and Q3 (based on the formulas given in Section 1.15)
with those determined by the ordered stem-and-leaf diagram.
b. Make an ordered stem-and-leaf diagram from the data in Table 1.1 and
compare Q1, Q2, and Q3 (based on the ordered stem-and-leaf diagram)
with the estimates shown in the text.
2
Ideas from Time Sequences of
Observations

2.1 INTRODUCTION
A gradual change in a critical adjustment or condition in a process is expected to pro-
duce a gradual change in the data pattern. An abrupt change in the process is expected to
produce an abrupt change in the data pattern. We need ways of identifying the presence
and nature of these patterns. The fairly standard practice of examining any regular data
reports simply by looking at them is grossly inadequate. Such reports are far more valu-
able when analyzed by methods discussed in the following sections.
There is no single way for a medical doctor to diagnose the ailment of a patient.
Consideration is given to information from a thermometer, a stethoscope, pulse rates,
chemical and biological analysis, X-rays, MRIs, and many other tests.
Neither is there just one way to obtain or diagnose data from the operation of a
process. Simple processes are often adjusted without reference to any data. But data
from even the simplest process will provide unsuspected information on its behavior.
In order to benefit from data coming either regularly or in a special study from a tem-
peramental process, it is important to follow one important and basic rule:
Plot the data in a time sequence.1
Different general methods are employed to diagnose the behavior of time sequence
data after plotting. Two important ones will be discussed in this chapter.
1. Use of certain run criteria (Section 2.4)
2. Control charts with control limits and various other criteria (Section 2.5)
signaling the presence of assignable causes

1. It is standard practice to scan a data report, then file it, and forget it. We suggest instead that you plot important
data and usually dispose of the report. Or perhaps record the data initially in graphical form.

51
52 Part I: Basics of Interpretation of Data

Example 2.1 A Look At Some Data


In a graduate course, composed primarily of students in statistics but including gradu-
ate students from the natural and social sciences, Dr. Ott’s first assignment for each stu-
dent had been to “obtain a time sequence of k subgroups from some process,” asking
them, if possible, to “choose a process considered to be stable.” A young woman elected
to complete her assignment by measuring times for sand to run through a 3-minute egg
timer in successive tests.2 The time was measured with a stopwatch. The data are shown
in Figure 2.1. Does this set of data appear to represent a stable (random) process?
Also, is it a “3-minute egg timer”?
Some Casual Observations
• The median of the 50 observations is about 21⁄2 seconds less than 3 minutes.
This is a slight bias (inaccuracy) but should not affect the taste quality of
boiled eggs.
• Almost no point is “close” to the median! Half of the points lie about 10
seconds above the median and the other half 5 to 10 seconds below.
• The points are alternately high and low—a perfect “sawtooth” pattern
indicating two causes operating alternately to produce the pattern. This agrees
with the preceding observation.
• There appears to be a steady increase from the eighth to the twentieth point
on each side of the egg timer.

3:10 n=1
Time, minutes and seconds

3:05

3:00
Median

2:55

2:50

2:45

1 5 10 15 20 25 30 35 40 45 50

Figure 2.1 Measured time for sand to run through a 3-minute egg timer (recorded in order of
observation). Courtesy of Mrs. Elaine Amoresano Rose.

2. Mrs. Elaine Amoresano Rose, former graduate student in Applied and Mathematical Statistics, Rutgers
Statistics Center.
Chapter 2: Ideas from Time Sequences of Observations 53

• Beginning with the twenty-third observation, there is an abrupt drop, both on


the “slow half” and the “fast half.” After the drop, the process operates near the
initial level.
Discussion—Egg Timer Data
An egg timer is surely a simple machine; one would hardly predict nonrandomness in
successive trials with it. However, once the peculiar patterns are seen, what are possi-
ble explanations?
The sawtooth. The egg timer had two halves. Elaine recognized this as an obvious
“probable” explanation for the sawtooth pattern; she then made a few more measure-
ments to identify the “fast” and “slow” sides of the timer.
The abrupt shift downward (twenty-third point). There are three possibilities:
1. The egg timer; the sand may be affected by humidity and temperature. Elaine
said she took a break after the twenty-third experiment. Perhaps she laid the
timer in the sun or on a warm stove. A change in heat or humidity of the sand
may be the explanation for the drop at the twenty-fourth experiment.
2. The stop watch used in timing. There is no obvious reason to think that its
performance might have produced the sawtooth pattern, but thought should
be given to the possibility. It does seem possible that a change in the
temperature of the watch might have occurred during the break. Or, was
the watch possibly inaccurate?
3. The operator observer. Was there an unconscious systematic parallax effect
introduced by the operator herself? Or some other operator effect?
Thus when studying any process to determine the cause for peculiarities in the data,
one must consider in general: (1) the manufacturing process, (2) the measuring process,
and (3) the way the data are taken and recorded.
Summary Regarding Egg Timer Data in Figure 2.1
Figure 2.1 shows the presence of two types of nonrandomness, neither of which was
foreseen by the experimenter. Nonrandomness will almost always occur; such occur-
rence is the rule and not the exception.
An egg timer is a very simple system in comparison with the real-life scientific sys-
tems we must learn to diagnose and operate. The unsuspected behaviors of large-scale
scientific systems are much more complicated; yet they can be investigated in the same
manner as the egg timer. Data from the process showed differences: between the two
sides of the egg timer, and a change over time. Sometimes the causes of unusual data
patterns can be identified easily. It is logical here to surmise that a slight difference
exists in the shape of the two sides, Figure 2.2 (a) and (b); often the identification is
more elusive. Knowing that nonrandomness exists is the most important information to
the experimenter. However, this knowledge must be supported by follow-up investiga-
tions to identify the causes of any nonrandomness, which are of practical interest.
54 Part I: Basics of Interpretation of Data

(a)

(b)

Figure 2.2 An egg timer.

Note: This set of data, Figure 2.1, will be discussed further in Section 2.4. The very
powerful graphical display and the “look test” provide the important information. But
more formal methods are usually needed, and two important direct methods of analysis
are discussed later in the chapter. The methods are run analysis and control charts. We
shall first consider two general approaches to process troubleshooting.

2.2 DATA FROM A SCIENTIFIC


OR PRODUCTION PROCESS
It is standard practice to study a scientific process by changing different variables sus-
pected of contributing to the variation in the process. The resulting data are then ana-
lyzed in some fashion to determine whether the changes made in these variables have
had an effect that appears significant either scientifically or economically. But to use
this method you must know what to change beforehand.
A less utilized but important method is to hold constant all variables that are sus-
pected of contributing to variations in the process, and then decide whether the result-
ing pattern of observations actually represents a stable, uniform process—that is,
whether the process is “well-behaved” or whether there is evidence of previously
unknown nonstability (statistical nonrandomness). Different patterns indicate different
causes of nonrandomness and often suggest the type of factors that have influenced the
behavior of the process—even though neither their existence nor identity may have
been suspected. This unsuspected nonrandom behavior occurs frequently, and recogni-
tion of its existence may prompt studies to identify the unknown factors, that is, lead to
scientific discovery.
Whenever a sequence of observations in order of time can be obtained from a
process, an analysis of its data pattern can provide important clues regarding variables
or factors that may be affecting the behavior of the process.
Chapter 2: Ideas from Time Sequences of Observations 55

2.3 SIGNALS AND RISKS


Dr. Paul Olmstead3 remarks:
To the extent that we as engineers have been able to associate physical data with
assignable causes, these causes may be classified by the types of physical data
that they produce, namely:
1. Gross error or blunder (shift in an individual)
2. Shift in average or level
3. Shift in spread or variability
4. Gradual change in average or level (trend)
5. A regular pattern of change in level (cycle)
When combinations of two or more assignable causes occur frequently in a process,
they will then produce combinations of data patterns. Learning that something in the
system is affecting it, either to its advantage or disadvantage, is important information.
The province and ability of specialists—engineers, scientists, production experts—are
to find compensating corrections and adjustments. Their know-how results both from
formal training and practical experience. However, experience has shown that they wel-
come suggestions. Certain patterns of data have origins that can be associated with causes;
origins are suggested at times by data patterns.
When analyzing process data, we need criteria that will signal the presence of
important process changes of behavior but that will not signal the presence of minor
process changes. Or, when we are studying the effects of different conditions in a research
and development study, we want criteria that will identify those different conditions
(factors) that may contribute substantially either to potential improvements or to diffi-
culties. If we tried to establish signals that never were in error when indicating the pres-
ence of important changes, then those signals would sometimes fail to signal the presence
of important conditions. It is not possible to have perfection.
The fact is that we must take risks in any scientific study just as in all other aspects
of life. There are two kinds of risks: (1) that we shall institute investigations that are
unwarranted either economically or scientifically; this is called the alpha risk (a risk), the
sin of commission, and (2) that we shall miss important opportunities to investigate; this
is called the beta risk (b risk), the sin of omission.
We aspire to sets of decision criteria that will provide a reasonable compromise
between the a and b risks. A reduction in the a risk will increase the b risk unless com-
pensations of some kind are provided. The risk situation is directly analogous to the per-
son contemplating the acceptance of a new position, of beginning a new business, of
hiring a new employee, or of buying a stock on the stock exchange.

3. Paul S. Olmstead, “How to Detect the Type of an Assignable Cause,” Industrial Quality Control 9, no. 3
(November 1952): 32 and vol. 9, no. 4 (January 1953): 22. Reprinted by permission of the author and editor.
56 Part I: Basics of Interpretation of Data

What risks are proper? There is no single answer, of course. When a process is stable,
we want our system of signals to indicate stability; when there is enough change to be of
possible scientific interest, the signaling system should usually indicate the presence
of assignable causes. Statisticians usually establish unduly low risks for a, often a = 0.05
or a = 0.01. They are reluctant to advise engineering, production, or other scientists to
investigate conditions unless they are almost certain of identifying an important factor
or condition. However, the scientist is the one who has to decide the approximate level of
compromise between looking unnecessarily for the presence of assignable causes and
missing opportunities for important improvements. A scientist in research will often
want to pursue possibilities corresponding to appreciably larger values of a and lower
values of b, especially in exploratory studies; and may later want to specify smaller val-
ues of a when publishing the results of important research. Values of a = 0.10 and even
larger are often sensible to accept when making a decision whether to investigate pos-
sible process improvement. In diagnosis for troubleshooting, we expect to make some
unnecessary investigations. It is prudent to investigate many times knowing that we may
fail to identify a cause in the process. Perhaps a = 0.10 or even a = 0.25 is economi-
cally practical. Not even the best professional baseball player bats as high as 0.500 or
0.600. The relationship of these risks to some procedures of data analysis will be dis-
cussed in the following sections. Some methods of runs are considered first. The methods
of control charts are considered in the following chapters.

Some Signals to Observe


When it has been decided to study a process by obtaining data from its performance,
then the data should be plotted in some appropriate form and in the order it is being
gathered. Every set of k subgroups offers two opportunities:
1. To test the hypothesis that the data represent random variation from stable
sources. Was the source of data apparently stable or is there evidence of
nonrandomness?
2. To infer the nature of the source(s) responsible for any nonrandomness
(from the data pattern), that is, to infer previously unsuspected hypotheses.
This is the essence of the scientific method. Two major types of criteria are discussed
in Sections 2.4 and 2.5.

2.4 RUN CRITERIA


Introduction
When someone repeatedly tosses a coin and produces a run of six heads in succession, we
realize that something is unusual. It could happen; the probability is (0.5)6 = 0.015.
We would then usually ask to see both sides of the coin because these would be very
unlikely runs from an ordinary coin having a head and a tail. When we then question the
coin’s integrity, our risk of being unreasonably suspicious is a = 0.015.
Chapter 2: Ideas from Time Sequences of Observations 57

A median line is one with half of the points above and half below. (Probability that
a single observation falls above is Pa = 0.5 and that it falls below is Pa = 0.5, when k is
even). The use of runs is formalized in the following sections; it is a most useful pro-
cedure to suggest clues from an analysis of ordered data from a process.
When exactly three consecutive points are above the median, this is a run above the
median of length 3. In Figure 2.3, consecutive runs above and below the median are of
length 3, 1, 1, 1, 2, and 4. The total number of runs is NR = 6. We usually count the num-
ber directly from the figure once the median line has been drawn. Locating the median
when k is larger can be expedited by using a clear, plastic ruler or the edge of a card.

Example 2.2 Ice Cream Fill Weights


The data plotted in Figure 2.4 represent k = 24 averages (gross weights of ice cream) in

order of production. Data represent X values, each computed from ng = 4 observations
in Table 2.6.

Median

Figure 2.3 Twelve averages showing six runs above and below the median.

210 n=4
k = 24
Ounces

205
Median = 204.12

200

1 3 5 7 9 11 13 15 17 19 21 23

Figure 2.4 Gross average weights of ice cream fill at 10-minute intervals. (Data from Table 2.6.)
Courtesy of David Lipman.
58 Part I: Basics of Interpretation of Data

The median is halfway between 204.00 and 204.25. The total number of runs above and
below the median is NR = 8. The runs are of length 7, 3, 3, 1, 1, 2, 1, and 6, respectively.
Before presenting a formal direct run analysis, let us consider the first run; it is of
length 7. It is just as improbable for a stable process to produce the first seven obser-
vations on the same side of the median as for a coin to flip seven heads (tails) at the start
of a demonstration. We do not believe that the process represented in Figure 2.4 was
stable over the first four-hour period of manufacture. This is not surprising; it is unlikely
that any industrial process will function at an entirely stable level over a four-hour
period. Whether the magnitude of the changes here is economically important is a dif-
ferent matter. Now let us consider a direct analysis using runs.

Some Interpretations of Runs


Too many runs above and below the median indicate the following possible engineering
reasons:
1. Samples being drawn alternately from two different populations (sources),
resulting in a “sawtooth” effect. These occur fairly frequently in portions
of a set of data. Their explanation is usually found to be two different
sources—analysts, machines, raw materials—that enter the process alternately
or nearly alternately.
2. Three or four different sources that enter the process in a cyclical manner.
Too few runs are quite common. Their explanations include:
1. A general shift in the process average
2. An abrupt shift in the process average
3. A slow cyclical change in averages
Once aware that they exist, sources of variation in a process can usually be deter-
mined (identified) by the engineer, chemist, or production supervisor.

Formal Criteria: Total Number of Runs Around the Median


The total number of runs in the data in Figure 2.4 is NR = 8. How many such runs are
expected in a set of k random points around the median? In answering this question, it
is sufficient to consider even numbers only, that is, numbers of the form k = 2m. The
average expected number is

k+2
NR = = m +1 (2.1)
2

and the standard deviation of the sampling distribution of NR is


Chapter 2: Ideas from Time Sequences of Observations 59

kk 
−1
m ( m − 1) 2  2  k ( k − 2) 1 k ( k − 2) k
σ= = = = ≅
4 ( k − 1) 2 ( k − 1)
(2.2)
2m − 1 k −1 2

Equation (2.1) is used so frequently that it helps to remember it. An aid to memory
is the following:
1. The minimum possible number of runs is 2.
2. The maximum possible number is k.
3. The expected number is the average of these two. This is an aid to memory,
not a proof.4
Of course, the number of runs actually observed will often be more or less than

NR = m + 1, but by how much? The answer is obtained easily from Table A.2; it lists
one-tailed significantly small critical values and significantly large critical values cor-
responding to risks a = 0.05 and 0.01.

Example 2.3. Use of Table A.2


In Figure 2.4 with k = 24, the expected number of runs is 0.5(24 + 2) = 13. The number
we observe is only 8. From Table A.2, small critical values of NR corresponding to k =
24 are seen to be 8 and 7 for a = 0.05 and 0.01, respectively.
Thus, the count of 8 runs is less than expected with a 5 percent risk, but not signif-
icantly less with a 1 percent risk. This evidence of a nonstable (nonrandom) process
behavior agrees with the presence of a run of length 7 below the median. We would ordi-
narily investigate the process expecting to identify sources of assignable causes.

Expected Number of Runs of Exactly Length s (Optional)


A set of data may display the expected total number of runs but may have an unusual
distribution of long and short runs. Table A.3 has two columns. The first one lists the

expected number of runs NR,s, of exactly length s; the second lists the expected number

of runs, NR≥ s , that is, of length greater than or equal to s.
From the second column, for example, it can be seen that when k = 26 = 64, only
one run of length 6 or longer is expected. When k = 25 = 32, only one-half a run of
length 6 or longer is expected (that is, a run of length 6 or longer is expected about half
the time).

4. Churchill Eisenhart and Freda S. Swed, “Tables for Testing Randomness of Grouping in a Sequence of
Alternatives,” Annals of Mathematical Statistics 14 (1943): 66–87.
60 Part I: Basics of Interpretation of Data

Example 2.4
Consider again the ice cream fill data in Figure 2.4, k = 24. The number of runs of
exactly length s = 1, for example, is 6.8 as shown in Table 2.1. The number expected
from the approximation, Table A.3, is 24/4 = 6; the number actually in the data is 3.
Other comparisons of the number expected with the number observed are given in this
table. It indicates two long runs: the first is a run of 7 below average; then a run of 6
high weights at the end. This pattern suggests an increase in filling weight during the
study; it is not clear whether the increase was gradual or abrupt. A c 2 test might be per-
formed to check the significance of the lengths of runs exhibited in Table 2.1. The for-
mula for c 2 is

(observed − expected)2
χ2 = ∑
expected

and critical values of c 2 are tabulated in most statistics texts. For length of run 1 and
≥ 2, we obtain

χ 2
=
(3 − 6.8) + (5 − 6.2)
2 2

= 2.12 + 0.23 = 2.35


6.8 6.2

The critical value at the a = 0.05 level with n0 = 2 – 1 = 1 df is c 20.05 = 3.84, which
is greater than c 2 = 2.35, so we are unable to assert that nonrandomness exists from this
test. Here, the degrees of freedom, n0, are one less than the number of cells used in the
comparison. As a rule, the cell size should be at least five observations. So, in this case,
we collapse the data to form two cells to meet this requirement. In the absence of a c 2
table, values of c 2 with n0 degrees of freedom may be read directly from Table A.12 by
using the entry for F with n1 = n0 and n2 = ∞.

Table 2.1 A comparison of the expected number of runs* and the observed number.
Data from Figure 2.4, where k = 24.
Expected number Expected number
of runs of Number of runs of Number
s exactly length s observed length ≥ s observed
1 6.8 3 13.0 8
2 3.4 1 6.2 5
3 1.6 2 2.8 4
4 0.7 0 1.2 2
5 0.3 0 0.5 2
6 0.1 1 0.2 2
7 0.1 1 0.1 1
* Note: Values for expected number of runs in this table have been computed from the exact formulae in
Table A.3. This is not usually advisable since the computation is laborious and the approximate values are
sufficiently close for most practical purposes.
Chapter 2: Ideas from Time Sequences of Observations 61

Example 2.5
Consider the sawtooth pattern of the egg timer, Figure 2.1. From Table A.2, the crit-
ical values for the total expected numbers of runs NR above and below the median for
k = 50 are

17 ≤ NR ≤ 34
The observed number is 50. This is much larger than the critical value of 34 corre-
sponding to a = 0.01. We conclude (again) that the data are not random. The pattern pro-
duced by the egg timer is a perfect sawtooth, indicating two alternating sources. The
sources are evidently the two sides of the egg timer.

The Longest Run-up or Run-down



In the longest run in Figure 2.4, there are four increases in X beginning with sample 18
and concluding with 22. This four-stage increase is preceded and followed by a decrease;
this is said to be a run-up of length exactly 4. In counting runs up and down we count
the intervals between the points, and not the points themselves.
It is easy to recognize a long run-up or run-down once the data have been plotted.
A long run-up or run-down is typical of a substantial shift in the process average.
Expected values of both extremely long and extremely short lengths in a random dis-
play of k observations are sometimes of value when analyzing a set of data. Here k is
the number of points in the sequence being analyzed. A few one-tailed critical values
have been given in Table 2.2.
It is easy to see and remember that a run-up of length 6 or 7 is quite unusual for sets
of data even as large as k = 200. Even a run of 5 may warrant investigating.

Table 2.2 Critical extreme length of a run-up or a run-down in a random set of k observations
(one-tail).
` = 0.01 ` = 0.05
Small Large Small Large
critical critical critical critical
k value value value value
10 0 6 1 5
20 1 6 1 5
40 1 7 1 6
60 1 7 2 6
100 2 7 2 6
200 2 7 2 7
Source: These tabular values were sent by Paul S. Olmstead based on his article: “Distribution of Sample
Arrangements for Runs-up and Runs Down,” Annals of Mathematical Statistics 17 (March 1946): 24–33.
They are reproduced by permission of the author.
62 Part I: Basics of Interpretation of Data

Summary—Run Analysis
Important criteria that indicate the presence of assignable causes by unusual runs in a set
of k subgroups of ng each have been described. They are applicable either to sets of data
representing k individual observations (ng = 1) or to k averages (ng > 1); or, to k percents
defective found by inspecting k lots of an item.
The ultimate importance of any criterion in analyzing data is its usefulness in iden-
tifying factors that are important to the behavior of the process. The application of runs
ranks high in this respect.

Run Criteria
1. Total number of runs NR around the median:5 Count the runs.

a. Expected number is NR = (k + 2)/2.
b. The fewest and largest number of expected runs are given in Table A.2
for certain risks, a.
2. A run above or below the median of length greater than six is evidence of
an assignable cause warranting investigation, even when k is as large as 200.
Count the points.
3. The distribution of runs of length s around the median (See Table A.3).
A set of data may display about the expected total number of runs yet have
too many or too few short runs (or long ones). Count the points.
4. A long run-up or run-down usually indicates a gradual shift in the process
average. A run-up or run-down of length five or six is usually longer than
expected (see Table 2.2). Count the lines between the points. Note that the
run length and the number of runs are quite different.

2.5 SHEWHART CONTROL CHARTS FOR VARIABLES

Introduction
The Shewhart control chart is a well-known, powerful method of checking on the sta-
bility of a process. It was conceived as a device to help production in its routine hour-
by-hour adjustments; its value in this regard is unequaled. It is applicable to quality
characteristics, either of the variable or attribute type. The control chart provides a
graphical time sequence of data from the process itself. This, then, permits the applica-
tion of run analyses to study the historical behavior patterns of a process. Further, the

5. Sometimes we apply the criteria of runs to the average line instead of the median; we do this as tentative criteria.
Chapter 2: Ideas from Time Sequences of Observations 63

control chart provides additional signals to the current behavior of the process; the
upper and lower control limits (UCL and LCL) are limits to the maximum expected
variation of the process. The mechanics of preparing variables and attributes control
charts will be presented. Their application to troubleshooting is a second reason for
their importance.

Mechanics of Preparing Control Charts (Variables)


The control chart is a method of studying a process from a sequence of small random
samples from the process. The basic idea of the procedure is to collect small samples
of size ng (usually at regular time intervals) from the process being studied. Samples of
size ng = 4 or 5 are usually best. It will sometimes be expedient to use ng = 1, 2, or 3;
sample sizes larger than 6 or 7 are not recommended. A quality characteristic of each
unit of the sample is then measured, and the measurements are usually recorded, but are
always charted.
The importance of rational subgroups must be emphasized when specifying the
source of the ng = 4 or 5 items in a sample. Since our aim is to actually locate the trou-
ble, as well as to determine whether or not it exists, we must break down the data in a
logical fashion. “The man who is successful in dividing his data initially into rational
subgroups based upon rational hypotheses is therefore inherently better off in the long
run than the one who is not thus successful.”6
In starting a control chart, it is necessary to collect some data to provide prelimi-

nary information for determining central lines on averages X and ranges R. It is usually
recommended that k = 20 to k = 25 subgroups of ng each be obtained, but k < 20 may
be used initially to avoid delay. (Modifications may be made to adjust for unequal sub-
group sizes.) The formal routine of preparing the control chart once the k data subgroups
have been obtained are as follows:

Step 1: Compute the average X and the range R of each sample. Plot the k points

on the X chart and R chart being sure to preserve the order in which
they were produced. (It is very important to write the sample size on
every chart and in a regular place, usually in the upper left-hand side as
in Figure 2.5.)
– –
Step 2: Compute the two averages, X and R; draw them as lines.
Step 3: Compute the following upper (UCL) and lower (LCL) 3-sigma control limits
for the R chart and draw them in as lines

UCL(R) = D4 R

LCL(R) = D3 R

6. Walter A. Shewhart, Economic Control of Quality of Manufactured Product (New York: D. Van Nostrand,
1931): 299.
64 Part I: Basics of Interpretation of Data

ng = 5
Row 1 Row 2 Row 3 Row 4
15
– –
UCL = X + A2R = 13.98

– X = 11.15
X
10
– –
LCL = X – A2R = 8.32

5
(a)

10 –
ng = 5 UCL = D4R = 10.29

R –
R = 4.875
5

LCL = 0
0
(b)

Figure 2.5 Control chart of mica thickness data with limits. (Data from Table 1.5.)

– –
Observe whether any ranges fall above D4 R or below D3 R. If not, tentatively
accept the concept that the variation of the process is homogenous, and
proceed to Step 4.7 Factors D3 and D4 can be found in Table 2.3 or Table A.4.
Note: The distribution of R is not symmetrical, but rather it is “skewed”
with a tail for larger values. This is particularly true for ng < 4. Although we

want values of ( R + 3ŝ R), values of ŝ R are not obtained simply. The easiest
– –
calculation is to use the D4 factors given in Table 2.3, where D4 R = R + 3ŝ R.
– –
Step 4: Compute A2 R, and obtain 3-sigma control limits on X:
– – – –
UCL( X ) = X + A2 R = X + 3ŝ X–
– – – –
LCL( X ) = X – A2 R = X – 3ŝ X–

where σˆ X = σˆ / ng


and ŝ = R/d2. A2 factors are found in Table 2.3 or Table A.4.

7. When the R chart has a single outage, we sometimes do two things: (a) check the sample for a maverick, and (b)

exclude the outage subgroup and recompute R. Usually this recomputing is not worth the effort.
When the R chart has several outages, the variability of the process is unstable, and it will not be reasonable to
compute a ŝ . The process needs attention.
Other examples of treating an R chart with outages are discussed in other sections of this book.
Chapter 2: Ideas from Time Sequences of Observations 65

––
Table 2.3 Factors to use with X and R control charts for variables.
Choose ng to be less than seven when feasible; these factors assume
sampling from a normal universe; see also Table A.4.
ng D3 D4 A2 d2
2 0 3.27 1.88 1.13
3 0 2.57 1.02 1.69
4 0 2.28 0.73 2.06
5 0 2.11 0.58 2.33
6 0 2.00 0.48 2.53
7 0.08 1.92 0.42 2.70
8 0.14 1.86 0.37 2.85
9 0.18 1.82 0.34 2.97
10 0.22 1.78 0.31 3.08

– –
Step 5: Draw dotted lines corresponding to UCL( X ) and LCL( X ).
Step 6: Consider whether there is evidence of assignable causes (see following
discussion). If any point falls outside UCL and LCL, we call this an
“outage,” which indicates the existence of an assignable cause.

Some Discussion
The recommendation to use 3-sigma control limits was made by Dr. Shewhart after
extensive study of data from production processes. It was found that almost every set of
production data having as many as 25 or 30 subsets would show outages. Further, the
nature of the assignable causes signaled by the outages using 3-sigma limits was usu-
ally important and identifiable by process personnel.

Upper and lower 3-sigma control limits on X are lines to judge “excessive” varia-
tion of averages of samples of size ng

3σˆ 3R
X ± 3σˆ X = X ± =X±
ng d 2 ng

However, computation is simplified by using the A2 factor from Table 2.3.


– – –
UCL( X ) = X + A2 R
– – –
LCL( X ) = X – A2 R
where
3
A2 =
d 2 ng

It was also found from experience that it was practical to investigate production
sources signaled by certain run criteria in data. These run criteria are recommended as
adjuncts to outages.
66 Part I: Basics of Interpretation of Data

The choice of a reasonable or rational subgroup is important but not always easy to
make in practice. Items produced on the same machine, at about the same time, and with
the same operator will often be a sensible choice—but not always. A machine may have
only one head or several; a mold may have one cavity or several. A decision will have to
be made whether to limit the sample to just one head or cavity or allow all heads or cav-
ities to be included. Initially, the decision may be to include several heads or cavities
and then change to individual heads if large differences are found.
Sample sizes of 4 or 5 are usually best. They are large enough to signal important
changes in a process; they usually are not large enough to signal smaller, less impor-
tant changes. Some discussion of the sensitivity of sample size is given in connection
with operating-characteristic curves.

Example of Control Chart Limits, Mica Thickness Data



In Figure 1.11, charts of X and R points, ng = 5, were made for the mica thickness data in
Table 1.5. We assumed there that the range chart represented a stable process; under that
assumption, we computed

ŝ = R/d2 = 2.09

We may now use the procedure outlined above to compute 3-sigma control limits
for each chart and to consider different criteria to check on stability of the process.

Control Chart Limits



Step 1: See Table 1.5: An X and R have been computed for each subgroup.
– –
Step 2: X = 11.15 and R = 4.875

Step 3: UCL(R) = D4 R = (2.11)(4.875) = 10.29

LCL(R) = D3 R = (0)(4.875) = 0
All points fall below UCL(R); see Figure 2.5b.
– –
Step 4: X + A2 R = 11.15 + (0.58)(4.875) = 11.15 + 2.83 = 13.98
– –
X – A2 R = 11.15 – 2.83 = 8.32
Step 5: See Figure 2.5a, control limits are plotted.
Step 6: Based on criteria below.

Discussion: R chart (Figure 2.5b). There is no point above (or close to) R + 3s R. Neither

is there a long run on either side of R; there is one run of length 4 below and one of
length 4 above. Runs of length 4 are expected.
Conclusion: The chart suggests no unreasonable process variability; all points are
below the upper control limit.
Chapter 2: Ideas from Time Sequences of Observations 67

– – –
Discussion: X chart (Figure 2.5a). Point 6 ( X = 14.3) is above UCL( X ); the outage indi-
cates a difference in the process average. Also, of the eight points 3 to 10, inclusive,

there are seven points above X. This run criterion suggests that the average of the first
group of about 10 points is somewhat higher than the average of the entire set of data.
The difference is not large; but it does indicate that something in the manufacturing or
measuring process was not quite stable.
This set of data was chosen initially in order to discuss a process that was much
more stable than ordinary. It is almost impossible to find k = 20 or more data subsets
from an industrial process without an indication of instability. In process improvement
and troubleshooting, these bits of evidence can be important.

Summary: Some Criteria for Statistical Control (Stability)


Routine Production: Criteria for Action. On the production floor, definite and uncom-
plicated signals and procedures to be used by production personnel work best.
Recommended control chart criteria to use as evidence of assignable causes requiring
process adjustment or possible investigations are:
1. One point outside lines at
– – –
X ± A2 R = X ± 3ŝ X– (a ≅ 3/1000)

A process shift of as much as 1s is not immediately detected by a point falling


outside 3-sigma limits; the probability is about 1/6 for ng = 4 (see Figure 2.8).
2. Two consecutive points (on the same side) outside either
– –
X + 2ŝ X– or X – 2ŝ X– (a ≅ 1/800)

3. A run of seven consecutive points above (or below) the process average or
median (a ≅ 1/64)
The first criterion is the one in ordinary usage; the last two should be used when it
is important not to miss shifts in the average and there is someone to supervise the
process adjustments.
Process Improvement and Troubleshooting. Since we are now anxious to investigate
opportunities to learn more about the process or to adjust it, it is sensible to accept a
greater risk of making investigations or adjustments that may be futile perhaps as often
as 10 percent of the time (allow a risk of a = 0.10). Besides the three criteria just listed,
some or all of the following may be practical.8

1. One point outside X ± 2ŝ X– (a ≅ 0.05)

8. Probabilities associated with runs here and below are based on runs around the median of the data, but are only
slightly different when applied to runs around their mean (arithmetic average).
68 Part I: Basics of Interpretation of Data


2. A run of the last five points (consecutive) on the same side of X (a ≅ 0.06)
Note: The risk associated with any run of the last k points, k > 5, is less than
for k = 5.
3. Six of the last seven points (6/7) on the same side (a ≅ 0.10).
Note: Risk for k out of the last (k + 1), k > 7, is less than 0. 10.
4. Eight of the last ten points (8/10) on the same side (a ≅ 0.10).

5. The last three points outside X ± ŝ X– (on the same side) (a ≅ 0.01).

6. The last two points outside X ± 1.5ŝ X– (on the same side) (a ≅ 0.01).
There are different types of assignable causes, and some will be signaled by one of
these criteria sooner than by another. Any signal that results in process improvement is
helpful. It provides signals to those with the process know-how to investigate, and grad-
ually allows some of the art of manufacturing to be replaced by science.

Case History 2.1


Depth of Cut
Ellis Ott recounts a typical troubleshooting adventure on the manufacturing floor.
While walking through a department in a hearing aid plant rather early one
morning, I stopped to watch a small assembly operation. A worker was per-
forming a series of operations; a diaphragm would be picked from a pile, placed
as a cover on a small brass piece (see Figure 2.6), and then the assembly placed in
an electronic meter where a reading was observed. If the reading were within

Diaphragm
Machining
operation

Cutting blade

Chuck

Brass piece with hole

Figure 2.6 Matching a hole in a brass piece with diaphragm assembly.


Chapter 2: Ideas from Time Sequences of Observations 69

certain limits, the assembly would be sent on to the next stage in production. If
not, the diaphragm was removed, another tried, and the testing repeated. After
five or six such trials,9 satisfactory mates were usually found.
I had a discussion with the engineer. Why was this selective assembly nec-
essary? The explanation was that the lathe being used to cut the hole was too
old to provide the necessary precision—there was too much variation in the
depth of cut. “However,” the engineer said, “management has now become con-
vinced that a new lathe is a necessity, and one is on order.” This was not an
entirely satisfying explanation. “Could we get 20 or 25 sets of measurements at
15-minute intervals as a special project today?” I asked. “Well, yes.” Plans were
made to have one inspector work this into the day’s assignments. I returned to
look at the data about 4 PM.
The individual measurements, which had been collected, were given in Table 1.8;

they are repeated, with the additional X and R columns, in Table 2.4. (A histogram dis-
plays them in Table 1.4; also see the plot on normal probability paper in Figure 1.7.)

Table 2.4 Data: air-receiver magnetic assembly (depth of cut in mils).


Taken at 15-minute intervals in order of production.

–– Range
X R
1 160.0 159.5 159.6 159.7 159.7 159.7 0.5
2 159.7 159.5 159.5 159.5 160.0 159.6 0.5
3 159.2 159.7 159.7 159.5 160.2 159.7 1.0
4 159.5 159.7 159.2 159.2 159.1 159.3 0.6
5 159.6 159.3 159.6 159.5 159.4 159.5 0.3
6 159.8 160.5 160.2 159.3 159.5 159.9 1.2
7 159.7 160.2 159.5 159.0 159.7 159.6 1.2
8 159.2 159.6 159.6 160.0 159.9 159.7 0.8
9 159.4 159.7 159.3 159.9 159.5 159.6 0.6
10 159.5 160.2 159.5 158.9 159.5 159.5 1.3
11 159.4 158.3 159.6 159.8 159.8 159.4 1.5
12 159.5 159.7 160.0 159.3 159.4 159.6 0.7
13 159.7 159.5 159.3 159.4 159.2 159.4 0.5
14 159.3 159.7 159.9 158.5 159.5 159.4 1.4
15 159.7 159.1 158.8 160.6 159.1 159.5 1.8
16 159.1 159.4 158.9 159.6 159.7 159.5 0.8
17 159.2 160.0 159.8 159.8 159.7 159.7 0.8
18 160.0 160.5 159.9 160.3 159.3 160.0 1.2
19 159.9 160.1 159.7 159.6 159.3 159.7 0.8
20 159.5 159.5 160.6 160.6 159.8 159.9 1.1
21 159.9 159.7 159.9 159.5 161.0 160.0 1.5
22 159.6 161.1 159.5 159.7 159.5 159.9 1.6
23 159.8 160.2 159.4 160.0 159.7 159.8 0.8
24 159.3 160.6 160.3 159.9 160.0 160.0 1.3
25 159.3 159.8 159.7 160.1 160.1 159.8 0.8
– ––
X = 159.67 R = 0.98

9. This is a fairly typical selective-assembly operation. They are characteristically expensive, although they may be a
necessary temporary evil: (1) they are expensive in operator-assembly and test time, (2) there always comes a day
when acceptable mating parts are impossible to find, but assembly “must be continued,” (3) it serves as an excuse
for delaying corrective action on the process producing the components.
70 Part I: Basics of Interpretation of Data

Sample number:
k: 1 5 10 15 20 25

160.5
UCL = 160.24

ng = 5
160.0
k = 25

X
159.5 –
X = 159.67

LCL = 159.10
159.0

2.0 –
ng = 5 UCL = D4R = 2.07
R –
R = 0.98
1.0

LCL = 0
0

––
Figure 2.7 Control chart (historical) of X and R on depth of cut. (Case History 2.1; data from
Table 2.4.)

The steps below relate to the previous numbering in preparing a control chart:
1. The averages and ranges have been plotted in Figure 2.7.
– –
2. The average X = 159.67 mils and R = 0.98 mils have been computed and
lines drawn in Figure 2.7.

3. UCL(R) = D4 R = (2.11)(0.98) = 2.07.
Since all range points fall below 2.07, we proceed to (4).

4. UCL( X ) = 159.67 + (0.58)(0.98) = 160.24.

LCL( X ) = 159.67 – 0.57 = 159.10.
5. See Figure 2.7 for UCL and LCL.
6. Possible evidence of assignable causes;

a. Runs: Either the run of eight points below X or the following run of nine
– –
points above X is evidence of a process operating at a level other than X.
See discussion in (7) below.
– – –
b. Are there any points outside X ± 3ŝ X– = X ± A2 R? No. Any pair of

consecutive points above X ± 2ŝ X–? No, but three of the last eight points
are “close.”
Chapter 2: Ideas from Time Sequences of Observations 71

c. Observations have been plotted on cumulative normal probability paper.


See Figure 1.7 and Section 1.7.

7. The run evidence is conclusive that some fairly abrupt drop in X occurred at

the ninth or tenth point and X increased at the seventeenth or eighteenth point
(from production at about 2 PM). After looking at the data, the supervisor was
asked what had happened at about 2 PM. “Nothing, no change.” “Did you
change the cutting tool?” “No.” “Change inspectors?” “No.”
The supervisor finally looked at the lathe and thought “perhaps” that the chuck gov-
erning the depth of cut might have slipped. Such a slip might well explain the increase
in depth of cut (at the seventeenth or eighteenth sample), but not the smaller values at
the ninth and tenth samples. The supervisor got quite interested in the control chart pro-
cedure and decided to continue the charting while waiting for the new lathe to arrive.
He learned to recognize patterns resulting from: a chip broken out of the cutting tool—
an abrupt drop, the gradual downward effect of tool wear, and effects from changing
stock rod. It was an interesting experience.
Eventually the new lathe arrived, but it was removed a few weeks later and the old
lathe back in use. “What happened?” The supervisor explained that with the control
chart as a guide, the old lathe was shown to be producing more uniform depth of cut
than they could get from the new one.

2.6 PROBABILITIES ASSOCIATED WITH



AN X CONTROL CHART: OPERATING-
CHARACTERISTIC CURVES

Identifying the Presence of Assignable Causes


Troubleshooting is successful when it gives us ideas of when trouble began and what
may be causing it. It is important to have different sources of variability, which sug-
gest sensible ideas about when and what to investigate. Specialists in data analysis can
learn to cooperate with the engineer or scientist in suggesting the general type of trou-
ble to consider. Data presented in the form of control chart criteria, patterns of runs, or
the presence of outliers in the data will often suggest areas of investigation (hypothe-
ses). The suggested hypotheses will ordinarily evolve from joint discussions between
the scientist and the specialist in data analysis. The objective is to identify the physi-
cal sources producing the unusual data effects and to decide whether the cost of the
cure is economically justified. The role of identifying causes rests principally with
the process specialists.
The role of the Shewhart control chart in signaling production to make standard
adjustments to a process is an important one. It also has the role of signaling opportune
times to investigate the system. The risks of signals occurring just by chance (without
72 Part I: Basics of Interpretation of Data

the presence of an assignable cause) are quite small. When a process is stable, the prob-
ability (a ) that the following criteria will erroneously signal a shift will also be small.
They are:
1. That a single point will fall outside 3-sigma limits just by chance: about three
chances in a thousand, that is a ≅ 0.003.
2. That a single point will fall outside 2-sigma limits just by chance: about one
time in 20, that is, a ≅ 0.05.
3. That the last two points will both fall outside 2-sigma limits on the same side
just by chance: about one time10 in 800, that is, a ≅ 0.001.
Thus, the two-consecutive-points criterion is evidence of a change in the process at
essentially the same probability level as a single point outside 3-sigma limits.


Operating-Characteristic Curves of X Charts
When working to improve a process, our concern is not so much that we shall investi-
gate a process without justification; rather it is that we shall miss a worthwhile oppor-
tunity to discover something important. As a rule of thumb, we may assume that a shift
in process average of one standard deviation (one sigma) is of practical interest in trou-

bleshooting. Just how sensitive is an X chart in detecting a shift of 1.0s ? Or in detect-
ing a shift of 1.5s ? Or in detecting a shift of zs ? The operating-characteristic curves
(OC curves) of Figures 2.8 and 2.11 provide some answers.11 They show the probabil-
ity of accepting the process as in control, PA, plotted against the size of shift to be
detected. Clearly, the probability of detecting the shift is PD = (100 – PA).
The two OC curves in Figure 2.8 have been computed for averages ng = 4 and ng = 9;
– – –
the criterion for detecting a shift in average of zs is one point above X ± A2 R = X ± 3ŝ X–.
The abscissa represents the amount of shift in the mean; the probabilities PA of missing
such shifts are shown on the left vertical scale, while the probability of detecting such
shifts PD is shown on the right.
Consider first the OC curve for ng = 4:
• A shift of 1.0s has a small probability of being detected: PD ≅ 16
or 17 percent.
• A shift of 1.5s has a 50 percent chance of being detected.
• A shift of 2s has PD ≅ 85 percent.
• Shifts of more than 3s are almost certain to be detected.

10. The probability that the first point will be outside is approximately 1/20; the probability that the next point will
then be outside on the same side is essentially 1/40; thus the probability that two consecutive points will be out-
side, on the basis of chance alone, is about (1/20)(1/40) = 1/800.
11. The method of deriving OC curves is outlined below.
Chapter 2: Ideas from Time Sequences of Observations 73

100 0
90 10
80 20
70 30
Criterion: one point above
60 – 40
X + 3s^ X–
PA% 50 50 PD%
40 60
ng = 9
30 70
ng = 4
20 80
10 90
0 100
0 0.5s 1.0s 1.5s 2.0s 2.5s 3.0s
Distance l shifts = zr

––
Figure 2.8 Comparing sensitivities of two X charts, ng = 4 and ng = 9 with operating-
characteristic curves.

Consider now the OC curve for ng = 9:



• Except for very small and very large shifts in X, samples of ng = 9 are much
more sensitive than when ng = 4.
• A shift of l.0s has a 50 percent chance of being detected.
• A shift of 1.5s has a 92 or 93 percent chance of detection.
Discussion: Samples of ng = 9 are appreciably more sensitive than samples of ng = 4 in
detecting shifts in average. Every scientist knows this, almost by instinct.
This may suggest the idea that we should use samples of nine rather than the rec-
ommended practice of ng = 4 or 5. Sometimes in nonroutine process improvement pro-
jects, one may elect to do this. Even then, however, we tend to hold to the smaller
samples and take them more frequently. During ordinary production, experience has
shown that assignable causes that produce a point out of 3-sigma limits with samples of
ng = 4 or 5 can usually be identified by an engineer or production supervisor, provided
investigation is begun promptly. If they are not detected on the first sample, then usu-
ally they are on the second or third, or by one of the earlier run criteria of this chapter.
Samples as large as ng = 9 or 10 frequently indicate causes that do not warrant the
time and effort required to investigate them during regular production.

Average Run Length


Another very important measure of the performance of a control chart is the average run
length (ARL). This may be regarded as the average (mean) number of samples before
74 Part I: Basics of Interpretation of Data

the control chart produces a signal. The ARL equation is simply the reciprocal of the
probability of detection, that is

1 1
ARL = =
PD 1 − PA

Note that PA can be read directly from the OC curve. In fact, one can construct an
ARL curve that will display the changes in ARL for various values of the shift in the
process average. This is very useful in determining the frequency of sampling and sample
size along with the control chart. For example, from Figure 2.8 we see that a control
chart with samples of 4 will detect a 1.0s shift in the mean about 16 percent of the time
(PA = 0.84), whereas samples of 9 will detect a 1.0s shift 50 percent of the time. For a
1.0s shift in the mean, the two ARLs are

1
Sample of size 4: ARL = = 6.25
.16

1
Sample of size 9: ARL = = 2.00
.50

If a 1.0s shift is to be detected, is it worth the increase in sample size from 4 to 9


to cut the time to detection by roughly 2/3? Perhaps it would be better to take samples of
4 at twice the frequency. (Thus keeping the sampling cost roughly the same.) Suppose
samples are taken once per shift (8 hours); the average time to signal (ATS) for a sample
of size 9 is

ATS = ARL (time between samples)


= 2(8) = 16 hours

while for the ng = 4 chart at twice the frequency

ATS = 6.25(4) = 25 hours

If a 1.0s shift is to be detected, it seems, at least on this basis, that using ng = 9 has
some advantage in this case. Recall, however, that for troubleshooting purposes, a sample
of 4 or 5 is preferred, since it will be of such sensitivity that it will pick up assignable
causes that can readily be isolated. A chart to detect a 1.0s difference is unusual and
may be overly sensitive for troubleshooting purposes.
The calculation of ARL for an ng = 4 control chart is included in Table 2.5. A graph
of the resulting ARL curve is shown in Figure 2.9.
Chapter 2: Ideas from Time Sequences of Observations 75


Table 2.5 Computation of OC curve and average run length for Shewhart X control chart with
sample size ng = 4.
Shift in Shift in Distance from Probability Probability
mean in mean in units mean to of of Average
units of r of r X– control limit detection acceptance run length

z z ng z X = 3 − z ng PD = Pr ( z ≥ z X ) PA = 100 − PD ARL = 100 / PD

0.0 0.0 3 0.135 99.865 740.74


0.5 1.0 2 2.28 97.72 43.86
1.0 2.0 1 15.87 84.13 6.30
1.5 3.0 0 50.00 50.00 2.00
2.0 4.0 –1 84.13 15.87 1.18
2.5 5.0 –2 97.72 2.28 1.02
3.0 6.0 –3 99.865 0.135 1.00

1000.00
Average run length

100.00

10.00

1.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Distance l shifts = zr

––
Figure 2.9 Average run length curve for X chart with ng = 4.

Some Computations Associated with OC Curves12


The following discussion will consider sample means of ng = 4. Figure 2.10 represents
four locations of a production process.
The three figures on the right of Figure 2.10 represent a process with averages

increased by 1.5s, zs, and 3s, respectively. The probabilities PD that a single X point

12. This section may be omitted without seriously affecting the understanding of subsequent sections.
76 Part I: Basics of Interpretation of Data

– –
Process average = X Average has shifted General case with X shifted Average has shifted
by 1.5s = 3s X– 2zs X– = zs 3s = 6s X–

ng = 4
PD = 99.7%
s X– = 0.5s
– PD
X + 3s

PD = 50%
– – –
X + 3s X– = X + A2R UCL 3s = 6s X–

3s X– 1.5s = 3s X–
zs = 2zs X–

X


X – 3s X–


X – 3s

Figure 2.10 Distributions with their associated distributions of averages (ng = 4).

will fall above the upper control limit, after the shift in process, are indicated by the
shaded areas. The less shaded areas represent the probability of acceptance of control, PA.
In position 1, the “outer” curve (the wider one) represents the process individual X
– –
values centered at X; the “inside” curve portrays the distribution of sample averages ( X )
of ng = 4, which is just one-half the spread of the process itself

( σ X = σ / n ).

In position 2, the process has shifted 1.5s = 3s X– and 50 percent of the shaded dis-
– –
tribution of averages is now above X ± A2 R. (PD = 50 percent.)
In position 4, the process has shifted 3s = 6s X–; “all” the distribution of averages is
above the original control limit. That is, PD ≅ 100 percent.
In the general position 3, the process has shifted zs = 2zs X–. The distance of the new
process average below the original control limit is (3 – 2z)s X–. The area of the shaded
– –
tail above X ± A2 R may be obtained from Table A.1. The distribution of samples even as
small as ng = 4 is essentially normal, even when the process distribution is nonnormal
as discussed in Section 1.8, Theorems 1 and 3.
Using the principles of Figure 2.10, computation of the OC curve and average run
length for control charts with samples of size ng = 4 is summarized in Table 2.5.
Chapter 2: Ideas from Time Sequences of Observations 77

100 0
90 10
OC curves of three plans:
80 20
(ng = 4)
70 (1) One point above 3s X– limit 30
(2) Two consecutive points above
60 2s X– limit (or below) 40
(3) One point above 2s X– limit
PA% 50 50 PD%
(or below)
40 60
(3) (2) (1)
30 70
20 80
10 90
0 100
0 0.5s 1.0s 1.5s 2.0s 2.5s 3.0s
Shift up (or down) of process average

––
Figure 2.11 Operating-characteristic curves of three decision plans associated with an X control
chart, ng = 4.

Some OC Curves Associated with Other Criteria


Figure 2.11 shows the increased sensitivity when using 2-sigma decision limits over
3-sigma limits in troubleshooting projects. Both the one-point and two-point criteria
of plans (2) and (3) are more sensitive to change than plan (1) with 3-sigma limits.
These plans are often to be used when looking for ways to improve a process.

Case History 2.2


Excessive Variation in Chemical Concentration
Figure 2.12 shows some measurements (coded) of chemical concentration obtained by
sampling from a continuous production line in a large chemical company; samples were
obtained at hourly intervals over a period of two weeks. During this period, every effort
was made to hold the manufacturing conditions at the same levels; whatever variation
there was in the process was unintentional and was considered inherent in the process.
Although the process average was excellent, the variation being experienced was greater
than could be tolerated in subsequent vital steps of the chemical process. This distribu-
tion of concentration was presented (by the scientists) as conclusive evidence that the
process could not be held to closer variation than 263 to 273.
However, such a picture (histogram) of the process does not necessarily represent the
potential capability of the process. Perhaps unsuspected changes occurred during the two
weeks, caused by factors that could be controlled once their presence was recognized.
78 Part I: Basics of Interpretation of Data

30

Frequency Target = 268


20 n = 152 individual
observations (ng = 1)

10

0
265 268 270 275
Chemical concentration (coded)

Figure 2.12 Accumulated analyses from hourly samples over two weeks’ production. (Data from
Case History 2.2.)

In Figure 2.13 we show the data used to prepare Figure 2.12 as averages of four con-

secutive readings, that is, covering a four-hour production period. Each point on the X
chart represents the average of four consecutive readings, and each point on the R chart
represents the range of these four readings. This is a record over time. We see conclu-
sive evidence of certain important changes having taken place during these two weeks—
changes whose existence were not recognized by the very competent chemists who
were guiding the process.

The R chart increases abruptly about May 14. Its earlier average is R ≅ 2.5, and the
later average about 6. Something happened quite abruptly to affect the four-hour aver-
age of the process.

On the X chart, the median overall is just about 267. The total number of runs above
and below the median is 11, which is convincing evidence of a nonstable process aver-
age (risk less than 0.01; the critical value is 12). Although no more evidence is needed,

one can draw tentative control limits on the X chart and see some further evidence of a
nonstable average. Control limits drawn over the first half of the data (which averages
about 267) are
– –
X ± A2 R ≅ 267 ± (0.73)(2.5)
UCL ≅ 268.8 and LCL ≅ 265.2

These lines have been drawn in Figure 2.13. Even during this first week, there are
outages on the upper control chart limit on May 8, 9, and 10 and below the LCL during
the next three or four days. There was an abrupt drop in average on May 11.
Chapter 2: Ideas from Time Sequences of Observations 79

Chemical concentration (coded) May 8 9 10 11 12 13 14 15 16 17 18 19 20

ng = 4, k = 38
272

270
UCL ≅ 268.8

268 X ≅ 268

Median ≅ 267
266

LCL ≅ 265.2
264

10
ng = 4 –
R – R2 ≅ 6
D4R = 5.7
5 –
R 1 ≅ 2.5

Median
0

Figure 2.13 A control chart (historical) of chemical concentration of data taken about once an hour
over a two-week period (sample averages and ranges of four consecutive analyses).


We could compute control chart limits on both the R chart and X chart over the

entire set of data; this would not mean much because of the obvious shifting in R and X.
The process has been affected by several assignable causes. It was agreed by produc-
tion that points should be plotted on the control chart when available (every four hours
or so) and evidence of assignable causes investigated to improve the process.
This particular process history is typical of those in every industry. Samples of size
n = 1 are relatively common in industries where measurements are on finished batches,
samples from continuous processes, complicated items (missiles, electronic test equip-
ment), or monthly sales records, as examples. Grouping the data into subgroups of ng =
3, 4, or 5 for analysis will usually be beneficial. It is usually best if there is a rationale
for the grouping. However, even an arbitrary grouping will often be of supplemental
value to a histogram.
It is important to note here that recognition of the existence of important changes in
any process is a necessary prerequisite to a serious study of causes affecting that process.
80 Part I: Basics of Interpretation of Data

Case History 2.3


Filling Vanilla Ice Cream Containers
A plant was manufacturing French-style vanilla ice cream. The ice cream was marketed
in 2.5-gallon containers. Specified gross weight tolerances were 200 ± 4 oz. Four con-
tainers were taken from production at 10-minute intervals in a special production study.
The gross weights for k = 24 subgroups of ng = 4 containers are shown in Table 2.6.
What are appropriate ways13 of presenting these observations for analysis?
Computation of Control Limits
The 24 sample averages and ranges are shown in Table 2.6; they have been plotted in
Figure 2.14.
– – ~
X = 203.95, R = 5.917; also the median, X = 204.12

For n = 4, UCL(R) = D4 R = (2.28)(5.917) = 13.49

Table 2.6 Data: gross weights of ice cream fill in 2.5-gallon containers.
Samples of ng = 4 in order of production in 10-minute intervals
Subgroup ––
number R X
1 202 201 198 199 4 200.00
2 200 202 212 202 12 204.00
3 202 201 208 201 7 203.00
4 201 200 200 202 2 200.75
5 210 196 200 198 14 201.00
6 202 206 205 203 4 204.00
7 198 196 202 199 6 198.75
8 206 204 204 206 2 205.00
9 206 204 203 204 3 204.25
10 208 214 213 207 7 210.50
11 198 201 199 198 3 199.00
12 204 204 202 206 4 204.00
13 203 204 204 203 1 203.50
14 214 212 206 208 8 210.00
15 192 198 204 198 12 198.00
16 207 208 206 204 4 206.25
17 205 214 215 212 10 211.50
18 204 208 196 196 12 201.00
19 205 204 205 204 1 204.50
20 202 202 208 208 6 205.00
21 204 206 209 202 7 205.25
22 206 206 206 210 4 207.00
23 204 202 204 207 5 204.25
24 206 205 204 202 4 204.25
––
X = 203.95
Source: Data courtesy of David Lipman, then a graduate student at the Rutgers University Statistics Center.


13. Runs on either side of the median for averages X from this set of data were considered in Section 2.4.
Chapter 2: Ideas from Time Sequences of Observations 81

Sample number 1 5 10 15 20 24

210
– –
UCL = X + A2R = 208.02

ng = 4
~
– X = Median = 204.12
X 205

X = 203.95

200
– –
LCL = X – A2R = 199.88

15 ng = 4 –
UCL = D4R = 12.70 (recomputed with point 5 excluded)

10 –
R '= 5.57 (recomputed
R with point 5 excluded)
5

Figure 2.14 A control chart (historical) of filling weights of ice cream containers. (Data from
Table 2.6.)

The fifth range point 14 exceeds the UCL. We recommend the exclusion of this
range point; it represents more variation than expected from a stable, controlled process.

Then we compute the average range and get X = 5.57.

Then UCL = D4 R = (2.28)(5.57) = 12.70

When we compare the short-term inherent process variability of 6ŝ ST = 6(5.57)/2.06


= 16.2 oz with the specification tolerance of ± 4 oz = 8 oz, we find that the short-term
process is twice as variable as specifications allow. In other words, the inherent process

variability is not economically adequate even if the process average were stable at X =
200 oz.
Since the 23 remaining range points fall below this revised UCL(R), we also pro-

ceed to calculate control limits on X. For ng = 4
– – –
UCL( X ): X + A2 R = 203.95 + (0.73)(5.57) = 203.95 + 4.07 = 208.02
– – –
LCL( X ): X – A2 R = 203.95 – 4.07 = 199.88

See Figure 2.14.


82 Part I: Basics of Interpretation of Data


Evidence from the X Control Chart
Before these data were obtained, it was known that the variability of the filled 2.5-gallon
ice cream containers was more than desired. In every such process where there is exces-
sive variability, there are two possibilities:

1. Excessive average short-term variation represented by 6ŝ ST = 6R/d2 as

discussed above. Even with an R chart in control and an X chart in control,
the process just is not constructed to meet the desired specifications.
2. A process average not controlled. The two general methods of this chapter
(runs and control charts) are applicable in considering this question of average
process stability.

a. The control chart for X in Figure 2.14 provides evidence that the process
average was affected by assignable causes on several occasions.
b. Three separate points above UCL: points 10, 14, 17.
c. Three separate points below LCL: points 7, 11, 15.
d. Consider: (1) the group of four points 7, 8, 9, 10 (over a 30-minute
period); then (2) the group of four points 11, 12, 13, 14 (over 30 minutes);
then (3) the group of three points 15, 16, 17 (over 20 minutes) and; then
(4) the group of five points 18, 19, 20, 21, 22 (over 40 minutes).
The pattern of these sequences suggests that the process average would creep
upward over a 25- to 30-minute period, then was probably adjusted downward; then the
adjustment cycle was repeated for a total of four such cycles. This is a supposition
(hypothesis) worth investigation.
The total number of runs around the median is 8; since the small critical value is 8
for one-sided a = 0.05, this is statistically significant at the 0.10 level of risk.
The long run of 7 below the median at the beginning and the run of 6 at the end are
additional evidence of a nonstable process.
Summary of Evidence
The process average was quite variable; the 24 samples averaged 2 percent higher than
specified. Even so, there is some danger of underfilled containers.
The inherent process variability is not economically acceptable even if process

average is stabilized at X = 200 oz.
Evidence from Frequency Distribution Analysis
The individual fill weights are shown in Table 2.7. Several containers were overfilled;
there is some underfill. A frequency distribution does not provide information about
process stability or about process capability.

We can go through the routine of computing X and ŝ LT as in Table 1.3, although this
cannot be expected to add to the analysis:
Chapter 2: Ideas from Time Sequences of Observations 83

Table 2.7 Gross weights of ice cream fill in 2.5-gallon containers.


Individual fill weights grouped in a frequency distribution
Cell
interval f m fm fm2
215–216 1 215.5 215.5 46,440.25
213–214 4 213.5 854.0 182,329.00
211–212 3 211.5 634.5 134,196.75
209–210 3 209.5 628.5 131,670.75
207–208 10 207.5 2,075.0 430,562.50
205–206 17 205.5 3,493.5 717,914.25 USL = 204
203–204 21 203.5 4,273.5 869,657.25
201–202 18 201.5 3,627.0 730,840.50
199–200 7 199.5 1,396.5 278,601.75
197–198 7 197.5 1,382.5 273,043.75
195–196 4 195.5 782.0 152,881.00
193–194 0 193.5 0.0 0.00
191–192 1 191.5 191.5 36,672.50
n= 96 19,554.0 3,984,810.30

X = 19, 554 / 96 = 203.69


96 ( 3, 984, 810.3) − (19, 554.0 )
2
182, 872.8
σˆ LT = = = 20.051842 = 4.4779
96 ( 95) 9,120


This is much larger than ŝ ST = R/d2 = 2.70. This will almost always be the case
when the control chart shows lack of control, as discussed in Section 1.13.
Summary
• The control chart shows that the process was not operating at any fixed level;
a series of process changes occurred during the four-hour study.
• The average overfill of the 96 filled containers was about 2 percent; also,
38/96 = 39.6 percent of the containers were filled in excess of the upper
specification limit (USL) of 204 oz (Table 2.7).
• The inherent process capability is estimated as follows:

ŝ ST = R/d2 = 5.57/2.06 = 2.70
and
6ŝ ST = 16.20 oz

This means that the inherent variability of the process is about twice what
is specified.
84 Part I: Basics of Interpretation of Data

• Major investigation into the following two questions will be needed to reduce
the process variability to the stated specifications:
1. What are the important assignable causes producing the shifting process

average shown by the X control chart? Once identified, how can
improvements be effected? The control chart can be continued and
watched by production personnel to learn how to control this average;
investigations by such personnel made right at the time an assignable
cause is signaled can usually identify the cause. Identifications are
necessary to develop remedies.

2. What are possible ways of reducing the inherent variability (ŝ = R/d2)
of the process? Sometimes relationships between recorded adjustments
made in the process and changes in the R chart can be helpful. For
example, there is a suggestion that the process variation increased for
about 40 minutes (points 14 to 18) on the R chart. This suggestion
would usually be disregarded in routine production; however, when
trying to improve the process, it would warrant investigation. A serious
process improvement project will almost surely require a more elaborately
designed study. Such studies are discussed in later chapters.

Case History 2.4


An Adjustment Procedure for Test Equipment

Summary
The procedure of using items of production as standards to compare test-set perfor-
mance indirectly with a primary standard (such as a bridge resistance in electrical test-
ing) is discussed. The procedure is not limited to the electronic example described
below; the control chart of differences is recommended also in analytical chemistry lab-
oratories and in other analytical laboratories.
Introduction
In measuring certain electrical characteristics of a new type of street lamp, it was nec-
essary to have a simple, efficient method of adjusting (calibrating) the test sets that are
in continuous use by production inspectors on the factory floor. No fundamental stan-
dard could be carried from one test set to another. The usual procedure in the industry
had been to attempt comparisons between individual test sets and a standard bridge by
the intermediate use of “standard” lamps. The bridge itself was calibrated by a labori-
ous method; the same technique was not considered practical for the different test sets
located in the factory.
Chapter 2: Ideas from Time Sequences of Observations 85

A serious lack of confidence had developed in the reliability of the factory test sets
that were being used. Large quantities were involved, and the situation was serious.
Inaccurate or unstable test sets could approve some nonconforming lamps from the daily
production; also, they could reject other lamps at one inspection that conformed to spec-
ifications on a retest. The manufacturing engineers chided the test-set engineers for poor
engineering practice. Conversely, of course, those responsible for the test sets said: “It’s
not our fault.” They argued that the lamps were unstable and were to blame for the exces-
sive variations in measurements. It became a matter of honor and neither group made
serious efforts to substantiate their position or exerted effort to improve the performance
either of the lamps or the test equipment. Large quantities of lamps were being tested
daily; it became urgent to devise a more effective criterion to make adjustments.
The comparison procedure used previously to adjust the test sets was to select five
“standard” lamps, which had been aged to ensure reasonable stability, and whose read-
ings were within or near specification limits. The wattage of the five standard lamps
was read on the bridge and the readings recorded on form sheets, one for each test set.
The lamps were then taken to each of the floor test sets where the lamps were read again
and the readings recorded on the form sheets. From a series of these form sheets, the
responsible persons attempted to see some signals or patterns to guide the adjustment
and maintenance of the test equipment. Whether it was because the adjustment proce-
dure on test equipment was ineffective or because of the instability of the lamps could
not be established.
A Modified Approach
It was generally agreed that there might be appreciable variation in a test set over a
period of a few days and that operating conditions could gradually produce increasing
errors in a set. Also, changes in temperature and humidity were expected to produce fluc-
tuations. It was also known that internal variations within a lamp would produce appre-
ciable variations in wattage at unpredictable times. Consequently, it was decided to use
a control chart with averages and ranges in some way.
After some discussion, the original comparison technique between the bridge and
the test sets with five standard lamps was modified to permit a control chart procedure.
The bridge was accepted as a working plant standard since there were data to indi-
cate that it varied by only a fraction of a percent from day to day.
In the modified procedure, each of five standard lamps was read on the bridge and
then on each floor test set, as before; the readings were recorded on a modification of
the original form sheet (see Table 2.8). The difference

∆i = Si – Bi

between the set reading (Si) and the bridge reading (Bi) for each lamp was then deter-

mined, and the average ∆ and the range R of these five differences were computed. Plus
and minus signs were used to indicate whether the set read higher or lower than the bridge.
In the initial program, a control chart of these differences was recorded at one test
set; readings of the standard lamps were taken on it every two hours. Within a few days,
86 Part I: Basics of Interpretation of Data

Table 2.8 Computations basic to a control chart test set calibration.


Standard Reading on Reading on Difference
no. bridge (B ) test set (S ) D =S–B
1 1820 1960 +140
2 2590 2660 + 70
3 2370 2360 – 10
4 2030 1930 –100
5 1760 1840 + 80

∆= + 36
R= 240
Source: Ellis R. Ott, “An Indirect Calibration of an Electronic Test Set,” Industrial
Quality Control (January, 1947). Reproduced by consent of the editor.

August September
23 24 25 26 28 29 30 31 1 2 4

*Voltage adjusted on test set



0 + A2R ≅ +98.6 * –
+100 0 + A2R ≅ +81.2
n=5

∆ 0


0 – A2R ≅ –98.6 0 – A2R ≅ –81.2
–100


D4R ≅ 360
300 n=5
k = 23 –
D4R
k = 20

200 R ≅ 170

R ≅ 140
R
100

Figure 2.15 A control chart guide


– to test-set adjustments. The central line has been set at a
desired value of ∆ = 0. Besides
– the one outage on August 29, there is other evidence
of nonrandomness around ∆ = 0 on the chart of averages and on the R– charts; too
few runs on each part of the R chart and on the August record on the ∆ chart.

control charts of the same type were posted at other test sets. An example of one of
the control charts is shown in Figure 2.15. After a short time, several things became
apparent. Certain test sets were quite variable; it was not possible to keep them in satis-
factory adjustment. However, much improved performance of some was obtained easily
by making minor systematic adjustments with the control charts as guides.
Chapter 2: Ideas from Time Sequences of Observations 87

One of the first inquiries was to compare the performance of the test sets under the
control charts’ guidance with the previous performance. Data from previous months
were available to make control charts and a two-week period in June was selected to
compare before and after. A comparison of variabilities of six test sets in June with a
period in August (a few days after the start of the control charts) and then with a later
two-week period in November is shown in Table 2.9.

Note: The average ∆ indicates the amount of bias (inaccuracy); the desired average

is ∆ = 0. Small values of R indicate less variability.
The immediate improvements effected in August are apparent: the average differ-

ences (bias) are reduced on every test set; the variability R is reduced on several sets. It
was soon discovered that set 7 was in need of a major overhauling.
No allowable limits of variation of the test sets with respect to the bridge had been
established, but it was agreed that it would be a decided improvement to maintain the
average of five lamps within five percent of the bridge readings. Table 2.9 showed that
the variation had been in excess of 10 percent during the first two weeks in June. It was
found during the first weeks of the experiment that the average of five lamps could be
held within four percent of the bridge for most sets, and even closer agreements were
obtained subsequently. Three-sigma control chart limits were projected in advance on

both ∆ and R charts and used as criteria for adjustment. Figure 2.15 shows a test set
averaging about 40 units high during the last week in August. The voltage adjustment
at the end of August centered it nicely.
It was surprising to find such large values of the range in Table 2.9. On the basis of
logic, it was possible to explain this variability by any one of the following explana-
tions, which revert to the original “lamp versus test set” controversy:
1. The five standard lamps were stable. The variations in readings arose from
the inability of a test set to duplicate its own readings. This assumption now

had some support; R for set 7 in November was significantly larger than for
other sets.

Causes that produced a shift in the ∆ chart or in the R chart of one test set but
not in others were assignable to the test set.

Table 2.9 A performance comparison of six test sets over three time periods.

Set June 1–15 Aug. 14–30 Nov. 1–15


– – – – – –
no. D R D R D R
1 –63 181 –6 72 17 74
3 –86 164 –1 68 –1 68
5 –62 216 –12 61 –13 61
6 –47 202 9 74 7 75
7 –136 138 16 86 17 140
8 –92 186 2 81 3 92
Source: Ellis R. Ott, “An Indirect Calibration of an Electronic Test Set,” Industrial Quality Control (January,
1947). Reproduced with the consent of the editor.
88 Part I: Basics of Interpretation of Data

2. The test sets were reliable. The variations in readings resulted from
internal variations within the standard lamps. The most obvious variations
in all test sets would require the replacement of one of the five standard
lamps by a new one. A reserve pool of standard lamps was kept for this
purpose. A need for replacement was evidenced by an upward trend of
several different R charts. Such a trend could not necessarily be attributed
to the standard lamps, but analysis of recent bridge readings from the form
sheet on individual lamps would show whether a particular lamp was the
assignable cause.
3. A combination of assumptions 1 and 2. It was most convenient to have data
from at least three test sets in order to compare their behavior. The test-set
engineers started control charts from data used in previous calibrations. They
posted control charts at each test set, learned to predict and prevent serious
difficulties as trends developed on the charts, and were able to make
substantial improvements in test-set performance.
The advantages of control charts using differences in any similar indirect
calibration program are essentially the same as the advantages of any control
chart over the scanning of a series of figures.
The control chart of differences is applicable to a wide variety of calibration
techniques in any industry where a sample of the manufacturer’s product can
be used as an intermediary. Reliable data are an important commodity.

2.7 CONTROL CHARTS FOR TRENDS


Some processes may exhibit an expected trend in average performance due to tool wear
or other inherent physical causes, such as loss of pressure as a tank empties or the degra-
dation of a catalyst in a chemical process. In such circumstances, the centerline of a con-
trol chart for averages cannot be projected as a horizontal line, but must be sloping, or
even curved.14
We will illustrate such a chart with respect to tool wear, a common application.
Consider the classic data given by Manuele15 as read and adapted by Cowden,16 which
is shown in Table 2.10. Here the data shows part diameter over a 9-hour period. The
specifications are 0.2470 to 0.2500 inches for an allowable spread in tolerances of 30
ten-thousandths. The data are coded as ten-thousandths above 0.2400 inches.

14. This also applies to the range chart when the process variance is expected to change over time.
15. Joseph Manuele, “Control Chart for Determining Tool Wear,” Industrial Quality Control 1 (May 1945): 7–10.
16. Dudley J. Cowden, Statistical Methods in Quality Control (Englewood Cliffs, NJ: Prentice Hall, 1957): 441.
Chapter 2: Ideas from Time Sequences of Observations 89

Table 2.10 Part diameter for nine hours of production.*


Hour, x Time Diameter y– R
0 5/10—5 AM 79, 78, 78, 77, 75 77.4 4
1 6 79, 78, 78, 77, 76 77.6 3
2 7 82, 81, 80, 80, 79 80.4 3
3 8 83, 82, 81, 81, 80 81.4 3
4 9 85, 85, 84, 84, 83 84.2 2
5 10 86, 86, 85, 85, 84 85.2 2
6 11 88, 87, 87, 86, 85 86.6 3
7 5/11—5 AM 89, 89, 89, 88, 88 88.6 1
8 6 91, 91, 90, 90, 89 90.2 2
9 7 94, 93, 92, 91, 90 92.0 4
– –
X = 4.5 Y = 84.36
* Data, y, coded as ten-thousandths above 0.2400, that is y = (x – 0.2400)104.

Table 2.11 Diameter of initial sample of fifty successive parts.*


Diameter Frequency, fi Midpoint, mi fimi fimi2
94–93 3 93.5 280.5 26,266.75
92–91 8 91.5 732.0 66,978.00
90–89 19 89.5 1,700.5 152,194.75
88–87 15 87.5 1,312.5 114,843.75
86–85 5 85.5 427.5 36,551.25
50 4,453 396,794.50

n = ∑ fi ∑f m
i i ∑f m
i i
2

y=
∑f m
i i
=
4, 453
= 89.06
∑f i
50

n ∑ fi mi2 − (∑f m ) 50 ( 396,794.5) − ( 4, 453 )


2 2

s= =
i i

n (n − 1) 50 ( 49 )

10, 516
= = 4.2922 = 2.07
2, 450

* Data, y, coded as ten-thousandths above 0.2400, that is y = (x – 0.2400)104.

Action Limit
An initial sample of 50 consecutive pieces, shown in Table 2.11, was taken prior to the
data shown in Table 2.10 and yielded a standard deviation s = 2.07 ten-thousandths.
Using a six-sigma spread, we would estimate the natural tolerance of the process to be
(6)(2.1) = 12.6 ten-thousandths, well within the specification spread of 30, assuming a
normal distribution of diameters.
Now, if an action limit is set three standard deviations within the specification
limit, parts outside the specification limit will be less than 1.35 in 1000 when the
process mean is exactly at the action limit, and much less otherwise. Clearly, we can
restrict the exposure even more when operating at the action limit by increasing the
90 Part I: Basics of Interpretation of Data

0.2505 105

0.25 USL

Part diameter (coded)


0.2495 UAL
95
Part diameter

0.249

0.2485 85

0.248
LAL
0.2475 75

0.247 LSL

0.2465 65
0 1 2 3 4 5 6 7 8 9 10
Hour

Figure 2.16 Plot of hourly diameter readings shown in Table 2.10.

distance of the action limit from the specification limit. For example, a 4.5 standard
deviation distance would assume less than 3.4 parts per million exposure. Figure 2.16
shows the action limit with the individual observations of diameter and their means
over a 9-hour period.
The action limit used in Figure 2.16 utilizes a 3 standard deviation distance. Thus,
using the initial sample measure of variation, the action limits were calculated as:

Upper action limit: UAL = USL – 3ŝ = 100 – 3(2.07) = 93.79 coded,
or 0.24938 inches

Lower action limit: LAL = LSL + 3ŝ = 70 + 3(2.07) = 76.21 coded,


or 0.24762 inches

where USL is the upper specification limit and LSL the lower specification limit. The
action limits are useful even when process performance cannot be predicted.
As long as the sample averages stay within the action limits, and individual diame-
ters are within the specification limits, the process continues to run. When these condi-
tions are violated, the tool is ground and repositioned.

Trend Line
The chart can be further developed in terms of utility and sophistication by applying
sloping control limits around a trend line, the equation of which is

ŷ = a + bx
Chapter 2: Ideas from Time Sequences of Observations 91

Periodic averages are plotted on the chart and any assignable causes are indicated
by points outside the sloping control limits. The control limits for sample averages are
set at ±3σ / n around the trend line to produce limits in the form of control lines

Upper control line: yˆ = a + 3σ / n + bx ( )


Lower control line: yˆ = ( a − 3σ / n ) + bx

The trend line may be fit by eye or by other techniques; however it is best estimated
using the least squares method. This method of fitting a line to a set of data is gener-
ally regarded as the most accurate. It can be used even when the observations are not
equally spaced and can provide an estimate of process variation even when single
observations are taken, rather than subgroups.
A least squares estimate is easily calculated. We need estimates of a and b for the
equation, which are obtained as
n n n
n∑ xi yi − ∑ xi ∑ yi
b= i =1 i =1 i =1

 n 
2
n
n∑ xi2 −  ∑ X i 
i =1  i=1 
n n

∑ y − b∑ X i i
a= i =1 i =1

For the diameter data, it will be found that


n n n

∑ xi = 225
i =1
∑ yi = 4, 218
i =1
∑x y
i =1
i i
= 19, 674
n n

∑ xi2 = 1, 425
i =1
∑yi =1
2
i
= 357,0054 n = 50

giving

50 (19, 674 ) − ( 225)( 4, 218 ) 34, 650


b= = = 1.68
50 (1, 425) − ( 225)
2
20, 625
4, 218 − 1.68 ( 225) 3, 840
a= = 76.8 =
50 50
yˆ = a + bx = 76.8 + 1.68 x
92 Part I: Basics of Interpretation of Data

Control Limits
Control limits could be set around this line using the range estimate

ŝ = R/d2 = 2.7/2.326 = 1.161

Note that this is substantially smaller than that obtained from the initial sample of
n = 50. This is very likely because the sample of 50 included variation introduced by
tool wear over such a long run. Rational subgroups of ng = 5, however, are much less
affected by tool wear. For this reason, it is best to restrict the subgroup size as much
as possible.
The range estimate gives control limits

yˆ ± 3σˆ / n
yˆ ± A2 R
yˆ ± 0.577 ( 2.7 )
yˆ ± 1.56

Such limits can easily be placed around a trend line fit by eye, by the method of
semi-averages, or by any other technique. However, limits can also be set by using the
standard error of estimate, sy x, associated with the least squares technique. This will

allow estimation of the short-term (within) variation even when the observations are
taken without subgrouping, that is, ng = 1. It is calculated as

n n n

∑y 2
i
− a∑ yi − b∑ xi yi
syi x = i =1 i =1 i =1

n−2

and for the diameter data

357, 054 − 76.8 ( 4, 218 ) − 1.68 (19, 674 )


syi x =
50 − 2
59.28
= = 1.235 = 1.11
48

Note how close this comes to the range estimate ŝ = 1.161. Again, this shows the
inflation of the estimate from the initial 50 observations of s = 2.07.
Using sy x, the action limits are found to be

Upper action limit UAL = USL – 3sy x = 100 – 3(1.11) = 96.67 coded,

or 0.24967 inches
Chapter 2: Ideas from Time Sequences of Observations 93

Lower action limit LAL = LSL + 3sy x = 70 + 3(1.11) = 73.33 coded,


or 0.24733 inches

The control limits around the line are set at

syi x
yˆ ± 3
n
1.11
yˆ ± 3
5
ˆy ± 1.49

With a sloping center line

yˆ = 76.8 + 1.68 x

Forced Intercept
This line and control limits are useful in analyzing data after the fact. They describe the
process that actually went on, and so the centerline is directed through the points plot-
ted. However, in many applications, where the slope is expected to be constant, a stan-
dard chart is utilized to target and control subsequent runs. These charts force the
intercept, a, to be at the desired starting point for the process at time x = 0. The eco-
nomic starting point would normally be at the lower action line to maximize tool life
while assuring that the process is initially targeted far enough from the specification
limit to avoid defective material. The intercept, a, is then taken to be

a = LAL = LSL + 3sy x •

or at any higher value that may be deemed safe for the startup of the process.
For the diameter data, we could force the intercept to be

a0 = 70 + 3(1.11) = 73.33

The resulting trend line is

yˆ = 73.33 + 1.68 x

and control lines set at ±3s y i x / n

Upper control line y = (73.33 + 1.49) + 1.68x = 74.82 + 1.68x


Lower control line y = (73.33 – 1.49) + 1.68x = 71.84 + 1.68x
94 Part I: Basics of Interpretation of Data

0.250500 105.00

0.250000 USL
UAL

Part diameter (coded)


0.249500 95.00
Part diameter

0.249000

0.248500 85.00

0.248000

0.247500 75.00
LAL
0.247000 LSL

0.246500 65.00
0 1 2 3 4 5 6 7 8 9 10
Hour

Figure 2.17 A diameter trend chart developed by the least squares method for subsequent runs
using a forced intercept.

These lines are plotted in Figure 2.17, and would be used to target and control the
process in subsequent runs.

Estimation of Tool Life


Tool life can be estimated as the time from the start of the process until it is stopped.
Using the adjusted least squares equation, the process is started at time x = 0 and is
expected to stop when the trend line reaches the upper action line, UAL = 96.3.
Substituting 96.3 for y and solving for x,

y = a + bx
96.3 = 73.33 + 1.68 x
96.3 − 73.33
x= = 13.67
1.68

We would estimate tool life under this method to be approximately 13 hours


40 minutes.

Summary of the Method


The basic procedure for establishing a trend control chart is as follows:
1. Obtain an estimate, ŝ , of the short-run (within) variation.
2. Set action limits at a distance 3ŝ inside the upper and lower
specification limits.
Chapter 2: Ideas from Time Sequences of Observations 95

3. Draw the trend line and control limits around the trend line.
4. Start the process at time x = 0 at the lower action limit, that is, where the trend
line intercepts the LAL.
5. Stop the process and reset when the average exceeds the UAL or an individual
item exceeds the USL.
Trend control charts are a versatile device, which expands the usefulness of the con-
trol chart to situations involving predictable movement in the process average. However,
trend control charts are not strictly limited to use in manufacturing. Other applications
of trend control charts include the analysis of business process performance, commerce,
economics, and civic–governmental–societal areas in general. In fact, anywhere it is
clear that the process, by its very nature, is set up to intentionally generate change, trend
control charts can offer a statistical measure of progress.

Case History 2.5


Rational Subgroups in Filling Vials with Isotonic Solution17

An X and R control chart was constructed to monitor the filling of vials with an isotonic
solution in a pharmaceutical plant. The objective was to establish and maintain control
over the volume filled during production. Every hour, three successive vials were
selected from each of four discharge needles and the fill volume recorded. The case his-
tory is on the CD-ROM included with this book in the file \Selected Case Histories\
Chapter 2\CH 2_5.pdf. The case history includes Tables 2.12 and 2.13, and Figures 2.18
and 2.19.

2.8 DIGIDOT PLOT


Hunter developed a simple enhancement of the stem-and-leaf diagram, which was dis-
cussed in Chapter 1.18 He recognized that the stem-and-leaf diagram alone cannot take
the place of an original data record. In order to reinstate the sequence in which the data
were observed, a dot is placed on a time sequence plot and simultaneously recorded
with its final digit(s) on a stem-and-leaf diagram. In this manner, a complete visual
record of the data is created: a display of the data distribution, a display of the data time
history, and a complete record of the data for later detailed statistical analysis.

17. Roland H. Noel and Martin A. Brumbaugh, “Applications of Statistics to Drug Manufacture,” Industrial Quality
Control 7, no. 2 (September 1950): 7–14.
18. J. Stuart Hunter, “The Digidot Plot,” The American Statistician 42, no. 1, (February 1988): 54.
96 Part I: Basics of Interpretation of Data

3 14.
170 13.
0082 12.
08 97 79 71 0642197 11.
77 01 779310 10.
86796 9.
96 8.

1 10 20 30 40

Figure 2.20 Digidot plot for subgroup mean data from Table 1.5.

Using the mica thickness data from Table 1.1, and the subgroup means according to
Table 1.5, we can construct a digidot plot, which replicates the data distribution shape
and the time sequence of the data. On the left side of Figure 2.20 is the unordered, but
upside-down and backwards, stem-and-leaf diagram (compare to Figure 1.13), and on
the right side is the connected-dot time plot (compare to Figure 1.11).

2.9 PRACTICE EXERCISES


1. Paul Olmstead is quoted as classifying the nature of assignable causes into
five categories. In addition to these five, one might recognize a category
known as “erratic jumps from one level to another for very short periods
of time.” For each of the six categories, identify or suggest a scenario that
could lead to such results in actual practice. For instance, for category 1,
gross error or blunder, we might consider that an inspector misplaces a
decimal point when recording the result of an inspection.
2. The authors mention that there are two types of risks in scientific study, just
as in all other aspects of life. They mention specifically the acceptance of a
new position, beginning a new business, hiring a new employee, or buying
stock on the stock exchange. In addition, we can recognize situations such as
the courtroom trial of an individual who pleads guilty to a crime; the release
of a person from a facility for the criminally insane; or the evaluation of
information provided by an espionage agent. For several such situations,
or others of your own choosing, identify the “null hypothesis” involved, and
what constitutes the alpha risk and beta risk. Explain what actions might be
taken to reduce either or both of these risks in the given situation.

3. Apply the runs criteria of Section 2.4 to the X and R chart in Figure 2.5.
State your conclusions.
Chapter 2: Ideas from Time Sequences of Observations 97


4. Use the data in Table 2.4 to plot an X and R chart for samples of size ng = 4.
Do this by excluding the fifth reading in each sample (159.7, 160, and so on).
Compare your results with the authors’ results using samples of size 5. Note
that Table 2.4 is coded data. Interpret your results in terms of the original data
(see Table 1.8). In this problem, apply all nine criteria listed in Section 2.5.
5. Extend Figure 2.8 for ng = 2 and ng = 5. Draw sketches comparable to those of
Figure 2.10 to illustrate the probabilities involved. (This must be based on the
applicable formulas.)
6. Redraw Figure 2.8 for ng = 1, ng = 4, ng = 16. Draw the OC curves for these
sample sizes on the same chart.
7. Draw the OC curve for a control chart with ng = 3.
8. For the diameter data of Table 2.10:
a. Compute three standard deviation action limits using the range method
of estimating variability. How do these limits compare to those in the
book? Why?
b. Plot the trend chart using the unadjusted least squares line.
3
Ideas from Outliers—
Variables Data

3.1 INTRODUCTION
Since data of questionable pedigree are commonplace in every science, it is important to
have objective signals or clues to identify them. Strong feelings exist among scientists on
the proper uses to be made of those that are suspected: one school of thought is to leave them
in; other schools have different ideas. If the outlier1 represents very good or very bad
quality, perhaps it represents evidence that some important, but unrecognized, effect in
process or measurement was operative. Is this a signal that warrants a planned investi-
gation? Is it a typical blunder warranting corrective action? Some excellent articles have
been written on the implications of outliers and methods of testing for them.2
There are important reasons why the troubleshooter may want to uncover the rea-
sons for an outlier:
1. It may be an important signal of unsuspected important factor(s) affecting
the stability of the process or testing procedure.
2. A maverick occurring in a small sample of four or five may have enough

effect to cause X to fall outside one of the control limits on the mean.3
The range will also be affected.
We would not want to make an adjustment to the process average that has
been signaled by a maverick.
3. A maverick (outlier) left in a relatively small collection of data may have a
major effect when making comparisons with other samples.

1. Other terms besides outlier in common usage include maverick and wild-shot.
2. Frank E. Grubbs, “Procedures for Detecting Outlying Observations in Samples,” Technometrics 11, no. 1
(February 1969): 1–21. Also, Frank Proschan, “Testing Suspected Observations,” Industrial Quality Control
(January 1957): 14–19.
3. Any point falling outside control limits will be called an “outage.”

99
100 Part I: Basics of Interpretation of Data

Case History 3.1


A Chemical Analysis—An R Chart As Evidence of Outliers
The percent by chemical analysis of a specific component A in successive batches of a
plastic monomer became important evidence in a patent-infringement dispute. The
numbers in column 1 of Table 3.1 represent the chemical analysis of individual, con-
secutively produced, several-ton batches of monomer. A crucial question was whether

Table 3.1 Record of chemical analyses (column 1) made on consecutive batches of a


chemical compound.
Column 2 shows a “material balance” content calculated for the same batches as column 1.
(1) (2)
Chemical Material

Subset analysis, X Range balance,
no. % ng= 5 ng= 5 %
11 2.76 4.12
3.66 3.69
3.47 3.92
3.02 4.14
3.55 3.29 0.90 3.70
12 3.55 3.74
3.10 3.74
3.28 3.65
3.13 3.63
3.21 3.25 0.45 3.92
13 3.66 3.95
3.40 3.96
3.25 3.95
3.36 3.95
3.59 3.45 0.41 3.76
14 1.32 4.10
3.32 3.71
2.91 4.12
3.81 3.77
3.47 2.97 2.49 4.15
15 3.70 3.79
3.59 3.75
3.85 3.72
3.51 3.83
4.12 3.75 0.61 3.77
16 4.08 3.65
3.66 3.76
3.66 3.65
3.47 5.39
4.49 3.87 1.02 3.96
17 3.85 3.78
3.47 3.58
3.32 5.35
2.94 4.52
1.43 3.00 2.42 5.46
18 3.51 4.15
3.74 3.89
3.51 3.96
3.63 3.63
3.36 3.55 0.38 5.27
Chapter 3: Ideas from Outliers—Variables Data 101

the chemical analyses on certain batches were reliable. Many more analyses than those
shown in Table 3.1 were in evidence.
The recommended sequence of steps taken in any statistical analysis of data begins

with plotting—in this set by plotting X and R charts where subsets were formed from
4
five consecutive batches. Table 3.1 lists only subsets 11 through 18. This analysis will

assume there is no other evidence. In Figure 3.1, we see that the control limits on the X

chart include all the X points; the R chart has an outage in subset 14 and another in 17.
What are some possible reasons for these two outages? Is it a general increase in vari-
ability, or is it merely the presence of mavericks?

We decide to retain the two subgroups with outages when computing R, although
we then obtain an estimate of s that is probably too large; this procedure is conserva-
tive and will be followed at least for now.

ŝ = R/d2 = 1.085/2.33 = 0.47

– –
X + A2R = 4.02
4

ng = 5

– X = 3.39
X

– –
X – A2R = 2.76


D4R = 2.30

2
R ng = 5

R = 1.085
1

0
11 12 13 14 15 16 17 18

––
Figure 3.1 An X–– and R control chart analysis of data (Table 3.1, column 1). Subsets of ng = 5,
ŝ = R /d2 = 0.47.

4. It is important to use a control chart analysis, but it is not important here whether to choose subsets of five or

of four. The maintenance of X and R charts as a part of the production system would probably have prevented the
difficulties that arose, provided the evidence from the chart had been utilized.
102 Part I: Basics of Interpretation of Data

Subgroup number
13 14 15 16 17 18

5.0 X + 3s^ = 4.84
Chemical analysis

ng = 1
4.0

3.0 X = 3.39
s^ = 0.47 from Figure 9.1
2.0 –
X – 3s^ = 2.02
.0

Figure 3.2 Individual batch analyses showing two outages.

Discussion
The 3-sigma control limits on individual batches have been drawn using this conserva-
tively large estimate (see Figure 3.2). We see that one batch analysis in subset 14 is
below the lower 3-sigma limit; also, one low batch analysis is seen in subset 17. We
must conclude that the analyses on these two batches are significantly different from
those of their neighbors. There are two possible explanations for the two outages (based
on logic alone):
1. The chemical content of these two batches is indeed significantly lower
than the others (and dangerously close to a critical specification).
2. There was an error (blunder) either in the chemical analysis or in the
recording of it.
It is not possible to determine which of the two possibilities is completely respon-
sible. The preceding analysis does present evidence of important errors.
• At the time that these batches were produced and analyzed, chemical
procedures were available that might have established conclusively which
was the actual source of error. Such methods include reruns of the chemical
analysis, visual and physical tests on other batch properties, checks on the
manufacturing log sheets in the material balance calculation, and discussions
with plant personnel.
• The column 2 figures in Table 3.1 were obtained on the basis of batch entries
in the process logbook. The material balance, column 2, is computed for each
batch on the assumption that the amounts of ingredients shown in the logbook
are actually correct. This assumption is critical. Whenever there is a substantial
discrepancy between columns 1 and 2 on a batch, it is advisable to make
immediate and careful checks on the reasons for the discrepancy.
Chapter 3: Ideas from Outliers—Variables Data 103

Table 3.2 Estimating s from a


moving range.
Xi MR
1. 0.110
2. 0.070 0.040
3. 0.110 0.040
4. 0.105 0.005
5. 0.100 0.005
6. 0.115 0.015
7. 0.100 0.015
8. 0.105 0.005
9. 0.105 0.000
10. 0.098 0.007
11. 0.110 0.012
—–
MR = 0.0144
—–
D4(MR ) = (3.27)(0.0144)
= 0.047
—–
ŝ = MR /d2
= 0.0144/1.13
= 0.0127

3.2 OTHER OBJECTIVE TESTS FOR OUTLIERS


In Case History 3.1, eight different subgroups ng = 5, were analyzed. The R chart in
Figure 3.1 showed two outages; individual batches responsible for the outages were
easily identified. This R chart identification of outliers is applicable when the amount
of data is large.
When we have only a few observations, other criteria to test for possible outliers
can be helpful. Two such criteria are the following:
1. An MR chart of moving ranges as a test for outliers. In studying a process,
observations usually should be recorded in the order of production. Data
from Table 3.2 have been plotted in Figure 3.3a—these are individual
observations. There seems to be no obvious grouping at different levels,
no obvious shift in average, nothing unusual when the number of runs is
counted, no cycle; but the second observation is somewhat apart from the
others. Possibly an outlier?
A moving range (MR) ng = 2, is the positive difference between two
consecutive observations. Moving ranges, |Xi+1 – Xi| , ng = 2, behave like
ordinary ranges, n = 2. In Table 3.2, the moving ranges have been written
in the column adjacent to the original Xi observations. Then

MR =
∑ ( MR ) = 0.0144
i

10
104 Part I: Basics of Interpretation of Data


X + 3ŝ = 0.140
0.14

X + 2ŝ = 0.127
0.12 ng = 1

Xi 0.10 –
X = 0.102

0.08 X – 2ŝ = 0.077

X – 3ŝ = 0.064
0.06

(a)

ng = 2 –––
D4(MR ) = 0.047

0.04

MR 0.02 –––
MR = 0.0144

(b)

Figure 3.3 A chart check for an outlier. (Data from Table 3.2.)

The upper control limit on the moving range chart, Figure 3.3b, is

UCL = D4 MR = ( 3.27 )( 0.0144 ) = 0.047

where D4 is the ordinary factor for ranges. See Table 2.3, or Table A.4.
The second observation in Table 3.2 is responsible for the first two moving
range points being near the UCL in Figure 3.3b; thus X2 = 0.070 is suspected
of being an outlier.
Also, control limits have been drawn in Figure 3.3a for n = 1 at
– –
X ± 3ŝ and at X ± 2ŝ

where ŝ = 0.013. The point on the X chart corresponding to X2 = 0.070 is


between the 2ŝ and 3ŝ lines. This is additional evidence, consistent with
that of Figure 3.3b, indicating this one observation to be an outlier with risk
between a = 0.05 and 0.01. Consequently, an investigation for the reason
is recommended.
Chapter 3: Ideas from Outliers—Variables Data 105

2. Dixon’s test for a single outlier in a sample of n. Consider the ordered set of
n random observations from a population presumed to be normal:

X(1), X(2), X(3), . . . , X(n–1), X(n)

We refer to such observations as order statistics, where X(1) is the smallest


value and X(n) is the largest in a group of size n. Either end point, X(1) or X(n),
may be an outlier. Dixon studied various ratios5 and recommends

X ( 2 ) − X (1)
r10 =
X ( n ) − X (1)

to test the smallest X(1) of being an outlier in a sample of n = 3, 4, 5, 6,


or 7. If the largest X(n) is suspect, simply reverse the order of the data, or
use the complementary ratio for r10, in this case, which is presented in
Appendix Table A.9. When n > 7, similar but different ratios are
recommended in Table A.9.

Example 3.1
Consider the previous data from Table 3.2, n = 11. From Table A.9 we are to compute:

X ( 3) − X (1) 0.100 − 0.070


r21 = = = 0.75
X ( n−1) − X (1) 0.110 − 0.070

This computed value of r21 = 0.75 exceeds the tabular entry 0.679 in Table A.9, cor-
responding to a very small risk of 0.01. We decide that the 0.070 observation does not
belong to the same universe as the other nine observations. This conclusion is consis-
tent with the evidence from Figure 3.3.
Whether the process warrants a study to identify the reason for this outlier and
the expected frequency of future similar low mavericks is a matter for discussion
with the engineer.

5. W. J. Dixon, “Processing Data for Outliers,” Biometrika 9 (1953): 74–89.


106 Part I: Basics of Interpretation of Data

3.3 TWO SUSPECTED OUTLIERS ON THE SAME END


OF A SAMPLE OF n (OPTIONAL)
Besides the control chart, the following two procedures are suggested as tests for a pair
of outliers:
1. Dixon’s test after excluding the more extreme of two observations: Proceed in
the usual way by testing the one suspect in the (n – 1) remaining observations.
If there are three extreme observations, exclude the two most extreme and
proceed by testing the one suspect in the (n – 2) remaining observations using
Table A.9.
Consider the 10 measurements in Example 3.2, where the two smallest
suggest the possibility of having a different source than the other eight; see
Figure 3.4.
Analysis with Dixon’s Criterion. Exclude the lowest observation. Then
in the remaining nine: X(1) = 2.22, X(2) = 3.04, X(8) = 4.11, and X(9) = 4.13.
Form the ratio

X ( 2 ) − X (1) 0.82
r11 = = = 0.43
X (8 ) − X (1) 1.89

Analysis: This ratio is between the critical value of 0.352 for a = 0.20
and 0.441 for a = 0.10.
Decision: We consider both of the suspected observations to be from a
different source if we are willing to accept a risk between 0.10 and 0.20.
2. A test for two outliers on the same end provided by Grubbs6 is based on the
ratio of the sample sum of squares when the doubtful values are excluded
compared to the sum when they are included. This is illustrated by the
following example.

4.0 ng = 1

3.0
?
2.0 ?

Figure 3.4 Data with two suggested outliers on the same end. (See Example 3.2.)

6. Frank E. Grubbs, “Procedures for Detecting Outlying Observations in Samples,” Technometrics 2, no. 1 (February
1969): 1–20.
Chapter 3: Ideas from Outliers—Variables Data 107

Example 3.2
Following are ten measurements of percent elongation at break test7 on a certain mate-
rial: 3.73, 3.59, 3.94, 4.13, 3.04, 2.22, 3.23, 4.05, 4.11, and 2.02. Arranged in ascend-
ing order of magnitude these measurements are: 2.02, 2.22, 3.04, 3.23, 3.59, 3.73, 3.94,
4.05, 4.11, 4.13. We can test the two lowest readings simultaneously by using the crite-
rion S21,2/S2 from Table A.10. For the above measurements

 n 
2
n
n∑ X i2 −  ∑ X i 
10 (121.3594 ) − ( 34.06 )
2
 i =1 
S 2 = ∑ ( Xi − X )
n
2 i =1
= = = 5.351
i =1 n 10
S 2 = ( n − 1) s 2

and

 n 
2
n
( n − 2) ∑ X −  ∑ Xi  8 (112.3506) − ( 29.82)2
2

( )
n i

S12,2 = ∑ X i − X1,2 = ( n − 3) s12,2


2 i =3 i =3
= =
i =3 ( n − 2) 8

where
n

∑X i
X1,2 = i =3
= 1.197
( n − 2)
Then,

S12,2 1.197
2
= = 0.224
S 5.351

From Table A.10, the critical value of S21,2/S2 at the 5 percent level is 0.2305. Since
the calculated value is less than this, we conclude that both 2.02 and 2.22 are outliers,
with a risk of 5 percent. This compares with a risk between 10 percent and 20 percent
by the previous analysis. Note that this test rejects the hypothesis of randomness when
the test statistic is less than the critical value.

7. Also see Figure 3.4.


108 Part I: Basics of Interpretation of Data

3.4 PRACTICE EXERCISES

1. In Table 3.1, there is an apparent outlier in subset 14 (X = 1.32). Use Dixon’s


test to objectively determine if this is an outlier. Do this with subset 14 only
(ng = 5), and subsets 13 and 14 (ng = 10). By interpolation in Table A.9,
determine the approximate p value (the value of the probability for which
the result would just barely be significant for each case). Explain the different
p values.
2. Use the data subsets 11–14 from Table 3.1 column 1 to plot an X chart and a
moving range chart. Compute the upper control limit on the MR chart based
on subsets 11 and 12. Then continue to plot, noting whether the apparent
outlier is detected by the chart.
3. In using an MR chart to check for an outlier in a set of data (where all the
data is in hand before the check is made), do you think the suspect should be
–—
included or excluded in the computation of MR? Explain your reasoning.
4. For the case of two suspected outliers, the authors provide two methods of
analysis: Dixon’s and Grubbs’s. Notice that the p value for Dixon’s method
is roughly 0.14, while for Grubbs’s method it is about 0.04. Examine the
two methods and suggest a theoretical reason why the latter p value is so
much lower.
5. A third method has been proposed for the case of two suspected outliers—a
t test. Let the two suspected units form one sample and the remaining units
form the other sample. Compute t. What is your opinion of the validity of this
method? Justify your answer. (Note: the t test is shown in Chapter 13).
6. Test the data in the sample below for outliers using both the Dixon test and the
Grubbs test using a = 0.05 as the criterion.

56, 46, 43, 57, 70, 50, 43, 40, 41, 40, 51, 55, 72, 53, 42, 44

7. Suppose the data in Table 3.1 is split into two parts with subset 11–14 in one
part and 15–18 in the other. Test whether the means of these subsets are the
same using a = 0.05. Exclude any outliers and retest. Are the results the same?
(Note: the t test is shown in Chapter 13).

8. Combine the X and R chart in Figure 3.3 into one chart.
4
Variability—
Estimating and Comparing

4.1 INTRODUCTION
The variability of a stable process may differ from one machine to another, or from one
set of conditions to another. Such variabilities in a process can introduce difficulties in
the comparison of averages or other measures of central tendency.
Comparisons of variability and of averages will depend upon estimates of variation.
Statistical tables will be used in making some of these comparisons. The computation
of some of these tables depends upon the number of degrees of freedom (df) associated
with estimates of the variance s 2 and standard deviation s. Degrees of freedom may be
thought of, crudely, as “effective sample size,” and depends on how much of the data is
“used up” in making estimates necessary for the calculation of the statistic involved.
The number of degrees of freedom may be different when computing estimates in
different ways. The number of df associated with different methods of computation will
be indicated.
Section 4.2 has been included for those who enjoy looking at extensions and rami-
fications of statistical procedures. It may be omitted without seriously affecting the
understanding of subsequent sections.

4.2 STATISTICAL EFFICIENCY AND BIAS IN


VARIABILITY ESTIMATES
Two terms used by statisticians have technical meanings suggested by the terms them-
selves. They are unbiased and statistically efficient. If the expected value of the statis-
tic is equal to the population parameter it is supposed to estimate, then the estimate is
called unbiased. A statistic that is the least variable for a given sample size is said to be
the “most reliable,” or is referred to as being statistically efficient.

109
110 Part I: Basics of Interpretation of Data

In the definition of the sample variance, the denominator (n – 1) is used; this pro-
vides an unbiased estimate of the unknown population statistic s 2. We might also
expect the square root of s2 to be an unbiased estimate of s ; actually this is not quite
the case. However, it is unusual for anyone to make an adjustment for the slight bias.
An unbiased estimate is

∑( X − X)
n
2
i
s 1
σ̂ = = i =1
(4.1)
c4 c4 n −1

where some values of c4 are given in Tables 4.1 and A.4, but they are seldom used in
practice.
Note that s– can be used in place of s in the above relationship.
The concept of statistical efficiency is discussed in texts on theoretical statistics. It
permits some comparisons of statistical procedures especially under the assumptions
of normality and stability. For example, the definition of the variance s2 in Equation
(4.2) would be the most efficient of all possible estimates if the assumptions were sat-

isfied. However, the statistical efficiency of ŝ based on the range, ŝ = R/d2, is only
slightly less than that obtained in relation to Equation (4.1) even when the assumptions
are satisfied. (See Table 4.2.) When they are not satisfied, the advantages often favor

ŝ = R/d2.
Statistical methods can be important in analyzing production and scientific data and
it is advisable that such methods be as statistically efficient as possible. But cooperation
between engineering and statistical personnel is essential; both groups should be ready
to compromise on methods so that the highest overall simplicity, feasibility, and effi-
ciency are obtained in each phase of the study.

Table 4.1 Factors c4 to give an unbiased estimate.

∑(X )
2
s 1 −X
σ̂ = =
i

c4 c4 n −1

n c4
2 0.80
3 0.88
4 0.92
6 0.95
10 0.97
25 0.99
∞ 1.00

4n − 4
Note: For n > 25, c 4 =
4n − 3
Chapter 4: Variability—Estimating and Comparing 111


Table 4.2 Statistical efficiency of ŝ = R/d2 in estimating the
population parameter from k small samples.
ng Statistical efficiency
2 1.00
3 0.992
4 0.975
5 0.955
6 0.93
10 0.85

4.3 ESTIMATING r AND r 2 FROM DATA:


ONE SAMPLE OF SIZE n

In Chapter 1, Table 1.3, the mechanics of computing X and ŝ from grouped data with
large n were presented as follows:

n∑ fi mi2 − (∑ f m )
2

σ̂ =
i i

n ( n − 1)

X=
∑fm i i

The frequency distribution procedure as given serves a dual purpose: (1) it presents
the data in a graphical form that is almost invariably useful, and (2) it provides a sim-
ple form of numerical computation that can be carried through without a calculator or
computer.
In some process improvement studies, we shall have only one set of data that is too
small to warrant grouping. Let the n independent, random observations from a stable
process be X1, X2, X3, . . . , Xn. Then the variance s 2 of the process can be estimated by

∑( X − X)
2

s = σ̂
2 2
= i
df = n − 1 (4.2)
n −1

so the standard deviation is

n∑ X i2 − (∑ X )
2

s = σ̂ = 2 i

n ( n − 1)
(4.3)

The following sections indicate that the comparison of two process variabilities is
based on variances rather than standard deviations.
112 Part I: Basics of Interpretation of Data

4.4 DATA FROM n OBSERVATIONS CONSISTING OF k


SUBSETS OF ng = r: TWO PROCEDURES
Introduction
Important procedures are presented in this section that will be applied continually
throughout the remaining chapters of this book. They are summarized in Section 4.6 and
Table 4.6. It is recommended that you refer to them as you read through this chapter.
In our industrial experiences, we often obtain repeated (replicated) observations
from the same or similar sets of conditions. The small letter r will be used to represent
the number of observations in each combination of conditions. There will usually be k
sets of conditions each with ng = r replicates. The letter n will usually be reserved to use
when two or more samples of size r are pooled. Thus,
r = ng = Number of observations in each combination of conditions
k = Number of subgroups
n = Sample size when two or more subgroups are pooled
so
n = rk

This distinction is important in this chapter, which compares estimates based on k


subgroups of size r with those using an overall sample size of n = rk.
The mechanics of computing ŝ from a series of small rational subgroups of size n
was discussed in Section 1.12. We usually considered k to be as large as 25 or 30. Then

R
σ̂ = (4.4)
d2

where d2 is a constant (see Table A.4) with values depending only on ng (or r). An
important advantage of this control chart method is its use in checking the stability of
process variation from the R chart.
This estimate ŝ in Equation (4.4) is unbiased. However, squaring to obtain ŝ 2 =
– 2
( R/d2) has the seemingly peculiar effect of producing a bias in ŝ 2. The bias can be
removed by the device of replacing d2 by a slightly modified factor d2* (read “d-two
star”) depending on both the number of samples k and the number of replicates, r. See
Table A.11 for values of d2*. Note that the value of d2* converges to the value of d2 as k
approaches infinity.

 R
2

σˆ 2 =  *  (4.5)
 d2 
Chapter 4: Variability—Estimating and Comparing 113

is unbiased. Also

R
σˆ = (4.6)
d 2*

Using d2* in place of d2, the variance is slightly biased much as s in Equation (4.3)

is biased. We shall sometimes use this biased estimate R/d2* later, especially when ng is
less than say 4 or 5, in connection with certain statistical tables based on the bias of s
in estimating s. (This is somewhat confusing; the differentiation is not critical, as can
be seen by comparing values of d2 and d2*.) The degrees of freedom df associated with
the estimate Equations (4.5) or (4.6) are also given in Table A.11 for each value of k and
r = ng based on an approximation due to Patnaik.1 However, a simple comparison indi-
cates that there is a loss of efficiency of essentially 10 percent when using this range
estimate, which is reflected in the associated degrees of freedom, that is,

df ≅ (0.9)k(r – 1) k>2 (4.7)

There is an alternate method of computing ŝ 2 from a series of rational subgroups


of varying sizes r1, r2, r3, . . . , rk. Begin by computing a variance si2 for each sample
from Equation (4.2) to obtain: s12, s22, s32, . . . , sk2. Then each sample contributes (ri – 1)
degrees of freedom for the total shown in the denominator of Equation (4.8). This esti-
mate ŝ 2 in Equation (4.8) is unbiased. When r1 = r2 = r3 = . . . = rk = r, the estimate
becomes simply

(r − 1) s + (r
2
− 1) s22 + … + ( rk − 1) sk2
σ̂ 2 = s p2 = 1 1 2
(4.8)
r1 + r2 + … + rk − k

with
df = k(r – 1)

Equation (4.8) is applicable for either large or small sample sizes ri.
Note: When k = 2 and r1 = r2 = r, Equation (4.8) becomes simply the average

s12 + s22
σ̂ 2 = s p2 = with df = 2 ( r − 1)
2

1. P. B. Patnaik, “The Use of Mean Range As an Estimator of Variance in Statistical Tests,” Biometrika 37 (1950):
78–87.
114 Part I: Basics of Interpretation of Data

4.5 COMPARING VARIABILITIES


OF TWO POPULATIONS
Consider two machines, for example, producing items to the same specifications. The
product may differ with respect to some measured quality characteristic because of dif-
ferences either in variability or because of unstable average performance.
Two random samples from the same machine (or population) will also vary. We
would not expect the variability of the two samples to be exactly equal. Now if many
random samples are drawn from the same machine (population, process), how much
variation is expected in the variability of these samples? When is there enough of a dif-
ference between computed variances of two samples to indicate that they are not from
the same machine or not from machines performing with the same basic variability?
This question can be answered by two statistical methods: the variance ratio test
(F test) and the range-square-ratio test (FR test).

Variance Ratio Test (F Test)


This method originated with Professor George W. Snedecor, who designated it the
“F test” in honor of the pioneer agricultural researcher and statistician, Sir Ronald
A. Fisher. The method is simple in mechanical application.
Method: Given two samples of sizes n1 and n2, respectively, considered to be from the
same population, compute s12 and s22 and designate the larger value by s12. What is the
expected “largest ratio,” with risk a, of the F ratio

s12
F=
s22 (4.9)

To answer, we will need degrees of freedom (df) for

F(df1, df2) = F(n1 – 1, n2 – 1)

The two degrees of freedom will be written as

Numerator (s12): df1 = n1 – 1, and


Denominator (s22): df2 = n2 – 1

Critical values, Fa , are given in Table A.12, corresponding to selected values of a


and F(n1 – 1, n2 – 1). The tables are constructed so that the df across the top of the tables
applies to df1 of the numerator (s12); the df along the left side applies to df2 of the denom-
inator (s22). Note that the tables are one-tailed. When performing a two-sided test, such
as the above, the risk will be twice that shown in the table.
Chapter 4: Variability—Estimating and Comparing 115

Example 4.1
The 25 tests on single-fiber yarn strength from two different machines are shown in Table
4.3; they are plotted in Figure 4.1. The graph suggests the possibility that machine 56
is different than machine 49. A formal comparison can be made by the F test, Equation
(4.9), using

σ̂ = s =
2 2
n (∑ X ) − (∑ X )
2 2

n ( n − 1)

Table 4.3 Data: breaking strength of single-fiber yarn spun on two machines.
Machine 49 Machine 56
3.99 5.34
4.44 4.27
3.91 4.10
3.98 4.29
4.20 5.27
4.42 4.24
5.08 5.12
4.20 3.79
4.55 3.84
3.85 5.34
4.34 4.94
4.49 4.56
4.44 4.28
4.06 4.96
4.05 4.85
4.34 4.17
4.00 4.60
4.72 4.30
4.00 4.21
4.25 4.16
4.10 3.70
4.35 3.81
4.56 4.22
4.23 4.25
4.01 4.10

Machine 49 Machine 56
Individual breaking

ng = r = 1 r=1
5.0
strength

4.0

Figure 4.1 Breaking strength of single-fiber yarn from two machines. (Data from Table 4.3.)
116 Part I: Basics of Interpretation of Data

Computations
Machine 56 Machine 49
n1(ΣX 2) = 25(496.1484) = 12,403.7025 n2(ΣX 2) = 25(496.1810) = 11,404.525
(ΣX)2 = (110.71)2 = 12,256.7041 (ΣX)2 = (106.56)2 = 11,355.034
146.9984 49.491
n1(n1 – 1) = 600 n2(n2 – 1) = 600
s12 = 146.9984/600 = 0.245 df = 24 s12 = 49.491/600 = 0.082 df = 24

We now compare the two variabilities by the F ratio, Equation (4.9).

s12 0.245
F= = = 2.99
s22 0.082 with df1 = n1 – 1 = 24, df2 = n2 – 1 = 24
Critical Value of F
Table A.12 gives a one-sided a = 0.01 critical value of F0.01 = 2.66. Since our test ratio
2.99 is larger than 2.66, we declare that the variability of machine 56 is greater than
machine 49, with very small risk, a < 2(0.01) = 0.02.

Example 4.2
The variability of an electronic device was of concern to the engineer who was assigned
that product. Using cathode sleeves made from one batch of nickel (melt A), a group of
10 electronic devices was processed and an electrical characteristic (transconductance,
Gm) was measured as shown in Table 4.4. Using nickel cathode sleeves from a new
batch (melt B), a second test group of 10 devices was processed and Gm was read. Was
there evidence that the population variability from melt B was significantly different
from melt A?
The first step was to plot the data (Figure 4.2).
The data are typical; neither set has an entirely convincing appearance of random-
ness. One obvious possibility is that the 1420 reading in melt B is an outlier. The Dixon
test for an outlier, n = 10, is

X ( 2 ) − X (1) 3770 − 1420


r11 = = = 0.508
X( n−1) − X (1) 6050 − 1420

This computed value, 0.508, exceeds the critical value of 0.477, for a = 0.05 and
n = 10, in Table A.9. This was reasonably convincing evidence that something peculiar
Chapter 4: Variability—Estimating and Comparing 117

Table 4.4 Data: measurements of transconductance of two groups of


electronic devices made from two batches (melts) of nickel.
Melt A Melt B
4760 6050
5330 4950
2640 3770
5380 5290
5380 6050
2760 5120
4140 1420
3120 5630
3210 5370
5120 4960
– –
A = 4184.0 B = 4861.0
(n1 = 10) (n2 = 10)

B ´ = 5243.3
(n2́ = 9)

Melt A Melt B

6000 r=1 r=1



B ' = 5243.3

5000 B = 4861.0

A = 4180.0
Gm 4000

3000

2000

Figure 4.2 Transconductance readings on electronic devices from two batches of nickel.
(Data from Table 4.4.)

occurred with the seventh unit in melt B either during production or testing to obtain
the 1420 reading.
Much of the variability of melt B is contributed by the suspected outlier observation.
What would happen if the 1420 observation were removed and F computed? (Note: B´
is used to indicate the nine observations with 1420 deleted.)

s12 = sA2 = 1,319,604 n1 = 10


s22 = sB´2 = 477,675 n2 = 9
118 Part I: Basics of Interpretation of Data

Then
F = s12/s22 = sA2/sB´2 = 1,319,604/477,675 = 2.77
df1 = n1 – 1 = 10 – 1 = 9 df2 = n2 – 1 = 9 – 1 = 8

Now this F exceeds F0.10 = 2.56, but not F0.05 = 3.39. This suggests that product from
melt A may be more variable than product from melt B´ (one maverick removed), but
is not entirely convincing since the respective two-sided risks are 0.20 and 0.10.
There are important engineering considerations that must be the basis for deciding
whether to consider melt B to be basically less variable than melt A, and whether to
investigate reoccurrences of possible mavericks in either group.
On the other hand, what are the consequences, statistically, if the 1420 observation
is not considered to be a maverick? When the variance of melt B is recomputed with all
10 observations,

sB2 = 1,886,388

The F ratio becomes

F = s12/s22 = sA2/sB2 = 1,866,388/1,319,604 = 1.43


df1 = n1 – 1 = 10 – 1 = 9 df2 = n2 – 1 = 10 – 1 = 9

Even if the engineer were willing to assume a risk of a = 0.20 there would still not
be justification in considering the variability of the two samples to be different when
the 1420 (suspected maverick) is left in the analysis, since F = 1.43 is less than either
F0.10 = 2.44 or even F0.25 = 1.59, with two-sided risks 0.20 and 0.50.
Conclusion
The reversal that results from removing the 1420 observation is not entirely unexpected
when we observe the patterns of melt A and melt B; melt A gives an appearance of
being more variable than melt B. The above statistical computations tend to support two
suppositions:
1. That the observation 1420 is an outlier.
2. That melt A is not a single population but a bimodal pattern having two
sources, one averaging about 5000 and the other about 3000. This suspicion
may justify an investigation into either the processing of the electronic devices
or the uniformity of melt A in an effort to identify two sources “someplace in
the system” and make appropriate adjustments.
(See Case History 13.4 for more discussion of this set of data.)
Chapter 4: Variability—Estimating and Comparing 119

Range–Square–Ratio Test, FR
In this chapter, we have considered two methods of computing unbiased estimates of the

variance, ŝ 2: the mean-square, ŝ 2 = s2, and the range-square estimate, ŝ 2 = ( R/d2* )2. In
the preceding section, two process variabilities were compared by forming an F ratio
and comparing the computed F with tabulated critical values Fa . When data sets are
available from two processes, or from one process at two different times, in the form of
k1 sets of r1 from one and as k2 sets of r2 from a second, then we may use the range-
square-ratio2 to compare variabilities.

(R / d )
2
*

=
1 2
FR
(R / d )
2
*
2 2
(4.10)

If the sample sizes n = r1k = r2k are the same for both data sets, the ratio simply
becomes

R12
FR =
R22

with
df1 ≅ (0.9)k1(r1 – 1) df2 ≅ (0.9)k2(r2 – 1) (4.11)

The test statistic FR is then compared to critical values in Table A.12 using the
degrees of freedom given in Equation (4.11) to determine statistical significance. Details
of the procedure are given below in Examples 4.3 and 4.4.

Example 4.3
This example concerns Case History 2.2 on chemical concentration. A visual inspection
of Figure 4.3 shows 11 range points, representing the period of May 14 through May
18, to be above the median. This long run is suggestive of a shift in the variability of
the process during this period. Does the range-square-ratio test offer any evidence of
variability?

We see an estimated average of R1 = 2.5 for the k1 = 20 points during the period

from May 8 through May 14, then a jump to an estimated R2 = 6 for the next 11 points,

2. Acheson J. Duncan, “The Use of Ranges in Comparing Variabilities,” Industrial Quality Control 11, no. 5
(February 1955): 18, 19, and 22. Values of r1 and r2 usually should be no larger than 6 or 7.
120 Part I: Basics of Interpretation of Data

May: 8 9 10 11 12 13 14 15 16 17 18 19 20

ng = r = 4

10 k2 = 11

R2 ≅ 6

R 1 ≅ 2.5
R 5 k1 = 20

Median R
0

Figure 4.3 Evidence of increased process variability. (Example 4.3.)

and possibly a drop back to the initial average during the period of the last seven points.
This is a conjecture, that is, a tentative hypothesis.
Range-Square-Ratio Test
(Data from Figure 4.3)

R1 = 2.5, k1 = 20, r1 = 4, with df1 ≅ (0.9)(20)(3) = 54

R2 = 6.0, k2 = 11, r2 = 4, with df2 ≅ (0.9)(11)(3) = 30
Then

(R / d )
2
*
R22
df ≅ ( 30, 54 )
36
= ≅ = = 5.76
2 2
FR
(R / d )
2 2
* R1 6.25
1 2


Note that the FR value is approximately equal to 5.76 since the d2* values for R2 and

R1 are 2.07 and 2.08, respectively. In Table A.12, no critical value of F is shown for 30
and 54 df, but one is shown for 30 and 60 df; it is F0.01 = 2.03. Since our test ratio, 5.76,
is much larger than any value in the vicinity of F0.01 = 2.03, there is no need or advan-
tage in interpolating. We declare there is a difference with risk a < 0.02.
The range-square-ratio test supports the evidence from the long run that a shift in
variability did occur about May 14. The process variability was not stable; an investi-
gation at the time would have been expected to lead to physical explanations.

X and R control charts are helpful in process improvement studies: these often indi-

cate shifts in R (as well as in X ). Then the range-square-ratio test can be applied easily
as a check, and investigations made into possible causes.
Chapter 4: Variability—Estimating and Comparing 121

Example 4.4
Another use of the range-square-ratio test is in comparing variabilities of two analyti-
cal chemical procedures. The data in Table 4.5 pertain to the ranges of three determi-
nations of a chemical characteristic by each of four analysts on (blind) samples from
each of four barrels. In the first set, determinations are made by method A and in the
second by method B. The question is whether the experimental variability of the sec-
ond method is an improvement (reduction) over the first. It is assumed that the experi-
mental variability is independent of analyst and sample so that the 16 ranges can be
averaged for each set. Here we have
– –
RA = 1.37 RB = 0.75
r1 = 3 r2 = 3
k1 = 16 k2 = 16

and from Table A.11, df1 = 29.3 and df2 = 29.3, or df ≅ (0.9)k(r – 1) = 29 from Equation
(4.7). Then

FR =
(1.37) 2

= 3.3 with df ≅ ( 29, 29 )


( 0.75) 2

From Table A.12, we find F0.05 = 1.87. Since our computed FR = 3.3 is in excess of
the one-sided critical (a = 0.05) value, we conclude that the variability of the second
method is an improvement over the variability of method A, with risk a = 0.05 since
this is a one-sided test.

Table 4.5 Variability (as measured by ranges, r = 3) of two methods of


chemical analysis using four analysts.
Analysts
Methods Barrels I II III IV
A 1 2.0 3.1 0 1.5
2 0.5 2.0 2.5 0.3
3 1.5 1.9 1.5 0.8 –
4 2.5 0.5 1.0 0.3 RA = 1.37
B 1 0.5 1.3 1.0 1.0
2 0.5 1.4 0 0.9
3 1.0 0.8 0.5 0.3 –
4 1.0 0.7 1.0 0.3 RB = 0.75
122 Part I: Basics of Interpretation of Data

Comparison of Two Special Variances


Both the F test and the FR test assume the variances in the numerator and the denomi-
nator to be independent. However, a comparison of ŝ LT2 and ŝ ST2 from the same set of data
is often enlightening and will serve to give an indication of possible lack of statistical
control. Obviously, they are not independent. When the process is in control, we expect
ŝ LT2 to be close to ŝ ST2 . However, when the process is out of control, we would expect ŝ LT
> ŝ ST. To determine the statistical significance of such a comparison, it is possible to
use a test developed by Cruthis and Rigdon.3
Consider the mica data shown in Table 1.5. Is the mica splitting process in control?
To test this hypothesis, proceed as follows using the subgroups of size ng = 5 shown in
the table:
1. Compute

ŝ ST = R/d2 = 4.875/2.33 = 2.092

2. From the subgroup means, compute

∑( X − X )
k 2

sX = i =1

k −1

3. Estimate ŝ LT as

σˆ LT = sX n = 1.249 5 = 2.793

4. Form the ratio

σˆ 2
F = LT
*
=
( 2.793) = 1.78 2

σˆ ST ( 2.092 )2
2

5. Critical values for F* will be found in Table A.20. For a subgroup size of
ng = 5 and a total sample size of n = 200, with risk a = 0.05, the critical
*
value is less than F 0.05 = 1.231 (the value for n = 160). Therefore, the
hypothesis of a chance difference should be rejected at the a = 0.05 level.
This is an indication that the mica data were not in control. Note that this
is a one-sided test.

3. E. N. Cruthis and S. E. Rigdon, “Comparing Two Estimates of the Variance to Determine the Stability of the
Process,” Quality Engineering 5, no. 1 (1992–1993): 67–74.
Chapter 4: Variability—Estimating and Comparing 123

Additional tables for other subgroup sizes will be found in the Cruthis and Rigdon
reference.
Now consider the outlier data in Table 3.2 showing 11 successive observations with
—–
MR = 0.0144. This estimates short-term variability as

MR 0.0144
σˆ ST = = = 0.01014
d2 1.128

The standard deviation of all eleven observations is ŝ LT = s = 0.01197. If the process


is in control, the estimates of ŝ ST and ŝ LT should be close to each other. If the process is
out of control, we would expect ŝ LT > ŝ ST. The F* test proceeds as follows

σˆ 2
F = LT
*
=
( 0.01197) = 1.395
2

σˆ ST ( 0.01014 )2
2

Table A.20 shows that for 10 observations, the critical value is F*05 = 1.878. For 11
observations, the critical value would be slightly less. This test does not indicate an
out of control condition among the 11 points, even though the control chart in Figure
3.3 does show the process to be out of control. This is because the effect of the second
observation is mitigated by the uniformity of the other observations.

4.6 SUMMARY
This chapter has presented different methods of computing estimates of s and s 2
from data. These estimates are used in comparing process variabilities under operat-
ing conditions. Frequent references will be made to the following estimates:

• ŝ = R/d2: used when the number of subgroups k is as large as 25 or 30 and
is conservative for smaller numbers of subgroups—(see note below). See
Equation (1.13). Gives unbiased estimate for all k.

• ŝ = R/d2*: used when the number of subgroups k is quite small (less than 30)
and with computations in association with other tables designed for use with
the slightly biased estimate s.
We frequently use this estimate in association with factors from Table A.15, which
was designed for use with the similarly biased estimate s.
Note: In Table A.11, it may be seen that d2 < d2* in each column; thus, the two esti-
mates above will differ slightly, and

R R
> .
d 2 d 2*
124 Part I: Basics of Interpretation of Data

Table 4.6 Summary: estimating variability.


Different procedures for computing ŝ are quite confusing unless used frequently. Forms, which will
be used most often in the following chapters, are marked with the superscript (*).
Computing measures of variation from k sets of r each

*1. ŝ = R/d2 df = ∞ unbiased Equation (1.13)
– *
*2. ŝ = R/d2 df = 0.9k(r – 1), slightly biased Equation (4.6)
also Table A.11

3. ŝ 2 = (R/d2*)2 df = 0.9k(r – 1), unbiased Equation (4.5)

4. σ̂ 2 =
(r − 1)s + (r
1
2
1 2
− 1) s + … + (rk − 1) s
2
2
2
k
also Table A.11
df = r1 + r2 + ... + rk – k unbiased Equation (4.8)
r1 + r2 + … + rk − k

*5. σˆ X = σˆ / r df same for ŝ and ŝ X– Equation (1.7)

Computing measures of variation from one set of n = rk

n ∑ fm 2 − (∑fm )
2

σ̂ = s =
n (n − 1)
*6. where m = cell midpoint and slightly biased Equation (1.4b)
df = n – 1

∑(X )
2
−X
σ̂ = s 2 =
2 i
7. df = (n – 1) unbiased Equation (4.2)
n −1

n ∑ X i2 − (∑ X )
2

σ̂ 2 = s 2 =
i

n (n − 1)
8. df = (n – 1) unbiased Equation (4.3)

∑(X )
2
−X
σ̂ = s =
i
9. df = (n – 1) slightly biased Equation (4.1)
n −1

Consequently, the use of d2 with a small number of subgroups will simply produce
a slightly more conservative estimate of s in making comparisons of k means. We usu-
ally shall use d2* in the case histories in the following chapters.
Two methods of comparing process variability were discussed in Section 4.5: the F
test and the range-square-ratio test (FR).
An outline of some computational forms is given in Table 4.6. The different proce-
dures for computing ŝ are quite confusing unless used frequently. Those forms which
will be most useful in the following chapters are marked.

Case History 4.1


Vial Variability
In an effort to evaluate two different vendors, data is taken on the weights (in grams)
of 15 vials from firm A and 12 vials from firm B. (See Table 13.4.) Standard devia-
tions were calculated to determine whether there was a difference in variation between
the vendors. The results were sA = 2.52 and sB = 1.24. An F test was used to determine
whether the observed difference could be attributed to chance, or was real, with an
a = 0.05 level of risk.
Chapter 4: Variability—Estimating and Comparing 125

Here

F=
( 2.52) 2

= 4.13
(1.24 ) 2

And, since this is a two-sided test with 14 and 11 degrees of freedom, the ratio of
the larger over the smaller variance is compared to a critical value

F0.025 = 3.21

to achieve the a = 0.05 risk. This would indicate that firm A has the greater variation,
assuming the assumptions of the F test are met. This is further discussed in Case
History 13.3.

4.7 PRACTICE EXERCISES


1. Apply the range-square-ratio test to the first and second halves of the mica
data shown in Figure 2.5. Is there a difference in variation undetected by the
control chart?

2. Given the data in Example 4.1, compute R for machine 49, using subgroups

of r = 5. Also, compute R for machine 56 by the same method. Compute ŝ
for each machine using Equation 4.6, and compare the results with those
obtained by the authors. Explain the reason for the differences. Which is
more efficient? Why?
3. Assume that the following additional data are collected for the comparison
of Example 4.1:

Machine 49: 4.01 5.08 4.40 4.25 4.20


Machine 56: 5.30 4.10 3.78 4.25 4.33
Recompute s2 for each machine.
Perform and analyze an F test. State your conclusion, including the
approximate p value. Define and state the value of n, k, and r. Use
a = 0.05.
4. If you have a calculator or computer with a built-in function, use it to
compute s for each subgroup for each machine using Equation 4.2 and then
pool, using Equation 4.8. (Check to be sure that your method uses n – 1
rather than n in the denominator of Equation 4.2.) Compare your pooled s
with s obtained by the authors and explain the likely cause of the difference.
126 Part I: Basics of Interpretation of Data

5. Consider Example 4.2. Analyze the authors’ statement that melt A is actually
from two sources. How can this assertion be proven statistically? Perform the
appropriate test(s) and draw conclusions. (Note: You may need to refer to
Chapter 3, Chapter 13, or an appropriate statistics text.)
6. Demand for part number XQZ280 is such that you have had to run them on
two process centers. Because of an engineering interest in the 2.8000 cm
dimension you have taken the following five samples of five dimensions each
from the two process centers (in order of their production occurrence).
a. Eliminate any outliers (a = 0.05) based on all 25 unit values from each
process center.
b. Formulate hypothesis H0 of no difference and use the F test to attempt
to reject H0.
c. Plot the data.

Sample no.
Process 1 2 3 4 5
Center A: 2.8000 2.8001 2.7995 2.8014 2.8006
2.8001 2.8012 2.8006 2.8000 2.8009
2.8006 2.8015 2.8002 2.8005 2.7996
2.8005 2.8002 2.8003 2.8003 2.8000
2.8005 2.8010 2.8009 2.7992 2.7997
Center B: 2.7988 2.7985 2.7995 2.8004 2.8001
2.7980 2.7991 2.7993 2.8001 2.8004
2.7989 2.7986 2.7995 2.8002 2.8007
2.7987 2.7990 2.7995 2.8004 2.8003
2.7985 2.7994 2.7995 2.7997 2.8012
5
Attributes or
Go/No-Go Data

5.1 INTRODUCTION
In every industry, there are important quality characteristics that cannot be measured,
or which are difficult or costly to measure. In these many cases, evidence from mechan-
ical gauges, electrical meters used as gauges, or visual inspection may show that some
units of production conform to specifications or desired standards and that some do not
conform. Units that have cracks, missing components, appearance defects or other
visual imperfections, or which are gauged for dimensional characteristics and fail to
conform to specifications may be recorded as rejects, defectives, or nonconforming
items.1 These defects may be of a mechanical, electronic, or chemical nature. The number
or percentage of defective units is referred to as attributes data; each unit is recorded
simply as having or not having the attribute.
Process improvement and troubleshooting with attributes data have received rela-
tively little attention in the literature. In this book, methods of analyzing such data
receive major consideration; they are of major economic importance in the great major-
ity of manufacturing operations. (See especially Chapter 11.)

5.2 THREE IMPORTANT PROBLEMS


The ordinary manufacturing process will produce some defective2 units. When random
samples of the same size are drawn from a stable process, we expect variation in the num-
ber of defectives in the samples. Three important problems (questions) need consideration:

1. Some nonconforming items may be sold as “seconds”; others reworked and retested; others scrapped
and destroyed.
2. If a unit of production has at least one nonconformity, defect, or flaw, then the unit is called nonconforming. If
it will not perform its intended use, it is called defective. This is in conformity with the National Standard ANSI/
ISO/ASQC A3534-2-2004. In this book, the two terms are often used interchangeably.

127
128 Part I: Basics of Interpretation of Data

1. What variation is expected when samples of size n are drawn from a


stable process?
2. Is the process stable in producing defectives? This question of stability is
important in process improvement projects.
3. How large a sample is needed to estimate the percent defective in a
warehouse or in some other type of population?

Discussion of the Three Questions


What can be predicted about the sampling variation in the number of defectives found
in random samples of size n from a stable process? There are two possibilities to con-
sider: first, the percent defective3 is assumed known. This condition rarely occurs in
real-life situations. Second, the process percent defective is not known but is esti-
mated from k samples, k ≥ 1, where each sample usually consists of more than one
unit. The samples may or may not all be of the same size. This is the usual problem
we face in practice.

Binomial Theorem
Assume that the process is stable and that the probability of each manufactured unit
being defective is known to be p. Then the probability of exactly x defectives in a ran-
dom sample of n units is known from the binomial theorem. It is

Pr ( x ) =
n!
p x q n− x
x !( n − x )!
(5.1)

where p is the probability of a unit being defective and q = 1 – p is the probability of


it being nondefective. It can be proved that the expected number of defectives in the
sample is np; there will be variation in the number that actually occurs.

Example 5.1
When n = 10 and p = q = 0.5, for example, this corresponds to tossing 10 ordinary
coins4 and counting the number of heads or tails on any single toss of the 10.
Probabilities have been computed and are shown in Table 5.1 and plotted in
Figure 5.1. The expected or most probable number of defectives in a sample of 10
with p = q = 0.5 is np = 5. Also, it is almost as likely to have 4 or 6 heads as to have
5 and about half as likely to have 3 or 7 heads as 5.

3. The percent defective P equals 100p where p is the fraction nonconforming in the process.
4. Or of tossing a single coin 10 times.
Chapter 5: Attributes or Go/No-Go Data 129

Table 5.1 Probabilities Pr(x) of exactly x heads in


10 tosses of an ordinary coin.
Pr (0) = Pr (10) = (.5)10 = 0.001
Pr (1) = Pr (9) = 10(.5)10 = 0.010
Pr (2) = Pr (8) = 45(.5)10 = 0.044
Pr (3) = Pr (7) = 120(.5)10 = 0.117
Pr (4) = Pr (6) = 210(.5)10 = 0.205
Pr (5) = 252(.5)10 = 0.246

.24
n = 10
p = .5
.20

.16
Pr (x )

.12

.08

.04

0
0 1 2 3 4 5 6 7 8 9 10
Number of heads

Figure 5.1 Probabilities of exactly x heads in 10 tosses of an ordinary coin (n = 10, p = 0.50).

The sum of all probabilities from Pr (0) to Pr (10) is one, that is, certainty. The com-
bined probability of 0, 1, or 2 heads can be represented by the symbol: Pr (x ≤ 2). Also,
Pr (x ≥ 8) represents the probability of eight or more. Neither of these probabilities is large:

Pr (x ≤ 2) = Pr (x ≥ 8) = 0.055

These represent two tails or extremes in Figure 5.1. For example, we can state that
when we make a single toss of 10 coins, we predict that we shall observe between three
and seven heads inclusive, and our risk of being wrong is about

0.055 + 0.055 = 0.11 or 11 percent

We expect to be right 89 percent of the time.


130 Part I: Basics of Interpretation of Data

Binomial Probability Tables for Selected Values of n


Decimal values of probabilities are tedious to compute from Equation (5.1) even for
small values of n. A computer or calculator for direct computation is not always avail-
able. Consequently, a table of binomial probabilities is included (Table A.5) for
selected values of n and p. Values in the table are probabilities of c or fewer defective
units in a sample of n when the probability of occurrence of a defective is p for each
item. The sum of the values I and J shown as the row and column headings for a spe-
cific probability give the value of c. This is only of importance when J is not zero.
When J is zero, the row headings I give c directly and the table corresponds to the usual
binomial listing. When J is shown to be non-zero, the value of J shown must be added
to I to produce c. This allows the table to be dramatically condensed compared to other
binomial tables. A heavy vertical line shows J to be non-zero for the values to the right
of the line.
Values in Table A.5 corresponding to c are accumulated values; they represent
Pr (x ≤ c). For example, the probability of three or fewer heads (tails) when n = 10 and
p = 0.5 is

Pr (x ≤ 3) = Pr (x = 0) + Pr (x = l) + Pr (x = 2) + Pr (x = 3) = 0.172

Here c = 3 and, since J = 0 for p = 0.5, the probability is read from the table using
row I = 3 and p = 0.5 to obtain Pr (x ≤ 3) = 0.172.
Now suppose it is desired to find the probability of three or fewer heads when n = 50
and p = 0.20. Using Table A.5, we see that J = 2 for p = 0.20. Therefore, we would find
the desired probability by using a row heading of I = 1, since I + J = 1+ 2 = 3. This gives
Pr (x ≤ 3) = 0.005.
The steps to find the probability of c or fewer occurrences in a sample of n for a
given value of p are then as follows:
1. Find the listing for the sample size n (for the previous example, n = 50).
2. Find the value of p desired at the top of the table (here p = 0.20).
3. Observe the value of J shown under the column for p (here J = 2).
4. Subtract J from c to obtain I (here I = 3 – 2 = 1).
5. Look up the probability using the values of I and J obtained (here
Pr (x ≤ 3) = 0.005).
Notice that these steps collapse to using the row values I for c whenever J is found
to be zero, that is, for all p values to the left of the vertical line.
Chapter 5: Attributes or Go/No-Go Data 131

Example 5.2
Assume a stable process has been producing at a rate of 3 percent defective; when we
inspect a sample of n = 75 units, we find six defectives.
Question: Is finding as many as six defectives consistent with an assumption that
the process is still at the 3 percent level?
Answer: Values of Pr (x ≤ c) calculated from the binomial are shown in Table 5.2
and again in Figure 5.2. The probability of finding as many as six is seen to be small; it
is represented by

Pr (x ≥ 6) = 1 – Pr (x ≤ 5) = 1 – 0.975 = 0.025

The probability for individual values may be found using the value Pr (c) = Pr (x =
c) = Pr (x ≤ c) – Pr (x ≤ c – 1) from the values given in Table A.5.

Table 5.2 Probabilities of exactly x occurrences


in n = 75 trials and p = 0.03.
Pr (0) = .101
Pr (1) = .236
Pr (2) = .270
Pr (3) = .203
Pr (4) = .113
Pr (5) = .049
Pr (6) = .018
Pr (7) = .005
Pr (8) = .001

.30 n = 75

.20
Pr (x )

.10

0
0 1 2 3 4 5 6 7
Exactly x defectives

Figure 5.2 Probabilities of exactly x defectives in a sample of 75 from an infinite population


with p = 0.03.
132 Part I: Basics of Interpretation of Data

Each individual probability above may have a rounding discrepancy of ±0.0005.


Also, cumulative probabilities may have a rounding discrepancy; such discrepancies
will be common, but of little importance. Thus, it is very unlikely that the process is still
at its former level of 3 percent. There is only a 2.5 percent risk that an investigation
would be unwarranted. If a process average greater than 3 percent is economically
important, then an investigation of the process should be made.
When we obtain k samples of size ng from a process, we count the number of defec-
tives in each sample found by inspection or test. Let the numbers be

d1, d2, d3, . . . , dk

Then the percent defective in the process, which is assumed to be stable, or in a pop-
ulation assumed to be homogeneous, is estimated by dividing the total number of defec-
tives found by the total number inspected, which gives the proportion defective p.
Multiplying p by 100 gives the percent defective P.

p̂ = p =
∑d i
and P̂ = 100p̂ (5.2)
∑n g

We do not expect these estimates5 of the process average to be exactly equal to it,
nor shall we ever know exactly the “true” value of P.

A Measure of Variability for the Binomial Distribution


When n is large and p and q are each larger than, say, 5 percent and

np ≥ 5 or 6 and nq ≥ 5 or 6 (5.3)

then the binomial distribution closely approximates the continuous normal curve
(Figure 1.2). The values in Equation (5.3) are guidelines and not exact requirements.
The computation of ŝ for the binomial (attribute data) can be a much simpler opera-
tion than when computing one for variables (Table 1.3). After the value of p̂ is obtained
as in Equation (5.2), the computation is made from the following formulas.6

pˆ (1 − pˆ )
σˆ p = for proportion (fraction) defective (5.4)
n

σˆ P =
(
Pˆ 100 − Pˆ ) for percent defective (5.5)
n

5. We sometimes will not show the “hat” over either p or P where its use is clear from the context.
6. This is proved in texts on mathematical statistics.
Chapter 5: Attributes or Go/No-Go Data 133

σˆ np = npˆ (1 − pˆ ) for number defective (5.6)

Thus, knowing only the process level p̂ and the sample size n, we compute stan-
dard deviations directly from the appropriate formula above. When the population
parameters n and p are known or specified, the formulas remain the same with p
replacing p̂.

Example 5.3
A stable process is estimated to be producing 10 percent nonconforming items.
Samplings of size n = 50 are taken at random from it. What is the standard deviation of
the sampling results?
Answer: Any one of the following, depending upon the interest.

σˆ p =
( 0.10 )( 0.90 ) = 0.0424 σˆ P = 4.24% σˆ np = 2.12
50

Question 1. What Variation Is Expected?


Assuming a stable process with p– = 0.10, the expected variation in the number of defec-
tives in samples of n = 50 can be represented in terms of m̂ = np and ŝ np. Just as in
Equation (1.10) for the normal curve, when the conditions of Equation (5.3) are applic-
able, the amount of variation is predicted to be between

np – 3snp and np + 3snp (99.7% probability)


np – 2snp and np + 2snp (95.4%) (5.7)
np – snp and np + snp (68.3%)

Thus, in Example 5.3 with n = 50 and p = 0.10, the average number expected is
np = 5 and the standard deviation is ŝ np = 2.12. From Equation (5.7), we can predict the
variation in samples to be from

5 – 6.36 to 5 + 6.36 that is, 0 to 11 inclusive (99.7% probability)


or
5 – 4.24 to 5 + 4.24 that is, 1 to 9 inclusive (95.4% probability)

Just as easily, the expected sampling variation in p or P can be obtained from


Equations (5.4) and (5.5).
134 Part I: Basics of Interpretation of Data

Question 2. Is the Process Stable?


A simple and effective answer is available when we have k samples of ng each by mak-
ing a control chart for fraction or percent defective. This chart is entirely analogous to
a control chart for variables.
In routine production, control limits are usually drawn using 3-sigma limits. Points
outside these limits (outages) are considered evidence of assignable causes. Also, evi-
dence from runs (Chapter 2) is used jointly with that from outages, especially in process
improvement studies, where 2-sigma limits are usually the basis for investigations.
We expect almost all points that represent samplings from a stable process to fall
inside 3-sigma limits. If they do not, we say “The process is not in control” and there is
only a small risk of the conclusion being incorrect. If they do all fall inside, we say “The
process appears to be in statistical control” or “The process appears stable”; this is not
the same as saying “It is stable.” There is an analogy. A person is accused of a crime:
The evidence may (1) convict the defendant of guilt, and we realize there is some small
chance that justice miscarried, or (2) fail to convict, but this is not the same as believ-
ing in the person’s innocence. The “null hypothesis” (1) is that of no difference.

Example 5.4
(Data from Table 11.18, machine 1 only.)
Final inspection of small glass bottles was showing an unsatisfactory reject rate. A
sampling study of bottles was made over seven days to obtain evidence regarding stabil-
ity and to identify possible major sources of rejects. A partial record of the number of
rejects, found in samples of ng = 120 bottles taken three times per day from one machine,
has been plotted in Figure 5.3. The total number of rejects in the 21 samples was 147; the
total number inspected was n = 7(3)(120) = 2520. Then P = 0.0583(100) = 5.83 percent.
When considering daily production by 8 hour shifts, the sample of 15 inspected per
hour totals to ng = 8(15) = 120. Then

σˆ P =
(5.83)(94.17) = 4.575% = 2.14%
120

The upper 3-sigma limit is: 5.83 + 3(2.14) = 12.25 percent. There is no lower limit
since the computation gives a negative value.
Discussion
Was the process stable during the investigation?
Outages: There is one on August 17.
Runs: The entire set of data is suggestive of a process with a gradually increasing P.
This apparent increase is not entirely supported by a long run. The run of five above at
the end (and its two preceding points exactly on the median and average) is “suggestive
Chapter 5: Attributes or Go/No-Go Data 135

20

ng = 120 21

15 18

np = number defective
UCL = 12.25%
15
Percent

10 12

– 9
P = 5.83%
5 6

0 0
12 13 14 15 16 17 18

Figure 5.3 A control chart record of defective glass bottles found in samples of 120 per shift
over a seven-day period. (See Example 5.4.)

support” (Table A.3); and the six below the median out of the first seven strengthens the
notion of an increasingly defective process.
Both the outage and the general pattern indicate an uptrend.

Question 3. How Large of a Sample Is Needed?


How many units are needed to estimate the percent defective in a warehouse or in some
other population or universe? A newspaper article presented forecasts of a scientific sur-
vey of a national presidential election based on 1400 individual voter interviews. The
survey was conducted to determine the expected voting pattern of 60 to 80 million voters.
The chosen sample size is not determined on a percentage basis.
The question of sample size was discussed in Section 1.11 for variables data. The
procedure with attributes is much the same but differs in detail.
From a statistical viewpoint, we must first adopt estimates of three quantities:
1. The magnitude of allowable error ∆ in the estimate we will obtain for P.
Do we want the answer to be accurate within one percent? Or within three
percent? Some tentative estimate must be made.
2. What is a rough guess as to the value of P? We shall designate it here by P̂.
3. With what assurance do we want to determine the region within which P is
to be established? Usually about 95 or 90 percent assurance is reasonable.
These two assurances correspond to ±2ŝ and ±1.64ŝ .
136 Part I: Basics of Interpretation of Data

Answer: The basic equations to determine sample size in estimating P in a popula-


tion, allowing for possible variation of ±∆, are:

±∆ = ±3ŝ P if we insist on 99.7 percent confidence


±∆ = ±2ŝ P if about 95 percent confidence is acceptable
±∆ = ±1.64ŝ P if we accept about a 90 percent confidence level

The second of these equations may be rewritten as

∆=2
(
Pˆ 100 − Pˆ )
n

which simplifies to

n=
(
4 Pˆ 100 − Pˆ ) (95% confidence) (5.8a)
∆ 2

Similarly, from the third and first equations above, we have

(1.64 ) Pˆ (100 − Pˆ )
2

n= (90% confidence) (5.8b)


∆2

n=
(
9 Pˆ 100 − Pˆ ) (99.7% confidence) (5.8c)
∆ 2

In general

n=
(
Zα / 2 2 Pˆ 100 − Pˆ )
∆ 2

Note a is the complement of the confidence level. When P̂ is unknown, use P = 50


for a conservatively large estimate of sample size.
A little algebra will show that for 95 percent confidence and P = 50 percent,
Equation (5.8a) resolves into

 100 
2

n=  for P (5.8d)


 ∆ 

or, when dealing with proportions


Chapter 5: Attributes or Go/No-Go Data 137

 1
2

n =   for p (5.8e)
 ∆

These relations will give a handy upper bound on the sample size that can be easily
remembered. If we have an estimate of P or p, the sample size can be reduced accord-
ingly by using Equations (5.8a), (5.8b), or (5.8c).
Using Equation (5.8e), we have:

∆ n
.01 10,000
.02 2,500
.03 1,111
.04 625
.05 400
.10 100
.15 44
.20 25

which gives a very conservative estimate of how large sample sizes could be.
Other factors are very important, too, in planning a sampling to estimate a percent
defective within a population. We must plan the sampling procedure so that whatever
sample is chosen, it is as nearly representative as possible. How expensive is it to obtain
items from the process or other population for the sample? What is the cost of provid-
ing test equipment and operators? We would be reluctant to choose as large a sample
when the testing is destructive as when it is nondestructive. These factors may be as
important as the statistical ones involved in Equation (5.8). However, values obtained
from Equation (5.8) will provide a basis for comparing the reasonableness of whatever
sample size is eventually chosen.

Example 5.5
How large a sample is needed to estimate the percent P of defective glass bottles in a
second warehouse similar to the one discussed in Example 5.4? Preliminary data suggest
that P̂ ≅ 5 percent or 6 percent. Now it would hardly be reasonable to ask for an esti-
mate correct to 0.5 percent, but possibly to 1 percent or 2 percent. If we choose ∆ = 1
percent, a confidence of 95 percent, and P̂ = 5 percent, then

4 ( 5)( 95)
n= = 1900
1

If we were to increase ∆ to 2 percent, then

4 ( 5)( 95)
n= = 475
4
138 Part I: Basics of Interpretation of Data

Decision
A representative sample of 1900 is probably unnecessary, too expensive, and too
impractical. A reasonable compromise might be to inspect about 475 initially. Then
whatever percent defective is found, compute P ± 2ŝ P. Be sure to keep a record of all
different types of defects found; a sample of 475 probably will provide more than
enough information to decide what further samplings or other steps need be taken.

5.3 ON HOW TO SAMPLE


There Are Different Reasons for Sampling
Estimating the percent P in a warehouse or from a process are examples just discussed.
Now we shall discuss the frequency and size of samples necessary to monitor a pro-
duction process. This is a major issue in process control.
• When daily production from any one shift is less than, say, 500, and 100
percent inspection is already in progress, it may be sensible to initiate a
control chart with the entire production as the sample.7 The control limits
UCL and LCL can be computed and drawn on the chart for an average sample
size ng as a basis for monitoring daily production. The control limits can be
adjusted (recomputed) for any point corresponding to a substantially larger
or smaller sample; this would be warranted only for points near the computed
limits for average ng. A factor of two in sample size is a reasonable basis
for recomputing.
• When daily shift production is more than, say, 500 or when evidence suggests
process trouble, smaller random samples—checked by a special inspector—
can provide valuable information. Samples of ng = 50 or 100 can provide much
useful information; sometimes even smaller ones will be adequate as a starter.
The special inspector will not only record whether each item is defective but
will inspect for all defect categories and will record the major types of defects
and the number of each. See Case Histories 6.1, 6.2, and 6.3.
• How large a sample is necessary when starting a control chart with attributes?
In Case History 5.1, a chart was plotted daily from records of 100 percent
inspection. In Figure 5.6, we see that variation of daily production defectives
during January was as much as four or five percent or more above and below
average. If a variation of ∆ = 2 percent about P = 5.6 percent were now
accepted as a reasonable variation to signal a daily shift, and we choose
3-sigma limits to use in production, then from Equation (5.8c) it follows that

7. The entire production will be considered the population at times; but it is usually more profitable to consider it as
a sample of what the process may produce in the future, if it is a stable process.
Chapter 5: Attributes or Go/No-Go Data 139

9 ( 5.6 )( 94.4 )
n= = 1189 ≅ 1200
4

So a sample of about 1200 would be adequate for a total day’s production.


When samples are to be inspected hourly, then ng = 1200/8 = 150 becomes
a reasonable choice. If samples were to be inspected bihourly, then ng = 1200/4
= 300 would be indicated from a statistical viewpoint. However, a decision to
choose a smaller sample would be better than no sampling.
The question of whether a sampling system should be instituted, and the details
of both size and frequency of samples, should consider the potential savings
compared to the cost of sampling.
Conclusion: If the potential economic advantages favor the start of a sampling plan,
then initial samples of no larger than 150 are indicated. This would provide a feedback
of information to aid production. The sample size can later be changed—either smaller or
larger—as suggested by experience with this specific process.
There is another reason for sampling in production. Consider a shipment of items—
either outgoing or incoming. How large a random sample shall we inspect and how
many defectives shall we allow in the sample and still approve the lot? This is the prob-
lem of acceptance sampling (see Chapter 6) for lot inspection, as opposed to sampling
in monitoring production in determining sample size.

5.4 ATTRIBUTES DATA THAT APPROXIMATE


A POISSON DISTRIBUTION
In Table A.5, binomial tables are provided for selected values up to n = 100. In this sec-
tion, we consider count data in the form of number of nonconformities counted in a
physical unit (or units) of a given size.

Example 5.6
A spinning frame spins monofilament rayon yarn;8 it has over a hundred spinarets. There
are occasional spinning stoppages at an individual spinaret because of yarn breakage. A
worker then corrects the breakage and restarts the spinaret. An observer records the num-
ber of stoppages on an entire spinning frame during a series of 15-minute periods. The
record over 20 time periods shows the following number of stoppages:

6, 2, 1, 6, 5, 2, 3, 5, 6, 1, 5, 6, 4, 3, 1, 3, 5, 7, 4, 5...

with an average of four per period.

8. See Case History 11.3.


140 Part I: Basics of Interpretation of Data

This type of data is attributes or countable data. However, it differs from our pre-
ceding attributes data in the following ways:
1. There is no way of knowing the “sample size n”; the number of possible
breaks is “very large” since each spinaret may have several stoppages during
the same time period. Although we do not know n, we know it is potentially
“large,” at least conceptually.
2. There is no way of knowing the “probability p” of a single breakage; we do
know that it is “small.”
3. We do not have estimates of either n or p, but we have a good estimate
of their product. In fact, the average number of stoppages per period
is m̂ = np = 4.
There are many processes from which we obtain attributes data that satisfy these
three criteria: n is “large” and p is “small” (both usually unknown) but their average
product m̂ = np is known. Data of this sort are called Poisson type. A classical example of
such data is the number of deaths caused by the kick of a horse among different Prussian
army units. In this case, the number of opportunities for a kick and the probability of a
kick is unknown, but the average number of fatalities turns out to be m = 0.61 per unit
per year.

How Much Variation Is Predicted in Samples from a


Poisson Distribution?
The question can be answered in two ways, both useful to know:
1. By computing ŝ c. From Equation (5.6), with

σ̂ c = np (1 − p )

we have, for the Poisson with p small, and therefore (1 – p) ≅ 1,

σˆ c ≅ np = µˆ (5.9)

where m̂ is the average number of defects per unit.


Discussion: In the artificial data of Example 5.6, we are given m = np = 4.
Then from Equation (5.9),

σ̂ = 4 = 2
Chapter 5: Attributes or Go/No-Go Data 141

Consequently, variation expected from this Poisson type of process, assumed


stable, can be expected to extend from

4 – 2ŝ to 4 + 2ŝ (about 95 percent probability)


that is, from 0 to 8.
2. From Poisson curves (Table A.6). These very useful curves give probabilities of
c or fewer defects for different values of m = np. The value of Pr (≤ 8), for the
previous example, is estimated by first locating the point m = np = 4 on the base
line; then follow the vertical line up to the curve for c = 8. Then using a plastic
ruler locate Pr (≤ 8) on the left-hand vertical scale; it is slightly more than 0.98.
The Poisson formula for the probability of x outcomes when the mean
is m becomes

µ x e− µ
Pr ( x ) =
x!

where e = 2.71828, and its cumulative values are what have been plotted.

Example 5.7
The question of whether the process that generated the defects data is stable is answered
by the c control chart. To construct such a chart, it is, of course, necessary to estimate
the average number of defects c found as

µ̂ = c

Limits then become

UCLc = c + 3 c
CL = c
LCLc = c − 3 c

so for the data on stoppages of the spinning frame we have

µ̂ = c = 4
UCLc = 4 + 3 4 = 4 + 6 = 10
CL = 4
LCLc = 4 − 3 4 = 4 − 6 = −2 ∼ 0
142 Part I: Basics of Interpretation of Data

ng = 100
UCLc = 10
10
8
Defects

6
CL = 4
4
2
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sample

Figure 5.4 c chart on stoppages of spinning frame.

Note that no sensible practitioner would construct a chart showing –2 as a lower


limit in the count of defects. In such a case as this, we simply do not show a lower limit,
since a value of 0 is obviously within the 3ŝ spread. The chart is shown in Figure 5.4.
Now, the stoppage data showed the number of stoppages of the spinning frame
itself, which consisted of 100 spinarets. Sometimes it is desirable to plot a chart for
defects per unit, particularly when the number of units ng involved changes from sam-
ple to sample. In such a case, we plot a u chart where

c
u= = defects per unit
ng

and the limits are

u
UCLu = u + 3
ng
CL = u
u
LCLu = u − 3
ng

Obviously, when ng = 1, the limits reduce to those of the c chart. If we adjust the
data for the ng = 100 spinarets involved in each count, we obtain the following data in
terms of stoppages per spinaret u.

0.06, 0.02, 0.01, 0.06, 0.05, 0.02, 0.03, 0.05, 0.06, 0.01, 0.05, 0.06,
0.04, 0.03, 0.01, 0.03, 0.05, 0.07, 0.04, 0.05

and the limits are


Chapter 5: Attributes or Go/No-Go Data 143

ng = 100
UCL = 0.10
0.10
Defects per unit

CL = 0.04
0.04

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sample

Figure 5.5 u chart on stoppages of spinning frame.

0.04
UCLu = 0.04 + 3 = 0.04 + 0.06 = 0.10
100
CL = 0.04
0.04
LCLu = 0.04 − 3 = 0.04 − 0.06 = –0.02 ∼ 0
100

which produces a chart similar to the c chart since n is constant. (See Figure 5.5.)
Discussion: Processes and events representable by Poisson distributions are quite
common.

Case History 5.1


Defective Glass Stems in a Picture Tube for a Color TV Set9
A 100 percent inspection was made following the molding process on machine A. A
record of lot number, the number defective, and the percent defective for each lot are
shown at the bottom of Figure 5.6. A chart of daily P values is shown on the same ver-
tical line above the computed P values below. The inspector (only one) was instructed
to note operating conditions and any problems observed. It was learned that it was pos-
sible to explain why things were bad more often than things were good.
Two types of visual defects predominated:
1. Cracked throat, the major problem, caused by stems sticking in a mold,
probably from “cold” fires.
2. Bubbles, caused by “hot” fires.

9. Courtesy of Carl Mentch, General Electric Company.


144 Part I: Basics of Interpretation of Data

All reject types have been combined for this present analysis and discussion.
The control limits in Figure 5.6 have been computed for n–g = 66,080/26 ≅ 2,540 and

P = 100(3,703/66,080) = 5.60 percent.

Product 60 60 90 Glass stem assembly Date January

Dept. Operator
Inspected for Visual (mainly cracked throat) Machine A

Inspector JAS

Note: Numbers in parentheses refer to notes originally recorded on back of


this record sheet. Now see Case History 3.1, discussion 8.

20
(3)
18
ng = 2540
16

14
P control chart
Percent

12
(5) (6)
10
(1) (5) (2)
8 (2) UCL = 6.97%
6 P = 5.60%
(2)
4
(2) (4) (4) LCL = 4.23% (2) (2) (2)
(2) (2) (2)
2
(4) (4)
0
Number defective,

5.60%
9.42
3.62
6.83
4.21
6.10
3.71
19.93
3.53
0.24
1.94
1.72
11.17
8.45
7.40
4.24
7.68
10.00
3.95
5.01
6.34
3.77
6.99
2.24
5.60
3.00
4.76
%

defective P

202
104
178
102
170
83
488
97
8
56
46
158
239
226
122
219
205
94
98
125
80
173
59
165
89
117
3703
Total 66,080
Number

2145
2876
2607
2424
2786
2239
2449
2745
3268
2885
2681
1414
2829
3052
2874
2850
2051
2379
1955
1971
2120
2476
2630
2948
2966
2460

Average 2540
CM
in lot

Limits
by:
number
or time

Jan
Lot

2
3
5
6
7
8
9
10
12
13
14
15
16
17
19
20
21
22
23
24
26
27
28
29
30
31

Figure 5.6 Form to record inspection by attributes. (See Case History 5.1.)
Chapter 5: Attributes or Go/No-Go Data 145

Discussion
The day-to-day variation is large; the percent of rejects was substantially better than the
month’s average on at least 11 different days. When possible reasons for these better
performances were explored, not much specific evidence could be produced:
• It is not surprising to have excessive rejects on the day after New Year’s nor
at the startup of the process.
• It was believed that the fires were out of adjustment on January 9, producing
almost 20 percent rejects. Fires were adjusted on the tenth, and improved
performance was evident on the next four days (January 10 through 14).
• Notes and their number as recorded originally by inspector on back of
Figure 5.6:
1. Day after New Year’s shutdown
2. Reason that these points were out of control is unknown
3. Four fires known to be out of adjustment
4. Fires readjusted to bogie settings on 10th
5. Stems sticking in mold 4
6. Fire position 6 too hot
• On January 15, rejects jumped again (to 11 percent). An investigation showed
stems sticking in mold 4. This condition was improved somewhat over the
next three days.
• On January 21, fire position 6 on the machine was found to be too hot.
An adjustment was made early on January 22, and the process showed an
improvement in the overall average and with less erratic performance for the
remainder of January.
• When this study was begun, the stem machine was considered to be in
need of a major overhaul. This chart shows that either the machine easily
lost its adjustment or possibly it was overadjusted. At any rate, the machine
was shut down and completely rebuilt during February, and then started
up again.
• Since the methods used to adjust the machine during January were not very
effective, perhaps it would now be helpful to inspect a sample of stems hourly
or bihourly and post the findings. This systematic feedback of information
to production would be expected to do several things: prevent a machine
operating unnoticed all day at a high reject rate; signal both better and worse
performance, allowing production and engineering to establish reasons for the
difference in performance.
146 Part I: Basics of Interpretation of Data

• Sometimes, large variations as in Figure 5.6 are found to be a consequence


of differences in inspectors. Since only one inspector was involved here, it is
doubtful that this type of variation from day to day was a major contribution
to the variation.
• Calculation of control limits in Figure 5.6 for average ng ≅ 2540, attributes data.
From Equation (5.5)

σˆ P =
(5.60 )(94.40 ) = 0.456%
2540


So, with P = 5.60% and 3ŝ p = 1.37%,

UCLP = P + 3σˆ P = 6.97%


CL = P
LCLP = P − 3σˆ P = 4.23%

Case History 5.2


Incoming Inspection of a TV Component10
A relatively inexpensive glass component (a mount) was molded at one plant of a com-
pany. After a 100 percent inspection, it was transported to a second plant of the same
company. After a 100 percent inspection at the receiving plant, it was sealed into a TV
picture tube. This case history can be found on the CD-ROM in the file \Selected Case
Histories\Chapter 5\CH 5_2.pdf. This file includes Figure 5.7.

5.5 NOTES ON CONTROL CHARTS


Figures 5.6 and 5.7 are examples of good practice in the construction and use of control
charts (see also Figure 6.13). They both contain notes about operating conditions and
other causes that could be associated with out of control signals. On Figure 5.6, these
notes were originally written on the back of the chart, while Figure 5.7 provides space
for such remarks.

10. Courtesy of Carl Mentch, General Electric Company.


Chapter 5: Attributes or Go/No-Go Data 147


Notes such as these are just as valuable when applied to X and R or other charts.
They facilitate discovery of the relation between physical circumstances and the behav-
ior of the chart. As such they are an aid in determining assignable causes.
Whether kept on the chart or in a logbook, it is a real help to have a chronological
description of the settings and other conditions that describe process operation. All too
often, however, in the heat of production, such a record may be omitted, ignored, or fal-
sified. But when appropriately used, notes on charts can be an important tool for process
improvement.

Case History 5.3


Notes on Gram Weight of a Tubing Process
An operator logbook can be a vital source of information on what is ailing a process—
as long as it is kept up to date. In this situation, a plant making glass tubing was expe-
riencing severe process instability. Plant process engineers were scratching their heads
trying to figure out what aspect of the process had changed. In this study, one of the
authors asked the operators to write down everything they did regardless of how
insignificant they thought it was.
A glass tube draw process involves the melting of glass in a large furnace and the
flow of glass through a forehearth, into a “bowl,” around a “bell,” and over a ring that
forms an envelope of glass in the shape of a tube as it is drawn into long lengths. The
newly formed tubing is drawn at great length as it cools, and then it is cut into “sticks”
at the end of the line. It is at this point that the inspector can collect and measure these
“sticks” to assess the control of the process.
This particular study started out as a simple process performance study11 intended
to evaluate the control of gram weight. Gram weight was considered to be a general
metric for how well the process was performing. If a relatively constant flow of glass
was going into the formation of the tubing, the gram weight (and consequently, dimen-
sions) of the glass should be in statistical control. Samples of size n = 3 “sticks” were
taken from the process and the sample average was plotted as in Figure 5.8.
The process shown in Figure 5.8 is clearly not in control. The logbook entries were
written on the chart by the author in an effort to help explain the source(s) of process
instability. Between 8:45 AM and noon on the first day, the operator makes several
process moves in an attempt to increase gram weight. At 12:45 PM the operator is satis-
fied that gram weight is back to normal and then turns on the automatic control of the
bell electric.
At 6:15 PM, the operator now observes that the bell amps gauge is moving and
allowing the temperature to go too far before the automatic control reacts. Increasing
the glass temperature will actually cause a reduction in wall thickness and gram weight

11. Process performance studies are discussed in Chapter 8.


Average gram weight

160
165
170
175
180
185
190
195
200

71
5
80
0
84
5
Cut bowl temp
Put 4 pts on forehearth control
93
0 Increased RB (17.5 to 18.5 amps)
10 Increased RB (18.5 to 19 amps)
15
Shut bell electric off (changed clamp)
11 Bell on automatic control
00 Increased RB (19.5 to 21 amps)

made by operator.
11
45 Lit bottom muffle fires
Increase bottom muffle fires
12 Bell on manual control
30
13
15 Bell on automatic control
14
00
14
45
15
30 Shift change (RB @ 21 amps, bell @ 2.8 amps, on automatic control)
16
15
17
00
17
45
18
30
Bell amps on auto moving, letting temp go too far before it moves
19
15
20
00

Time
20
45
21
30
Bell electric set point changed, gram weight dropping off, auto control
22 not putting on bell electric @ 2.4 amps
15
Gram Weight of Glass Tubing

23
00
23 Shift change
45
No bell electric, cut RB 0.5 pt to get some control on bell electric
Increased set point on bell electric by 6 pts
is not working as it should.

30 2.4 amps on bell electric, cut bowl 0.5 pt to get more bell electric

Cut bell electric set point by 4 pts


11
5
20
Operator notices that automatic control

0
24
5 Cut bell electric set point by 1 pt

Cut bell electric set point by 1 pt


33
0
41
5
attempt to resuscitate automatic control.

Raised bell electric set point by 1 pt


50
is under automatic control.

0
Operator struggles to make process moves in

54
5
63
0
control. New approach is stabilizing gram

71
5 Shift change
Operator now assumes the role of automatic

weight, but engineering still believes process

74
5
Figure 5.8 Plot of average gram weight of n = 3 tubes/sample taken at 15 minute intervals. Logbook entries shown describe process moves

Part I: Basics of Interpretation of Data 148


Chapter 5: Attributes or Go/No-Go Data 149

(among other problems). Cutting the temperature quickly increases wall thickness and
gram weight. These adjustments can be seen between 5:30 PM and 6:00 PM, and 11:45 PM
and midnight.
At 9:30 PM, the operator sees gram weight gradually declining and decides to
change the bell electric set point and observes that “gram wt. dropping off, not putting
on bell electric at 2.4 amps.” Though a shift change occurs at 11:00 PM, the new opera-
tor is now aware of the situation and observes at 11:15 PM that “No bell electric, cut RB
1
⁄2 point to get some control on bell electric.”
Beginning at midnight, the operator now runs the process on manual control
through a succession of bell electric moves. The result is a well-controlled process as
seen in Figure 5.8 beginning at 3:15 AM. When the results of this study, coupled with
the logbook comments, were shown to the plant process engineering department, they
were astounded.
The lead process engineer believed that the process was always run under automatic
control and that process instability was due to other assignable causes that they were
looking for. Once it was determined that the automatic control did not work, the process
engineering department simply fixed the automated controller and the process instabil-
ity went away. Adjusting the level of gram weight to an appropriate target was then only
a simple manipulation of control parameters.

5.6 PRACTICE EXERCISES


1. Table 5.2 shows the probabilities of x occurrences in n = 75 trials, given
p = 0.03.
a. Find the probability that x will exceed np + 3 sigma and compare this
result with the statement of Equation 5.7. Explain the discrepancy.
b. Repeat this comparison for n = 50 and n = 25.
2. Adapt Equation 5.8 to find a sample size to give with 50 percent confidence
that p = 20 percent defective material is estimated with a possible error of ±5
percent. Rework for p = 10 and 50 percent.
3. Consider Example 5.6. Use the Poisson probability curves to find the
following probabilities.

P(0 ≤ x ≤ 8 given m = 4) (m ± 2s)


P(2 ≤ x ≤ 6 given m = 4) (m ± s)

Compare these probabilities with the normal approximation with the same
mean and variance. Explain the discrepancy, if any.
150 Part I: Basics of Interpretation of Data

4. Show numerically that the Poisson is the limit of the binomial (Equation 5.1)
as np remains constant, n approaches infinity, and p approaches 0. For X = 1,
start with n = 10, p = 0.1, then evaluate n = 100, p = 0.01, and finally use the
Poisson limit formula. Hint: Table A.5 gives sufficient accuracy in evaluating
Equation 5. 1.
– is 2540. Recompute the limits
5. In Case History 5.1, the average sample size, n,
assuming the percent defective shown in column 4 of Figure 5.6 remains the
same, but the average size is reduced from 2540 to 100. Which points are now
out of control? Why is it changed? What does this say about sample size
requirements for p charts?
6. In Case History 5.2, there are two points for 1–14 and 1–21 outside the UCL
of 1.23. However, the point of 1-21 has a special little control limit of 1.32 just
for it. Explain why this is and show how it was calculated.
7. In analyzing experiments, an alternative to simply accepting or rejecting the
null hypothesis is to compute the “p value” (also called “probability value” or
“significance level”) of an experimental outcome. This is the probability,
given that the null hypothesis is true, that the observed outcome or one more
extreme would occur. The computation of a p value can often aid in deciding
what action to take next. For example, see Figure 5.7; on January 21, the
percent defective was 1.2934 percent based on n = 1933.

σˆ =
( 0.734 )(99.266) = 0.1941 and P = 0.734%
1933
1.2934 − 0.734
Z= = 2.88
0.1941
Pr ( Z ≤ 2.88 ) = 0.9980 p value = 0.002

Using this same logic, compute p values for the following situations:
a. Binomial process, n = 30, p = 0.05, x = 5 (use Table A.5)
b. Poisson process, m = np = 2, x = 1, x = 4 (use Table A.6)
c. Normal process, m = 50, s = 10, x = 72 (use Table A.1)

d. Student’s t, m = 50, s = 40, n = 16, X = 72 (interpolate in Table A.15)
8. Compare actual probability of 3s limits with that obtained from the normal
approximation to the binomial for small n and p.
9. Compare limits obtained by the Poisson and binomial distributions when the
approximation m = np is poor.
Part II
Statistical
Process Control
6
Sampling and
Narrow-Limit Gauging

6.1 INTRODUCTION1
Incoming inspection traditionally decides whether to accept an entire lot of a prod-
uct submitted by a vendor. Factors such as the reputation of the vendor, the urgency of
the need for the purchased material, and the availability of test equipment influence this
vital decision—sometimes to the extent of eliminating all inspection of a purchase.
In contrast, some companies even perform 100 percent inspection for nondestruc-
tive characteristics and accept only those individual units that conform to specifications.
However, 100 percent screening inspection of large lots does not ensure 100 percent
accuracy. The inspection may fail to reject some nonconforming units and/or reject
some conforming units. Fatigue, boredom, distraction, inadequate lighting, test equip-
ment variation, and many other factors introduce substantial errors into the screening
inspection.
Sampling offers a compromise in time and expense between the extremes of 100
percent inspection and no inspection. It can be carried out in several ways. First, a few
items, often called a “grab sample,” may be taken from the lot indiscriminately and
examined visually or measured for a quality characteristic or group of characteristics.
The entire lot may then be accepted or rejected on the findings from the sample.
Another procedure is to take some fixed percentage of the lot as the sample. This was
once a fairly standard procedure. However, this practice results in large differences in
protection for different sized lots. Also, the vendor can “play games” by the choice of

1. This chapter on sampling has been expanded to include narrow-limit gauging. The discussion of sampling,
however, remains largely unchanged from the first edition and still reflects the philosophy on acceptance
sampling of Ellis R. Ott. The coauthors are in complete agreement with Ott’s approach. See E. G. Schilling,
“An Overview of Acceptance Control,” Quality Progress (April 1984): 22–24 and E. G. Schilling, “The Role
of Acceptance Sampling in Modern Quality Control,” Communications in Statistics-Theory and Methods 14,
no. 11 (1985): 2769–2783.

153
154 Part II: Statistical Process Control

lot size, submitting small lots when the percent defective is large, and thus increasing
the probability of their acceptance.
This chapter presents and discusses the advantages and applications of scientific
acceptance sampling plans.2 These plans designate sample sizes for different lot sizes.
If the number of defective units found in a sample exceeds the number specified in the
plan for that lot and sample size, the entire lot is rejected. A rejected lot may be returned
to the vendor for reworking and improvement before being returned for resampling, it
may be inspected 100 percent by the vendor or vendee as agreed, or it may be scrapped.
Otherwise, the entire lot is accepted except for any defectives found in the sample.
Historically, the primary function of acceptance sampling plans was, naturally
enough, acceptance–rejection of lots. Application of acceptance sampling plans was a
police function; they were designed as protection against accepting lots of unsatisfac-
tory quality. However, the vendor and customer nomenclature can also be applied to a
shipping and receiving department within a single manufacturing plant or to any point
within the plant where material or product is received for further processing. Variations
in the use of scientific sampling plans are, for instance:
1. At incoming inspection in a production organization
2. As a check on a product moving from one department or process of a plant
to another
3. As a basis for approving the start-up of a machine
4. As a basis for adjusting an operating process or machine before approving
its continued operation
5. As a check on the outgoing quality of product ready for shipment to
a customer

6.2 SCIENTIFIC SAMPLING PLANS


The control chart for attributes discussed in Chapter 5 is one possible system of sur-
veillance in any of these situations. Another method is the use of tabulated sampling
plans as discussed in Section 6.10. Such plans are particularly useful while initiating
process control procedures, and as a means of disposition of product when the process
is out of control. Sampling plans must be used when tests are destructive, since 100 per-
cent inspection is obviously impossible under such conditions. Use of sampling plans
and process control procedures to supplement each other can provide an optimum of
protection and control at minimum cost.3

2. Various sampling plans in usage are referenced. The emphasis of this discussion is on some basic ideas related to
acceptance sampling plans.
3. Edward G. Schilling, “Acceptance Control in a Modern Quality Program,” Quality Engineering 3 (1990): 181–91.
Chapter 6: Sampling and Narrow-Limit Gauging 155

Consider the following single sampling plan applied to a process.

Plan: n = 45, c = 2

This notation indicates that:


• A random sample of n = 45 units is obtained—perhaps taken during the last
half-hour or some other chosen period, possibly from the last 1000 items
produced, or even the last 45 units if interest in the present status of the process
is primary. The quantity from which the sample is taken is called a lot.
• The 45 units are inspected for quality characteristic A, which may be a single
quality characteristic or a group of them. If it is a group of them, the character-
istics should usually be of about equal importance and be determined at the
same inspection station.
• If not more than two (c = 2) defective units are found in the sample, the entire
lot (or process) except for defectives is accepted for characteristic A; that is, the
entire lot is accepted if zero, one, or two defectives are found. When more than
two defectives are found in the sample, the lot is not acceptable.
In addition to this decision to accept or reject, a second important use of a plan is
as a feedback system, providing information to help production itself (or the vendor)
improve the quality of subsequent lots as produced. In any case, such plans often exer-
cise a healthy influence on the control of a process. This will be discussed in Sections
6.11 and 6.12.

6.3 A SIMPLE PROBABILITY


What may we expect to happen (on the average) if many successive samples from the
process are examined under the above plan, n = 45, c = 2? What fraction of samplings
will approve the process for continuance? To provide an answer, additional information
is required. Let us assume first, for example, that the actual process is stable and pro-
ducing five percent defective, that is, the probability that any single item is defective is
p = 0.05 or P = 5 percent.
Discussion: From Table A.5 of binomial probabilities, the probability4 PA of no more
than c = 2 defectives in a sample of 45 with p = 0.05 is

P(x ≤ 2) = 0.608 or 60.8%

4. The symbol PA (read “P sub A”) represents the “probability of acceptance.” When the acceptance plan relates to
an entire lot of a product, the PA is the probability that any particular lot will be accepted by the plan when it is
indeed the stated percent defective. We shall talk about accepting or rejecting a process as well as accepting or
rejecting a lot.
156 Part II: Statistical Process Control

6.4 OPERATING-CHARACTERISTIC CURVES OF A


SINGLE SAMPLING PLAN
There are two areas of special interest in practice:
1. What happens when lots with a very small percentage of defective units are
submitted for acceptance? A reasonable plan would usually accept such lots
on the basis of the sample.
2. What happens when lots with a “large” percentage of defective units are
submitted? A reasonable plan ought to, and usually will, reject such lots on
the basis of the sample.
As in Section 6.3, values of PA have been obtained for selected values of P and
have been tabulated in Table 6.1 and graphed in Figure 6.1. The resulting curve is called
the operating-characteristic curve (OC curve) of the sampling plan, n = 45, c = 2.

Discussion: Regarding Figure 6.1


• When lots with less than 1 percent defective are submitted to the plan, only
occasionally will they be rejected; PA = 0.99.
• When lots with more than 10 percent defective are submitted, the probability
of acceptance is small, PA ≅ 0.10.
• Lots with P between 2 percent and 10 percent defective have a probability
of acceptance which drops sharply.
• Whether the plan n = 45, c = 2, is a reasonable plan to use in a particular
application has to be given consideration.

Table 6.1 Probabilities PA of finding x ≤ 2 in a sample


of n = 45 for different values of p.
Values from Table A.5, Binomial Probability Tables.
P, in percent PA = P (x ≤ 2)
0 1.00
1 0.99
2 0.94
3 0.85
4 0.73
5 0.61
6 0.49
8 0.29
10 0.16
15 0.03
Chapter 6: Sampling and Narrow-Limit Gauging 157

100

80 Single-sampling plan
n = 45
c=2
60

PA

40

20

0
0 2 4 6 8 10 12 14 16
Percent defective in submitted lots (P = 100p)

Figure 6.1 Operating-characteristic curve of a single sampling plan for attributes (n = 45, c = 2).
The probability of a submitted lot being accepted, PA, is shown on the vertical axis
while different percent defectives in submitted lots are shown on the horizontal axis.
(Data from Table 6.1.)

6.5 BUT IS IT A GOOD PLAN?


Whether the plan n = 45, c = 2 is a sensible, economical plan for a specific application
involves the following points:
1. Sampling plans for process control and improvement should provide signals
of economically important changes in production quality, a deterioration
or improvement in the process, or other evidence of fluctuations. Are
signals important in this application during ordinary production or in a
process improvement project? Should a set of plans be devised to detect
important differences between operators, shifts, machines, vendors?
Improvements to industrial processes are often relatively inexpensive.
2. What are the costs of using this plan versus not using any?
a. What is the cost to the company if a defective item is allowed to proceed
to the next department or assembly? If it is simple and inexpensive to
eliminate defectives in subsequent assembly, perhaps no sampling plan is
necessary. If it is virtually impossible to prevent a defective from being
included in the next assembly, and if this assembly is expensive and is
158 Part II: Statistical Process Control

ruined thereby, failure to detect and eliminate an inexpensive component


cannot be tolerated.
b. What is the cost of removing a defective unit by inspection? What is the
cost of improving it by reworking it? What is the possibility of improving
the process by reducing or eliminating defective components? When the
total cost of permitting P percent defective items to proceed to the next
assembly is equal to the cost of removing the defectives or improving the
process, there is an economic standoff. When the costs of sampling and the
possible consequent 100 percent screening are less than the alternative costs
of forwarding P percent defectives, some sampling plan should be instituted.
3. Does this sampling plan minimize the total amount of inspection? Acceptance
sampling plans are intended to reduce the amount of inspection in a rational
way, while considering quality levels and associated risks. Unnecessary
inspection is wasteful and counterproductive.
In process control, we often use smaller sample sizes than required by a plan when
our dependence on quality relates only to the acceptance–rejection aspect of the plan.
Convenience sometimes stipulates the use of smaller sample sizes, referred to as con-
venience samples. Any sampling plan that detects a deterioration of quality and sends
rejected lots or records of them back to the producing department can have a most salu-
tary influence on production practices.
Many companies use one type of acceptance procedure on purchases and another
on work in process, or outgoing product. In contracting the purchase of materials, it is
common practice to specify the sampling procedure for determining acceptability of
lots. Agreement to the acceptance procedures may be as critical to the contract as price
or date of delivery.

6.6 AVERAGE OUTGOING QUALITY (AOQ) AND ITS


MAXIMUM LIMIT (AOQL)
Two concepts will be discussed here with reference to a single-sampling rectification
inspection plan for attributes (nondestructive test) that includes the following steps:
1. A lot rejected by the plan is given a 100 percent screening inspection. All
defectives are removed and replaced by nondefectives; that is, the lot is
rectified. It is then resubmitted for sampling before acceptance.
2. Defectives are always removed when found and replaced by nondefectives,
even in the samples.
As a consequence of (1) and (2), the average outgoing quality (AOQ) of lots pass-
ing through the sampling station will be improved. Since very good lots will usually
be accepted by the sampling plan, their AOQ will be improved only slightly (see
Chapter 6: Sampling and Narrow-Limit Gauging 159

Average outgoing quality (AOQ)


n = 45
c=2
AOQ if there were no inspection
4
AOQL ≅ 3% when P ≅ 5%
3

0
0 1 2 3 4 5 6 7 8 9 10 11 12
P, incoming percent defective

Figure 6.2 Averaging outgoing quality (AOQ) compared to incoming percent defective P for
the plan n = 45, c = 2. (AOQ is after any lots that fail the inspection plan have been
100 percent inspected; see Table 6.2, columns 1 and 3.)

Figure 6.2). Lots with a larger percent of defectives will be rejected more often and
their defectives removed in screening. Their AOQ will be improved substantially
(Figure 6.2).
The worst possible average situation is represented by the height of the peak of the
AOQ curve. This maximum value is called the average outgoing quality limit (AOQL).
In Figure 6.2, AOQL ≅ 3 percent. This very useful AOQL concept forms the basis for a
system of sampling plans.
Any sampling plan will provide information on the quality of a lot as it is submitted—
certainly more information5 than if no inspection is done. But some plans require much
too large a sample for the need; some specify ridiculously small samples. The AOQL
concept is helpful in assessing the adequacy of the plan under consideration and/or indi-
cating how it could be improved. When a 3 percent AOQL system is used, the worst pos-
sible long-term average accepted will be 3 percent. However, this could occur only if
the producer always fed 5 percent defective to the sampling station. Except in this
unlikely event, the average outgoing quality6 will be less than 3 percent.

5. Other valuable information on a lot’s quality could be provided by a production control chart(s). These charts
are usually readily available to us only when the vendor is a department of our own organization; but this
is not necessarily so. A basis for control chart information on incoming material is often established with
outside vendors.
6. There is a hint of a spurious suggestion from Figure 6.2, namely, that one way to get excellent product quality
is to find a supplier who provides a large percent of defectives! This is not really a paradox. The only way to
ensure good quality product is to provide good manufacturing practices; dependence upon 100 percent inspection
is sometimes a short-term “necessary evil.”
It is told that three friends at a company’s country club for dinner ordered clams on the half shell. After an
exceptionally long wait, three plates of clams were brought with apologies for the delay: “I am very sorry, but
we had to open and throw out an awful lot of clams to find these good ones!” Would you eat the clams served?
160 Part II: Statistical Process Control

6.7 COMPUTING THE AVERAGE OUTGOING


QUALITY (AOQ) OF LOTS FROM A PROCESS
PRODUCING P PERCENT DEFECTIVE
When a sequence of lots is submitted to an acceptance plan n = 45, c = 2, what is the
AOQ for different values of p?
Consider first the case for p = 0.05; the number of defectives in an average lot of
size N = 2000, for example, is Np = 100. The average number of defectives removed
from the lot as a consequence of the sampling plan comes from two sources:
1. From the sample. Average number removed is

np = (45)(0.05) = 2.25 (6.1)

2. From lots that are rejected by the plan. We designate the probability of a lot
being rejected by

PR = 1 – PA = 1 – 0.61 = 0.39 (6.2)

Then the additional average number of defectives removed from these rejected lots is

p(N – n)PR = (0.05)(1955)(0.39) = 38.12 (6.3)

so the total average number of defectives removed from a lot is the sum of these two:

np + p(N – n)PR = 2.25 + 38.12 = 40.37 (6.4)

The average number of defectives remaining in a lot, 100 – 40.37 = 59.63, divided
by N is the AOQ:

AOQ = 59.63/2000 = 0.0298 or 2.98%

Consider now the general case. The average number of defectives in a lot is Np.
From Equation (6.4), the number removed is

np + p(N – n)PR

Then

Np −  np + p ( N − n ) PR   n
AOQ = = pPA 1 − 
N  N
Chapter 6: Sampling and Narrow-Limit Gauging 161

Table 6.2 Average outgoing quality (AOQ) of lots proceeding past an acceptance sampling
station using the plan n = 45, c = 2.
Lots that fail to pass sampling are submitted to 100 percent screening with replacement of
defectives. See Equation (6.5) for method of approximating AOQ.
P PA AOQ = P ë PA
0 1.00 0.00
1 0.99 0.99
2 0.94 1.88
3 0.85 2.55
4 0.73 2.92
5 0.61 3.05
6 0.49 2.94
8 0.29 2.32
10 0.16 1.60
15 0.03 0.45

Since the value of n/N is usually very small,

AOQ ≅ pPA (6.5)

Equation (6.5) was used7 to compute values of AOQ in Table 6.2.

6.8 OTHER IMPORTANT CONCEPTS ASSOCIATED


WITH SAMPLING PLANS
Minimum Average Total Inspection. Includes inspection of the samples and the 100
percent screening of those lots that are rejected by the plan. Both Dodge-Romig sys-
tems, AOQL and LTPD, include this principle.
Acceptable-quality level (AQL). Represents the largest average percent defective that
will be accepted with reasonably high probability. Sometimes an attempt is made to for-
malize the concept as “the quality in percent defective that the consumer is willing to
accept about 95 percent of the time such lots are submitted.” This definition has been
the basis for some heated arguments.
Point of Indifference.8 Represents a percent defective in lots that will be accepted half
the time when submitted (PA = 50 percent).

7. For ordinary practical purposes, it is adequate to regard the characteristics of sampling plans as sampling from a
production process with fixed fraction defective p.
8. Hugo C. Hamaker, “Some Basic Principles of Sampling Inspection by Attributes,” Applied Statistics 7 (1958): 149–59.
162 Part II: Statistical Process Control

Lot Tolerance Percent Defective Plans. To many consumers, it seems that the quality
of each lot is so critical that the average outgoing quality concept does not offer ade-
quate protection. The customer often feels a need for a lot-by-lot system of protection.
Such systems have been devised and are called lot tolerance percent defective (LTPD)
plans. The LTPD is the quality level that will have 10 percent probability of acceptance.
Good Quality Is a Consequence of Good Manufacturing. Good quality is not the result
of inspection; inspection is often considered a “necessary complement.” Most important
is the role that acceptance sampling plans can play in providing useful information to
help manufacturing improve its processes (see Section 6.11).

6.9 RISKS
The vendor/producer wants reasonable assurance (a small producer’s risk) of only a
small risk of rejection when lots are submitted having a small percent defective. In
Figure 6.1, the producer’s risk is about five percent for lots with two percent defective
and less than five percent for better lots. This may be reasonable and acceptable to the
producer on some product items and not on others; negotiation is normally required.
The vendee/consumer wants reasonable assurance (a small consumer’s risk) that
lots with a large percent defective will usually be rejected. In Figure 6.1, the consumer’s
risk is seen to be about 16 percent for lots submitted with 10 percent defective; less than
16 percent for larger percents defective.
Compromises between the consumer and producer are necessary. Two systems of
tabulated plans in wide use provide a range of possibilities between these two risks; the
Dodge-Romig plans and the ANSI/ASQ Z1.4 system.

6.10 TABULATED SAMPLING PLANS


Fortunately for the quality practitioner, sampling plans have been tabulated for sim-
plicity and accuracy of use. Two of the best known9 sampling plans are the Dodge-
Romig AOQL and LTPD plans and the AQL sampling system of ANSI/ASQ Z1.4.
The Dodge-Romig tables provide AOQL and LTPD plans for different lot sizes.
The AOQL plans must be used with nondestructive tests, since 100 percent inspection
of rejected lots is required. These plans were designed to minimize the total inspection
resulting from the inspection of the sample(s), and whatever 100 percent screening
inspection is required on lots that fail sampling. The tables provide AOQL plans for
specified AOQL values of 0.1 to 10 percent, and LTPD plans for values of P from 0.5
to 10 percent.

9. Harold F. Dodge and Harry G. Romig, Sampling Inspection Tables, Single and Double Sampling (New York:
John Wiley and Sons, 1998). A selected set of these plans will be found in the standard entitled ASTM E1994-98.
See also ANSI/ASQ Z1.4, Sampling Procedures and Tables for Inspection by Attributes (Milwaukee: American
Society for Quality, 2003).
Chapter 6: Sampling and Narrow-Limit Gauging 163

The plans of ANSI/ASQ Z1.4 are used extensively in acceptance sampling. These
plans are based on an acceptable quality level (AQL) concept put forward in Mil-Std-
105. The producer’s risk is emphasized when an AQL is chosen. Plans are intended to
protect the producer when producing at or better than the AQL level, unless there is a
previous history or other basis for questioning the quality of the product. When product
is submitted from a process at the AQL level, it will be accepted most of the time. Since
OC curves drop only gradually for percent defectives slightly larger than the AQL
value, such product will have a fairly high probability of acceptance. The producer’s
interest is protected under criteria designated as normal inspection.
How then is the consumer protected? Whenever there is reason to doubt the quality
level of the producer, the ANSI/ASQ Z1.4 system provides stricter criteria for accep-
tance. These plans are called tightened inspection plans. Criteria are provided in the
plans to govern switching from normal to tightened inspection. Proper use of ANSI/ASQ
Z1.4 demands that the rules for shifting from normal to tightened inspection be observed.
When the producer has an excellent record of quality on a particular item, the
ANSI/ASQ Z1.4 system plans permit a reduction in sample sizes by switching to
reduced inspection. This shift to reduced inspection is not designed to maintain the
AQL protection, but to allow a saving in inspection effort by the consumer. For use
with individual lots, specific plans can be selected by referring to OC curves printed
in the standard.

6.11 FEEDBACK OF INFORMATION

Problems, Problems, Everywhere


Problems always abound when manufacturing any product; they may be found both
during processing and in the finished product. Problems may result from product
design, vendor quality, testing inadequacies, and on and on. It is tempting to blame the
problems on factors outside our own immediate sphere of responsibility. In fact, there
are occasions when a vendor is known to be supplying low-quality items; there are also
occasions when we have examined our process very carefully without finding how to
improve it. There are two standard procedures that, though often good in themselves,
can serve to postpone careful analysis of the production process:
1. Online inspection stations (100 percent screening). These can become a
way of life.
2. Online acceptance sampling plans that prevent excessively defective lots
from proceeding on down the production line, but have no feedback
procedure included.
These procedures become bad when they allow or encourage carelessness in pro-
duction. It gets easy for production to shrug off responsibility for quality and criticize
inspection for letting bad quality proceed.
164 Part II: Statistical Process Control

More Than a Police Function


No screening inspection should simply separate the good from the bad, the conforming
from the nonconforming, the sheep from the goats. No online acceptance sampling sys-
tem should serve merely a police function by just keeping unsatisfactory lots from con-
tinuing down the production line. Incorporated in any sampling system should be
procedures for the recording of important detailed information on the number and types
of production defects. It is a great loss when these data are not sent back to help pro-
duction improve itself. A form for use in reporting such information is vital, although
preparing an effective one is not always a simple task.
Any systematic reporting of defects, which can trigger corrective action, is a step
forward. Contentions that the start of a system should be postponed—“we aren’t ready
yet”—should be disregarded. Get started. Any new information will be useful in itself
and will suggest adjustments and improvements.

Defect Classification
Any inspection station has some concepts of “good” and “bad.” This may be enough to
get started. However, corrective action on the process cannot begin until it is known
what needs correction. At a station for the visual inspection of enamel bowls, items for
a sampling sheet (Table 6.6) were discussed with the regular inspector, a foreman, and
the chief inspector. Table 6.10 was similarly devised for weaving defects in cloth. Some
general principles can be inferred from them. Figure 6.3 and the discussion below offer
some ideas for record sheets associated with single-sampling acceptance plans. These
principles apply to information kept with control charts as well. Any process events
written in the Comments column can be used to explain rejection situations.
• Give some consideration to the seriousness of defects. Table 6.7 uses two
categories, serious and very serious. Categories can be defined more carefully
after some experience with the plan. (More sophisticated plans may use three
or four categories.)
• Characterize defects into groups with some regard for their manufacturing
source. This requires advice from those familiar with the production process.
In Table 6.6, for example, black spots are listed as both A.4 and A.5, and B.8
and B.9. They were the result of different production causes. Corrective action
is better indicated when they are reported separately. Also, note metal exposed,
A.6, A.7, and A.8, and B.5, B.6, and B.7.
• Do not list too many different defect types; limit the list to those that occur
most often; then list “others.” When some other defect appears important, it
can be added to the list.
• Eventually, information relating to the natural sources of defects may be
appropriate; individual machines, operators, shifts. Even heads on a machine
or cavities in a mold may perform differently.
Chapter 6: Sampling and Narrow-Limit Gauging 165

Department: Mounting
Tube type: 6AK5
Item: Grid
Test: Visual
Sampling plan: n = 45, c = 2

Inspector

Damage

Action
Spacy

Other
Taper

Slant
Time

Total
Date

n Comments

2/5/73 10 00 45 MB 2 1 0 1 4 R

12 00 45 MB 3 0 1 2 6 R

200 45 AR 1 0 0 1 2 A

400 45 AR 2 0 0 1 3 R
Daily 180 8 1 1 5 15
Total

Circulation:
JMA WCF
FFR RMA

Figure 6.3 Lot-by-lot record for acceptance sampling (single sampling).

Sampling versus 100 Percent Inspection


Information from samples rather than 100 percent inspection is usually more helpful
because:
1. In 100 percent screening, inspection is often terminated as soon as any
defect is found in the unit. This can result in undercounting important defects.
In a sample, however, inspection of a unit can usually be continued until all
quality characteristics have been checked and counted.
2. Much 100 percent inspection is routine and uninspiring by its very nature.
Records from such inspection are often full of inaccuracies and offer little
useful information for improvement. Small samples give some release from
the boredom and allow more careful attention to listed defect items. They
also permit recognition and attention to peculiarities that occur.
3. With the use of small, convenient, fixed-size samples, information can
be fed back as illustrated in the two case histories below. The resulting
improvements in those situations had been thought impossible. Also, see
Case History 5.2.
166 Part II: Statistical Process Control

Teeth and Incentives


Firmness and tact are important when persuading people that they are at fault and that
they can correct it. There are various possibilities:
1. Some major companies physically return defectives back to the erring
department. Others have them repaired by a repair department, but charge
the repair back to the erring department. A department can often improve
itself if suitable information is fed back.
2. The physical holdup of product proceeding down the line has a most salutary
effect. When no complaints or records on bad quality are made, but instead
bad product continues down the line, it is almost certain to induce carelessness.
It says, loud and clear, “Who cares?”
3. Even a control chart on percent defective (a p chart) posted in the manufacturing
department can provide encouragement. Used carefully, this can have as much
interest and value as a golf score to an individual10 or a department.

6.12 WHERE SHOULD FEEDBACK BEGIN?


There is no one answer, but there are some guidelines:
1. An acceptance plan may already be operating but serving only as a police
function. Attach a feedback aspect, organized so as to suggest important
manufacturing problems.
2. Sore thumb. Sometimes a large amount of scrap or a failure to assemble
will indicate an obvious problem. Often no objective information is available
to indicate its severity, the apparent sources, or whether it is regular or
intermittent. Start small-scale sampling with a feedback. This may be a
formal acceptance sampling plan or a convenience sample large enough to
provide some useful information. (A sample of n = 5 will not usually be
large enough11 when using attributes data, but frequently, a sample of 25 or
50 taken at reasonable time intervals will be very useful.)
3. Begin at the beginning? It is often proposed that any improvement project
should start at the beginning of the process, making any necessary adjustments
at each successive step. Then, at the end of the process, it is argued, there will
be no problems. This approach appeals especially to those in charge of the
manufacturing processes. Sadly, it is often not good practice.

10. Ernest W. Karlin and Ernie Hanewinckel, “A Personal Quality Improvement Program for Golfers,” Quality
Progress (July 1998): 71–78. Golfers will find this approach to be an interesting, as well as effective, quality
improvement–guided method for improving your golf score.
11. For an exception, see Case History 11.1 (spot welding).
Chapter 6: Sampling and Narrow-Limit Gauging 167

First, there is rarely an opportunity to complete such a well-intentioned project. A


“bigger fire” develops elsewhere, and this one is postponed, often indefinitely.
Second, most of the steps in a process are usually right. In the process of following
operations step by step, and in checking each successive operation, much time is lost
unnecessarily. Usually it proves better to start at the back end; find the major problems
occurring in the final product. Some will have arisen in one department, some in
another. The method of the following Case History 6.1 was designed to pinpoint areas in
manufacturing that warrant attention, whether from raw materials or components, process
adjustment, engineering design, inspection, or others. Pareto analysis allows prioritiza-
tion of these aspects of any process study.12

Case History 6.1


Outgoing Product Quality Rating (OPQR)

Introduction
This program gets to the source of difficulties in a hurry. Further, it enlists the coop-
eration of various departments. The method starts by rating small samples of the out-
going product. This outgoing product quality rating program13 was suggested by a
plan14 developed in connection with complicated electronic and electrical equipment.
A well-known pharmaceutical house utilized this system for a major drive on package
quality. It is equally applicable in many other industries. [See the file \Selected Case
Histories\Chapter 6\CH6_1.pdf on the CD-ROM—including Tables 6.3 and 6.4, and
Figure 6.4.]

Case History 6.2


Metal Stamping and Enameling
Many different enameled items were made in a plant in India such as basins, trays, and
cups. The manufacture of each product began by punching blanks from large sheets of
steel and cold-forming them to shape. The enamel was then applied by dipping the item
into a vat of enamel slurry and firing in an oven. (This enameling process consisted of

12. Joseph M. Juran, “Pareto, Lorenz, Cournot, Bernoulli, Juran and others,” Industrial Quality Control 17, no. 4
(October 1960): 25.
13. William C. Frey, “A Plan for Outgoing Quality,” Modern Packaging (October 1962). Besides special details in
Table 6.3, Figure 6.4, and the classification of defects, other ideas and phrases from this article are included here.
Permission for these inclusions from the author and publisher are gratefully acknowledged.
14. Harold F. Dodge and Mary N. Torrey, “A Check Inspection and Demerit Rating Plan,” Industrial Quality Control
13, no. 1 (July 1956).
168 Part II: Statistical Process Control

two or three coating applications.) See Table 6.5 and Figure 6.5 for steps in producing
an enameled basin.
As we made our initial tour of the plant, we saw two main visual inspection (sorting)
stations: (1) after metal forming (before enameling) and (2) after final enameling (before
shipment to the customer). Either station would be a logical place to collect data.

Table 6.5 Steps in producing an enameled basin.


Step 1. Metal fabrication:
a. Metal punching (one machine with one punching head); produced circular blanks from a
large sheet of steel.
b. Stampings (three-stage forming): one machine with dies at each stage to produce
rough-edged form.
c. Trimming; a hand operation using large metal shears.
d. Cold spinning (a hand operation on a lathe to roll the edge into a band).
e. Sorting inspection (100%) (no records).
Step 2. Acid bath
Step 3. Enameling (Blue and white coats):
a. Mixing enamel.
b. Apply blue enamel coating (by dipping).
c. Fire coating (in ovens).
d. Apply white enamel coating (by dipping).
e. Paint border (hand operation).
f. Final firing.
Step 4. Final inspection:
Product classified but no record kept of defects found.

Figure 6.5 Representation of steps in metal fabrication to form an enameled basin.


Chapter 6: Sampling and Narrow-Limit Gauging 169

It is never enough just to collect data. The sensitivities of various production and
inspection groups must be recognized and their participation and support enlisted.
Many projects produce important, meaningful data that are useless until their interpre-
tations are implemented. The sequence of steps shown below was important to success
in enlisting support for the production study.
To Begin
We arrived at the factory about 8 AM and our meeting with the plant manager ended
about 12:30 PM. When asked, “When can you begin?” the answer was “Now.”
Since we had agreed to begin with visual defects on basins (Figure 6.6), the chief
inspector and our young quality control people sketched out an inspection sheet that
allowed the start of sampling at final inspection. Regular final inspection continued to
classify items as:
• First quality—approved for export.
• Second quality—with minor defects; these sold at a slightly reduced price.
• Third quality—with some serious defects; these sold at a substantial reduction
in price.
The daily inspection sheet (Table 6.6) was used.

Figure 6.6 An enameled basin.


170 Part II: Statistical Process Control

Table 6.6 Daily inspection sheet (sampling).

Product: Basin size: 16 cm or 40 cm


Date
Stage: Final inspection (after firing)
Sample size
Number Classification Number Summary
Defects Notes
inspected First Second Third inspected First Second Third
A. Serious
1. Jig mark
2. Lump
3. Nonuniform border
4. Black spot inside
5. Black spot outside
6. Metal exposed (rim)
7. Metal exposed (border)
8. Metal exposed (body)
9. Bad coating
10. Others
B. Very serious
1. Very nonuniform border
2. Chip
3. Sheet blister
4. Dented
5. Metal exposed (border)
6. Metal exposed (rim)
7. Metal exposed (body)
8. Black spot inside
9. Black spot outside
10. Lumps
11. Very bad coating
12. Dust particles
13. Others

Inspector signature

Random samples of 30 units from each day’s production were inspected for the 16-
and 40-cm basins and the number of defects of each type recorded. A single basin might
have more than one type of defect; an inspection record was kept of all defects found
on it. The same inspector was used throughout the workshop to reduce differences in
standards of inspection.
Progress
Daily meetings of the workshop team were held to look at the data on the daily sam-
pling inspection sheets. They led to discussions on ways to reduce high-defect items.
Chapter 6: Sampling and Narrow-Limit Gauging 171

Table 6.7 Enamel basins—defect analysis after four days.


Defects observed
Number inspected = 4 × 30 = 120
40-cm basins 16-cm basins
Very Very
Classification Serious serious Total Serious serious Total
of defects no. % no. % no. % no. % no. % no. %
Nonuniform border 19 16 — — 19 16 34 28 12 10 46 38
Blue, black spot 51 42 24 20 75 62 45 37 11 9 56 46
Metal exposed 5 4 3 2 8 6 10 8 32 27 42 35
Sheet blister — — 28 23 28 23 — — 7 6 7 6
Jig mark 1 1 — — 1 1 9 8 — — 9 8
Lump 14 12 5 4 19 16 2 2 — — 2 2
Bad coating 15 13 5 4 20 17 — — — — — —
Chips — — 3 2 3 2 — — — — — —
Dented — — 2 2 2 2 — — 1 1 1 1
Dust particles — — 12 10 12 10 — — — — — —
Others — — — — — — — — — — — —
Total defects 105 + 82 = 187 100 + 63 = 163

Beginning with the first day, information was given to production via the production
supervisor and was a major factor in several early corrections. Then, after four days, a
summary of the causes of defects was prepared and discussed at a meeting of the work-
shop team (Table 6.7).
Sequence of Steps in the Workshop Investigation
1. A team was formed; it included the chief inspector, the production supervisor,
two young experienced quality control people, and one of the authors (Ellis Ott).
2. A two-hour tour of the plant was made with the team to identify potential
stations for gathering information.
3. A meeting with members of the team and the works manager was held
following the tour. The discussion included:
a. Types of problems being experienced in the plant.
b. Important cooperative aspects of the project.
c. Various projects that might be undertaken. Management suggested that we
emphasize the reduction of visual defects, especially in their large-volume
16-cm enameled basin (see Figure 6.6). The suggestion was accepted.
4. Two locations were approved to begin in the workshop:
a. At final visual inspection: step 4 in Table 6.5.
b. At the point of 100 percent sorting after metal fabrication and forming
(step 1e) and just before an acid bath that preceded enameling. No records
were being kept of the number of actual defects found.
172 Part II: Statistical Process Control

5. A final oral summary with the works manager, which included an


outline presentation of findings and ideas for improvements suggested
by the data.
Some Findings
A quick check of Table 6.7 shows that four types of defects accounted for about 80 per-
cent of all defects found during the first four days (a typical experience):

Defect 40 cm 16 cm
1. Blue and black spots 62% 46%
2. Nonuniform border 16% 38%
3. Metal exposed 6% 35%
4. Sheet blister 23% 6%

Many good ideas came from a discussion of this four-day summary sheet. It was
noticed, for example, that the smaller 16-cm basin had a record of 35 percent defects for
“metal exposed” while the 40-cm basin—over six times as much area—had only six
percent! “What would explain this peculiar result?” Suddenly realizing what might be
happening, the supervisor said, “Wait a minute,” left us abruptly, and returned with a
metal tripod that was used to support both the 16- and 40-cm basins during the firing
of the enamel coating. On the small basins, the exposed metal was on the basin rim. The
small basins nestled down inside the tripod, letting the edges touch the supporting tri-
pod during firing, and the glaze (Figure 6.7) often adhered to the tripod as well as to the
basin; when the basin was removed from the tripod, the enamel pulled off the edge and
left metal exposed (a serious defect). The large basin sat on top of the tripod, and any
exposed metal was on the bottom in an area where it was classified as minor.
In this case, the solution was simple to recognize and effect, once the comparison
between the two basins was noted, because the supervisor was an active member of
the team.

Figure 6.7 Tripod supporting 16-cm enameled basin during firing.


Chapter 6: Sampling and Narrow-Limit Gauging 173

Some Subsequent Summaries


Four summaries at three- to five-day intervals were prepared during the two weeks. The
one in Table 6.8 compares the progress on the four major defects that had been found.
Figure 6.8 shows the levels of the percentage of major defects over the four time
periods for 16-cm basins. The defects in periods 3 and 4 have decreased considerably
except for blue and black spots (40-cm) and sheet blister. The summary of inspection
results was discussed each period with the production people who took various actions,
including the following, to reduce defects:
1. Pickling process—degreasing time was increased.
2. Change in enamel solution.
3. Better supervision on firing temperature.

Table 6.8 Summary showing percentage of major defects over four time periods.
40-cm basins 16-cm basins
Period* Period
Major defects 1 2 3 1 2 3 4
Blue and black spots 62.5% 52.3% 58.8% 56.7% 37.2% 27.3% 30.0%
Nonuniform border 15.8% 13.3% 3.3% 38.3% 38.3% 14.6% 23.0%
Metal exposed 6.7% 8.4% 3.3% 35.0% 11.6% 7.3% 2.0%
Sheet blister 23.3% 12.4% 18.9% 5.8% 8.4% 6.7% 2.0%
* No production of 40-cm basin in period 4.

First Second Third


quality quality quality

60

50

40
Percent

30

20

10

0
1 2 3 4 1 2 3 4 1 2 3 4
Period

Figure 6.8 Summary of percent classification of 16-cm enameled basins over four sampling
periods. (Data from Table 6.9.)
174 Part II: Statistical Process Control

Table 6.9 Changes in quality classification over four time periods.


40-cm basins 16-cm basins
Period* Period
Quality
classification 1 2 3 1 2 3 4
First 18.3% 14.2% 23.3% 20.0% 27.5% 48.0% 44.0%
Second 37.5% 58.3% 58.9% 45.0% 49.2% 46.0% 53.3%
Third 44.2% 27.5% 17.8% 35.0% 23.3% 6.0% 2.7%
Total 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
* No production of 40-cm basin in period 4.

These and other changes were effective in reducing defects.


The record (for 16-cm basins, from Table 6.9) is shown graphically in Figure 6.8. A
sharp decrease is evident in the critical third-grade quality and an increase in the per-
cent of first-quality items. Probably most important was the decrease in third quality.
At the end of the two-week study, the team met again with top management. Types
of accomplishments and problems were discussed (in nonstatistical terms). Everyone
was pleased with the progress attained and in favor of extending the methods to other
product items.
A summary of some proposed plans for extensions was prepared by the team and
presented orally and in writing to management. It is outlined below.
A Plan for the Development and Extension of a Quality Control System
at an Enamel Works
1. Official quality control committee. Although quality is the concern of
everybody in the organization, it is usually found to be the responsibility of
none. It is always important to have a small committee to plan and review
progress; and this committee should have as secretary the person who will be
charged with the responsibility of implementing the program at the factory
level. The committee should meet at least once a week to study the results
achieved and plan future action.
The committee should be composed of a management representative,
production personnel in charge of the manufacturing and enameling sections,
and the chief inspector (chairperson), assisted by the quality control person.
2. Control of visual defects in production. Initially, systematic sampling on a
routine basis should be done on all items produced every day and at least once
a week after conditions have been stabilized. Inspect for visual defects after
machining and after enameling. These data should be kept in a suitable file in
an easily distinguishable way, product by product.
All data collected should be maintained on a control chart to be kept in the
departments concerned, and quality control should bring to the notice of
Chapter 6: Sampling and Narrow-Limit Gauging 175

the appropriate personnel any abnormalities that need to be investigated


or corrected. Weekly summaries would be discussed by the quality
control committee.
3. Control of outgoing quality (quality assurance). Starting with exported
products, a regular check of about 20 items per day should be made on firsts,
seconds, and thirds (quality). Based on appropriate demerit scores for the
type and intensity of defects, a demerit chart (see Case History 6.1 on
the CD-ROM) can be kept in each case for groups of similar products.
4. Control of inspector differences. Establish “just acceptable” and “just not
acceptable” standards for each defect. Make it available to all inspectors.
Keep the data on quality assurance for each inspector (sorter). Summarize
the information on a monthly basis to study the extent of misclassification per
inspector and take corrective measures to improve poor inspectors (sorters).
5. Control of nonvisual defects. In addition to the visual defects, some important
quality characteristics that require control are:
a. Weight of the product
b. Weight of the enamel on the product
c. Uniformity of enameling
d. Chipping strength of enameling
The processes have to be studied with regard to performance on these
characteristics and, where required, simple experimentation should be
planned for effecting improvements and routine control thereafter with the
help of control charts.
6. Specification. In due course, the committee should concern itself with laying
down realistic specifications. The data from paragraphs 2 to 5 would be of
immense help.
7. Incoming material: acceptance sampling. The quality of the materials
accepted has a vital bearing on the quality of manufacture. Acceptance
sampling plans may be started on a few vital items and, based on experience,
can be gradually extended. In each case, it is necessary to devise suitable
forms so that it will be possible to analyze each product and vendor without
too much trouble.
8. Training. There should be a one-hour talk every week with the help of data
and charts collected as above to a group of workers and supervisors on
different product types; the group will be different for different weeks. The
talk should pertain to the data in which the group itself would be interested
so as to ensure responsive cooperation.
176 Part II: Statistical Process Control

9. Regular reports. Daily reports on all quality control data, should be available
to the concerned person in charge of production as well as the works manager
and technical adviser.
In each case, reports should be short and should pinpoint achievements as well as
points needing attention.

Case History 6.3


An Investigation of Cloth Defects in a Cotton Mill (Loom Shed)

Some Background Information


The advanced age and condition of the looms and loom shed had led the management
of a factory to schedule a complete replacement of looms.
The factory had had some previous helpful experience using quality control methods
to increase machine utilization in the spinning department. Consequently, there was a
climate of cooperation and hopefulness in this new venture.
There were two shifts, A and B; the same looms were studied on the two shifts.
The 500 looms were arranged in 10 lines of 50 looms each. There were 10 super-
visors per shift, one supervising each line of 50 looms. There were 25 loom operators
per line (per shift), each operator servicing two looms (in the same line).
The regular inspection practice was to 100 percent inspect each piece of cloth
(which was 18 feet long and 45 inches wide), rating it as good, second, or poor. No
record had been kept of the number, type, or origin of defects. Thus, inspection provided
no feedback of information to guide improvement of the production process.
Unfortunately, this is quite typical of 100 percent inspection procedures.
The principles of exploratory investigation used in this in-plant workshop study are
general ones, applicable in studying defects in many types of technical operations.
Summary of Workshop
1. Planning. The essential sequence of steps was:
a. A team was formed representing supervision, technology, and
quality control.
b. A tour of the plant was made for obtaining background information to
formulate methods of sampling and to prepare sampling inspection records.
c. An initial meeting was held by members of the team with plant
management to select the projects for study and outline a general plan
of procedure.
d. Frequent meetings of the team with appropriate supervisory and technical
personnel were held as the study progressed.
Chapter 6: Sampling and Narrow-Limit Gauging 177

Table 6.10 Record of weaving defects—major and minor—found in cloth pieces over two days
(from five looms on two shifts). All five looms in line 2.

Major defects Minor defects

Long Border Long


end end
Number of pieces inspected
Loom number

Imperfect head

3 ends > 18"

Other spots
Total major

Total minor
Wrong wft.
Others wft.

Wrong wft.
Flange cut

Crack wft.
Float wft.

Float wft.
Float wp.

Float wp.
Selvedge
No head

6" – 18"

6" – 18"
Smash

Others

Others

Others

Others
Crack

Rusty
Thick

Thick
> 18"

Fluft

Thin
Oily

Oily
6"
Shift A
210 13 1 2 1 1 1 6 4 1 1 8 2 2 18
211 15 1 1 1 2 5 1 3 2 3 1 1 11
223 13 2 1 1 1 5 5 7 4 16
251 15 2 1 1 4 4 1 3 2 1 1 6 18
260 7 2 1 1 4 4 1 5 1 5 16
S 63 3 5 1 4 2 4 2 1 1 1 24 1 20 1 3 23 10 2 4 1 14 79
Shift B
210 14 4 1 5 3 5 2 3 1 14
211 14 1 2 1 2 1 7 4 1 2 7
223 12 2 3 1 1 1 8
251 14 1 1 1 1 1 1 4
260 8 1 7 1 9
S 62 1 6 2 2 1 1 13 7 1 20 3 3 1 1 6 42
A & B shifts combined
S 3 6 1 10 2 2 2 1 4 3 1 1 1 37 1 27 1 4 43 3 13 1 3 4 1 20 121

2. Data form. A standard form was prepared in advance; it listed the types of
major and minor defects that might be expected (Table 6.10). This form was
used by the inspector to record the defects found in the initial investigation
and in subsequent special studies.
The factory representatives on the team suggested that certain types of
defects were operator-induced (and could be reduced by good, well-trained
supervision) and that other types were machine-induced (some of which
should be reduced by machine adjustments, when recognized).
3. Sampling versus 100 percent inspection in study. It was not feasible to
examine cloth from each of the 500 loom–operator combinations; but even
had it been technically feasible, we would prefer to use a sampling of looms
for an initial study. Information from a sample of looms can indicate the
general types of existing differences, acquaint supervision with actual records
of defect occurrences, and permit attacks on major problems more effectively
178 Part II: Statistical Process Control

and quickly than by waiting for the collection and analysis of data from all
500 looms.
What specific method of sampling should be used? Any plan to sample
from groupings, which might be operating differently, is preferred to a
random sampling from the entire loom shed. Differences among the 10 line
supervisors are an obvious possibility for differences. Within each line, there
may be reasons why the sampling should be stratified:
a. Proximity to humidifiers, or sunny and shady walls
b. Different types of looms, if any
c. Different types of cloth being woven, if any
4. The proposed scheme of sampling looms. Management was able to provide
one inspector for this study. This made it possible to inspect and record
defects on the production of about 15 looms on each shift. (One inspector
could inspect about 200 pieces of cloth daily.) Each loom produced six or
seven pieces of cloth per shift or 12 to 14 pieces in the two shifts. Then an
estimate of the number n of looms to include in the study is: n ≅ 200/12 ≅ 16.
After discussion with the team, it was decided to select five looms from each
of lines 1, 2, and 3 on the first day of the study and repeat it on the second
day; then five looms from lines 4, 5, and 6 on the third day and repeat on the
fourth day; then lines 7, 8, and 9 on two successive days.
This sampling scheme would permit the following comparisons to be made:
a. Between the 10 line supervisors on each shift, by comparing differences
in numbers of defects between lines.
b. Between the two shifts, by comparing differences in numbers of defects
from the same looms on the two shifts. (Shift differences would probably
be attributed to either supervision or operator differences; temperature and
humidity were other possibilities.)
c. Between the looms within lines included in the study.
Each piece of cloth was inspected completely, and every defect observed was
recorded on the inspection form (see Table 6.10). The technical purpose of this
sampling-study workshop was to determine major sources and types of defects
in order to indicate corrective action and reduce defects in subsequent
production. After an initial determination of major types of differences, it was
expected that a sampling system would be extended by management to other
looms and operators on a routine basis.
5. A final oral summary/outline presentation of findings and indicated
differences was held between the team and management. Findings were
presented graphically to indicate major effects; specific methods of improving
Chapter 6: Sampling and Narrow-Limit Gauging 179

the manufacturing and supervisory processes were discussed. In addition,


suggestions were made on how to extend the methods to other looms,
operators, supervision, and possible sources of differences.
6. Some findings. Many different types of useful information were obtained from
the two-week study, but improvement began almost immediately. Table 6.10,
for example, includes the first two days’ record of defects of five rather bad
looms (line 2) on shifts A and B.
a. The vital few. Five types of defects accounted for 70 percent of all
major defects found during the first two days; it is typical that a few
types of defects account for the great majority of all defects.
b. Difference between shifts. Management was surprised when shown that
almost twice as many defects, major and minor, came from shift A as
from shift B. (See Table 6.10.) This observed difference was the reason
for a serious study to find the reasons; important ones were found by
plant personnel.
c. Differences between loom–operator combinations in line 2. All five of
these bad looms are in line 2; they have the same line supervisor but
different operators. No statistically significant difference was determined
between loom–operator combinations. It was found that more improve-
ments could be expected by improving line supervision than from operator
or machine performance.

Discussion

Management maintained a mild skepticism, initially, toward this “ivory tower”


sampling study but soon became involved. It interested them, for example, that the num-
ber of defects on the second day of sampling was substantially lower than on the first
day! Supervisors had obtained evidence from the loom records of loom–operator dif-
ferences and could give directions on corrective methods. Improvements came from
better operator attention and loom adjustments; operators readily cooperated in making
improvements.
The average number of major and minor defects (causing downgrading of cloth)
had been 24 percent prior to the workshop study (that is, about 0.24 major defects per
piece since major defects were the principal cause of downgrading). During this work-
shop in December, the system of recording and charting the percent of pieces with
major defects was begun and continued.
The data from five loom operators per line immediately showed important differ-
ences between lines (supervisors) and loom operators; the word was passed from super-
visor to supervisor of differences being found and suggestions for making substantial
improvements.
No record is available showing the improvements made by the end of the second
week when a presentation to management outlined the major findings; but management
180 Part II: Statistical Process Control

n = 120
24
Level prior to SQC = 23.8%
Percent major damage

20
Average ≅ 15%
Average ≅ 17%
16
Average ≅ 11.5%

12
Management Management
discontinued quality reinstated quality
8
control sampling control sampling

March 1–March 19 March 20–April 1 April 2–April 21 April 22–April 30

Figure 6.9 Record of percent major damaged cloth in March and April following start of quality
control program. Average prior to the program was about 24 percent.

arranged for the sampling procedure to be continued and posted charts of the sam-
pling defects.
Figure 6.9 shows a daily record of the major defects per piece during March and
April; this study began the previous December. It shows several apparent levels of per-
formance; an explanation of two of them is given in a letter written the following July:
Damages, which came down from 24 percent to 16 percent a little while after
December 19, have now been further reduced to 11 percent as a result of addi-
tional sampling checks for quality at the looms. You will see some periods when
management lifted the controls in the hope that they would not be necessary.
But as soon as things started worsening, they reinstated the procedures laid
down earlier. Progress in this plant is quite satisfactory, and it provides a
method immediately applicable to other cotton mills.

6.13 NARROW-LIMIT GAUGING


The plans presented so far involve attributes (go/no-go) data. Go/no-go gauging has
advantages over measurements that are made on a variables scale; less skill and time
are usually required. But also, the tradition of gauging has been established in many
shops even when it may be entirely feasible to make measurements.
However, there are real disadvantages associated with gauging. Large sample-sizes
are required to detect important changes in the process. This is expensive when the test
is nondestructive; it becomes exorbitant when the test is destructive. The function of
inspection is a dual one: (1) it separates the sheep from the goats, of course; but (2) it
Chapter 6: Sampling and Narrow-Limit Gauging 181

should provide a warning feedback of developing trends. Go/no-go gauges made to


specifications provide little or no warning of approaching trouble. When the process
produces only a small percent of units out of specifications, a sample selected for gaug-
ing will seldom contain any out-of-spec units. By the time out-of-spec units are found,
a large percentage of the product may already be out of specifications, and we are in
real trouble.
Hopefully there is a procedure that retains the important advantages of gauging but
improves its efficiency. Narrow-limit gauging (NLG) is such a method. It is a gauging
procedure; it is versatile; it is applicable to chemical as well as to mechanical and elec-
trical applications. Required sample sizes are only nominally larger than equivalent
ones when using measurements.
This discussion concerns narrowed or compressed gauges used to guide a process.
They function to prevent trouble rather than to wait until sometime after manufacture to
learn that the process has not been operating satisfactorily. A process can be guided only
if information from gauging is made available to production in advance of a substantial
increase in defectives.
The variation of some processes over short periods is often less than permitted by
the specifications (tolerances). It is then economical to allow some shift in the process,
provided a system is operating that detects the approach of rejects.

6.14 OUTLINE OF AN NL-GAUGING PLAN


At the start, narrowed or compressed gauges (NL gauges) must be specified and pre-
pared. They may be mechanical ones made in the machine shop; they may simply be
limits computed and marked on a dial gauge or computed and used with any variables
measurement procedure.
NL gauges are narrowed by an amount indicated by ts ; see Figure 6.10, where the
mean is (arbitrarily) taken a distance 3s from the lower specification limit.

USL

3s
Lower specification limit

Upper specification limit

ts
NL gauge

Figure 6.10 Definition of ts for NL gauge.


182 Part II: Statistical Process Control

Small samples of size n are usually gauged at regular time intervals; n may be as
small as 4 or 5 and is not usually greater than 10. (The sample is of the most recent pro-
duction when being used for process control.)
A specified number c of items in a sample of n will be allowed to “fail”15 the NL
gauges. If there are c or fewer items nonconforming to the NL gauges in a sample of n,
the process is continued without adjustment. If more than c nonconforming items are
found, then the process is to be adjusted. Separate records are kept of the number of
units that fail the smaller NL gauge and of the number failing the larger NL gauge. In
the applications discussed here, there are upper and lower specifications, and the dis-
tance between them is larger than the process spread. That is, (USL – LSL) > 6s .
It is assumed that the process produces a product whose quality characteristic has a
reasonably normal distribution (over a short time interval). See “Hazards,” Section 6.17.

Basic Assumptions for NL Gauging


1. An estimate of the basic process variability is available or can be made.
Perhaps an estimate can be made from a control chart of ranges or perhaps
from experience with a similar application.
2. The difference between the upper and lower specification limits (USL, LSL)
is greater than 6s ; that is, (USL – LSL) > 6s.
This assumption means that some shifting in the process average is acceptable.
It also means that only one tail of the distribution at a time need be considered
when computing operating-characteristic curves (OC curves).
3. The distribution of the quality characteristic should be reasonably normal over
a short time.

6.15 SELECTION OF A SIMPLE NL-GAUGING


SAMPLING PLAN

Those who have used X and R charts are familiar with the usefulness of small samples
of ng = 4 or 5. Samples as large as 9 or 10 are not often used; they are too sensitive and
indicate many shifts and peculiarities in the process that are not important problems.

Then those who are familiar with X and R charts will approve of an NL-gauge plan

that gives process guidance comparable to that of X and R charts for ng = 4 or 5. Four
such plans are the following:

A. ng = 5, c = 1, t = 1.0
C. ng = 10, c = 2, t = 1.0

15. Any unit that fails an NL gauge can be re-gauged at the actual specifications to determine salability.
Chapter 6: Sampling and Narrow-Limit Gauging 183

D. ng = 4, c = 1, t = 1.2
F. ng = 10, c = 2, t = 1.2

Two OC curves are shown in Figure 6.11, along with that of an X control chart
plan, ng = 4. OC curves are shown for plans A and C (data from Table 6.11). The OC

curves of plans A and D are very close to that of an X chart, ng = 4. For that reason,
the curve for plan D has not been drawn. A discussion of the construction of operating-
characteristic curves for some plans is given in Section 6.16.
When guiding a process, we may believe that there is need for more information

than provided by X and R charts using ng = 5. If so, we usually take samples of five more
frequently in preference to increasing the size of ng. The same procedure is recom-
mended for an NL-gauge system.
The effectiveness of NL gauging in helping prevent trouble is improved when we
chart the information from successive samples. These charts indicate the approach of
trouble in time to take preventive action. Plans that have zero acceptance numbers are
all-or-nothing in signaling trouble.
Since an indication of approaching trouble is not possible in a plan with c = 0, the
smallest recommended value is c = 1. A frequent preference is a plan with ng = 4 or 5,
c = 1, and t = 1.

1.00

.80 NL-gauge plan A


ng = 5, c = 1, t = 1.0

.60

.40 NL-gauge plan C Plan A


ng = 10, c = 2, t = 1.0

X , ng = 4
.20
Plan C

0
0 1 2 3 4 5 6 7 8 9 10
P, percent outside one specification limit

Figure 6.11 Some


–– operating characteristics of NL-gauging plans and a variables control chart
on X with ng = 4. OC curves are shown for plans A and C. (Data from Table 6.11.)
––
The OC curves of plans A and D are very close to that of an X chart with ng = 4.
For that reason, the curve for plan D has not been drawn.
184 Part II: Statistical Process Control

Case History 6.4


Extruding Plastic Caps and Bottles
The mating fit of a semiflexible cap on a bottle depends on the inside diameter (ID) of
the cap and the outside diameter (OD) of the bottle, as shown in Figure 6.12.
Some Pertinent Information
1. The molded plastic caps and bottles shrink during cooling after leaving the
mold. We found it reasonable to immerse them in cold water before NL gauges
were used. Sometimes they are held in a plastic bag during immersion.
Gauging could then be done shortly after production.
2. The ID and OD dimensions can be adjusted by certain temperature ranges in
the machine molds and by other machine adjustments. It requires specific
knowledge of the process to effect these machine changes in diameter without
introducing visual defects into the product.
3. The usual production control check on these two diameters is by gauging; a
plug gauge for ID and a ring gauge for OD. It is traditional to use plug gauges
made to the maximum and minimum specifications. Then by the time they
find a reject, large numbers of out-of-spec caps or bottles are already in the
bin. Since it is rarely economical to make a 100 percent inspection of caps
or bottles, the partial bin of product must be scrapped.
4. It is possible to purchase dimensional equipment to measure the OD and
the ID. However, they are slow to use, not as accurate as one would
expect, require more skill than gauging, and the plotting of data is
more elaborate.

Soft plastic cap


ID

OD

Soft plastic bottle

Figure 6.12 Molded plastic bottle components.


Chapter 6: Sampling and Narrow-Limit Gauging 185

5. Each cap or bottle has the cavity of origin molded into it because differences
between cavities are common. However, it requires excessive time to
establish the cavity numbers represented in a sample and then to measure
and record diameters.
Determining the Variability of Cap ID
On one type of plastic cap, the specifications were

USL = 1.015 inches and LSL = 1.00 inches

Samples of three caps from each of the 20 cavities were gauged with a series of six
gauges having diameters of 1.000 inches, 1.003 inches, 1.006 inches, 1.009 inches,
1.012 inches, and 1.015 inches. The department participated in measuring diameters
with these plug gauges. It was first determined that the ID of a cap could be determined
to within 0.001 inches or 0.002 inches with this set of gauges. Then the ID of samples
of three caps were “measured” with them and ranges of ng = 3 computed. From these
measurements we computed

ŝ = R/d2 ≅ 0.003 inches

We already had plug gauges to measure 1.003 inches and 1.012 inches; these cor-
responded to NL gauges compressed by tŝ ≅ 0.003 (t = 1.0). We decided to use the plan:
ng = 10, t = 1.0, and c = 1; random samples of 10 would allow more representative sam-
pling of the 20 cavities.16 Other plans using different values of ng and c might have been
equally effective, of course.
Consequences
Much of the art of adjusting very expensive and complicated equipment was replaced
by cause-and-effect relationships. A production process that had been a source of daily
rejects, aggravation, and trouble began gradually to improve.
Then the department was expanded by purchasing equipment for the manufacture
of plastic bottles; the critical dimension was now the OD of the neck. The same general
approach that had been used for caps was still applicable; ring gauges when narrowed
by 0.003 inches from the LSL and USL proved satisfactory NL-gauges.
Soft plastic caps and bottles are produced in very large quantities and the cost of a
single unit is not great. There are serious reasons why the traditional manufacturing
controls discussed in (3) and (4) above are not satisfactory.
Industrial experiences with NL gauges have shown them practical and effective.

16. See item 5.


186 Part II: Statistical Process Control

Case History 6.5


Chemical Titration

Introduction
Several multihead presses were producing very large quantities of a pharmaceutical
product in the form of tablets. It was intended to double the number of presses, which
would require an increase in the amount of titration and the number of chemists; this
was at a time when analytical chemists were in short supply. It was decided to explore
the possibility of applying gauging methods to the titration procedure. The following
NL-gauging plan was a joint effort of persons knowledgeable in titration procedures and
applied statistics.
Some Pertinent Information
1. Data establishing the variability of tablet-making presses were already in
the files; they were used to provide the estimate ŝ ≅ 0.125 grains.
2. Specifications on individual tablets were LSL = 4.5 grains, USL = 5.5 grains.
3. Then 6ŝ = 0.75 is less than the difference USL–LSL of 1.0 grains. A choice
of t = 1.2 with ŝ = 0.125 gives tŝ ≅ 0.15. The chemists agreed that this was
feasible. The lower NL-gauge value was then 4.65 grains, and the upper
NL-gauge value was 5.35 grains.
Sampling
It was proposed that we continue the custom of taking several samples per day from
each machine; then an OC curve of our NL-gauging plan was desired that would be

comparable to using X and R charts with ng = 4 or 5. Some calculations led to the plan;
ng = 4, t = 1.2, and c = 1. (See Table 6.13 for the computation of the OC curve.)
A semiautomatic-charging machine was adjusted to deliver the titrant required to
detect by color change:
1. Less than 4.65 grains on the first charge
2. More than 5.35 grains on the second charge
Consequences
With minor adjustments, the procedure was successful from the beginning. The accu-
racy of the gauging method was checked regularly by titration to endpoint.
Shifts on individual presses were detected by the presence of tablets outside the NL
gauges, but usually within specifications. The presses could be adjusted to make tablets
to specifications. The required number of chemists was not doubled when the
increased number of tablet presses began operation—in fact the number was reduced
by almost half.
Chapter 6: Sampling and Narrow-Limit Gauging 187

Nov. 2 Nov. 4

1
.2383 ✔ ✔✔ NL gauge ✔ ✔ ✔
.2382 ✔✔ ✔✔ ✔✔ ✔✔✔ ✔ ✔✔✔ ✔✔ ✔✔ ✔✔✔
.2381 ✔✔ ✔✔✔ ✔✔ ✔ ✔✔✔ ✔✔ ✔✔ ✔✔ ✔✔ ✔ ✔ ✔✔
.2380 ✔ ✔✔ ✔✔ ✔✔ ✔✔ ✔✔
.2379 ✔✔ ✔✔ ✔
.2378
.2377 NL gauge
Time 8 9 10 11 12 1 2 3 4 5 8 9 10

Figure 6.13 Adjustment chart on a screw machine operation using NL-gauging principles.
Plan: n1 = 5, c = 1 with t = 1.5. Notes: (1) Sample failed n1 = 5, c = 1 criterion;
tool was reset.

Case History 6.6


Machine-Shop Dimensions
Some aircraft instruments use parts machined with great precision. It was customary in
this shop to measure critical dimensions to the nearest ten-thousandth of an inch. The
specifications on one part were 0.2377 to 0.2383 inches (a spread of 0.0006 inches). Data
on hand indicated a machine capability of ±0.0002, that is, an estimated ŝ = (0.0004)/6.
The realities of measurement would not permit using limits compressed less than
0.0001 inch; this corresponded to t ≅ 1.5. The plan was: n1 = 5, c = 1 and t = 1.5.
The toolmaker would measure a machined piece with a toolmaker’s micrometer and
indicate the reading by a check, as in Figure 6.13. This chart is a combination of a vari-
ables chart and NL-gauge chart. A form was accepted and maintained by “old-line”
machine operators.
The machine operators were willing to make check marks above and below the NL-
gauge lines. Previously, their practice had been to flinch and make a sequence of three or
four consecutive check marks at UCL = 0.2383. Then later rechecks of production at these
times would show oversized parts. The psychology of making checks outside NL gauges,
but within specifications, resulted in resetting the tool before rejects were machined.
Although no physical NL gauges were made for this particular machine-shop appli-
cation, the entire concept of adjusting the process was exactly that of NL gauging.

6.16 OC CURVES OF NL-GAUGE PLANS17


It is not easy at first to accept the apparent effectiveness of NL gauging with small sam-
ples. True, the OC curves in Figure 6.11 do indicate that some NL-gauging plans are
17. May be omitted by reader.
188 Part II: Statistical Process Control


very comparable to ordinary X and R control charts. However, actual experience is also
helpful in developing a confidence in them.
We used samples of ng = 4, 5, and 10 in preceding examples. It will be seen that OC

curves comparable to X and R charts, ng = 4, can be obtained from NL-gauging plans using

ng = 4 or 5 t = 1.0 to 1.2 and c = 1

We show the method of deriving OC curves for ng = 5 and 10 in Table 6.11. Curves
for many other plans can be derived similarly.
Two different types of percents will be used in this discussion:
1. P will represent the percent outside the actual specification. It appears as
the abscissa in Figure 6.15. It is important in assessing the suitability of a
particular sampling plan. It appears as column 1 in Table 6.11.
2. P´ will represent the percent outside the NL gauge corresponding to each
value of P. It appears as column 4 in Table 6.11. It is an auxiliary percent
used with Table A.5 to determine probabilities PA.

Derivation of OC Curves
On the vertical scale in Figure 6.15 we show the probability of acceptance, PA; it repre-
sents the probabilities that the process will be approved or accepted without adjustment
under the NL-gauging plan for different values of P. To obtain PA, we use the binomial
distribution (Table A.5) in conjunction with the normal distribution as in Figure 6.14.
The steps in filling out Table 6.11 are as follows:
1. Select appropriate percents P out of specification and list them in column 1.
These are directly related to the position of the process mean m relative to the
specification. For P to change, m must shift.
2. From the normal table (Table A.1) obtain the standard normal deviate Z
corresponding to the percent out of specification P. List these in column 2.
3. Determine the corresponding standard normal deviate for the narrow-limit
gauge. This will always be Z´ = Z – t since the narrow-limit gauge will be a
distance ts from the specification limit. List the values of Z´ in column 3.
4. Find the percent outside the narrow-limit gauge P´ from the value of Z´ shown
using the standard normal distribution (Table A.1). List these in column 4.
5. Find the probability of acceptance for the NLG plan at the specified percent
P out of specification by using the binomial distribution (Table A.5) with
percent nonconforming P´, sample size n, and acceptance number c given.
List these values in column 5. Note that, given column 4, PA can be
calculated for any other values of n and c. Additional plans are shown in
columns 5a and 5b.
Chapter 6: Sampling and Narrow-Limit Gauging 189

NLG USL

P' P

m
X scale
ts
Z' Z Z scale

Figure 6.14 Deriving an OC curve for an NL-gauging plan (general procedure). (Detail given
in Table 6.11).

Table 6.11 Derivation of operating-characteristic curves for some NL-gauging plans with
gauge compressed by 1.0s (t = 1.0)
(1) (2) (3) (4) (5) (5a) (5b)
P= Z value Z value P´ = percent PA PA PA
percent of P of P´ outside Plan A Plan B Plan C
outside Zp Z ´ = Zp – t NLG n=5 n=5 n = 10
spec. c=1 c=2 c=2
0.1 3.09 2.09 1.83 0.9968 0.9999 0.9993
0.135 3.00 2.00 2.28 0.9950 0.9999 0.9987
1.0 2.33 1.33 9.18 0.9302 0.9933 0.9432
2.0 2.05 1.05 14.69 0.8409 0.9749 0.8282
5.0 1.64 0.64 26.11 0.6094 0.8844 0.4925
10.0 1.28 0.28 38.97 0.3550 0.7002 0.1845
15.0 1.04 0.04 48.40 0.2081 0.5300 0.0669

1.00
.90
.80
NL-gauging plan A
.70 n = 5, c = 1, t = 1.0
.60
PA% .50

.40
.30
.20
.10
0
0 1 2 3 4 5 6 7 8 9 10
P, percent outside specification limit (use LSL)

Figure 6.15 OC curves of NL-gauging plan. (Data from Table 6.11.)


190 Part II: Statistical Process Control

6. Plot points corresponding to column 1 on the horizontal axis and column 5 on


the vertical axis to show the OC curve (Figure 6.15).
Narrow-limit plans have been used in acceptance sampling (see Schilling and
Sommers18) and as a process control device. To compare the OC curve of the narrow-
limit procedure to that of a standard control chart, it is usually assumed that the process
is in control with a mean value positioned 3s from the specification limits with P =
0.135 percent. Shifts in the process mean from that position will result in changes in
P. The relationship between the shift in the mean and the percent outside 3s is shown
in Table 6.12.
Thus, a shift in the mean of (3 – Z)s will result in the percents defective shown in
column 1 of Table 6.13 for either type of chart. The probability of acceptance can then
be calculated and the charts compared (see Figure 6.11).
Note that the binomial distribution (rather than the Poisson approximation) is used
here because, for NLG applications, very often nP´ is greater than 5.

Table 6.12 Percent of normally distributed product outside 3s specification from nominal mean
of control chart for comparison of NLG to other control chart procedures.
Distance to spec.
Shift in mean in s units Percent
(Z ) (3 – Z ) nonconforming
Nominal 0.0 3.0 0.135
0.5 2.5 0.62
1.0 2.0 2.28
1.5 1.5 6.68
2.0 1.0 15.87
2.5 0.5 30.85
3.0 0.0 50.00

Table 6.13 Deriving an OC curve for the NLG plan n = 4, t = 1.2, c = 1.


(1) (2) (3) (4) (5)
P = percent Z value Z value P´ = percent PA PA
outside spec. of P of P´ outside Plan D Shewhart

Zp Z ´ = Zp – t NLG n=4 X chart
t = 1.2 (see Table
c=1 2.5)
0.135 3.0 1.8 3.59 99.26 99.87
0.62 2.5 1.3 9.68 95.08 97.72
2.28 2.0 0.8 21.19 80.07 84.13
6.68 1.5 0.3 38.21 50.63 50.00
15.87 1.0 –0.2 57.93 20.39 15.87
23.27 0.73 –0.47 68.00 9.96 6.68
30.85 0.5 –0.7 75.80 4.64 2.28
50.00 0.0 –1.2 88.49 0.56 0.135

18. E. G. Schilling and D. J. Sommers, “Two-Point Optimal Narrow-Limit Plans with Applications to MIL-STD-
105D,” Journal of Quality Technology 13, no. 2 (April 1981): 83–92.
Chapter 6: Sampling and Narrow-Limit Gauging 191

6.17 HAZARDS
There is a potential error in values of PA if our estimate of s is in substantial error. In
practice, our estimate of s might be in error by 25 percent, for example. Then instead
of operating on a curve corresponding to t = 1.0, we either operate on the curve corre-
sponding to t = 1.25 when our estimate is too large, or we operate on the curve cor-
responding to t = 0.75 when our estimate is too small. In either case, a difference of this
small magnitude does not appear to be an important factor.
Suppose that the portion of the distribution nearest the specification is not normally
distributed. In most instances, this is more of a statistical question than a practical one,
although there are notable exceptions in certain electronic characteristics, for example.
In machining operations, we have never found enough departure from a normal distri-
bution to be important except when units produced from different sources (heads, spin-
dles) are being combined. Even then, that portion of the basic curve nearest the
specification limit (and from which we draw our sample) is typically normally-shaped.
If the distribution is violently nonnormal, or if an error is made in estimating s,
the NL-gauging system still provides control of the process, but not necessarily at the
predicted level.
In discussing a similar situation, Tippett19 remarks that there need not be too much
concern about whether there was an “accurate and precise statistical result, because in
the complete problem there were so many other elements which could not be accu-
rately measured.”
It should be noted that narrow-limit plans need not be limited to normal distribu-
tions of measurements. Since the OC curve is determined by the relationship of the pro-
portion of product beyond the specification limit P to the proportion of product outside
the narrow-limit gauge, P´, any distribution can be used to establish the relationship.
This can be done from probability paper or from existing tables of nonnormal distribu-
tions. For example, when a Pearson type III distribution is involved, the tables of
Salvosa20 can easily be used to establish the relationship, given the coefficient of skew-
ness a3 involved. The procedure would be similar to that described here (Table 6.13),
using the Salvosa table in place of the normal table for various values of Z.
Discussion: We have used NL-gauges in a variety of process control applications over
the years. From both experience and the underlying theory, we find that NL-gauges
offer a major contribution to the study of industrial processes even when it is possible

to use X and R charts. There are different reasons that recommend NL-gauges:
1. Only gauging is required—no measurements.
2. Even for samples as small as five, the sensitivity is comparable to control
charts with samples of four and five.

19. L. H. C. Tippett, Technological Applications of Statistics (New York: John Wiley & Sons, 1950).
20. Luis R. Salvosa, “Tables of Pearson’s Type III Function,” Annals of Mathematical Statistics 1 (May 1930): 191ff.
See also Albert E. Waugh, Elements of Statistical Method (New York: McGraw-Hill, 1952): 212–15.
192 Part II: Statistical Process Control

3. Record keeping is simple and effective. Charting the results at the machine

is simpler and faster than with X and R charts. The number of pieces that
fail to pass the NL gauges can be charted by the operator quickly. Trends
in the process and shifts in level are often detected by the operator before

trouble is serious. Operator comprehension is often better than with X and
R charts.

4. NL-gauge plans may be applied in many operations where X and R charts

are not feasible, thereby bringing the sensitivity of X and R charts to many
difficult problems.

6.18 SELECTION OF AN NL-GAUGING PLAN



The selection of an NL-gauging plan is similar to the selection of an X and R plan. The
same principles are applicable. A sample size of five in either is usually adequate. A
sample of 10 will sometimes be preferred with NL-gauging; it provides more assurance
that the sample is representative of the process. We recommend the following plans, or

slight modifications of them, since their sensitivity corresponds closely to an X and R
chart with ng = 5:

Plan A Plan F Plan D


ng = 5 ng = 10 ng = 4
t = 1.0 t = 1.2 t = 1.2
c=1 c=2 c=1

Various charts and ideas presented in this chapter are reproduced from an article by
Ellis R. Ott and August B. Mundel, “Narrow-limit Gauging,” Industrial Quality Control
10, no. 5 (March 1954). They are used with the permission of the editor.
The use of these plans in process control has been discussed by Ott and Marash.21
Optimal narrow-limit plans are presented and tabulated by Edward G. Schilling and Dan
J. Sommers in their paper,22 and discussed in the next section.

6.19 OPTIMAL NARROW-LIMIT PLANS


An optimal narrow-limit plan, in the sense of minimum sample size to achieve a given
AQL and LTPD with associated producer’s risk a and consumer’s risk b, can be approx-
imated from the following relations. The method uses upper-tail standard normal deviates

21. E. R. Ott and S. A. Marash, “Process Control with Narrow-Limit Gauging,” Transactions of the 33rd ASQC
Annual Technical Conference (Houston, TX: 1979): 371.
22. E. G. Schilling and D. J. Sommers, “Two-Point Optimal Narrow-Limit Plans with Applications to MIL-STD-
105D,” Journal of Quality Technology 13, no. 2 (April 1981): 83–92.
Chapter 6: Sampling and Narrow-Limit Gauging 193

zAQL, zLTPD, za , and zb from Table A.1 and follows from a procedure relating narrow-limit
and variables plans developed by Dan J. Sommers.23

 zα + zβ 
2

n = 1.5  
 z − z
AQL LTPD 

 zα + zβ 
2

c = 0.75   − 0.67 = 0.5n − 0.67


 z − z
AQL LTPD 

z LTPD zα + z AQL zβ
t=
zα + zβ

When the conventional values of a = 0.05 and b = 0.10 are used, these equations
simplify to

 3.585 
2

n= 
 z − z
AQL LTPD 

c = 0.5n − 0.67
z LTPD (1.645) + z AQL (1.282 )
t=
2.927

For example, the plan n = 4, c = 1, t = 1.2 in Table 6.13 has an AQL of 0.62 percent
and an LTPD of 23.27 percent, and can be approximated as follows:

 3.585 
2

n=  = 4.10
 2.50 − 0.73 
c = 0.5 ( 4.10 ) − 0.67 = 1.38
0.73(1.645) + 2.50 (1.282 )
t= = 1.5
2.927

giving n = 4, c = 1, and t = 1.5, which is close enough for practical purposes.

6.20 PRACTICE EXERCISES


1. Copy Figure 6.1 onto a sheet of graph paper, using Table 6.1 for assistance.
Extend the chart by plotting similar curves for c = 1 and c = 0, holding
n = 45 constant.

23. Ibid.
194 Part II: Statistical Process Control

2. Copy Figure 6.1 onto a second sheet of paper and extend the chart by plotting
similar curves for n = 25 and n = 100, holding c = 2 constant.
3. Write a short essay on the effect of varying n and c, generalizing from the
results of the above exercises. Prepare for oral presentation.
4. Using PA = 0.95 (producer risk of five percent) for AQL, PA= 0.50 for IQL,
and PA = 0.10 (consumer risk of 10 percent) for LTPD, set up a table of
the AQL, IQL, and LTPD for the five plans in exercises 1 and 2. (Note:
Current ANSI standards use LQ, meaning, “limiting quality” instead of
LTPD; IQL means indifference quality level.)
5. State the two important requirements for narrow-limit gauging to work.
6. What should be done with individual units that fail the NL gauge?
7. Derive OC curves for n = 10, c = 1, t = 1, and n = 10, c = 1, t = 1.5. Plot them
on a graph together with the three plans presented in Table 6.11. Use different
colors to clearly distinguish between the five curves. From study of these five
curves, write a general discussion of the effect of n, c, and t on an NLG plan,
and how they interact.
8. Given NLG plan n = 5, c = 1, t = 1, find probability of acceptance PA for a
fraction defective p of 0.03 (3 percent).
9. A process has been
– monitored–with a control chart and shows good evidence
of control, with X = 180 and R = 5.4. The upper specification limit on this

process is 188. Given a sample size of 10, find (a) an upper limit on X that
will provide a 90 percent chance of rejecting a lot with 10 percent defective,
and (b) a comparable NL-gauging plan using n = 10, c = 1. Illustrate the
relationships involved with sketches.
– –
10. Sketch the OC curve of a conventional Shewhart X chart with n = 5. For X
charts, assume that the control limit is set for m = USL – 3 sigma: Then derive
the OC curve of an NLG plan with n = 4, t = 1.2, c = 1. Compare these two

OC curves and discuss trade-offs involved in using the Shewhart X chart and
the NLG technique for ongoing process control. Consider the need for
maintaining an R chart concurrent with an NLG process control technique,
and consider as an alternative a double NLG plan that monitors both tails of
the process. (Note that such a procedure gives ongoing information about
process variability as well as fraction defective, and can be used in
conjunction with a two-sided specification limit.)
7
Principles and Applications
of Control Charts

7.1 INTRODUCTION
Process quality control is not new. It had its genesis in the conviction of Walter Shewhart
that “constant systems of chance causes do exist in nature” and that “assignable causes
of variation may be found and eliminated.”1 That is to say that controlled processes exist
and the causes for shifts in such processes can be found. Of course, the key technique in
doing so is the control chart, whose contribution is often as much in terms of a physical
representation of the philosophy it represents as it is a vital technique for implementa-
tion of that philosophy.
In its most prosaic form, we think of process quality control in terms of a continu-
ing effort to keep processes centered at their target value while maintaining the spread
at prescribed values. This is what Taguchi has called online quality control.2 But process
quality control is, and must be, more than that, for to attain the qualities desired, all ele-
ments of an organization must participate and all phases of development from concep-
tion to completion must be addressed. Problems must be unearthed before they blossom
and spread their seeds of difficulty. Causes must be found for unforeseen results. Action
must be taken to rectify the problems. And finally, controls must be implemented so the
problems do not reoccur. This is the most exciting aspect of process control: not the con-
trol, but the conquest. For nothing is more stimulating than the new. New ideas, new
concepts, new solutions, new methods—all of which can come out of a well-organized
and directed program of process quality control.

1. W. A. Shewhart, Economic Control of Quality of Manufactured Product (New York: D. Van Nostrand, 1931).
2. G. Taguchi, “On-Line Quality Control during Production,” Japanese Standards Association (Tokyo, 1981).

195
196 Part II: Statistical Process Control

7.2 KEY ASPECTS OF PROCESS QUALITY CONTROL


Process quality control may be addressed in terms of three key aspects:
1. Process control. Maintaining the process on target with respect to centering
and spread.
2. Process capability. Determining the inherent spread of a controlled process for
establishing realistic specifications, for comparative purposes, and so forth.
3. Process change. Implementing process modifications as part of process
improvement and troubleshooting.
These aspects of process control are depicted in Figure 7.1.
Naturally, these aspects work together in a coordinated program of process quality
control in that achievement of statistical control is necessary for a meaningful assessment
of capability and the analysis of capability of a process against requirements is often an
instrument of change. But changes may necessitate new efforts for control and the cycle
starts over again.
There is much more to process quality control than statistics. Yet statistics plays a
part all along the way. Interpretation of data is necessary for capability studies and for
achieving control. The statistical methodology involved in troubleshooting and design
of experiments is essential in affecting change. After all, processes are deaf, dumb,
blind, and usually not ambulatory. In other words, they are not very communicative. Yet
they speak to us over time through their performance. Add bad materials and the process
will exhibit indigestion. Tweak the controls and the process will say “ouch!” Yet in the
presence of the variation common to industrial enterprise, it is difficult to interpret these
replies without amplification and filtering by statistical methods designed to eliminate
the “noise” and focus on the real shifts in level or spread of performance.
One approach to the analysis of a process is continued observation so that, when
a variable changes, affecting the process, the resulting change in the performance of

Process Process
change control

Process
capability

Figure 7.1 Statistical process quality control.


Chapter 7: Principles and Applications of Control Charts 197

the process will identify the cause. This is the essence of interpretation of data as a
means for determining and achieving process capability and control. Alternatively,
deliberate changes can be made in variables thought to affect the process. The result-
ing changes in performance identify and quantify the real effect of these changes. This
is a basic approach in troubleshooting and the use of design of experiments in process
improvement.
In any event, statistical analysis is necessary because it provides a communication
link between the process and the investigator that is unbiased and that transcends the
variation that bedevils interpretation of what the process is saying.
It is possible, then, to distinguish two types of investigation in the application of
statistics to process control:
• Interpretation. Listen to the process; detect signals from the process as to when
variables change.
• Experimentation. Talk to the process; perturb variables by experimental design.
The key is communication with the process through statistics.

7.3 PROCESS CONTROL


There are many ways to control a process. One way is through experience, but that takes
too long. Another is through intuition, but that is too risky. A third approach (all too
common) is to assume the process is well-behaved and not to bother with it; but that
may lead to a rude awakening. All these have their place but should be used judiciously
in support of a scientific approach to achieving and maintaining statistical control
through control charts.
It is the philosophy of use of control charts that is so important. The search for an
assignable or “special” cause and the measurement of inherent variation brought about
by “common” causes are at the heart of that philosophy. The purpose of control is to
identify and correct for assignable causes as they occur and thereby keep variation in
the process within its “natural” limits. In so doing, the control chart is used to test
whether the data represents random variation from stable sources and, if not, to help
infer the nature of the source(s) responsible for any nonrandomness.
There are many types and uses of control charts. The chart may be used for “stan-
dards given,” that is, to maintain future control when previous standards for the mean
and standard deviation have been established. Alternatively, they may be used to inves-
tigate and establish control using past and current data with “no standards given.”
Control limits and appropriate factors for variables and attributes charts for use in these
situations are shown in Table 7.1. Values of the factors are given in Table A.4.
In establishing control of a process, the effort is usually initiated with a “no stan-
dards given” control chart to investigate the process. This, then, allows development
of data on the process, which may be used to eliminate assignable causes, establish con-
trol, and to estimate process parameters. After 30 to 50 successive points have remained
198 Part II: Statistical Process Control

Table 7.1 Factors for Shewhart charts, n = ng.


Standards given
Defects ĉ or

Mean X Standard deviation Proportion p̂ or defects per unit û
with m, s s or range R number of defects n p̂ against c or m
Plot given with s given with p given given

p (1− p )
Upper µ + 3σ / n s : B 6s pˆ : p + 3 cˆ : c + 3 c
control n
limit µ
= m + As R : D 2s npˆ : np + 3 np (1− p ) µˆ : µ + 3
n

Centerline s : c 4s p̂ : p cˆ : c
m
R : d 2s n p̂ : np m̂ : m

p (1− p )
Lower µ − 3σ / n s : B 5s pˆ : p − 3 cˆ : c − 3 c
control n
limit µ
= m – As R : D 1s npˆ : np − 3 np (1− p ) µˆ : µ − 3
n
No standards given

Mean X of past
data using Standard deviation Proportion p̂ or Defects ĉ or
s or R against s or range R number of defects np̂ defects per unit û
Plot past data against past data against past data against past data

– p (1− p )
Upper s : X + A3 s– s : B4 s– pˆ : p + 3 cˆ : c + 3 c
control n
limit – µ
npˆ : np + 3 np (1− p )
– –
R : X + A2R R : D4R µˆ : µ + 3
n

Centerline s: X s : s– p̂ : p– cˆ : c–
– –
R: X R:R n p̂ : n p– m̂ : m–

– p (1− p )
Lower s : X – A3 s– s : B3 s– pˆ : p − 3 cˆ : c − 3 c
control n
limit µ
npˆ : np − 3 np (1− p )
– – –
R : X – A2R R : D3 R µˆ : µ − 3
n

in control, it is possible to establish (essentially) known, stable values of the process


parameters. Given these “known” constants, charts can be set up for continuing control
using “standards given” limits, which incorporate these process parameters or targets
developed from them. These values would be known going into construction of the
chart. An easy way to determine whether “standards given” or “no standards given”
limits should be employed is to ask the question, “can the chart be constructed without
taking any (further) data?” If the answer is “yes,” a “standards given” chart should be
used. If the answer is “no,” a “no standards given” chart is in order.
Chapter 7: Principles and Applications of Control Charts 199

7.4 USES OF CONTROL CHARTS


Control charts may be used in making judgments about the process, such as establish-
ing whether the process was in a state of control at a given time. This is useful in deter-
mining the capability of the process. Again, they may be used in an ongoing effort to
maintain the centering and spread of the process, that is, in maintaining control. Control
charts may also be used to detect clues for process change. This is at the heart of process
improvement and troubleshooting.
According to W. E. Deming,3 in all aspects of process control, it is desirable to dis-
tinguish between two types of study:
• Enumerative study. The aim is to gain better knowledge about material in
a population.
• Analytic study. The aim is to obtain information by which to take action on
a cause system that has provided material in the past and will produce material
in the future.
Process control studies are by nature analytic. The objective is to characterize the
process at a given point in time and not necessarily the product that is being produced.
Therefore, the sampling procedures are not necessarily those of random sampling from
the population or lot of product produced over a given period. Rather, the samples are
structured to give sure and definitive signals about the process. This is the essence of
rational subgrouping as opposed to randomization. It explains why it is reasonable to
take regular samples at specific intervals regardless of the quantity of product produced.
It also indicates why successive units produced are sometimes taken as a sample for
process control, rather than a random selection over time.

7.5 RATIONAL SUBGROUPS


Use of the control chart to detect shifts in process centering or spread requires that the
data be taken in so-called rational subgroups. These data sets should be set up and taken
in such a way that variation within a subgroup reflects nonassignable random variation
only, while any significant variation between subgroups reflects assignable causes.
Experience has shown that the reasons for an assignable cause can be found and will
give insight into the shifts in process performance that are observed.
Rational subgrouping must be done beforehand. Control charts are no better than
the effort expended in setting them up. This requires technical knowledge about the
process itself. It requires answers to such questions as:
• What do we want the chart to show?

3. W. E. Deming, Some Theory of Sampling (New York: Dover Publications, 1966). See especially Chapter 7,
“Distinction between Enumerative and Analytic Studies.”
200 Part II: Statistical Process Control

• What are the possible sources of variation in the process?


• How shall nonassignable or random error be measured?
• What should be the time period between which samples are taken?
• What sources can be combined in one chart and which sources should be split
among several charts?
The answers to these and similar questions will determine the nature of the sam-
pling and the charting procedure.

7.6 SPECIAL CONTROL CHARTS


Table 7.1 shows how to compute control limits for the standard charts with which most
people engaged in industrial use of statistics are familiar. There are some charts, how-
ever, that are well suited to specific situations in which process control is to be applied.

7.7 MEDIAN CHART


Of particular importance for in-plant control is the median chart, in which the median

is plotted in lieu of X on a chart for process location. While special methods have been
developed for the construction of such charts, using the median range, for example,4,5 it

is very simple to convert standard X chart limits to limits for the median. In so doing,

familiar methods are used for constructing the X limits, they are then converted to
median limits, and the person responsible for upkeep of the chart simply plots the
~
median X along with the range R sample by sample. Calculation of the limits is straight-
forward since it is by the standard methods, which may be available on a calculator or
computer. The calculation of the limits is transparent to the operator who plots statistics
that are commonplace yet rich in intuitive meaning. Even when displayed by a com-
puter terminal at a workstation, the median and range charts are meaningful in that they
display quantities that are well known to the operator, allowing concentration on the
philosophy rather than the mechanics of process control.

The conversion of an X chart to a median chart is simplified by Table 7.2a, which
presents three forms of conversion:

1. Widen the X limits by a multiple W. That is, if the limits are

X ± 3σˆ / ng ,

4. E. B. Farrell, “Control Charts Using Midranges and Medians,” Industrial Quality Control 9, no. 5 (March 1953):
30–34.
5. P. C. Clifford, “Control Charts without Calculations,” Industrial Quality Control 15, no. 11 (May 1959): 40-44.
Chapter 7: Principles and Applications of Control Charts 201

widen them to
(
X ± W 3σˆ / ng . )
Keep the sample size the same.

2. Use the factor ZM in the place of 3 in the limits for X. That is, if the limits are

X ± 3σˆ / ng ,
use
X ± Z M σˆ / ng

and plot medians on the new chart. Keep the sample size the same.


Table 7.2a Factors for conversion of X chart into median chart.
Widen Alternate
by Factor sample size Efficiency
n = ng W ZM nM E
2 1.00 3.00 2 1.000
3 1.16 3.48 5 0.743
4 1.09 3.28 5 0.828
5 1.20 3.59 8 0.697
6 1.14 3.41 8 0.776
7 1.21 3.64 11 0.679
8 1.16 3.48 11 0.743
9 1.22 3.67 14 0.669
10 1.18 3.53 14 0.723
11 1.23 3.68 17 0.663
12 1.19 3.56 17 0.709
13 1.23 3.70 20 0.659
14 1.20 3.59 21 0.699
15 1.23 3.70 23 0.656
16 1.20 3.61 24 0.692
17 1.24 3.71 27 0.653
18 1.21 3.62 27 0.686
19 1.24 3.72 30 0.651
20 1.21 3.64 30 0.681
∞ 1.25 3.76 1.57n 0.637
Source: Computed from efficiencies E given by W. J. Dixon and F. J. Massey, Jr., Introduction to Statistical
Analysis, 2nd ed. (New York: McGraw-Hill, 1957): Table A.8b4.


Table 7.2b Factors for conversion of X chart into midrange chart.
Widen Alternate
by Factor sample size Efficiency
n = ng W ZMR nMR E
2 1.00 3.00 2 1.000
3 1.04 3.13 4 0.920
4 1.09 3.28 5 0.838
5 1.14 3.42 7 0.767
Source: Computed from efficiencies E given by W. J. Dixon and F. J. Massey, Jr., Introduction to Statistical
Analysis, 2nd ed. (New York: McGraw-Hill, 1957): Table A.8b4.
202 Part II: Statistical Process Control


3. Keep the limits the same as for the X chart but increase the sample size
from ng to nM. This is useful when the location of the limits has special
significance, such as in converting modified control limits or acceptance
control limits.
It should be emphasized that median charts assume the individual measurements
upon which the chart is based to have a normal distribution.
Sometimes it is desirable to plot the midrange, that is, the average of the extreme
observations in a subgroup, rather than the median. This procedure actually has greater
efficiency than the median for sample size five or less.

Standard X charts may be converted to use of the midrange by using the factors
shown in Table 7.2b in a manner similar to the conversion of the median.

Conversion of an X chart to a median chart may be illustrated using the statistics of
mica thickness data compiled in Table 7.3 for k = 40 samples of size ng = 5. For each
– ~
sample, the mean X, median X, range R, and standard deviation s are shown. Using the

means and ranges, the control limits for an X and R chart are found to be

UCLX = X + A2 R = 11.15 + 0.58 ( 4.875) = 13.98

CLX = X = 11.115

LCLX = X − A2 R = 11.15 − 0.58 ( 4.875) = 8.32

Table 7.3 Mean, median, range, and standard deviation of mica thickness.
Sample 1 2 3 4 5 6 7 8 9 10

X 10.7 11.0 11.9 13.1 11.9 14.3 11.7 10.7 12.0 13.7
~
X 11.5 10.5 12.5 13.5 12.5 14.0 11.5 11.0 11.5 14.5
R 4.0 3.5 7.5 3.5 4.5 5.0 6.0 5.5 4.0 6.0
s 1.72 1.50 3.03 1.56 1.71 1.99 2.49 2.11 1.54 2.64
Sample 11 12 13 14 15 16 17 18 19 20

X 9.8 13.0 11.7 9.6 12.0 11.9 11.7 11.1 10.0 11.0
~
X 9.5 13.0 10.0 9.5 12.5 12.0 11.5 10.0 10.5 10.5
R 3.5 3.0 6.0 8.0 4.5 7.5 4.0 7.5 3.5 5.0
s 1.30 1.12 2.82 3.31 1.73 2.75 1.60 3.05 1.41 1.84
Sample 21 22 23 24 25 26 27 28 29 30

X 12.8 9.7 9.9 10.1 10.7 8.9 10.7 11.6 11.4 11.2
~
X 13.0 10.0 10.5 10.0 10.5 8.5 10.5 11.0 12.0 10.5
R 3.5 6.0 3.5 6.0 5.5 6.5 2.0 4.5 2.0 6.5
s 1.35 2.33 1.56 2.33 2.51 2.63 0.84 1.78 0.89 2.44
Sample 31 32 33 34 35 36 37 38 39 40

X 11.1 8.6 9.6 10.9 11.9 12.2 10.3 11.7 10.1 10.0
~
X 11.0 8.0 10.0 10.5 13.0 12.0 10.0 12.0 9.5 10.0
R 3.5 4.5 5.0 8.5 5.0 7.0 1.5 3.5 4.0 4.5
s 1.29 1.92 1.98 3.27 2.27 3.05 0.67 1.44 1.82 1.70
Chapter 7: Principles and Applications of Control Charts 203

and
UCLR = D4 R = 2.11( 4.875) = 10.29
CLR = R = 4.875
LCLR = D3 R = 0 ( 4.875) = 0


Point 6 is out of control on the X chart while the R chart appears in control against

its limits (see Figure 2.5). The X chart may be converted to a median chart using

UCLM = X + WA2 R = 11.15 + 1.20 ( 0.58 )( 4.875) = 14.54

CLX = X = 11.15

LCLX = X − WA2 R = 11.15 − 1.20 ( 0.58 ) ( 4.875) = 7.76

The median chart is plotted in Figure 7.2. The tenth point is just in control on the

median chart as it is on the X chart. The sixth point is also barely in control as is the thirty-

second point. The fluctuations are roughly the same as in the X chart, but the median chart

has an efficiency of 70 percent compared to the X chart (see Table 7.2), which accounts
for the lack of indication of an out-of-control condition on the sixth point.

7.8 STANDARD DEVIATION CHART


As an alternative to the range chart, a standard deviation chart (s chart) is sometimes
calculated. Such charts are particularly well adapted to computation by the computer.

UCLM = 14.54

14.0

12.0
~ CL = 11.15
X

10.0

8.0 LCLM = 7.76

0.0 8.0 16.0 24.0 32.0 40.0

Figure 7.2 Median chart for mica thickness.


204 Part II: Statistical Process Control

UCL = 4.14

3.30

2.40
CL = 1.98
S

1.60

0.80

0.0 8.0 16.0 24.0 32.0 40.0

Figure 7.3 s chart for mica thickness.

Factors for the construction of an s chart are given in Table 7.1. Its construction may be
illustrated using the mica data from Table 7.3 as follows:

s = 1.98
UCLs = B4 s = 2.089 (1.98 ) = 4.14
CLs = s = 1.98
LCLs = B3 s = 0

The resulting s chart is shown in Figure 7.3. The s chart appears in control as does
the R chart.

7.9 ACCEPTANCE CONTROL CHART


Shewhart control charts for monitoring a process are usually set up using sample sizes
of 4 or 5 and 3s control limits. Little attention is directed toward the b risk of missing
a shift of a given size, and the a risk of an incorrect signal when the process has not
changed is simply set at 0.003. These standard values were arrived at through empirical
studies conducted by Shewhart and others at the Western Electric Hawthorne plant and
elsewhere. Their genesis is largely empirical. Operators were able to determine real-life
assignable causes from control signals when these values were used. The sensitivity of
the chart was just about right. It worked!
There are conditions, however, where it is desirable to incorporate the a and b risks
into the limits of a control chart. This is particularly true in troubleshooting a process
where smaller assignable causes are to be detected. Sometimes a chart is needed that
Chapter 7: Principles and Applications of Control Charts 205

will accept the process when it is operating at specific levels and reject the process other-
wise. These are examples of processes for which specific levels must be adhered to with
known fixed risks. The sample size and the acceptance constant are then derived from
statistical calculations based on the OC curve.
Acceptance control charts are particularly well adapted to troubleshooting in
process control. Unlike the conventional Shewhart chart, the acceptance control chart
fixes the risk b of missing a signal when the process mean goes beyond a specified
rejectable process level (RPL). It also incorporates a specified risk a of a signal occur-
ring when the process mean is within a specified acceptable process level (APL). Since
in troubleshooting and improvement studies it is essential that aberrant levels be
detected, the acceptance control chart is a versatile tool in that b can be set at appropri-
ate levels. Acceptance control charts can be used for continuing control of a process
when it is desirable to fix the risks of a signal. They also serve as an acceptance-
control device when interest is centered on acceptance or rejection of the process that
produced the product, that is, Type B sampling for analytic purposes. These charts are
ordinarily used when the standard deviation is known and stable. A control chart for R
or s is ordinarily run with the acceptance control chart to assure the constancy of
process variation.
An acceptance control chart is set up as follows:
1. Determine the sample size as

(
 Z +Z σ )
2

ng = 
α β

 RPL − APL 
 

where Zl is the upper tail normal deviate for probability l . A few values
of l are
Risk l Zl
0.10 1.282
0.05 1.645
0.025 1.960
0.01 2.326
0.005 2.576

Note that a risk of a /2 should be used if the chart is to be two-sided. The b risk
is not divided by 2 since it applies to only one side of a process at any time.
2. Determine the acceptance control limit (ACL) as follows:
a. APL given

Upper ACL = APL + Zα σ / ng

Lower ACL = APL − Zα σ / ng


206 Part II: Statistical Process Control

b. RPL given

Upper ACL = RPL − Z β σ / ng

Lower ACL = RPL + Z β σ / ng

c. A nominal centerline (CL) is often shown halfway between the upper and
lower ACL for a two-sided chart. It is sometimes necessary to work with
either the APL or the RPL, whichever is more important. The other value
will then be a function of sample size and may be back-calculated from the
relationships shown. That is why two sets of formulas are given. In normal
practice, when both the APL and RPL with appropriate risks are set, the
sample size is determined and either of the sets of formulas may be used to
determine the ACL.
3. When a = 0.05 and b = 0.10, these formulas become:
a. Single-sided limit
Zα = 1.645 Z β = 1.282

8.567σ 2
ng =
( RPL − APL ) 2

b. Double-sided limit
Zα / 2 = 1.960 Z β = 1.282

10.511σ 2
ng =
( RPL − APL ) 2

These risks are quite reasonable for many applications of acceptance control charts.
The development and application of acceptance control charts are detailed in the semi-
nal paper by R. A. Freund.6 Application of the procedure to attributes data is discussed
by Mhatre, Scheaffer, and Leavenworth,7 and is covered later in this section.
Consider the mica data given in Table 7.3. For these data, purchase specifications
were 8.5 to 15 mils with an industry allowance of five percent over and five percent
under these dimensions. Thus

Upper RPL = 1.05(15) = 15.75

6. R. A. Freund, “Acceptance Control Charts,” Industrial Quality Control 14, no. 4 (October 1957): 13–23.
7. S. Mhatre, R. L. Scheaffer, and R. S. Leavenworth, “Acceptance Control Charts Based on Normal Approximations
to the Poisson Distribution,” Journal of Quality Technology 13, no. 4 (October 1981): 221–27.
Chapter 7: Principles and Applications of Control Charts 207

Upper APL = 15
Lower APL = 8.5
Lower RPL = 0.95(8.5) = 8.075

The previously calculated s chart indicates that the process is stable with –s = 1.98.
But s is a biased estimate of s as revealed from the factor for the centerline of a “stan-
dards given” s chart based on known s. We see that

–s = c s
4

so, an unbiased estimate of s is

s 1.98
σˆ = = = 2.11
c4 0.9400

For risks a = 0.10 and b = 0.10 the sample size required is

 (1.282 + 1.282 ) 
2

Upper limit ng =  2.11 = 52


 15.75 − 15.00 

 (1.282 + 1.282 ) 
2

Lower limit ng =  2.11 = 162


 8.0075 − 8.5 

Suppose for convenience we take ng = 50. Then the acceptance control limits, using
the RPL formulas, are

Upper ACL = 15.75 − 1.282


( 2.11) = 15.37
50

Lower ACL = 8.075 + 1.282


( 2.11) = 8.46
50

The risks on the upper limit will be essentially held constant by this procedure,
because we selected the sample size appropriate to that limit. Only the RPL will be held
constant for the lower limit since the ACL was calculated from the RPL formula. The
risk at the APL will have changed, but may be back-calculated as follows:

σ
ACL = APL − Zα
ng
208 Part II: Statistical Process Control

so with some algebra

Zα =
( APL − ACL ) n
σ

Zα =
(8.50 − 8.46) 50
2.11
Zα = 0.13

and from the normal table


a = 0.4483

The new lower APL having a = 0.10 risk is

1.282 ( 2.11)
New lower APL = ACL + Zα σ / ng = 8.46 + = 8.84
50

rather than 8.46. Clearly some trade-offs are in order. If we proceed with the chart, plot-
ting the mean of successive samples of 50 represented by each row of Table 7.3, the
chart would appear as in Figure 7.4.

16 ACL = 15.37
15

14

13
(12.12)
CL = 11.91
12

X (11.18)
11

10 (10.70)
(10.64)

9 ACL = 8.46
8

7
1–10 11–20 21–30 31–40
Samples

Figure 7.4 Acceptance control chart for mica thickness.


Chapter 7: Principles and Applications of Control Charts 209

Acceptance Control Charts for Attributes


Mhatre, Scheaffer, and Leavenworth8 have adapted the acceptance control chart to
attributes data using the Poisson distribution and a square-root transformation. Given
APL, RPL, a, and b, the acceptance control limit (ACL) and sample size (ng) can be
determined as follows where APL and RPL are expressed in terms of defects per unit.

( )
2
0.25 Zα + Z β
ng =
( )
2
RPL − APL

 Z 
2

ACL =  ng ( APL ) + α 
 2

As an example, suppose APL = 1.0, RPL = 4.0, a = 0.05, and b = 0.025. The accep-
tance control chart would be constructed as follows.

0.25 (1.64 + 1.96 )


2

ng = = 3.24 ∼ 3
( )
2
4.0 − 1.0

 1.64 
2

ACL =  3.24 (1.0 ) +  = 6.86 ∼ 7


 2 

The authors recommend use of the square-root approach over a simple normal
approximation. It provides better protection against a Type I error. Formulas for the
simple normal approximation, which tends to give better protection against a Type II
error, are as follows.

 Z APL + Z RPL 
2

ng = 
α β 
 RPL − APL 
 
ACL = n ( APL ) + Zα n ( APL )

In the above example, this would result in a comparable sample size and acceptance
control limit of

8. Ibid.
210 Part II: Statistical Process Control

 1.64 1.0 + 1.96 4.0 


2

ng =   = 3.43 ≈ 3
 4.0 − 1.0 

ACL = 3.43(1.0 ) + 1.64 3.43(1.0 ) = 6.47 ≈ 6

This approach is based on the Poisson distribution, which is appropriate when deal-
ing with defect count data.

ARL and the Acceptance Control Chart


In setting the parameters for an acceptance control chart, it is sometimes desirable to
consider the average run length (ARL) associated with the APL and RPL. This is easily
done by converting the average run length to the associated risk by means of the prob-
ability of detection, PD , using the relationship

1 1
ARL = and PD =
PD ARL

giving
1 1
α= and β = 1 −
ARL ARL

and so we have
ARL PD ` risk a risk
100 0.01 0.01 (0.99)
50 0.02 0.02 (0.98)
20 0.05 0.05 (0.95)
10 0.10 0.10 (0.90)
5 0.20 0.20 (0.80)
2 0.50 0.50 0.50
1.25 0.80 (0.80) 0.20
1.11 0.90 (0.90) 0.10
1.05 0.95 (0.95) 0.05
1.02 0.98 (0.98) 0.02
1.01 0.99 (0.99) 0.01

For example, when the mica process is running at the upper APL of 15, sampling
with a risk of a = 0.10, it would take an average of 10 points before a signal occurs.
Using the relationship between risk and ARL it is possible to derive an acceptance con-
trol chart based solely on ARLs by specifying them and converting the ARLs into risks.
Thus, suppose in the mica example it was desired to have an upper RPL of 15.75 with
an ARL of 1.1 and an upper APL of 15 and an ARL of 10. These would convert to a
risk of a = 1/10 = 0.10 at the APL and b = 1 – 1/1.11 = 0.10 at the RPL. These values
of risk would be put in the standard formulas to obtain the parameters of the appropri-
ate acceptance control chart.
Chapter 7: Principles and Applications of Control Charts 211

USL

3s
3s X–

3s X–
3s

LSL

Figure 7.5 Modified control limits.

7.10 MODIFIED CONTROL LIMITS


It should be emphasized that acceptance control charts are not modified limit control
charts in that they incorporate stated risks and allowable process levels. Acceptance
control charts are oriented toward the process, whereas modified limit charts are designed
to detect when nonconforming product is being produced. Modified limits are set directly
from the specifications (USL and LSL) as in Figure 7.5.
The process is assumed to be normally distributed. A nominal centerline of a mod-
ified limit chart is then set 3s in (on the good side) from the specification limit. The
control limit is set back toward the specification limit a distance 3s X–. In this way,
signals will not be given unless the process is centered so close to the specification that
nonconforming product may be produced. It will be seen that the modified control
limits are simply
 
σ 1 
Upper modified limit: USL − 3σ + 3 = USL − 3 1 − σ
ng 
 ng 

 
σ 1 
Lower modified limit: LSL + 3σ − 3 = LSL + 3 1 − σ
ng 
 ng 

Clearly there is no consideration as to the selection of sample size or the process lev-
els that might be regarded as acceptable or rejectable with certain risks as in acceptance
control charts. If modified limits are used with the mica data of Table 7.3, for samples of
five the limits are:
212 Part II: Statistical Process Control

 1 
Upper modified limit: 15.75 − 3 1 −  2.11 = 12.25
 5
 1 
Lower modified limit: 8.075 + 3 1 −  2.11 = 11.57
 5

Inspection of the means shown for the 50 samples indicates that only 11 samples
would produce averages inside the modified limits. This is because the process is inca-
pable of meeting the specification limits, since

USL − LSL 15.75 − 8.075


= = 3.64
σ 2.11

instead of 6, which would indicate marginal capability.

7.11 ARITHMETIC AND EXPONENTIALLY WEIGHTED


MOVING AVERAGE CHARTS
Often, data are not obtained in successive subsamples but come naturally in a sequence
of single observations. Sometimes the subsample data are lost or not recorded so that
all that is available is a series of means or ranges. The daily temperatures listed in the
newspaper hour by hour are of this form. Such data are often analyzed by arithmetic
moving average and range charts in which successive subgroups (often of size k = 2)
are formed by deleting the earliest observation from a subgroup, and appending the next
available observation to obtain successive arithmetic moving averages. The resulting
subgroups can be analyzed (approximately) by the standard methods.
A moving range chart was illustrated in Section 3.2 as a check for outliers. A mov-
ing average chart can be constructed in similar fashion. Moving averages are obtained
by deleting the earliest observation from a subgroup, appending the next consecutive
observation, and averaging the observations in the new subgroup. Using the data from
Table 3.2, this would produce moving averages of subgroup size 2 or 3 as follows:
– –
i Xi MR (ng = 2) M X(ng = 2) M X(ng = 3)
1 0.110 — — —
2 0.070 0.040 0.0900 —
3 0.110 0.040 0.0900 0.0967
4 0.105 0.005 0.1075 0.0950
5 0.100 0.005 0.1025 0.1050
6 0.115 0.015 0.1075 0.1067
... ... ... ... ...


The moving averages and ranges would then be plotted in the form of X and R
charts using the standard formulas for limits with a sample size ng equal to the subgroup
sample size chosen. Subgroup sample sizes of ng = 2 are quite common with other sub-
group sizes (such as ng = 3) chosen to match the physical circumstances surrounding
Chapter 7: Principles and Applications of Control Charts 213

data collection. For example, a subgroup size of ng = 5 would produce data averaged
over the span of a work week.
An alternative to the arithmetic moving average is the exponentially weighted mov-
ing average (EWMA) chart developed by Roberts,9 and originally called the geometric
moving average chart, in which a cumulative score is developed that weights the earlier
observations successively less than subsequent observations in such a way as to auto-
matically phase out distant observations almost entirely. It is particularly useful for a
continuing series of individual observations and is useful when the observations to be
plotted on the chart are not independent as in the case of the hourly temperatures.
To set up an exponentially weighted moving average chart, proceed as follows:
1. It is convenient to set Z0 equal to m0, where m0 is the ordinate of the central line of
the control chart. Thus, Z0 = m0 and for each point Zt , plotted at time t, calculate

Zt = rxt + (1 – r)Zt–1

where Zt–1 is the weighted value at the immediately preceding time and r is
the weight factor 0 < r < 1 between the immediate observation xt and the
preceding weighted value. It can be shown that the EWMA is a special class
of time-series model referred to as an ARIMA(0,1,1), or IMA(1,1) model.10,11
Typically, r = 0.25 or 0.40.

9. S. W. Roberts, “Control Chart Tests Based on Geometric Moving Averages,” Technometrics 1, no. 3 (August
1959): 239–250.
10. The EWMA can be used when the data are autocorrelated, that is, the current observation is correlated with those
just prior to it in sequence. The process can then be modeled by an autoregressive integrated moving average
(ARIMA) model with parameters p = 0, d = 1, and q = 1, that is, ARIMA(0,1,1). It can also be represented as an
integrated moving average (IMA) model with parameters d = 1, and q = 1, that is, IMA(1,1), which has the form

∇xt = (1 − θ B ) at

where B is the backward-shift operator, which is defined by Bat = at–1, and at is a random shock at time t. At time
t, the model can be written as
∇xt = xt −1 = at − θ at −1
∴ xt = xt −1 + at − θ at −1

The EWMA with r = 1 – q or q = 1 – r, and xt = Zt–1 + at or at = xt – Zt–1, is the optimal one-step-ahead forecast
for this process. Thus, if Zt is the forecast for the observation in period t made at the end of period t – 1, then
xt = xt −1 + ( xt − Z t −1 ) − (1 − r ) ( xt −1 − Z t − 2 )
= xt −1 + ( xt − Z t −1 ) + ( r − 1) ( xt −1 − Zt − 2 )
= xt −1 + xt − Z t −1 + rxt −1 − xt −1 − rZ t − 2 + Zt − 2
= xt − Z t −1 + rxt −1 − rZt − 2 + Z t − 2

which yields the EWMA form


Zt −1 = rxt −1 + (1 − r ) Z t − 2
Z t = rxt + (1 − r ) Z t −1

11. For more information on IMA models, see G.E.P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and
Control (San Francisco: Holden-Day, 1976).
214 Part II: Statistical Process Control

2. Set limits at

1 − (1 − r )
r 2t
µ0 ± 3σ
2−r

where m0 is the target value for the chart. After the first few points, the last
factor in the formula effectively drops out and the limits become

r
µ0 + 3σ
2−r

Notice, if r = 1 we have a conventional control chart.


A study of Zt will show that the influence of any observation decreases as time
passes. For example

Z 3 = rx3 + (1 − r ) Z 2

(
= rx3 + (1 − r ) rx 2 + (1 − r ) Z1 )
= rx3 + (1 − r ) rx 2 + (1 − r ) (1 − r ) rx1

This is a geometric progression which, in general, amounts to


t −1 t −1
Z t −1 = ∑ r (1 − r ) xt − j+1 = r ∑ (1 − r )
j −1 j −1
xt − j+1
j =1 j =1

∴ Z t = ∑ r (1 − r )
t
xt − j+1 = r ∑ (1 − r )
t
j −1 j −1
xt − j+1
j =1 j =1

Suppose the arithmetic moving average and EWMA methods are to be applied to

the first 10 points of the X data in Table 7.3. We will take m0 = 11.15 and

σ = 2.11 / 5 = 0.94


so the limits for X are as follows:

Moving Average
The control limits for the moving average, based on a span of ng = 2 points, are

ng = 2

µ0 ± 3σ / ng
Chapter 7: Principles and Applications of Control Charts 215

11.15 ± 3
( 0.94 )
2
11.15 ± 1.99
9.16 to 13.14

Exponentially Weighted Moving Average


The control limits for the exponentially weighted moving average, based on r = 1⁄4, are

r = 1/ 4

1 − (1 − r )
r 2t
µ0 ± 3σ
2−r

11.15 ± 3( 0.94 ) 1 − ( 0.75)


0.25 2t

1.75

11.15 ± 1.07 1 − ( 0.75)


2t

Giving
t Limits UCL LCL
1 11.15 ± 0.71 10.44 11.86
2 11.15 ± 0.88 10.27 12.03
3 11.15 ± 1.02 10.13 12.17
4 11.15 ± 1.06 10.09 12.21
≥5 11.15 ± 1.07 10.08 12.22

Then the moving averages are as follows:



X

Sample X ng = 2 Zt
1 10.7
2 11.0 10.85 10.78
3 11.9 11.45 11.06
4 13.1 12.50 11.57
5 11.9 12.50 11.65
6 14.3 13.10 12.31
7 11.7 13.00 12.16
8 10.7 11.20 11.80
9 12.0 11.35 11.85
10 13.7 12.85 12.31

The resulting arithmetic moving average chart shown in Figure 7.6 does not detect
an outage against its limits although sample 6 is just at the limit.
The EWMA does, however, detect outages at the sixth and tenth points because of
its superior power relative to the simple moving average chart used. The geometric
(exponentially weighted) moving average (or EWMA) chart can be seen in Figure 7.7.
216 Part II: Statistical Process Control

UCL = 13.14
13

12

CL = 11.15

X 11

10

LCL = 9.16
9
2 3 4 5 6 7 8 9 10

Figure 7.6 Moving average chart for mica thickness.

13.2

UCL = 12.22
Z 12.0

CL = 11.15

10.8
LCL = 10.08

3 4 5 6 7 8 9 10
Sample

Figure 7.7 Geometric moving average chart for mica thickness.

7.12 CUMULATIVE SUM CHARTS


The cumulative sum (CUSUM) chart provides a very sensitive and flexible vehicle for
the analysis of a sequence of individual observations or statistics. Such charts involve
plotting the sum of the observations (in terms of deviations from target T) up to a given
point against the number of samples taken. The resultant sums when plotted do not often

have direct meaning, such as X in the Shewhart chart, but are used as an index of the
behavior of some parameter of the process. Cumulative sum charts have been shown
to be more sensitive than the Shewhart chart for detecting shifts of less than about 3s X–
in the mean. The Shewhart chart is more sensitive than CUSUM in detecting departures
greater than 3s X– as pointed out by Lucas.12 Of course, the 3s X– level of sensitivity for

12. J. M. Lucas, “A Modified ‘V’ Mask Control Scheme,” Technometrics 15, no. 4 (November 1973): 833–47.
Chapter 7: Principles and Applications of Control Charts 217

sample sizes 4 or 5 was selected by Shewhart because he found that, at that level of a
process shift, assignable causes could reasonably be expected to be found by the user.
Nevertheless, for purposes of troubleshooting and process improvement, greater sensi-
tivity is often welcome and the cumulative sum chart has proven to be a parsimonious
and reliable tool.
Let us first consider a two-sided approach, suggested by Barnard13 which incorpo-
rates use of a V-mask superimposed on the plot to assess the significance of any appar-
ent change. The cumulative sum chart for testing the mean simply plots the sum of all
the differences collected up to a point against time. A V-mask is constructed and posi-
tioned against the last point at the positioning point indicated on the mask, with the
bottom of the mask parallel to the x axis. As long as all the previous points remain vis-
ible within the cut out portion (or notch) of the mask, the process is regarded as in con-
trol. When a previous point is covered under the solid portion of the mask, or its
extension, the process is regarded as out of control. Thus, Figure 7.8 indicates an out-
of-control condition.
To construct and plot a V-mask for the process it is necessary to determine the
following14:
1. m 0 = APL = acceptable process level
2. a = risk of false signal that process has shifted from APL (use a /2 for
two-sided test)

7
q
6
d q
5
4
3
S (X – T )
2
1
0
3 4 Sample number
–1
–2

Figure 7.8 CUSUM chart for mica thickness, d = 1.58, q = 45°.

13. G. E. A. Barnard, “Control Charts and Stochastic Processes,” Journal of the Royal Statistical Society, series B,
vol. 21 (1959): 239–71.
14. The notation used here is in accord with the ISO technical report ISO/TR 7811 Cumulative Sum Charts—
Guidance on Quality Control and Data Analysis Using CUSUM Techniques (Geneva: International Organization
for Standardization, 1997).
218 Part II: Statistical Process Control

3. m1 = RPL = rejectable process level


4. D = |m1 – m 0| = change in process to be detected
5. b = risk of not detecting a change of magnitude D
6. s X– = standard error of the points plotted (s X– = s when n = 1)
7. d = D/s X– standardized change to be detected
8. A = as = scaling factor showing the ratio of the y to the x axis (distance
of A units on ordinate corresponding to one unit length on abscissa) A = 2s
is recommended, so a = 2.
9. T = target for the process

( )
i
10. Ci = ∑ x j − T = cumulative sum
j =1

Then the mask is determined by specifying its lead distance d and half angle q as
shown in Figure 7.8.
These quantities are calculated using the following relationships

D δσX
tan θ = =
2A 2A
2 1− β 
d= ln  
δ2  α 

Johnson and Leone15 have noted that for b small (negligible):

2
d=− ln α
δ2

Note that q is scale dependent, while d is not.


It should be pointed out that the mask may be reparameterized in terms of two other
quantities. Some authors use H and F as parameters of the CUSUM procedure where

H = hs = decision interval
F = fs = slope of the mask

This is shown in Figure 7.9.

15. N. L. Johnson and F. C. Leone, “Cumulative Sum Control Charts—Mathematical Principles Applied to their
Construction and Use,” Industrial Quality Control, part 1, vol. 18, no. 12 (June 1962): 15–20; part 2, vol. 19,
no. 1 (July 1962): 29–36; part 3, vol. 19, no. 2 (August 1962): 22–28.
Chapter 7: Principles and Applications of Control Charts 219

Positioning
point
H
q
X
d
Last point

F
q
1

Figure 7.9 V-mask.

The abscissa or zero baseline corresponds to the centerline of the Shewhart chart in
the sense that the cumulative sum will plot horizontally if the process is at the target. A
shift F from a target m0 will provide a process level m0 + F having probability of accep-
tance b = 0.5 when a = b. The interval H + F corresponds to a control limit in the sense
that for the first observation after a change to give a signal, the previous point must be a
vertical distance H + F from the positioning point of the mask. Similarly, the rth previous
point must be a vertical distance H + rF above the positioning point.
These two systems of specifying cumulative sum charts are obviously related. The
relationship is as follows

H = Ad tan q
F = A tan q

so that, as indicated by Ewan16

tan q = F/A
d = H/F

Note that H and F correspond directly to the slope and intercept of sequential sam-
pling plans with

F = slope = s
H = intercept = h2

16. W. D. Ewan, “When and How to Use CU-SUM Charts,” Technometrics 5, no. 1 (February 1963): 4–22.
220 Part II: Statistical Process Control

and as pointed out by Schilling,17 it is possible to utilize this relationship with tables or
computer programs for sequential plans by taking

tan q = s/F
and
d = h2/s

Deviations of the observations from a target value T for the chart are plotted, rather
than the observations themselves. T is often taken as the acceptable process level m0 for
CUSUM charts utilizing the Barnard procedure. For a two-sided procedure, Ewan and
Kemp18 have suggested a target halfway between the two acceptable, or the two
rejectable, process levels.
Of course, scaling of the chart is of great importance. If the equal physical dis-
tances on the y and x axes are in the ratio y:x = A:1, it is necessary to adjust the half
angle so that its tangent is 1/A times the uncorrected value. This is shown in the for-
mulas given above.
The plot of the cumulative sum can be used to estimate the process average from
the slope of the points plotted. The estimate is simply, m = T + (slope), where T is the
target value for the chart. The slope can be determined by eye or, alternatively, from
the average slope for the last r points of the cumulative sum. If, for a range of r plotted
points, C1 is the first cumulative sum to be used in the estimate and Cr the last, the
process mean may be estimated as

Cr − C1
µ̂ = T +
r −1

where

Ci = ∑ ( X i − T )
i

j =1

The time of a process change may be estimated by passing a trend line through the
points of trend and observing the sample number at which it intersects the previous
stable process line.
To illustrate the use of the cumulative sum chart, consider the first ten means in
Table 7.3. Take the target value as T = m0 = 11.15 with a standard error of the means as

σ X = 2.11 / 5 = 0.94

17. E. G. Schilling, Acceptance Sampling in Quality Control (New York: Marcel Dekker, 1982): 187.
18. W. D. Ewan and K. W. Kemp, “Sampling Inspection of Continuous Processes with No Autocorrelation between
Successive Results,” Biometrics 47 (1960): 363–80.
Chapter 7: Principles and Applications of Control Charts 221

and use a scaling factor A = 1. The chart will be set up to detect a two-sided difference
of D = 2 in the mean with a = 0.05 and b = 0.10. Then
2
D = 2, δ = = 2.13, A = 1
0.94
α
= 0.025, β = 0.10, T = µ0 = 11.15
2
 δσ   0.94 ( 2.13) 
θ = tan −1  X  = tan −1   = tan (1) = 45°
−1

 2A  
 2 (1) 

2 1− β  2  1 − 0.10 
d= ln  = ln   = 1.58
δ  α  ( 2.13)  0.025 
2 2

The data are cumulated as follows:


– – –
Sample, i X (X – T ) Ci = S( X – T )
1 10.7 –0.45 –0.45
2 11.0 –0.15 –0.60
3 11.9 0.75 0.15
4 13.1 1.95 2.10
5 11.9 0.75 2.85
6 14.3 3.15 6.00
7 11.7 0.55 6.55
8 10.7 –0.45 6.10
9 12.0 0.85 6.95
10 13.7 2.55 9.50

The cumulative sum chart appears in Figure 7.8. Clearly a shift is detected at the
sixth point. A line through the fifth and sixth points when compared to a line through
the remainder of the points indicates the shift occurred after the fifth point plotted. The
new mean is estimated as

C6 + C5
µˆ = T +
1
6.00 − 2.85
= 11.15 +
1
= 11.15 + 3.15
= 14.30

It is possible to present the cumulative sum chart in computational form using the
reparameterization as suggested by Kemp.19 This approach is directly suitable for

19. K. W. Kemp, “The Use of Cumulative Sums for Sampling Inspection Schemes,” Applied Statistics 11, no. 1
(March 1962): 16–31.
222 Part II: Statistical Process Control

computerization, see Lucas.20 The computational method requires calculation of two


cumulative sums, one for detecting an increase in the mean and one to detect a
decrease. They are
– –
S1 = Σ(Xi – k1) = Σ( Xi – T – F) (increase)
– –
S2 = Σ( Xi – k2) = Σ( Xi – T + F) (decrease)
where
k1 = (T + F)
k2 = (T – F)

They are computed in such a way that if



• S1 < 0, set S1 = 0 until a positive value of ( Xi – k1) is obtained. Then begin
cumulating S1 again.
And if

• S2 > 0, set S2 = 0 until a negative value of ( Xi – k2) is obtained. Then begin
cumulating S2 again.
These quantities indicate a lack of control if

S1 > H, or S2 < –H

We see from Figure 7.8 that, for a cumulative sum to be significant, there must be
a previously plotted point outside the V-mask or its extension. Suppose that point is P,

preceding the current point, Σ( X – T), by r samples. Then, for a process increase to be
detected, using the lower arm of the V-mask, the vertical height

P < Σ( X – T) – H – rF

P < Σ( X – T – F) – H

H < Σ( X – T – F) – P

Similarly, to detect a decrease, using the upper arm of the V-mask



P > Σ( X – T) + H + rF

P > Σ( X – T + F) + H

–H < Σ( X – T + F) – P

20. J. M. Lucas, “A Modified ‘V’ Mask Control Scheme,” Technometrics 15, no. 4 (November 1973): 833–47.
Chapter 7: Principles and Applications of Control Charts 223

Note that in the computational procedure, P is set, and reset, to zero, so significance
is indicated if
– –
Σ( X – k1) = Σ( X – T – F) > H (increase)
– –
Σ( X – k2) = Σ( X – T + F) < –H (decrease)
Lucas21 has suggested that the computational procedure also keeps track of the num-
ber of successive readings r showing the cumulative sum to be greater than zero, or less
than zero, respectively. When an out-of-control condition is detected, an estimate of the
new process average can then be obtained using the relation
Out-of-control high (S1 > H)

µ̂ = T +
(S 1
+ rF )
r

Out-of-control low (S2 < –H)

µ̂ = T +
(S 2
− rF )
r

The computational procedure may be illustrated with a computational approach to


the cumulative sum chart for the means of mica thickness given in Table 7.3.
Recall

d = 1.58
q = 45°
So
H = Ad tan q
= 1(1.58) tan 45°
= 1(1.58)1
= 1.58
F = A tan q
= 1 tan 45°
= 1(1)
=1

21. J. M. Lucas, “The Design and Use of V-Mask Control Schemes,” Journal of Quality Technology 8, no. 1 (January
1976): 1–12.
224 Part II: Statistical Process Control

The cumulation, then, is as follows:

Increase in mean Decrease in mean

– – – – Number Number
Sample X X–T X–T–F S1 X–T+F S2 high low
1 10.7 –0.45 –1.45 0 0.55 0 0 0
2 11.0 –0.15 –1.15 0 0.85 0 0 0
3 11.9 0.75 –0.25 0 1.75 0 0 0
4 13.1 1.95 0.95 0.95 2.95 0 1 0
5 11.9 0.75 –0.25 0.70 1.75 0 2 0
6 14.3 3.15 2.15 2.85* 4.15 0 3 0
7 11.7 0.55 –0.45 2.40* 1.55 0 4 0
8 10.7 –0.45 –1.45 0.95 0.55 0 5 0
9 12.0 0.85 –0.15 0.80 1.85 0 6 0
10 13.7 2.55 1.55 2.36* 3.55 0 7 0
11 9.8 –1.35 –2.35 0.01 –0.35 –0.35 8 1
12 13.0 1.85 0.85 0.86 2.85 0 9 0
13 11.7 0.55 –0.45 0.41 1.55 0 10 0
14 9.6 –1.55 –2.55 0 –0.55 –0.55 0 1
15 12.0 0.85 0.15 0.15 0.15 0 1 0

The procedure detects an out-of-control condition at points 6, 7, and 10, as shown


by asterisks. An estimate of the process mean at point 6 would be

2.85 + 3(1)
µ = 11.15 + = 11.15 + 1.95 = 13.10
3

Kemp22 has suggested plotting the above results in the form of a control chart. Such
a chart could be one-sided (plotting S1 or S2 only) or two-sided as shown in Figure 7.10.

S1
Increase

UCL = H = 1.58

0.0
Decrease

–1 LCL = –H = –1.58

S2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sample

Figure 7.10 Kemp cumulative sum chart.

22. K. W. Kemp, “The Use of Cumulative Sums for Sampling Inspection Schemes,” Applied Statistics 11, no. 1
(March 1962): 16–31.
Chapter 7: Principles and Applications of Control Charts 225

A one-sided version of the Barnard chart can be developed using the top or bottom
half of the V-mask with a rather than a/2 as the risk. Usually the arm of the mask is
extended all the way down to the abscissa. A one-sided cumulative sum chart testing
for an increase in mean of the mica data would be developed as follows, with a = 0.05,
b = 0.10:
 δσ 
θ = tan −1  X  = 45°
 2A 
2  1− β 
d= ln  
δ2  α 
2  1 − 0.10 
= ln  
( 2.13)  0.05 
2

= 1.27

Figure 7.11 shows that if such a one-sided test was conducted, the process would be
found to be out of control at the sixth point.
Lucas23 has shown that a cumulative sum chart equivalent to the Shewhart chart has

H=0
F = 3s
which implies
tan q = 3s /A
d=0

so that, when A = 3s, d = 0, and q = 45°.

d = 1.27
6

S 3

1 3 4 5 6 Sample

Figure 7.11 One-sided cumulative sum chart.

23. J. M. Lucas, “A Modified ‘V’ Mask Control Scheme,” Technometrics 15, no. 4 (November 1973): 833–47.
226 Part II: Statistical Process Control

3s

S (X – T )

Sample

Figure 7.12 CUSUM chart equivalent to Shewhart chart.

A CUSUM chart of this form would appear as Figure 7.12. Such a chart seems to
have little advantage over a Shewhart chart. In fact Lucas states
A V-mask . . . designed to detect large deviations quickly is very similar to a
Shewhart chart and if only very large deviations are to be detected, a Shewhart
chart is best.24
Nevertheless, it has been proposed by Ewan25 that the plan h = 5 and f = 0.5 giving
H = 5s and F = 0.5s has properties similar to the Shewhart chart, but possesses more
sensitivity in the region of d less than 3s and a higher average run length (ARL) at m0.
ISO/TR 781126 also recommends h = 5 and f = 0.5 as a general-purpose plan. This
amounts to a Barnard chart with

d = 10
q = 26.57° = 26°34′

using equal scales, that is, A = 1 (otherwise use q = tan–1(1/2A), d = 10 when A ≠ 1).
This plan may provide a useful substitute for the Shewhart chart when a cumulative sum
chart is in order. It is a good match to the corresponding arithmetic and EWMA chart
as well.

24. Ibid.
25. W. D. Ewan and K. W. Kemp, “Sampling Inspection of Continuous Processes with No Autocorrelation between
Successive Results,” Biometrics 47 (1960): 363–80.
26. ISO/TR 7811, Cumulative Sum Charts—Guidance on Quality Control and Data Analysis Using CUSUM
Techniques (Geneva: International Organization for Standardization, 1997).
Chapter 7: Principles and Applications of Control Charts 227

Special Cumulative Sum Charts


Cumulative sum charts are especially good at disclosing small, longer-term process
changes. However, intermittent very short perturbations, and shifts in excess of 3s X– are
often more readily discovered with the conventional Shewhart chart. In recognition of
this, special CUSUM charts have been developed to increase sensitivity to abrupt or
short-term changes. These include the construction of a snub-nosed mask and a proce-
dure for combining CUSUM and Shewhart charts.
The snub-nosed mask is essentially a combination of two separate masks superim-
posed one over the other. A basic CUSUM mask is developed in terms of conventional
considerations. A second, short-term mask is then constructed with a larger half angle
q and smaller lead distance d.
For example, the ISO technical report27 on CUSUM gives the following plan: (a)
h = 5 and f = 0.5, combined with (b) h = 1.55 and f = 1.55. The snub-nosed mask for
this combination is shown in Figure 7.13, which assumes s = 1 and A = 1.
This combination produces average run lengths (ARLs) much shorter than the
Shewhart 3s chart for almost all displacements. For a one-sided chart with zero dis-
placement, the snub-nosed chart ARL is 472 against 740 for the Shewhart chart. Of

Ci

F = 1.55(b)

H = 1.55(b)
–1 0 1 2 3 4 5 6 7 8 9 10 11 12

H = 5.0(a)

s=1
A=1
F = 0.5(a) (1:1 scale)

Sample number

Figure 7.13 Snub-nosed CUSUM mask.

27. Ibid., 11.


228 Part II: Statistical Process Control

Table 7.4 Average run length for special CUSUM charts.28


Snub-nosed Combined
Process Shewhart Standard CUSUM Shewhart/ FIR
displacement, chart w/3s CUSUM h = 5.0, f = 0.5 standard h = 5.0, f = 0.5
d limits h = 5.0, f = 0.5 h = 1.55, f = 1.55 CUSUM s = 2.5
0.00 740 930 472 448 896
0.25 336 140 114 112 125
0.50 161 38 36 35 29
0.75 82 17 16 16 11
1.00 43.9 10.5 9.9 9.8 6.3
1.50 15.0 5.8 5.2 5.3 3.4
2.00 6.3 4.1 3.2 3.5 2.4
2.50 3.2 3.2 2.3 2.4 1.9
3.00 2.0 2.6 1.7 1.8 1.5
4.00 1.2 1.9 1.2 1.2 1.2

course, a large ARL is desirable for zero displacement, that is, we would not want to act
when nothing has changed in the process. Thus, ARL values can be seen in Table 7.4.
Another alternative for increasing sensitivity to large abrupt changes is the com-
bined Shewhart–CUSUM approach. Here, a Shewhart chart and a CUSUM chart are
run in parallel on the same data. A sample is checked for control against the Shewhart
chart. If it does not exceed the limits, it is incorporated into the CUSUM total. For sam-
ples of size one, this can also be regarded as a test for outliers before an observation is
entered into the CUSUM. For this procedure, it is sometimes recommended that the
limits for the Shewhart chart be taken to be ±4s. The average run length for a conven-
tional 3s Shewhart chart combined with the general-purpose CUSUM (h = 5, f = 0.5)
are listed in Table 7.4.
Still another approach for shortening the time to a signal when the process is
started, or may be thought to possibly be out of control, is the fast initial response (FIR)
procedure. It can be used only with the computational method. The cumulative sum is
initiated at a value other than zero, often at S = H/2, where S = ss is the value at which
the CUSUM is started, and s is a parameter of the CUSUM and not a sample standard
deviation. When the process is on target, the CUSUM will settle down quickly.
However, if it is off target it will have a shorter path to the control limit. The ARL val-
ues for a CUSUM with the FIR procedure and s = 2.5 using the computational approach
with the ISO general-purpose plan (h = 5.0, f = 0.5) are given in Table 7.4 for compar-
ison to the other control chart procedures.
Finally, when equal scaling is desirable, such as in the comparison of two or more
characteristics, a standardized chart may be desirable. For this chart, the differences are
standardized by dividing the differences by the standard deviation, giving

28. Values, except for the Shewhart chart, are taken from ISO/TR 7811, Ibid., 19.
Chapter 7: Principles and Applications of Control Charts 229

i 
X −T
Ci = ∑  i ,
j =1  σ 

which results in a standard error of the plotted points equal to one. This produces the
following parameters for the V-mask,29 incorporating the recommendation that A = 2s

A=2 D = µ1 − µ0 T = µ0
 D
θ = tan −1   H = 2d tan θ
 4
2 1− β 
d= lnn   F = 2 tan θ
D2  α 

Such charts are an aid to uniformity, but standardization may result in a plot that is
somewhat obscure to the operating personnel.
The computational method is especially good for use with a computer when many
characteristics are to be controlled. The V-mask and CUSUM graph approach is quite
useful in quality control for detecting the time at which a process change occurred. It
will emulate the Shewhart chart by using the parameters h = 5 and f = 0.5, or tan q =
1/2A and d = 10. Thus, cumulative sum charts can be used to supplement Shewhart
charts as a troubleshooting device.

7.13 PRECONTROL
A process of any kind will perform only as well as it is set up before it is allowed to run.
Many processes, once set up, will run well and so need be subject only to occasional
check inspections. For these processes, control charts would be overkill. Precontrol is
a natural procedure to use on such processes.30 It is based on the worst-case scenario
of a normally distributed process centered between the specifications and just capable of
meeting specifications, that is, the difference between the specifications is assumed to
be just equal to a 6s spread. If precontrol lines are set in from the specifications a dis-
tance one-quarter of the difference between the specifications, there would then be a
seven percent chance of an observation falling outside the precontrol (PC) lines on one
side by normal theory. The chance of two successive points falling outside the precon-
trol lines on the same side would be

P(2 outside) = 0.14 × 0.07 ≅ 0.01

29. When A = 1: q = tan-1(D/2), H = d tan q, and F = tan q with d unaffected.


30. D. Shainin, “Techniques for Maintaining a Zero Defects Program,” AMA Bulletin 71 (1965).
230 Part II: Statistical Process Control

Spec PC line PC line Spec

7% 7%

1/2 spec

Specification

Figure 7.14 Precontrol justification.

Table 7.5 Precontrol rules.


1. Set precontrol lines in 1/4 from the specifications.
2. Begin process.
3. If first piece outside specifications, reset.
4. If first piece outside PC line, check next piece.
5. If second piece outside same PC line, reset.
6. If second piece inside PC lines, continue process and reset only when two successive pieces are
outside PC lines.
7. If two pieces are outside opposite PC lines, reduce variation immediately.
8. When five successive pieces are inside PC lines, go to frequency gauging and continue as long
as average checks to reset is 25.
Frequency guidelines
Process Frequency Process characterization
Erratic 1/50 Intermittently good and bad
Stable 1/100 May have drift
Controlled 1/200 In statistical control
9. During frequency gauging, do not reset until piece is outside PC lines. Then check next piece and
go to step 5.

This can be seen from Figure 7.14. This principle is basic to the precontrol approach.
A typical set of rules for application of precontrol are given in Table 7.5 and are intended
for maintaining an AQL (acceptable quality level) of one to three percent when the spec-
ifications are about 8s wide. Application of these rules will lead to the diagrammatic
representation shown in Figure 7.15.
As an example, consider the following sequence of initial mica thickness measure-
ments in starting up the process

8.0, 10.0, 12.0, 12.0, 11.5, 12.5, 10.5, 11.5, 10.5, 7.0, 7.0
Chapter 7: Principles and Applications of Control Charts 231

Start

Gauge
first Run
piece

Outside Outside Inside 2 successive 5 successive


specification PC lines PC lines pieces outside pieces inside
limits PC lines PC lines

Use
Gauge Reset frequency
second gauging
piece

Average pieces Average Average pieces


Outside Inside to reset more pieces to reset
PC lines PC lines than 29— to reset less than 21—
reduce 21–29— increase
Reset frequency frequency OK frequency

Figure 7.15 Precontrol schematic.

where the specifications are 8.5 and 15. The precontrol procedure would operate as
follows:
1. Set precontrol lines at

8.5 +
(15 − 8.5) = 10.1
4

15 −
(15 − 8.5) = 13.4
4

2. Begin the process


3. First piece is 8.0, outside specifications
3a. Reset process and begin again
4. Next piece is 10.0, outside lower PC line
5. Second piece 12.0 is within PC lines, so let process run
...
8. Next pieces are 12.0, 11.5, 12.5, 10.5, all within PC line so start frequency
gauging, roughly one in 50 pieces
232 Part II: Statistical Process Control

8a. Next two sample pieces are 11.5, 10.5. Within PC line so continue
sampling
9. Next sample piece is 7.0, outside lower PC line so check next piece
10. Next successive piece is 7.0, reset and start over
This procedure takes advantage of the principle of collapsing the specifications to
obtain greater sensitivity in a manner similar to narrow-limit gauging. While it is sensi-
tive to the assumption of a normally distributed process, it is not necessary to know s
as in narrow-limit gauging. Precontrol provides an excellent approach to check inspec-
tion that can be used after the control charts are removed.

7.14 NARROW-LIMIT CONTROL CHARTS


Narrow-limit gauging (NLG) incorporates a compressed or narrow limit much like the
PC line in precontrol. The narrow limit is set a distance ts inside the specification limit
when the procedure is to be used for acceptance against the spec. The narrow limit may
be set a distance (3 – t)s from the mean when it is to be used as a process control device,
ignoring any specification. The number of observations outside the narrow limit may be
used to characterize the process. A control chart may be set up by plotting the resulting
count (high or low) against an allowable number c for the sample size n used. The plan
ng = 5, t = 1, c = 1 corresponds to a Shewhart chart with sample size 4, while ng = 10,
t = 1.2, c = 2 corresponds to a Shewhart chart with sample size 5. When used with the
specification limit, such a chart corresponds to a modified limits control chart, while
when used with a crude estimate of process spread (3s ) such a chart may be used in
tracking the process.

7.15 OTHER CONTROL CHARTS

Manual Adjustment Charts


Varieties of other control chart procedures are available for specific applications.
Notable among them are adaptive control charts, which provide a system of feedback
from the data to achieve appropriate adjustment of the process. The interested reader
should consult the seminal paper by Box and Jenkins31 and subsequent literature.
Box and Jenkins32 further elaborate in their text on the subject of feedback control
using manual adjustment charts. These charts are intended to provide the statistical

31. G. E. P. Box and G. M. Jenkins, “Some Statistical Aspects of Adaptive Optimization and Control,” Journal of the
Royal Statistical Society 24 (1962): 297–343.
32. G. E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control (San Francisco: Holden-Day,
1976): 433–46.
Chapter 7: Principles and Applications of Control Charts 233

quality control practitioner with a means of making ongoing process adjustments. Box33
offers the manual adjustment chart as a compromise solution between the control engi-
neer interested in the use of feedback control for process regulation, and the quality
control practitioner who prefers control charts for process monitoring.
The act of process monitoring using control charts assumes that it is possible to bring
the process into a state of statistical control without continual process adjustment. If the
process is in a state of statistical control, we say that the process is governed by common
cause (inherent) variation. By applying a target value for the process, and limits that define
the amount of common cause variation that is expected to naturally occur, points on the
chart are evaluated to determine if they violate conditions of nonrandomness. For example,
a simple rule is to identify points that lie beyond any of the (typical) three-sigma limits. If
so, we can say with confidence that an assignable, or special, cause has affected the process.
The quality control practitioner must then do some investigative work to find the under-
lying reason (root cause) for the change, and remove it from the process as a means of
reducing process variation. See Section 2.5 for a more thorough discussion of control
criteria that can be used in routine production or process troubleshooting situations.
The objectives of process monitoring, according to Box, are to (a) continually con-
firm that the established common cause system remains in operation, and (b) to look for
deviations that are unlikely due to chance that can lead someone to track assignable
causes down and eliminate them. While these objectives will lead to process improve-
ment, they are not always attainable in a permanent sense. Often, raw material lots dif-
fer coming into the process; process operators rotate among shifts or leave their jobs,
and so on. The result is that the process mean rarely stabilizes for very long. This cre-
ates a need for process regulation.
The use of manual adjustment charts for process regulation by feedback control
assumes that an additional variable Yt exists that can be used to adjust for the deviation
dt of a quality characteristic xt from its target T or more simply, dt = xt – T. The intent is
to continually adjust the variable so that the quality characteristic is as close as possible to
its target. In effect, this process is analogous to driving a car and making occasional
adjustments with the steering wheel to maintain your car’s path down the center of the
lane you are driving in. If you take your hands off the wheel, the car will eventually end
up in the opposing lane, or on the side of the road. The car drift may be due to a front-
end alignment problem, or other malady, but the “process mean” will not remain stable,
thus forcing the driver to control the car via continual manual adjustments.
As Box explains,34 process monitoring uses a statistical background similar to
hypothesis testing, whereas process regulation parallels statistical estimation by esti-
mating the level of the disturbance dt (the difference from target) and making an
adjustment to cancel it out. In fact, if process regulation is intended, then waiting for a
statistically significant disturbance (instead of making continual process adjustments)
will increase the mean square deviation from the target (versus the level of variability
as a result of process monitoring).

33. G. E. P. Box, “George’s Column,” Quality Engineering 4, no. 1 (1991–92): 143–51.


34. Ibid.
234 Part II: Statistical Process Control

Case History 7.1


Metallic Film Thickness
As a means of illustrating the use of a manual adjustment chart, Box and Luceno pre-
sent 100 observations of a thickness measurement xt of a very thin metallic film.35 The
readings were taken at equally-spaced intervals of time during the manufacture of a
computer chip when no process adjustment was being applied. The process target was
set at T = 80, and the objective was to try to maintain thickness as close as possible to
this value. Thus, the process disturbance would be the deviation dt = xt – 80. The data
are given as follows
80 92 100 61 93 85 92 86 77 82 85 102 93 90 94
75 75 72 76 75 93 94 83 82 82 71 82 78 71 81
88 80 88 85 76 75 88 86 89 85 89 100 100 106 92
117 100 100 106 109 91 112 127 96 127 96 90 107 103 104
97 108 127 110 90 121 109 120 109 134 108 117 137 123 108
128 110 114 101 100 115 124 120 122 123 130 109 111 98 116
109 113 97 127 114 111 130 92 115 120

In Figure 7.16, the measured thickness and corresponding process disturbance dt are
plotted over the time the data were collected. Three-sigma control limits were computed

140 60.00

130

120 40.00
UCL = 110.52
110
+2σ = 100.35
Metallic film thickness (xt )

Disturbance (dt = xt -80)


100 20.00

90
Target = 80 Disturbance = 0
80 0.00

70

60 -20.00
-2σ = 59.65
50
LCL = 49.48
40 -40.00

30

20 -60.00
1 11 21 31 41 51 61 71 81 91 101
Time (t)

Figure 7.16 Disturbances of metallic film thickness from a target value of T = 80 for an
uncontrolled process. Points out of statistical control (“X”) are based on control
chart criteria for action during routine production (Section 2.5).

35. G. E. P. Box and A. Luceno, Statistical Control By Monitoring and Feedback Adjustment (New York: John Wiley
& Sons, 1997): 128–53.
Chapter 7: Principles and Applications of Control Charts 235

based on the average moving range. Using the control chart criteria presented in Section
2.5 for assessing the stability of the process in routine production, it is clear that the
level of metallic film thickness is out of statistical control.
—–
The short-term standard deviation sST of the process is estimated by ŝ ST = MR/d2 =
11.47/1.128 = 10.17; and the long-term deviation sLT of the process is estimated by sLT
= s = 17.18. Since other efforts to stabilize the process have failed, the process opera-
tor decided to control the process by manually adjusting the deposition rate Y whose
level at time t will be denoted Yt. The control chart in Figure 7.16 is now replaced with
a manual adjustment chart, which is shown in Figure 7.17.
To use the manual adjustment chart, the process operator records the latest value of
the metallic film thickness and then reads off the adjustment scale the appropriate
amount by which the deposition rate should be increased or decreased to alter the thick-
ness level. For the first reading, the thickness value is 80, which is on the target of 80,
so no action is needed. However, the next value is a thickness of 92, which corresponds
to an adjustment of –2 change in the deposition rate. Thus, the process operator is now
required to reduce the deposition rate by two units from its current value.
Box notes that the successive recorded adjusted thickness values shown in Figure
7.17 are the readings that would actually occur after the manual adjustment in deposi-
tion rate has been made.36 The process variation seen in Figure 7.16 would not be seen

-10.00 140.00

130.00
-8.00

120.00
-6.00
110.00
Deposition rate adjustment (Yt - Yt-1)

-4.00
100.00

Adjusted thickness
-2.00
90.00

0.00 80.00

70.00
2.00

60.00
4.00
50.00
6.00
40.00

8.00
30.00

10.00 20.00
1 11 21 31 41 51 61 71 81 91 101
Time (t)

Figure 7.17 A bounded Box–Jenkins manual adjustment chart, which allows the process operator
to plot the thickness and then read off the appropriate change in the deposition rate
needed to bring the process to the target of T = 80.

36. G. E. P. Box, “George’s Column,” Quality Engineering 4, no. 1 (1991–92): 143–51.


236 Part II: Statistical Process Control

by the process operator. The manual adjustment chart can be a very effective tool in
bringing the adjusted process to a more improved state of statistical control, and would
produce a dramatic reduction in the standard deviation of the process. The adjusted pro-
cess standard deviation for these data was reduced to ŝ LT = s = 11.18, or a 35 percent
reduction in variability!

Construction of the manual adjustment chart is based on the following assumptions:


1. That a change in the deposition rate Y will produce all of its effect on
thickness within one time interval (no lingering effects over time).
2. That an increase of one unit in the deposition rate Y will increase the thickness
y by 1.2 units. Note that the constant 1.2 is called the gain, g, and can be
interpreted like a regression coefficient, that is, ∆Y = gx = 1.2x.
3. That an uncontrolled disturbance, as seen in Figure 7.16, can be effectively
forecast a step ahead using an EWMA approach.
See Hunter for an excellent discussion of the use of the EWMA in developing a
manual adjustment chart with a spreadsheet program.37 (The reader can find files called
Hunter.xls and Box and Luceno.xls on the CD-ROM included with this text.) Recall
from Section 7.11 that the weight factor for the EWMA, r, controls how much influence
the immediate observation dt has on the current value of dt , which we can write as

d̂t = rdt + (1 – r) d̂t–1


or
d̂t = d̂t–1 + r(dt – d̂t–1)
d̂t – d̂t–1 = r(dt – d̂t–1)
d̂t – d̂t–1 = r(dt – d̂t–1)
d̂t+1 – d̂t = r(dt+1 – d̂t)
= ret

where dt+1 – d̂t = et is the forecast error. So, r can be thought of as the proportion of the
forecast error that is believed to provide an accurate forecast. The adjusted thickness at
time t + 1 is

xt+1 – T = dt+1 + gYt ,

37. J. S. Hunter, “The Box-Jenkins Bounded Manual Adjustment Chart: A Graphical Tool Designed for Use on the
Production Floor,” Quality Progress (August 1998): 129–37.
Chapter 7: Principles and Applications of Control Charts 237

which implies that the deviation xt+1 – T of the thickness from its target depends on the
current level of the disturbance dt+1 and the current level of the deposition rate Yt .
Unfortunately, the value of dt+1 is unknown at time t + 1, so the EWMA can be used to
produce the estimate of d̂t+1. Box38 shows that the adjusted thickness becomes

xt+1 – T = et+1

which indicates that the deviation in target seen in Figure 7.17 is simply the forecast
error. The adjustment in the deposition rate from its previous value at time t – 1 can be
shown to be

Yt − Yt −1 = −
r
g
( xt − T ) = −
0.2
1.2
( xt − T ) = − ( xt − T ) = − dt
1
6
1
6

when we set r = 0.2 and g = 1.2, which is exactly the control that the manual adjustment
chart in Figure 7.17 achieves.

Short-Run Control Charts


Many processes are carried on in short runs or batches that are much shorter in length
than the 20 to 30 samples recommended for constructing a control chart.39 These short
runs can frequently be expected to initiate a change in level, or variation, or both. These
changes are, in fact, previously known potential assignable causes that enter the process
at predetermined times. There are several ways to construct control charts for such sit-
uations, as discussed in a comparative study by Haugh and Pond40:
1. Ignore the systematic variability and plot on a single chart.
2. Stratify the data and plot it on separate charts.
3. Use regression analysis to model the data and plot the residuals on a
single chart.
4. Standardize the data and plot the standardized values on a single chart.
This last option has received considerable attention and involves the use of the linear
transformation

X−µ
Z=
σ

38. G. E. P. Box, “George’s Column,” Quality Engineering 4, no. 1 (1991–92): 143–51.


39. E. G. Schilling, “Short Run Control Charts,” in 54th Annual Quality Conference Transactions (Rochester, NY:
ASQ Rochester Section, March 31, 1998): 50–53.
40. L. D. Haugh and A. D. Pond, “Adjustable Individual Control Charts for Short Runs,” ASQ 40th Annual Quality
Congress Proceedings (Milwaukee: American Society for Quality, 1995): 1117–25.
238 Part II: Statistical Process Control

to remove systematic changes in level and variability. The standardization of Shewhart


charts has been examined in depth by Nelson.41 Wheeler has also provided an excellent
discussion of such procedures.42 Griffith also gives an insightful basic introduction to
these methods.43 Charts of this form usually have limits of 0 ± 3.
There are a number of variations of these so-called Z-charts, some of which are as
follows:
• Difference charts. A constant is subtracted from each of the observations and
the resulting differences are used as data for a control chart constructed using
standard methods. The standard deviation of the resulting values is unaffected
by this procedure. The constant can be a known value for the mean (m), a mean

from past data (X), a process target (T ), or the specification limit itself. Here,
the control limits are 0 ± 3ŝ .
• Standardized charts. A constant (see above) is subtracted from each of the
data values and the result is divided by a second constant. When the standard
deviation is used as the divisor, the resulting Z-values have a standard deviation
of 1 and the control limits are ± 3 for a = 0.003. The divisor can be such as:
a known standard deviation (s), an estimate from past data (s), or a range

estimator ( R). Sometimes the divisor is not a standard deviation, but some
other value. For example, Burr discusses a measure in which the nominal
was subtracted from the observations and the divisor was half the tolerance.44
This has the advantage of showing the fraction of tolerance used up by the
process as well as closeness to the nominal. Ordinarily, the moving range
with a span of two is the preferred method of estimating the standard deviation
for short-run charts. Alternatively, when using subgroups, the usual pooled
estimates are in order.

• Short-Run X and R charts. Bothe has suggested plotting

X−X
R

against ± A2 control limits for the mean and

R
R

41. L. S. Nelson, “Standardization of Shewhart Control Charts,” Journal of Quality Technology 21, no. 4 (October
1989): 287–89.
42. D. J. Wheeler, Short Run SPC (Knoxville, TN: SPC Press, 1991).
43. G. K. Griffith, Statistical Process Control Methods for Long and Short Runs, 2nd ed. (Milwaukee: ASQ Quality
Press, 1996).
44. J. T. Burr, “SPC in the Short Run,” ASQ Quality Congress Transactions (Milwaukee: American Society for
Quality Control, 1989): 778–80.
Chapter 7: Principles and Applications of Control Charts 239

against D3 and D4 control limits for the range.45 This produces dimensionless
charts suitable for plotting different parts or runs on the same chart.
The Q chart has raised short-run methodology to the next level of sophistication.
The technique allows the chart to be constructed from the initial data points without the
need for previous estimates of the mean or variance. This allows charting to begin at
the very start of a production run. Furthermore, the probability integral transformation is
utilized to achieve normality for Q from otherwise nonnormal data, such as the range or
the standard deviation. The method is explained in Quesenberry, which covers Q charts
for the mean and variance when these parameters are known and unknown.46
For example, consider the case when individual normally distributed measurements
are to be plotted for unknown mean and known variance. The chart for process location
is constructed as follows:
1. Collect individual measurements: x1, x2, . . . , xr, . . .
2. For the rth point to be plotted, compute
1
 r − 1 2  xr − xr −1 
Qr ( X r ) =     , r = 1, 2, 3, . . .
 r   σ 

where –xr–1 is the mean of the previous r – 1 points.


3. Plot Qr(Rr) for each of the data points against limits of 0 ± 3.
The chart for process variation is constructed accordingly:
1. Utilize the measurements: x1, x2, . . . , xr, . . .
2. For the rth point to be plotted (r even only):
a. Compute

(x r
− xr −1 )
2

, r = 2, 4, 6, . . .
2σ 2
b. Find the percentile of the c2 distribution with one degree of freedom for
the value computed.
c. Find the normal Z value at this percentile and set it equal to Q(Rr).
3. Plot Qr(Xr) for the even data points against limits of 0 ± 3.

45. D. R. Bothe, Measuring Process Capability (New York: McGraw-Hill, 1997): 792–98.
46. C. P. Quesenberry, “SPC Q Charts for Start-up Processes and Short or Long Runs,” Journal of Quality Technology
23, no. 3 (July 1991): 213–24.
240 Part II: Statistical Process Control

Thus, if s = 2 and the first two data points were 7 and 11, in plotting the second
point we would have
1
 2 − 1 2  11 − 7 
Qr ( X r ) =     = 1.414
 2   2 

for the location chart and


(x r
− xr −1 )
2

=
(11 − 7) 2

=2
2(4)
a.
2σ 2

b. For c 2 with one degree of freedom, the percentile for c 2 = 2 is 84.3 percent.
c. A normal 84.3 percentile corresponds to Z = 1.006
hence
Q(Rr) = 1.006

Note that both Qr(Xr) and Q(Rr) have limits of 0 ± 3, and hence may be plotted on
the same chart. Quesenberry has extended the Q chart to attributes data for both the
binomial and Poisson distributions.47,48
Emphasis on inventory control and just-in-time techniques have heightened the
importance of statistical control in short-run situations. A summary of short-run charts is
given in Table 7.6.

Other Specialized Control Charts


Multivariate control charts are an important part of the methodology of process control.
They provide a vehicle for simultaneous control of several correlated variables. With,
say, five process variables of concern, it is possible to run a single T 2 control chart that
will indicate when any of them have gone out of control. This can be used to replace the
five individual charts used if these variables are treated in a conventional manner. The T 2
chart also incorporates any correlation that may exist between the variables and thus
overcomes difficulty in interpretation that may exist if the variables are treated sepa-
rately. Multivariate methods are discussed in detail by Jackson.49

47. C. P. Quesenberry, “On Properties of Binomial Q Charts for Attributes,” Journal of Quality Technology 27, no. 3
(July 1995): 204–13.
48. C. P. Quesenberry, “On Properties of Poisson Q Charts for Attributes,” Journal of Quality Technology 27, no. 4
(October 1995): 293–303.
49. J. E. Jackson, “Principle Components and Factor Analysis,” Journal of Quality Technology, part 1, vol. 12
(October 1980): 201–13; part 2, vol. 13 (January 1981): 46–58; part 3, vol. 13 (April 1981): 125–30.
Chapter 7: Principles and Applications of Control Charts 241

Table 7.6 Summary of short-run control chart plotting measures and limits.
Chart Plot Limits
– –
Difference D=X–T D ± A2R
x −µ
Standard Z= 0±3
σ
Bothe:
x −x
Location BL = 0 ± A2
R

R
Spread Bs = D 3, D 4
R
Quesenberry: 1
 r − 1 2  x r − x r −1 
Location (s known) Qr ( x r ) =     0±3
 r   σ 

Spread* Q(Rr) 0±3


2
* Q(Rr) equals Z value corresponding to normal percentile equal to c percentile of

χv2=1 =
(x r
− x r −1 )
2

2σ 2
for even values only.

Finally, quality scores are often plotted on a demerit per unit or OPQR (outgoing
product quality rating) chart. Such charts provide weights for various nonconformities
to give an overall picture of quality. See Dodge and Torrey50 and Frey.51 See also the
discussion in Case History 6.1 on the CD-ROM.
An excellent comparison of control chart procedures has been given by Freund.52

7.16 HOW TO APPLY CONTROL CHARTS


There are a variety of control chart forms and procedures. Some of these are summa-
rized in Table 7.7, which shows the type of chart, the type of data to which it applies,
its use, and the level of sophistication required for effective application.
Effective application of control charts requires that the chart selected be appropri-
ate to the application intended. Some reasonable selections for various alternatives are
shown in Table 7.8.
Proper use of control charts requires that they be matched to the degree of control
the process has exhibited, together with the extent of knowledge and understanding

50. H. F. Dodge and M. N. Torrey, “A Check Inspection and Demerit Rating Plan,” Industrial Quality Control 13,
no. 1 (July 1956): 5–12.
51. W. C. Frey, “A Plan for Outgoing Quality,” Modern Packaging (October 1962).
52. R. A. Freund, “Graphical Process Control,” Industrial Quality Control (January 1962): 15–22.
242 Part II: Statistical Process Control

Table 7.7 Use of control charts.


Type Data Use Level
X Measurement Rough plot of sequence B*

X and R Measurement In-plant by operator B

X and s Measurement In-plant by computer A*
Median, R, s Measurement Excellent introductory tool B
NLG Measurement In-plant ease of gauging with greater sensitivity B
p Proportion Attributes comparison B
np Number defective In-plant attributes B
c Defects In-plant defects B
u Defects/unit Defects comparison B
CUSUM Measurement One observation at a time; engineering analysis;
proportion defects natural for computer A

Moving X and R Measurement In-plant, one observation at a time B
Geometric Measurement Continuing sequences; no definite period A
moving average
Demerits/unit Attributes Audit B
characteristics
Adaptive Real-time System feedback and control A
measurements
Acceptance Measurements Fixes risks; combines acceptance sampling A
control and process control
T2 Multiple correlated Combined figure of merit for many characteristics A
measurements
* B—basic; A—Advanced.

Table 7.8 Selection of chart.


Data
Purpose Individuals Subgroups
Overall indication of quality CUSUM p, c Shewhart p, c
Attain/maintain control of CUSUM p, c Shewhart p, c
attributes
– –
Attain/maintain control of Moving X, R; geometric Shewhart X, R; NLG
measurement moving average
Attain/maintain control of Multivariate control chart Multivariate control chart
correlated characteristics
Feedback control Adaptive Adaptive
Investigate assignable causes CUSUM Analysis of means
Overall audit of quality Demerits/unit Demerits/unit
Acceptance with control CUSUM Acceptance control

which has been achieved at a given time. In this way the sophistication and frequency
of charting may be changed over time to stay in keeping with the physical circumstances of
the process. This progression is shown in Figure 7.18.
As a process or product is introduced, little is known about potential assignable
causes or, in fact, the particular characteristics of the process that require control. At that
time, it is appropriate to do 100 percent inspection or screening while data is collected
Chapter 7: Principles and Applications of Control Charts 243

Process understanding

Little Some Extensive


Control Excellent X, R NLG Spot check


Average p, c X, R NLG


Poor X chart p, c X, R

Figure 7.18 Progression of control charts.

Process control

Sampling plan Check inspection

Screening No inspection

New Old
Maturity of product

Figure 7.19 Time line for control.

to allow for implementation of more economic procedures. After the pilot plant and
start-up phase of production process development, acceptance sampling plans may be
instituted to provide a degree of protection against an out-of-control process while at the
same time collecting data for eventual implementation of process control. Whenever
dealing with a process, acceptance sampling should be viewed as an adjunct and pre-
cursor of process control, rather than as a substitute for it.
Sometimes acceptance sampling plans can be used to play the role of a process con-
trol device. When this is done, emphasis is on feedback of information rather than sim-
ple acceptance or rejection of lots. Eventually enough information has been gathered to
allow implementation of control charts and other process control devices along with
existing acceptance sampling plans. It is at this point that acceptance sampling of lots
should be phased out in preference to expanded process control. In its turn, when a high
degree of confidence in the process exists, control charts should be phased out in favor
of check inspections, such as precontrol and eventually process checking or no inspec-
tion at all. These ideas are illustrated in Figure 7.19.
It will be seen that there is a lifecycle in the application of control charts. Preparation
requires investigation of the process to determine the critical variables and potential
244 Part II: Statistical Process Control

Stage Step Method


Preparatory State purpose of investigation Relate to quality system
Determine state of control Attributes chart
Determine critical variables Fishbone diagram
Determine candidates for control Pareto chart
Choose appropriate type of chart Depends on data and purpose
Decide how to sample Rational subgroups
Choose subgroup size and frequency Sensitivity desired
Initiation Insure cooperation Team approach
Train user Team approach
Analyze results Look for patterns
Operational Assess effectiveness Periodically check usage and relevance
Keep up interest Change chart, involve users
Modify chart Keep frequency and nature of chart
current with results
Phaseout Eliminate chart after purpose is Go to spot checks, periodic sample
accomplished inspection, overall p, c charts

Figure 7.20 Lifecycle of control chart application.

rational subgrouping. Motivational aspects should be considered in implementation. This


is often accomplished by using a team approach while attempting to get the operators and
supervisors as much involved as possible. Charts must be changed over the life of the

application to sustain interest. A given application might utilize p charts, X and R charts,
median charts, narrow-limit charts, and so forth, successively in an effort to draw atten-
tion to the application. Eventually, of course, with the assurance of continued control,
the charts should be withdrawn in favor of spot checks as appropriate. This is seen in
Figure 7.20.
Certain considerations are paramount in initiation of a control chart, including
rational subgrouping, type of chart, frequency, and the type of study being conducted.
A check sequence for implementation of control charts is shown in Figure 7.21.
Control charts are not a cure-all. It takes a great deal of time and effort to use them
properly. They are not appropriate in every situation to which statistical quality control is
to be applied.
A retailer with a large number of small job-shop vendors is hard put to insist on
process control at the source for acceptance of products, since only a few pieces are
made and purchased at any given time. Here, acceptance sampling is the method of
choice. On the other hand, a large firm dealing with a large amount of product from a
few vendors is well advised to work with the vendors to institute process control at the
source, thus relieving the necessity for extensive incoming inspection.
These ideas were well-summarized by Shewhart as follows in his description of the
use of control charts in process control.53

53. W. A. Shewhart, Economic Control of Quality of Manufactured Product (New York: D. Van Nostrand, 1931).
Chapter 7: Principles and Applications of Control Charts 245

Determine
purpose of chart

Consider rational
subgroups

Determine spacing
and type of chart

Determine
subgroup size
and frequency

Implement
chart Out of
In control
control

Process
Hands off
study

Eventually modify Determine


or phase out assignable causes

Correct
Spot check
process

Figure 7.21 Check sequence for control chart implementation.

. . . control of this kind cannot be reached in one day. It cannot be reached in


the production of a product in which only a few pieces are manufactured. It can,
however, be approached scientifically in a continuing mass production.

7.17 PRACTICE EXERCISES


For the exercises below, consider the following data taken from Table 7.9.
1. Prepare a median chart for the data in Table 7.9.

2. What sample size would allow medians to be plotted on an X chart for
samples of six with no change in the positioning of the limits?
3. Prepare a chart for the midrange using the data in Table 7.9 with samples
of five.
4. Prepare an s chart for the data in Table 7.9.
246 Part II: Statistical Process Control

Table 7.9 Data: air-receiver magnetic assembly (depth of cut).


Taken at 15-minute intervals in order of production.

–– Range
X R
1 160.0 159.5 159.6 159.7 159.7 159.7 0.5
2 159.7 159.5 159.5 159.5 160.0 159.6 0.5
3 159.2 159.7 159.7 159.5 160.2 159.7 1.0
4 159.5 159.7 159.2 159.2 159.1 159.3 0.6
5 159.6 159.3 159.6 159.5 159.4 159.5 0.3
6 159.8 160.5 160.2 159.3 159.5 159.9 1.2
7 159.7 160.2 159.5 159.0 159.7 159.6 1.2
8 159.2 159.6 159.6 160.0 159.9 159.7 0.8
9 159.4 159.7 159.3 159.9 159.5 159.6 0.6
10 159.5 160.2 159.5 158.9 159.5 159.5 1.3
11 159.4 158.3 159.6 159.8 159.8 159.4 1.5
12 159.5 159.7 160.0 159.3 159.4 159.6 0.7
13 159.7 159.5 159.3 159.4 159.2 159.4 0.5
14 159.3 159.7 159.9 159.5 159.5 159.4 1.4
15 159.7 159.1 158.8 160.6 159.1 159.5 1.8
16 159.1 159.4 158.9 159.6 159.7 159.3 0.8
17 159.2 160.0 159.8 159.8 159.7 159.7 0.8
18 160.0 160.5 159.9 160.3 159.3 160.0 1.2
19 159.9 160.1 159.7 159.6 159.3 159.7 0.8
20 159.5 159.5 160.6 160.6 159.8 160.0 1.1
21 159.9 159.7 159.9 159.5 161.0 160.0 1.5
22 159.6 161.1 159.5 159.7 159.5 159.9 1.6
23 159.8 160.2 159.4 160.0 159.7 159.8 0.8
24 159.3 160.6 160.3 159.9 160.0 160.0 1.3
25 159.3 159.8 159.7 160.1 160.1 159.8 0.8
–– ––
X = 159.67 R = 0.98

5. If the specifications for the data in Table 7.9 are LSL = 159.0 in; and
USL = 160.0 in; and it is known that s = 0.4, find limits for a modified-
limit control chart for samples of ng = 5. Why won’t the chart work?
6. Using the specification limits from Exercise 5, set up an acceptance control
chart with an upper and lower APL of 159.5 using s = 0.4 and a = 0.05
with samples of ng = 5. Back-calculate to determine the RPL having b = 0.10.
Note that the APL and RPL are process levels and not specifications on
individuals.
7. Plot an exponentially weighted moving average chart of the means from
Table 7.9. Use s = 0.4. Note

σ X = σ / ng
Chapter 7: Principles and Applications of Control Charts 247

8. Plot a CUSUM chart for the means of the samples from Table 7.9. Use
s = 0.4. Remember to use

σ X = σ / ng

9. Convert the limits from Exercise 8 into values of H and F.


10. Plot a Kemp chart for the sample averages of Table 7.9. Use s = 0.4. Note that

σ X = σ / ng

11. Set up precontrol limits against the specifications in Exercise 5. Using the
data in Table 7.9 in sequence, sample by sample, how many samples would
be taken before a problem is detected?
8
Process Capability,
Performance, and
Improvement

8.1 PROCESS CAPABILITY


What do we mean by the capability of a process? The ATT Statistical Quality Control
Handbook states, “The natural behavior of the process after unnatural disturbances are
eliminated is called the process capability.”1 The handbook emphasizes that a process
capability study is a systematic investigation of the process using control charts to deter-
mine its state of control, checking any lack of control for its cause, and taking action to
eliminate any nonrandom behavior when justified in terms of economics or quality.
Process capability can never be divorced from control charts or from the concepts of
control that Shewhart envisaged. It is basic to the process and may be thought of as
inherent process capability. This may be estimated from a range or standard deviation
chart of past data, but it can be measured only when the process itself is in control (on

the X chart also because of possible synergistic effects on the spread that come about
by bringing the average under control). It is not necessary for the process to be normally
distributed to use control charts to effect control. Hence, measures of location and
spread will not always be found independent of each other, and so complete control is
required to establish the capability of a process.
Again,
There is no capability without control!

It is important to observe that the inherent capability of a process has nothing to do


with specifications. It is a property of the process, not of the print. It is a natural pheno-
menon, which can be estimated with some effort and measured with even more. Process

capability studies may be performed on measurements with X and R charts but also can
be accomplished with p or c charts to demonstrate a condition of control. A p chart in

1. B. B. Small (ed.), ATT Statistical Quality Control Handbook (New York: ATT, 1956).

249
250 Part II: Statistical Process Control

control at three percent nonconforming says that the inherent capability of that process is
three percent unless something is done to change the process itself from what it is now.

8.2 PROCESS OPTIMIZATION STUDIES


Process capability studies are one type of four studies that may be performed in what
Mentch calls a process optimization program.2 These are:
• Process performance check. A quick check of the product produced by a
process. Based on a small amount of data at a given time, it gives a snapshot
of the process performance within a limited time frame. Output is short-run
capability. Example: Calculate gas mileage from one tankful.
• Process performance evaluation. A comprehensive evaluation of product
produced by the process based on whatever historical data is available, it gives
a moving picture of how the process has performed in the past. Output is the
estimated process capability that could be achieved. Example: Study past
records to estimate how good mileage could be.
• Process capability study. An investigation undertaken to actually achieve a
state of statistical control on a process based on current data taken in real time,
including efforts to achieve control. It gives a live image of control. Output is
the inherent process capability of the controlled process. Example: Study gas
mileage and make adjustments until control is achieved.
• Process improvement study. A comprehensive study to improve a process that is
not capable of meeting specifications even though it is in statistical control. It
gives a vision of what the process could be and sets out to attain it. Output (if
successful) is target capability. Example: Modify car after study by changing
exhaust system to increase gas mileage.
It is clear that these studies are sometimes performed individually although they can
comprise the elements of a complete process improvement program.
The process performance check is conducted in a short time frame. From one day
to one week is typical. Based on existing data, the primary tools are frequency distrib-
utions, sample statistics, or Pareto analysis of attributes data. It is simple to perform,
usually by one person, and may lead to quick corrective action.
The process performance evaluation is a longer-term study, typically of a few
weeks. It is based on existing historical data, usually enough for a control chart. The pri-

mary tools are X and R charts for variables data; p, np, or c charts for attributes data;
and sometimes Pareto analysis. Usually done by one person, it can lead to relatively
quick corrective action. Process performance evaluations are, however, based on “what

2. C. C. Mentch, “Manufacturing Process Quality Optimization Studies,” Journal of Quality Technology 12, no. 3
(July 1980): 119–29.
Chapter 8: Process Capability, Performance, and Improvement 251

has been done” not on “what can be done.” This can lead to underestimates of process
capability when it is not practical to eliminate assignable causes, or to overestimates
when synergistic effects exist between level and spread.
The process capability study is a much longer-term study, usually of a month or
more. It is conducted using data from current production in an effort to demonstrate

inherent process capability, not to estimate or predict it. Tools used are X and R charts
for variables and p, np, or c charts for attributes. Process capability is the best in-control
performance that an existing process can achieve without major expenditures. Such
studies are relatively inexpensive and while it is possible for a single person to perform
them, they are normally conducted by a team.
A process improvement study is usually recommended only after a process capa-
bility study has shown the present process (equipment) to be inadequate. It requires par-
ticipation by all interested parties from the very beginning. A cost analysis should be
performed at the outset since such studies can be expensive. A working agenda should
be drawn up and control charts kept throughout, verifying improvements when they
occur. Tools here include design of experiments, regression, correlation, evolutionary
operations (EVOP), and other advanced statistical techniques. This is almost always a
team project with management leadership.
Proper use of these studies will identify the true capability of the process. They can
be used progressively as necessary in an improvement program, with each study lead-
ing to further process optimization. These studies are intended to pinpoint precise areas
for corrective action and bring about cost savings through yield improvement. Such
studies often result in cost avoidance by preventing unnecessary expenditures on new
processes or equipment. Don’t throw out the old process until you have a realistic esti-
mate of what it can do!

8.3 CAPABILITY AND SPECIFICATIONS


The capability of a process is independent of any specifications that may be applied to it.
Capability represents the natural behavior of the process after unnatural disturbances are
eliminated. It is an inherent phenomenon and is crudely measured by the 6s spread
obtained by using the estimate of standard deviation from an in-control chart for variation.
The thrust of modern quality control is toward reduction of variation. This follows
the Japanese emphasis on quality as product uniformity around a target rather than simple
conformance to specifications. Thus, process capability becomes a key measure of quality
and must be appropriately and correctly estimated.
While it is true that a product with less variation around nominal is, in a sense, better
quality, specifications will probably never be eliminated; for specifications tell us how
much variation can be tolerated. They provide an upper limit on variation that is impor-
tant in the use of the product, but which should be only incidental to its manufacture. The
objective of manufacture should be to achieve nominal, for the same product can be sub-
jected to different specifications from various customers. Also, specifications are not
stable over time. They have a tendency to shrink. The only protection the manufacturer
252 Part II: Statistical Process Control

has against this phenomenon is to strive for product as close to nominal as possible in
a constant effort toward improvement through reduction in variation. Otherwise, even
the best marketing plan can be defeated by a competitor who has discovered the secret
of decreased variation.
When a duplicate key is made, it is expected to fit. If it is too thick, it will not fit.
If it is too thin, it may snap off. If it is at nominal, the customer will be pleased by its
smooth operation. If quality is measured by the tendency to repurchase, the user will
avoid purchase when product has been found out of spec, but the customer will be
encouraged to repurchase where product is made at nominal. Thus, specifications should
be regarded as an upper bound or flag beyond which the manufacturer should not tres-
pass. But nominal product is the hallmark of a quality producer.
The idea of relating specifications to capability is incorporated in the capability
index, Cp , where

Spread of specifications USL − LSL


Cp = =
Process spread 6σ

A process just meeting specifications has Cp = 1. Sullivan has pointed out that the
Japanese regard Cp = 1.33 as a minimum, which implies an 8s spread in the specifica-
tions, with Cp = 1.66 preferred.3 The value of s should represent the best estimate of
the process variability, which will be ŝ ST.4 In the case of an in-control process, the esti-
mate of ŝ LT will be very close to ŝ ST. According to Bothe, if the process is in a “perfect”
state of control, ŝ LT will be identical to ŝ ST.5 However, some long-term changes in the
process will be so small that they may not be detected on the control chart. Detection
of these small changes will be difficult unless the subgroup size is substantially
increased. These undetected changes between the subgroups result in the estimate of ŝ LT
being slightly greater than the estimate of ŝ ST, even when the chart appears to be in a
state of statistical control.
Cp values of 3, 5, and 8 can be found in practice. There is, in fact, a relation between
the Cp index, acceptable quality level (AQL), and parts per million (ppm) as follows:

Cp AQL(%) AQL(ppm)
0.5 13.36 130,000
0.75 2.44 24,400
1.00 0.26 2,500
1.25 0.02 200
1.33 0.003 30
1.50 0.001 10
1.63 0.0001 1.0
1.75 0.00001 0.1
2.00 0.0000002 0.002

3. L. P. Sullivan, “Reducing Variability: A New Approach to Quality,” Quality Progress 17, no. 7 (July 1984): 15–21.
4. Davis R. Bothe, Measuring Process Capability (New York: McGraw-Hill, 1997): 39.
5. Ibid.
Chapter 8: Process Capability, Performance, and Improvement 253

This correctly implies that the way to achieve quality levels in the range of parts per
million is to work on the process to achieve Cp in excess of 1.33. This can be done and
is being done through the methods of statistical process control.
A rough guess at the Cp index will indicate the type of process optimization study
that may be appropriate. We have

Cp Study
<1 Process improvement
1–1.3 Process capability
1.3–1.6 Performance evaluation
> 1.6 Performance check

Consider the mica thickness data. Since the s chart is in control,6 we estimate
process capability using ŝ LT = s/c4 = 2.09/0.94 = 2.11. The spread in the specifications
is (15 – 8.5) = 6.5 so

6.5
Cp = = 0.47
6 ( 2.11)

Clearly, the process is inferior. A process improvement study is definitely called for.
Note that use of Cp implies the ability to hold the mean at nominal, that is, a process
in control. When the process is centered away from nominal, the standard deviation
used in Cp is sometimes calculated using nominal m0 in place of the average of the data
so that

∑( x − µ )
2

σ̂ LT = 0

n −1

This will give an inflated measure of variability and decrease Cp.


When a single specification limit is involved, or when the process is deliberately
run off-center for physical or economic reasons, the Cpk index is used. Cpk is a truer mea-
sure of the process capability relative to how far the process is off target and how vari-
able it is, which results in the potential for making pieces out of specification. In the
case of an out-of-control process, the use of ŝ ST means that the estimate of Cpk will rep-
resent how well you could run the process with respect to the specification.

6. We can estimate a potential Cp in the case of an out-of-control process by using the short-term estimate of s. For

example, an estimate based on the range would be ŝ ST = R/d2. In the case of the mica thickness data, an estimate of
Cp would be
USL − LSL 15 − 8.5 6.5
Cp = = = = 0.52
6 ( R / d 2 ) 6 ( 4.875 / 2.33) 6 ( 2.09 )

Of course, since the process is out of control, this estimate is a prediction of the potential capability of the process
(once assignable causes are removed).
254 Part II: Statistical Process Control

Here

USL − x
C pk =
3σ̂ ST

and/or
x − LSL
C pk =
3σ̂ ST

When the process is offset and two-sided specification limits are involved, the capa-
bility index is taken to be the minimum of these two values. Note that the estimate of
the standard deviation must be consistent with the specification being considered. If the
specification is based on individual values, then ŝ = ŝ ST is an appropriate estimate
of the process standard deviation. However, if the specification is based on average
values, then the value of Cpk would use

σˆ X = σ ST
2
/n

In the case of the mica thickness data, the specification limits are based on individ-
ual values, so the estimate of Cpk is

min {USL − X , X − LSL}


C pk =
3σˆ ST
min {15 − 11.152,11.152 − 8.5}
=
3( 2.09 )
min {3.848, 2.652}
=
6.27
= 0.42

The relationship between Cp and Cpk can be described as:


• Cpk can be equal to but never larger than Cp
• Cp and Cpk are equal only when the process is centered on target
• If Cp is larger than Cpk , then the process is not centered on target
• If both Cp and Cpk are > 1, the process is capable and performing within
the specifications
• If both Cp and Cpk are < 1, the process is not capable and not performing
within the specifications
Chapter 8: Process Capability, Performance, and Improvement 255

• If Cp is > 1 and Cpk is < 1, the process is capable, but is not centered and not
performing within the specifications
Neither Cp nor Cpk should be considered as absolute metrics of process capability.
Both of these metrics are based on sample estimates, so they are subject to error.
100(1 – a)% confidence intervals for Cp and Cpk that utilize Table A.1 are shown,
respectively, by Kotz and Johnson7 to be

 1

 2  2 2
Ĉ p 1 − −Z α  ,
 9 ( n − 1) 1−  9 ( n − 1) 
2   
 
 1

 2  2  2

Ĉ p 1 − +Z α  
 9 ( n − 1) 2  (
9 n − 1)  
1−  
 

and
1

 n − 1 1  6   2
Cˆ pk − Z  + Cˆ pk
2
 1 +  ,
 9n ( n − 3) 2 ( n − 3)  n − 1 
α
1−
2

1
 n − 1 1  6   2
Cˆ pk + Z  + Cˆ pk
2
 1 +  .
 9n ( n − 3) 2 ( n − 3)  n − 1 
α
1−
2

The 95 percent confidence interval for Cp for the mica thickness data would be

 1

  2
± (1.96 ) 
2 2
0.47 1 −
 9 ( 200 − 1)  
 9 ( 200 − 1) 
 

(
0.47 0.99888 ± (1.96 )( 0.033417 ) )
0.47 ( 0.99888 ± 0.06550 )
0.47 ( 0.99888 − 0.06550 ) , 0.47 ( 0.99888 + 0.06550 )
0.47 ( 0.93338 ) , 0.47 (1.06438 )
0.44, 0.50

7. S. Kotz and N. L. Johnson, Process Capability Indices (London: Chapman & Hall, 1993).
256 Part II: Statistical Process Control

and for Cpk , the 95 percent confidence interval is


1
 200 − 1  6   2
+ ( 0.42 )
2 1
0.42 ± 1.96  1 + 
 9 ( 200 ) ( 200 − 3) 2 ( 200 − 3)  200 − 1 

{ }
1
0.42 ± 1.96 0.000561 + ( 0.1764 )( 0.002538 )(1.030151) 2

0.42 ± 1.996 ( 0.001022 ) 2


1

0.42 ± 0.06
0.36, 0.48

Case History 8.1


The Case of the Schizophrenic Chopper

Introduction
A plant was experiencing too much variation in the length of a part that was later fab-
ricated into a dimensionally critical component of an assembly. The part was simply cut
from wire that was purchased on spools. Two spools were fed through ports into a chop-
per, one on the left and one on the right, so that two parts could be cut at one blow. The
parts then fell into a barrel, which was periodically sampled.

A histogram of a 50-piece sample from the barrel showed X = 49.56 mils with a
standard deviation of 0.93 mils. The specifications were 44.40 ± 0.20 mils. Clearly, this
check showed process performance to be inadequate. A process performance study of
past samples showed several points out of control for the range and wide swings in the
mean well outside the control limits.
A new supervisor was assigned to the area and took special interest in this process.
It was decided to study its capability. A control chart for samples of size 5 was set up
on the process and confirmed the previous results.
One day the control chart exhibited excellent control. The mean was well-behaved

and the ranges fell well below the established centerline of R = 2.2. What had happened?
The best place to find out was at the chopper. But this was a period of low productivity
because wire was being fed in from one side only. The other side was jammed. Perhaps
that had an effect. Perhaps it was something else.
It was then that the supervisor realized what had happened. Each side of the chop-
per was ordinarily set up separately. That would mean that any drift on either side would
increase the spread of the product and, of course, shift the mean. What he had learned
about rational subgrouping came back to him. It would be sensible to run control charts
on each side separately, and then they could be adjusted as needed and be kept closer to
nominal. Closer to nominal on the two sides meant less overall variation in the product
and better control of the mean. This could well be the answer.
Chapter 8: Process Capability, Performance, and Improvement 257

Control charts were set up on the two sides separately. They stayed in reasonable
control and were followed closely so that adjustments would be made when they were
needed (and not when they were not needed). Assignable causes were now easier to find
because the charts showed which side of the chopper to look at. The control charts for
– –
the range eventually showed R = 0.067 for the right side and R = 0.076 for the left. The
mixed product had ŝ = 0.045. This gave a capability index of Cp = 0.40/0.27 = 1.48.
The sorting operation was discontinued as the product had attained uniformity beyond
the hopes of anyone in the operation. All this at a cost of an additional five samples plot-
ted for the second chart. This is an example of what can be done with statistical process
control when it is properly applied.

8.4 NARROW-LIMIT GAUGING FOR


PROCESS CAPABILITY
Narrow-limit gauging provides a natural tool for evaluating process capability and asso-
ciated parts per million (ppm) by taking advantage of the increased sensitivity afforded
by the compressed gauge. Approximate narrow-limit plans can be devised using the
Sommers approximation described in Chapter 6.8
Consider the formula for Cpk against an upper specification limit USL, then
USL − µ
C pk =

USL − µ
3C pk =
σ
3C pk = Z p

where Zp is the Z value from Table A.1 for the fraction of product out of the specifica-
tion. By symmetry, this is applicable to lower specification limits as well.
Suppose it is necessary to distinguish between two process capabilities Cpk1 and Cpk2
with risks a and b, respectively. The Sommers approximation for an optimal narrow-
limit plan may be expressed in terms of the capability indices as follows:

 Zα + Z β  1.5  Zα + Z β  1  Z + Zβ 
2 2 2

n = 1.5   =   =  α 
 Z p – Z p  9  C pk − C pk  6  C pk − C pk 
1 2 1 2 1 2

t=
Z p2 Zα + Z p1 Z β
=
(
3 C pk2 Zα + C pk1 Z β )
Zα + Z β Zα + Z β
c = 0.5n − 0.67

8. E. G. Schilling and D. J. Sommers, “Two Point Optimal Narrow Limit Plans with Applications to MIL-STD-105D,”
Journal of Quality Technology 13, no. 2 (April 1981): 83–92.
258 Part II: Statistical Process Control

When a = b, these become

2 
2

n=  
3  C pk1 − C pk2 

t=
3
(
C + C pk1
2 pk2
)
c = 0.55n − 0.67

For example, if it is desired to determine if a process is running at Cpk1 = 1.5 but


not lower than Cpk2 = 1.0 with equal risks a and b of 0.0227, respectively, we have
Za = Zb = 2.0 and

2  2.0 
2

n=   = 10.67 ∼ 11
3  1.0 − 1.5 

t=
3
2
(1.5 + 1.0 ) = 3.75
c = 0.5 (10.67 ) − 0.67 = 4.66 ∼ 5

The sampling plan is n = 11, t = 3.75, and c = 5.


The properties of this procedure can be assessed from the OC curve, two points
of which are given in Table 8.1, and so the risks a = b = 0.0227 are very closely
approximated.
It should be noted that this approach simplifies when dealing with the capability
index Cp . In this case, we have

USL − LSL
Cp =

so
USL − LSL 2 (USL − µ )
6C p = =
σ σ

Table 8.1 Assessment of capabilities under narrow limit plan (n = 11, t = 3.75, c = 5).
Cpk p ppm Zp Z ´ = Zp – t p´ Pa
Cpk1 1.5 0.0000034 3.4 4.5 0.75 0.2266 0.9785
Cpk2 1.0 0.00135 350 3.0 –0.75 0.7734 0.0215
Chapter 8: Process Capability, Performance, and Improvement 259

and
USL − µ
3C p =
σ

assuming centering of the process. The approach for Cpk given above may then be used
with either specification limit, remembering to use half the total risk for a in doing the
computation.
For example, if it is desired to evaluate the capability of a process thought to be
running at Cp = 1.5 against a possible alternative Cp = 1.0 with risks of a = b = 0.05 it
would be necessary to halve the a risk to 0.025, proceeding as above using the basic
Cpk computations.

8.5 PROCESS PERFORMANCE


It may be of interest to estimate the performance level of the process rather than its
capability with respect to the specification. We can think of process performance as
what the process does make with respect to the specifications. On the other hand,
process capability tells us what the process can make when it is in control. Pp is a simple
measure of process performance relative to the specification tolerance USL – LSL. The
idea of relating specifications to performance is incorporated in the index Pp where

Spread of specifications USL − LSL


Pp = =
Process spread 6σ̂ LT

A process just meeting specifications has Pp = 1, but this is no guarantee that the
process is in a state of statistical control. This index is often misrepresented as Cp when
the process standard deviation ŝ ST is replaced by ŝ LT, and the process is incorrectly
assumed to be in control.
When a single specification limit is involved, or when the process is deliberately run
off-center for physical or economic reasons, the Ppk index is used. Ppk is a truer measure of
the process performance relative to how far the process is off target and how variable it
is, which results in the potential for making pieces out of specification. In the case of an
out-of-control process, the use of ŝ LT means that Ppk will represent how well you are run-
ning the process with respect to the specification over a specified period of time.
Here

USL − x
Ppk =
3σ̂ LT

and/or
260 Part II: Statistical Process Control

x − LSL
Ppk =
3σ̂ LT

When the process is offset and two-sided specification limits are involved, the capa-
bility index is taken to be the minimum of these two values.
Note that the estimate of the standard deviation must be consistent with the specifi-
cation being considered. If the specification is based on individual values, then ŝ = ŝ LT
is an appropriate estimate of the process standard deviation. However, if the specifica-
tion were based on average values, then the value of Ppk would use

σˆ X = σˆ LT
2
/n

Bothe,9 Somerville and Montgomery,10 and Clements11 discuss several process


capability and performance measures for nonnormal distributions. If a quality charac-
teristic has a nonnormal distribution, the measures shown in this text are not appropri-
ate, though they are robust to slight departures from normality. Bothe presents this list
of commonly nonnormal characteristics:
• Taper
• Flatness
• Surface finish
• Concentricity
• Eccentricity
• Perpendicularity
• Angularity
• Roundness
• Warpage
• Straightness
• Squareness
• Weld or bond strength

9. D. R. Bothe, Measuring Process Capability (New York: McGraw-Hill, 1997): 431–513.


10. S. E. Somerville and D. C. Montgomery, “Process Capability Indices and Non-Normal Distributions,” Quality
Engineering 9, no. 2 (1996–97): 305–16.
11. J. A. Clements, “Process Capability Calculations for Non-Normal Distributions,” Quality Progress 22, no. 9
(September 1989): 95–100.
Chapter 8: Process Capability, Performance, and Improvement 261

• Tensile strength
• Casting hardness
• Particle contamination
• Hole location
• Shrinkage
• Dynamic imbalance
• Insertion depth
• Parallelism
These characteristics all share the fact that they are bounded by some physical limit,
for example, particle contamination cannot be less than zero. As the process improves,
the average of the characteristic moves toward the bound. This produces an even more
skewed (nonnormal) distribution. There are two typical methods for estimating process
capability for nonnormal distributions:
1. Use the 0.135 and 99.865 percentiles of the distribution. In the case of
an underlying normal distribution, X99.865 – X0.135 would be the 6s spread of
the process.
2. Using the percent nonconforming to estimate performance capability measures
for ppm and Ppk indices, see Bothe.12

8.6 PROCESS IMPROVEMENT


Control charts are the method of choice in conducting process optimization programs.
They separate the assignable or special causes of variation in the process that can be
corrected on the floor from those random or common causes which only redefinition of
the process by management can correct.
Dr. Deming has indicated that most people think the job of statistical process con-
trol has ended when the process is in control.13 That is only the beginning, for he empha-
sizes that this is the time to concentrate on elimination of the common causes.
This is not as difficult as it may appear and, in some sense, comes naturally in the
lifecycle of a product or process. During the development phase of a process, there
are often violent swings in average and spread. Since process variation is usually
more stable than the average, the spread can be expected to be controlled at some
stable level sometime during the introduction of the product. Thereafter, as the

12. D. R. Bothe, Measuring Process Capability (New York: McGraw-Hill, 1997).


13. W. E. Deming, Quality Productivity and Competitive Position (Cambridge, MA: MIT Center for Advanced
Engineering Study, 1982).
262 Part II: Statistical Process Control

process matures using process control techniques, the average comes under control
and the process is reasonably stable. This provides an excellent opportunity for inno-
vation, for now the erratic state of lack of control has been eliminated, variation is stable,
and meaningful process improvement studies can be performed. It is at this point that
changes in the process can be implemented that will lead to further reduced variation
with still better process control.

8.7 PROCESS CHANGE


Statistical control of a process is not, in itself, the goal of process control. The objective
is, as pointed out by Shewhart, to obtain satisfactory, adequate, dependable, economic
quality. It is sometimes the case that a process produces inadequate quality even though
it is in a state of control. Natural variation may exceed the span of the specifications.
The variation may be acceptable, but it may not be possible to center the mean because
of trade-offs with other variables. It is in precisely this situation that a process improve-
ment program is appropriate to bring about change in the process—that is, a new
process—such that the desirable attributes of quality are achieved. This may require
new equipment, improved process flow, better raw materials, personnel changes, and so
forth, and may be quite expensive. No process improvement program should be under-
taken without a process capability study to show that the effort is justified. Such a pro-
gram is usually a team effort consisting of representatives from manufacturing, quality,
engineering, and headed by a member of management. Mentch has outlined the steps in
a process improvement study as follows14:
1. Develop a formal work agenda for the team selected to perform the study,
including components to be worked on, priorities, responsibilities, and a
completion schedule. Compile cost analysis at every stage.
2. Determine critical problem areas through cause-and-effect analysis and
perform a Pareto analysis to show where effort should be directed.
3. Utilize statistically designed experiments, EVOP, and so forth, to show what
actions will be required to correct problem areas.
4. Continue control charts from the previous process capability study to show
the effect of changes made.
5. Conclude the program when the process is in control, running at an acceptable
rate, and is producing product that meets specifications so that further
expenditures are not justified.
6. Institute continuing controls to ensure that the problems do not reappear.

14. C. C. Mentch, “Manufacturing Process Quality Optimization Studies,” Journal of Quality Technology 12, no. 3
(July 1980): 119–29.
Chapter 8: Process Capability, Performance, and Improvement 263

Man Machine
Forgot it was loaded Poor thermostat

Wrong setting Pop-up stuck

Burned
toast
Bread too thin Poor instructions

Bread too thick and stuck Bad design

Material Method

Figure 8.1 Cause-and-effect diagram for burned toast.

8.8 PROBLEM IDENTIFICATION


Initial brainstorming sessions are aided by listing possible causes of a problem on a
cause-and-effect or fishbone diagram, developed by Professor Kaoru Ishikawa in 1950.
Its Japanese name is tokusei yoinzu, or characteristics diagram. It displays the charac-
teristics of a problem that lead causally to the effect of interest.
Often in problem solving, the skeletal framework is laid out in terms of the generic
categories of man (operator), machine, material, and method. Included in method is
management, although this is sometimes split out.
As a somewhat prosaic example of the use of this technique, suppose you are con-
fronted with burned toast for breakfast. Consider the possible causes. They can be laid
out using the cause-and-effect diagram as shown in Figure 8.1. Listing the causes in this
way facilitates their identification and prepares for Pareto analysis to assess their rela-
tive importance.

8.9 PRIORITIZATION
Pareto analysis addresses the frequency distribution associated with various causes.
Since the causes are normally nominal variables, the frequency distribution is ordered
from highest frequency to lowest. The resulting histogram and cumulative frequency
distribution are plotted to give a visual representation of the distribution of causes. This
will help separate out the most important causes to be worked on.
Vilfredo Pareto (1848–1923) studied the distribution of wealth in Italy and found
that roughly 20 percent of the population had 80 percent of the wealth. In addition, in
marketing it was later found that 20 percent of the customers account for roughly 80
percent of the sales. In cost analysis, 20 percent of the parts contain roughly 80 percent
of the cost, and so forth.
Juran was the first to identify this as a universal principle that could be applied to
quality and distinguish between what he called the “vital few” and the “trivial many.”15

15. J. M. Juran, “Pareto, Lorenz, Cournot, Benoulli, Juran, and Others,” Industrial Quality Control 17, no. 4 (October
1960): 25.
264 Part II: Statistical Process Control

A typical example might be the number of defects found in pieces of pressed glass over
the period of a month (see Table 8.2).
The resulting cumulative distribution of the causes is plotted in Figure 8.2. Note that
the bars shown correspond to the histogram of the causes.

Table 8.2 Pressed-glass defects.*


Percent of all
Defects defects found
Man (operator)
A—Jack 36 0.27
B—Lucille 69 0.51
C—Carl 66 0.50
D—Dan 3,317 24.52
Machine
A—left 1,543 11.41
B—middle 95 0.70
C—right 120 0.89
Material
A—supplier 1 1,126 8.32
B—supplier 2 2,822 20.86
C—supplier 3 225 1.66
Method
A—design 1 3,799 28.08
B—design 2 159 1.18
C—design 3 35 0.26
Miscellaneous (12 items) 116 0.86
Total 13,528
* Courtesy of C. C. Mentch.

100

90

80

70
Percent of total

60

50

40

30

20

10

0
Method Man Material Machine Material All
A D B A A others
(23 items)

Figure 8.2 Pareto diagram of pressed-glass defects.


Chapter 8: Process Capability, Performance, and Improvement 265

8.10 SUMMARY
We have looked at process quality control in the broad sense. By definition, it encom-
passes all aspects of an operation. Its objective is what Shewhart called SADE-Q, that
is, satisfactory, adequate, dependable, economic, quality. Three important aspects of
process quality control are process control, process capability, and process change.
These are tied together with the control chart. In process control, it is used to track the
process, separating chance causes from assignable causes. An in-control chart is neces-
sary for any reasonable assessment of process capability. In addition, it is an instrument
of process change. Thus, the control chart is the method of choice when dealing with
the statistical side of process quality control.

8.11 PRACTICE EXERCISES


1. Under what conditions should the process performance check be as effective
as a process performance evaluation?
2. Using the data in Table 7.9 and the specifications from Exercise 5 of
Chapter 7, compute the Cp index. What does it tell you?
3. Do a Pareto analysis of the demerits shown in Table 6.3 (on the CD-ROM
in Case History 6.1).
4. Your doorbell doesn’t work and you speculate on a cause. Draw up a
cause-and-effect diagram.
5. Using the data from Table 13.4, assume vials from Firm A were delivered
before the vials from Firm B and a short-run control chart is being kept on
the weight of the vials. Set up the following charts:
a. Difference chart from a target of 8.64. Use the moving range to estimate
the process standard deviation.
b. Standardized chart with the same target, but where process standard
deviation is known to be 2.34.
6. Consider the observations of transconductance given in Table 13.5. Assume
Melt A preceded Melt B. Set up the following charts:
a. Difference chart from a target of 4184. Use the moving range to estimate
the process standard deviation.
b. Standardized chart with the same target, but where the process is such that
the standard deviation is known to be 1266.
c. Bothe charts for average and range.
d. Q chart for average and variance, assuming a known standard deviation
of 1266.
Part III
Troubleshooting and
Process Improvement
9
Some Basic Ideas and
Methods of Troubleshooting
and Problem Solving

9.1 INTRODUCTION
In Chapters 2 and 5, a scientific process was studied by attempting to hold constant all
variables that were thought to affect the process. Then data obtained from the process
in a time sequence were examined for the presence of unknown causes (nonrandom-
ness) by the number and length of runs and by control charts. Experience in every
industry has shown that their processes have opportunities for economic improvement
to be discovered by this approach.
When evidence of nonrandomness is observed, the assignable causes can some-
times be explained through standard engineering or production methods of investigation.
Sometimes the method of investigation is to vary one factor or several factors suspected
of affecting the quality of the process or the product. This should be done in a pre-
planned experimental pattern. This experimentation was formerly the responsibility of
persons involved in research and development. More recently, process improvement and
troubleshooting responsibilities have become the province of those engineers and super-
visors who are intimately associated with the day-to-day operation of the plant
processes. Effective methods of planning investigations have been developed and are
being applied. Their adoption began in the electrical, mechanical, and chemical indus-
tries. However, the principles and methods are universal; applications in other industries
may differ only in detail.
The following sections will outline some procedures for troubleshooting and ana-
lyzing data from investigations (experiments). Examples from different sciences and
industries will be presented to illustrate useful methods. We emphasize attributes data
in Chapter 11 and variables in Chapters 13, 14, and 15.

269
270 Part III: Troubleshooting and Process Improvement

9.2 SOME TYPES OF INDEPENDENT AND


DEPENDENT VARIABLES
Introductory courses in science introduce us to methods of experimentation. Time, tem-
perature, rate of flow, pressure, and concentration are examples of variables often
expected to have important effects in chemical reactions. Voltage, power output, resis-
tance, and mechanical spacing are important in electronics and many laws involving
them have been determined empirically. These laws have been obtained from many lab-
oratory studies over long periods of time by many different experimenters. Such laws
are often known by the names of the scientists who first proposed and studied them. We
have special confidence in a law when some background of theory has been developed
to support it, but we often find it very useful even when its only support is empirical.
In order to teach methods of experimentation in science courses, students are often
assigned the study of possible effects of different factors. Different levels of tempera-
ture may be selected and the resultant responses determined. Hopefully, the response
will behave like a dependent variable. After performing the experimental study, a pre-
viously determined relationship (law) may be shown to the student to compare with the
experimental data. As specialized studies in a science are continued, we may be
assigned the project to determine which factors have major influence on a specific char-
acteristic. Two general approaches are possible:

1. Recognized Causative Variables (Factors)


We study the effects of many variables known to have been important in similar stud-
ies (temperature, light intensity, voltage, power output, as examples). This procedure is
often successful, especially in well-equipped research laboratories and pilot plants. This
is often considered basic to the “scientific method.”
Frequently, however, those scientific factors that are expected to permit predictions
regarding the new process are found to be grossly inadequate. This inadequacy is espe-
cially common when a process is transferred from the laboratory or pilot plant to pro-
duction. The predicted results may be obtained at some times but not at others, although
no known changes have been introduced. In these cases, the methods of Chapters 2 and
5 are especially relevant for checking on stability.

2. Omnibus-Type Factors1
Sometimes the results vary from machine to machine and from operator to operator. The
following fundamental “laws” have resulted from empirical studies in many types of
industry; they are presented with only slight “tongue in cheek”:

1. There is no term in common usage to designate what we mean by “omnibus-type” factors. Other terms that
might be used are bunch-type or chunky-type. The idea is that of a classification-type factor that will usually
require subsequent investigation to establish methods of adjustment or other corrective action. An omnibus-type
factor deliberately confounds several factors; some may be known and others unknown.
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 271

Consider k different machines assigned to the same basic operation:


• When there are three or four machines, one will be substantially better or
worse than the others.
• When there are as many as five or six machines, at least one will be
substantially better and one substantially worse than the others.
There are other important omnibus-type factors. We might study possible effects
in production from components purchased from different vendors;2 or differences
between k machines intended to produce the same items or materials; or differences be-
tween operators or shifts of operators. This type of experimentation is often called
“troubleshooting” or problem solving; its purpose is to improve either the product or
the process, or both.
A troubleshooting project often begins by studying possible differences in the
quality output of different machines, or machine heads, or operators, or other types of
variables discussed below. Then when important differences have been established,
experience has shown that careful study of the sources of better and worse perfor-
mance by the scientist and supervisor will usually provide important reasons for those
differences.
A key to making adjustments and improvements is in knowing that actual differ-
ences do exist and in being able to pinpoint the sources of the differences.
It is sometimes argued that any important change or difference will be evident to an
experienced engineer or supervisor; this is not the case. Certainly many important
changes and improvements are recognized without resort to analytical studies, but the
presence and identity of many economically important factors are not recognized with-
out them. Several case histories are presented throughout the following chapters that
illustrate this very important principle.

Variables Summary
Types of Independent Variables (Factors) in a Study
1. Continuous variables with a known or suspected association or effect on the
process: temperature, humidity, time of reaction, voltage. Sometimes these
variables can be set and held to different prescribed levels during a study—
sometimes they cannot be so controlled.
2. Discrete omnibus-type factors. Several examples will be given relating to
this type: different heads or cavities on a machine, different operators,
different times of day, and different vendors. Once it has been determined
that important differences do exist, it almost always leads to identification
of specific assignable causes and to subsequent process improvement.

2. See Case History 11.9.


272 Part III: Troubleshooting and Process Improvement

Types of Quality Characteristics (Response Variables, Dependent Variables, Factors)


1. Measurable, variable factors: the brightness of a TV picture, the yield of a
chemical process, the breaking strength of synthetic fibers, the thickness of
a sheet of plastic, the life (in hours) of a battery.
2. Attributes or classification data (go/no-go): the light bulb will or will not
operate, or the content of a bottle is, or is not, underfilled. There are occasions
where the use of attributes data is recommended even though variables data
are possible. In Chapter 6, the important, practical methods of narrow-limit
gauging (NL gauging) were discussed.
Experimentation with variables response data is common in scientific investiga-
tions. Our discussion of experimentation in Chapters 13, 14, and 15 will consider vari-
ables data. In practice, however, important investigations frequently begin with quality
characteristics that cause rejects of a go/no-go nature. See Chapter 11 for discussions
involving their use.

9.3 SOME STRATEGIES IN PROBLEM FINDING,


PROBLEM SOLVING, AND TROUBLESHOOTING
There are different strategies in approaching real-life experiences. The procedures pre-
sented here have been tested by many persons and in many types of engineering and
production problems. Their effective use will sometimes be straightforward, but will
always benefit from ingenuity in combining the art and science of troubleshooting.
It is traditional to study cause-and-effect relationships. However, there are fre-
quently big advantages to studies that only identify areas, regions, or classification as
the source of difference or difficulty. The pinpointing of specific cause and effect is thus
postponed. The omnibus-type factor may be different areas of the manufacturing plant,
or different subassemblies of the manufactured product. Several examples are discussed
in the following chapters and in case histories.
Two important principles need to be emphasized:

Basic Principle 1: Plan to learn something initially—but


not everything
• This is important, especially in those many industrial situations where more
data are rather easily attainable.
• It is not possible to specify all the important rules to observe in carrying out
a scientific investigation, but a second very important rule to observe, if at all
possible, is Basic Principle 2:
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 273

Basic Principle 2: Be present at the time and place the


data are being obtained, at least for the beginning of
the investigation
• This often provides opportunities to observe possible sources of error in the
data acquisition. Corrections may be possible by improving the data recording
forms, or by changing the type of measuring instrumentation.
• Observance of the data being obtained may suggest causative relationships that
can be proposed, questioned, or evaluated only at the time of the study.
• Observing the possibility of different effects due to operators, machines, shifts,
vendors, and other omnibus-type variables can be very rewarding.

Case History 9.1


Black Patches on Aluminum Ingots3

Introductory Investigations
While conducting some model investigations in different types of factories, an occasion
to investigate a problem of excessive black-oxidized patches on aluminum ingots came
into our jurisdiction. A general meeting was first held in the office of the plant manager.
At this meeting, the general purpose of two projects, including this specific one, was
described. This problem had existed for several months and competent metallurgists
had considered such possible causes as contaminants in aluminum pigs and differences
in furnace conditions.
Our study group had just two weeks to work on the problem; clearly, we could not
expect to become better metallurgists than those competent ones already available.
Rather than investigate the possible effects of such traditional independent variables as
furnace conditions (temperature and time, and so on), we considered what omnibus-type
variables might produce differences in the final ingots.
Planning the Study
The ingots were cast in 10 different molds. A traveling crane carried a ladle of molten
aluminum to a mold; aluminum was poured into the mold, where it was allowed to
solidify before removal by a hoist. The plant layout in Figure 9.1 shows the general
location of the 10 molds (M), the electric furnace and track, two doors, two walls, and
one window. It was considered that the location of these doors, windows, and walls

3. E. R. Ott, United Nations Technical Assistance Programme, report no. TAA/IND/18 (March 25, 1958).
274 Part III: Troubleshooting and Process Improvement

Wall 1 Door

M1 M2 M3 M4 M5

Electric
Window Track
furnace

M6 M7 M8 M9 M10

Wall 2 Door

Figure 9.1 Plant layout of molds and furnace.

might possibly affect the oxidation of the black patches. The location of the patches was
vaguely considered to be predominantly on the bottom of ingots.
It was decided to record the occurrence and location of the black patches on ingots
from a selected sample of molds, one from each in the order M1, M10, M3, M6, M5, M8.
Then the procedure was repeated once with these same six molds.
This selection of ingots would indicate whether the location of black patches would
occur and reoccur on some molds and not on others, whether it occurred in about the
same location on all molds, and whether it would reoccur in about the same location on
the same mold. If the reoccurrence of black patches was predictable, then the problem
was not contamination of the molten aluminum or furnace conditions, but would relate
to some condition of the molds. If the black patches did not reoccur in the same areas,
but their locations appeared random, then the problem might be of a metallurgical
nature. The problem might be contamination of the molten aluminum or in changing
conditions of the molds.
A comparison of “inside locations” (M3 and M8) with “outside locations” (M1, M5,
M6, M10) might also indicate possible effects related to distances from furnace, doors,
and windows.
How were the location and intensity of the black oxidation to be measured? There
is no standard procedure: (1) Often a diagram can be prepared and the location of
defects sketched or marked on it; Figure 9.2 shows the blank diagrams that were pre-
pared in advance on sheets of paper. (2) Included on each form was the order of casting
to be followed.
Note: While getting organized, it was learned that the 10 molds had electrical
heating in their walls and bottoms (the tops were open). The metallurgists agreed that
differences in heating might have an effect, and it would be possible to measure the
temperatures at different locations in a mold with a contact thermometer. Prior to
pouring, the temperatures at designated locations of the mold were measured and
recorded on the form (Figure 9.2).
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 275

From mold no. _________


Order of casting mold ___
8 in.
24 in.
E1 S1 E2 S2

36 in. B
(a)

S1 = Side next to furnace


(b) S2 = Side away from furnace
E1 = End next to wall 1
E2 = End next to wall 2
B = Bottom of ingot

Figure 9.2 (a) Form to record black patch areas on molds; (b) representation of a mold.

It was about six hours after the initial planning meeting that the data forms had been
drawn, the plan formalized, and the first ingot poured. Then, after solidifying, the ingot
was withdrawn, its identity marked, and the procedure was continued.
Obtaining Data
Now, if possible, be present when the study begins—long enough, at least to observe
some data. We examined the ingot from M1; yes, there was a smallish three-inch irreg-
ular circle of oxide—not on the bottom, but on the side S1. The locations and sizes of
the oxide were recorded as planned.
No clues were immediately available; the wall temperature in the area of the black
patch was no different from the temperatures at locations lacking the oxide. Was there
anything special about the condition of the mold wall at the origin of the black oxide?
An immediate investigation “suggested” the possibility that the white oxide dressing
with which the molds were treated weekly “looked a bit different.” It was of unlikely
importance, but its existence was noted.
The casting of the first round of ingots was continued as planned; some of the ingots
had black patches, some did not. Their location was indicated on the prepared forms. It
was time to repeat molds beginning with M1. When the second M1 ingot was examined,
it showed a black patch in the same general location as the first M1 ingot! Moreover, this
was the general repeat pattern for the six molds. A careful examination in the area pro-
ducing a black patch usually suggested a slightly differing appearance: nothing obvious
or very convincing.
It was the practice to dress the molds with white oxide every few days. When it was
time to make a third casting on M1, a redressing was applied (by brush) to the specific
area of origin of the black patch.
276 Part III: Troubleshooting and Process Improvement

Analysis
Then the next casting was made. Consequence? No black patch. It was found that this
same procedure would repeatedly identify areas in other molds that needed redressing
to prevent black oxidized patches.
Summary
The basic logic and method of this study are important. Repeat observations on selected
single units of your process will demonstrate one of two things: either the performance
of the unit will repeat; or, it will not.
It was established in this case history that the black patches came repeatedly from
specific geometric areas within molds. The reason for the problem was thus unrelated
to contaminants or other metallurgical properties of aluminum pigs, or to distances or
relationships to the furnace or windows and walls. Temperature differences within a
mold could have been a possible explanation. Being present and able to inspect the first
mold illustrates the importance of the previously stated basic principle 2.
It is not always that the correction of a process can be identified so readily; but
the opportunity was provided for simple data to suggest ideas. In this case history, the
retreatment of molds provided a complete solution to the problem.
• In some studies, the purpose of data collection is to provide organized
information on relationships between variables. In many other instances,
such as this one, the purpose is simply to find ways to eliminate a serious
problem; the data themselves or formal analyses of them are of little or no
interest. It was the logic and informal analysis that was effective.
• In troubleshooting and process improvement studies, we can plan programs of
data acquisition that offer opportunities for detecting types of important differ-
ences and repeat performances. The opportunity to notice possible differences
or relations, such as the location of black patches and their origin within molds,
comes much more surely to one who watches data in the process of acquisition
than to one who sits comfortably in an office chair.
These ideas will be extended in subsequent case histories.

9.4 BICKING’S CHECKLIST


It is important to consider all aspects of a test program before committing time and
resources to the project. Probably the best statement of key considerations in this regard
is a checklist proposed by Charles Bicking and reproduced as Figure 9.3.4 The reader is

4. C. A. Bicking, “Some Uses of Statistics in the Planning of Experiments,” Industrial Quality Control (January
1954): 20–24.
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 277

Checklist for Planning Test Programs

A. Obtain a clear statement of the problem


1. Identify the new and important problem area
2. Outline the specific problem within current limitations
3. Define the exact scope of the test program
4. Determine the relationship of the particular problem to the whole research or
development program
B. Collect available background information
1. Investigate all available sources of information
2. Tabulate data pertinent to planning the new program
C. Design the test program
1. Hold a conference of all parties concerned
a. State the propositions to be proved
b. Agree on the magnitude of differences considered worthwhile
c. Outline the possible alternative outcomes
d. Choose the factors to be studied
e. Determine the practical range of these factors and the specific levels at
which tests will be made
f. Choose the end measurements that are to be made
g. Consider the effect of sampling variability and of precision of test
methods involved
h. Consider possible interrelationships (or “interactions”) of the factors
i. Determine limitations of time, cost, materials, manpower, instrumentation,
and other facilities, and of extraneous conditions, such as the weather
j. Consider human relations angles of the program
2. Design the program in preliminary form
a. Prepare a systematic and inclusive schedule
b. Provide for stepwise performance or adaption of schedule if necessary
c. Eliminate effect of variables not under study by controlling, balancing, or ran-
domizing them
d. Minimize the number of experimental runs
e. Choose the method of statistical analysis
f. Arrange for orderly accumulation of data
3. Review the design with all concerned
a. Adjust the program in line with comments
b. Spell out the steps to be followed in unmistakable terms

Figure 9.3 Bicking’s checklist for planning test programs. (continued)


278 Part III: Troubleshooting and Process Improvement

(continued)

D. Plan and carry out the experimental work


1. Develop methods, materials, and equipment
2. Apply the methods or techniques
3. Attend to and check details; modify methods if necessary
4. Record any modifications of the program design
5. Take precautions in the collection of the data
6. Record the progress of the program
E. Analyze the data
1. Reduce recorded data, if necessary, to numerical form
2. Apply proper mathematical statistical techniques
F. Interpret the results
1. Consider all the observed data
2. Confine conclusions to strict deductions from the evidence at hand
3. Test equations suggested by the data by independent experiments
4. Arrive at conclusions as to the technical meaning of results as well as their sta-
tistical significance
5. Point out implications of the findings for application and for further work
6. Account for any limitations imposed by the methods used
7. State results in terms of verifiable probabilities
G. Prepare the report
1. Describe work clearly, giving background, pertinence of the problems, and
meaning of results
2. Use tabular and graphic methods of presenting data in good form for future use
3. Supply sufficient information to permit the reader to verify results and draw his
or her own conclusions
4. Limit conclusions to an objective summary of evidence so that the work recom-
mends itself for prompt consideration and decisive action

well advised to study the checklist and to use it faithfully in designing a test program or
experiment. The points made are simple but their relevance to the success of such pro-
grams is profound.

9.5 PROBLEM SOLVING SKILLS


While Bicking’s checklist is an effective approach for planning a test program, there are
more-structured approaches for solving process problems. In this section, a process is
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 279

laid out that begins with the identification of the problem and ends with the successful
implementation of the solution. This process consists of four steps:
• Identifying the problem
• Finding the root cause
• Deciding on a solution
• Implementing the solution
We want to apply a permanent fix, if possible. However, the problem can only be
fixed if we have a clear understanding of what the real problem “is,” that is, we must
identify it. The problem must then be analyzed to determine its cause. Once the cause
has been identified, a decision is needed to determine how to eliminate the cause. With
the decision in hand, a plan is developed to take action to implement the solution.
Kepner and Tregoe advocate an approach for problem analysis that is based more
on what cause is more rational than which is more creative.5 This approach can be
described in the following seven steps, and is used as the basis for the four-step process
previously listed.
1. You should know what ought to be happening and what is happening.
2. A problem is the difference between what is happening and what
should happen. This can then be expressed as a deviation. Compare the
deviation with the expectation and recognize a difference that seems
important to you.
3. Investigate and identify the problem deviation, that is, what (identity), where
(location), when (timing), and to what extent (size).
4. Identify features that distinguish what the problem is from what it is not.
Describe what the problem is in detail. Describe what the problem is not by
asking questions for each corresponding is, that is, [the problem] could be,
but it is not [something else]. This helps eliminate causes that do not make
any sense to consider.
5. List the potential cause(s) or contributory factors of the problem. These
should be clear-cut events or changes that lead to the problem and are clearly
associated with the occurrence of the problem. You should make statements
that you can test with the facts. Attempt to infer any likely causes of the
problem by developing hypotheses that would explain how the potential
cause(s) could have caused the observed problem.

5. C. H. Kepner and B. B. Tregoe, The Rational Manager (New York: McGraw-Hill, 1965). This problem solving
approach is widely considered to be the best in the business community, and to a large degree in the manufacturing
community as well.
280 Part III: Troubleshooting and Process Improvement

6. Now test the potential cause(s) of the problem, checking that each is not only
a potential cause, but also that it is the only cause, that is, that occurrence of
this problem is always associated with the occurrence of this cause or
combination of causes. Get rid of the causes that do not hold up.
7. Identify the most probable cause to verify. The most likely cause should be
the one that has the most reasonable and fewest assumptions, that is, it is the
one most closely associated with the problem.
Confirm the cause as the solution. Check the facts to verify the assumptions.
Observe the solution to see if it works. Use experiments to confirm the solution, that is,
can you “turn the problem on and off?”

9.6 SIX SIGMA METHODOLOGY


Hahn, Doganaksoy, and Hoerl define Six Sigma as “a disciplined and highly quantita-
tive approach to improving product or process quality.”6 The term “Six Sigma” refers to
the goal of achieving a process that produces defects in no more than 3.4 parts per mil-
lion opportunities (assuming a 1.5s process shift).
Six Sigma was originally developed at Motorola with the objective of reducing
defects in the manufacture of electronics. It has since been adopted and modified by
other companies, such as Allied Signal, General Electric, and Corning. Hahn, et al.7
describe the features of a Six Sigma program as:
• A top-down, rather than a bottom-up, approach that is led by the CEO (chief
executive officer) of the company.

Lower Upper
spec limit Process spec limit
(LSL) shifts 1.5s (USL)

–6s – 1.5s + 1.5s +6s

Figure 9.4 A Six Sigma process that produces a 3.4 ppm level of defects.

6. G. J. Hahn, N. Doganaksoy, and R. Hoerl, “The Evolution of Six Sigma,” Quality Engineering 12, no. 3 (2000):
317–26.
7. Ibid.
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 281

• Six Sigma Champions are selected from the ranks of the leaders of each
business within the company. These people are responsible for successful
implementation of Six Sigma in their respective businesses. They are also
responsible for the success of the project, providing the necessary resources
to do the work, and removing any organizational barriers to success.8
• At the business and project level, Six Sigma is driven by Master Black Belts
(MBBs) and Black Belts (BBs). These people work on Six Sigma projects
full-time and are responsible for setting quality objectives for the business,
and selection of Six Sigma projects, monitoring of these projects towards the
objectives, and the mentoring and training of the project team members. The
project leader is a trained BB who has a history of accomplishment, that is,
only the best employees are chosen. A BB project assignment usually lasts
for two years during which time the BB leads eight to 12 projects lasting
approximately three months apiece. MBBs are used as mentors to the BBs
and resources for the project team. They are typically BBs who have worked
on many projects, but have deeper knowledge of statistical methods, business
and leadership experience, and the ability to teach others.
• Implementation of the Six Sigma project is the responsibility of the project
team members who receive Green Belt (GB) training from the MBBs and BBs.
GBs do not spend all their time on projects.
• How many MBBs, BBs, and GBs are needed? A rule of thumb that has been
used is to have one BB per every 100 employees, and one MBB per 10 BBs.
The actual number of BBs needed will be based on the number of projects
selected. Project selection is a key part of Six Sigma training.
Training in a Six Sigma program involves everyone in the company. The depth of
training will vary depending on the role of the individual. BBs are often trained for four
weeks spread out over a three-month period that involves the implementation of Six
Sigma techniques on a particular project. The project is intended to be both hands-on
and bottom-line oriented. More importantly, the project must document the resulting
dollar savings since this is the metric that management understands. Certification of
BBs and GBs is dependent on successful completion of the project. MBBs are typically
certified after mentoring 20 successful projects, BBs are certified after completing two
projects (one mentored by a MBB and the other done independently), and GBs are cer-
tified after completing one project.
The training program typically comprises:
• Three weeks of statistical methods:
– A week of data analysis, including methods discussed in Chapters 1 through 5
of this text

8. J. M. Lucas, “The Essential Six Sigma,” Quality Progress (January 2002): 27–31.
282 Part III: Troubleshooting and Process Improvement

– A week of design of experiments, including the material discussed in


Chapter 10 of this text
– A week of quality control, including the material discussed in Chapter 8
of this text
• A week of training on project selection, project management and evaluation,
team selection, and team building
Voelkel has proposed a more in-depth curriculum for BB training.9 Training meth-
ods not currently covered in BB training that may be useful for future projects include:
• More experimental design, for example, mixed-level factorial designs, and
crossed and nested factor designs (see Chapter 15).
• More modeling, that is, more knowledge of linear regression (see Chapter 12
in this text).
• Time series, including statistical process control (SPC) (see Chapter 8) and
engineering process control (EPC).
• Multivariate methods, for example, principal component analysis, cluster
analysis, and factor analysis.
• Reliability, for example, fitting data to Weibull and other distributions.
• Graphical methods and simulation, for example, analysis of means (see Chapters
11 through15) and normal probability plotting (see Chapters 1 and 10).
• Broader use of one or more statistical software packages, for example, Minitab.
Obviously, this more-sophisticated training is beyond what most Six Sigma programs
are willing to accommodate. Fortunately, MBBs can find such training at selected uni-
versities sponsoring either distance learning or on-campus courses in statistics and qual-
ity engineering. Improving the education of BBs and MBBs in statistical methods helps
to create a critical mass of knowledgeable workers who can perpetuate Six Sigma suc-
cess in their company.
Six Sigma is typically based on an approach referred to as the DMAIC process,
which is shown in Figure 9.5. The components of this process are described as:
• Define (D). Define the problem to be solved, including the customer impact
and potential benefits. This information is captured on the project charter form,
which also includes the voice of the customer (VOC).
• Measure (M). Identify the critical-to-quality characteristics (CTQs) of the
product or service that correspond to the VOC. At this stage, you should verify
the measurement process capability. Also, it is important to establish a baseline
for the current defect rate and to set goals for improvement.

9. Joseph G. Voelkel, “Something’s Missing,” Quality Progress (May 2002): 98–101.


Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 283

Define
Measure
Charter Analyze
Collect data Improve
Collect data Control
Process map Select
Plot data solutions Document
Statistical
Understand methods Develop and
Understand implement plans Training
the voice of
customer current Explore and
process with organize Measure
(VOC) results and Monitor new
focused potential
causes with evaluate process and
problem recommend
statement identified benefits
root cause future plans

Figure 9.5 DMAIC process used in Six Sigma methodology.

• Analyze (A). Understand the root causes of why defects occur, and identify the
key process variables that cause these defects.
• Improve (I). Quantify the influences of key process variables on the CTQs,
identify acceptable limits of these variables, and modify the process to stay
within these limits. This will reduce defect levels in the CTQs.
• Control (C). Establish controls so that the modified process now keeps the key
process variables within acceptable limits. In this stage, we wish to maintain
the gains over the long term.
Since Six Sigma is a very data-oriented approach, it is imperative that statistical
methodology be used in the implementation of the DMAIC stages. The variety of statis-
tical tools range from basic (histograms, scatter diagrams, Pareto charts, control charts,
and so on) to more advanced (design of experiments, regression analysis, and so on).10
Six Sigma companies will set a cost savings goal for the year and decide which pro-
jects contribute toward this metric. Hahn et al. discuss successes at Motorola (almost a

10. This text covers most of the common Six Sigma basic and advanced tool set. For more information on Six Sigma,
the reader is referred to these sources:
F. W. Breyfogle, Implementing Six Sigma: Smarter Solutions Using Statistical Methods (New York: John Wiley &
Sons, 1999).
G. J. Hahn, N. Doganaksoy, and R. W. Hoerl, “The Evolution of Six Sigma,” Quality Engineering 12, no. 3
(2000): 317–26.
G. J. Hahn, W. J. Hill, R. W. Hoerl, and S. A. Zinkgraf, “The Impact of Six Sigma Improvement: A Glimpse into
the Future of Statistics,” American Statistician 53, no. 3 (1999): 208–15.
M. Harry, The Vision of Six Sigma: Roadmap for a Breakthrough (Phoenix, AZ: Sigma Publishing Co., 1994).
R. W. Hoerl, “Six Sigma and the Future of the Quality Profession,” Quality Progress (June 1998): 35–42.
J. G. Voelkel, “Something’s Missing,” Quality Progress (May 2002): 98–101.
284 Part III: Troubleshooting and Process Improvement

billion dollars in three years and a Malcolm Baldridge Award), Allied Signal (over two
billion dollars since it began using Six Sigma), and GE (over a billion dollars savings
in one year!).11
While the DMAIC process is focused on reducing defects in existing products, ser-
vices, and processes, Design for Six Sigma (DFSS) was created by GE and applied to
design projects in research and development. The objective of DFSS is to design prod-
ucts, services, and processes that are Six Sigma–capable. This approach is easily adapt-
able to research and development efforts. Hahn et al. describe the basic principles of
DFSS as follows12:
• Customer requirements. The customer requirements for a new product, service,
or process define the CTQs. This involves the use of customer research tools
such as Quality Function Deployment (QFD).
• Requirements flow-down. The CTQs are “flowed down” to requirements for
functional design, detailed design, and process control variables. The intent is
to prevent the design from being finalized too quickly.
• Capability flow-up. As the CTQs are flowed down, the capability to meet these
requirements is constantly assessed using relevant existing or new data. The
intent in this stage is to permit early consideration of any trade-offs and the
avoidance of any future surprises.
• Modeling. Both the flow-down of requirements and the flow-up of capability
are determined from the knowledge of the relationship between the CTQs
(“Y’s”) and the design elements (“X’s”). The models are based on physical
fundamentals, simulation, empirical methods, or a mix of these.
The approach for implementing DFSS was developed by GE and is called DMADV,
which involves five steps:
• Define (D). Identify the product, service, or process to be designed (or
redesigned). Develop and define the team charter, which includes the scope,
business case, key milestones, needed resources, and project plan. The
activities are based on common sense and constitute a major portion of any
training program on project management. This stage is to be taken very
seriously as a poorly defined project could stall the entire effort.
• Measure (M). Plan and conduct the necessary research to understand the
customer needs and requirements, which are in turn translated into
measurable CTQs. Designed experiments are an effective means of
understanding relationships in this phase.

11. G. J. Hahn, N. Doganaksoy, and R. Hoerl, “The Evolution of Six Sigma,” Quality Engineering 12, no. 3 (2000):
317–26.
12. Ibid.
Chapter 9: Some Basic Ideas and Methods of Troubleshooting and Problem Solving 285

• Analyze (A). Develop alternative concepts. The best-fit concept is selected for
development into a high-level design for which the capability of this design to
meet the requirement is predicted. In this phase, the different design options are
considered and evaluated systematically. This will typically involve the use of
designed experiments coupled with knowledge of physical phenomena.
• Design (D). Develop the detailed design. The capability of the proposed design
is evaluated and plans are developed to perform a pilot test of the new or
redesigned product or service. In addition, a supplier DMAIC program is
initiated in areas needing improvement.
• Verify (V). Build and pilot a fully functional, yet limited-scale version of the
new or redesigned product or service.
Six Sigma is too often thought of as strictly applicable to manufacturing processes,
but actually it can be implemented in any business process. Special emphasis has been
given to commercial transactions and product service. The reason why Six Sigma can
apply to any business process lies in the fact that all work occurs in interconnected
processes. These processes have outputs that are a function of process inputs and pro-
cessing steps. In the Six Sigma methodology, the process is diagrammed in the form of
the SIPOC model shown in Figure 9.6.
As an application of Six Sigma to transactional processes, Hahn et al.13 discussed
successes at GE Capital that accounted for nearly 45 percent of GE’s profitability. This
paper explains that applications range from selling or insuring mortgages to bidding on
municipal bonds to writing insurance to providing consumer credit. Such processes

S I P O CCustomers
Suppliers

Inputs Process Outputs

Figure 9.6 The SIPOC model used for understanding the process from an overview standpoint.

13. Ibid.
286 Part III: Troubleshooting and Process Improvement

have outputs, which are the primary concern, such as cycle time, profitability, accuracy,
and so on. But these outputs are a function of the process inputs and what happens at
key processing steps. As the authors state, “If one is able to view both a manufacturing
line and credit card collections as processes with key inputs and processing steps to be
improved, in order to improve outputs, the leap from manufacturing to business appli-
cations becomes second nature.”14

9.7 PRACTICE EXERCISES


1. Identify a process that you have had experience with or can study closely.
a. Describe the physical nature of the process in detail.
b. Identify all the continuous variable factors associated with this process.
Be specific regarding the operating ranges of these variables and the
degree of control that the operator has over them.
c. Identify several omnibus-type factors.
2. Based on the author’s discussion of Case History 9.1, write out a set of short,
specific guidelines for conducting a process troubleshooting project. Include
from five to eight guidelines.
3. Identify a process that has a problem in need of a solution.
a. Sketch out the problem solving process using one of the approaches
discussed in this chapter.
b. Discuss how a Six Sigma approach could be applied.
4. Explain where “3.4 ppm level of defects” comes from in Figure 9.4.

14. Ibid.
10
Some Concepts of Statistical
Design of Experiments

10.1 INTRODUCTION
Success in troubleshooting and process improvement often rests on the appropriateness
and efficiency of the experimental setup and its match to the environmental situation.
Design suggests structure, and it is the structure of the statistically designed experiment
that gives it its meaning.1 Consider the simple 22 factorial experiment laid out in Table
10.1 with measurements X11, X12, X21, X22. The subscripts i and j on Xij simply show the
machine (i) and operator (j), associated with a given measurement.

Table 10.1 Experimental plan.

Operator

O1 = Dianne O2 = Tom

(1) (2)

M1 = old X11 X12

Machine
(3) (4)

M2 = new X21 X22

1. Sections 10.3, 10.4, 10.5, and 10.6 are not vital to subsequent understanding and may be omitted by the reader.
They are intended as a supplement for those already somewhat familiar with the topic.

287
288 Part III: Troubleshooting and Process Improvement

Here there are two factors or characteristics to be tested, operator and machine.
There are two levels of each so that operator takes on the levels Dianne and Tom and
the machine used is either old or new. The designation 2p means two levels of each of
p factors. If there were three factors in the experiment, say by the addition of material
(from two vendors), we would have a 23 experiment.
In a properly conducted experiment, the treatment combinations corresponding to
the cells of the table must be run at random to avoid biasing the results. Tables of ran-
dom numbers or slips of paper drawn from a hat can be used to set up the order of exper-
imentation. Thus, if we numbered the cells as shown in the diagram and drew the
numbers 3, 2, 1, 4 from a hat, we would run Dianne–new first, followed by Tom–old,
Dianne–old, and Tom–new in that order. This is done to insure that any external effect
that might creep into the experiment while it is being run would affect the treatments in
random fashion. Its effect would then appear as experimental error rather than biasing
the experiment.

10.2 EFFECTS
Of course, we must measure the results of the experiment. That measurement is called
the response. Suppose the response is ‘units produced in a given time,’ and that the
results are as shown in Table 10.2.
The effect of a factor is the average change in response (units produced) brought about
by moving from one level of a factor to the other. To obtain the machine effect we would
simply subtract the average result for the old machine from that of the new. We obtain

Machine effect = X 2• − X1• =


(5 + 15) − ( 20 + 10 ) = –5
2 2

Table 10.2 Experimental results.

Operator

O1 = Dianne O2 = Tom

1 a

M1 = old 20 10

Machine
b ab

M2 = new 5 15
Chapter 10: Some Concepts of Statistical Design of Experiments 289

which says the old machine is better than the new. Notice that when we made this cal-
culation, each machine was operated equally by both Dianne and Tom for each average.
Now calculate the operator effect. We obtain

Operator effect = X •2 − X •1 =
(15 + 10 ) − (5 + 20 ) = 0
2 2

The dots (•) in the subscripts simply indicate which factor was averaged out in

computing X. It appears that operators have no effect on the operation. Notice that each
average represents an equal time on each machine for each operator and so is a fair
comparison.
However, suppose there is a unique operator–machine combination that produces
a result beyond the effects we have already calculated. This is called an interaction.
Remember that we averaged machines out of the operator effect and operators out of
the machine effect. To see if there is an interaction between operators and machines, we
calculate the machine effect individually for each operator. If there is a peculiar rela-
tionship between operators and machine, it will show up as the average difference
between these calculations. We obtain

Machine effect for Dianne = X21 – X11 = 5 – 20 = –15

Machine effect for Tom = X22 – X12 = 15 – 10 = 5

The average difference between these calculations is

Interaction =
(5) − ( −15) = 20 = 10
2 2

The same result would be obtained if we averaged the operator effect for each
machine. It indicates that there is, on the average, a 10-unit reversal in effect due to the
specific operator–machine combination involved.
Specifically, in going from Dianne to Tom on the new machine we get, on the aver-
age, a 10-unit increase; whereas in going from Dianne to Tom on the old machine we get,
on the average, a 10-unit decrease. For computational purposes in a 22 design, the inter-
action effect is measured as the difference of the averages down the diagonals of the
table. Algebraically, this gives the same result as the above calculation since:

Interaction =
(X 22
− X12 ) − ( X 21 − X11 )
2

=
(X 22
+ X11 ) − ( X 21 + X12 )
2
290 Part III: Troubleshooting and Process Improvement

=
(Southeast diagonal ) − (Southwest diagonal )
2
15 + 20 − 5 − 10
=
2
= 10

There is a straightforward method to calculate the effects in more complicated


experiments. It requires that the treatment combinations be properly identified. If we
designate operator as factor A and machine as factor B, each with two levels (– and +),
the treatment combinations (cells) can be identified by simply showing the letter of a
factor if it is at the + level and not showing the letter if the factor is at the – level. We
show (1) if all factors are at the – level. This is illustrated in Table 10.3.
The signs themselves indicate how to calculate an effect. Thus, to obtain the A
(operator) effect we subtract all those observations under the – level for A from those
under the + level and divide by the number of observations that go into either the + or
– total to obtain an average. The signs that identify what to add and subtract in calcu-
lating an interaction can be found by multiplying the signs of its component factors
together as in Table 10.4.
And we have
( a + ab ) − ( b + (1)) − (1) + a − b + ab
A (operators) effect = = =0
2 2

( b + ab ) − ( a + (1)) – (1) − a + b + ab
B (machines) effect = = = –5
2 2

Table 10.3 The 22 configuration.

A
– +

– (1) = X11 a = X12

+ b = X21 ab = X22
Chapter 10: Some Concepts of Statistical Design of Experiments 291

Table 10.4 Signs of interaction.

A
– +

– (–)(–) = + (+)(–) = –

+ (–)(+) = – (+)(+) = +

AB interaction =
((1) + ab ) − ( a + b ) = + (1) − a − b + ab = 10
2 2

Note that the sequence of plus (+) and minus (–) signs in the numerators on the right
match those in the following table for the corresponding effects:

Treatment combination A B AB Response


(1) – – + X11 = 20
a + – – X12 = 10
b – + – X21 = 5
ab + + + X22 = 15

Tables of this form are frequently used in the collection and analysis of data from
designed experiments. The sequence of the treatment combinations in the first column
of the table is referred to as the Yates order.

10.3 SUMS OF SQUARES


Process control and troubleshooting attempt to reduce variability. It is possible to cal-
culate how much each factor contributes to the total variation in the data by determin-
ing the sums of squares (SS) for that factor.2 For an effect Eff associated with the factor,
the calculation is
SS(Ef f) = r2P–2Eff 2

2. Sums of squares, SS, are simply the numerator in the calculation of the variance, the denominator being degrees of
freedom, df. Thus, s2 = SS/df and s2 is called a mean square, MS, in analysis of variance.
292 Part III: Troubleshooting and Process Improvement

where r is the number of observations per cell and p is the number of factors. Here, r =
1 and p = 2, so
SS(A) = (0)2 = 0
SS(B) = (–5)2 = 25
SS(A × B) = (10)2 = 100

To measure the variance ŝ 2 associated with an effect, we must divide the sums of
squares by the appropriate degrees of freedom to obtain mean squares (MS). Each effect
(Eff ), or contrast, will have one degree of freedom so that
2
ŝ Eff = Mean square (Eff ) = SS(Eff )/1 = SS(Eff )

The total variation in the data is measured by the sample variance from all the data
taken together, regardless of where it came from. This is

SS ( T ) ∑ (
− X)
n
2
X i
σ T2 = sT2 = =i =1

df ( T ) r2 p
−1

= (15 − 12.55) + ( 5 − 12.5) + (10 − 12.5) + ( 20 − 12.5)  / 1( 4 ) − 1


2 2 2 2

 
125
= = 41.67
3

We can then make an analysis of variance table (Table 10.5) showing how the vari-
ation in the data is split up. We have no estimate of error since our estimation of the
sums of squares for the three effects uses up all the information (degrees of freedom) in
the experiment. If two observations per cell were taken, we would have been able to
estimate the error variance in the experiment as well.
The F test3 can be used to assess statistical significance of the mean squares when
a measure of error is available. Since the sums of squares and degrees of freedom add

Table 10.5 Analysis of variance.


Effect SS df MS
Operator (A) 0 1 0
Machine (B) 25 1 25
Interaction (A × B) 100 1 100
Error No estimate
Total 125 3

3. See Section 4.5.


Chapter 10: Some Concepts of Statistical Design of Experiments 293

to the total, the error sum of squares and degrees of freedom for error may be deter-
mined by difference. Alternatively, sometimes an error estimate is available from pre-
vious experimentation. Suppose, for example, an outside measure of error for this
experiment was obtained and turned out to be ŝ e2 = 10 with 20 degrees of freedom.
Then, for machines F = 25/10 = 2.5 and the F table for a = 0.05 and 1 and 20 degrees
of freedom shows F* = 4.35 would be exceeded five percent of the time. Therefore, we
are unable to declare that machines show a significant difference from chance variation.
On the other hand, interaction produces F = 100/10 = 10 which clearly exceeds the crit-
ical value of 4.35, so we declare interaction significant at the a = 0.05 level of risk. Note
that this is a one-tailed test.

10.4 YATES METHOD


The calculation of the effects and the sums of squares in a 2P experiment may be accom-
plished by an algorithm called the Yates method, which is easily incorporated into a
computer spreadsheet. To use this method, the observations have to be put in Yates stan-
dard order. This is obtained by starting at 1 and multiplying the previous results by the
next letter available. We obtain: 1, a, b, ab, c, ac, bc, abc, d, ad, bd, abd, cd, acd, bcd,
abcd, e, ae, be, abe, ce, ace, bce, abce, de, ade, bde, abde, cde, acde, bcde, abcde, and
so on. We select that portion of the sequence that matches the size of the experiment.
The Yates method consists of the following steps:
1. Write the treatment combinations in Yates order.
2. Write the response corresponding to each treatment combination. When there
are r observations per cell, write the cell total.
3. Form column 1 and successive columns by adding the observations in pairs
and then subtracting them in pairs, subtracting the top observation in the
pair from the one below it.
4. Stop after forming p columns (from 2P ).
5. Estimate effects and sums of squares as

Last column
Effect =
r 2 p−1

SS =
( Last column ) 2

r2p

For a 22 experiment, this requires two columns and is shown in Table 10.6. Note
that T is the total of all the observations. So for the production data from Table 10.2,
the Yates analysis is shown in Table 10.7. Table 10.8 summarizes the Yates method
of analysis.
294 Part III: Troubleshooting and Process Improvement

Table 10.6 Yates method for 22 experiment.

A
– +

– y1 y2

+ y3 y4

Yates Sums of
order Observation Column 1 Column 2 Yates effect squares
(1) y1 y1 + y2 y1 + y2 + y3 + y4 = T
a y2 y3 + y4 y2 – y1 + y4 – y3 = Qa Qa /2 Qa2/4
b y3 y2 – y1 y3 + y4 – y1 – y2 = Qb Qb /2 Qb2/4
ab y4 y4 – y3 y4 – y3 – y2 + y1 = Qab Qab /2 Qab2/4

Table 10.7 Yates analysis of production data.


Yates Sums of
order Observation Column 1 Column 2 Yates effect squares
(1) 20 30 50 = T
a 10 20 0 = Qa 0 = Qa /2 0 = Qa2/4
b 5 –10 –10 = Qb –5 = Qb /2 25 = Qb2/4
ab 15 10 20 = Qab 10 = Qab /2 100 = Qab2/4

Table 10.8 Yates method with r replicates per treatment combination.


Observation Col. p
Treatment total Col. 1 Col. 2 … (from 2p) Effect SS
(1) t1 ta + t1 Repeat Repeat, Final * *
a ta on and so on repetition Col. p (Col. p )2

( ) ( )
Col. 1
r 2p -1 r 2p
Add
b tb pairs
ab tab
...
... ta – t1
... Subtract
pairs
... (upper
from
(abc…) lower)
* Not calculated for treatment combination (1)
Chapter 10: Some Concepts of Statistical Design of Experiments 295

When the experiment is replicated with r > 1, residual error MS and a check on the
calculations can be determined from Table 10.8 as follows:
Compute : Then :
Σ (Obsn.total ) = A
1
 (
Check : B − p = D
A ) 
2

r  2 
Σ (Obsn.total ) = B  
2

Σ (Obsn.) = C
2 B
Ressidual error SS = C −
r
 B 
 C− r 
Σ ( SS ) = D Residual error MS =  
 residual df 
 

These concepts are easily extended to higher 2P factorials. For a 23 experiment, we


have a configuration as indicated in Table 10.9.
Effects may be calculated from first principles. For example, if the appropriate signs
are appended to the observations, the effects may readily be calculated from the signs of
Table 10.10.

Table 10.9 23 configuration.

B– B+

A– A+ A– A+

(1) a b ab
C– y1 y2 y3 y4

c ac bc abc
C+ y5 y6 y7 y8

Table 10.10 Signs for effect calculation.


Effect
Observation A B C AB AC BC ABC
y1 – – – + + + –
y2 + – – – – + +
y3 – + – – + – +
y4 + + – + – – –
y5 – – + + – – +
y6 + – + – + – –
y7 – + + – – + –
y8 + + + + + + +
296 Part III: Troubleshooting and Process Improvement

So we have

Main effects A =
1
4
( y2 + y4 + y6 + y8 ) − ( y1 + y3 + y5 + y7 )
1
4

B=
1
4
( y3 + y4 + y7 + y8 ) − ( y1 + y2 + y5 + y6 )
1
4

C=
1
4
( y5 + y6 + y7 + y8 ) − ( y1 + y2 + y3 + y4 )
1
4

Interaction AB =
1
4
( y1 + y4 + y5 + y8 ) − ( y2 + y3 + y6 + y7 )
1
4

AC =
1
4
( y1 + y3 + y6 + y8 ) − ( y2 + y4 + y5 + y7 )
1
4

BC =
1
4
( y1 + y2 + y7 + y8 ) − ( y3 + y4 + y5 + y6 )
1
4

ABC =
1
4
( y2 + y3 + y5 + y8 ) − ( y1 + y4 + y6 + y7 )
1
4

In addition, the Yates method becomes as shown in Table 10.11.


For example, consider the following data from an experiment in which there were
no replicates and the response was y = [yield (lb) – 80].4 Note from the data display in
Table 10.12 there is clearly a B effect.
The Yates analysis is shown in Table 10.13.

Table 10.11 Yates method for 23 experiment.


Yates Yates Sums of
order Observation Col. 1 Col. 2 Col. 3 effect squares
(1) y1 y1 + y2 y1 + y2 + y3 + y4 T*
a y2 y3 + y4 y5 + y6 + y7 + y8 Qa** Qa /4 Qa2/8
b y3 y5 + y6 y 2 – y 1+ y 4 – y 3 and so on and so on and so on
ab y4 y7 + y8 y6 – y5 + y8 – y7
c y5 y2 – y1 y3 + y4 – y1 – y2
ac y6 y4 – y3 y7 + y8 – y5 – y6
bc y7 y6 – y5 y4 – y3 – y2 + y1
abc y8 y8 – y7 y8 – y7 – y6 + y5
* T = Σy = 23 –y
** Qa = (y2 – y1 + y4 – y3) + (y6 – y5 + y8 – y7)

4. O. L. Davies (ed.), The Design and Analysis of Industrial Experiments (London: Oliver & Boyd, 1954): 264–68.
Chapter 10: Some Concepts of Statistical Design of Experiments 297

Table 10.12 Illustrative example of 23.

B– B+

A– A+ A– A+

(1) a b ab
C– 7.2 8.4 2.0 3.0

c ac bc abc
C+ 6.7 9.2 3.4 3.7

Table 10.13 Yates analysis of illustrative example.


Yates Yates Sums of
order Observation Col. 1 Col. 2 Col. 3 effect squares
(1) 7.2 15.6 20.6 43.6 = T 10.9 = 2 y– 237.62
a 8.4 5.0 23.0 5.0 = 4A 1.25 = A 3.12
b 2.0 15.9 2.2 –19.4 = 4B –4.85 = B 47.04
ab 3.0 7.1 2.8 –2.4 = 4AB –0.6 = AB 0.72
c 6.7 1.2 –10.6 2.4 = 4C 0.6 = C 0.72
ac 9.2 1.0 –8.8 0.6 = 4AC 0.15 = AC 0.04
bc 3.4 2.5 –0.2 1.8 = 4BC 0.45 = BC 0.40
abc 3.7 0.3 –2.2 –2.0 = 4ABC –0.5 = ABC 0.50
290.16

Table 10.14 ANOVA of illustrative example.


Source SS df MS F F0.05
A 3.12 1 3.12 7.52 7.71
B 47.04 1 47.04 113.35 7.71
C 0.72 1 0.72 1.73 7.71
Error 0.72 + 0.04 + 0.40 + 0.50 = 1.66 4 0.415
Total 290.66 – 237.62 = 52.54 7

Suppose it is possible to assume that no interactions exist and so to “pool” all but
main effects as an estimate of error. Under this assumption, the magnitude of the mean
square error calculated for the interactions simply represents experimental error. The
resulting analysis of variance table is displayed in Table 10.14. The estimate of error for
this example is the square root of 0.415, or 0.644.
We see that when we change B from the low to the high level, the response varies by
an amount that exceeds chance at the a = 0.05 level. Therefore, factor B is statistically
298 Part III: Troubleshooting and Process Improvement

significant and the effect on yield in going from the low to the high levels of B is esti-
mated to be –4.85 lb.

10.5 BLOCKING
Occasionally it is impossible to run all of the experiment under the same conditions. For
example, the experiment must be run with two batches of raw material or on four dif-
ferent days or by two different operators. Under such circumstances it is possible to
“block” out such changes in conditions by confounding, or irrevocably combining,
them with selected effects. For example, if in the previous experiment units y1, y2, y3,
and y4 were run by one operator and y5, y6, y7, and y8 were run by another operator, it
would be impossible to distinguish the C effect from any difference that might exist
between operators. This would be unfortunate, but it is possible to run the experiment
in such a way that an unwanted change in conditions, such as operators, will be con-
founded with a preselected higher-order interaction in which there is no interest or
which is not believed to exist. For example, looking at the signs associated with the
ABC interaction, if y2, y3, y5, and y8 were run by the first operator and y1, y4, y6, and y7
by the second, the operator effect would be irrevocably combined, or confounded, with
the ABC interaction in Table 10.10. No other effect in the experiment would be changed
by performing the experiment in this way. That is why the structure and randomization
are so important.
Appendix Table A.17 gives blocking arrangements for various 2P factorial designs.
It will be seen that the pattern suggested for the above example may be found under
design 1 for blocking a 2P experiment.
For selected full-factorial designs, the table shows the number of factors and the
number of runs involved. Each design is then partitioned into a specified number of
blocks: B1, B2, . . . The experimental units to be run in each block are shown together
with interactions that will be confounded with the variation resulting from differences
between blocks. Thus, if the previous example of a 23 design were blocked as shown
under design 1, only the ABC interaction would be confounded with blocks. No other
effect would be affected.
It has been pointed out that, particularly in the screening stage of experimenta-
tion, the Pareto principle often applies to factors incorporated in a designed experi-
ment. A few of the effects will tend to account for much of the variation observed and
some of the factors may not show significance. This phenomenon has been called
“factor sparsity” or sparsity of effects,5 and provides some of the rationale for delib-
erately confounding the block effect with a preselected interaction that is deemed
likely to not exist.

5. G.E.P. Box and R. D. Meyer, “An Analysis for Unreplicated Fractional Factorials,” Technometrics 28, no. 1
(February 1986): 11–18.
Chapter 10: Some Concepts of Statistical Design of Experiments 299

10.6 FRACTIONAL FACTORIALS


Sometimes it is possible to reduce the number of experimental runs by utilizing only a
portion of the structure of the full-factorial experiment. Consider running only the cells
of the previous 23 experiment that are not shaded. The result is shown in Table 10.15.
The treatment combination notation has been given in each cell. Patterns such as this
have been discovered that will allow running an experiment with some fraction of the
units required for the full factorial. In this case, only half of the units would be used so
this is called a one-half replication of a 2P factorial, which is sometimes expressed as a
2p–1 design since (1/2)2p = 2p –1. The price of course is aliasing, which is irrevocably
combining effects in the analysis. The aliased effects may be determined from the de-
fining relation associated with the fractional-factorial design used. In this case, the
defining relation is

I = –ABC

When the defining relaton is multiplied through on both sides by an effect of inter-
est, where I is considered to be a one, the result is an equality showing what is aliased
with that effect. In this procedure, any terms obtained with even number exponents, such
as squared terms, are eliminated. Thus, to determine what is aliased with the effect A:

A(I) = A(–ABC)
A = –A2BC
A = –BC

Note that –BC simply indicates the complement of the BC interaction.

Table 10.15 Fraction of a 23.

B– B+

A– A+ A– A+

(1) a b ab
C–

c ac bc abc
C+
300 Part III: Troubleshooting and Process Improvement

Clearly, the term or word with the least number of letters in the defining relation is
of particular importance. This is because, when multiplied by a main effect, it represents
the lowest-level interaction that will be aliased with it. The length of this shortest word
in the defining relation is called the resolution of the design and is typically represented
by a Roman numeral, for example, III. For example, as shown in Table A.18, a quarter
replication of a 25, or 25–2, design having a defining relation

I = –BCE = –ADE = ABCD

is of resolution III since the length of the shortest word is three, and will have main
effects aliased with two-factor interactions. The design is denoted as 25–2
III .
The aliasing pattern can be seen for various design resolutions in Table A.18 as
follows:

Resolution of design Aliasing pattern means this*:


II Main effects aliased with each other
III Main effects aliased with two-factor interactions
IV Main effects not aliased, two-factor interactions aliased with each other
V and higher Main effects not aliased, two-factor interactions not aliased

* Assumes three-factor and higher interactions are negligible and can be ignored.

If only the runs indicated in Table 10.15 were performed and analyzed, the follow-
ing effects would be aliased together:

A and –BC
B and –AC
C and –AB
since I = –ABC.
If we were in a situation in which we did not expect any two-factor or higher inter-
actions to exist, we would be able to estimate the A, B, and C main effects by regarding
the BC, AC, and AB effects to be negligible, or zero. This is what is done in the analy-
sis of fractional factorials.
Assume the responses obtained were as before, namely

Treatment Response
(1) 7.2
ac 9.2
bc 3.4
ab 3.0

We could estimate the effects by simply subtracting the average response when the
units are at the low level of a factor from those that are made at the high level. Thus
Chapter 10: Some Concepts of Statistical Design of Experiments 301

A=
( ac + ab ) − ((1) + bc ) = (9.2 + 3.0 ) − ( 7.2 + 3.4 ) = 6.1 − 5.3 = 0.8
2 2 2 2

B=
( bc + ab ) − ((1) + ac ) = (3.4 + 3.0 ) − ( 7.2 + 9.2) = 3.2 − 8.2 = –5.0
2 2 2 2

C=
( ac + bc ) − ((1) + ab ) = (9.2 + 3.4 ) − ( 7.2 + 3.0 ) = 6.3 − 5.1 = 1.2
2 2 2 2

Note that these are reasonably close to the estimates obtained from the full factorial.
In larger experiments, the Yates method may be used for the analysis. The proce-
dure is as follows:
1. Write down Yates standard order for the size of the fractional-factorial
experiment run. That is, write as many terms as there are treatment
combinations run.
2. Match the treatment combinations run with the Yates order by writing the
unused letter(s) in parentheses after a treatment combination shown to match
the units run.
3. Perform Yates analysis.
4. Identify the aliased effects represented in the effects and sums of squares
column. Multiply through the defining contrast given for the fraction used
by the effect of interest to obtain the aliased effects. Any squared terms are
eliminated in this process.
For the fraction used above, if we treat I as if it were a 1 (one) and multiply both
sides through by A, B, and C we get

I = –ABC
A = –A2BC = –BC
B = –AB2C = –AC
C = –ABC2 = –AB

Carrying out the Yates analysis on the yield data, we obtain the results shown in
Table 10.16. Note that the effects calculated are not in parentheses. The aliased effects
are obtained from the defining relation.
Fractional factorials have been tabulated and show the treatment combinations to be
run and the defining contrast. Appendix Table A.18 is such a tabulation and shows the
treatment combinations to be run (TRT), already in Yates order, with the corresponding
aliased effects (EFF) that will be estimated by the row in the Yates analysis indicated by
302 Part III: Troubleshooting and Process Improvement

Table 10.16 Yates analysis of 1⁄2 fraction of illustrative example.


Yates Sums of Aliased
order Observation Col. 1 Col. 2 Effect squares effects
(1) 7.2 16.4 22.8
a(c) 9.2 6.4 1.6 0.8 0.64 A – BC
b(c) 3.4 2.0 –10.0 -5.0 25.00 B – AC
ab 3.0 –0.4 –2.4 –1.2 1.44 AB – C

the treatment combination shown. Note that only two-factor interactions are shown in
the EFF column, any higher interactions being ignored. Note that when a 1⁄4 fraction is
run; the defining contrast contains three interactions.
As an example, if it were desired to run a 1⁄2 fraction of a three-factor two-level
experiment, design 1 in Table A.18 would be appropriate. This resolution III design,
23–1
III , would have four runs using treatment combinations (1), a(c), b(c), and ab resulting
in the aliasing shown under EFF (which does not show anything greater than two-factor
interactions). The defining relation is I = –ABC.
If at some later time it is possible to run the rest of the fractional factorial, the data
from the fractions may be combined and analyzed. In that case, only the interactions
shown in the defining relation will be lost. The other effects will be clear of confound-
ing. Recombining the fractions acts like a blocked experiment, with the interactions
shown in the defining contrast confounded with any block effect that might come about
from running at different times. This can be seen from appendix Table A.17 where the
fraction run in the example is part of two blocks in the 2P blocking arrangement shown.
The confounded interaction is ABC.

10.7 GRAPHICAL ANALYSIS OF 2P DESIGNS


Main Effect and Interaction Plots
The main effect plot simply plots the averages for the low (–) and high (+) levels of the
factor of interest. The slope of the line reflects whether the effect is positive or nega-
tive on the response when the level changes from low to high. The value of the main
effect is simply seen as the difference in the value of the response for the endpoints of
the line. Figure 10.1 demonstrates what the main effect plots would look like for the
data in Table 10.2.
The interaction plot in Figure 10.1 shows the four cell means in the case of a 22
design. One of the factors is shown on the horizontal axis. The other factor is repre-
sented by a line for each level, or two lines in the case of the 22 design. Intersecting (or
otherwise nonparallel) lines indicate the presence of a possibly statistically significant
interaction. The value of the interaction can be seen as the average of the difference
between the lines for each level of the factor on the horizontal axis. Note that in Figure
10.1, the main effect for the factor on the horizontal axis can be seen in the interaction
plot as the dotted bisecting line.
Chapter 10: Some Concepts of Statistical Design of Experiments 303

20 20

15 15 B+
Average response

Average response
10 10 B–

5 Effect = 0 5 Effect = 10

A– A+ A– A+
Operators Operators

20 20

15 15 A+
Average response

Average response

10 10

5 Effect = – 5 5 Effect = 10 A–

B– B+ B– B+
Machines Machines

Figure 10.1 Main-effect and interaction plots for the 22 design in Table 10.2.

Normal Probability Plot of Effects


The Yates method is helpful in providing quick estimates of factor effects. However, it
may not always contain a reasonable estimate of the experimental error to be able to
declare any of these effects to be statistically significant in the analysis of variance
(ANOVA). Often this occurs when a small design is run that is not replicated, resulting
in few, if any, degrees of freedom for error.
The use of a normal probability plot, which was introduced in Section 1.7, will
allow for a good check for significance when the number of runs is small or the design
is not replicated.
If the effects are indeed normally distributed, their frequency distribution will be
bell-shaped and the cumulative distribution will be S-shaped. To translate this distribu-
tion to normal probability paper, we in effect pull the ends of the cumulative distribution
304 Part III: Troubleshooting and Process Improvement

Normal probability plot


Scale changes
here due to
“stretching”
99.99%

Cumulative probability
Cumulative
Frequency distribution frequency
distribution Scale

Cumulative probability
100% changes
50% little here
Probability

50%

0%
X X 0.01%
X

Figure 10.2 The relationship between the normal frequency distribution, cumulative distribution,
and the nature of the normal probability plot.

curve until it straightens out. The result of this stretching on the scale of the normal prob-
ability plot is that the center is relatively unchanged and the end values are spaced fur-
ther apart. Figure 10.2 demonstrates the relationship between the frequency distribution,
cumulative distribution, and the normal probability plot.
Nonsignificant effects should vary around zero and demonstrate a normal distribu-
tion. On normal probability paper, these points should follow a straight line drawn
through them. Often this fit is done by “eyeball,” though computer software typically
uses an initial linear fit through the points that can be moved to better fit the error terms.
Significant effects will lie in one or both tails of the distribution, that is, they will
appear as “outliers” and fall away from the straight-line fit. This idea is illustrated in
Figure 10.3.
The procedure for plotting the factor effects on normal probability paper is a sim-
ple one:
• Perform the Yates analysis of the experiment.
• Order the Yates factor effects from the largest negative to the largest positive.
• Assign a rank i of 1 to the largest negative up to a rank of 2P – 1 to the largest
positive effect.
• Calculate an estimated cumulative probability for each effect Pi using

i − 0.5
Pi =
2P − 1

• Plot each effect versus its corresponding estimated probability.


Chapter 10: Some Concepts of Statistical Design of Experiments 305

99.99%

Very significant

Cumulative probability Pi * effect falls farther


off the straight line
*
50%
* Significant effects
will fall off the
* straight line

*
* Nonsignificant

* effects will follow


a straight line

0.01%
– effects 0 + effects

Effect

Figure 10.3 Drawing the line on a normal probability plot of effects.

Using the Yates analysis presented in Table 10.13, we can calculate the estimated
probabilities for the normal probability plot according to the given procedure:

Factor effect Rank Yates effect Probability Pi


B 1 –4.85 (1 – 0.5)/7 = 0.07
AB 2 –0.60 1.5/7 = 0.21
AC 3 0.15 2.5/7 = 0.36
BC 4 0.45 3.5/7 = 0.50
ABC 5 0.50 4.5/7 = 0.64
C 6 0.60 5.5/7 = 0.79
A 7 1.25 6.5/7 = 0.93

Now, plot the last two columns on normal probability paper at this point or, if we are
using computer software to do this for us (such as DesignExpert), simply have the pro-
gram generate the graph. Figure 10.4 shows the normal probability plot that DesignExpert
creates for this example.
Here we see that the B effect clearly falls off the straight line fit to the effects clus-
tered around the zero point. This result is consistent with the relative magnitude of the
B effect and its sum of squares versus those of the other effects. Also, and more impor-
tantly, the normal probability plot is in agreement with the conclusions drawn from the
analysis of variance (ANOVA) shown in Table 10.14.
Actual analyses are not always this obvious, but significant effects will still be
noticeably off the line. Nonsignificant effects will collect around zero and plot along a
straight line drawn on the probability plot. Significant positive effects will fall off this
line in the upper right-hand portion of the plot. Significant negative effects will fall off
the line in the lower left-hand portion of the plot, as seen in Figure 10.4.
306 Part III: Troubleshooting and Process Improvement

Normal plot

DESIGN-EXPERT Plot
Response 1
99

A: A 95
B: B 90
C: C

Normal % probability
80
70

50

30
20
10
5
B

-4.85 -3.33 -1.80 -0.28 1.25

Effect

Figure 10.4 DesignExpert normal probability plot for the effects shown in Table 10.13.

In the case of a nonreplicated design, such as the one discussed here, the analyst has
some options in determining the level of experimental error by which to judge the sta-
tistical significance of the factor effects:
• Combine the sums of squares for the factors that fall on or near the line into
the error term. Since they represent effects that estimate “noise,” we can get
a degree of freedom for each effect used. Note that if an interaction is
deemed to be off the line, none of the main effects associated with it may
be combined into the error term. Many sources refer to this model-building
principle as preserving the hierarchy of the model.
• Use a prior estimate of the error based on reliable historical data, and with a
sufficient number of degrees of freedom, say, greater than five. More is needed
if the analyst is interested in detecting more subtle effects.
• Rely on the nonsignificant effects that determined the line to isolate the
significant effects. When the analyst is in a situation where replication is not
possible due to cost, time, people, or other resources, this approach may be
the only recourse.
If the analysis contains large positive and large negative effects, the analyst is often
tempted to draw a line through all of the points. This makes their significance less appar-
ent. The solution is to use a half-normal plot as discussed by Daniel.6 The half-normal
plot is a very effective means of analyzing experiments with large positive and negative

6. C. Daniel, “Use of Half-Normal Plots in Interpreting Factorial Two-Level Experiments,” Technometrics 1 (1959):
311–42.
Chapter 10: Some Concepts of Statistical Design of Experiments 307

Half Normal plot

DESIGN-EXPERT Plot
Response 1 99

A: A 97
B: B 95

Half Normal % probability


C: C B
90

85
80

70

60

40

20

0.00 1.21 2.43 3.64 4.85

Effect

Figure 10.5 DesignExpert half-normal probability plot for the effects shown in Table 10.13.

effects as it removes the sign prior to plotting. Figure 10.5 shows a half-normal plot for
the Table 10.13 data. Comparing it to Figure 10.4, where there was only a single large
effect, or where the significant effects are all positive or negative, both types of plots are
essentially equivalent to each other.

10.8 CONCLUSION
This is only a cursory discussion of design of experiments. It touches only on the most
rudimentary aspects. It will serve, however, as an introduction to the concepts and content.
Experimental design is a very powerful tool in the understanding of any process—
in a manufacturing facility, pilot line facility, or laboratory. The ideas presented here
will be revisited in the upcoming chapters and more graphical techniques, such as the
analysis of means (ANOM), will be presented as another method of analyzing designed
experiments.

Case History 10.1


23 Experiment on Fuses
J. H. Sheesley has reported on an experiment in which the safe operation of a spe-
cialty lamp system depended on the safe and sure operation of a thermal fuse.7 Since

7. J. H. Sheesley, “Use of Factorial Designs in the Development of Lighting Products,” ASQC Electronic Division
Newsletter—Technical Supplement, issue 4 (Fall 1985): 23–27.
308 Part III: Troubleshooting and Process Improvement

this system was to be used in a new application, the behavior of the fuse was exam-
ined under various conditions. The data is shown here as a 23 experiment selected
from the overall data to illustrate the procedure. The three factors were line voltage
(A), ambient temperature (B), and type of start (C). The response was temperature of
the fuse after 10 minutes of operation as measured by a thermocouple on the fuse. The
levels used are as follows:
Line voltage A + 120 V
A – 110V
Temperature B + 1100
B – 750
Start C + Hot
C – Cold
The resulting data in terms of average temperature (n = 10) after 10 minutes is
shown in Table 10.17.
The Yates analysis is as follows:

Yates Sum of
order Observation Col. 1 Col. 2 Col. 3 Yates effect squares
(1) 0.5 11.4 89.4 310.6 = T 77.7 = 2 y– 12,059.0
a 10.9 78.0 221.2 68.0 = 4A 17.0 = A 578.0
b 29.8 108.1 28.8 71.1 = 4B 17.8 = B 631.9
ab 48.2 113.1 39.2 5.8 = 4AB 1.4 = AB 4.2
c 43.7 10.4 66.6 131.8 = 4C 33.0 = C 2,171.4
ac 64.4 18.4 5.0 10.4 = 4AC 2.6 = AC 13.5
bc 47.3 20.7 8.0 –61.1 = 4BC –15.3 = BC 466.7
abc 65.8 18.5 –2.2 –10.2 = 4ABC –2.6 = ABC 13.0

Table 10.17 Average temperature after 10 minutes (minus 200°C).

B– B+

A– A+ A– A+

(1) a b ab
C– 0.5 29.8 10.9 48.2

c ac bc abc
C+ 43.7 47.3 64.4 65.8
Chapter 10: Some Concepts of Statistical Design of Experiments 309

Normal plot
DESIGN-EXPERT Plot
Avg Temp (after 10 min)

99
A: Line voltage
B: Temperature 95
C: Start 90 C

Normal % probability
80
B
70
A
50
AC
30 AB
20

10
5
BC

-15.40 -3.31 8.78 20.86 32.95


Effect

Figure 10.6 DesignExpert normal probability plot for the effects shown in Yates analysis for
Case History 10.1.

If the ABC interaction is assumed not to exist, its sum of squares can be used as a
measure of error and we have:

Source SS df MS F F0.05
Line voltage A 578.0 1 578.0 44.5 161.4
Temperature B 631.9 1 631.9 48.6 161.4
Interaction AB 4.2 1 4.2 0.3 161.4
Start C 2171.4 1 2171.4 167.0 161.4
Interaction AC 13.5 1 13.5 1.0 161.4
Interaction BC 466.7 1 466.7 35.9 161.4
Error (ABC) 13.0 1 13.0
Total 3878.7 7

We see that, even with this limited analysis of the fuse data, we are able to show
that start (C) has a significant effect with a risk of a = 0.05. The effect of start from cold
to hot is 33.0°. But is this the best we can do with this analysis? No, it is evident from
the Yates analysis that the AB and AC interactions have small sums of squares. Before
we combine these presumably insignificant effects into error, we can use a normal plot
to see if these two terms do indeed reside along a straight line with other insignificant
effects. Figure 10.6 is a normal plot of the effects that shows that both the AB and AC
interactions estimate error along with the ABC interaction.
The normal plot now shows that the A and B main effects along with the BC inter-
action are apparently significant. This can be shown statistically by the ANOVA table
using the combined interactions for error:
310 Part III: Troubleshooting and Process Improvement

Source SS df MS F F0.05
Line voltage A 578.0 1 578.0 56.5 10.1
Temperature B 631.9 1 631.9 61.7 10.1
Start C 2171.4 1 2171.4 212.2 10.1
Interaction BC 466.7 1 466.7 45.6 10.1
Error 4.2 + 13.5 +13.0
(AB + AC + ABC) = 30.7 3 10.2
Total 3878.7 7

Note the dramatic reduction in the critical value for the F statistic when the error
degrees of freedom are increased from 1 to 3. We can now see that all main effects
and the BC interaction are statistically significant, as reflected in the normal plot in
Figure 10.6. Since the BC interaction is now deemed important, the C main effect cannot
be interpreted directly. By definition, the effect of C on the response is dependent on the
level of B, as shown by the interaction plot in Figure 10.7.
If the objective of the experiment is to minimize the average thermal fuse temper-
ature, then the BC interaction plot indicates that a B temperature of 750° and a cold start
will minimize the response (predicted estimate is 5.7°). To complete the optimization,
we consider the A main effect. The sign of the A main effect is positive (+17.0) as shown
in the Yates analysis. Thus, the optimal setting for A that minimizes the response is the
low level (110V). So, the recommendation for the settings of A, B, and C that minimize
the average thermal fuse temperature is A @ 110V, B @ 750°, and C @ cold start.

DESIGN-EXPERT Plot Interaction Graph


Avg Temp (after 10 min)
65.8 C: Start
Avg Temp (after 10 min) = 5.7
LSD: 10.1855

X: B: Temperature = 750.00 49.475


Y: C: Start = cold
Avg Temp (after 10 min)

C: C1 cold
C: C2 cold
Actual Factor
33.15
A: Line voltage = 115.00

16.825

0.5

750.00 837.50 925.00 1012.50 1100.00

B: Temperature

Figure 10.7 DesignExpert BC interaction plot.


Chapter 10: Some Concepts of Statistical Design of Experiments 311

10.9 PRACTICE EXERCISES


1. Consider the following data on height of Easter lilies similar to that shown in
Table 14.2.

Storage period
Short Long
28 48
Long 26 37
30 38
Conditioning time
31 37
Short 35 37
31 29

a. Estimate the main effects and interaction effects from the basic formula.
b. Perform Yates analysis.
c. Estimate error.
d. Set up an analysis of variance table.
e. Test for significance of the effects at the a = 0.05 level.
2. Given the following data on capacitances of batteries from Table 14.4

Nitrate
concentration
(C) Low High
Shim (B) In Out In Out
Hydroxide (A) New Old New Old New Old New Old
–0.1 1.1 0.6 0.7 0.6 1.9 1.8 2.1
1.0 0.5 1.0 –0.1 0.8 0.7 2.1 2.3 Day 1
0.6 0.1 0.8 1.7 0.7 2.3 2.2 1.9
–0.1 0.7 1.5 1.2 2.0 1.9 1.9 2.2
–1.4 1.3 1.3 1.1 0.7 1.0 2.6 1.8 Day 2
0.5 1.0 1.1 –0.7 0.7 2.1 2.8 2.5
Treatment 1 2 3 4 5 6 7 8
combination

a. Suppose the first three observations in each set of six had been run
on one day and the last three observations on the next day. Estimate
the block effect.
b. Perform Yates analysis.
c. Set up an analysis of variance table ignoring the fact that the experiment
was run on different days and test at the a = 0.05 level.
312 Part III: Troubleshooting and Process Improvement

d. Set up an analysis of variance table as if the experiment were blocked


as in (a) above and test at the a = 0.05 level.
e. What are the advantages and disadvantages of blocking on days?
3. Suppose in Exercise 2 that treatment combinations 1, 4, 6, and 7 were tested
on one piece of equipment and combinations 2, 3, 5, and 8 were tested on
another. What would that do to the analysis and interpretation of the results?
(Hint. Write out the treatment combinations and use Table 10.15.)
4. Consider the following data from Table 14.9 (recoded by adding 0.30) on
contact potential of an electronic device with varying plate temperature (A),
filament lighting schedule (B), and aging schedule (C ).

Factor

A B C Response
– – – 0.16, 0.13, 0.15, 0.19, 0.11, 0.10
+ + – 0.45, 0.48, 0.37, 0.38, 0.38, 0.41
– + + 0.26, 0.34, 0.41, 0.24, 0.25, 0.25
+ – + 0.12, 0.18, 0.08, 0.09, 0.12, 0.09

a. What type of factorial experiment do these data represent?


b. What treatment combinations were run?
c. Place the treatment combinations in Yates order.
d. Perform Yates analysis.
e. Set up an analysis of variance table and test for significance at the
a = 0.05 level.
5. What is the defining contrast in Exercise 4? (Hint: Write out the treatment
combinations and use Table A.18.)
6. Show what effects are aliased together in the analysis of variance for
Exercise 4.
7. Write out the treatment combinations and defining contrast for a fractional
factorial investigating five different factors, each at two levels, when it is
possible to make only eight runs.
8. The defining contrast for a 25–2 is I = –BCE – ADE + ABCD. What are all
the effects aliased with C?
9. Line width, the width of the developed photoresist in critical areas, is of vital
importance in photolithographic processes for semiconductors. In an attempt
to optimize this response variable, Shewhart charts were run on the process,
but even after identifying a number of assignable causes, the process remained
Chapter 10: Some Concepts of Statistical Design of Experiments 313

out of control. In an attempt to improve the process and isolate other potential
assignable causes, several statistically designed experiments were run. Among
them was a 23 factorial experiment on the following factors, each, of course,
at two levels as follows:

Factor Levels
A: Print GAP spacing Proximity print, soft contact print
B: Bake temperature 60°C, 70°C
C: Bake time 5 min, 6 min

The results of the experiment as given by Stuart Kukunaris,8 a student of Dr.


Ott, are as follows:

A: Proximity print Soft contact print


B: 60°C 70°C 60°C 70°C
C: 5 min 6 min 5 min 6 min 5 min 6 min 5 min 6 min
373 368 356 356 416 397 391 407
372 358 351 342 405 393 391 404
361 361 350 349 401 404 396 403
381 356 355 342 403 409 395 407
370 372 355 339 397 402 403 406
Total 1857 1815 1767 1728 2022 2005 1976 2027

Once discovered, interactions play an important part in identifying assignable


causes apart from naturally occurring process fluctuations. Often the process
is so tightly controlled that naturally occurring slight changes in important
factors do not indicate their potential impact. This designed experiment was
useful in gaining further insight into the process:
a. Perform a Yates analysis.
b. Confirm that A (spacing), B (temperature), AB (spacing–temperature
interaction) and AC (spacing–time interaction) are significant. The
physical importance of these effects is indicated by the effects column
of the Yates analysis.

8. S. Kukunaris, “Operating Manufacturing Processes Using Experimental Design,” ASQC Electronics Division
Newsletter—Technical Supplement, issue 3 (Summer 1985): 1–19.
11
Troubleshooting
with Attributes Data

11.1 INTRODUCTION
Perhaps the presence of an assignable cause has been signaled by a control chart. Or per-
haps it is known that there are too many rejects, too much rework, or too many stoppages.
These are important attributes problems. Perhaps organized studies are needed to deter-
mine which of several factors—materials, operators, machines, vendors, processings—
have important effects upon quality characteristics. In this chapter, methods of analysis
are discussed with respect to quality characteristics of an attributes nature.
Not much has been written about process improvement and troubleshooting of qual-
ity characteristics of an attributes nature. Yet in almost every industrial process, there
are important problems where the economically important characteristics of the prod-
uct are attributes: an electric light bulb will give light or it will not; an alarm clock will
or will not ring; the life of a battery is or is not below standard. There are times when
it is expedient to gauge a quality characteristic (go/no–go) even though it is possible to
measure its characteristic as a variable. This chapter discusses some effective designed
studies using enumerative or attributes data and methods of analysis and interpretation
of resulting data.
Explanations of why a process is in trouble are often based on subjective judgment.
How can we proceed to get objective evidence in the face of all the plausible stories as
to why this is not the time or place to get it? Data of the attributes type often imply the
possibility of personal carelessness. Not everyone understands that perfection is unat-
tainable; a certain onus usually attaches to imperfection. Thus, it is important to find
ways of enlisting the active support and participation of the department supervisors, the
mechanics, and possibly some of the operators. This will require initiative and ingenuity.
In many plants, little is known about differences in machine performance. Just as
two autos of the same design may perform differently, so do two or three machines of
the same make. Or, a slight difference in a hand operation that is not noticed (or is

315
316 Part III: Troubleshooting and Process Improvement

considered to be inconsequential) may have an important effect on the final perfor-


mance of a kitchen mixer or a nickel cadmium battery. Experience indicates that there
will be important differences in as few as two or three machines, or in a like number of
operators, shifts, or days. Several case histories are presented in this chapter to illustrate
important principles of investigation. In each, it is the intent to find areas of differences.
Independent variables or factors are often chosen to be omnibus-type variables.1 Once
the presence and localized nature of important differences are identified, ways can usu-
ally be found by engineers or production personnel to improve operations.
Data from the case histories have been presented in graphical form for a variety of
reasons. One compelling reason is that the experiment or study is valuable only when
persons in a position to make use of the results are convinced that the conclusions are
sensible. These persons have had long familiarity and understanding of graphical presen-
tations; they respond favorably to them. Another reason is that the graphical form shows
relationships and suggests possibilities of importance not otherwise recognized.

11.2 IDEAS FROM SEQUENCES OF


OBSERVATIONS OVER TIME
The methods presented in Chapter 2 are applicable to sequences of attributes data as
well as to variables data. Control charts with control limits, runs above and below the
median—these procedures suggest ideas about the presence and nature of unusual per-
formance. As each successive point is obtained and plotted, the chart is watched for evi-
dence of economically important assignable causes in the process even while it is
operating under conditions considered to be stable.2
If the process is stable (in statistical control), each new point is expected to fall within
the control limits. Suppose the new point falls outside the established 3-sigma control
limits. Since this is a very improbable event when the process is actually stable, such an
occurrence is recognized as a signal that some change has occurred in the process. We
investigate the process to establish the nature of the assignable cause. The risk of an
unwarranted investigation from such a signal is very small—about three in a thousand.
In troubleshooting, it is often important to make an investigation of the process with
a somewhat greater chance (risk) of an unwarranted investigation than three in a thou-
sand; lines drawn at p– ± 2ŝ p will be more sensitive to the presence of assignable causes.
A somewhat larger risk of making an unwarranted investigation of the process is asso-
ciated with a point outside 2-sigma limits; it is about one chance in 20 (about a five per-
cent risk). However, there is now a smaller risk (b ) of missing an important opportunity
to investigate, especially important in a process improvement study.
In process control, we set the control limits at ±3ŝ arbitrarily and compute the
resulting a. In troubleshooting, the decision limits use just the opposite approach.

1. See Chapter 9.
2. See Chapter 5.
Chapter 11: Troubleshooting with Attributes Data 317

11.3 DECISION LINES APPLICABLE TO


k POINTS SIMULTANEOUSLY
Introduction
When each individual point on a Shewhart control chart is not appraised for a possible
shift in process average at the time it is plotted, there is a conceptual difference in prob-
abilities to consider. For example, consider decision lines3 drawn at p– ± 2ŝ p. The risk
associated with them is indeed about five percent, if we apply them as criteria to a single
point just observed. However, if applied to an accumulated set of 20 points as a group,
about one out of twenty is expected to be outside of the decision limits even when there
has been no change in the process. Evidently then, decision lines to study k = 20 points
simultaneously, with a five percent risk of unnecessary investigation, must be at some
distance beyond p– ± 2ŝ p.
Troubleshooting is usually concerned with whether one or more sources—perhaps
machines, operators, shifts, or days—can be identified as performing significantly dif-
ferently from the average of the group of k sources. The analysis will be over the k
sources simultaneously, with risk a.
In the examples and case histories considered here, the data to be analyzed will not
usually relate to a previously established standard. For example, the data of Case History
11.1 represent the percent of rejects from 11 different spot-welding machine–operator com-
binations. In this typical troubleshooting case history, there is no given standard to use as
a basis for comparison of the 11 machine–operator combinations. They will be compared
to their own group average. Data in Figure 11.1, pertaining to the percent winners in horse

ng = 144 UDL = 0.213


(.01)
.20 (.05)
0.199

.15
p' = .125
p
.10

LDL = 0.051
.05 (.05)
(.01)
0.037

1 2 3 4 5 6 7 8
Post position

Figure 11.1 Winners at different post positions. (Data from Table 11.2.)

3. Even one point outside decision lines will be evidence of nonrandomness among a set of k points being considered
simultaneously. Some persons prefer to use the term “control limits.” Many practitioners feel strongly that only
those lines drawn at ± 3 sigma around the average should be called control limits. At any rate, we shall use the
term decision lines in the sense defined above.
318 Part III: Troubleshooting and Process Improvement

racing, are a different type; there is a given standard. If track position is not important, then
it is expected that one-eighth of all races will be won in each of the eight positions.
When dealing with groups of points, the decision limits must be adjusted for the
group size, k. They must be widened. This graphical analysis of k sources of size ng
simultaneously is called the analysis of means4 and is abbreviated ANOM. It uses the
normal approximation to the binomial and therefore requires a fairly large sample size.
It is recommended that ng p– > 5.

Probabilities Associated with k Comparisons,


Standard Given
Values of a factor Za to provide proper limits are given in Table 11.1 (or Table A.7) for
values of a = 10, 5, and 1 percent. Upper and lower decision lines to judge the extent
of maximum expected random variation of points around a given group standard pro-
portion p´ or percent defective P´ of k samples are:

UDL(a) = p´ + Zasp UDL(a) = P´ + ZasP (11.1)


LDL(a) = p´ – Zasp LDL(a) = P – ZasP

Table 11.1 Nonrandom variability.


Standard given, df = ∞. See also Table A.7.
k Z0.10 Z0.05 Z0.01
1 1.64 1.96 2.58
2 1.96 2.24 2.81
3 2.11 2.39 2.93
4 2.23 2.49 3.02
5 2.31 2.57 3.09
6 2.38 2.63 3.14
7 2.43 2.68 3.19
8 2.48 2.73 3.22
9 2.52 2.77 3.26
10 2.56 2.80 3.29
15 2.70 2.93 3.40
20 2.79 3.02 3.48
24 2.85 3.07 3.53
30 2.92 3.14 3.59
50 3.08 3.28 3.72
120 3.33 3.52 3.93

4. E. R. Ott and S. S. Lewis, “Analysis of Means Applied to Per-Cent Defective Data,” Rutgers University Statistics
Center Technical Report no. 2, Prepared for Army, Navy, and Air Force under contract NONR 404(1 1), (Task NP
042-2 1) with the Office of Naval Research, February 10, 1960. E. R. Ott, “Analysis of Means—A Graphical
Procedure,” Industrial Quality Control 24, no. 2 (August 1967): 101–9. Also, see Chapters 13, 14, and 15.
Chapter 11: Troubleshooting with Attributes Data 319

If even one of the k points falls outside these decision lines, it indicates (statisti-
cally) different behavior from the overall group average.5
The following derivation of entries from Table A.7 and Table 11.1 may help the
reader understand the problem involved in analyzing sets of data. The analysis assumes
that samples of size ng are drawn from a process whose known average is p´, and ng and
p´ are such that the distribution of pi in samples of size n is essentially normal.
(Approximately, ng p´ > 5 or 6; see Equation (5.3) Chapter 5.) We now propose to select
k independent random samples of ng from the process and consider all k values pi simul-
taneously. Within what interval

p´ – Zasp and p´ + Zasp

will all k sample fractions pi lie, with risk a or confidence (1 – a)?


Appropriate values of Za , corresponding to selected levels a and the above assump-
tions, can be derived as follows. Let Pr represent the unknown probability that any one
sample pi from the process will lie between the lines to be drawn from Equation (11.1).
Then the probability that all k of the sample pi will lie within the interval in (11.1) is
Pr k. If at least one point lies outside these decision lines, this is evidence of nonrandom
variability of the k samples; that is, some of the sample pi are different, with risk a. The
value of Za can be computed as follows:

Pr k = 1 – a (11.2)

Then, corresponding to the value of Pr found from this equation, Za is determined


from Table A.1. Values of Za found via Equation (11.2) are shown in Table 11.1 for a =
0.10, 0.05, 0.01, and selected values of k.

Numerical Example
Compute Z0.05 in Table 11.1 for k = 3:

(Pr)3 = 0.95
log Pr = (1/3)log(0.95) = 9.99257 – 10 = –0.00743
Pr = 0.98304
1 – Pr = 0.01696

for a probability Pr of 0.00848 in each tail of a two-tailed test. From Appendix


Table A.1, we find that corresponding decision lines drawn at p´ ± Z0.05sP require
that Z0.05 = 2.39.

5. The material in this section is very important; however, it can be omitted without seriously affecting the under-
standing of subsequent sections.
320 Part III: Troubleshooting and Process Improvement

Note 1. When lines are drawn at p´ ± 2sp about the central line, it is commonly
believed that a point outside these limits is an indication of an assignable cause with risk
about five percent. The risk on an established control chart of a stable process is
indeed about five percent if we apply the criterion to a single point just observed; but
if applied, for example, to 10 points simultaneously, the probability of at least one point
of the 10 falling outside 2-sigma limits by chance is

1 – (0.954)10 = 1 – 0.624 = 0.376

That is, if many sets of 10 points from a stable process are plotted with the usual
2-sigma limits, over one-third of the sets (37.6 percent) are expected to have one or
more points just outside those limits. This is seldom recognized.
Also, just for interest, what about 3-sigma limits? If many sets of 30 points from a
stable process are plotted with the usual 3-sigma limits, then

1 – (.9973)30 = 1 – 0.922 or 7.8%

of the sets are expected to have one or more points outside those limits.
Conversely, in order to provide a five percent risk for a set of 10 points considered
as a group, limits must be drawn at

±Z0.05sp = ±2.80sp

as shown in Table 11.1.


Note 2. Or consider an accumulated set of 20 means (k = 20). About one out of
twenty is expected to be outside the lines drawn at p´ ± 2sp. Consequently, decision
lines to study 20 means simultaneously must be at some distance beyond p´ ± 2sp. Table
11.1 shows that the lines should be drawn at
p´ ± 3.02sp for a = 0.05
and at
p´ ± 3.48sp for a = 0.01

Example 11.1
Consider the following intriguing problem offered by Siegel: “Does the post position on
a circular track have any influence on the winner of a horse race?”6
Data on post positions of the winners in 144 eight-horse fields were collected from
the daily newspapers and are shown in Table 11.2. Position 1 is that nearest the inside rail.

6. S. Siegel, Nonparametric Statistics for the Behavioral Sciences (New York: McGraw-Hill, 1956): 45–46.
Chapter 11: Troubleshooting with Attributes Data 321

Table 11.2 Winners at different post positions.


Post position: 1 2 3 4 5 6 7 8 Total
No. of winners: 29 19 18 25 17 10 15 11 144

Percent: 20.1 13.2 12.5 17.3 11.8 7.0 10.4 7.7 P = 12.5%

The calculations for an analysis of means (ANOM) with standard given and p´ = 1/8
= 0.125 follows

σp =
( 0.125)( 0.875) = 0.0275
144

For k = 7, we have Z0.05 = 2.68 and Z0.01 = 3.19.7 The decision limits are:

Risk LDL UDL


0.05 0.051 0.199
0.01 0.037 0.213

These have been drawn in Figure 11.1 following certain conventions:


1. The sample size, n = 144, is written in the upper-left corner of the chart.
2. The risks, 0.05 and 0.01, are shown at the end of the decision lines.
3. The points corresponding to the eight post positions are connected by a dotted
line in order to recognize comparisons better.
4. The values of the decision lines are written adjacent to them.
Discussion
The point corresponding to post position 1 is between the (0.05) and (0.01) upper lines;8
this indicates that position 1 has a better than average chance of producing a winner (a
< 0.05). Figure 11.1 supports what might have been predicted: if positions have any
effect, the best position would surely be that one nearest the rail and the worst would be
near the outside. Not only does the graph show position 1 in a favored light, it also indi-
cates a general downward trend in the winners starting from the inside post positions.
(There is not enough evidence to support, conclusively, the possibility that position 4 is
superior to positions 2 and 3. There seems little choice among positions 6, 7, and 8.)

7. Although there are eight positions in this case, there are only seven independent positions. (When any seven of the
pi are known, the eighth is also known.) We enter Table 11.1 with k = 7. It is evident that the decision is not
affected whether k = 7 or k = 8 is used. This situation seldom if ever arises in a production application.
8. The authors’ chi-square analysis of this data also indicates that there is a significant difference between positions
with a risk between 0.05 and 0.01.
322 Part III: Troubleshooting and Process Improvement

Factors to Use in Making k Comparisons, No Standard Given


Factors for standard given were obtained easily in the preceding section. However, sit-
uations where they can be used in solving production problems seldom occur. In the
great majority of troubleshooting situations, there is no standard given; but it is very
useful to compare individual performances with the overall average group perfor-
mance. The comparison procedure used here is called analysis of means, no standard
given. It is similar to a p chart. The procedure is outlined in Table 11.3 and illustrated
in several case histories. Factors, designated by Ha , provide decision lines for the impor-
tant case of no standard given

p– – Ha ŝ p and p– + Ha ŝ p

when analyzing attribute data, or



X ± Ha ŝ X–

when analyzing variables data.

Table 11.3 Analysis of means, attributes data, one independent variable.


Step 1: Obtain a sample of ni items from each of k sources and inspect each sample. (It is preferable
to have all ni equal.) Let the number of defective or nonconforming units in the k samples be
d1, d2, . . . , dk, respectively.
Step 2: Compute the fraction or percent defective of each sample.

pi = di /ni Pi = 100di /ni

Step 3: Plot the points corresponding to the k values, pi or Pi.



Step 4: Compute the grand average p– or P and plot it as a line:

p– = Σdi /Σni P = 100Σdi /Σni

Step 5: Compute a standard deviation, using average n– initially if there is variation in the
sample size.

If n p– > 5 and n– (1 – p– ) > 5 σˆ p =


p (1− p )
σˆ P =
(
P 100 − P )
n n

Step 6: From Table 11.4 or Appendix Table A.8, obtain the value of Ha corresponding to k and a.
Draw decision lines:

UDL: p– + Ha ŝp P + Ha ŝP

LDL: p– – H ŝ a p P – Ha ŝP

Step 7: Accept the presence of statistically significant differences (assignable causes) indicated by
points above the UDL and/or below the LDL with risk a. Otherwise, accept the hypothesis of
randomness of the k means, that is, no statistically significant differences.
Step 8: Process improvement: consider ways of identifying the nature of and reasons for significant
differences.
Chapter 11: Troubleshooting with Attributes Data 323

Table 11.4 Analysis of means; no standard given; df = ∞. Comparing k groups with their group
– –
average (especially for use with attributes data): P ± Hasp, X ± Has X–.
See also Table A.8 for df = ∞.
k = no. of groups H0.10 H0.05 H0.01
2 1.16 1.39 1.82
3 1.67 1.91 2.38
4 1.90 2.14 2.61
5 2.05 2.29 2.75
6 2.15 2.39 2.87
7 2.24 2.48 2.94
8 2.31 2.54 3.01
9 2.38 2.60 3.07
10 2.42 2.66 3.12
15 2.60 2.83 3.28
20 2.72 2.94 3.39

The computation of Ha is much more difficult than the earlier computation of Za .


We use the normal Z test if s is known and the t test if s is unknown. Likewise, we use
Za if m is known and Ha if m is not known. References are given to their derivation.
Some factors, Ha , to use when analyzing attributes data that are reasonably normal are
given in Table 11.4.9 When n and p do not permit the assumption of normality, a
method of computing the exact probability of a percent defective to exceed a specified
value is possible.10
The following notes pertain to the use of ANOM or to an understanding of the
procedures.
Note 1. The procedures for pi and Pi can be combined easily as in Figure 11.4.
Simply compute UDL and LDL using percents, for example, and indicate the percent
scale P on one of the vertical scales (the left one in Figure 11.4). Then mark the frac-
tion scale p on the other vertical scale.
Note 2. Values of Ha are given in Tables A.8 and 11.4 for three risks. We shall fre-
quently draw both sets of decision lines expecting to bracket some of the points. (These
three levels are to be considered as convenient reference values and not strict bases for
decisions.) A risk of somewhat more than five percent is often a sensible procedure.
Note 3. When k = 2, and the assumption of normality is reasonable, the comparison
of the two values of p is like a Student’s t test.11 Values of Ha corresponding to k = 2 are

tα 2
Hα =
2

9. The binomial distribution is reasonably normal for those values of ng p and ngq greater than 5 or 6 and p and q are
greater than say 0.05. See Equation (5.3).
10. S. S. Lewis, “Analysis of Means Applied to Percent Defective Data,” Proceeding of the Rutgers All-Day
Conference on Quality Control, 1958.
11. See Section 13.4.
324 Part III: Troubleshooting and Process Improvement

where ta is from a two-tailed t table corresponding to df = ∞. Thus the ANOM, for k = 2,


is simply a graphical t test with df = ∞, which amounts to a normal Z test.
In Case History 11.2, as in the previous Example 11.1, an alternative approach used
a chi-square analysis. The conclusion regarding the question of statistical significance
of the data agrees with that of ANOM. It may be helpful to some readers to explain that
ANOM is one alternative to a chi-square analysis. A chi-square analysis is not as sensi-
tive as ANOM to the deviation of one or two sources from average, or to trends and
other order characteristics. A chi-square analysis is more sensitive to overall variation of
k responses. The ANOM is very helpful to the scientist and engineer in identifying spe-
cific sources of differences, and the magnitude of differences, and is a graphical presen-
tation with all its benefits.

11.4 ANALYSIS OF MEANS FOR PROPORTIONS


WHEN n IS CONSTANT
Lewis and Ott12 applied analysis of means to binomially distributed data when the normal
approximation to the binomial distribution applies. The procedure for k sample propor-
tions from samples of equal size n when no standards are given, is as follows:
1. Compute pi , the sample proportion, for i = 1, 2, . . . , k.
– the overall mean proportion.
2. Compute p,
3. Estimate the standard error of the proportions by

p (1 − p )
σˆ e = σˆ p =
ng

Regard the estimate as having infinite degrees of freedom.


4. Plot the proportions pi in control chart format against decision limits
computed as:

p (1 − p )
p ± Hα
ng

The Ha factor is obtained from Appendix Table A.8 using df = ∞.


5. Conclude that the proportions are significantly different if any point plots
outside the decision limits.

12. S. S. Lewis and E. R. Ott, “Analysis of Means Applied to Percent Defective Data,” Rutgers University Statistics
Center Technical Report no. 2 (February 10, 1960).
Chapter 11: Troubleshooting with Attributes Data 325

When standards are given, that is, when testing against a known or specified value
of p, use the above procedure with p replaced by the known value of p and Ha taken
from the row labeled SG in Appendix Table A.8.

Example 11.2
Hoel13 poses the following problem:
Five boxes of different brands of canned salmon containing 24 cans each were exam-
ined for high-quality specifications. The number of cans below specification were respec-
tively 4, 10, 6, 2, 8. Can one conclude that the five brands are of comparable quality.
To answer the question by analysis of means, the procedure would be as follows,
using the a = 0.05 level of risk:
• The sample proportions are 0.17, 0.42, 0.25, 0.08, 0.33.
• The average proportion is p– = 30/120 = 0.25.
• The decision lines are

p (1 − p )
p ± Hα
ng

0.25 (1 − 0.25)
0.25 ± 2.29
24
0.25 ± 0.20
( 0.05, 0.45)
The analysis of means plot is as shown in Figure 11.2 and leads to the conclusion
that there is no evidence of a significant difference in quality at the a = 0.05 level of risk.

11.5 ANALYSIS OF MEANS FOR PROPORTIONS


WHEN n VARIES
Often it is not practical to assume that the sample size will remain constant when work-
ing with proportions data. In this situation, Duncan recommended three options14:

13. P. G. Hoel, Introduction to Mathematical Statistics, 3rd ed. (New York: John Wiley & Sons, 1962).
14. A. J. Duncan, Quality Control and Industrial Statistics (Homewood, IL: Irwin, 1974): 403–5.
326 Part III: Troubleshooting and Process Improvement

.45

pi .25

.05

1 2 3 4 5
Box

Figure 11.2 Analysis of means plot; proportion defective.

– This approach is demonstrated in Case Histories


1. Base the control limits on n.
11.7 and 11.8. The ANOM limits are calculated as

p (1 − p )
p ± Hα
n

2. Expressing the variable in standard deviation units, that is, instead of plotting
the sample fraction defective, the statistic (pi – p– )/ŝ Pi is plotted, where

p (1− p )
σ̂ pi
ni

When the data are plotted in the form of a control chart, Duncan refers to this
type of chart as a stabilized p-chart with control limits at –3 and +3.15
3. The most straightforward option is to compute separate control limits for each
sample. Thus, for 3-sigma limits, the value of sp will vary inversely with n .
This means that differing n will require differing limits, that is, when n is
larger, then sp will be smaller, and the limits will be narrower, and vice versa.
In the one-way comparison of binomial means, the estimate of sp for each
sample is described in option 2 above. Sheesley applied this approach to the
ANOM,16 assuming that the data satisfies the requirements for using the
normal approximation to the binomial (see Chapter 5), using these limits

15. A. J. Duncan, “Detection of Non-random Variation When Size of Sample Varies,” Industrial Quality Control
(January 1948): 9–12.
16. J. H. Sheesley, “Comparison of K Samples Involving Variables or Attributes Data Using the Analysis of Means,”
Journal of Quality Technology 12, no. 1 (January 1980): 47–52.
Chapter 11: Troubleshooting with Attributes Data 327

p (1 − p )
p ± Hα
ni

The adaptation of option 3 to the two-way case is also straightforward. Consider the
following problem in factor A with ka = 3 levels and factor B with kb = 2 levels expressed
in a two-way layout of kab = 6 cells:
B1 B2
A1 p11 p12 p–1•
A2 p21 p22 p–2•

A3 p31 p32 p–3•


p–•1 p–•2 p–

The sample proportion in the cell representing the ith level of factor A and jth level
of factor B, pij, is calculated as dij/nij. So, the levels of factors A and B that are plotted on
the ANOM chart along with the centerline p– are
kb ka ka kb

∑d
j =1
ij ∑d ij ∑∑d
i =1 j =1
ij
pi• = kb
, p• j = i =1
ka
, p= ka kb

∑n
j =1
ij ∑n
i =1
ij ∑∑n
i =1 j =1
ij

The dot notation implies that the statistic is averaged over the levels of the factor
with the missing subscript; for example, p–2• is the average proportion for the second
level of factor A averaged over the levels of factor B.
The estimates of the standard deviation for each sample is based on p– , but is depen-
dent on its sample size. Therefore, the standard error estimates for the plotted statistics
above and the individual cells in the two-way layout are shown to be

p (1 − p ) p (1 − p ) p (1 − p )
σˆ pi• = kb
, σˆ p• j = ka
, σ̂ pij =
nij
∑n
j =1
ij ∑ni =1
ij

The ANOM limits for the case of proportions data when n varies for main effects are

p– ± Ha ŝ pi• where Ha is based on a and ka with df = ∞


p– ± H ŝ where H is based on a and k with df = ∞
a p•j a b

This approach is used to compute the adjusted decision lines for main effects in
Case History 11.7.
328 Part III: Troubleshooting and Process Improvement

The two-way situation can be expanded to the three-way case as well. Consider the
following problem in factor A with ka = 2 levels, factor B with kb = 2 levels, and factor
C with kc = 3 levels expressed in a two-way layout of kab = 6 cells:
C1 C2 C3
B1 B2 B1 B2 B1 B2
A1 p111 p121 p112 p122 p113 p123 p–1••
A2 p211 p221 p212 p222 p213 p223 p–2••

p–•11 p–•21 p–•12 p–•22 p–•13 p–•23 p–


p–••1 p–••2 p–••3
* * * p–•1•
* * * p–•2•

The sample proportion in the cell representing the ith level of factor A, jth level of
factor B, and kth level of factor C, pijk , is calculated as dijk/nijk . So, the levels of factors
A, B, and C that are plotted on the ANOM chart along with the centerline p– are
kb kc ka kc ka kb

∑ ∑ dijk
j =1 k =1
∑∑d ijk ∑∑d
i =1 j =1
ijk
pi•• = kb kc
, p• j• = i =1 k =1
ka kc
, p••k = ka kb for main effects
∑∑n
j =1 k =1
ijk ∑∑n
i =1 k =1
ijk ∑∑n
i =1 j =1
ijk

kc kb ka

∑ dijk ∑ dijk
j =1
∑d ijk
pij• = k =1
kc
, pi•k = kb
, p• jk = i =1
ka for two-factor interactions
∑n
k =1
ijk ∑n
j =1
ijk ∑n
i =1
ijk

The dot notation implies that the statistic is averaged over the levels of the factor
with the missing subscript, for example, p–21• is the average proportion for the second
level of factor A and the first level of B averaged over the levels of factor C.
– but are
The estimates of the standard deviation for each sample are still based on p,
again dependent on sample size. Therefore, the standard error estimates for the plotted
statistics above and the individual cells in the three-way layout are shown to be

p (1 − p ) p (1 − p ) p (1 − p )
σˆ pi•• = kb kc
, σˆ p• j• = ka kc
, σˆ p••k = ka kb for main effects
∑∑n
j =1 k =1
ijk ∑∑n
i =1 k =1
ijk ∑∑n
i =1 j =1
ijk

p (1 − p ) p (1 − p ) p (1 − p )
σˆ pij• = kc
, σˆ pi•k = kb
, σˆ p• jk = ka for two-factor interactions
∑n
k =1
ijk ∑n j =1
ijk ∑n
i =1
ijk
Chapter 11: Troubleshooting with Attributes Data 329

p (1 − p )
σ̂ pijk = for the three-factor interaction
nijk

The ANOM limits for the case of proportions data when n varies are presented in
Chapter 15. This approach can be used to compute adjusted decision lines for main
effects and interactions in Case History 11.8.

11.6 ANALYSIS OF MEANS FOR COUNT DATA


Vaswani and Ott17 used an analysis of means-type procedure on data that was distrib-
uted according to the Poisson distribution, when the normal approximation to the
Poisson could be employed. For k units each with a count of ci successes, when no stan-
dards are given, analysis of means is performed in the following manner:
1. Count the number of successes per unit ci, i = 1, 2, . . . , k.
– the mean number of successes overall.
2. Compute c,
3. Estimate the standard error of the counts as

σˆ e = σˆ c = c

Regard the estimate as having infinite degrees of freedom.


4. Plot the counts, ci , in control chart format against decision limits computed as

c ± Hα c

5. Conclude that the counts are significantly different if any point falls outside
the decision limits.
When standards are given, that is, when the mean of the Poisson distribution is
known or specified to be m, the value of m is used to replace c– in the above procedure
and values of Ha are taken from appendix Table A.8 using the row labeled SG.

Example 11.3
Brownlee presents some data on the number of accidents over a period of time for three
shifts.18 These were 2, 14, and 14 respectively. Analysis of means can be used to test if

17. S. Vaswani and E. R. Ott, “Statistical Aids in Locating Machine Differences,” Industrial Quality Control 11, no. 1
(July 1954).
18. K. A. Brownlee, Industrial Experimentation (New York: Chemical Publishing Company, 1953).
330 Part III: Troubleshooting and Process Improvement

16.0

ci 10.0

4.0

A B C
Shift

Figure 11.3 Analysis of means plot; accidents by shift.

these data indicate an underlying difference in accident rate between the shifts. The
analysis is performed as follows:
• The overall mean is c– = 30/3 = 10
• Limits for analysis of means at the a = 0.05 level are

c ± Hα c
10 ± 1.91 10
10 ± 6.0
( 4.0, 16.0 )
The analysis of means plot is shown in Figure 11.3 and indicates the shifts to be sig-
nificantly different at the 0.05 level of risk.

11.7 INTRODUCTION TO CASE HISTORIES


The mechanics of ANOM with attributes data are used in the following case histories.
Whatever analysis is used when analyzing data in a troubleshooting or process improve-
ment project is important only as it helps in finding avenues to improvement. Any
analysis is incidental to the overall procedure of approaching a problem in production.
The case histories have been chosen to represent applications to different types of pro-
cesses and products. They have been classified below according to the number of inde-
pendent variables employed, where independent variables are to be interpreted as
discussed in Section 9.2.
Chapter 11: Troubleshooting with Attributes Data 331

The planned procedures for obtaining data in Sections 11.5, 11.6, 11.7, and 11.8 are
especially useful throughout industry, yet there is little organized information published
on the subject.19
Many of the ideas presented here are applied also to variables data in Chapters 13,
14, and 15.

Outline of Case Histories (CH) that Follow in This Chapter


Section 11.8 One independent variable at k levels
CH 11.1 Spot-welding electronic assemblies
CH 11.2 A corrosion problem with metal containers
CH 11.3 End breaks in spinning yarn
CH 11.4 Bottle-capping
Section 11.9 Two independent variables
CH 11.5 Alignment and spacing in a cathode-ray gun
CH 11.6 Phonograph pickup cartridges—a question of redesign
CH 11.7 Machine shutdowns (unequal ri)
Section 11.10 Three independent variables
CH 11.8 Glass-bottle defectives; machines, shifts, days
CH 11.9 Broken and cracked plastic caps (three vendors)
Section 11.11 A very important experimental design: 1⁄2 × 23
CH 11.10 Grid-winding lathes

11.8 ONE INDEPENDENT VARIABLE WITH k LEVELS

Case History 11.1


Spot-Welding Electronic Assemblies20
Excessive rejections were occurring in the mount assembly of a certain type of electronic
device. Several hundreds of these mounts were being produced daily by operators, using

19. For an excellent illustration of use of this procedure, see L. H. Tomlinson and R. J. Lavigna, “Silicon Crystal
Termination: An Application of ANOM for Percent Defective Data,” Journal of Quality Techology 15, no. 1
(January 1983): 26–32.
20. E. R. Ott, “Trouble-Shooting,” Industrial Quality Control 11, no. 9 (June 1955).
332 Part III: Troubleshooting and Process Improvement

their own spot-welding machine. The mount assemblies were inspected in a different
area of the plant, and it was difficult to identify the source of welding trouble.21
An oxide coating was extending too far down on one component; the department
supervisor believed that the trouble was caused by substandard components delivered
to the department. In fact, there was evidence to support this view. The supervisor of
the preceding operation agreed that the components were below standard. Even so,
whenever any operation with as many as three or four operator–machine combinations
is in trouble, a special short investigation is worthwhile. This is true even when the
source of trouble is accepted as elsewhere.
This example discusses a straightforward approach to the type of problem described
above. It is characterized by (1) the production of a product that can be classified only
as “satisfactory” or “unsatisfactory,” with (2) several different operators, machines, heads
on a machine, or jigs and fixtures all doing the same operation. The procedure is to
select small samples of the product in a carefully planned program for a special study,
inspecting each one carefully, and recording these sample inspection data for careful
analysis. Experience has shown that these small samples, obtained in a well-planned
manner and examined carefully, usually provide more useful information for corrective
action than information obtained from 100 percent inspection. It allows more effort to
be allocated to fewer units.

Collecting Data

An inspector was assigned by the supervisor to obtain five mounts (randomly) from
each operator–welder combination at approximately hourly intervals for two days; then
ng = (8)(2)(5) = 80. Each weld was inspected immediately, and a record of each type of
weld defect was recorded by operator–welder on a special record form. Over the two-
day period of the study, records were obtained on 11 different operator–welders as
shown in Table 11.5: the percent defective from these eleven combinations, labeled A,
B, C, . . . , K, have been plotted in Figure 11.4. The average percent of weld rejects for

the entire group for the two-day study was p– = 66/880 = 0.075 or P = 7.5 percent; this
was just about the rate during recent production.

Discussion

Several different factors could have introduced trouble into this spot-welding operation.
One factor was substandard components, as some believed. But were there also differ-
ences among spot welders, operators, or such factors as the time of day (fatigue), or the
day of the week? Did some operators need training? Did some machines need mainte-
nance? We chose to study a combination of operators with their own regular machines;
the supervisor decided to get data from 11 of them.

21. In regular production, each operator was assigned to one specific welding machine. No attempt was made in this
first study to separate the effects of the operators from those of the machines.
Chapter 11: Troubleshooting with Attributes Data 333

Table 11.5 Welding rejects by


operator–machine.
Samples of ng = 80.
Operator Number Percent
A 3 3.75
B 6 7.5
C 8 10.0
D 14 17.5
E 6 7.5
F 1 1.25
G 8 10.0
H 1 1.25
I 8 10.0
J 10 12.5
K 1 1.25
Σ = 66
p = 66 / 880 = 0.75
P = 7.5%

σˆ =
(7.5)(92.5) = 2.94%
80
Decision line: a = 0.05, k = 11
UDL = 7.5% + (2.70)(2.94)
= 15.4%

20 .20
ng = 80
UDL = 16.8%
(.01)
15.4%
(.05)
15 .15

P% 10 .10 p

P = 7.5%

5 .05

0 0
A B C D E F G H I J K
Operator–machine

Figure 11.4 Welding rejects by operator–machine. (Data from Table 11.5.)

When we had the data, we plotted it and computed decision lines (Figure 11.4).
Combination D exceeded the upper limit (a = 0.01); three combinations F, H, and K
were “low.” In discussing these four operators, the supervisor assured us without any
hesitation that:
334 Part III: Troubleshooting and Process Improvement

• Operator D was both “slow and careless.”


• Operator F was very fast and very careful, and it was the operator’s frequent
practice to repeat a weld.
• Operator H was slow, but careful.
• Operator K was one about whom little was known because the operator was
not a regular.
Conclusion
Pooling the attributes information from small samples (of five per hour over a two-day
period) indicated the existence of important differences in operator–welder combina-
tions. These differences were independent of the quality of components being delivered
to the department. Efforts to improve the troublesome spraying oxide coating in the pre-
ceding department would be continued, of course.
These observed differences in welding suggest also:
1. Combinations F, H, and K should be watched for clues to their successful
techniques in the hope that they can then be taught to others.
2. Combination D should be watched to check the supervisor’s unfavorable
impression.
3. In addition, the desirability of studying the effect of repeat welding at
subsequent stages in the manufacturing process should be studied. This
may be an improvement at the welding stage; but its effect on through
assembly needs assessment.

Case History 11.2


A Corrosion Problem with Metal Containers22
The effects of copper on the corrosion of metal containers was studied by adding cop-
per in three concentrations. After being stored for a time, the containers were examined
for failures of a certain type. The data are summarized in Table 11.6 and plotted in
Figure 11.5. The large increase in defectives is very suggestive that an increase in parts
per million (ppm) of copper produces a large increase in failures. The increase is sig-
nificant both economically and statistically.

22. H. C. Batson, “Applications of Factorial Chi-Square Analysis to Experiments in Chemistry,” Transactions of the
American Society for Quality Control (1956): 9–23.
Chapter 11: Troubleshooting with Attributes Data 335

Table 11.6 Effect of copper on corrosion.


Level of copper, Containers Failures, Fraction Percent
ppm examined, ng di failing failing
5 80 14 p1 = .175 17.5
10 80 36 p2 = .450 45.0
15 80 47 p3 = .588 58.8

Totals 240 p– = 97/240 = .404 P = 40.4%

60 ng = 80
UDL = 53.5%
(.01)
50

P = 40.4%
40

P 30 LDL = 27.3
(.01)

20

10

0
5 10 15
Copper, ppm

Figure 11.5 Effect of copper on corrosion. (Data from Table 11.6.)

Formal Analysis: ANOM

P (100 − P ) 40.4 (100 − 40.4 )


σˆ P = = = 5.5%
ng 80

For k = 3 and a = 0.01, Table A.8 gives H0.01 = 2.38. Then



P ± H0.01ŝ P = 40.4 ± (2.38)(5.5)
UDL(0.01) = 53.5%
LDL(0.01) = 27.3%

One point is below the LDL; one point is above the UDL. There is no advantage in
computing decision lines for a = 0.05.
336 Part III: Troubleshooting and Process Improvement

Whether the suggested trend is actually linear will not be discussed here, but if
we assume that it is linear, the increase in rejections from 5 to 15 ppm level of
copper is
58.8% − 17.5%
Average increase = = 4.13%
10

This was considered a very important change.

Case History 11.3


End Breaks in Spinning Cotton Yarn23

The Problem
An excessive number of breaks in spinning cotton yarn was being experienced in a tex-
tile mill. It was decided to make an initial study on a sample of eight frames to deter-
mine whether there were any essential differences in their behavior. Rather than take all
the observations at one time, it was decided to use random time intervals of 15 minutes
until data were on hand for ten such intervals on each of the eight frames.
Each frame contained 176 spindles. As soon as a break occurred on a spindle, the
broken ends were connected or “pieced” together and spinning resumed on that spindle.
(The remaining 175 spindles continued to spin during the repair of the spindle.) Thus
the number of end breaks during any 15-minute interval is theoretically unlimited, but
we know from experience that it is “small” during ordinary production.
The selection of a 15-minute interval was an arbitrary decision for the initial study.
It was similarly decided to include eight frames in the initial study.
The number of end breaks observed in 15 minutes per frame is shown in Table 11.7
and in Figure 11.6.
Conclusions
It is apparent that there is an appreciable difference between frames. Those with aver-
ages outside of the (0.01) decision lines are:

Excessive breaks: frames 5 and 8


Few breaks: frames 2, 3, and 7

The analysis using circles and triangles in Table 11.7 given in Analysis 1 below pro-
vides some insight into the performance of the frames.

23. S. Vaswani and E. R. Ott, “Statistical Aids in Locating Machine Differences,” Industrial Quality Control 11, no. 1
(July, 1954).
Chapter 11: Troubleshooting with Attributes Data 337

Table 11.7 End breaks during spinning cotton yarn.


Sample Frame no.
no. 1 2 3 4 5 6 7 8 Total

1 13 7 22 15 20 23 15 14 129

2 18 10 7 12 19 17 18 22 123

3 8 8 21 14 15 16 8 8 98

4 13 12 8 10 23 3 12 20 101

5 12 6 9 27 32 4 9 18 117

6 6 6 6 17 34 12 1 24 106

7 16 20 5 9 8 17 7 21 103

8 21 9 2 13 10 14 7 17 93

9 17 14 9 24 21 8 6 33 132

10 16 7 7 10 14 10 6 11 81

Frame 14.0 9.9 9.6 15.1 19.6 12.4 8.9 18.8 Grand
avg. avg. = 13.54

ng = 10
20
c–, average number of breaks

UDL = 17.03 (.01)


(.05)
16.50
15
c– = 13.54

LDL = 10.59 (.05)


10 (.01)
10.05

1 2 3 4 5 6 7 8
Frame number

Figure 11.6 End breaks on spinning frames.


338 Part III: Troubleshooting and Process Improvement

Note. The circles represent data values that exceed c– + 2sc, and the triangles are
data values that are below c– – 2sc.
Analysis 1: A “Quick Analysis”
The average number of breaks per time interval is c– = 13.54. Then for a Poisson distri-
bution (Section 5.4)

σˆ c = 13.54 = 3.68

for individual entries in Table 11.7. Let us consider behavior at the c– ± 2sc level since
we are interested in detecting possible sources of trouble:

13.54 + 2(3.68) = 20.90 (figures in circles)24


13.54 – 2(3.68) = 6.18 (figures in triangles)
Conclusions
A visual inspection of the individuals so marked suggests:
1. Frames 4, 5, and 8 are suspiciously bad since there are circles in each column
and no triangles.
2. Frames 2 and 7 look good; there are at least two triangles in each and no
circles. (Frame 3 shows excellent performance except for the two circled
readings early in the study.)
Analysis 2: ANOM (The Mechanics to Obtain Decision Lines in Figure 11.6)
Each frame average is of ng = 10 individual observations. In order to compare them to
their own group average, we compute

σˆ c = σˆ / ng = 3.68 / 10 = 1.16

From Table A.8, values of Ha for k = 8 are: H0.05 = 2.54 and H0.01 = 3.01.25 Then for
a = 0.05
UDL(0.05) = 13.54 + 2.54(1.16) = 16.48
LDL(0.05) = 13.54 – 2.54(1.16) = 10.59

24. In practice, we use two colored pencils. This analysis is a form of NL gauging; see Chapter 6.
25. The individual observations are considered to be of Poisson type; this means somewhat skewed with a longer tail
to the right. However, averages of as few as four such terms are essentially normally distributed. Consequently, it
is proper to use Table A.8; see Theorem 3, Chapter 1.
Chapter 11: Troubleshooting with Attributes Data 339

and for a = 0.01


UDL(0.01) = 13.54 + 3.01(1.16) = 17.03
LDL(0.01) = 13.54 – 3.01(1.16) = 10.05

A Further Comment
Other proper methods of analyzing the data in Table 11.7 include chi-square and analy-
sis of variance. Each of them indicates nonrandomness of frame performance; however,
unlike ANOM, they need to be supplemented to indicate specific presses exhibiting
different behavior and the magnitude of that behavior difference.
Process Action Resulting from Study
Because of this initial study, an investigation was conducted on frames 5 and 8 that
revealed, among other things, defective roller coverings and settings. Corrective action
resulted in a reduction in their average breaks to 11.8 and 8, respectively; a reduction of
about 50 percent. A study of reasons for the better performance of frames 2, 3, and 7 was
continued to find ways to make similar improvements in other frames in the factory.

Case History 11.4


An Experience with a Bottle Capper
This capper has eight rotating heads. Each head has an automatic adjustable chuck
designed to apply a designated torque. Too low a torque may produce a leaker; too high
a torque may break the plastic cap or even the bottle.
It is always wise to talk to line operators and supervisors: “Any problem with broken
caps?” “Yes, quality control has specified a high torque, and this is causing quite a lot
of breakage.” After watching the capper a few minutes, a simple tally of the number of
broken caps from each head was made. (See Table 11.8 and Figure 11.7.)

Table 11.8 Plastic caps breaking at the capper.


Head no. f = Number broken
1 1
2 1
3 2
4 2
5 1
6 2
7 2
8 9
340 Part III: Troubleshooting and Process Improvement

10

8 UDL = 7.27
(.01)
(.05)
6 6.53
f
4
c– = 2.5

0
1 2 3 4 5 6 7 8
Head

Figure 11.7 Cap breakage at different heads. Actual value of n is unknown, but 50 is a guess.

Head 8 is evidently breaking almost as many caps as all others combined (see
Formal Analysis below). Too high a torque specification? Or inadequate adjustment on
head 8? The answer is obviously the latter. In theory, broken caps may be a consequence
of the capper, the caps, or the bottles. But it is human nature to attribute the cause to
“things beyond my responsibility.”
Discussion
If there had been no significant differences between heads, what then?
1. How many cavities are in the bottle mold? Probably four or eight. Let us hold
out (collect) 20 or 30 broken-capped bottles and check the mold numbers of
the bottles that are (or should be) printed during molding. Often, the great
majority of defectives will be from one or two bottle molds.
2. How many cavities are producing caps? Probably 8 or 16. Let us take the same
20 or 30 broken-capped bottles and check the cavity numbers of the caps (also
printed at molding). It is not unusual to find a few bottle-cap cavities
responsible for a preponderance of broken caps.
Formal Analysis
The number of breaks on each head is known for the period of observation. Opportunity
for breaks was “large,” but the incidence was “small” (see Section 5.4); a Poisson dis-
tribution is a reasonable assumption.

c = 20 / 8 = 2.5 σˆ c = 2.5 = 1.58


H 0.05 = 2.54 for k = 8
Chapter 11: Troubleshooting with Attributes Data 341

Then
UDL(0.05) = c– + H0.05ŝ c = 2.5 + 4.01 = 6.51
Also,
UDL(.01) = 7.26

Conclusion
The point corresponding to head 8 is above UDL(0.01); this simply supports the intu-
itive visual analysis that head 8 is out of adjustment. Note that the analysis is approxi-
mate since c– < 5, but the conclusions are obvious.

11.9 TWO INDEPENDENT VARIABLES

Introduction
Our emphasis here and elsewhere will be upon planning and analyzing data from stud-
ies to identify sources of trouble; engineering and production personnel then use the
resulting information to reduce that trouble. The ideas presented in Chapter 9 will be
illustrated in these discussions. In particular, omnibus-type independent variables will
be used frequently. Troubleshooting can usually be improved by data collection plans
that employ more than one independent variable. Such plans speed up the process of
finding the sources of trouble with little or no extra effort. This section will consider the
important and versatile case of two independent variables, Section 11.10 will discuss
the case of three independent variables.
When using temperatures of 100°, 120°, 140°, for example, it is said that the inde-
pendent variable (temperature) has been used at three levels. Similarly, if a study con-
siders three machines and two shifts, it is said that the study considers machines at three
levels and shifts at two levels.
Consider a study planned to obtain data at a levels of variable A and b levels of vari-
able B. When data from every one of the (a × b) possible combinations is obtained, the
plan is called a factorial design. When A and B are each at two levels, there are 22 = 4
possible combinations; the design is called a 22 factorial (two-squared factorial). Such
designs are very effective and used frequently.

Two Independent Variables: A 22 Factorial Design


This procedure will be illustrated by Case History 11.5; then the analysis will be discussed.
342 Part III: Troubleshooting and Process Improvement

Case History 11.5


Comparing Effects of Operators and Jigs in a Glass-Beading Jig
Assembly (Cathode-Ray Guns)
Regular daily inspection records were being kept on 12 different hand-operated glass-
beading jigs, sometimes called machines. The records indicated that there were appre-
ciable differences in the number of rejects from different jigs although the parts in use
came from a common source. It was not possible to determine whether the differences
were attributable to jigs or operators without a special study. There was conflicting evi-
dence, as usual. For example, one jig had just been overhauled and adjusted; yet it was
producing more rejects than the departmental average. Consequently, its operator
(Harry) was considered to be the problem.
The production supervisor, the production engineer, and the quality control engineer
arranged an interchange of the operator of the recently overhauled jig with an operator
from another jig to get some initial information. From recent production, prior to inter-
change, 50 units from each operator were examined for two quality characteristics: (1)
alignment of parts, and (2) spacing of parts. The results for alignment defects in the
morning’s sample are shown in Figure 11.8 in combinations 1 and 4.
Then the two operators interchanged jigs. Again, a sample of 50 units of each oper-
ator’s assembly was inspected; the results from before and after interchange are shown
in Figure 11.8. The same inspector examined all samples.

Operator

Art Harry

(1) (2)

9 5 6 11/100

r = 50 50
Machine
(3) (4)

10 11 13 24/100

50 50

16/100 19/100
17/100 18/100

Figure 11.8 Alignment defects found in samples during an interchange of two operators on two
machines. (The number of defects is shown in each square at the center and the
sample size in the lower right corner. See Case History 11.5.)
Chapter 11: Troubleshooting with Attributes Data 343

Alignment Defects
Totals for the 100 cathode-ray guns assembled by each operator are shown at the bot-
tom of Figure 11.8; totals for the 100 guns assembled on each jig are shown at the right;
and the numbers of rejects produced on the original and interchanged machines are
shown at the two bottom corners.

Operators Machines (jigs) Interchange


Art: 16/100 = 16% 9: 11/100 = 11% Original machines: 18/100 = 18%
Harry: 19/100 = 19% 10: 24/100 = 24% Interchanged machines: 17/100 = 17%

Discussion
The difference in jig performance suggests a problem with jig 10. This difference is
called a jig main effect. (Any significant difference between operators would be called
an operator main effect.) This surprised everyone, but was accepted without any fur-
ther analysis.
The performance of jig 10 is significantly worse than that of jig 9 (see Figure 11.9)
even though it had just been overhauled. The magnitude of the difference is large:

∆ = 24% – 11% = 13%

Neither the small difference observed between the performance of the two opera-
tors nor between the performance of the operators before and after the interchange is
statistically significant.

Operator Machine Interchange


Art Harry 9 10 Before After
30
ng = 100
UDL = 24.4%
(.01)
(.05)
22.8%
20 –
P = 17.5%

P
LDL = 12.2%
(.05)
10 (.01)
10.6%

Figure 11.9 Alignment comparison shows difference in effect of machines, but not in operators
or before-and-after effect (ANOM). (Data from Figure 11.8.)
344 Part III: Troubleshooting and Process Improvement

Based on the above information, all operators were called together and a program
of extending the study in the department was discussed with them. The result of the
interchange of the operators was explained; they were much interested and entirely
agreeable to having the study extended.

An Extension of the Interpretation from the Study

One interpretation of such an interchange would be that the two operators performed dif-
ferently when using their own machine than when using a strange machine. Such prefer-
ential behavior is called an interaction; more specifically here, it is an operator–machine
interaction.
A variation of this procedure and interpretation has useful implications. If the oper-
ators were not told of the proposed interchange until after the sample of production
from their own machines had been assembled, then:
1. The number of defectives in combinations 2 and 3 made after the
interchange (Figure 11.8), could well be a consequence of more careful
attention to the operation than was given before the interchange. Since
the number of defects after the interchange is essentially the same as
before, there is no suggestion from the data that “attention to the job”
was a factor of any consequence in this study.
2. It would be possible in other 22 production studies that a proper
interpretation of an observed difference between the two diagonals of
the square (such as in Figure 11.8) might be a mixture (combination)
of “performance on their own machines” and “attention to detail.”
Such a mixture is said to represent a confounding of the two possible
explanations.

Spacing Defects

The same cathode-ray guns were also inspected for spacing defects. The data are shown
in Figure 11.10.

Operators Jigs
Art: 6/100 = 6% 9: 3/100 = 3%
Harry: 8/100 = 8% 10: 11/100 = 11%

In Figure 11.11, the difference between jigs is seen to be statistically significant for
spacing defects, risk less than 5 percent. Since it seemed possible that the before/after
interchange difference might be statistically significant for a = 0.10, a third pair of deci-
sion lines have been included; the pair of points for B, A, lie inside them, however. There
is the possibility that the interaction might prove to be statistically significant if a larger
sample is inspected.
Chapter 11: Troubleshooting with Attributes Data 345

Operator
Art Harry

(1) (2)

9 2 1 3/100

r = 50 50 Before/after interchange
Machine Before: 9/100 = 9%
(3) (4) After: 5/100 = 5%
10 4 7 11/100

50 50

6/100 8/100
A = 5/100 9/100 = B

Figure 11.10 Spacing defects found in samples during an interchange of two operators on two
machines. (See Figure 11.11.)

Operator Machine Interchange


Art Harry 9 10 Before After

ng = 100
UDL = 11.64%
(.01)
10.5%
(.05)
10 (.10)
9.96%
Percent


P = 7.0%

5 LDL = 4.04%
(.10)
(.05)
3.46%
(.01)
2.36%

Figure 11.11 Spacing defects comparison showing differences in effect of machines, but not in
operators or before-and-after interchange. (Data from Figure 11.10.)

Formal Analysis

P = 7.0%; k = 2 in each of the three comparisons.

7.0 ( 93.0 )
σˆ P = = 2.55%
100
346 Part III: Troubleshooting and Process Improvement

Decision Lines (in Figure 11.11)


For a = 0.05, H0.05 = 1.39

UDL(0.05) = P + H0.05ŝ P = 7.0 + (1.39)(2.55) = 10.54%

LDL(0.05) = P – H0.05ŝ P = 7.0 – (1.39)(2.55) = 3.46%

For a = 0.10, H0.05 = 1.16



UDL(0.10) = P + H0.05ŝ P = 7.0 + (1.16)(2.55) = 9.96%

LDL(0.10) = P – H0.05ŝ P = 7.0 – (1.16)(2.55) = 4.04%

Two Independent Variables: A Typical a ë b Factorial Design


The design of experiments often has been described with examples from agriculture
and the chemical industries. It is at least equally important to use many of the same concepts
in the electrical, electronic, and mechanical fields, but rather less complicated versions
of them are recommended for troubleshooting studies in industry. The following exam-
ple represents a small factorial experiment that was carried out three times in three days.

Case History 11.6


A Multistage Assembly26
In many complicated assembly operations, we do not find the problems until the final
inspection report is made at the end of the assembly. Sometimes it is feasible to carry
an identification system through assembly and establish major sources of trouble on a
regular, continuing basis. Many times, however, it is not feasible to maintain an identi-
fication system in routine production. In the study considered here, no one could deter-
mine who was responsible for a poor-quality or inoperative unit found at final inspection.
This operator says, “They are good when they leave me”; another says, “It’s not my
fault.” No one accepts the possibility of being responsible.
We discuss an experience just like this in which a routing procedure through assem-
bly was established on a sampling basis. The procedure was conceived in frustration; it
is remarkably general and effective in application.
A particular new audio component was well-designed in the sense that those car-
tridges that passed the final electrical test performed satisfactorily. However, too many
acoustical rejects were being found at final testing, and the need of an engineering
redesign had been considered and was being recommended.

26. E. R. Ott, “Achieving Quality Control,” Quality Progress (May, 1969). (Figures 11.12, 11.14, 11.15, and 11.16
reproduced by permission of the editors.)
Chapter 11: Troubleshooting with Attributes Data 347

There are, of course, many engineering methods for improving the design of almost
any product; a redesign is often an appropriate approach to the solution of manufactur-
ing quality problems. Perhaps a change in certain components or materials or other
redesign is considered essential. But there is an alternative method, too frequently over-
looked, and this is to determine whether the components, the assembly operators, and
the jigs or fixtures—any aspect of the entire production process—are capable of a major
improvement.
The following study was planned to compare assembly operators at two of the many
stages of a complicated assembly process.
The two stages were chosen during a meeting called to discuss ways and means.
Present were the production supervisor, the design engineer, and the specialist on plan-
ning investigations. During the meeting, different production stages thought to be pos-
sible contributors to the acoustical defects were suggested and discussed critically. Two
omnibus-type variables were chosen for inclusion in this exploratory study; then at
every other stage, each tray in the study was processed at the same machine, by the same
operator, in the same way throughout as consistently as possible. No one was at fault, of
course not. The purpose of the study was to determine whether substantial improve-
ments might be possible within the engineering design by finding ways to adjust assem-
bly procedures.
• Four operators at stage A, with their own particular machines, were included;
the four operator–machine combinations27 are designated as A1, A2, A3, and A4.
• Also included in the study were three operators performing a hand operation
using only tweezers; these operators are designated as C1, C2, and C3.
• A three-by-four factorial design with every one of the 3 × 4 = 12 combinations
of A and C was planned. Twelve trays—or perhaps 16—can usually be
organized and carried around a production floor without mishap. More than
15 or 16 is asking for trouble. Each standard production tray had spaces for
40 components.
• A routing ticket as in Figure 11.12 was put in each of the 12 trays to direct
passage through the assembly line. The 12 trays were numbered 1, 2, 3, . . . ,
12. All components were selected at random from a common source.
• Each unit was inspected for all three listed defects, (a), (b), and (c), and the
entries in Figure 11.13 indicate the number of each type of defect found at
inspection. Since defect type (c) was found most frequently, Figure 11.14
is shown for type (c) only. A mental statistical analysis, or “look test” at
the data in Figure 11.14, indicates clearly that C2 is an operator producing
many rejects. In addition, A1 is substantially the best of the four operator–
machine combinations.

27. The operators and machines were confounded deliberately. This was an exploratory study, and problems associ-
ated with interchanging operators and machines were considered excessive in comparison to possible advantages.
348 Part III: Troubleshooting and Process Improvement

Tray 1
Routing card
Operation Position
A Op. A 1
B Mch. B 1*
C Op. C 1
D Op. D 3*
E Insp. S *
No. units inspected: 40
No. of rejects found:
Type a: 0
Type b: 0
Type c: 0

Date: 4/9 S
Inspector

*Same for all 12 trays

Figure 11.12 Routing card used to obtain data on an audio component assembly.

r = 40 Operator

A1 A2 A3 A4

a: 0 a: 1 a: 0 a: 0
C1 b: 0 b: 1 b: 0 b: 0
c: 0 c: 1 c: 7 c: 3
Operator

a: 0 a: 0 a: 0 a: 2
C2 b: 0 b: 0 b: 0 b: 2
c: 2 c: 14 c: 8 c: 5

a: 0 a: 0 a: 0 a: 0
C3 b: 0 b: 1 b: 2 b: 1
c: 1 c: 5 c: 2 c: 6

Figure 11.13 Record of the defects of each type found in the first study, arranged according
to the combination of operators from whom they originated, for defect types a,
b, and c.

• This was surprising and important information. Since not all results are as
self-evident, an analysis is given in Figure 11.15, first for columns A and
then for rows C.
• This analysis shows that the differences were large and of practical significance
as well as being statistically significant. In addition, based on differences
indicated by this study, certain adjustments were recommended by engineering
and manufacturing following a careful study and comparison of the operators
at their benches. The study was repeated two days later. The very few type c
Chapter 11: Troubleshooting with Attributes Data 349

r = 40
A1 A2 A3 A4 Σ

C1 0 1 7 3 11 11/160 = 6.9%
r = 40

C2 2 14 8 5 29 29/160 = 18.1%

C3 1 5 2 6 14 14/160 = 8.75%

Σ 3/120 20/120 17/120 14/120 54/480

2.5% 16.6% 14.2% 11.6% 11.2%

Figure 11.14 Defects of type c only. (Data from Figure 11.13.) Total defects of this type shown at
right and at bottom by operator.

A1 A2 A3 A4 C1 C2 C3

20 .20
UDL = 18.8%
(.01)
17.4% 17.2%
(.05) (.01)
n– = 120
16.0%
(.05)
15 n– = 160 .15
P, %


P = 11.2%
10 .10

6.4%
(.05)
LDL = 5.0% 5.2%
5 (.05) (.01) .05
3.6%
(.01)

Figure 11.15 Comparing significant effects of operator/machine combinations A and C (ANOM)


(type c defects). (Data from Figure 11.14.)

defects found are shown in Figure 11.16. The greatly improved process average
– –
of P = (6)(100)/480 = 1.2 percent compares strikingly with the earlier P = 11.2
percent that had led to considering an expensive redesign of components to
reduce rejects. The improvement was directed to the same people who had
decided that a redesign was the only solution to the problem.
• Formal analysis is shown for Figures 11.13 and 11.14. The overall percent
defective is
350 Part III: Troubleshooting and Process Improvement

r = 40

A1 A2 A3 A4 Σ

C1 0 0 0 1 1
40

C2 0 0 3 1 4

C3 0 0 0 1 1

6/480 =
Σ 0 0 3 3
1.2%

Figure 11.16 Number of defects found in second study of audio component assemblies
(type c defects).


P = 100(54)/(12)(40) = 5400/480 = 11.2%
Then

σˆ P =
(11.2)(88.8)
ng

where ng = (3)(40) = 120 when comparing columns: ŝ P = 2.9% and ng = (4)(40) = 160
when comparing rows: ŝ P = 2.5%.
Decision lines for the analysis of means are as follows:

P ± Ha ŝ P

Columns A: k = 4, ŝ P = 2.9%, H0.05 = 2.14, H0.01 = 2.61

a = 0.05 a = 0.01
UDL = 17.4 UDL = 18.8
LDL = 5.0 LDL = 3.6

Columns C: k = 3, ŝ P = 2.5%, H0.05 = 1.91, H0.01 = 2.38

a = 0.05 a = 0.01
UDL = 16.0 UDL = 17.2
LDL = 6.4 LDL = 5.2
Chapter 11: Troubleshooting with Attributes Data 351

Case History 11.7


Machine Shutdowns (Unequal ri)
This case history presents a variation in the type of data employed. It is different also
in that the sample sizes are not all equal.
It is routine procedure in some plants to designate the source of data by the
machine, shift, and/or operator. The data recorded in Table 11.9 serve to illustrate a pro-
cedure of obtaining data for a process improvement project and of providing an analysis
when sample sizes are not all equal.28 The data relate to the performance of five differ-
ent presses (machines) over three shifts. Table 11.9 indicates both the number of times
each press was checked and the number of times its performance was so unsatisfactory
that the press was shut down for repairs. The quality of performance is thus indicated
by frequency of shutdowns. We quote from the Zahniser-Lehman article:
As the product comes from these presses, there are 57 separate characteristics
that require an inspector’s audit. It takes two or three minutes to complete the
examination of a single piece. Yet the combined production from these presses
reaches more than a million a day. To cover this job with a series of charts for
variables would require a minimum of 15 inspectors. . . . (We) use a . . . “shut-
down” chart. The shutdown chart is a p chart on which each point gives a per-
centage of checks resulting in shutdowns for a given press on a given shift
during a two-week period. On this chart, r represents the number of times the
press has been checked on this shift during the particular period. . . .

Table 11.9 Talon’s press-shift performance record.


Press Number of Number of
number Shift times checked shutdowns
1 A 50 2
B 55 7
C 40 4
2 A 45 3
B 55 3
C 55 14
3 A 40 0
B 35 3
C 45 0
4 A 50 6
B 55 9
C 60 11
5 A 60 4
B 45 3
C 60 6

28. J. S. Zahniser and D. Lehman, “Quality Control at Talon, Incorporated,” Industrial Quality Control (March 1951):
32–36. (Reproduced by kind permission of the editor.)
352 Part III: Troubleshooting and Process Improvement

The percentage of checks resulting in shutdowns in the department is 10


percent. The average number of checks per press per shift is 750/15 = 50.
The speed of the presses has been increased by 32 percent since control
charts were first applied. Nevertheless, we find that the percentage of audits
that result in shutdowns has been cut squarely in half. Obviously the quality of
product coming from them also is vastly better than it was three years ago, even
though our charts measure quality only indirectly.
The enhancement of records by charting a continuing history is demonstrated clearly.
The data from Table 11.9 have been rearranged in Table 11.10 in the form of an
obvious three-by-five (3 × 5) factorial design. The numbers in the lower right-hand cor-
ner refer to the number of check inspections made. The percent of all shutdowns for
each shift (across the five presses) is indicated in the column at the right. The percent
of shutdowns for each press across the three shifts is shown at the bottom.
Press Performance
The percent of shutdowns on presses 1 through 5 is

9.0%, 12.9%, 2.5%, 15.7%, and 7.9%

The excellent performance (2.5 percent) of press 3 should be of most value for
improving the process.
We compute tentative decision lines as shown in Figure 11.17a for risks (0.05) and
(0.01). These will be adjusted subsequently.
Press Performance (k = 5, n = 750/5 = 150)
Press 3. This is below the tentative (0.01) decision line. A look now at Table 11.10
shows that press 3 was checked only 120 times (compared to the average of n = 150).
A recomputation of decision limits for ng = 120 shows a slight change in limits, which
does not affect materially the conclusion that press 3 performs significantly better than
the overall average.

Table 11.10 Table of shutdowns.


Press
Shift 1 2 3 4 5 Total, percent
A d=2 3 0 6 4 15/245 = 6.1
r = 50 45 40 50 60
B 7 3 3 9 3 25/245 = 10.2
55 55 35 55 45
C 4 14 0 11 6 35/260 = 13.4
40 55 45 60 60

Total 13/145 20/155 3/120 26/165 13/165 P = 75/750 = 10%
Percent 9.0 12.9 2.5 15.7 7.9
Chapter 11: Troubleshooting with Attributes Data 353

20 Press decision limits Shift decision limits



(using n = 150) (using n– = 250)
UDL = 16.8% (.01)
15.6% (.05)
15 14.54%
n– = 150 13.67%
(.01)
(.05)

– n– = 250
P, %

P = 10.0%
10

6.33% (.05)
5.46% (.01)
5 LDL = 4.4% (.05)
3.2% (.01)

0
1 2 3 4 5 A B C
Press Shift
(a) (b)

Figure 11.17 Comparing number of press shutdowns by press and by shift. (Data from Table 11.10.)

Press 4. This is just on the tentative (0.05) decision line. A check shows that this press
was checked ng = 165 times—more than average. A recomputation of decision lines
using ng = 165 instead of 150 will shrink the limits for press 4 slightly; see Figure 11.18.
There is reason to expect that some way can be found to improve its performance.
Adjusted Decision Lines
Decision lines are now obtained for the actual values ni, with a = 0.05 for press 4 and
a = 0.01 for press 3. The lines in Figure 11.17a were computed for average n– = 150.
Only presses 3 and 4 are near the decision lines and are the only ones for which the deci-
sion might be affected when the actual values of n are used to recompute.
Press 3 (LDL). ng = 120 from Table 11.10.

σˆ P =
(10 )(90 ) = 2.74%
120
LDL ( 0.01) = 10 − ( 2.75) ( 2.744 ) = 2.46%

Press 4 (UDL). ng = 165 from Table 11.10.

σˆ P =
(10 )(90 ) = 2.33%
165
UDL ( 0.01) = 10.0 + ( 2.29 ) ( 2..33) = 15.34%
354 Part III: Troubleshooting and Process Improvement

Recomputed for ng = 165 UDL = 15.34%


(.05)
15


P = 10.0%
10
P, %

5
LDL = 3.73%
(.05)
2.46%
(.01)
Recomputed for n = 120
0
1 2 3 4 5
Press

Figure 11.18 Figure 11.17 redrawn and decision limits recomputed using actual ni, instead of
average n for two borderline points. (Data from Table 11.10.)

These two changes in decision lines are shown in Figure 11.18 by dotted lines. The
slight changes would hardly affect the decision on whether to investigate. Unless the
variation of an individual ng from average n is as much as 40 percent it will probably
not warrant recomputation.29
Shift Performance (k = 3, n = 750/3 = 250)
Shift A. This is beyond (0.05) decision limit—and the shutdown rate is just about one-
third that of shift C. Possible explanations include:
1. More experienced personnel on shift A.
2. Better supervision on shift A.
3. Less efficient checking on press performance.
4. Better conditions for press performance (temperature, lighting, humidity,
as examples).
A comparison of shifts A and C to determine differences would seem in order.

29. Since the sample sizes were not very different, this analysis did not use the Sidak hα* factors available for use
with unequal sample sizes (see Table A.19). This kept the analysis as simple and straightforward as possible. See
L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15, no. 1
(January 1983): 43–44.
Chapter 11: Troubleshooting with Attributes Data 355

11.10 THREE INDEPENDENT VARIABLES

Introduction
Some may have had actual experience using multifactor experimental designs. Some
have not, and it is fairly common for new converts to propose initial experiments with
four, five, and even more independent variables.
Anyone beginning the use of more than one independent variable (factor) is well
advised to limit the number of variables to two. There are several reasons for this rec-
ommendation. Unless someone experienced in experimental design methods is guiding
the project, something is almost certain to go wrong in the implementation of the plan,
or perhaps the resulting data will present difficulties of interpretation. These typical dif-
ficulties can easily lead to plantwide rejection of the entire concept of multivariable
studies. The advantage of just two independent variables over one is usually substan-
tial; it is enough to warrant a limitation to two until confidence in them has been estab-
lished. Reluctance to use more than one independent variable is widespread—it is the
rule, not the exception. An initial failure of the experiment can be serious, and the risk
is seldom warranted.
After some successful experience with two independent variables—including an
acceptance of the idea by key personnel, three variables often offer opportunities for get-
ting even more advantages. The possibilities differ from case to case but they include
economy of time, materials, and testing, and a better understanding of the process being
studied. Some or all of the three variables should ordinarily be of the omnibus type, espe-
cially in troubleshooting expeditions.
There are two designs (plans) using three independent factors that are especially
useful in exploratory projects in a plant. The 23 factorial design (read “two cubed”)—
that is, three independent variables each at two levels—is least likely to lead to mix-ups
in carrying out the program and will result in minimal confusion when interpreting
results. Methods of analysis and interpretation are discussed in Case Histories 11.8
and 11.9.
The 23 factorial design provides for a comparison of the following effects:
1. Main effects of each independent variable (factor).
2. Two-factor interactions of the three independent variables. These different
comparisons are obtained with the same number of experimental units as
would be required for a comparison of the main effects and at the same level
of significance. An added advantage is that the effect of each factor can be
observed under differing conditions of the other two factors.
3. Three-factor interaction of the three variables. Technically this is a possibility.
Experience indicates, however, that this is of lesser importance in trouble-
shooting projects with attributes data. The mechanics of three-factor interaction
analysis will be indicated in some of the case histories.
356 Part III: Troubleshooting and Process Improvement

Case History 11.8


Strains in Small Glass Components
The production of many industrial items involves combinations of hand operations with
machine operations. This discussion will pertain to cathode-ray tubes. Excessive fail-
ures had arisen on a reliable type of tube at the “boiling water” test, in which samples
of the completed tubes were submerged in boiling water for 15 seconds and then plunged
immediately into ice water for five seconds. Too many stem cracks were occurring
where the stem was sealed to the wall tubing. A team was organized to study possible
ways of reducing the percent of cracks. They decided that the following three questions
related to the most probable factors affecting stem strength:
1. Should air be blown on the glass stem of the tube after sealing? If so, at what
pounds per square inch (psi)?
2. Should the mount be molded to the stem by hand operation or by using a
jig (fixture)?
3. The glass stems can be made to have different strain patterns. Can the trouble
be remedied by specifying a particular pattern?
Factors Selected
It was decided to include two levels of each of these three factors.

Air A1: 2.5 psi air blown on stem after sealing


A2: No air blown on stem after sealing
Jig B1: Stem assembly using a jig
B2: Stem assembly using only hand operations (no jig)
Stem tension C1: Normal stem (neutral to slight compression)
C2: Tension stem (very heavy)

Approximately 45 stems were prepared under each of the eight combinations of fac-
tors, and the number of stem failures on the boiling water test is indicated in the squares
of Table 11.11a and b. These are two common ways of displaying the data. Manufacturing
conditions in combination 1, for example, were at levels A1, B1, C1, for r = 42 stems; there
were two stem cracks. Stems in combination 2 were manufactured under conditions A2,
B1, C2, and so forth.
Main Effects
Half the combinations were manufactured under air conditions A1; those in the four
groups (1), (3), (5), and (7). The other half (2), (4), (6), and (8) were manufactured
Chapter 11: Troubleshooting with Attributes Data 357

Table 11.11 A study of stem cracking: a 23 production design.

A1 A2 A1 A2
C1 (1) C 2 (2) C2 (5) C 1 (6)
B1 2 0 B1 6 0
r = 42 44 44 44
C2 (3) C 1 (4) C1 (7) C 2 (8)
B2 7 4 B2 9 1
r = 45 44 42 47

p– = 29/352 = 0.082
(a)

A1 A2

B1 B2 B1 B2

(1) (7) (6) (4)


C1 2 9 0 4
r = 42 42 44 44

(5) (3) (2) (8)


C2 6 7 0 1
44 45 44 47

(b)

under air conditions A2. Then, in comparing the effect of A1 versus A2, data from these
two groupings are pooled:

A1: (1,3,5,7) had 24 stem cracks out of 173; 24/173 = 0.139.


A2: (2,4,6,8) had 5 stem cracks out of 179; 5/179 = 0.028

Similarly, combinations for the Bs and Cs are displayed in Table 11.12.


Points corresponding to values shown in Table 11.12 have been plotted in Figure
11.19. The logic and mechanics of computing decision lines are given below, following
a discussion on decisions.
Decisions from Figure 11.19
The most important single difference is the advantage of A2 over A1; 2.8 percent versus
13.9 percent. The magnitude of this difference is of tremendous importance and is sta-
tistically significant. Fortunately, it was a choice that could easily be made in production.
358 Part III: Troubleshooting and Process Improvement

Table 11.12 Computations for analysis of means. (Data from Table 11.11.)
Main effects: Two-factor interactions:
Air: A1: (1,3,5,7): 24/173 = 0.139 AC: (Like) (1,2,7,8): 12/175 = 0.069
A2: (2,4,6,8): 5/179 = 0.028 (Unlike) (3,4,5,6): 17/177 = 0.096
AB: (Like) (1,4,5,8): 13/177 = 0.073
Stem B1: (1,2,5,6): 8/174 = 0.046 (Unlike) (2,3,6,7): 16/175 = 0.091
assembly: B2: (3,4,7,8): 21/178 = 0.118 BC: (Like) (1,3,6,8): 10/178 = 0.056
(Unlike) (2,4,5,7): 19/174 = 0.109
Stem C1: (1,4,6,7): 15/172 = 0.087
tension: C2: (2,3,5,8): 14/180 = 0.078 Three-factor interaction:
ABC: (Odd*) (1,2,3,4): 13/175 = 0.074
(Even*) (5,6,7,8): 16/177 = 0.090
*Note: Treatment combinations for three-factor interactions (and higher) are grouped according to subscript
totals being either odd or even.

Main effects Interactions


AC AB BC ABC
A1 A2 B1 B2 C1 C2 L U L U L U O E

.14
n– = 176
UDL = 0.120
.12 (.01)
0.111
(.05)
.10
p = 0.082
.08
p

.06 LDL = 0.053


(.05)
0.044 (.01)
.04

.02

Figure 11.19 Comparing effects of three factors on glass stem cracking: three main effects and
their interactions. (Data from Table 11.12.)

A second result was quite surprising; the preference of B1 over B2. The converse had
been expected. The approximate level of risk in choosing B1 over B2 is about a = 0.01;
and the magnitude of the advantage—about 7 percent—was enough to be of additional
economic interest.
No two-factor interaction is statistically significant at the 0.05 level and the BC
interaction “just barely” at the 0.10 level. This latter decision can be checked by using
the factor H0.10 = 1.16 from Table A.14, using df = ∞, since this is for attributes. Then, the
recommended operating conditions are A2B1 and two possibilities follow:

A2B1 and C1 or A2B1 and C2

that is, combinations 2 and 6 with zero and zero cracked stems, respectively.
Chapter 11: Troubleshooting with Attributes Data 359

Some slight support for a choice may be seen from the BC-interaction diagram
(Figure 11.19), which gives “the nod” to those B and C having like subscripts, that is,
to combination 6.
Mechanics of Analysis—ANOM (Figure 11.19)
For the initial analysis, we consider all eight samples to be of size n– = 44. Then each
comparison for main effects and interactions will be between k = 2 groups of four sam-
ples each, with n– = 4(44) = 176 stems.
The total defectives in the eight combinations is 29.

p– = 29/352 = 0.082 or 8.2%


and

σˆ P =
(8.2)(91.8) = 2.1% and ŝ P = 0.021
176

Decision Lines (for Figure 11.19)


For a = 0.05, H0.05 = 1.39

UDL(0.50) = p– + H0.05ŝ p = 0.082 + (1.39)(0.021) = 0.111


LDL(0.50) = p– – H ŝ = 0.082 – (1.39)(0.021) = 0.053
0.05 p

For a = 0.10, H0.05 = 1.82

UDL(0.10) = p– + H0.05ŝ p = 0.082 + (1.82)(0.021) = 0.120


LDL(0.10) = p– – H ŝ = 0.082 – (1.82)(0.021) = 0.044
0.05 p

These decision lines30 are applicable not only when comparing main effects but also
to the three two-factor interactions and to the three-factor interaction. This is because
when dealing with two-level designs we have an exact test. This is not true for other
configurations (see Chapter 15).
Two-Factor Interactions—Discussion
In the study, it is possible that the main effect of variable A is significantly different
under condition B1 than it is under condition B2; then there is said to be an “A × B (‘A
times B’) two-factor interaction.”
If the main effect of A is essentially (statistically) the same for B1 as for B2,
then there is no A × B interaction. The mechanics of analysis for any one of the three

30. Since the sample sizes were not very different, this analysis did not use special procedures available for the
analysis of samples of different size. Use of n– simplifies the analysis here with very little effect on statistical
validity since the sample sizes are almost the same.
360 Part III: Troubleshooting and Process Improvement

possible two-factor interactions have been given in Table 11.12 without any discussion
of their meaning.
In half of the eight combinations, A and B have the same or like (L) subscripts;
namely (1,4,5,8). In the other half, they have different or unlike (U) subscripts; namely
(2,3,6,7). The role of the third variable is always disregarded when considering a two-
factor interaction:

AB: Like (1,4,5,8) has 13/177 = 0.073 or 7.3%


AB: Unlike (2,3,6,7) has 16/175 = 0.091 or 9.1%

The difference between 9.1 percent and 7.3 percent does not seem large; and it is
seen in Figure 11.19 that the two points corresponding to AB(L) and AB(U) are well
within the decision lines. We have said this means there is no significant A × B interac-
tion. This procedure deserves some explanation.
In Figure 11.20, the decreases in stem failures in changing from A1 to A2 are as follows:

Under condition B1: drops from 8/86 = 9.3% to 0/88 = 0%: a drop of 9.3%.
Under condition B2: drops from 16/87 = 18.4% to 5/91 = 5.5%: a drop of 12.9%.

Are these changes significantly different (statistically)? If the answer is “yes,” then
there is an A × B interaction.
Consider the differences under conditions B1 and B2 and assume that they are not
statistically equal.

20
A 1 B 2 (3,7): 16/87 = 18.4%

B2
8/86 = 9.3%
P, %

10
A 1B 1

A 2 B 2 (4,8): 5/91 = 5.5%


B1

A 2 B 1 (2,6): 0/88 = 0%
0
A1 A2

Figure 11.20 A graphical comparison of effect on stem cracks. The upper line shows the
decrease in changing from A1 to A2 under condition B2; the lower line shows
the decrease in changing from A1 to A2 under condition B1. Intuitively the effects
seem quite comparable. (Data from Table 11.11.)
Chapter 11: Troubleshooting with Attributes Data 361

A1B2 – A1B1 ≠ A2B2 – A2B1 (11.3)


that is, is 9.1% ≠ 5.5% statistically?

This can be rewritten as

A1B2 + A2B1 ≠ A1B1 + A2B2 (11.4)

From Table 11.12, corresponding combinations are

Unlike (2,3,6,7) ≠ Like (1,4,5,8)


that is, is 9.1% ≠ 7.3%?

That is, the two lines are not parallel if the points corresponding to AB— Like and
Unlike—are statistically different, that is, fall outside the decision lines of Figure 11.19.
The relation in Equation (11.4) is a simple mechanical comparison to compute when
testing for AB interaction. The decision lines to be used in Figure 11.19 are exactly those
used in comparing main effects because it is a 2p design.
Similar combinations to be used when testing for A × C and B × C interactions are
shown in Table 11.12.
Note: Also, the combinations used to test for a three-factor, A × B × C, interaction
are given in Table 11.12. If an apparent three-factor interaction is observed, it should be
recomputed for possible error in arithmetic, choice of combinations, or unauthorized
changes in the experimental plan, such as failure to “hold constant” or randomize—all
factors not included in the program. With very few exceptions, main effects will provide
most opportunities for improvement in a process; but two-factor interactions will some-
times have enough effect to be of interest.

Case History 11.9


A Problem in a High-Speed Assembly Operation (Broken Caps)

Cracked and Broken Caps (See Figure 11.21)


Cracked and broken caps are a constant headache in pharmaceutical firms. One aspect
of the problem was discussed in Case History 11.4. It was recognized that too many
caps were being cracked or broken as the machine screwed them onto the bottles.
Besides the obvious costs in production, there is the knowledge that some defective caps
will reach the ultimate consumer; no large-scale 100 percent inspection can be perfect.
When the department head was asked about the problem we were told that it had been
investigated; “We need stronger caps.” Being curious, we watched the operation awhile.
It seemed that cracked caps were not coming equally from the four lines.
362 Part III: Troubleshooting and Process Improvement

Figure 11.21 Components in a toiletry assembly. The complete labeled assembly: a soft plastic
ring; a hard plastic marble; a hard plastic cap; a glass bottle. (Case History 11.9.)

Production was several hundred per minute. The filling–capping machine had four
lines adjacent to four capper positions. At each, beginning with an empty bottle, a soft
plastic ring was squeezed on, a filler nozzle delivered the liquid content, a hard plas-
tic marble was pressed into the ring, and a cap was applied and tightened to a desig-
nated torque.
Each of the four lines had a separate mechanical system, including a capper and
torque mechanism. An operator–inspector stood at the end of the filling machine,
removing bottles with cracked caps whenever they were spotted. We talked to the
department head about the possibilities of a quick study to record the number of
defects from the four capper positions. There were two possibilities: a bucket for each
position where defective caps could be thrown and then counted, or an ordinary sheet
of paper where the operator–inspector could make tally marks for the four positions.
The department head preferred the second method and gave necessary instructions.
After some 10 or 15 minutes, the tally showed:

Capper head Cracked caps


1 2
2 0
3 17
4 1
Chapter 11: Troubleshooting with Attributes Data 363

We copied the record and showed it to the department head who said, “I told you.
We need stronger caps!” Frustrated a bit, we waited another hour or so for more data.

Capper head Cracked caps


1 7
2 4
3 88
4 5

We now discussed the matter for almost 30-minutes and the department head was
sure the solution was stronger caps. Sometimes we were almost convinced! Then it was
agreed that capper head 3 could hardly be so unlucky as to get the great majority of
weak caps. The chief equipment engineer was called and we had a three-way discus-
sion. Yes, they had made recent adjustments on the individual torque heads. Perhaps the
rubber cone in head 3 needed replacement.
Following our discussion and their adjustments to the machine, head 3 was brought
into line with the other three. The improved process meant appreciably fewer cracked
and broken caps in production and fewer shipped to customers. It is rarely possible to
attain perfection. We dismissed the problem for other projects.
Stronger Caps?
Then it occurred to us that we ought to give some thought to the department head’s ear-
lier suggestion: stronger caps. With a lot of experience in this business, the department
head might now possibly be a little chagrined; besides there might be a chance to make
further improvement. So we reopened the matter. Exclusive of the capper heads, why
should a cap break? Perhaps because of any of the following:
1. Caps (C). Wall thickness, angle of threads, distances between threads,
different cavities at molding; irregularities of many dimensions—the
possibilities seemed endless.
2. Bottles (B). The angle of sharpness of threads on some or all molds,
diameters, distances. This seemed more likely than caps.
3. Rings (R). Perhaps they exert excessive pressures some way.
4. Marbles (M). These were ruled out as quite unlikely.
Selection of Independent Variables to Study
The traditional approach would be to measure angles, thickness, distances, and so forth;
this would require much manpower and time.
The question was asked: “How many vendors of caps are there? Of bottles? Of
rings?” the answer was two or three in each case. Then how about choosing vendors as
omnibus variables? Since a 23 study is a very useful design, we recommended the fol-
lowing vendors: two for caps, two for bottles, and two for rings. This was accepted as a
reasonable procedure.
364 Part III: Troubleshooting and Process Improvement

A 23 Assembly Experiment
Consequently, over the next two or three weeks, some 8000 glass bottles were acquired
from each of two vendors B1, B2; also, 8000 caps from each of two vendors, C1, C2; and
8000 rings from each of two vendors, R1, R2.
It took some careful planning by production and quality control to organize the
flow of components through the filling and capping operation and to identify cracked
caps with the eight combinations. The assembly was completed in one morning. The
data are shown in Tables 11.13a and b. From the table alone, it seemed pretty clear that

Table 11.13 Data from a 23 factorial production study of reasons for cracked caps. (Data from
Case History 11.9, displayed on two useful forms.)

B1 B2 B1 B2
C1 (1) C 2 (2) C2 (5) C 1 (6)
R1 9 33 R1 51 27

C2 (3) C 1 (4) C1 (7) C 2 (8)


R2 20 21 R2 13 28

(a)

B1 B2

R1 R2 R1 R2

(1) (7) (6) (4)


C1 9 13 27 21

(5) (3) (2) (8)


C2 51 20 33 28

(b)

The number* of assemblies in each combination was 2000; the number of cracked and broken caps is
shown in the center.
c = 202 / 8 = 25.25
σˆ c = c = 25.25 = 5.02 for individualss
σˆ c = σˆ c / 4 = 5.02 / 2 = 2.51 when comparing averages of four cells
– –
ΣB1(1,3,5,7) = 93; B1 = 23.25 ΣC1(1,4,6,7) = 70; C1 = 17.5
– –
ΣB2(2,4,6,8) = 109; B2 = 27.25 ΣC2(2,3,5,8) = 132; C2 = 33.0

ΣR1(1,2,5,6) = 120; R1 = 30.0

ΣR2(3,4,7,8) = 82; R2 = 20.5
* We have chosen to analyze the data as being Poisson type.
Chapter 11: Troubleshooting with Attributes Data 365

the difference in cracked caps between vendors C2 and C1 must be more than just
chance; 132 compared to 70. (Also, see Figure 11.22 and Table 11.14.)
It was a surprise to find that combination 5, B1R1C2, which produced 51 rejects, was
assembled from components all from the same vendor!
Formal Analysis (See Figure 11.20)
The percent of cracked caps is small and ng is large; so we shall simplify computations
by assuming a Poisson distribution as a close approximation to the more detailed analy-
sis using the binomial.

Main effects Interactions


BC BR CR
B1 B2 R1 R2 C1 C2 L U L U L U
Average number of cracked caps

30 UDL = 29.80
(.01)
28.78
(.05)
ng = 8000
H.05ŝ c–
c– = 25.25
25

LDL = 21.72
(.05)
20.70 (.01)
20

Figure 11.22 Comparing effects of bottles, rings, and caps from different vendors on cracked caps
(main effects and two-factor interactions). (Data from Tables 11.13b and 11.14.)

Table 11.14 Computations for two-factor interactions—cracked cap. (Data from Figure 11.24.)
Decision lines Computation for two-factor interactions
a = .05 BC: Like (1,2,7,8) = 22 + 61 = 83 Average = 20.75
UDL = c– + H0.05ŝ c– Unlike (3,4,5,6) = 48 + 71 = 119 29.75
= 25.25 + (1.39)(2.5) 202
= 25.25 + 3.53 = 28.78
LDL = 25.25 – 3.53 = 21.72 BR: Like (1,5,4,8) = 60 + 49 = 109 27.25
Unlike (2,6,3,7) = 33 + 60 = 93 23.25
a = .01 202
UDL = 25.25 + (1.82)(2.5)
= 25.25 + 4.55 = 29.80 CR: Like (1,3,6,8) = 36 + 48 = 84 21.0
LDL = 25.25 – 4.52 = 20.70 Unlike (2,4,5,7) = 34 + 84 = 118 29.5
202
366 Part III: Troubleshooting and Process Improvement

Summary/Discussion (See Figure 11.22)


Caps. The largest difference is between vendors of caps. The difference is of practical
interest as well as being statistically significant (a < 0.01); the overall breakage from
C2 is almost twice that of C1. In addition, C2 caps crack more often than C1 caps for each
of the four bottle–ring combinations. This is a likely reason that the department head
had believed they needed “stronger caps,” since any change from vendor C1 to C2 would
increase cracking. Thus the reason for excessive cracking was from a joint origin; the
adjustment of the capper had effected an improvement, and the selection of cap vendor
offers some opportunities. There were good reasons why it was not feasible for pur-
chasing to provide manufacturing with only the best combination B1R1C1. What should
be done? The vendor of C2 can study possible reasons for excessive weakness of the
caps: Is the trouble in specific molding cavities? Is it in general design? Is it in plastic
or molding temperatures? In certain critical dimensions?
In the meantime, can purchasing favor vendor C1? Production itself should avoid
using the very objectionable B1R1C2 combination. (A rerun of the study a few days later
showed the same advantage of C1 over C2, and that B1R1C2 gave excessive rejects.)
Other Main Effects
Rings. The effect of rings is seen to be statistically significant (a about five percent);
the magnitude of the effect is less than for caps. It can also be seen that the effect of
rings when using C2 is quite large (combining B1 and B2); the effect is negligible when
using C1. This interdependence is also indicated by the CR interaction.
Bottles. The effect of bottles as a main effect is the least of all three. (Most surprising to
us!) However, let us consider that half of the data in Table 11.15; these data are from
Table 11.13b for only the better cap C1. The data for this 22 design, using C1 only, shows

Table 11.15 Effects of rings and bottles using only caps C1.
Data from Table 11.18

B1 B2
(1) (7)
R1 9 27 36

(6) (4)
R2 13 21 34


22 48 c = 17.5
– –
R 1 = 18 B 1 = 11
– –
R 2 = 17 B 2 = 24
Chapter 11: Troubleshooting with Attributes Data 367

– – – –
R1 R2 B1 B2

ng = 4000
25
UCL = 22.89
(.01)

20
c– = 17.5
c–
15
LCL = 12.11
(.01)
10

Figure 11.23 Comparing effects of bottles and rings from different vendors when using caps
from the better vendor. (Data from Table 11.15.)

a definite advantage of using bottle B1 in any combination with R. Also in Figure 11.23, the
advantage is seen to be statistically significant, a = 0.01. (Note: this is a BC interaction.)
Formal Analysis Caps C1 Only (see Figure 11.23)

c = 70 / 4 = 17.5; σˆ c = 17.5 = 4.18; σˆ c = 4.18 / 2 = 2.96 for averages of two cells.

Then
UDL(0.01) = 17.50 + (1.82)(2.96)
= 17.50 + 5.39
= 22.89
LDL(0.01) = 12.11

11.11 A VERY IMPORTANT EXPERIMENTAL


DESIGN: 1/2 × 23
The 22 and 23 designs are useful strategies31 to use initially when investigating problems
in many industrial processes. A third equally important strategy, discussed here, is more
or less a combination of the 22 and 23 designs. It enables the experimenter to study
effects of three different factors with only four combinations of them instead of the eight

31. See also Section 14.9.


368 Part III: Troubleshooting and Process Improvement

Table 11.16 Two special halves of a 23 factorial design.

P1 P2 P1 P2
T1 (1) T 2 (2) T2 (5) T 1 (6)
F1 F1

T2 (3) T 1 (4) T1 (7) T 2 (8)


F2 F2

(a) (b)

in a 23 design. This “half-rep of a two cubed” design is especially useful in exploratory


studies in which the quality characteristics are attributes.
In Table 11.16a and b, two particular halves of a complete 23 factorial design are
shown. Some data based on this design are shown in Figure 11.24.
The reasons for choosing only a special half of a 23 factorial design are reductions
in time, effort, and confusion. Especially when it is expected that the effects of the three
factors are probably independent, one should not hesitate to use such a design. In this
design, the main effect of any variable may possibly be confounded with an interaction
of the other two variables. This possible complication is often a fair price to pay for the
advantage of doing only half the experimental combinations.

Case History 11.10


Winding Grids
This case history presents two practical procedures:
1. A half-rep of a 23 design using attributes data.
2. A graphical presentation of data allowing simultaneous comparisons of four
types of defects instead of the usual single one.
Introduction
Grids are important components of electronic tubes.32 They go through the following
manufacturing steps:
1. They are wound on a grid lathe. There are several grid lathes in use and the
tension T on the lateral wire can be varied on each. After winding, they are
transported to an operator called a hand-puller.

32. F. Ennerson, R. Fleischmann, and D. Rosenberg, “A Production Experiment Using Attribute Data,” Industrial
Quality Control 8, no. 5 (March 1952): 41–44.
Chapter 11: Troubleshooting with Attributes Data 369

P1 P2 F1 F2 T1 T2
ng = 200
P1 P2 UDL = 26.8%
(.01)
T1 1 T2 2 (.05)
25%
F1 19 13 32/200 = 16%

r = 100 100 P = 21.5%
T2 3 T1 4
20%
F2 26 28 54/200 = 27%
100 100 (.05)
(.01)
45 = 22.5% 41 = 20.5% LDL = 16.2%
39 = 19.5% 47 = 23.5%
(a) Spaciness

ng = 200
P1 P2 15%
T1 T2 UDL = 13.0%
(.01)
F1 0 17 17 = 8.5%

100 100 10% P = 9.25%
T2 T1
F2 16 4 20 = 10%
(.01)
100 100 5% LDL = 5.5%
16 = 8% 21 = 10.5%
33 = 16.5% 4 = 2%
0
(b) Taper

ng = 200
P1 P2
10%
T1 T2
UDL = 7.5%
F1 0 13 13 = 6.5% (.01)

100 100 P = 4.75%
5%
T2 T1
F2 3 3 6 = 3% (.01)
LDL = 2.0%
100 100 0
3 = 1.5% 16 = 8%
16 = 8% 3 = 15%
(c) Damaged

20% ng = 200

P1 P2
UDL = 15.3%
T1 T2 15% (.01)
F1 26 7 33 = 16.5%

100 100 P = 11.25%
T2 T1
10%
F2 12 0 12 = 6%
100 100 (.01)
LDL = 7.2%
38 = 19% 7 = 3.5% 5%
19 = 9.5% 26 = 13%

(d) Slant

Figure 11.24 Effects of pullers, formers, and tension on four defect types. (Computations from
Table 11.17.)
370 Part III: Troubleshooting and Process Improvement

2. The hand-puller P uses a pair of tweezers to remove loose wire from the ends.
The loose wire is inherent in the manufacturing design. The grids are then
transported to a forming machine.
3. Each forming machine has its forming operator F. After forming, the grids go
to inspection.
Following these operations, inspectors examine the grids for the four characteris-
tics of a go/no-go attributes nature: spaciness, taper, damage, and slant.
Design of the Experiment
In a multistage operation of this kind, it is difficult to estimate just how much effect the
different steps may be contributing to the total rejections being found at the end of
the line. Since the percent of rejections was economically serious, it was decided to
set up a production experiment to be developed jointly by representatives of production,
production engineering, and quality control. They proposed and discussed the follow-
ing subjects:
• Grid lathes. It was decided to remove any effect of different grid lathes by
using grids from only one lathe. The results of this experiment cannot then
be transferred automatically to other lathes.
However, it was thought that the tension of the lateral wire during winding
on the grid lathe might be important. Consequently, it was decided to include
loose tension T1 and tighter tension T2 in the experiment. Levels of T1 and T2
were established by production engineering.
• Pullers. The probable effect of these operators was expected to be important.
Two of them were included in the experiment, both considered somewhat
near average. They are designated by P1 and P2. Others could be studied later
if initial data indicated an important difference between these two.
• Forming operators. It was judged that the machine operators had more effect
than the machines. Two operators F1 and F2 were included, but both operated
the same machine.
• Inspection. All inspection was done by the same inspector to remove any
possible difference in standards.
• Design of the experiment. Actually, the design of the experiment was being
established during the discussion that led to the selection of three factors and
two levels of each factor.
It would now be possible to proceed in either of two ways:
1. Perform all 23 = 8 possible combinations, or
2. Perform a special half of the combinations chosen according to an
arrangement such as in Table 11.16a.
Chapter 11: Troubleshooting with Attributes Data 371

Since it was thought that the effects of the different factors were probably
independent, the half-rep design was chosen instead of the full 23 factorial.
• Number of grids to include in the experiment. The number was selected after
the design of the experiment had been chosen. Since grids were moved by
production in trays of 50, it was desirable to select multiples of 50 in each
of the four squares. It was decided to include 100 grids in each for a total of
400 grids in the experiment. This had two advantages: it allowed the entire
experiment to be completed in less than a day and had minimum interference
with production, and it was expected that this many grids would detect
economically important differences.
To obtain the data for this experimental design, 200 grids were wound with loose
tension T1 on the chosen grid lathe and 200 with tighter tension T2. These 400 grids were
then hand-pulled and formed according to the chosen schedule. The numbers of rejects
for each characteristic, found at inspection, are shown in the center of squares in Figure
11.24. The number of grids in each combination is shown in the lower-right corner.
Analysis of the Data (ANOM)
Main Effects. By combining the data within the indicated pairs of squares we have the
percents of rejects for spaciness as follows:

P1(1,3): 45/200 = 0.225 or 22.5%


P2(2,4): 41/200 = 0.205 or 20.5%
F1(1,2): 32/200 = 0.160 or 16.0%
F2(3,4): 54/200 = 0.270 or 27.0%
T1(1,4): 47/200 = 0.235 or 23.5%
T2(2,3): 39/200 = 0.195 or 19.5%

These are shown in Figure 11.24a.


Conclusions. By examining the four charts in Figure 11.24, we see where to place our
emphasis for improving the operation. It should be somewhat as follows:
Tension. A loose tension T1 on the grid lathe is preferable to a tight tension T2
with respect both to taper and damage (Figure 11.24b, c).
Forming operators. These have an effect on spaciness and slant. Operator F2
needs instruction with respect to spaciness; and operator F1 with respect to
slant (Figure 11.24a, d). This is very interesting.
Hand pullers. Again, one operator is better on one characteristic and worse on
another. Operator P1 is a little more careful with respect to damage, but is not
careful with respect to slant, ruining too many for slant (Figure 11.24c, d).
372 Part III: Troubleshooting and Process Improvement

It may be that there are other significant effects from these three factors; but if so,
larger samples in each cell would be needed to detect such significance. This experi-
ment has provided more than enough information to suggest substantial improvements
and extensions in the processing of the grids.
It is interesting and helpful that each operator is better on one quality characteristic
and worse on another. Each operator can be watched for what is done right (or wrong), and
whatever is learned can be taught to others. Sometimes it is argued that people are only
human, so that while they are congratulated for good performance, they also can be
taught how to improve their work.
Computation of Decision Lines (ANOM): Half-Rep of a 23
The computation in this half-rep design proceeds exactly as in a 22 factorial design

(Table 11.17). First, the overall percent defective P is determined for each defect type.

Then decision lines are drawn at: P ± Ha ŝ P. Each comparison is between two totals of
ng = 200.

11.12 CASE HISTORY PROBLEMS

Problem 11.1—Based on a Case History


Defective Glass Bottles. The use of quite small attributes samples taken at regular time
intervals during production can provide evidence of important differences in the pro-
duction system and indicate sources to be investigated for improvements.
Background. A meeting was arranged by telephone with a quality control representa-
tive from a company whose only product was glass bottles. This was one of the few times
Ellis Ott ever attempted to give specific advice to anyone in a meeting without visiting
the plant or having had previous experience in the industry. The plant was a hundred
miles away and they had a sensible discussion when they met.
The following points were established during the discussion:
1. There were too many rejects: the process was producing about 10 percent
rejects of different kinds.
2. Knowledge of rejects was obtained from an acceptance sampling operation
or a 100 percent inspection of the bottles. The inspection station was in a
warehouse separate from the production areas; the usual purpose of inspection
was to cull out the rejects before shipping the bottles to their customers. The
information was not of much value for any process improvement effort, often
obtained a week after bottles were made and too late to be considered
representative of current production problems.
Chapter 11: Troubleshooting with Attributes Data 373

Table 11.17 Computations of decision lines (ANOM). (Data from Figure 11.24.)

• Spaciness: Comparing k = 2 groups of 200 each, P = 86/400 = 21.5%

σˆ P =
(21.5)(78.5) = 2.90%
200
For a = .05: For a = .01:

UDL = P + Ha ŝP UDL = 21.5 + (1.82)(2.90)
= 21 + (1.39)(2.90) = 26.8%
= 25.5% LDL = 16.2%

LDL = P – Ha ŝP
= 17.5%

• Taper: k = 2 groups of 200 each: P = 37/400 = 9.25%

σˆ P =
(9.25)(90.75) = 2.05%
200
For a = .05: For a = .01:
(Not needed) UDL = 9.25 + (1.82)(2.05)
= 13.0%
LDL = 5.5%

• Damaged: k = 2 groups of 200 each: P = 19/400 = 4.75%

σˆ P =
( 4.75)(95.25) = 1.50%
200
For a = .05: For a = .01:
(Not needed) UDL = 4.75 + (1.82)(1.50)
= 7.50%
LDL = 2.0%

• Slant: k = 2, P = 45/400 = 11.25%

σˆ P =
(11.25)(88.75) = 2.23%
200
For a = .05: For a = .01:
UDL = 11.25 + (1.39)(2.23) UDL = 11.25 + (1.82)(2.23)
= 14.35 = 15.3
LDL = 8.15 LDL = 7.2

3. Large quantities of glass bottles were being produced. Several machines were
operating continuously on three shifts, and for seven days per week. Each
machine had many cavities producing bottles.
The representative returned to the plant and made the following arrangements:
• A plant committee was organized representing production, inspection, and
industrial engineering, to study causes and solutions to the problem of defects.
374 Part III: Troubleshooting and Process Improvement

• An initial sampling procedure was planned for some quick information.


From the most recent production, samples of 15 per hour were to be chosen
at random from: (1) each of three machines (on the hot end), and (2) each of
three shifts, and (3) over seven days.
• The sample bottles were to be placed in slots in an egg-carton-type box marked
to indicate the time of sampling as well as machine number, shift, and date.
• After the bottles were collected and inspected, the number and type of various
defects were recorded. The data in Table 11.18 show only the total of all rejects.
(A breakdown by type of defect was provided but is not now available.)

Major Conclusions That You May Reach


1. There were fundamental differences between the three machines, and
differences between shifts.
2. There was a general deterioration of the machines, or possibly in raw
materials, over the seven days. A comparison of this performance pattern
with the timing of scheduled maintenance may suggest changes in the
maintenance schedule.

Table 11.18 Defective glass bottles from three machines—three shifts and seven days.
Machine
Date Shift 1 2 3
8/12 A 1 4 4
B 4 0 4
C 12 6 9
8/13 A 3 6 30
B 2 8 46
C 2 7 27
8/14 A 2 1 1
B 8 11 15
C 8 7 17
8/15 A 4 11 10
B 5 7 11
C 4 6 11
8/16 A 10 8 9
B 6 12 10
C 7 15 19
8/17 A 7 11 15
B 12 9 19
C 24 8 18
8/18 A 8 6 16
B 10 12 17
C 8 19 15
Number of days: 7
Number of machines: 3
Number of shifts: 3
Number of hrs/shift: 8
Number of items/hr: 15
Total n = (63)(120) = 7560
r = (8)(15) = 120/shift/day/machine = cell size
Chapter 11: Troubleshooting with Attributes Data 375

3. Each machine showed a general uptrend in rejects; one machine is best and
another is consistently the worst.
4. There was an unusual increase in rejects on all shifts on August 13 on one
machine only. Manufacturing records should indicate whether there was any
change in the raw material going to that one machine. If not, then something
was temporarily wrong with the machine. The records should show what
adjustment was made.

Suggested Exercises. Discuss possible effects and types of reasons that might explain
the differences suggested below. Prepare tables and charts to support your discussion.
1. Effect of days. All machines and shifts combined. Is there a significant
difference observed over the seven days?
2. Effect of machines. Each machine with three shifts combined. What is the
behavior pattern of each machine over the seven days.
3. Effect of shift. Each shift with three machines combined. What is the behavior
pattern of each shift over the seven days?

Problem 11.2—Wire Treatment


During processing of wire, it was decided to investigate the effect of three factors on an
electrical property of the wire.
• Three factors were chosen to be investigated:
T: Temperature of firing
D: Diameter of heater wire
P: the pH of a coating
• It was agreed to study these three factors at two levels of each. The
experimental design was a half-rep of a 23.
• The quality characteristic was first measured as a variable; the shape of the
measurement distribution was highly skewed with a long tail to the right.
Very low readings (measurements) were desired; values up to 25 units were
acceptable but not desirable. (Upper specification was 25.) It was agreed,
arbitrarily for this study, to call very low readings (< 5) very good; and high
readings (> 14) bad. Then the analysis was to be made on each of these two
attributes characteristics.
• After this initial planning, it was suggested and accepted that the fractional
factorial would be carried out under each of three firing conditions:
A: Fired in air
376 Part III: Troubleshooting and Process Improvement

Table 11.19 Wire samples from spools tested after firing under different conditions of
temperature, diameter, and pH.

A. Very good quality B. Bad quality

T1 T2 T1 T2

P1 P2 P1 P2
A: 8/12 10/12 A: 0/12 1/12
D1 D1
S: 4/12 10/12 S: 7/12 0/12
H: 9/12 8/12 H: 1/12 4/12

P2 P1 P2 P1
A: 8/12 7/12 A: 3/12 5/12
D2 D2
S: 9/12 7/12 S: 1/12 1/12
H: 7/12 5/12 H: 3/12 5/12

S: Fired by the standard method already in use


H: Fired in hydrogen
The data are shown in Table 11.19. Whether the very good or bad qualities are most
important in production is a matter for others to decide.

Suggested Exercises
1. Compare the effectiveness of the three firing conditions, A, S, and H. Then
decide whether or not to pool the information from all three firing conditions
when studying the effects of T, D, and P.
2. Assume now that it is sensible to pool the information of A, S, and H;
make an analysis of T, D, and P effects.
3. Prepare a one- or two-page report of recommendations.

11.13 PRACTICE EXERCISES


1. Given three samples of 100 units each, the fraction defective in each sample

is 0.05, 0.06, and 0.10. Compute P and ANOM decision limits for a = 0.01
and a = 0.05. Do any of the three samples have a significantly different
fraction defective?
2. Recompute the ANOM decision chart, excluding workers D, F, H, and K in
Table 11.5
Chapter 11: Troubleshooting with Attributes Data 377

3. Based on the chart shown in Figure 11.5, what action should be taken?
4. a. Explain why the author used the Poisson rather than the binomial model in
Case History 11.3.
b. Explain the meaning of the triangles and the circles in Table 11.7.
5. Assume in Case History 11.6 that the data from operator A4 had to be excluded
from the analysis. Reanalyze the remaining data and state conclusions.
6. Explain the difference between the upper five percent decision limit of 15.6
percent shown in Figure 11.17 and of 15.34 percent shown in Figure 11.18.
7. Create an industrial example to illustrate the meaning of the terms
“interaction” and “confounding.” Prepare for oral presentation.
8. Work Problem 11.1 in the Case History Problems section.
9. Work Problem 11.2 in the Case History Problems section.
10. Make a histogram of the data from Table 11.7 to see if it appears to conform
to the Poisson distribution.
11. Create interaction plots for the AB and BC interactions in Case History 11.8
using the data from Table 11.11.
12
Special Strategies
in Troubleshooting

The special methods presented in this chapter can be very effective in many situations:
• Disassembly and reassembly
• A special screening program for many treatments
• The relationship between two variables
• Use of transformations to supplement ANOM analyses
They involve obtaining insight from the patterns in which data fall.

12.1 IDEAS FROM PATTERNS OF DATA

Introduction
A set of numbers may be representative of one type of causal system when they arise
in one pattern, or a different causal system when the same data appear in a different pat-
tern. A control chart of data that represents a record of a process over a time period
almost invariably carries much more meaning than the same data accumulated in a his-
togram, which obscures any time effects.
There are times, however, when variations in the data appear to be irretrievably lost;
sometimes, as discussed below, some semblance of order can be salvaged to advantage.

379
380 Part III: Troubleshooting and Process Improvement

Case History 12.1


Extruding Plastic Components
Hundreds of different plastic products are extruded from plastic pellets. Each product
requires a mold that may have one cavity or as many as 16, 20, or 32 cavities producing
items purported to be “exactly” alike. The cavities have been machined from two mating
stainless steel blocks; plastic is supplied to all cavities from a common stream of semi-
fluid plastic.
It is sometimes recognized that the cavities do not perform alike, and it is prudent
foresight to require that a cavity number be cut into each cavity; this is then a means of
identifying the cavity that has produced an item in a large bin of molded parts. The
importance of these numbers in a feedback system is potentially tremendous.
In assemblies, there are two types of defects. Those that occur in a random fashion
and those that occur in patterns but that are seldom recognized as such. A pattern, when
recognized, can lead to corrective action; an ability to identify these patterns is of real
value in troubleshooting.
A bottle for a well-known men’s hair toiletry has a soft plastic plug (as in Figure
12.1). It is inserted by machine into the neck of a plastic bottle. It controls excess appli-
cation of the toiletry during use.
Incoming inspection was finding epidemics of “short-shot plugs” in shipments from
an out-of-state supplier. Several discussions were held, by telephone, with the vendor.
The short-shot plugs were incompletely formed, as the term implies. When they got into
production they often jammed the equipment and were also a source of dissatisfaction
when they reached the consumer.
A knowledgeable supervisor got involved in the problem and decided to obtain
some data at incoming materials inspection. The importance of determining whether
these defects were occurring randomly from the many mold cavities or in some pattern
was known. Several boxes of plugs were inspected, and the defective short-shot plugs
kept separate. Just over 100 of them were obtained; by examining them, the cavities that

Figure 12.1 Plastic bottle and plug insert.


Chapter 12: Special Strategies in Troubleshooting 381

produced them were identified (see Table 12.1). It is very evident1 that the defective
plugs do not come randomly from the 32 cavities as identified from their mold numbers.
The supervisor then reasoned as follows:
• Since rejects were found from cavities 1 and 32, it must be a 32-cavity mold,
although some cavities produced no defective plugs.
• A 32-cavity mold must be constructed as a 4 × 8 mold rather than a 16 × 2
mold; and the obvious numbering of molds must be somewhat as in Table
12.2a. From Table 12.1, the supervisor filled in the number of short-shot
plugs corresponding to their cavity of origin (see Table 12.2b).

Table 12.1 Number of defective plastic plugs (short-shot) from each of 32 cavities in a mold.
Cavity No. No. of Defectives Cavity No. No. of Defectives
1 13 17 17
2 1 18 2
3 0 19 0
4 1 20 1
5 0 21 0
6 1 22 0
7 4 23 1
8 10 24 9
9 3 25 8
10 0 26 0
11 0 27 0
12 0 28 0
13 0 29 0
14 0 30 0
15 5 31 1
16 9 32 15
N = 101

Table 12.2a Numbering on cavities in the mold.


1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32

Table 12.2b Pattern of short-shot plugs.


13 1 0 1 0 1 4 10
3 0 0 0 0 0 5 9
17 2 0 1 0 0 1 9
8 0 0 0 0 0 1 15

1. For those to whom it is not “evident,” a formal analysis can be provided. It is possible and important for a trou-
bleshooter to develop a sense of nonrandomness and resort to formal analysis when the evidence is borderline.
Of course, when the use of data results in a major improvement, any question of “significance” is academic.
382 Part III: Troubleshooting and Process Improvement

It is evident that almost all defective plugs were produced at the two ends of the
mold, and essentially none were produced near the center. Certainly, this is not random.
What could produce such a pattern?
• In any molding process, plastic is introduced into the mold at a source and
forced out to individual cavities through small channels in the mating blocks.
Then, the supervisor reasoned, the source of plastic must be at the center
and not enough was reaching the end cavities. This was an educated surmise,
so the supervisor telephoned the vendor and asked,
“Does your mold have 32 cavities in a 4 × 8 pattern?” “Yes,” the vendor
answered. “Does it have a center source of plastic?” Again, “Yes, what makes
you think so?” Then the supervisor explained the data that had been obtained
and the reasoning formulated from the data. During the telephone conversation,
different ways were suggested as possible improvements for the physical
extrusion problem:
1. Clean out or enlarge portions of the channels to the end cavities,
2. Increase the extrusion pressure, and/or
3. Reduce the viscosity of the plastic by increasing certain feed
temperatures.
After some production trials, these suggestions resulted in a virtual elimination of
short-shot plugs. Case dismissed.

Case History 12.2


Automatic Labelers
Labels are applied to glass and plastic bottles of many kinds: beverage, food, and phar-
maceuticals. They may be applied at 400 or 500 a minute or at slower rates. It is fasci-
nating to watch the intricate mechanism pick up the label, heat the adhesive backing
of the label, and affix it to a stream of whirling bottles. But it can cause headaches from
the many defect types: crooked labels (see Figure 12.2), missing labels, greasy labels,
wrinkled labels, and others.
There are many theories advanced by production supervisors to explain crooked
label defects, for example. Crooked bottles are a common explanation. Production has
even been known to keep one particular crooked bottle in a drawer; whenever a ques-
tion was raised about crooked labels, the crooked bottle in the drawer was produced.
Now crooked bottles (with nonvertical walls) will surely produce a crooked label. It
takes a bit of courage and tenacity to insist that there may be other more important fac-
tors producing crooked labels than just crooked bottles. Such insistence will mean an
Chapter 12: Special Strategies in Troubleshooting 383

Figure 12.2 Plastic bottle and crooked label.

effort to collect data in some form. Is it really justified in the face of that crooked bottle
from the drawer? We thought so.2
There were two simple methods of getting data on this problem:
1. Collect some bottles with crooked labels, and measure the (minimum) angle
of a wall. This procedure was not helpful in this case. In some other problems,
defects have been found to come from only certain mold cavities.
2. Collect some 25 (or 50) bottles from each of the six labeling positions. This
requires help from an experienced supervisor. Then inspect and record the
extent of crooked labels from each head. One of our first studies showed
crooked labels all coming from one particular position—not from the six
positions randomly.
It is not easy to identify differences in the performance of six heads on a labeler
(operating at several hundred per minute) or on any other high-speed multiple-head
machine. And it is easy to attribute the entire source of trouble to crooked bottles, defec-
tive labels, or to the responsibility of other departments or vendors. No one wants to bear
the onus: “It’s not my fault”; but someone in your organization must develop methods
of getting sample data from high-speed processes—data that will permit meaningful
comparisons of these heads as one phase of a problem-solving project or program.
Then, controls must be established to prevent a relapse following the cure.

12.2 DISASSEMBLY AND REASSEMBLY


Any assembled product that can be readily disassembled and then reassembled can be
studied by the general procedure outlined in this chapter. The method with two compo-
nents as in the first example below is standard procedure. However, it is not standard
procedure when three or four components are reassembled; the method has had many
applications since the one in Case History 12.3.

2. One must frequently operate on the principle that “My spouse is independently wealthy,” whether or not it is true.
384 Part III: Troubleshooting and Process Improvement

Example 12.1
While walking through a pharmaceutical plant, we saw some bottles with crooked caps;
the caps were crooked on some bottles but straight on others. The supervisor said, “Yes,
some of the caps are defective.” Out of curiosity, we picked up a bottle with a crooked
cap and another with a straight cap and interchanged them. But the “crooked” cap was
now straight and the “straight” cap was now crooked. Obviously, it was not this cap but
the bottle that was defective. No additional analysis was needed, but a similar check on
a few more crooked assemblies would be prudent.

Case History 12.3


Noisy Kitchen Mixers3
During the production of an electric kitchen mixer, a company was finding rejects for
noise at final inspection. Different production experiments had been run to determine
causes of the problem, but little progress had been effected over several months of
effort. It was agreed that new methods of experimentation should be tried.
Choice of Factors to Include in the Experiment
There were different theories advanced to explain the trouble; each theory had some
evidence to support it and some to contradict it. A small committee representing pro-
duction, engineering, and quality met to discuss possible reasons for the trouble. The
usual causative variables had been tested without much success.
One gear component (henceforth called “gears”) was suspected; it connected the top
half and bottom half of the mixer. Nevertheless, there was no assurance that gears were
a source of noise or the only source. A production program to inspect gears for “out of
round” was being considered; then it was expected to use only the best in assembly.
Then the question was asked: “If it isn’t the gears, is the trouble in the top half or
the bottom half of the mixer?” There was no answer to the question. “Is it feasible to
disassemble several units and reassemble them in different ways?” The answer was,
“Yes.” Since each half is a complicated assembly in itself, it was agreed to isolate the
trouble in one-half of the mixer rather than look for the specific reason for trouble.
There were now three logical factors to include in the study:
Tops (T)
Bottoms (B)
Gears (G)

3. E. R. Ott, “A Production Experiment with Mechanical Assemblies,” Industrial Quality Control 9, no. 6 (1953).
Chapter 12: Special Strategies in Troubleshooting 385

The trouble must certainly be caused by one or more of these separate factors (main
effect) or perhaps by the interrelation of two factors (interaction). Further experiments
might be required after seeing the results of this preliminary study.
Arrangement of Factors in the Experiment
The object of this troubleshooting project was to determine why some of the mixers
were noisy. There was no test equipment available to measure the degree of noise of the
mixer, but only to determine whether it was good G or noisy N.
It was agreed to select six mixers that were definitely noisy and an equal number of
good mixers. Then a program of interchange (reassemblies of tops, bottoms, and gears)
was scheduled. During the interchanges, those mixer reassemblies that included a gear
from a noisy mixer should test noisy if the gear was the cause; or if the top or bottom
was the cause, then any reassembly which included a top or bottom from a noisy mixer
should test noisy. Thus, it might be expected that half of the reassemblies (containing a
gear from a noisy mixer) would test noisy, and the other half test good.
The reassembly of components from the three different groups was scheduled
according to the design of Table 12.3, which shows also the number of noisy mixers,
out of a possible six, in each of the reassembled combinations. After reassembly (with
parts selected at random), each mixer reassembly was rated by the original inspector as
either noisy or good. In case of doubt, a rating of one-half was assigned. For example,
group 7 indicates all six mixers from the reassembly of G gears, N tops, and G bottoms
were noisy. No one type of component resulted in all noisy mixers nor all good mixers.
Nevertheless, tops from noisy mixers were a major source of trouble. This was a most
unexpected development and produced various explanations from those who had been
intimately associated with the problem.
Two things are immediately evident from Table 12.3:
1. Noisy gears do not reassemble consistently into noisy mixers, and
2. Noisy tops do reassemble almost without exception into noisy mixers.
Formal Analysis
There were 6 × 8 = 48 reassemblies; the fractions of noisy ones are shown in Table 12.4
and graphically in Figure 12.3.

Table 12.3 Data on reassemblies of mixers.


N Gears G Gears
ng = 6
N tops G tops N tops G tops
N—bottoms (1) (2) (3) (4)
41⁄2 2 6 2
G—bottoms (5) (6) (7) (8)
41⁄2 3 6 11⁄2
386 Part III: Troubleshooting and Process Improvement

Table 12.4 Computations for main effects and interactions (ANOM)


Number of noisy assemblies out of total assemblies
Main effects: Two-factor interactions:
Gears: N: (1,2,5,6): 14/24 = 0.583 TB: (Like)(1,3,6,8): 15/24 = 0.625
G: (3,4,7,8): 15.5/24 = 0.646 (Unlike)(2,4,5,7): 14.5/24 = 0.604
Tops: N: (1,3,5,7): 21/24 = 0.875 TG: (Like)(1,4,5,8): 12.5/24 = 0.520
G: (2,4,6,8): 8.5/24 = 0.354 (Unlike)(2,3,6,7): 17/24 = 0.708
Bottoms: N: (1,2,3,4): 14.5/24 = 0.604 BG: (Like)(1,2,7,8): 14/24 = 0.583
G: (5,6,7,8): 15/24 = 0.625 (Unlike)(3,4,5,6): 15.5/24 = 0.646

p– = 0.6145
Three-factor interaction:
TBG: (Odd*)(1,4,6,7): 15.5/24 = 0.646
(Even*)(2,3,5,8): 14/24 = 0.583
* Note: Treatment combinations for three-factor interactions (and higher) are grouped according to subscript
totals being either odd or even (see Chapter 11).

Main effects Interactions


Gears Tops Bottoms BG GT BT BTG
N G N G N G L U L U L U
Proportion of noisy assemblies

ng = 24

UDL = 0.79
.80 (.01)
(.05)
0.75

.60 p– = .615
LDL = 0.48
(.05)
(.01)
.40 0.43

Figure 12.3 A formal comparison of mixer performance (analysis assumes independence) in


reassemblies using subassemblies from six noisy and six good mixers. (Data from
Table 12.4).

Formal Analysis

σˆ p
( 0.615)( 0.385) = 0.099
24

Decision Lines (Figure 12.3)


a = 0.05:

p ± H 0.05σˆ p = 0.615 ± (1.39 )( 0.099 ) = 0.615 ± 0.138


UDL = 0.75
LDL = 0.48
Chapter 12: Special Strategies in Troubleshooting 387

a = 0.01:

p ± H 0.01σˆ p = 0.615 ± (1.82 )( 0.099 ) = 0.615 ± 0.180


UDL = 0.79
LDL = 0.43

Discussion
Before this study, gears had been the suspected source of trouble. Can we now say that
gears have no effect? The interaction analysis in Figure 12.3 indicates that the only
effect that might be suspect is the top-gear relationship. Although no top-gear decision
lines have been drawn for a = 0.10, top-gear combinations may warrant a continuing
suspicion that more extreme out-of-round gears result in noisy mixers with some tops.
But the most important result is that of the tops, and engineering studies can be
designed to localize the source of trouble within them. This was done subsequently by
dividing tops into three areas and making reassemblies in a manner similar to that used
in this study.
The decision to choose six noisy mixers and six good mixers was dictated primar-
ily by expediency. With this number of mixers, it was estimated that the reassembling
and testing could be completed in the remaining two hours of the workday. This was an
important factor because some of the committee members were leaving town the next
morning and it was hoped to have some information on what would happen by then. In
addition, it was thought that the cause of the trouble would be immediately apparent
with even fewer reassembled mixers than six of each kind.
This type of 23 study with mechanical or electrical-mechanical assemblies is an
effective and quick means of localizing the source of trouble. It is quite general, and we
have used it many times with many variations since this first experience.
Note: See Case Histories 12.1 and 12.2 in Section 12.1 for a general strategy in
troubleshooting.

12.3 A SPECIAL SCREENING PROGRAM FOR


MANY TREATMENTS
Introduction
The periodic table contains many trace elements. Some will increase the light output
from the phosphor coating on the face of a radar tube. This type of light output is called
cathode luminescence. The coating produces luminescence when struck by an electron
beam. How do we determine which trace element to add to a new type of coating?
There are many other examples of problems involving the screening of many
possible treatments. Many chemicals have been suggested as catalysts in certain
usages. Many substances are the possible irritants causing a particular allergy. Many
388 Part III: Troubleshooting and Process Improvement

substances are tested as possible clues for a cancer. How can the testing of many possi-
bilities be expedited?
This chapter considers one general strategy to provide answers to problems of this
type where there are many items to be investigated. How should they all be examined
using only a few tests? The amount of testing can be reduced drastically under certain
assumptions. Testing sequences involving many factors have been developed that can
do the job. These sequences have their origins in the two-level factorial experiment. In
Case History 11.10 it was discussed how the 23 factorial design can be split into two
halves of four runs each; thus with a change in the assumptions, we can test the three
factors with only four of the eight combinations and not perform the other four. As the
number of factors increases, there are many other ways of dividing the 2n runs into frac-
tions; each strategy involves the extent to which the experimenter decides to relax the
assumptions.
On the other hand, the basic 2n factorial array can be augmented with additional data
points, and the assumptions can then support quadratic estimates. Thus it is apparent
from the developmental history of applied statistics that the number of tests or data points
bear a direct relationship to the assumptions made at the planning stage for gathering
data and to the testing of significance of effects seen when the data are finally at hand.
A screening program. In the case of screening many possible factors or treatments,
it is possible to reduce drastically the amount of testing under certain assumptions.4
Theorem. With n separate tests, it is possible to screen (2n – 1) treatments at the
chosen levels as having a positive or negative effect, and to identify the effec-
tive treatment under assumptions given below. We then have:

Number of tests n 2 3 4
Number of factors 2n – 1 3 7 15
Number of treatments/test 2n–1 2 4 8

Assumptions
1. It is feasible to combine more than one treatment in a single test. For example,
several trace elements can be included in the same mix of coating.
2. No treatment is an inhibitor (depressant) for any other treatment. It is not
unusual to observe a synergistic effect from two combined substances; such
an effect would favor their joint inclusion. (The combining of two treatments
that adversely affect each other would not be desirable.)
3. The effect of any treatment is either positive or negative (effective or not
effective). For example, it is possible to determine whether the light output
of one mix of phosphor is brighter than that of another.

4. E. R. Ott and F. W. Wehrfritz, “A Special Screening Program for Many Treatments,” Statistica Neerlandica, special
issue in honor of Prof. H. C. Hamaker (July 1973): 165–70.
Chapter 12: Special Strategies in Troubleshooting 389

4. The effect of a treatment is consistent (not intermittent).


5. No more than one effective treatment is among those being studied. (There
may be none.)
These assumptions are the basis of this initial discussion. However, in application
each assumption can be modified or disregarded as warranted by the conditions of a spe-
cific experiment. Modifications of the assumptions would require a reconsideration of
the logic. For example, a numerical value of the yield of a process might be available on
each run; this measure can then be used instead of considering only whether the run is a
success or a failure. Replicates of each run would then provide a measure of the process
variability. The following discussion is based on the listed assumptions, however.

Examples of Screening Programs


Screen 7 different treatments with three tests (n = 3).
Screen 15 different treatments with four tests (n = 4).
Screen 31 different treatments with five tests (n = 5).
Each of the n tests will include 2n–1 treatments.

Example 12.2
As an example, when the experimenter is planning three tests (n = 3), seven treatments
or factors (k = 2n – 1 = 23 – 1 = 7) can be allocated to the three tests. Each test will
include four treatments (2n–1 = 23–1 = 4). The treatments or factors are designated as X1,
X2, X3, X4, X5, X6, X7 and assigned to these numbers as desired. Table 12.5 shows which
of the treatments to combine in each test. The array of plus and minus signs is the design
matrix, indicating the presence (+) or absence (–) of a treatment (or high and low con-
centrations of a treatment). The experimental responses are designated as Ei: each E is
simply a (+) or a (–) when assumption 3 is being accepted.
The identification of the effective treatment is made by matching the pattern of
experimental responses, El, E2, E3, with one of the columns of treatment identification.
When the (+) and (–) results of the three tests match a column of (+) and (–) in Table

Table 12.5 A screening design for 23 – 1 = 7 factors.


Treatment identification Experimental
Treatments* included X1 X2 X3 X4 X5 X6 X7 response
Test 1: X1,X2,X3,X5 + + + – + – – E1
Test 2: X1,X2,X4,X6 + + – + – + – E2
Test 3: X1,X3,X4,X7 + – + + – – + E3
* Note: Each test combines four of the treatments, which are represented by +. A full factorial would require
not three but 27 = 128 tests.
390 Part III: Troubleshooting and Process Improvement

12.5, this identifies the single causative treatment. If, for example, the three treatments
yield the vertical sequence (+ – +), then X3 is identified as the cause.
Discussion
A discussion of possible answer patterns for the three tests involving the seven treat-
ments follows:
No positive results. If none of the three tests yields a positive result, then none
of the seven treatments is effective (assumptions 2 and 4).
One positive result. If, for example T1 alone yields a positive effect, then
treatment X5 is effective. The reasoning is as follows: the only possibilities
are the four treatments in T1. It can’t be either X1 or X2 because they are both
in T2, which gave a negative result: it can’t be X3, which is also in T3, which
gave a negative result. Then it must be X5.
Two positive results. If T1 and T2 both yield positive results and T3 yields
negative (El and E2 are +; E3 is –), then X2 is the effective treatment. The
reasoning is similar to that for one positive experimental result; the decision
that it is X2 can be made easily from Table 12.5. Simply look for the order
(+ + –) in a column; it appears uniquely under X2.
Three positive results. The (+ + +) arrangement is found under treatment X1.
It is easily reasoned that it cannot be X2 since X2 does not appear in test 3,
which gave a positive response; and similar arguments can be made for each
of the other variables other than X1.

Example 12.3
When n = 4, let the 24 – 1 = 15 treatments be

X1, X2, X3, . . . , X15

Then a representation of the screening tests and analysis is given in Table 12.6. Note
that each test combines 8 = 2n – 1 treatments.
There is a unique combination of the plus and minus signs corresponding to each
of the 15 variables; this uniqueness identifies the single cause (assumption 5). No vari-
able is effective if there is no positive response.

Fewer Than 2n – 1 Treatments to Be Screened. Frequently, the number of treatments k


to be screened is not a number of the form 2n – 1. We then designate the k treatments as
X1, X2, X3, . . . , Xk
Table 12.6 A screening design for 24 – 1 = 15 factors.
Treatment Identification* Experimental

Chapter 12: Special Strategies in Troubleshooting


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 response
T1 : X1 X2 X3 X4 X6 X8 X10 X12 + + + + – + – + – + – + – – – E1
T2 : X1 X2 X3 X5 X6 X9 X11 X13 + + + – + + – – + – + – + – – E2
T3 : X1 X2 X4 X5 X7 X8 X11 X14 + + – + + – + + – – + – – + – E3
T4 : X1 X3 X4 X5 X7 X9 X10 X15 + – + + + – + – + + – – – – + E4
* The treatments may also be designated by the letters A, B, C, . . ., O: (Case History 12.4)
Test 1: A B C D F H J L
Test 2: A B C E F I K M
Test 3: A B D E G H K N
Test 4: A C D E G I J O

391
392 Part III: Troubleshooting and Process Improvement

Table 12.7 A screening design for five factors.


Treatment identification Experimental
X1 X2 X3 X4 X5 response
Test 1: X1,X2,X3,X5 + + + – + E1
Test 2: X1,X2,X4 + + – + – E2
Test 3: X1,X3,X4 + – + + – E3

When n = 3 and k = 5, for example, carry out the three tests T1, T2, T3, disregarding
entirely treatments X6 and X7, as in Table 12.7.
Note: We would not expect either T2 alone or T3 alone to give a positive result; that
would be a puzzler under the assumptions. (Possible explanations: a gross experimen-
tal blunder or an intermittent effect or interaction. For example if X1 and X4 together
were required to give a positive result.)
Again, if there are fewer than 15 treatments to be screened (but more than seven,
which employs only three tests), assign numbers to as many treatments as there are and
ignore the balance of the numbers. The analysis in Table 12.6 is still applicable.
Ambiguity If Assumption 5 Is Not Applicable. One seldom can be positive, in advance,
whether there may possibly be two or more effective treatments in the group to be
screened. If during the experiment, some combination of the treatments gives a vastly
superior performance, this is an occasion for rejoicing. But economics is a factor that
may require or desire the specific identification of causes. Let us consider the case of
seven treatments being screened (Table 12.5):
• One positive test response. No ambiguity possible; the indicated treatment is
the only possibility.
• Two positive test responses. If both T1 and T2 give positive responses, then it
may be X2 alone or any two or three of the three treatments X2, X5, and X6.
The possible ambiguity can be resolved in a few tests; an obvious procedure would
be to run tests with X2, X5, and X6 individually. However, when there seems to be little
possibility of having found more than a single positive treatment, run a single test with
X5 and X6 combined. If the result is negative, then the single treatment is X2.

Case History 12.4


Screening Some Trace Elements
Some of the earth’s trace elements have important effects upon some quality character-
istics of cathode-ray tubes. Something is known about these effects, but not enough.
Hoping to identify any outstanding effect upon the light-output quality characteristic, a
screening study was planned. Fifteen trace elements (designated A, B, C, . . . , O) were
mixed in four phosphor slurries in the combinations shown in Table 12.8 (eight different
Chapter 12: Special Strategies in Troubleshooting 393

Table 12.8 Variables data in a screening design for 15 factors (trace elements).
C44 C34 C24 C14 Experimental
A B C D E F G H I J K L M N O response
Test 1 + + + + – + – + – + – + – – – 46
Test 2 + + + – + + – – + – + – + – – 84
Test 3 + + – + + – + + – – + – – + – 44
Test 4 + – + + + – + – + + – – – – + 51

trace elements were included in each slurry run). A (+) indicates those elements that were
included in a test and a (–) indicates those that were not included. The measured output
responses of the four slurries are shown in the column at the right.
• Previous experience with this characteristic had shown that the process
variability was small; consequently, the large response from test 2 is
“statistically significant.”
• Also, the response from test 2 was a definite improvement over ordinary slurries,
which had been averaging about 45 to 50. The other three output responses, 46,
44, and 51, were typical of usual production. The four test responses can be
considered to be (– + – –). Under the original assumptions of this study plan,
the identity of the responsible trace element must be M.
• One cannot be sure, of course, that the uniqueness assumption is applicable.
Regardless, the eight trace elements of test 2 combine to produce an excellent
output response. At this point in experimentation, the original 15 trace elements
have been reduced to the eight of test 2 with strong evidence that the improve-
ment is actually a consequence of M alone. Whether it is actually a consequence
of M alone or some combination of two or more of the eight elements of test 2
can now be determined in a sequence of runs. Such a sequence might be as
follows: The first run might be with M as the only one, the second with the
seven other than M. If there is still uncertainty, experiments could continue
eliminating one trace element at a time.

12.4 OTHER SCREENING STRATEGIES


One can find other screening strategies discussed in the technical literature. Anyone
who has reason to do many screening tests should consider the preceding strategy and
consult the published literature for others.
The foregoing allocation of experimental factors to the tests is based on combina-
torial arrays. As such, this strategy serves to reduce the initial large number of factors to
more manageable size. This is indicated in the readings of Table 12.8.
394 Part III: Troubleshooting and Process Improvement

12.5 RELATIONSHIP OF ONE VARIABLE TO ANOTHER


Engineers often use scatter diagrams to study possible relationships of one variable to
another. They are equally useful in studying the relationship of data from two sources—
two sources that are presumed to produce sets of data where either set should predict
the other. Scatter diagrams are helpful in studying an expected relationship between two
sets of data by displaying the actual relationship that does exist under these specific
conditions. They often show surprising behavior patterns that give an insight into the
process that produced them. In other words, certain relationships, which are expected by
scientific knowledge, may be found to exist for the majority of the data but not for all.
Every type of nonrandomness (lack of control) of a single variable mentioned earlier—
outliers, gradual and abrupt shifts, bimodal patterns—is a possibility in a scatter dia-
gram that displays the relationship of one set of data to another. These evidences of
nonrandomness may lead an engineer to investigate these unexplained behavior patterns
and discover important facts about the process.
For example, one step in many physical or chemical processes is the physical treat-
ment of a product for the purpose of producing a certain effect. How well does the
treatment actually accomplish its expected function on the items? Does it perform ade-
quately on the bulk of them but differently enough on some items to be economically
important? Any study of a quality characteristic relationship between before and after a
treatment of importance can begin with a display in the form of a scatter diagram. This is
especially important when problems of unknown nature and sources exist. The method
is illustrated in the following case history.

Case History 12.5


Geometry of an Electronic Tube
Changes in the internal geometry in a certain type of electronic tube during manufac-
ture can be estimated indirectly by measuring changes in the electrical capacitance.
This capacitance can be measured on a tube while it is still a “mount,” that is, before the
mount structure has been sealed into a glass bulb (at high temperature).
During successive stages of a tube assembly, the internal geometry of many (or all)
was being deformed at some unknown stage of the process. It was decided to investigate
the behavior of a mount before being sealed into a bulb and then the tube after sealing—
a before-and-after study. In addition, it was decided to seal some tubes into bulbs at 800°
and some at 900° and to observe the effect of sealing upon these few tubes.
Data are shown in Table 12.9 for 12 tubes sealed at 900° and another 12 sealed at
800°; two of the latter were damaged and readings after sealing were not obtainable. In
preparing the scatter diagram, we use principles of grouping for histograms, Section 1.4,
Chapter 1. The range of the R-1 readings is from a low of 0.56 to a high of 0.88, a dif-
ference of 0.32. Consequently, a cell width of 0.02 gives us 16 cells, which is reason-
able for such a scattergram. Similarly, a cell width of 0.02 has been used for the R-2
Chapter 12: Special Strategies in Troubleshooting 395

Table 12.9 12 Tubes at 900° and 12 tubes at 800°.


At 900° At 800°
R-1 before R-2 after stage A R-1 before R-2 after stage A
Tube no. stage A and before stage B Tube no. stage A and before stage B

1 0.66 0.60 13 0.75 0.68


2 0.72 0.57 14 0.56 0.49
3 0.68 0.55 15 0.72 0.59
4 0.70 0.60 16 0.66 0.56
5 0.64 0.64 17 0.62 0.52
6 0.70 0.58 18 0.56 –
7 0.72 0.56 19 0.65 –
8 0.73 0.62 20 0.88 0.87
9 0.82 0.62 21 0.56 0.46
10 0.66 0.60 22 0.76 0.77
11 0.72 0.62 23 0.72 0.69
12 0.84 0.87 24 0.74 0.70

data. In practice, tally marks are made with two different colored pencils; in Figure 12.4,
an x has been used for tubes processed at 900° and an o for tubes processed at 800°.
Tube no. 1, at 900°, for example, is represented by an x at the intersection of column
66–67 and row 60–61.
Analysis and Interpretation: Stage A
There is an obvious difference in the patterns corresponding to the 12 tubes sealed at
900° and the 10 sealed at 800°; all 22 tubes had had the same manufacturing assembly
and treatment in all other respects. The 900° sealing temperature is deforming the inter-
nal geometry in an unpredictable manner; tubes processed at 800° in stage A show a
definite tendency to line up. If the quality characteristics of the tubes can be attained at
800°, well and good. If not, then perhaps the mount structure can be strengthened to
withstand the 900°, or perhaps an 850° sealing temperature could be tried.
In any case, the stage in the process that was deforming the geometry has been iden-
tified, and the problem can be tackled.
Discussion
A small number of tubes was sufficient to show a difference in behavior between the
two groups. No analytical measure5 of the relationship would give additional insight
into the problem.

5. Note: There will be occasions when an explicit measure of the before-and-after relationship is useful to the
troubleshooter, but not in this case. However, the correlation coefficient r = 0.84 and the least-squares line of
best fit Y = –1.66 + 1.119X for the benefit of those who commonly use such measures. Note that a simple
nonparametric test for the existence of a relationship is presented in P. S. Olmstead and J. W. Tukey, “A Corner
Test for Association,” Annals of Mathematical Statistics 18 (December 1947): 495–513.
396 Part III: Troubleshooting and Process Improvement

Capacitance before stage A


56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88
57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89
87
86
85
84
83
82 n = 22
81
80
79 Line of no change
78
77
76
75
Capacitance after stage A

74
73
72
71
70 Line of best fit
69
68
67
66
65
64
63
62
61
60
59
58
57
56
55
54
53
52 Definite tendency for the ’s to line up
51
50 ’s: Manufacturing condition A1 (900°)
49
48
47 ’s: Manufacturing condition A2 (800°)
46

Figure 12.4 A scatter diagram showing relationship of capacitance on same individual tubes
before and after stage A. (Data from Table 12.9.)

The line of no change has been drawn in Figure 12.4 by joining the point at the max-
imum corner of the (86–87) cell on both R-1 and R-2 with the similar corner point of
the (56–57) cell. R-1 = R-2 at each point of the line. It is evident that the electrical
capacitance was unchanged on only three or four tubes. For all others, it was apprecia-
bly reduced during stage A, especially for tubes with lower capacitances. This repre-
sents an unsolved problem.
Summary: Scatter Diagram, Before and After
In Chapter 2, we discussed certain patterns of data involving one independent variable
that indicate assignable causes of interest to an engineer. We also discussed certain tests
available to detect different data patterns. When we study possible relationships between
two variables, we have an interest in the pattern of points with respect to a diagonal line
(such as a line of no change). The data may show variations of the same or different types
as in Chapter 2: gross errors, mavericks, trends, abrupt shifts, short cycles, and long
Chapter 12: Special Strategies in Troubleshooting 397

cycles. The suggested tests for the existence of these patterns with respect to a horizon-
tal line can be extended or adjusted to diagonal lines. A simple scattergram gives valu-
able clues that are often entirely adequate6 for engineering improvements.

12.6 MECHANICS OF MEASURING THE DEGREE


OF A RELATIONSHIP
Introduction
The following section presents a well-known computational procedure, one which may
at times give some insight into the sources or natures of process misbehavior. The two
general divisions in this presentation are:
1. Computing the equation of a line of best fit7
2. Computing a correlation coefficient
Through the use of computers and computer programs, the practice of computing a
value of r from Equation (12.4) below has become an almost irresistible temptation.
Serious decision errors8 are a frequent consequence of this practice unless accompanied
by a printout of a scatter diagram. It is important to start every analysis of a set of rela-
tionship data with a scatter diagram even if a rough one. Sometimes it is even satisfactory
to use ordinary ruled paper.

Line of Best Fit


The line of best fit can be computed from the data and applied without much danger of
serious misuse. A linear relationship is assumed between an independent variable X and
a dependent variable Y. How can we establish, objectively, a line to predict Y from X
when we have n pairs of experimental data, (Xi,Yi)?
Mathematical Procedure. Consider that a scatter diagram has been drawn, as in Figure
12.5. We propose to compute the unknown parameters a and b in the equation

Yc = a + b ( X i − X ) (12.1)

6. They often provide more information of value than the more traditional computation of the correlation coefficient
r. The computation of r from data containing assignable causes may give a deceptive measure of the underlying
relationship between the two variables—sometimes too large and sometimes too small. Any computation of r
should be postponed at least until the scattergram has been examined.
7. In the sense of least squares. This line is usually called a regression line.
8. See Figure 12.6.
398 Part III: Troubleshooting and Process Improvement

Y
Yi

Yc

Xi X

Figure 12.5 A scatter diagram of n plotted points with an estimated line of best fit; differences
(Yi – Yc) have been indicated by vertical dotted lines.

where Yc ,i denotes the predicted value corresponding to a given Xi. We usually write Yc
instead of Yc ,i .
The equations to determine a and b are

a = ΣYi / n = Y (12.2)

Σ ( X i − X ) (Yi − Y ) nΣX iYi − ( ΣX i )( ΣYi )


b= = (12.3)
Σ ( Xi − X ) nΣX i2 − ( ΣX i )
2 2

These two parameters are easily determined using a computer or a hand calculator.
The equations are derived9 on the basis of finding a and b such as to minimize the sum of
squares of differences between computed Yc’s and corresponding observed Yi’s. Hence,
the term least squares line of best fit to the data.
This gives us the equation10:

Yc = a + b ( X i − X ) (12.1)
– –
It can be noted that the line passes through the average point (X, Y ).

9. This deviation involves some differential calculus. For a complete discussion of regression, see N. R. Draper and
H. Smith, Applied Regression Analysis, 3rd ed. (New York: John Wiley & Sons, 2003).
10. Enoch Ferrell suggests the following ingenious procedure. After the scatter diagram has been drawn, stretch a
black thread across the figure in what seems to fit the pattern of data dots “best by eye.” Then count the number
of points on each side of the line and adjust it slightly to have n/2 points on each side of the resulting line. Turn
the scatter diagram so that the line is horizontal; count the number of runs above and below the line (the median
line). Then adjust the line to different positions, always with some n/2 points on each side, until a line with
maximum number of runs is obtained. With a little practice, this is easy if the number of points is not too large.
This procedure gives a line with computation limited to counting; it can be found quickly; and it dilutes the effect
of wild observations (mavericks).
Chapter 12: Special Strategies in Troubleshooting 399

A proper interpretation of the line of best fit does not require that X be a random vari-
able. With Equation (12.1), the values of X can be set deliberately at different values to
obtain data pairs (Xi, Yi). This is entirely different from the important requirements for
the correlation coefficient, whose mechanical computation is given later in this section.
Computation Example. What is the regression equation corresponding to the 22 pairs
of points in Table 12.9 and Figure 12.4?
Answer: The following preliminary computations are made, with n = 22.
n n

∑ Xi = 15.56
i =1
X = 0.707 ∑X
i =1
i
2
= 11.1354

n n

∑Yi = 13.76
i =1
Y = 0.625 ∑ X Y = 9.8778
i =1
i i

Then from Equations (12.2) and (12.3)

13.76
a= = 0.625
22
22 ( 9.8778 ) − (15.56 )(13.76 )
b=
22 (11.1354 ) − (15.56 )
2

3.2060
= = 1.119
2.8652
and
Yc = 0.625 + 1.119 ( X − 0.707 )

Any two points will determine the line:

When X = 0.595 Yc = 0.500


When X = 0.855 Yc = 0.791

The regression line can be drawn through these two points.


It seems doubtful that the equation of the line contributes appreciably in the study of
the effect of heat on the capacitance of electronic tubes in this troubleshooting study.
But there are occasions where it is a help to have it.

Linear Correlation Coefficient—a Measure of the Degree of


Relationship between X and Y
Application of the following discussion is critically dependent upon certain assump-
tions, and these assumptions are seldom satisfied adequately in troubleshooting and
process improvement projects.
400 Part III: Troubleshooting and Process Improvement

Procedure: Consider X to be an independent, random, continuous variable that is


normally distributed, and assume also that the dependent variable Y is normally dis-
tributed. Let the assumed linear relationship be

Y = a + b( X − X )

In a physical problem, we obtain pairs of experimental values (Xi, Yi), as in Table 12.9),
for example. They will not all lie along any straight line. For example, see Figure 12.4.
Then a measure r of the linear relationship, under the assumptions below, can be
computed from n pairs of values (Xi, Yi) as follows

∑( X − X ) (Yi − Y )
n

i
r= i =1
(12.4a)
∑( X − X) ∑ (Y − Y )
n n
2 2
i i
i =1 i =1

 n  
∑( X )
n

 ∑ ( X i − X ) (Yi − Y )   
2
− X

i
 bσ̂
σx
=  i=1 n 
i =1
= (12.4b)
σˆ
 ∑ ( Xi − X )  (Yi − Y )  y
2 n


2

 i =1   i =1 

It makes little sense to compute an r as in Equation (12.4) unless the following pre-
requisites have been established (usually by a scatter diagram):
1. X and Y are linearly related.
2. X is a random variable, continuous, and normally distributed. (It must not be
a set of discrete, fixed values; it must not be a bimodal set of values; it must
be essentially from a normal distribution.)
3. The dependent variable Y must also be random and normally distributed.
(It must not be an obviously nonnormal set of values.)
A scatter diagram is helpful in detecting evident nonlinearity, bimodal patterns,
mavericks, and other nonrandom patterns. The pattern of experimental data should be
that of an “error ellipse”: one extreme, representing no correlation, would be circular; the
other extreme, representing perfect correlation, would be a perfectly straight line.
Whether the data are not normally distributed is not entirely answered by a his-
togram alone; some further evidence on the question can be had by plotting accumu-
lated frequencies Σfi on normal-probability paper (see Section 1.7), and checking for
mavericks (Chapter 3).
Some Interpretations of r, under the Assumptions. It can be proved that the maximum
attainable absolute value of r from Equation (12.4) is + 1; this corresponds to a straight
Chapter 12: Special Strategies in Troubleshooting 401

line with positive slope. (Its minimum attainable value of –1 corresponds to a straight
line with negative slope.) Its minimum attainable absolute value is zero; this corre-
sponds to a circular pattern of random points.
Very good predictability of Y from X in the region of the data is indicated by a value
of r near +1 or –1.
Very poor predictability of Y from X is indicated by values of r close to zero.
The percent of relationship between Y and X explained by the linear Equation (12.1)
is 100r2. Thus r = 0 and r = ± 1 represent the extremes of no predictability and perfect
predictability. Values of r greater than 0.8 or 0.9 (r2 values of 0.64 and 0.81, respec-
tively) computed from production data are uncommon.
Misleading Values of r. The advantage of having a scatter diagram is important in the
technical sense of finding causes of trouble. There are different patterns of data that will
indicate immediately the inadvisability of computing an r until the data provide reason-
able agreement with the required assumptions.
Some patterns producing seriously misleading values of r when a scatter diagram is
not considered are as follows:
• Figure 12.6 (a); a curvilinear relationship will give a low value of r.
• Figure 12.6 (b); two populations each with a relatively high value of r. When
a single r is computed from Equation (12.4b), a low value of r results. It is
important to recognize the existence of two sources in such a set of data.

(a) (b) (c)

(d) (e)

Figure 12.6 Some frequently occurring patterns of data that lead to seriously misleading values
of r and are not recognized as a consequence.
402 Part III: Troubleshooting and Process Improvement

• Figure 12.6 (c); a few mavericks, such as shown, give a spuriously high
value of r.
• Figure 12.6 (d); a few mavericks, such as shown, give a spuriously low
value of r, and a potentially opposite sign of r.
• Figure 12.6 (e); two populations distinctly separated can give a spuriously
high value of r, and a potentially opposite sign of r.
All the patterns in Figure 12.6 are typical of sets of production and experimental data.

12.7 SCATTER PLOT MATRIX


When the number of variables is greater than two, the number of two-variable relation-
ships that can be observed gets larger as well. For example, if a data set contains five
variables of interest, the number of two-variable relationships becomes

5! 5! 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅1
C25 = = = = 10
( )
5 − 2 ! 2! 3! 2! ( ⋅ 2 ⋅1)( 2 ⋅1)
3

With the use of a simple extension of the scatter diagram technique, this chore can
be reduced to a single graphic, namely, the scatter plot matrix. Similar to a matrix of
numbers, the scatter plot matrix is a matrix of scatter diagrams.
Hald discusses an experiment involving the heat produced during the hardening of
Portland cement.11 The investigation included four components, x1, x2, x3, and x4, con-
tained in the clinkers from which the cement was produced. The heat evolved, y, after
180 days of curing is presented in calories per gram of cement, and the components are
given in weight %. Table 12.10 gives the data collected during the investigation.
The use of a scatter plot matrix shows us the nature of the relationships between
these five variables with only a single graphic, as shown in Figure 12.7 (produced in
MINITAB12). Note that there are 10 scatter plots in the matrix as expected.
A close look at the scatter plot matrix in Figure 12.7 shows that there are both lin-
ear and nonlinear relationships between these variables. All of the four components
exhibit linear relationships with the response variable y, as seen along the bottom row
of the matrix. The four components appear to be fairly independent with the exception of
a possible nonlinear relationship between x1 and x3, and a linear relationship between
x2 and x4.
Scatter plot matrices are a critical supplement to a correlation matrix. Spurious val-
ues, as seen in Figure 12.6, can produce misleading values of r. Using the scatter plot
matrix in conjunction with a correlation matrix prevents misrepresentation of relation-
ships between variables.

11. A. Hald, Statistical Theory with Engineering Applications (New York: John Wiley & Sons, 1952): 647.
12. MINITAB is a popular statistical analysis program from MINITAB Inc., State College, PA.
Chapter 12: Special Strategies in Troubleshooting 403

Table 12.10 Hald cement data.


x1 x2 x3 x4 y
7 26 6 60 78.5
1 29 15 52 74.3
11 56 8 20 104.3
11 31 8 47 87.6
7 52 6 33 95.9
11 55 9 22 109.2
3 71 17 6 102.7
1 31 22 44 72.5
2 54 18 22 93.1
21 47 4 26 115.9
1 40 23 34 83.8
11 66 9 12 113.3
10 68 8 12 109.4

Matrix Plot of x1, x2, x3, x4, y

60
x2

40

20
24

16
x3

50
x4

25

1200

100
y

80

0 10 20 20 40 60 8 16 240 25 50
x1 x2 x3 x4

Figure 12.7 Scatter plot matrix of Hald cement data.

12.8 USE OF TRANSFORMATIONS AND ANOM


Often times the analyst will encounter a situation where the mean of the data is corre-
lated with its variance. The resulting distribution is typically skewed in nature.
Fortunately, if we can determine the relationship between the mean and the variance, a
404 Part III: Troubleshooting and Process Improvement

transformation can be selected that will result in a more symmetrical, reasonably nor-
mal, distribution for analysis.
An important point here is that the results of any transformation analysis pertain
only to the transformed response. However, we can usually back-transform the analysis
to make inferences to the original response. For example, suppose that the mean m and
the standard deviation s are related by the following relationship:

σ ∝ µα

The exponent of the relationship a can lead us to the form of the transformation
needed to stabilize the variance relative to its mean. Let’s say that a transformed
response YT is related to its original form Y as

YT = Y λ

The standard deviation of the transformed response will now be related to the orig-
inal variable’s mean m by the relationship

σ YT ∝ µ λ +α −1

In this situation, for the variance to be constant, or stabilized, the exponent must
equal zero. This implies that

λ +α −1= 0 ⇒ λ = 1− α

Such transformations are referred to as power, or variance-stabilizing, transforma-


tions. Table 12.11 shows some common power transformations based on a and l.
Note that we could empirically determine the value for a by fitting a linear least
squares line to the relationship

σ i ∝ µiα = θµiα

which can be made linear by taking the logs of both sides of the equation yielding

log σ i = log θ + α log µi

Table 12.11 Common power transformations for various data types.


a l=1–a Transformation Type(s) of Data
0 1 None Normal
0.5 0.5 Square root Poisson
1 0 Logarithm Lognormal
1.5 –0.5 Reciprocal square root
2 –1 Reciprocal
Chapter 12: Special Strategies in Troubleshooting 405


The data take the form of the sample standard deviation si and the sample mean Xi at

time i. The relationship between log si and log Xi can be fit with a least squares regres-
sion line. The least squares slope of the regression line is our estimate of the value of a.

Example 12.4
In finding a transformation for the mica thickness data, we can use the mean and stan-
dard deviations for the 40 subgroups presented in Table 7.3. A linear regression line that

relates the subgroup log means Xi and log standard deviations si was determined to be

log si = 0.68 − 0.023 log X i

Here the estimate of a is –0.023, which is very nearly equal to zero and l ~ 1.
According to Table 12.11, the recommendation is that a transformation is unnecessary.

Box–Cox Transformations
Another approach to determining a proper transformation is attributed to Box and
Cox.13 Suppose that we consider our hypothetical transformation of the form

YT = Y λ

Unfortunately, this particular transformation breaks down as l approaches zero


and Yl goes to one. Transforming the data with a l = 0 power transformation would
make no sense whatsoever (all the data are equal!), so the Box–Cox procedure is dis-
continuous at l = 0. The transformation takes on the following forms, depending on the
value of l:

(Yl – 1)/(lY l–1), for l ≠ 0,
YT = { •
Y ln Y , for l = 0
where

Y = geometric mean of the Yi
= (Y1Y2 … Yn)1/n

The Box–Cox procedure evaluates the change in sum of squares for error for a
model with a specific value of l. As the value of l changes, typically between –5 and
+5, an optimal value for the transformation occurs when the error sum of squares is
minimized. This is easily seen with a plot of the SS(Error) against the value of l.

13. G. E. P. Box and D. R. Cox, “An Analysis of Transformations,” Journal of the Royal Statistical Society, B, vol. 26
(1964): 211–43.
406 Part III: Troubleshooting and Process Improvement

Box-Cox Plot of Mica Thickness, .001"


Lower CL Upper CL
4.5 Lambda
(using 95.0% conf idence)
Estimate 0.87
4.0
Lower CL 0.21
Upper CL 1.58
Rounded Value 1.00
3.5
StDev

3.0

2.5

2.0 Limit

-5.0 -2.5 0.0 2.5 5.0


Lambda

Figure 12.8 Box–Cox transformation plot for n = 200 mica thickness values.

Box–Cox plots are available in commercially available statistical programs, such as


Minitab. The Box–Cox plot for the mica thickness data from Table 1.1 was produced
with Minitab and is shown in Figure 12.8.
Note that this analysis was performed on the raw n = 200 data values. Yet the
estimate of lambda is 0.9, which again is close to one (indicating that no transforma-
tion is necessary).
Furthermore, Minitab produces a confidence interval for lambda based on the data.
For this example, a 95 percent confidence interval was generated (it is the default). Data
sets will rarely produce the exact estimates of l that are shown in Table 12.11. The use
of a confidence interval allows the analyst to “bracket” one of the table values, so a
more common transformation can be justified.

Some Comments About the Use of Transformations


Transformations of the data to produce a more normal distribution are sometimes use-
ful, but their practical use is limited. Often the transformed data do not produce results
that differ much from the analysis of the original data.
Transformations must be meaningful and, hopefully, relate to the first principles of
the problem being studied. Furthermore, according to Draper and Smith:
When several sets of data arise from similar experimental situations, it may not
be necessary to carry out complete analyses on all the sets to determine appro-
priate transformations. Quite often, the same transformation will work for all.
Chapter 12: Special Strategies in Troubleshooting 407

The fact that a general analysis for finding transformations exists does not
mean that it should always be used. Often, informal plots of the data will clearly
reveal the need for a transformation of an obvious kind (such as l n Y or 1/Y). In
such a case, the more formal analysis may be viewed as a useful check proce-
dure to hold in reserve.
With respect to the use of a Box–Cox transformation, Draper and Smith offer this
comment on the regression model based on a chosen l:
The model with the “best l” does not guarantee a more useful model in practice.
As with any regression model, it must undergo the usual checks for validity.14

Case History 12.6


Defects/Unit2 on Glass Sheets

Background
Glass sheets are 100 percent inspected for visual defects on each of six production lines.
One of these defects is called s/cm2. Table 12.12 contains a portion of the quality data

Table 12.12 Portion of quality data collected on glass sheets


over a two-month period.
Date Time Line Number s/cm2
1/5/1999 7:00 L2 0.050866
1/5/1999 7:02 L2 0.028139
1/5/1999 7:04 L2 0.029221
1/5/1999 7:06 L2 0.031385
1/5/1999 16:52 L9 0.013880
1/5/1999 16:55 L9 0.013219
1/5/1999 16:58 L9 0.017184
1/6/1999 6:41 L9 0.020489
1/6/1999 8:08 L9 0.015202
1/6/1999 8:11 L9 0.013880
1/6/1999 8:14 L9 0.016523
1/7/1999 6:48 L2 0.099232
1/7/1999 6.54 L2 0.089912
1/7/1999 10:18 L2 0.059759
.. .. .. ..
. . . .
2/26/1999 7:28 L2 0.008459
2/26/1999 19:23 L2 0.027256
2/26/1999 19:26 L2 0.023496
2/26/1999 19:28 L2 0.018797
2/27/1999 7:25 L3 0.010575
2/27/1999 7:28 L3 0.011236
2/27/1999 7:31 L3 0.016523
2/27/1999 7:33 L3 0.011897
2/27/1999 20:42 L2 0.020677
2/27/1999 20:44 L2 0.021617
2/27/1999 20:46 L2 0.018797

14. N. R. Draper and H. Smith, Applied Regression Analysis, 3rd ed. (New York: John Wiley & Sons, 1998): 279.
408 Part III: Troubleshooting and Process Improvement

Histogram of s/cm2
80

70

60

50
Frequency

40

30

20

10

0
0.02 0.04 0.06 0.08 0.10 0.12 0.14
s/cm2

Figure 12.9 Histogram of the untransformed s/cm2 defect data.

collected on glass sheets that were produced. Sheets were collected from each produc-
tion line and inspected by a quality technician and the number of small defects (“s”)
found was reported to production. A histogram of the data over a two-month period is
given in Figure 12.9, which shows a typical Poisson skewed distribution.
Analysis
In an effort to determine whether this defect differed statistically with respect to the dif-
ferent lines, the analysis of means (ANOM) procedure was chosen since it shows sta-
tistical significance and graphically portrays which lines are different from the others.
However, before ANOM can be applied on this data, an attempt should be made to
normalize (make more bell-shaped) the distribution. A common first approach to normal-
izing defect/cm2 (Poisson) data is to use a square root transformation. After applying this
transformation, the distribution becomes more symmetrical, as seen in Figure 12.10.
This transformation seems to have made the distribution somewhat more symmet-
rical, but not enough to claim it is reasonably normal (bell-shaped). Statisticians often
use a Box–Cox approach to determine the proper transformation to normalize a vari-
able. The results of such an approach (done in Minitab) yields the Box–Cox plot in
Figure 12.11.
The optimum value is lambda ~ 0, which indicates that a log transformation is
needed. The histogram in Figure 12.12 shows the result of transforming the original
s/cm2 data using natural logarithms.
Much better! Now that the new variable is reasonably normally distributed, the
ANOM approach is applied to the transformed data. The resulting ANOM plot (using
an Excel add-in program discussed in Chapter 17) is shown in Figure 12.13.
Chapter 12: Special Strategies in Troubleshooting 409

Histogram of sqr(s/cm2)

40

30
Frequency

20

10

0
0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36
sqr(s/cm2)

Figure 12.10 Histogram of the square root of the s/cm2 defect data.

Box-Cox Plot of s/cm2


Lower CL Upper CL
0.12 Lambda
(using 95.0% conf idence)
0.10 Estimate -0.06
Lower CL -0.28
Upper CL 0.14
0.08 Rounded Value 0.00
StDev

0.06

0.04

0.02
Limit
0.00
-5 -4 -3 -2 -1 0 1 2
Lambda

Figure 12.11 Box–Cox transformation plot for the original s/cm2 defect data.

The ANOM plot indicates that line 2 has significantly higher levels of s/cm2 than
the other four lines. Even though line 1 had a response higher than that for line 2, its
decision limits are wider, indicating that there are fewer values in the data set for this
410 Part III: Troubleshooting and Process Improvement

Histogram of ln(s/cm2)
40

30
Frequency

20

10

0
-4.8 -4.4 -4.0 -3.6 -3.2 -2.8 -2.4 -2.0
ln(s/cm2)

Figure 12.12 Histogram of the natural log transformation of the s/cm2 defect data.

One-Way ANOM for Line Number


No Standard Given
-3.2

L1
L2 UDL(0.050)=-3.336

-3.4

L4
-3.6
CL=-3.625
ln(s/cm2)

-3.8 L5

LDL(0.050)=-3.915

-4
L3
L9

-4.2
Line Number

Figure 12.13 ANOM plot of the transformed ln(s/cm2) data using the Excel add-in program.

line. The fact that the point falls within these limits is an indication that there is insuf-
ficient data to declare it as statistically significant compared to the other lines.
Furthermore, lines 3 and 9 have significantly lower levels of s/cm2. Lines 1, 4,
and 5 plotted within the decision limits, so the lines do not appear to be significantly
Chapter 12: Special Strategies in Troubleshooting 411

different from each other. (Note that the ANOM limits were generated with a = .05, so
we are 95 percent confident in our conclusions.)
Investigation would begin as to why line 2 defect levels are so much higher, and
why lines 3 and 9 produce significantly lower levels. In this case, it was determined,
after some further questioning of the customer, that new washers had been installed on
these two lines. The other lines were still using older washers.
Why Not Just Do ANOM on the Original Data?
Programs, such as the Excel ANOM add-in, which is on the CD-ROM included with
this text, will analyze Poisson data directly. However, such programs require that the
data be in the form of counts (integers), not defects/unit (in this case, s/cm2). If we
know the corresponding unit information to drive the original data back to defect counts,
and the unit values are the same (or very nearly the same), then these programs will work
as intended.
Unfortunately, the data often come in the form of defects/unit (blisters/ lb., knots/in3,
s/cm2) and the corresponding unit information is either not collected, unavailable, or
both. In this case, we must resort to the use of ANOM assuming normality and follow
the approach presented in this section.
What would happen if we used ANOM and assumed normality without a transfor-
mation?
The ANOM plot on the original, untransformed data is given in Figure 12.14. How
does it compare to the one in Figure 12.13?

One-Way ANOM for Line Number


No Standard Given

0.05

0.045

L1 UDL(0.050)=0.043

0.04 L2

L5

0.035
L4
CL=0.032
s/cm2

0.03

0.025

LDL(0.050)=0.021
0.02
L3 L9

0.015

0.01
Line Number

Figure 12.14 ANOM plot of the original s/cm2 data using the Excel add-in program.
412 Part III: Troubleshooting and Process Improvement

Actually, the conclusions do not change at all! This may not always be the case, but
it illustrates the point that transformations sometimes unnecessarily complicate the inter-
pretation of results. The only anomaly in these results is the level of s/cm2 for line 5,
which now appears higher than it does when transformed. However, since the point still
falls within the decision limits, this result has no consequence on our conclusions.

12.9 PRACTICE EXERCISES


1. State the three factors of Case History 12.3.
2. Since there were 6 × 8 = 48 reassemblies in Case History 12.3 why is n = 24
used in computing sigma-hat?
3. What was the advantage of conducting the reassembly experiment in Case
History 12.3 instead of inspecting for out-of-round gears?
4. What was the main conclusion of Case History 12.3?
5. What are the general conditions that lead to screening programs of the type
described in Section 12.3?
6. How many different treatments can be screened with six tests? What can be
learned about interaction between factors in such a screening program?
7. Make up a set of data for Table 12.5 to illustrate a significant effect for
treatment 5 but not for any others.
8. a. Would you expect a scatter diagram to be of help in presenting
the comparison of the chemical analysis and the materials balance
computations of the data in Table 3.1?
b. Prepare a scatter diagram of the data
9. a. Would you expect a scatter diagram of, for example, machine 49
versus 56 in Table 4.3 to be helpful?
b. Prepare a scatter diagram.
13
Comparing Two
Process Averages

13.1 INTRODUCTION
This section provides a transition to analysis of means for measurement data in the
special case where two treatments are being compared and sample sizes are the same.
The discussion begins with a comparison at two levels of just one independent vari-
able. When data from two experimental conditions are compared, how can we judge
objectively whether they justify our initial expectations of a difference between the two
conditions? Three statistical methods are presented to judge whether there is objective
evidence of a difference greater than expected only from chance. This is a typical deci-
sion to be made, with no standard given.

13.2 TUKEY’S TWO-SAMPLE TEST TO


DUCKWORTH’S SPECIFICATIONS
There are important reasons for becoming familiar with the Tukey1 procedure: no hand
calculator is needed; also such a procedure may well be used more often and “compen-
sate for (any) loss of mathematical power. Its use is to indicate the weight of the evi-
dence roughly. If a delicate and critical decision is to be made, we may expect to replace
it or augment it with some other procedure.”

Tukey Procedure
Two groups of r1 and r2 measurements taken under two conditions are the criteria for
the Tukey–Duckworth procedure. The requirement for comparing the experimental

1. J. W. Tukey, “A Quick Compact, Two-Sample Test to Duckworth’s Specifications,” Technometrics 1, no. 1


(February 1959): 31–48.

413
414 Part III: Troubleshooting and Process Improvement

Table 13.1 Critical values of the Tukey-Duckworth sum.


Also see Table A.13.
Two-sided One-sided*
Approximate critical values of the critical value of the
risk sum a + b sum a + b
0.09 6 5
0.05 7 6
0.01 10 9
0.001 13 12
* Kindly provided by Dr. Larry Rabinowitz who also studied under
Dr. Ott. Note that, for the two-sided test a ≅ (a + b)/2n and for the
one-sided test a ≅ (a + b)/2n–1.

conditions by these criteria is that the largest observation of the two be in one sample
(A2) and the smallest in the other (A1). Let the number of observations in A2 that are
larger than the largest in A1 be a, and let the number in A1 smaller than the smallest in
A2 be b where neither a nor b is zero. (Count a tie between A1 and A2 as 0.5.) Critical
values of the sum of the two counts, a + b, for the Tukey–Duckworth test are given in
Table 13.1. The test is essentially independent of sample sizes if they are not too
unequal, that is, the ratio of the larger to the smaller is less than 4:3.
Note, the Tukey–Duckworth test may be one- or two-sided.

Case History 13.1


Nickel-Cadmium Batteries
In the development of a nickel-cadmium battery, a project was organized to find some
important factors affecting capacitance.2
The data in Table 13.2 were obtained at stations C1 and C2 (other known independent
variables believed to have been held constant). Is the difference in process averages from
the two stations statistically significant?
Some form of graphical representation is always recommended and the credibility
of the data considered. The individual observations have been plotted in Figure 13.1.
There is no obvious indication of an outlier or other lack of stability in either set. Also,
the criteria for the Tukey–Duckworth procedure are satisfied, and the sum of the two
counts is
a + b = 6 + 6 = 12

This exceeds the critical sum of 10 required for risk a ≅ 0.01 (Table 13.1) and allows
us to reject the null hypothesis that the stations are not different.

2. See Case History 14.2, p. 444, for additional discussion.


Chapter 13: Comparing Two Process Averages 415

Table 13.2 Data: capacitance of nickel-cadmium


batteries measured at two stations.
C1 C2
0.6 1.8
1.0 2.1
0.8 2.2
1.5 1.9
1.3 2.6
1.1 2.8
– –
C1 = 1.05 C2 = 2.23
R1 = 0.9 R2 = 1.0

Station C1 Station C2
ng = 1
+3
Capacitance

+2
~
Median (X ) = 1.65
+1

Figure 13.1 Comparing levels of a battery characteristic manufactured at two different stations.
(Data from Table 13.2.)

13.3 ANALYSIS OF MEANS, k = 2, ng = r1 = r2 = r


There is hardly need of any additional evidence than the Tukey two-sample analysis to
decide that changing from C1 to C2 (Table 13.2) will increase capacitance. However,
analysis of means (ANOM) for variables data, discussed previously in Chapter 11 for
attributes data, will be presented here and used later with many sets of variables data.
ANOM will be used first to compare two processes represented by samples, then
applied in Chapter 14 to 22 and 23 experimental designs. The importance of 22 and 23
designs in troubleshooting, pilot-plant studies, and initial studies warrants discussion
separate from the more general approach in Chapter 15 where the number of variables
and levels of each is not restricted to two.
Just as with attributes data, it is often good strategy to identify possible problem
sources quickly and leave a definitive study until later. The choice of some independent
variables to be omnibus-type variables is usually an important shortcut in that direction.

Formal Analysis
• From Table 13.2 (k = 2, r = 6), values of the two averages and ranges are
given. They are shown in Figure 13.2. The two range points are inside the
decision limits. The mechanics of analysis of means are shown in Table 13.3.
416 Part III: Troubleshooting and Process Improvement

– –
C1 C2

ng = r = 6
UDL = 1.98
2 (.01) –
2 UCL = D4R = 1.90

C = 1.64
– –
X R = 0.95
(.01) R 1
LDL = 1.30
1 LCL = 0
0

(a) (b)

Figure 13.2 Comparing two process averages by analysis of means (variables). (Data from
Table 13.2.)

Table 13.3 Summary: mechanics of analysis of means (ANOM) for two small samples (A and B)
with r1 = r2 = r.
– – –
Step 1. Obtain and plot the two sample ranges. Find R and D4R. If both points fall below D4R,
compute†
σˆ = R / d 2* and σˆ X = σˆ / r

Also df ≅ (0.9)k(r – 1) = 1.80(r – 1) for k = 2 (or see Table A.11).


– – –
Step 2. Plot points corresponding to A, B, and their average X.
Step 3. Compute Ha ŝ X–, and draw decision lines

UDL = X + Hα σˆ X

LDL = X − Hα σˆ X

usually choosing values of a = 0.10, 0.05, and 0.01 to bracket the two sample averages.
Ha is from Table A.14.
Step 4. When the pair of points falls outside a pair of decision lines, their difference is statistically
significant, risk a.
– – –
Note: Points A and B will be symmetrical with X when r1 = r2.
*
† Find d2 in Table A.11 for k = 2 and r.

• Figure 13.2b verifies there is homogeneity of variance.


• From Table A.11,

σˆ = R / d 2* = 0.95 / 2.60 = 0.365

and
σˆ X = σˆ / r = ( 0.365) / 6 = 0.149
df ≅ ( 0.9 )( 2 ) ( 6 − 1) = 9
Chapter 13: Comparing Two Process Averages 417

• From Table A.14 for k = 2 and df = 9; H0.05 = 1.60, H0.01 = 2.30.


• Decision lines. For a = 0.01:

X ± H 0.01σˆ X = 1.64 ± ( 2.30 )( 0.149 )


UDL = 1.64 + 0.34 = 1.98
LDL = 1.64 − .034 = 1.30

These two decision lines are shown in Figure 13.2a; the two C points are outside them.
We conclude that there is a statistically significant difference in capacitance resulting in a
change from station C1 to C2. This is in agreement with the Tukey procedure above.

13.4 STUDENT’S T AND F TEST COMPARISON OF


TWO STABLE PROCESSES
Note: This section may be omitted without affecting understanding of subsequent sections.

Example 13.1
Again use the data from Table 13.2 (k = 2, r1 = r2 = r = 6.)
The t statistic to compute is:

C2 − C1
t= (13.1)
1 1
sp +
r1 r2

Step 1: Compute

r∑ x2 − (∑ x ) 6 ( 7.15) − ( 6.3)
2 2

s = σˆ =
2 2
= = 0.107
1 1
r ( r − 1) 30

6 ( 30.70 ) − (13.4 )
2

s = σˆ 22 =
2
2
= 0.155
30

Step 2: Check for evidence of possible inequality of variances with the F test,
Equation (4.9).

s22 0.155
F= 2 = = 1.45 with df = F ( 5, 5)
s1 0.107
418 Part III: Troubleshooting and Process Improvement

when s22 is the larger of the two variances.


In Table A.12, we find critical values: F0.05 = 5.05 and F0.10 = 3.45, for risk
a = 0.10 and a = 0.20, since this is a two-tailed test.
Step 3: Since F = 1.45 is less than even the critical value F0.10, we accept equality
of variances of the two processes and proceed to estimate their common
pooled variance. From Equation (4.8)

s12 + s22 ( 0.107 ) + ( 0.155)


s p2 = = = 0.131
2 2
and
s p = 0.131 = 3.62
with
df = (r1 – 1) + (r2 – 1) = 5 + 5 = 10

Since r1 = r2 in Equation (13.1), the denominator of t becomes

sp 2
σˆ ∆ = = sX 2 = 0.209 (13.2)
r

Note that when r1 ≠ r2, then

s 2
=
(r − 1) s + (r
1
2
1 2
− 1) s22
(13.3)
p
r1 + r2 − 2

See Equation (4.8).


Step 4: Then finally compute from Equation (13.1)

C2 − C1 1.18
t= = = 5.64 df = 10
0.209 0.209

The critical value found in Table A.15 for df = 10 is: t0.01 = 3.17.
Step 5: Decision. Since our t = 5.64 is larger than t0.01 = 3.17, we decide that the
process represented by the sample C2 is operating at a higher average than
the process represented by the sample C1 with risk less than a = 0.01.
Step 6: Inequality of variances. It may happen that the F test of step 2 rejects the
hypothesis of equality of variances. When this happens, it is inappropriate
to calculate the pooled variance sp2 because there is no one variance that
describes the variability in the data. Under such circumstances, we may
Chapter 13: Comparing Two Process Averages 419

appeal to the Welch–Aspin test,3 which can be regarded as a modification


of the t test in which

 s12 s22 
2

(r1 − 1)(r2 − 1)  r + r 
X1 − X 2  1 2 
t= df ≅
 2  2
2 2
s12 s22
+ (r2 − 1)  sr1  + (r1 − 1)  sr2 
r1 r2  1  2

the test proceeds in the manner described above for the standard t test.
Note, a little algebra will show that when r1 = r2 = r,

df ≅
(r − 1)(1 + F ) 2

(1 + F )
2

where F is the ratio of the two variances

F = s22 /s12

Some Comparisons of t Test and ANOM in Analyzing Data from Table 13.2
In Figure 13.1, both range points fall below UCL(R), and we accept homogeneity of
variability in the two processes. This agrees with the results of the F test above.

Then ŝ = R/d2* = 0.95/2.60 = 0.365. This estimate ŝ agrees closely with the pooled
estimate sp = 0.362 in Step 3.

The decision lines in Figure 13.2a are drawn about C at a distance ±Haŝ X–. It can be
shown that for k = 2

1 sp 2 tα
± Hα σ̂ X = ± tα that is, Hα =
2 r 2

Thus, the decision between the two process averages may be made by looking at
Figure 13.2a instead of looking in the t table. The ANOM is just a graphical t test when
k = 2. It becomes an extension of the t test when k > 2.
When r1 ≠ r2 or when r is not small, we use Equation (4.8) in estimating ŝ for
ANOM. When r1 = r2 = r is small—say less than 6 or 7, the efficiency of the range in
estimating ŝ is very high (see Table 4.2); the loss in degrees of freedom (df) is only

3. B. L. Welch, “The Generalization of Student’s Problem when Several Different Population Variances are Involved,”
Biometrika 34 (1947): 28–35.
420 Part III: Troubleshooting and Process Improvement

about 10 percent as we have seen. It is, of course, possible to increase the sample size
to compensate for this loss of efficiency.

13.5 MAGNITUDE OF THE DIFFERENCE


BETWEEN TWO MEANS
At least as important as the question of statistical significance is the question of practi-
cal or economic significance. The observed sample difference in capacitance in Table
13.2 is

∆ = C2 − C1 = 2.23 − 1.05 = 1.18

This was found statistically significant. It is now the scientist or engineer who must
decide whether the observed difference is large enough to be of practical interest. If the
data were not coded, it would be possible to represent the change as a percent of the aver-

age, C = 1.640. In many applications, a difference of one or two percent is not of practi-
cal significance; but a difference of 10 percent or so would often be of great interest. The
decision must be made for each case, usually by design, process, or quality engineers.
If the study in Table 13.2 were repeated with another pair of samples for stations C1

and C2, we would not expect to observe exactly the same average difference ∆ as
observed this first time. (However, we would expect the average difference for r = 6 to
be statistically significant, risk a ≅ 0.01.) The confidence limits on the difference are
given (for any risk a and k = 2) by the two extremes

( )
∆1 = C2 − C1 + 2 Hα σˆ X risk α (13.4)

∆2 = (C 2
− C ) − 2 H σ̂ˆ
1 α X

We have, with risk a = 0.01, for example:

 0.365 
∆1 = 1.18 + 2 ( 2.30 )   = 1.87
 6 

and
 0.365 
∆ 2 = 1.18 − 2 ( 2.30 )   = 0.49
 6 

Alternatively, we found the effects of C1 and C2 to differ by 1.18 units; the two
processes that C1 and C2 represent may actually differ by as much as 1.87 units or as
little as 0.49 units. Thus, in Equation (13.4), the experimenter has a measure of the
Chapter 13: Comparing Two Process Averages 421

extreme differences that can actually be found in a process because of shifting between
levels C1 and C2, with risk a.
Sometimes, the observed difference may not be of practical interest in itself but may
suggest the possibility that a larger change in the independent variable might produce a
larger effect, which would then be of interest. These are matters to discuss with the engi-
neer or scientist.

Case History 13.2


Height of Easter Lilies on Date of First Bloom4
Botanists have learned that many characteristics of plants can be modified by man. For
example, “Easter lilies” grown normally in the garden in many states bloom in July or
August—not at Easter time. You would probably not give a second thought to such char-
acteristics as the range of heights you would favor when buying an Easter lily or the
number of buds and blooms you would prefer, but they are important factors to the hor-
ticulturist. The referenced study employed a more complex design than either the one
presented here or the 22 design in Table 14.2. Botanists and agriculturists usually have
to wait through one or more growing seasons to acquire data. Their experimental
designs often need to be quite complicated to get useful information in a reasonable
period of time. Industrial troubleshooting and process improvement can often move
much faster; additional information can often be obtained within a few hours or days.
Several less-complicated experiments are usually the best strategy here. This is one
reason for our emphasis on three designs: the 22, the 23, and the fractional factorial. A
study was made of the height of Easter lilies on the date of first bloom. Under two dif-
ferent conditioning times after storage, T1 and T2, of Easter lily bulbs (all other factors
believed to have been held constant), the measured heights of plants were

Condition: T1 T2
28 31
26 35
30 31
– –
T1 = 28.0 32.3 = T2
R1 = 4 4 = R2

These heights are plotted in Figure 13.3.


Analysis
• The Tukey–Duckworth count is a + b = 6, which is significant at a ≅ 0.10.
• A second analysis (ANOM). In Figure 13.4, points corresponding to T1 and T2
fall outside the a ≅ 0.10 lines and inside the 0.05 lines.

4. R. H. Merritt, “Vegetative and Floral Development of Plants Resulting from Differential Precooling of Planted
Croft Lily Bulbs,” Proceedings of the American Society of Horticultural Science 82 (1963): 517–25.
422 Part III: Troubleshooting and Process Improvement

T1 T2

35


T 2 = 32.3
ng = r = 1

Height

30 X = 30.15


T 1 = 28.0

25

Figure 13.3 Heights of lilies under two different storage conditions.

T1 T2

34
UDL = 32.66
(.05)
32 (.10)
32.08
ng = r = 3 –
X = 30.15
Height

30

28.22
(.10)
28
(.05)
LDL = 27.64

26

Figure 13.4 Comparing average heights of lilies under two different conditions (ANOM).

Conclusion
From this analysis, there is some evidence (risk less than 0.10 and greater than 0.05)
that a change from condition T1 to T2 may produce an increase in the height. The amount
of increase is discussed in the following section. The choice of conditions to use in rais-
ing Easter lilies and/or whether to study greater differences in levels of T must be made
by the scientist. (Also see Case History 14.1.)
Mechanics of Computing Decision Lines (Figure 13.4)

σˆ = R / d 2* = 4.0 / 1.81 = 2.21

σˆ X = σˆ / r = 2.21 / 1.73 = 1.28

df ≅ ( 0.9 ) k ( r − 1) = ( 0.9 )( 2 )( 2 ) = 3.6


Chapter 13: Comparing Two Process Averages 423

Or from Table A.11, df = 3.8 ≅ 4.


Decision Lines

X ± Haŝ X–
a = 0.05:
UDL = 30.15 + 2.51 = 32.66
LDL = 30.15 − 2.51 = 27.64

a = 0.10:
UDL = 30.15 + 1.93 = 32.08
LDL = 30.15 − 1.93 = 28.22

Magnitude of Difference
For a = 0.10:

∆1 = ( T2 − T1 ) + 2 H 0.10σˆ X = 4.30 + 2 (1.93) = 4.30 + 3.86 = 8.16 in.

∆ 2 = ( T2 − T1 ) − 2 H 0.10σˆ X = 4.30 − 2 (1.93) = 4.30 − 3.86 = 0.44 in.

Thus the expected average difference may actually be as small as 0.44 in. or as large
as 8.16 in., with risk a = 0.10.
For a = 0.05:

∆1 = ( T2 − T1 ) + 2 H 0.05σˆ X = 4.30 + 2 ( 2.51) = 4.30 + 5.02 = 9.32 in.

∆ 2 = ( T2 − T1 ) − 2 H 0.05σˆ X = 4.30 − 2 ( 2.51) = 4.30 − 5.02 = −0.72 in.


A negative sign on ∆2 means that there is actually a small chance that condition T1
might produce taller plants than T2; it is a small chance but a possibility when consid-
ering confidence limits of a = 0.05.

Case History 13.3


Vials from Two Manufacturing Firms
The weights in grams of a sample of 15 vials manufactured by firm A and 12 vials by
firm B are given in Table 13.4. Are vials manufactured by firm A expected to weigh sig-
nificantly more than those manufactured by firm B?
We shall discuss the problem from different aspects.
424 Part III: Troubleshooting and Process Improvement

Table 13.4 Data: vials from two manufacturing firms.


Firm Weight, grams
A: 7.6, 8.3, 13.6, 14.9, 12.7, 15.6, 9.1, 9.3,
11.7, 9.6, 10.7, 8.0, 9.4, 11.2, 12.8 (r1 = 15)
B: 7.1, 7.6, 10.1, 10.1, 8.7, 7.2, 9.5, 10.2,
9.5, 9.0, 7.3, 7.4 (r2 = 12)

Firm A (k 1 = 15) Firm B (k 2 = 12)

r=1
16

14

12
Weight


A = 10.97

10

B = 8.64
8

Figure 13.5 Weights of individual vials from two manufacturing firms.

Informal Analysis 1
We begin by plotting the data in a single graph (Figure 13.5). We note that all observa-

tions from firm B lie below the average A of firm A. Little additional formal analysis is
necessary to establish that the process average of firm A exceeds the process average of
firm B.
Analysis 2
The required conditions for the Tukey–Duckworth test are satisfied, and the counts are:
a + b = 8 + 4.5 = 12.5. This count exceeds the critical count of nine for risk a ≅ 0.01
for a one-sided test. This is in agreement with analysis 1.
Analysis 3
Clearly, the Student t test is inappropriate here because the variances of the vials from
firm A and firm B are unequal. This is indicated by an F test as shown in Case History
4.1. In this case, the Welch–Aspin test is in order. We calculate

A− B
t=
s A2 sB2
+
r1 r2
Chapter 13: Comparing Two Process Averages 425

A = 10.97 B = 8.64
s = 6.34
2
A
sB2 = 1.55
r1 = 15 r2 = 12

10.97 − 8.64 2.33


t= = = 3.15
6.34 1.55 0.74
+
15 12

with degrees of freedom

 s2 s2 
2

(r1 − 1) (r2 − 1)  r1 + r2 
 1 2 
df ≅
 s12   s22 
2 2

( 2 ) r  ( 1 ) r 
r − 1 + r − 1
 1  2

 
2

(14 )(11)  615


.34 1.55
+
12 

 .34   1.55 
2 2

(11)  615  + (14 )  12 

46.90

2.20
≅ 21.3 ∼ 21

The t table shows t0.01 = 2.83 with 21 df and so the critical value is clearly exceeded
with a one-sided risk less than 0.005.

Discussion
Thus, the three analyses are in agreement that the average product expected from firm
A should exceed that of firm B, provided the basic assumptions are satisfied.
However, consider the patterns of the data in Figure 13.5. There are four consecutive
vials from firm A that are appreciably higher than the others. There is enough evidence
to raise certain important questions:
1. Is the product from firm A of only one kind, or is it of two or more kinds?
2. Do the four high values from firm A represent higher weights, or is the test
equipment unstable (in view of the consecutive peculiarities)?
3. Summary: Is it possible to predict future weights?
426 Part III: Troubleshooting and Process Improvement

Comments
Two basic assumptions in comparing samples from two populations are that two stable
and normally distributed populations are being sampled. In this example, these
assumptions are surely not justified. Of course, if a decision must now be made to
choose the firm producing “heavier” vials, then firm A would be the choice. However,
it will then be prudent to sample succeeding lots of product to ensure that the noncon-
trolled process of firm A does not begin producing vials of a low weight.
We seldom find two sets of data where it is adequate to be satisfied with a single
routine test (such as a t test or a Tukey test). Additional worthwhile information comes
from “looking at” the data in two or more different ways. Many patterns of nonran-
domness occur. Some statistical tests are robust in detecting one pattern, some in detect-
ing others.

Case History 13.4


Average of Electronic Devices
It is the usual experience to find one set of data (or both) originating from a nonstable
source instead of just one stable source as often assumed. Let us consider the data in
Table 13.5 pertaining to two batches of nickel cathode sleeves (see Example 4.2). Using
cathode sleeves made from one batch of nickel (melt A), a group of 10 electronic
devices was processed; then an electrical characteristic (transconductance Gm) was
read on a bridge. Using nickel cathode sleeves from a new batch of nickel (melt B), a
second group of 10 devices was processed and Gm read. Is there evidence that devices

Table 13.5 Data: measurements on electronic


devices made from two batches of
nickel cathode sleeves.
Melt A Melt B
4760 6050
5330 4950
2640 3770
5380 5290
5380 6050
2760 5120
4140 1420
3120 5630
3210 5370
5120 4960
– –
A = 4184.0 B = 4861.0
(n1 = 10) (n2 = 10)

B ´ = 5243.3
(n2́ = 9)
Chapter 13: Comparing Two Process Averages 427

processed from melt B will average a higher Gm than devices from melt A, as is hoped?
(See Figure 4.2.)
Analysis: A Routine Procedure (Not Recommended)
If we were to proceed directly with a formal t test, we would first average each group
and compute each variance

A = 4184.0 B = 4861.0

s A2 = 1, 319, 604.4 sB2 = 1, 886, 387.8

Since r1 = r2, we use the simplified form of the t test from Section 13.4, Step 3,

( )
s p2 = s A2 + sB2 / 2 = 1, 602, 996.1

Then
2 (1266.1)
s p = 1266.1 and 2sX = = 566.2
10

and
t = 677 / 566.2 = 1.20 df = n1 + n2 − 2 = 18

Critical values of t corresponding to 18 df are

t0.20 = 1.330
and
t0.10 = 1.734

Thus, we do not have evidence of statistical significance even with risk a ≅ 0.20.
This routine application of the t test is not recommended. The suspected outlier of
1420 in melt B has major effects on the result of a t test, as well as on the average of the
process using melt B.
Further Analysis
Consider again the data in Table 13.5 (Figure 4.2). After excluding 1420 (the seventh
point) in melt B as a suspected maverick, the Tukey counts are easily seen from Figure
4.2 to be

a = 3 (number in melt B larger than any in melt A)


b = 4 (number in melt A smaller than any in melt B)
428 Part III: Troubleshooting and Process Improvement

and
(a + b) = 7
This test on the modified data indicates that the Gm of devices made from nickel
cathode sleeves from melt A will average lower than those made from melt B (with risk
of about a = 0.05).
Analysis: Student’s t Test Applied to the Modified Data
We may now recompute the t test after excluding the 1420 observation.

A = 4184.0 ( r1 = 10 ) B′ = 5243.3 ( r2 = 9 )

s A2 = 1, 319, 604 sB′2 = 477, 675

Then from Equation (13.3)

9 (1, 319, 604 ) + 8 ( 477, 675)


sp = = 923, 402 = 960.9
17

From Equation (13.1)

+ = ( 960.9 )
1 1 19
sp = 441.5
10 9 90

Then

B′ − A 1059.3
t= = = 2.40 df = 17
441.5 441.5

From Appendix Table A.15, critical values of t for 17 df are

t0.02 = 2.567
t0.05 = 2.110

Consequently, this t test (on the modified data) indicates a significant difference
between population means with risk about a ≅ 0.03 or 0.04. This result is consistent
with analysis 2 but not with analysis 1.
Conclusion
Clearly, the one suspect observation in melt B has a critical influence on conclusions
about the two process averages. Now it is time to discuss the situation with the engineer.
The following points are pertinent:
Chapter 13: Comparing Two Process Averages 429

• The melt B average is an increase of about 25 percent over the melt A average.
Is the increase enough to be of practical interest? If not, whether 1420 is left in
or deleted is of no concern.
• There is a strong suspicion that the data on melt A comes from two sources
(see Figure 4.2 and the discussion of Example 4.2). Are there two possible
sources (in the process) that may be contributing two-level values to Gm from
the nickel of melt A?
• It appears that a serious study should be made of the stability of the manu-
facturing process when using any single melt. The question of “statistical
significance” between the two melt-sample averages may be less important
than the question of process stability.

13.6 PRACTICE EXERCISES


1. Use moving ranges (ng = 2) on the following:
a. The data on melt A, n = 10, to obtain ŝ A and compare this with
sA = 1149
b. The data on melt B´, n = 9, to obtain ŝ B´ and compare with sB´ = 691.1
2. Plot moving average and range charts, ng = 2, for the data on firm A in
Case History 13.3. What evidence does this present, if any, regarding the
randomness of the data on firm A?
3. Consider all 15 + 12 = 27 points from the two firms in Case History 13.3,
and repeat the procedure of Exercise 2. Does the number and/or length of
runs provide evidence of interest?
4. What do we conclude by applying the Tukey test to the ranges of Figure 4.3?
Are the conclusions using the range-square ratio test FR and Tukey’s test in
reasonable agreement?
5. Compare the process averages represented by the samples from two machines,
Table 4.3. Possibilities include:
a. Dividing each group into subgroups (ng = 5, for example)
b. Using a t test on all 25 observations in each group; or perhaps the first
(or last, or both) five or 10 of each.
6. Construct a data set and pick a significance level such that Tukey’s test and
the t test give opposite results. Which is more likely to be correct? Why would
you prefer one of these tests to the other?
430 Part III: Troubleshooting and Process Improvement

7. Use the data set from Exercise 6 to show that, for k = 2, the t test and
ANOM are roughly equivalent. (The only difference is in the use of s

or R to estimate sigma.)
8. The following data were acquired from two parallel sets of processing
equipment. Examine them with the Tukey–Duckworth, the t test, and the
F test to determine whether the null hypothesis, H0, of no difference between
them can be rejected or accepted.

Equipment A Equipment B
40 53
22 61
25 30
37 45
20 39
26 32
27 32
28 42
47 38
52 45
59
50

9. Consider the following:


a. In Exercise 6 of Chapter 4, use both the Tukey–Duckworth and the t test
to attempt to reject H0, the null hypothesis of no difference.
– –
b. Graph the X and R values for each of the five samples on X and R control
charts. Do both process centers appear to be operating in good control?
Why or why not?
c. Conduct an ANOM on the two process centers. Explain the difference in

philosophy between the X control chart and the ANOM.
10. Set up a Shewhart chart with n = 3 using the data on firm A from Table 13.4.
Plot firm B on this chart. What does this tell you?
11. Perform analysis of means on the data from Table 13.4. Use a = 0.05,
0.01 limits.
14
Troubleshooting
with Variables Data

14.1 INTRODUCTION
This chapter covers 22, 23, and fractional factorial designs for measurement data, which
are the fundamental structures for troubleshooting experiments in industry.
The ideas on troubleshooting with attributes data discussed in Chapter 11 are
equally applicable when using variables data. Identifying economically important
problems, enlisting the cooperation of plant and technical personnel, deciding what
independent variables and factors to include—these are usually more important than the
analysis of resulting data. These remarks are repeated here to emphasize the importance
of reasonable planning prior to the collection of data. The reader may want to review
the ideas of Chapters 9 and 11 before continuing.
This chapter will discuss three very important designs: two-level factorial designs,1
22 and 23, and the “half-replicate” design, 1/2 × 23. These are very effective designs, espe-
cially when making exploratory studies and in troubleshooting. In Chapter 15, some case
histories employing more than two levels of some independent variables are discussed.
Results from a study involving the simultaneous adjustment of two or more inde-
pendent variables are sometimes not readily accepted by those outside the planning
group. For many years, engineers and production supervisors were taught to make stud-
ies with just one independent variable at a time. Many are skeptical of the propriety of
experiments that vary even two independent variables simultaneously. Yet they are the
ones who must accept the analysis and conclusions if the findings in a study are to be
implemented. If they are not implemented, the experiment is usually wasted effort. It is
critical that an analysis of the data be presented in a form that is familiar to engineers
and which suggests answers to important questions. For these reasons, the graphical
analysis of means is recommended and emphasized here.

1. Chapter 11, Secs. 11.10, 11.11, and 11.12.

431
432 Part III: Troubleshooting and Process Improvement

14.2 SUGGESTIONS IN PLANNING


INVESTIGATIONS—PRIMARILY REMINDERS
The two “levels” of the independent variable may be of a continuous variable that is also
recognized as a possible causative variable. Then we may use a common notation: A– to
represent the lower level of the variable and A+ to represent the higher level. Frequently,
however, the two “levels” should be omnibus-type variables as discussed in Chapter 9:
two machine/operator combinations, two shifts, two test sets, two vendors. Then we may
use a more representative notation such as A1 and A2 to represent the two levels.
Some amount of replication is recommended, that is, r > 1 and perhaps as large as
5 or 6. In many investigations, there is little difficulty in getting replicates at each com-
bination of the independent variables. A single replicate may possibly represent the
process adequately if the process is actually stable during the investigation, which is an
assumption seldom satisfied. It should certainly be checked beforehand. Outliers and
other evidences of an unstable process are common even when all known variables are
held constant (Chapter 3). It is even more likely that a single observation would be inad-
equate when two or three variables are being studied at different levels in a designed
experiment.
There may be exceptional occasions where it is practical and feasible to use a
design with four or five variables, each at two levels (a 24 or a 25 design or a fraction
thereof). However, leave such complexities to those experienced in such matters.
Any study that requires more than 15 or 20 trays of components to be carried about
manually in the plant will require extreme caution in planning and handling to prevent
errors and confusion in identification. It is very difficult to maintain reasonable surveil-
lance when only eight or 10 trays must be routed and identified through production stages.
Scientists often use appreciably more data than a statistician might recommend for
the following reasons:
1. It has been traditional in many sciences to use very large sample sizes and to
have little confidence in results from smaller samples.
2. Any analysis assumes that the sample chosen for the study is representative of
a larger population; a larger sample may be required to satisfy this important
condition. Usually, however, replicates of three, four, or five are adequate.

Evolutionary Operation
The 22 design with its center point is the basis of the well-known evolutionary operation
(EVOP).2 It has appeal especially for the chemical industry in its efforts to increase
process yields. During planning of a typical EVOP program, two independent process
variables, A and B, are selected for study. They may be temperature, throughput rate, per-
cent catalyst, and so on. Two levels of each variable are chosen within manufacturing

2. G. E. P. Box, “Evolutionary Operation: A Method for Increasing Industrial Productivity,” Applied Statistics 6
(1957): 81–101.
Chapter 14: Troubleshooting with Variables Data 433

specifications, one toward the upper specification limit and one toward the lower specifi-
cation limit. Since the difference in levels is usually not large, many replicates may be
needed to establish significant differences. The process is allowed to run until enough data
have been collected to declare significance. In this way, the process may be said to evolve
to establish the extent and direction of useful changes in the variables involved.

14.3 A STATISTICAL TOOL FOR PROCESS CHANGE


The secret of process change is not only in analysis, but also in action. It is not enough
to find the cause for a problem, the solution must be implemented. Controls must be set
up to insure that the problem does not occur again. There is no better way to get action
than to have relevant data presented in a way that anyone can understand. Moreover,
there is no better way to understand and present data than with a graph. In other words,
“plot the data.”
A process improvement program requires that sources of variability normally con-
cealed in random error be identified. The search is for common causes. This would
imply changes in level less than 3s X– for the sample size used. This search can be accom-
plished through observation and interpretation of data using cumulative sum charts or
Shewhart charts made more sensitive through runs analysis, warning limits, increased
sample size, and the like. However, this is very difficult since it is not easy to associate
a small change in level detected by such a chart with its cause. This is precisely why
Shewhart recommended 3s X– with small sample sizes of 4 or 5. This means that to
achieve process improvement after control is attained, one must resort to statistically
designed experiments rather than interpretation of data.
There are varieties of methods used in the analysis and presentation of the results
of designed experiments, but none is more appropriate to the analysis and presentation
of data to nonstatisticians than analysis of means. The reason for this is that it is graph-
ical. It uses the Shewhart chart as a vehicle for the analysis and presentation of data. As
such, it has all the advantages of Shewhart procedure. The limit lines are slightly more
difficult to calculate, but this difficulty is transparent to the observer of the chart. The
simple statement that “if there is no real difference between the experimental treat-
ments, the odds are (1 – a)/a to 1 that the points will plot within the limits” will suffice
for statistical interpretation. From that point, the chart and its use are as familiar from
plant to boardroom as a Shewhart control chart.
The control chart has often been used in the exposition and analysis of designed
experiments. Excellent early examples may be seen in the work of Dr. Grant Wernimont.3,4
The analysis of means as developed by Ott5 is such a procedure, but differs from the

3. G. Wernimont, “Quality Control in the Chemical Industry II: Statistical Quality Control in the Chemical
Laboratory,” Industrial Quality Control 3, no. 6 (May 1947): 5.
4. G. Wernimont, “Design and Interpretation of Interlaboratory Studies of Test Methods,” Analytical Chemistry 23
(November 1951): 1572.
5. E.. R. Ott, “Analysis of Means,” Rutgers University Statistics Center Technical Report, no. 1 (August 10, 1958).
434 Part III: Troubleshooting and Process Improvement

traditional control chart approach in that probabilistic limits are employed that adjust for
the compound probabilities encountered in comparing many experimental points to one set
of limits. Thus, this method gives an exact estimate of the a risk over all the points plotted.

14.4 ANALYSIS OF MEANS FOR


MEASUREMENT DATA
The basic procedure is useful in the analysis of groups of points on a control chart, in
determining the significance and direction of response of the means resulting from main
effects in an analysis of variance, as a test for outliers, and in many other applications.
Steps in the application of analysis of means to a set of k subgroups of equal size ng are
as follows:

1. Compute Xi, the mean of each subgroup, for i = 1, 2, . . . , k

2. Compute X, the overall grand mean
3. Compute ŝ e, the estimate of experimental error as

R
σˆ e =
d 2*

where

R = the mean of the subgroup ranges
d2* = Duncan’s adjusted d2 factor given in Appendix Table A.11
with degrees of freedom, df = n , as shown in Table A.11. For values of d2*
not shown, the experimental error can be estimated as

σˆ e =
R

 (
k ng − 1 
 )
d2 ( g )
 k n − 1 + 0.2778 


where d2 = factor for calculating the centerline of an R control chart when s is


known. The degrees of freedom for the estimate may then be approximated by

ν ≈ 0.9k ng − 1( )
Clearly, other estimates of s may be used if available.

4. Plot the means Xi in control chart format against decision limits computed as

σ̂ e
X ± Hα
ng
Chapter 14: Troubleshooting with Variables Data 435

where Ha = Ott’s factor for analysis of means given in Appendix Table A.8
5. Conclude that the means are significantly different if any point plots outside
the decision limits.
The above procedure is for use when no standards are given, that is, when the mean
and standard deviation giving rise to the data are not known or specified. When stan-
dards are given, that is, in testing against known values of m and s, the Ha factor shown
in Table A.8 in the row marked SG should be used. The limits then become:

σ
µ ± Hα
ng

The values for Ha shown in appendix Table A.8 are exact values for the studentized
maximum absolute deviate in normal samples as computed by L. S. Nelson6 and adapted
by D. C. Smialek7 for use here to correspond to the original values developed by Ott.
Values for k = 2 and SG are as derived by Ott.8

14.5 EXAMPLE-MEASUREMENT DATA


Wernimont presented the results of a series of Parr calorimeter calibrations on eight dif-
ferent days.9 The results of samples of four in BTU/ lb./°F are:

Day X R
1 2435.6 13.5
2 2433.6 11.3
3 2428.8 10.0
4 2428.6 5.6
5 2435.9 18.7
6 2441.7 13.6
7 2433.7 9.9
8 2437.8 14.8
Average 2434.5 12.2

Analysis of means limits to detect differences between days, with a level of risk of
a = 0.05, may be determined as follows:
• An estimate of the experimental error is

6. L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15, no. 1
(January 1983): 40–44.
7. E. G. Schilling and D. Smialek, “Simplified Analysis of Means for Crossed and Nested Experiments,’’ 43d Annals
of the Quality Control Conference, Rochester Section, ASQC (March 10, 1987).
8. E . R. Ott, “Analysis of Means,” Rutgers University Statistical Center Technological Report, no. 1 (August 10,
1958).
9. G. Wernimont, “Quality Control in the Chemical Industry 11: Statistical Quality Control in the Chemical
Laboratory,” Industrial Quality Control 3, no. 6 (May 1947): 5.
436 Part III: Troubleshooting and Process Improvement

2442.7

– 2434.5
Xi

2426.3

1 2 3 4 5 6 7 8
Day

Figure 14.1 Analysis of means plot; Parr calorimeter determination.

R 12.2
σˆ e = = = 5.87
d 2* 2.08
df = 22.1 ∼ 22

• The analysis of means limits are

σ̂ e
X ± Hα
ng

5.87
2434.5 ± 2.80
4
2434.5 ± 8.22

The analysis of means plot is shown in Figure 14.1 and does not indicate significant
differences in calibration over the eight days at the 0.05 level of risk.

14.6 ANALYSIS OF MEANS: A 22 FACTORIAL DESIGN


The method of Chapter 13 is now extended to this very important case of two indepen-
dent variables (two levels of each).

Main Effects and Two-Factor Interaction



In Figure 14.2, the Xij represent the average quality characteristic or response under the

indicated conditions. If the average response A1, under conditions B1 and B2 is statistically
Chapter 14: Troubleshooting with Variables Data 437

A1 A2

(1) (2)

– – – 1 – –
B1 X 11 X 21 B 1 = –(X 11 + X 21)
2

r r

(3) (4)

– – – 1 – –
B2 X 12 X 22 B 2 = –(X 12 + X 22)
2

r r

– –
A1 A2
– 1 – – – 1 – –
U = –(X 21 + X 12) L = –(X 11 + X 22)
2 2


Figure 14.2 A general display of data in a 22 factorial design, r replicates in each average, Xij.


different from the average response A2 under the same two conditions, then factor A is
said to be a significant main effect.

Consider the Xij in Figure 14.2 to be averages of r replicates. For simplicity, we
shall abbreviate the notation by writing, for example,

(1, 3) for (1) + ( 3)  = ( X11 + X12 )


1 1 1
A1 =
2 2 2

then

A1 =
1
2
(1, 3) B1 =
1
2
(1, 2) L=
1
2
(1, 4 )
A2 =
1
2
( 2, 4 ) B2 =
1
2
(3, 4 ) U=
1
2
( 2, 3)

where L again represents the average of the two combinations (1) and (4) of A and B

having like subscripts; U is the average of the two unlike combinations.
As before, the mechanics of testing for a main effect with the analysis of means is
– – – –
to compare A1 with A2 and B1 with B2 using decision lines

X ± Hα σ̂ X (14.1)

Actually, three comparisons can be tested against the decision lines in Equation
(14.1). The mechanics of testing for main effects with ANOM can be extended as fol-
lows to test for the interaction of A and B:
438 Part III: Troubleshooting and Process Improvement

– –
A1 = 1⁄2[(1) + (3)] versus A2 = 1⁄2[(2) + (4)] to test for an A main effect
– –
B1 = 1⁄2[(1) + (2)] versus B2 = 1⁄2[(3) + (4)] to test for a B main effect
– –
L = 1⁄2[(1) + (4)] versus U = 1⁄2[(2) + (3)] to test for an AB interaction

Discussion
A effect I. Change in response to A under condition B1 (A1 to A2) = (2) – (1)
– –
= X21 – X11
A effect II. Change in response to A under condition B2 (A1 to A2) = (4) – (3)
– –
= X22 – X12

Total change in A = [(2) + (4)] – [(1) + (3)]


Average change in A = 1⁄2([(2) + (4)] – [(1) + (3)])
This average change is called the main effect of A.
Definition: If the A effect I is different under B1 than A effect II under B2, there is
said to be a two-factor or AB interaction.
When the changes I and II are unequal,

I ≠ II
(2) – (1) ≠ (4) – (3)
then
(2) + (3) ≠ (4) + (1)

But [(2) + (3)] is the sum of the two cross-combinations of A and B with unlike U
subscripts while [(1) + (4)] is the sum of the like L subscripts in Figure 14.2. Briefly,
when there is an AB interaction, the sum (and average) of the unlike combinations are
not equal to the sum (and average) of the like combinations.
Conversely, if the sum (and average) of the two terms with like subscripts equals
statistically10 the sum (and average) of the two with unlike subscripts

[(2) + (3)] = [(4) + (1)]


then
[(2) – (1)] = [(4) – (3)]
that is,
I = II

Theorem 14.1. To test for a two-factor interaction AB, obtain the cross-sums,
like [(1) + (4)] and unlike [(2) + (3)]. There is an AB interaction if, and only
if, the like sum is not equal to the unlike sum, that is, their averages are not
equal statistically.

10. Being equal statistically means that their difference is not statistically significant, risk a.
Chapter 14: Troubleshooting with Variables Data 439

ng = r
(3)
B2
(4)


X
(2)

(1)
B1

A1 A2

Figure 14.3 A graphical interpretation of a two-factor interaction.

It is both interesting and instructive to plot the four combination averages as shown
in Figure 14.3, and discussed in Chapter 10. It always helps in understanding and inter-
preting the meaning of the interaction. When [(2) + (3)] = [(1) + (4)], the two lines are
essentially parallel, and there is no interaction. Also, when they are not essentially par-
allel, there is an A × B interaction.
This procedure is summarized in Table 14.1 and will be illustrated in Case
History 14.1.

Case History 14.1


Height of Easter LiIies11

Introduction
Consider data from two independent variables in a study of Easter lilies raised in the
Rutgers University greenhouse. The two independent factors considered in this
analysis are:
1. Storage period (S). The length of time bulbs were stored in a controlled
dormant state, (S1 and S2).
2. Time (T). The length of time the bulbs were conditioned after the storage
period, (T1 and T2).
The researchers specified levels of S and T from their background of experience.
The quality characteristic (dependent variable) considered here is a continuous variable, the
height H in inches of a plant on the date of first bloom.

11. Other important independent variables and quality characteristics were reported in the research publication. See R.
H. Merritt, “Vegetative and Floral Development of Plants Resulting from Differential Precooling of Planted Croft
Lily Bulbs,” Proceedings of the American Society of Horticultal Science 82 (1963): 517–25.
440 Part III: Troubleshooting and Process Improvement

Table 14.1 Analysis of means in a 22 factorial design, r replicates.


– –
Step 1: Obtain and plot the four ranges. Find R and D4R and use the range chart as a check on
possible outliers. Obtain d2* from Table A.11; compute
σˆ = R / d 2*
σˆ X = σˆ / ng = σˆ / 2r
df ≅ ( 0.9 ) k (r − 1) = 3.6 (r − 1) for k = 4 ranges (or see Table A.11)

Step 2: Plot points corresponding to the two main effects and interaction (Figure 14.2).

A1 = 1 / 2 (1) + ( 3 )  B1 = 1 / 2 (1) + ( 2)  L = 1 / 2 (1) + ( 4 ) 

A2 = 1 / 2 ( 2) + ( 4 )  B2 = 1 / 2 ( 3 ) + ( 4 )  U = 1 / 2 ( 2) + ( 3 ) 

X = 1 / 4 (1) + ( 2) + ( 3 ) + ( 4 ) 

Step 3: Obtain Ha from Table A.8 for k = 2 and df as calculated. Compute and draw lines at

UDL = X + Hα σˆ X

LDL = X − Hα σˆ X

usually choosing a = 0.05 and then 0.10 or 0.01 to bracket the extreme sample averages.
Step 4: When a pair of points falls outside the decision lines, their difference
– –indicates a statistically
significant difference, risk about a. If points corresponding to L and U fall outside (or near) the
decision lines, graph the interaction as in Fig. 14.3.

Note: In step 1, the value ŝ = R/d2* is a measure of the within-group variation, an estimate of inherent
variability even when all factors are thought to be held constant. This within-group variation is used
as a yardstick to compare between-factor variation with the decision lines of step 3.
The well-known analysis of variance (ANOVA) measures within-group –variation by a residual sum of
squares, ŝe2, whose square root ŝe will be found to approximate ŝ = R/d2* quite closely in most sets of
data. ANOVA compares between-group variation by a series of variance ratio tests (F tests) instead of
decision lines.
It is important to stress that a replicate is the result of a repeat of the experiment as a whole, and is not
just another unit run at a single setting, which is referred to as a determination.

Table 14.2 represents data from the four combinations of T and S, with r = 3 repli-
cate plants in each.
Formal Analysis (Figure 14.4)
Main Effects

S1 = 30.15 T1 = 34.65
S2 = 37.80 T2 = 33.30

Interaction

L = 31.15 U = 36.80
X = 33.98
Chapter 14: Troubleshooting with Variables Data 441

Table 14.2 Height of Easter lilies (inches).

S1 S2

28 (1) 49 (2)
26 37
30
____ 38
____ –
T1 – – T 1 = 34.65
X 1 = 28.0 X 2 = 41.3

R1 = 4 R 2 = 12

31 (3) 37 (4)
35 37
31
____ 29
____ –
T2 – – T 2 = 33.3
X 3 = 32.3 X 4 = 34.3

R3 = 4 R4 = 8

– –
S 1 = 30.15 S 2 = 37.8
– –
U = 36.8 L = 31.15

Storage period Time Interaction


– – – – – –
S1 S2 T1 T2 L U

ng = 2r = 6
40
UDL = 38.01
(.01)
36.70
Height X , in.

(.05)

35 –
X = 33.98

LDL = 31.26
(.05)
29.95
30 (.01)

(a)


UCL = D4R = 17.99

r=3
15

10
R , in.


R = 7.0

0
(b)

Figure 14.4 ANOM data from Table 14.2. (a) Height of Easter lilies; (b) ranges of heights.
442 Part III: Troubleshooting and Process Improvement


All four R points fall below D4 R = 17.99 in Figure 14.4b. Then

σˆ = R / d 2* = 7.0 / 1.75 = 4.0


σˆ X = 4.0 / 6 = 1.63
df ≅ ( 0.9 ) k ( r − 1) = ( 0.9 )( 4 ) ( 3 − 1) = 7.2 ∼ 7

Decision Lines: Figure 14.4a


The risks have been drawn in parentheses at the right end of the decision lines.
• We decide that the major effect is storage period, with a risk between 0.05 and
0.01. There is also a two-factor interaction, risk slightly less than 0.05.
• If customers prefer heights averaging about 28 inches, then the combination
T1S1 is indicated. If the preference is for heights of about 38 inches, additional
evidence is needed since the 49-inch plant in (2) and the 29-inch plant in (4)
represent possible outliers.

X ± Ha ŝ X–:
For α = 0.05 :
UDL = 33.98 + (1.67 )(1.63) = 36.70
LDL = 33.98 − (1.67 )(1.63) = 31.26
For α = 0.01 :
UDL = 33.98 + ( 2.47 )(1.63) = 38.01
LDL = 33.98 − ( 2.47 )(1.63) = 29.95

But the effect of storage period is certainly substantial and the effect of time prob-
ably negligible.
Magnitude of the Difference
From Section 13.5, Equation (13.4), the magnitude of a difference may vary by ± 2Ha ŝ X–
from the observed difference. Then the expected magnitude of the S main effect in a
large production of Easter lily bulbs is

(S 2 )
− S1 ± 2 Hα σ̂ X = 7.65 ± 2 ( 2.72 )

= 7.65 ± 5.44 (α ≅ 0.05)


that is by as much as 13.09 inches and as little as 2.21 inches.
Discussion of the Time-By-Storage Interaction
Since the T × S interaction was shown to be statistically significant, risk about five percent,
there is a preferential pairing of the two variables. The change in height corresponding
Chapter 14: Troubleshooting with Variables Data 443

T1 T2 S1 S2

r=3 r=3
(2)
(2)
40 40
Height X , in. S2

Height X , in.
T1


(4) (4)
(3) T2
(3)
30 (1) 30
S1 (1)

(a) (b)

Figure 14.5 An auxiliary chart to understand the interaction of S with time. (Data from Table 14.2.)

to a change from T1 to T2 is different when using S1 than when using S2. To interpret the
interaction, we plot the averages of the four subgroups from Table 14.2 as shown in
Figure 14.5a and b. Height increases when changing from condition T1 to T2 at level S1
but decreases when changing at level S2. The lines in Figure 14.5 are not essentially par-
allel; there is a T × S interaction. It was shown in Figure 14.4a that the interaction effect
is statistically significant, a ≅ 0.05.
Pedigree of Some Data
The three heights in combination 2 warrant checking for presence of an outlier. From
Table A.9,
X( n ) − X( n−1) 49 − 38
r10 = = = 0.915
X( n ) − X(1) 49 − 37

Critical values of r10 are

0.941 for a = 0.05


0.886 for a = 0.10

The observed ratio is clearly more than the critical value for a = 0.10. The record
books of the scientist should be checked to see whether an error has been made in tran-
scribing the 49-inch observation or whether there are (were) possible experimental clues
to explain such a tall plant. In the formal analysis above, we have included the suspect
observation but should now keep a suspicious mind about the analysis and interpretation.
The 29-inch height in combination 4 is also suspect.
If no blunder in recording is found, some other important and possibly unknown
factor in the experiment may not be maintained at a constant level.
444 Part III: Troubleshooting and Process Improvement

Table 14.3 General analysis of a 23 factorial design, r ≥ 1


Step 1: Plot– an R chart (Fig. 14.6b). Check on possible outliers; when all range points fall below
D4R, compute
σˆ = R / d 2*
σˆ X = σˆ / ng = σˆ / 4r
df ≅ ( 0.9 ) k (r − 1)

Step 2: Obtain averages as shown in Table 14.6 (ng = 4r )


a. Main effects as in Table 14.6.
b. Two-factor interactions as in Tables 14.6 and 14.7.
c. Plot averages as in Fig. 14.6a.
Step 3: Compute decision lines for k = 2, ng = 4r, and a = 0.01, 0.05 or 0.10 as appropriate
UDL = X + Hα σˆ X

LDL = X − Hα σˆ X

Draw the decision lines as in Fig. 14.6a.


Step 4: Any pair of points outside the decision lines indicates a statistically significant difference,
risk about a. Differences that are of practical significance indicate areas to investigate or
action to be taken.
Step 5: It is sometimes helpful to compute confidence limits* on the magnitude of the observed
differences, whether a main effect or a two-factor interaction:
(X 1 )
− X 2 ± 2Hα σ̂ X

t
* Recall Hα = , so 2Hα σˆ X = t 2σˆ x , which is the usual construction for a confidence interval on the
2
difference of two means.

14.7 THREE INDEPENDENT VARIABLES: A 23


FACTORIAL DESIGN
This discussion is about quality characteristics measured on a continuous scale; it par-
allels that of Section 11.11, which considers quality characteristics of an attribute
nature. The mechanics of analysis as summarized in Table 14.3 are slight variations of
those in Section 14.6 and will be presented here in connection with actual data in the
following Case History 14.2.

Case History 14.2


Assembly of Nickel-Cadmium Batteries12
A great deal of difficulty had developed during production of a nickel-cadmium battery.
Consequently, a unified team project was organized to find methods of improving cer-
tain quality problems.

12. From a term paper prepared for a graduate course at Rutgers University by Alexander Sternberg. The data have been
coded by subtracting a constant from each of the observations; this does not affect the relative comparison of effects.
Chapter 14: Troubleshooting with Variables Data 445

In an exploratory experiment, three omnibus-type variables were included:


A1: Processing on production line 1—using one concentration of nitrate
A2: Processing on production line 2—a different nitrate concentration
(A difference between A1 and A2 might be a consequence of lines or
concentrations.)
B1: Assembly line B-1—using a shim in the battery cells
B2: Assembly line B-2—not using a shim
(A difference between B1 and B2 might be a consequence of lines or shims.)
C1: Processing on station C-1—using fresh hydroxide
C2: Processing on station C-2—using reused hydroxide
(A difference between C1 and C2 might be a consequence of stations
or hydroxide.)
All batteries (r = 6) were assembled from a common supply of components in each of
the eight combinations. The 48 batteries were processed according to a randomized plan,
and capacitance was measured at a single test station. The measurements are shown in
Table 14.4 and the combination averages in Table 14.5. (Large capacitances are desired.)
The variation within any subgroup of six batteries can be attributed to three possi-
ble sources in some initially unknown way:
1. Variation attributable to components and materials
2. Variation attributable to manufacturing assembly and processing
3. Variation of testings
A measure of within-subgroup variability is

σˆ = R / d 2* = 1.45 / 2.55 = 0.57

Table 14.4 Capacitance of individual nickel-cadmium batteries in a 23 factorial design


(data coded).
The numbering of the eight columns is consistent with that in Table 14.5.
A1 A2
B1 B2 B1 B2
C1 C2 C1 C2 C1 C2 C1 C2
(1) (5) (7) (3) (6) (2) (4) (8)
–0.1 1.1 0.6 0.7 0.6 1.9 1.8 2.1
1.0 0.5 1.0 –0.1 0.8 0.7 2.1 2.3
0.6 0.1 0.8 1.7 0.7 2.3 2.2 1.9
–0.1 0.7 1.5 1.2 2.0 1.9 1.9 2.2
–1.4 1.3 1.3 1.1 0.7 1.0 2.6 1.8
0.5 1.0 1.1 –0.7 0.7 2.1 2.8 2.5

X– 0.08 0.78 1.05 0.65 0.91 1.65 2.23 2.13
R 2.4 1.2 0.9 2.4 1.4 1.6 1.0 0.7
446 Part III: Troubleshooting and Process Improvement

Table 14.5 Averages of battery capacitances (r = 6) in a 23 factorial design, displayed as


two 2 × 2 tables.
(Data from Table 14.4.)

A1 A2 A1 A2
C1 (1) C 2 (2) C2 (5) C 1 (6)
B1 – 1.65 B1 0.78 0.91
X = 0.08
r=6 r=6
C2 (3) C 1 (4) C1 (7) C 2 (8)
B2 0.65 2.23 B2 1.05 2.13

If none of the independent variables is found statistically significant or scientifi-


cally important, the variation due to components and materials is more important than
the effect of changes of processing and assembly that have been included in the study.
The following analysis is a direct extension of the analysis of means for a 22 inves-
tigation, Table 14.2 and Section 11.11. Of the eight combinations in Table 14.4, half
were produced at level A1—those in columns 1, 3, 5, and 7. Half were produced at level
A2—those in columns 2, 4, 6, and 8. We shall designate the average of these four col-
umn averages as

A1 = 0.64 A2 = 1.73
with ng = 4r = 24

An outline of the mechanics of computation for main effects and two-factor inter-
actions is given in Table 14.3; further details follow:
– – –
• R = 1.45 and D4 R = (2.00)(1.45) = 2.90. All range points fall below D4 R
(Figure 14.6b), and we accept homogeneity of ranges, and compute

σˆ = R / d 2* = 0.57
σˆ 0.57
σˆ X = = = 0.116 for ng = 4r = 24
ng 24

df ≅ 36

• Averages of the three main effects and three two-factor interaction effects
are shown in Table 14.6. Each average is of ng = 4r = 24 observations. The
decision lines for each comparison are: (k = 2, df ≅ 36)

X ± Hα σ̂ X
Chapter 14: Troubleshooting with Variables Data 447

Two-factor
interaction
Three-factor
Main effects AB AC BC interaction
– – – – – – – – – – – – – –
A1 A2 B1 B2 C1 C2 L U L U L U E O

ng = 4r = 24
1.6

UDL = 1.41 (.01)


1.4 1.35 (.05)


Capacitance

X = 1.185
1.2

1.02 (.05)
1.0 LDL = 0.96 (.01)

0.8

0.6

(a)

4 r=6

UCL = D4R = 2.90

2 –
R

R = 1.45

0
(b)

Figure 14.6 (a) Electrical capacitance of nickel-cadmium batteries: the ANOM comparisons;
(b) range chart, nickel-cadmium batteries. (Data from Table 14.4.)

Table 14.6 Averages to test for main effects


and two-factor interactions.
(Data from Table 14.4); ng = 4r = 24.
Main Effects Two-Factor Interactions
– –
A1 = 0.65 AB: L = 1.305
– –
A2 = 1.73 U = 1.065
– –
B1 = 0.855 AC: L = 1.23
– –
B2 = 1.515 U = 1.14
– –
C1 = 1.07 BC: L = 0.94
– –
C2 = 1.30 U = 1.43
448 Part III: Troubleshooting and Process Improvement

For a = 0.05: Ha = 1.43

UDL ( 0.05) = 1.185 + (1.43)( 0.12 )


= 1.185 + 0.17 = 1.35
LDL ( 0.05) = 1.185 − 0.17 = 1.02

• Since two sets of points are outside these decision lines, we compute
another pair.
For a = 0.01: Ha = 1.93

UDL ( 0.05) = 1.185 + (1.93)( 0.12 )


= 1.185 + 0.232 = 1.42
LDL ( 0.05) = 1.185 − 0.232 = 0.95

• Figure 14.6a indicates large A and B main effects, and a BC interaction all
with risk a < 0.01. The large A main effect was produced by using one
specific concentration of nitrate that resulted in a much higher quality battery.
Combinations (4) and (8) with A2 and B2 evidently are the best. Besides the
demonstrated advantages of A2 over A1 and of B2 over B1, a bonus benefit
resulted from no significant difference due to C. This result indicated that
a certain expensive compound could be reused in manufacturing and
permitted a very substantial reduction in cost.
The presence of a BC interaction caused the group to investigate. It was puzzling
at first to realize that the four groups of batteries assembled on line B1 and processed at
station C2 and those assembled on line B2 and processed at station C1 average signifi-
cantly better than combinations B1C1 and B2C2. This resulted in the detection of specific
differences in the processing and assembling of the batteries. The evaluation led to
improved and standardized process procedures to be followed and a resulting improve-
ment in battery quality.
When all parties concerned were brought together to review the results of the pro-
duction study, they decided to manufacture a pilot run to check the results. This retest
agreed with the first production study; changes were subsequently made in the manu-
facturing process that were instrumental in improving the performance of the battery
and in reducing manufacturing costs. From Equation (13.4), limits on the magnitude of
the main effects (for a = 0.01) are:

( )
∆ A = A2 − A1 ± 2 Hα σ̂ X

= (1.73 − 0.64 ) ± 2 ( 0.232 )


= 1.09 ± 0.46
Chapter 14: Troubleshooting with Variables Data 449

∆ B = ( B2 − B1 ) ± 2 Hα σ̂ X

= (1.515 − 0.8555) ± 2 ( 0.232 )


= 0.66 ± 0.46

14.8 COMPUTATIONAL DETAILS FOR TWO-FACTOR


INTERACTIONS IN A 23 FACTORIAL DESIGN
There are three possible two-factor interactions: AB, AC, and BC. As discussed previ-
ously in Theorem 14.1, a test for a two-factor interaction compares those combinations
having like subscripts with those having unlike subscripts, ignoring the third variable
(see Table 14.7).
The four AB combinations in Table 14.5 are:

Like: A1B1: combinations (1), (5)


A2B2: combinations (4), (8)
Unlike: A1B2: combinations (7), (3)
A2B1: combinations (2), (6)

Then the two-factor AB interaction can be tested by ignoring the third variable C
and comparing the averages

L AB = ( 0.08 + 2.23 + 0.78 + 2.13) / 4 = 1.305


U AB = (1.65 + 0.65 + 0.91 + 1.05) / 4 = 1.065

– –
Table 14.7 Diagram to display a combination selection procedure to compute L and U in testing
AB interaction.

A1 A2

B1 (1,5) (2,6)

B2 (7,3) (4,8)

Unlike (2,3,6,7) Like (1,4,5,8)


450 Part III: Troubleshooting and Process Improvement

The comparison above is between the two diagonals in Table 14.7. Similar com-
parisons provide tests for the two other two-factor interactions.
These averages are plotted in Figure 14.6a. Each of these averages is (again) of ng =
4r = 24 observations just as in testing main effects; in 2k experiments, the two-factor
interactions can be compared to the same decision lines as for main effects. Since the
pair of points for BC is outside the decision lines for a = 0.01, there is a BC interaction.

Three-Factor Interaction in a 23 Factorial Design


Consider the subscripts, each of value 1 or 2. When all three subscripts are added
together for the treatment combinations, half the resulting sums are even E and half are
odd O. A comparison of those whose sums are even with those that are odd provides a
test for what is called a three-factor interaction.

ABC E = ( 0.78 + 1.05 + 0.91 + 2.13) / 4 = 1.22 ( columnss 5,6,7,8 )



ABC O = ( 0.08 + 0.65 + 1.65 + 2.23) / 4 = 1.15 ( columns 1,2,3,4 )


The difference between the three-factor (ABC) averages is quite small (Fig 14.6a),
and the effect is not statistically significant; it is quite unusual for it ever to appear
significant.
Many remarks could be made about the practical scientific uses obtained from
three-factor interactions that “appear to be significant.” The following suggestions are
offered to those of you who find an apparently significant three-factor interaction:
1. Recheck each set of data for outliers.
2. Recheck for errors in computation and grouping.
3. Recheck the method by which the experiment was planned. Is it possible that
the execution of the plan was not followed?
4. Is the average of one subgroup “large” in comparison to all others? Then
discuss possible explanations with a scientist.
5. Discuss the situation with a professional applied statistician.

14.9 A VERY IMPORTANT EXPERIMENTAL


DESIGN: 1/2 × 23
In this chapter, we have just discussed 22 and 23 factorial designs. They were also dis-
cussed in Chapter 11 for data of an attributes nature. We shall conclude this chapter, as
we did Chapter 11, with an example of a half replicate of a 23. Some reasons why the
design is a very important one were listed in Section 11.11; the reasons are just as
applicable when studying response characteristics that are continuous variables.
Chapter 14: Troubleshooting with Variables Data 451

Example 14.1
Consider only that portion of the data for a nickel-cadmium battery in Table 14.5 cor-
responding to the four combinations shown here in Table 14.8. The computations for the
analysis are the same as for an ordinary 22 factorial design. The averages and ranges are
shown in Table 14.8 and plotted in Figure 14.7.
Mechanics of Analysis
From Figure 14.7b the four ranges fall below

D4 R = ( 2.00 )(1.85) = 3.70 for k = 4 and r = 6


σˆ = R / d 2* = 1.85 / 2.57 = 0.72
σˆ X = σˆ / ng = 0.72 / 12 = 0.207 for ng = 2r = 12

df ≅ ( 0.9 ) k ( r − 1) = ( 0.9 ) ( 4 ) ( 6 − 1) = 18

For a = 0.05:

UDL = 1.152 + (1.49 )( 0.207 )


= 1.46
LDL = 0.84

Table 14.8 Battery capacitances: a special half of a 23 design.


(Data from Table 14.5.)

A1 A2

C1 (1) C2 (2)
– – –
B1 X = 0.08 X = 1.65 B 1 = 0.865
R = 2.4 R = 1.6
r=6

C2 (3) C1 (4)
– – –
X = 0.65 X = 2.23 B 2 = 1.44
B2
R = 2.4 R = 1.0

– –
A 1 = 0.365 A 2 = 1.94
– –
C 2 = 1.15 C 1 = 1.155

X = 1.152
452 Part III: Troubleshooting and Process Improvement

Main effects
– – – – – –
A1 A2 B1 B2 C1 C2

ng = 2r = 12
2.0
1.8

Electrical capacitance
1.6 UDL = 1.57
(.01)
1.46
1.4 (.05)


1.2 X = 1.152

1.0
0.84
(.05)
0.8 LDL = 0.73
(.01)
0.6
0.4

(a)
(1) (2) (3) (4)

4 UCL = 3.70

3
2
R


R = 1.85
1
0
(b)

Figure 14.7 Analysis of means (ANOM) for a half replicate of a 23 design (1/2 × 23). (Data from
Table 14.8.)

For a = 0.01:

UDL = 1.152 + ( 2.04 )( 0.207 )


= 1.57
LDL = 0.73

For a = 0.10: Although not computed, the (0.10) with the (0.05) lines would clearly
bracket the B points.
Some Comments About the Half-Replicate and Full-Factorial Designs
The two points corresponding to A fall outside the 0.01 decision lines; the two points
corresponding to B are just inside the 0.05 decision lines. We note that the magnitude

B2 − B1 = 0.575
Chapter 14: Troubleshooting with Variables Data 453

in this half replicate is almost the same as

B2 − B1 = 0.660

in the complete 23 design (data from Table 14.6).


We see that the principal reason the B effect shows significance more strongly in
the 23 study than in the half replicate is the wider decision lines in Figure 14.7a. These
decision lines are based on only half as many batteries (and df ) as those in Figure 14.6a,
namely ng = 2r = 12 compared to ng = 4r = 24. This reduction in sample size results in
a larger ŝ X– and requires a slightly larger Ha . There is possible ambiguity as to whether
the diagonal averages represent a comparison of a C main effect, or an AB interaction.
Similarly, each apparent main-effect factor may be confounded with an interaction of
the other two factors.
When the magnitude of the difference is of technical interest, there are two possible
alternatives to consider: (1) Decide on the basis of scientific knowledge—from previous
experience or an extra test comparing C1 with C2—whether a main effect is more or less
plausible than an A × B interaction. (2) It is very unlikely that there is a genuine A × B
interaction unless either one or both of A and B is a significant main effect. Since A and
B both have large main effects in this case history, an interaction of these two is not
precluded. The ambiguity can also be resolved by completing the other half of the 23
design; this effort will sometimes be justified. The recommended strategy is to proceed
on the basis that main effects are dominant and effect all possible improvements. The
advantages of this design are impressive, especially in troubleshooting projects.

Case History 14.3


An Electronic Characteristic
Important manufacturing conditions are frequently difficult to identify in the manufac-
ture of electronic products. The factors that control different quality characteristics of a
particular product type often seem difficult to adjust to meet specifications. Materials
in the device are not always critical provided that compensating steps can be specified
for subsequent processing stages. Designed production studies with two, three, and
sometimes more factors are now indispensable in this and other competitive industries.
It was decided to attempt improvements in contact potential quality by varying three
manufacturing conditions considered to affect it in manufacture.
The three production conditions (factors) recommended by the production engineer
for this experiment were the following:
• Plate temperature, designated as P
• Filament lighting schedule F
• Electrical aging schedule A
454 Part III: Troubleshooting and Process Improvement

On the basis of experience, the production engineer specified two levels of each of
the three factors; levels that were thought to produce substantial differences yet which
were expected to produce usable product. At the time of this production study, these three
factors were being held at levels designated by P1, F1, and A1. It was agreed that these
levels would be continued in the production study; second levels were designated by P2,
F2, and A2. A half-replicate design was chosen for this production study in preference to
a full 23 factorial.
Twelve devices were sent through production in each of the four combinations of P,
F, and A shown in Table 14.9. All units in the study were made from the same lot of
components, assembled by the same production line, and processed randomly through
the same exhaust machines at approximately the same time. After they were sealed and
exhausted, each group was processed according to the plan shown in Table 14.9. Then,
electronic readings on contact potential were recorded on a sample of six of each com-
bination. (All 12 readings are shown in Table 14.11.)
Conclusions
1. From Figure 14.8a, the change in aging to A2 produced a very large
improvement. Also, the change from F1 to F2 had the undesirable significant
effect of lowering contact potential, and the change from P1 to P2 had a
statistically significant improvement (at the 0.05 level) but of lesser
magnitude than the A effect.
The production engineer considered combination 2 to be a welcome
improvement and immediately instituted a change to it in production
(A2, P2, and F1). The reduction in rejected items was immediately evident.

Table 14.9 Coded contact potential readings in a half replicate of a 23.


P1 P2

A1 –0.14 (1) A2 +0.15 (2)


–0.17 +0.18
–0.15 +0.07
–0.11 +0.08 –
F1 –0.19 +0.08 F 1 = –0.024
–0.20
______ +0.11
______

X = –0.160 +0.112
R= 0.09 0.11

A 2 –0.04
+0.04
(3) A1 –0.18
–0.12
(4)
+0.11 –0.22
–0.06 –0.21 –
F2 –0.05 –0.18 F 2 = –0.098
–0.05
______ –0.21
______

X = –0.008 –0.187
R= 0.17 0.10

– –
P 1 = –0.084 P 2 = –0.038
– –
A 2 = +0.052 A 1 = –0.174

Source: Doris Rosenberg and Fred Ennerson, “Production Research in the Manufacture of Hearing Aid Tubes,”
Industrial Quality Control 8, no. 6 (May, 1952): 94–97. (Data reproduced by permission.)
Chapter 14: Troubleshooting with Variables Data 455

Main effects
– – – – – –
A1 A2 F1 F2 P1 P2

ng = 2r = 12
+0.1


Contact potential, X 0
UDL = –0.035
– (.01)
X = –0.061
(.01)
–0.1 LDL = –0.087

–0.2

(a)

r= 6 –
UCL = D4R = 0.236

0.2


0.1 R = 0.118
R

0
(1) (2) (3) (4)
(b)

Figure 14.8 Analysis of three factors and their effects on contact potential. (a) ANOM in a 1⁄2 × 23
experiment; (b) ranges. (Data from Table 14.9.)

2. A control chart on different characteristics, including contact potential, had


been kept before the change was made from combination 1 to 2. Figure 14.9

shows the sustained improvement in X after the change.
3. Further studies were carried out to determine whether there were two-
factor interaction effects and how additional changes in A and P (in the
same direction) and in F (in the opposite direction) could increase contact
potential still further. Some of these designs were full factorial; some were
half replicates.
Formal Analysis (Figure 14.9)

σˆ = R / d 2* = 0.118 / 2.57 = 0.046 for k = 4, r = 6

σˆ X = σˆ / ng = 0.046 / 12 = 0.0133 for ng = 2r = 12


456 Part III: Troubleshooting and Process Improvement

ng = 5

+0.2 UDL = 0.18



Contact potential, X –
X = 0.06
0
UDL = –0.16 LDL = –0.06
–0.2 –
X = –0.25

–0.4 LDL = –0.34

(a)

ng = 5 –
UCL = D4R = 0.44
0.4 –
UCL = D4R = 0.32

– R = 0.21
0.2
R

R = 0.15

0
(b)


Figure 14.9 X and R control charts from production before and after changes made as a
consequence of the study discussed in Case History 14.3.

Decision Lines: k = 2, df = 18, H0.05 = 1.49 and H0.01 = 2.04


a = 0.05
UDL = −0.061 + (1.49 )(.0133) = −0.061 + 0.020 ≅ −0.04
LDL = −0.061 − (1.49 )(.0133) = −0.061 − 0.020 ≅ −0.08

a = 0.01
UDL = −0.061 + ( 2.04 )(.0133) = −0.061 + 0.027 ≅ −0.03
LDL = −0.061 − ( 2.04 )(.0133) = −0.061 − 0.027 ≅ −0.09

14.10 GENERAL ANOM ANALYSIS OF


2p AND 2p–1 DESIGNS
A procedure for analyzing 2p factorial designs and 2p–1 fractional factorial designs is
given in Table 14.10.
In this section, we define 2p factorial designs as having p factors at two levels each;
and 2p–1 fractional factorial designs as having p factors at two levels that are run in a 1⁄2
Chapter 14: Troubleshooting with Variables Data 457

Table 14.10 General analysis of a 2p or 2p-1 factorial design, r ≥ 1.


Step 1: Check on homogeneity of variance by plotting an R chart (r > 1) on the replicates of each cell,
or a half-normal plot (r = 1) for designs with single replication.
Step 2: Perform the Yates analysis and calculate the effects.
Step 3: For each treatment effect, E, to be plotted, compute

X + = X + (E / 2) and X − = X − (E / 2)
– – –
Step 4: For each treatment, plot X– and X+ around the centerline X
Step 5: Compute decision lines for k = 2 and a = 0.01, 0.05, or 0.10 as appropriate

UDL = X + Hα σˆ X

LDL = X − Hα σˆ X

where ŝ = R/d2*, or the square root of the residual mean square from the Yates analysis
with appropriate degrees of freedom.
Step 6: Any pair of points outside of the decision lines indicates a statistically significant difference,
risk about a. Differences that are of practical significance indicate areas to investigate or
action to be taken.
Step 7: It is sometimes helpful to compute confidence limits on the magnitude of the observed
effect as
E ± 2Ha ŝ X–

(half) fraction of the full-factorial number of runs. Note that the analysis is done on a
base design equal to the number of runs in the 2p–1 design. For example, the base design
for a 23–1 fractional factorial is a 22 factorial design.
To illustrate the procedure, we consider the data given in Case History 10.1. The Yates
analysis was as follows:

Yates Sum of
order Observation Col. 1 Col. 2 Col. 3 Yates effect squares
1 0.5 11.4 89.4 310.6 77.7 = 2 –y 12,059.0
a 10.9 78.0 221.2 68.0 17.0 = A 578.0
b 29.8 108.1 28.8 71.1 17.8 = B 631.9
ab 48.2 113.1 39.2 5.8 1.4 = AB 4.2
c 43.7 10.4 66.6 131.8 33.0 = C 2,171.4
ac 64.4 18.4 5.0 10.4 2.6 = AC 13.5
bc 47.3 20.7 8.0 –61.1 –15.3 = BC 466.7
abc 65.8 18.5 –2.2 –10.2 –2.6 = ABC 13.0

Assume that the AB, AC, and ABC interactions do not exist, and that we can com-
bine their sums of squares into an estimate of experimental error.

SS ( Error ) = 4.2 + 13.5 + 13.0 = 30.7


s = 30.7 / 3 = 3.20, with 3 df .

Let us assume there are no significant interaction effects except for BC. Then, an
analysis of means would progress as follows using the Yates effects to reconstruct the
effect means:
458 Part III: Troubleshooting and Process Improvement

Step 1: No replication (r = 1)
Step 2: Yates analysis (see table above)

Step 3: X = 38.85, A = 17.0, B = 17.8, C = 33.0, BC = –15.3
– –
A effect means: X+ = 38.85 + (17.0/2) = 47.35 = A+
– –
X– = 38.85 – (17.0/2) = 30.35 = A–
– –
B effect means: X+ = 38.85 + (17.8/2) = 47.75 = B+
– –
X– = 38.85 – (17.8/2) = 29.95 = B–
– –
C effect means: X+ = 38.85 + (33.0/2) = 55.35 = C+
– –
X– = 38.85 – (33.0/2) = 22.35 = C–
– –—
BC effect means: X+ = 38.85 + (–15.3/2) = 31.20 = BC+
– –—
X– = 38.85 – (–15.3/2) = 46.50 = BC–
Step 4: These means are plotted in Figure 14.10.
Step 5: a = 0.05 decision limits

UDL = 38.85 + ( 2.25)( 3.20 ) / 4 = 42.45

LDL = 38.85 − ( 2.25)( 3.20 ) / 4 = 35.25

This general procedure can be used for any full or fractional factorial.

42.45 UDL (.05)


Temperature


38.85 X

35.25 LDL (.05)

A– A+ B– B+ C– C+ BC– BC+

Figure 14.10 ANOM of Case History 10.1 data.


Chapter 14: Troubleshooting with Variables Data 459

14.11 PRACTICE EXERCISES


Possible (useful) things to do with the data in Table 14.11 (use subgroups of size 4):
1. Check each group for outliers and other types of nonhomogeneity by whatever
methods you choose. Does the manufacturing process within each group
appear reasonably stable (excepting the effects of P, F, A)?
2. Form subgroups, vertically in each column, of four each.
a. Obtain ranges r = 4 and make an R chart for all groups combined.
– –
b. Compute ŝ = R/d2 and ŝ = R/d2* and compare. Are these estimates
meaningful?
3. Complete an ANOM.
4. Select at random six from each group; do a formal analysis.
5. How well do the conclusions from the data in Table 14.11 agree with the
data r = 6 in Table 14.9?
6. In this chapter, the authors present step-by-step procedures for analyzing
the 22 factorial (Table 14.1) and the 23 factorial (Table 14.3) designs.
Following this format, write a procedure for analyzing a 1/2 × 23 design.
Pay close attention to the definition and values of r, k, and ng. Note the
difference in meaning of r and k depending on whether we are preparing
the R chart or the ANOM chart. (This is good to note for experimental
designs in general.)

Table 14.11 Contact potential in a half replicate of a 23 design,


r = 12; P = plate temperature; F = filament lighting;
A = aging.
See Case History 14.3.
(1) (2) (3) (4)
P1F1A1 P2F1A2 P1F2A2 P2F2A1
–0.20 +028 –0.08 –0.22
–0.17 +0.07 +0.11 –0.15
–0.18 +0.17 –0.05 –0.10
–0.20 +0.15 –0.01 –0.20
–0.17 +0.16 –0.07 –0.17
–0.25 +0.05 –0.15 –0.14
–0.14 +0.15 +0.04 –0.12
–0.17 +0.18 +0.11 –0.22
–0.15 +0.07 –0.06 –0.21
–0.11 +0.08 –0.04 –0.18
–0.19 +0.11 –0.05 –0.21
–0.20 +0.08 –0.05 –0.18

X= –0.1775 +0.129 –0.025 –0.175
460 Part III: Troubleshooting and Process Improvement

7. In Section 12.1, the authors offer practical advice concerning the planning
of experiments for industrial process improvements. Similar suggestions are
scattered throughout the text. Go through the entire text and prepare an index
to these suggestions.
Note: This is a good exercise to conduct in small groups. Later, in your
professional practice, when you are thinking about running a process
improvement study, you may find such an index very helpful.
8. Suppose an experiment is run on 56 units with eight subgroups of seven
units each. The average range for these eight subgroups is 5.5 and the overall
data average is 85.2. Furthermore, these eight subgroups are aggregated into
a 23 factorial design.
a. Compute the control limits on the R chart.
b. Compute the decision limits on the ANOM chart, a = 0.05 and
a = 0.01.
9. Consider Table 14.4.
a. What columns would you use to perform a test on factor A with B held
constant at level 1 and C held constant at level 2?
b. Calculate –s to estimate sigma and compare with the text estimate of
sigma. If this approach had been used, what would have been the
degrees of freedom?
10. Analyze the data in Table 14.9 as a 22 factorial on aging schedule and
filament lighting schedule. Compute the interaction and check for significance.
Use a = 0.05.
11. Regard the data of Table 14.2 as a half-rep of a 23 with factors S, T, and
(L, U). Perform an analysis of means. Use a = 0.05.

12. Make a table of A5 factors that, when multiplied by R and added and
subtracted from the grand mean, will yield analysis of means limits.
15
More Than Two Levels
of an Independent Variable

15.1 INTRODUCTION
A vital point was introduced in Chapter 11 that is seldom appreciated even by the most
experienced plant personnel: Within a group of no more than four or five units/molds,
machines, operators, inspectors, shifts, production lines, and so on—there will be at least
one that performs in a significantly different manner. The performance may be better or
worse, but the difference will be there. Because even experienced production or man-
agement personnel do not expect such differences, demonstrating their existence is an
important hurdle to overcome in starting a process improvement study. However, it does
not require much investigation to convince the skeptics that there actually are many
sources of potential improvement. The case histories in this book were not rare events;
they are typical of those in our experience.
Some simple design strategies aid immeasurably in finding such differences. This
too is contrary to the notion that an experienced factory hand can perceive differences
of any consequence. Once production personnel learn to expect differences, they can
begin to use effectively the various strategies that help identify the specific units that
show exceptional behavior. Some very effective strategies for this purpose were dis-
cussed in Chapter 11.
The importance of designs using two and three variables at two levels of each (22
and 23 designs) was discussed in Chapter 14. They are especially effective when look-
ing for clues to the sources of problems. Some industrial problems warrant the use of
more than two levels of an independent variable (factor), even in troubleshooting. This
chapter extends the graphical methods of analysis to three and more levels. Just as in
Chapter 11, there are two basic procedures to consider: (1) a standard given and (2) no
standard given.

461
462 Part III: Troubleshooting and Process Improvement

15.2 AN ANALYSIS OF k INDEPENDENT SAMPLES—


STANDARD GIVEN, ONE INDEPENDENT VARIABLE
Given a stable process (that is, one in statistical control) with known average m and stan-
dard deviation s, we obtain k independent random samples of r, each from the given
process, and consider all k means simultaneously. Within what interval

m ± Zas X–

will all k means lie with probability (1 – a )?


Under the generally applicable assumption that averages from the process or popu-
lation are normally distributed, values of Za corresponding to a = 0.10, 0.05, and 0.01
were derived in Section 11.3 for attributes data and are given in Table A.7. They are
equally applicable to variables data.1
The limits for various cases of “standards given” are shown in Table 15.1, where
the symbol Ha,n indicates the risk a used and the degrees of freedom n to be employed.
Suppose an individual enters a casino where dice are being thrown at eight tables.
Data are taken on 25 throws at each of the tables to check on the honesty of the house.
It can be shown that for the sum of two fair dice on one toss, m = 7 and s = 2.42.
Therefore, if the resulting sample averages were 7.23, 6.46, 7.01, 6.38, 6.68, 7.35, 8.12,
and 7.99, we do not have evidence of dishonesty at the a = 0.05 level of risk since the
decision limits are as follows:

Table 15.1 Limits for standards given.


Standard(s)
given Limits

 ˆ 
σ 
None X ± Hα ,ν 
 n 
 g

 
σ 
s X ± Hα ,ν 
 n 
 g

 
k  σˆ 
m µ ± Hα ,ν
k − 1  ng 

 
σ 
m, s µ ± Zα 
 n 
 g

1. When r is as large as 4, this assumption of normality of means is adequate even if the population of individuals is
rectangular, right-triangular, or “almost” any other shape. (See Theorem 3, Section 1.8.)
Chapter 15: More Than Two Levels of an Independent Variable 463

9 n = 25
UDL0.01 = 8.56
UDL0.05 = 8.32
8

7

m=7
X

6
LDL0.05 = 5.68
LDL0.01 = 5.44
5

1 2 3 4 5 6 7 8

Figure 15.1 Analysis of means chart for eight casino tables.

For a = 0.05:

 2.42 
7 ± 2.73  
 25 

7 ± 1.32

LDL0.05 = 5.68 DL0.05 = 8.32


UD

For a = 0.01:

 2.42 
7 ± 3.22  
 25 
7 ± 1.56
LDL0.05 = 5.44 DL0.05 = 8.56
UD

And the plot shows no evidence of a departure from the hypothesized distribution
as shown in Figure 15.1. One other example, standard given, was discussed in Section
11.3. The very important area of no standard given with variables data follows.

15.3 AN ANALYSIS OF k INDEPENDENT SAMPLES—NO


STANDARD GIVEN, ONE INDEPENDENT VARIABLE
This analysis is a generalization of the analysis of means in Section 14.4. It compares k

≥ 2 means (averages) with respect to their own grand mean X instead of being restricted
to k = 2. More formally, the procedure is as follows.
464 Part III: Troubleshooting and Process Improvement

Given k sets of r observations each, but no known process average or standard


deviation, the k means will be analyzed simultaneously for evidence of nonrandomness
(significant differences). Decision lines, UDL and LDL, will be drawn at

X ± Haŝ X–

Thus, the k means are compared to their own group mean X. If any mean lies out-
side the decision line, this is evidence of nonrandomness, risk a.
The factors Ha are functions of both k and the degrees of freedom,2 df, in estimat-
ing a. The computation of Ha , no standard given, is much more complicated than the
computation of Za for the case of standard given, Section 11.3. Dr. L. S. Nelson3 has
succeeded in deriving the exact values of

hα = k / ( k − 1)Hα

These were published in 1983. We give values of Ha in Table A.8 for a = 0.10, 0.05,
and 0.01, without indicating the method of computation.
Table A.8 gives percentage points of the Studentized maximum absolute deviate4
for selected values of k from k = 2 to k = 60 and selected degrees of freedom.

Case History 15.1


Possible Advantage of Using a Selection Procedure for Ceramic Sheets
During the assembly of electronic units, a certain electrical characteristic was too vari-
able. In an effort to improve uniformity, attention was directed toward an important
ceramic component of the assembly. Ceramic sheets were purchased from an outside
vendor. In production, these ceramic sheets were cut into many individual component
strips. How does the overall variability of assemblies using strips cut from many dif-
ferent sheets compare with variability corresponding to strips within single sheets?
Could we decrease the overall variability by rejecting some sheets on the basis of dif-
ferent averages of small samples from them?
There was no record of the order of manufacture of the sheets, but it was decided to
cut seven strips from each of six different ceramic sheets. The six sets were assembled

2. See Table A.11 for the degrees of freedom (df) corresponding to the number of samples k of r each when using
ranges. Otherwise, use df of estimate employed.
df ≅ (0.9)k(r – 1)
3. L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15,
no. 1 (January 1983): 40–44.
4. M. Halperin, S. W. Greenhouse, J. Cornfield, and J. Zalokar, “Tables of Percentage Points for the Studentized
Maximum Absolute Deviate in Normal Samples,” Journal of the American Statistics Association 50 (1955):
185–95.
Chapter 15: More Than Two Levels of an Independent Variable 465

Table 15.2 Measurements on an electronic assembly.


Ceramic
Sheet 1 2 3 4 5 6
16.5 15.7 17.3 16.9 15.5 13.5
17.2 17.6 15.8 15.8 16.6 14.5
16.6 16.3 16.8 16.9 15.9 16.0
15.0 14.6 17.2 16.8 16.5 15.9
14.4 14.9 16.2 16.6 16.1 13.7
16.5 15.2 16.9 16.0 16.2 15.2
15.5 16.1 14.9 16.6 15.7 15.9

X 16.0 15.8 16.4 16.5 16.1 15.0

R 2.8 3.0 2.4 1.1 1.1 2.5

into electronic units through the regular production process. The electrical characteris-
tics of the final 42 electronic units are shown in Table 15.2 (also see Table 1.7).
The troubleshooter should ask the question: “Is there evidence from the sample data
that some of the ceramic sheets are significantly different from their own group aver-
age?” If the answer is “no,” the data simply represent random or chance variation
around their own average, and there is no reason to expect improvement by using
selected ceramic sheets.
Analysis of Means
Analysis of means (ANOM) applied to the data from Table 15.2 (one independent vari-
able at k levels) comprises the following procedure.
– –
Step 1: Plot a range chart (Figure 15.2b). All points are between D3 R and D4 R.
Then we compute

σˆ = R / d 2* = 2.15 / 2.73 = 0.788

and

σˆ X = 0.788 / 7 = 0.30 with df ≅ ( 0.9 ) k ( r − 1) ≅ 32


Step 2: Obtain the six averages from Table 15.2 and the grand average, X = 15.97.
Plot the averages as in Figure 15.2a.

Step 3: Compute the decision lines X ± Haŝ X– for k = 6, df = 32.
For a = 0.05:

UDL0.05 = 15.97 + ( 2.54 )( 0.30 ) = 16.73


LDL0.05 = 15.97 − ( 2.54 )( 0.30 ) = 15.21
466 Part III: Troubleshooting and Process Improvement

Ceramic sheet
1 2 3 4 5 6

17 r=7
UDL = 16.73
(.05)

– –
X = 15.97
X 16

15.21
(.05)
LDL = 15.03 (.01)
15

(a)

r=7 –
UCL = D 4R = 4.13
4


R = 2.15
R 2

LCL = D 3R = 0.17
0

(b)

Figure 15.2 Analysis of means charts (averages and ranges). (Ceramic sheet data from Table 15.2.)

For a = 0.01:

UDL0.01 = 15.97 + ( 3.13)( 0.30 ) = 16.91


LDL0.01 = 15.97 − ( 3.13)( 0.30 ) = 15.03

The decision lines are drawn in Figure 15.2a; the risk a is indicated in
parentheses at the end of each decision line.

Step 4: The point corresponding to sample 6, X = 15.0, is below LDL(0.05) and very
near LDL (0.01) = 15.03. Whether such a point is just outside or just inside
the decision lines will not impress many troubleshooters as representing
different bases for action. If they would reject or accept one, they would
similarly reject or accept the other.
Step 5: Interpretation. Sample 6 represents a ceramic sheet whose average is
significantly different (statistically) from the grand average (risk a ≅ 0.01).
This evidence supports the proposal to reject ceramic sheets such as those
in sample 6.
Chapter 15: More Than Two Levels of an Independent Variable 467

ng = 5, r = 35, k = 1
17
UDL = 16.87
(.01)
UDL = 16.77
(.05)

– Desired average = 16.5


X

LDL = 16.23
(.05)
LDL = 16.13
(.01)
Average = 16.16
16

Figure 15.3 Comparing a group average with a given specification or a desired average (average
of first five ceramic sheets compared to desired average).

Discussion
After removing sample 6, consider the grand average of the remaining five samples.
Has the removal improved the average of the remaining ceramic sheets enough that they
now represent a process average at the given specification of 16.5? To answer, we shall
compare their average to decision lines (standard given) drawn around m = 16.5. The

average of the combined 35 observations from the remaining five samples is X = 16.16;
this is shown as a circled point in Figure 15.3. In this example, k = 1 since we wish to
compare to a desired average value m = 16.5, that is, a standard is given, with a sample

average of X = 16.16. The t statistic is used since we must estimate s with ŝ = 0.788
having df ≅ (0.9)(6)(7 – 1) = 32.
Decision lines, using our previous ŝ = 0.788 are simply m ± tŝ X–. Using Table A.15:
k = 1, ng = 5r = 35, df ≅ 32.
For a = 0.05:

 0.788 
LDL0.05 = 16.50 − ( 2.04 )   = 16.23
 35 

For a = 0.01:

 0.788 
LDL0.01 = 16.50 − ( 2.75)   = 16.13
 35 

Decision
The grand average of the 35 electronic units made from the 35 pieces of ceramic is below
the LDL for a = 0.05 and is therefore significantly lower (statistically) than m = 16.5, risk
468 Part III: Troubleshooting and Process Improvement

a ≅ 0.05. Thus, no plan of rejecting individual ceramic sheets by sampling can be


expected to raise the grand average of the remaining to 16.5, risk < 0.05 and about 0.01.
Technical personnel need to consider three matters based on the previous analyses:
1. What can be done in processing ceramic sheets by the vendor to increase the
average electrical characteristic to about 16.5? It may take much technical
time and effort to get an answer.
2. Will it be temporarily satisfactory to assemble samples (r = 7) from each
ceramic sheet, and either reject or rework any ceramic sheet averaging below
15.21 or 15.03? (See Figure 15.2.) This would be expected to improve the
average somewhat.
3. Perhaps there are important factors other than ceramic sheets that offer
opportunities. What can be done in the assembly or processing of the
electronic assemblies to increase the electrical characteristic?

Case History 15.2


Adjustments on a Lathe
A certain grid (for electronic tubes) was wound under five different grid–lathe tensions
to study the possible effect on diameter.
Do the dimensions in Table 15.3 give evidence that tension (of the magnitude in-
cluded in this experiment) affects the diameter? It was the opinion in the department
that increased tension would reduce the diameter.
Interpretation
All of the points plotted in Figure 15.4 lie within the decision lines; also, there is no sug-
gestion of a downward trend in the five averages, as had been predicted. We do not have
evidence that the changes in grid–lathe tension affect the grid diameter.

Table 15.3 Grid diameters under tensions.


See Figure 15.4.
T20 T40 T60 T80 T120
42 48 46 48 50
46 48 42 46 45
46 46 42 42 49
44 47 46 45 46
45 48 48 46 48

Xi: 44.6 47.4 44.8 45.4 47.6
Ri: 4 2 6 6 5
Chapter 15: More Than Two Levels of an Independent Variable 469

T 20 T 40 T 60 T 80 T 120

ng = r = 5
UDL = 48.16
(.05)
48


X = 45.96
46

X

44 LDL = 43.76
(.05)

(a)


UCL = D4R = 9.7
10
r=5


5 R = 4.60
R

0
(b)

Figure 15.4 Comparing k = 5 subgroups with their own grand mean. (Grid–lathe data from
Table 15.3.)

15.4 ANALYSIS OF MEANS—NO STANDARD GIVEN,


MORE THAN ONE INDEPENDENT VARIABLE
Analysis of means of experiments involving multiple factors becomes more compli-
cated. An extension of the methods of Ott5 to the analysis of main effects and interac-
tions in a designed experiment was developed by Schilling.6,7,8
This approach is based on the experiment model and utilizes the departures, or dif-
ferentials, from the grand mean that are associated with the levels of the treatments run in

5. E. R. Ott, “Analysis of Means—A Graphical Procedure,” Industrial Quality Control 24, no. 2 (August 1967).
6. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 1, Analysis of Treatment Effects,” Journal
of Quality Technology 5, no. 3 (July 1973): 93–108.
7. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 2, Analysis of Contrasts,” Journal of
Quality Technology 5, no. 4 (October 1973): 147–55.
8. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 3, Analysis of Non-Normal Data,” Journal
of Quality Technology 5, no. 4 (October 1973): 156–59.
470 Part III: Troubleshooting and Process Improvement

the experiment. These differentials, or treatment effects, are adjusted to remove any lower-
order effects and plotted against decision limits using the analysis of means procedure.
This approach is sometimes referred to as ANOME (ANalysis Of Means for Effects)
to distinguish it from ANOM (ANalysis Of Means), which plots the means against the
limits. The former allows application of analysis of means to sophisticated experiments
by using the experiment model, while the latter affords straightforward understanding by
comparing the means themselves directly to the limits. The following is a simplification
of this procedure that may be used for crossed and nested experiments.

15.5 ANALYSIS OF TWO-FACTOR CROSSED DESIGNS


For a two-factor factorial experiment having a levels of factor A and b levels of factor
B, with r observations per cell, as illustrated in Figure 15.5, the procedure is outlined
as follows:
Step 1: Calculate means and ranges as shown in Figure 15.5.
Step 2: Calculate treatment effects for the main effects as the difference between
the level mean and the grand mean.
Main effects for factor A

( )
A1 = X1i − X

A = (X − X )
2 2i
.......................
A = (X − X )
a ai

B
2 b

– – ... – –
1 X 11, R 11 X 12, R 12 X 1b, R 1b X 1•

– – ... – –
2 X 21, R 21 X 22, R 22 X 2b , R 2b X 2•
A
... ... ... ...

– – ... – –
a X a1, R a1 X a 2, R a 2 X ab , R ab X a•

– – ... – –
X •1 X •2 X •b X

Figure 15.5 Basic form of a two-factor crossed factorial experiment.


Chapter 15: More Than Two Levels of an Independent Variable 471

In general

(
Ai = X i i − X )
Main effects for factor B

( )
B1 = X i1 − X

B = (X − X )
2 i2
.......................
B = (X − X )
b ib

In general

(
Bj = Xi j − X )
Step 3: Calculate treatment effects for interaction as the difference between the cell
means and the grand mean less all previously estimated lower-order effects
that would be contained in the cell means. This gives the following:
Interaction effects (AB)

(
AB11 = X11 − X − A1 − B1 )
AB = ( X − X ) − A − B
12 12 1 2
...........................................
AB = ( X − X ) − A − B
1b 1b 1 b

AB = ( X − X ) − A − B
21 21 2 1

AB = ( X − X ) − A − B
22 22 2 2
...........................................
AB = ( X − X ) − A − B
2b 2b 2 b

AB = ( X − X ) − A − B
a1 a1 a 1

AB = ( X − X ) − A − B
a2 a2 a 2
...........................................
AB = ( X − X ) − A − B
ab ab a b
472 Part III: Troubleshooting and Process Improvement

In general

( )
ABij = X ij − X − Ai − B j


Refer to Figure 15.5 to see how any given cell mean Xij would be affected by
not only the interaction effect ABij but also would be affected by the main
effect Ai and the main effect Bj. Hence, Ai and Bj must be subtracted out to
give a legitimate estimate of the interaction effect.
Step 4: Estimate experimental error.

R
σˆ e =
d 2*

with
df = 0.9ab(r – 1)

Alternatively, experimental error can be estimated more precisely using the


standard deviation as

 n 
2

 ∑ Xi 
 
n t k

∑ X i − ∑ ∑ nij Tij − i=1


i
2 2

i =1 i =1 j =1 n
σ̂ e = (15.1)
 t 
n −  ∑ qi  − 1
 i=1 

with

 t 
df = n −  ∑ qi  − 1
 i=1 

where
X = individual observation
t = number of effects tested (main effects, interactions, blocks, and so on)
ki = number of individual treatment effects (means) for an effect tested
n = total number of observations in experiment
nij = number of observations in an individual treatment effect (mean)
Chapter 15: More Than Two Levels of an Independent Variable 473

Tij2 = treatment effect squared


qi = degrees of freedom for an effect tested
Step 5: Compute limits for the treatment effect differentials as

q
0 ± σ̂ e hα
n

where
n = total number of observations in the experiment
q = degrees of freedom for effect tested
k = number of points plotted

A main effect q=a–1 k=a


B main effect q=b–1 k=b
AB interaction q = ab – a – b + 1 k = ab

and ha is obtained as follows:

Main effects hα = Hα k / ( k − 1) from Table A.8

Interactions ha = ha∗ from Table A.19

Two different factors are necessary since Ha is exact for main effects only.9
For interactions and nested factors, ha* is used because of the nature of
the correlation among the points plotted.
Step 6: Plot the chart as in Figure 15.6.

9. The factors ha* in Table A.19 follow from the approach to ANOM limits suggested in P. F. Ramig, “Applications of
Analysis of Means,” Journal of Quality Technology 15, no. 1 (January 1983): 19–25 and are incorporated in the
computer program for ANOM by P. R. Nelson, “The Analysis of Means for Balanced Experimental Designs,”
Journal of Quality Technology 15, no. 1 (January 1983): 45–56. Note that in the special case in which one or more
factors in an interaction have two levels, the above interaction limits are somewhat conservative. A complete dis-
cussion with appropriate critical values is given in P. R. Nelson, “Testing for Interactions Using the Analysis of
Means,” Technometrics 30, no. 1 (February 1988): 53–61. It is pointed out that when one factor has two levels, k
may be reduced by one-half. This fact is used in the above computer program. The approach used in the text is for
consistency and ease of application and will be found to be adequate in most cases.
474 Part III: Troubleshooting and Process Improvement

UDL
Effect

LDL

A1 A2 ... Aa B1 B2 ... Bb AB 11 AB 12 ... AB ab

Figure 15.6 Analysis of means chart for two-factor experiment.

The following example studies the effects of developer strength A and development
time B on the density of a photographic film plate and illustrates the method. Figure
15.7 presents the data.
Step 1: See Figure 15.7 for means and ranges.
Step 2: Main effects are as follows:

A1 = (3.17 – 6.28) = – 3.11


A2 = (6.83 – 6.28) = 0.55
A3 = (8.83 – 6.28) = 2.55
B1 = (5.42 – 6.28) = –0.86
B2 = (6.08 – 6.28) = –0.20
B3 = (7.33 – 6.28) = 1.05

Step 3: Interaction effects are as follows:

AB11 = (2.75 – 6.28) – (–3.11) – (–0.86) = 0.44


AB12 = (2.50 – 6.28) – (–3.11) – (–0.20) = –0.47
AB13 = (4.25 – 6.28) – (–3.11) – ( 1.05) = 0.03
AB21 = (5.50 – 6.28) – ( 0.55) – (–0.86) = –0.47
AB22 = (7.00 – 6.28) – ( 0.55) – (–0.20) = 0.37
AB23 = (8.00 – 6.28) – ( 0.55) – ( 1.05) = 0.12
AB31 = (8.00 – 6.28) – ( 2.55) – (–0.86) = 0.03
Chapter 15: More Than Two Levels of an Independent Variable 475

Development time B

10 15 18

– – –
0 X 11 = 2.75 1 X 12 = 2.50 2 X 13 = 4.25

5 4 4
1 –
X 1. = 3.17
2 R11 = 5 3 R12 = 3 5 R13 = 4

4 2 6
Developer strength A

– – –
4 X 21 = 5.50 6 X 22 = 7.00 9 X 23 = 8.00

7 7 8
2 –
X 2. = 6.83
6 R21 = 3 8 R22 = 2 10 R23 = 5

5 7 5

– – –
7 X 31 = 8.00 10 X 32 = 8.75 12 X 33 = 9.75

8 8 9
3 –
X 3. = 8.83
10 R31 = 3 10 R32 = 3 10 R33 = 4

7 7 8

– – – –
X .1 = 5.42 X .2 = 6.08 X .3 = 7.33 X = 6.28

Figure 15.7 Density of photographic film plate.

AB32 = (8.75 – 6.28) – ( 2.55) – (–0.20) = 0.12


AB33 = (9.75 – 6.28) – ( 2.55) – ( 1.05) = –0.13

Step 4: Experimental error is estimated here using the range. (See Sections 15.6
and 15.7 for examples of the standard deviation method.)

32
R= = 3.56
9
R 3.56
σˆ e = = = 1.71
d 2* 2.08

with df = 0.9(3)(3)(4 – 1) = 24.3 ≅ 25


476 Part III: Troubleshooting and Process Improvement

Step 5: Limits are


• Main effects: df = 25, n = 36, k = 3, q = 2, a = 0.05

k 3
hα = Hα = 2.04 = 2.50
k −1 2

0 ± 1.71( 2.50 )
2
36
0 ± 1.01

• Interaction: df = 25, n = 36, k = 9, q = 4, a = 0.05

hα = hα* = 3.03

0 ± 1.71( 3.03)
4
36
0 ± 1.73

Step 6: Plot as in Figure 15.8.


We see from Figure 15.8 that developer strength A and development time B are both
significant while interaction AB is not significant. Note that analysis of means indicates
which levels are contributing to the significant results.
The analysis of variance for these results is given in Figure 15.9. Since the results
of analysis of variance and analysis of means are usually consistent with each other, it

2
+1.73

1 +1.01
Effect

–1 –1.01
–1.73
–2

–3

A A A B B B AB AB AB AB AB AB AB AB AB
1 2 3 1 2 3 11 12 13 21 22 23 31 32 33

Figure 15.8 Analysis of means of density.


Chapter 15: More Than Two Levels of an Independent Variable 477

Analysis of Variance
Source df SS MS F F0.05
Strength 2 198.22 99.11 37.684 3.362
Time 2 22.72 11.36 4.319 3.362
Interaction 4 3.28 0.82 0.312 2.732
Error 27 71.00 2.63
Total 35 295.22

Figure 15.9 Analysis of variance of density.

is not surprising that ANOVA shows only main effects to be significant. In fact, the
ANOVA table can be constructed directly from the treatment effects.

15.6 THE RELATION OF ANALYSIS OF MEANS TO


ANALYSIS OF VARIANCE (OPTIONAL)
It should be noted that the experiment model (See Case History 15.3) for both the
analysis of means (ANOM) and analysis of variance (ANOVA) are the same. Thus, for
the density data given in Table 15.7:

Xijk = m + Ai + Bj + ABij + ek(ij)

where the capital letters represent the treatment effects (or differentials) calculated in
Step 2. For a given factor, ANOM looks at each of the associated treatment effects indi-
vidually to see if any of them depart significantly from an expected value of zero.
ANOVA, on the other hand, looks at the treatment effects for a factor as a group. It is
therefore not surprising that the sums of squares (SS) of an analysis of variance are
related to the treatment effects Ti of the corresponding analysis of means.
We have
k
SS j = n j ∑ Ti 2
i =1

where
Ti = Treatment effect for level i of factor j
nj = Number of observations in an individual treatment effect mean
The ANOVA table can be constructed by conventional methods, or from the treat-
ment effects themselves, using the above relation to obtain the sums of squares.
For the density data we have
3
Strength: SS ( A ) = 12∑ Ai2 = 12 ( −3.11) + ( 0.55) + ( 2.55)  = 12 16.4771 = 197.7
2 2 2

i =1
 
478 Part III: Troubleshooting and Process Improvement

3
Time: SS ( B ) = 12∑ B 2j = 12 ( −0.86 ) + ( −0.20 ) + (1.05)  = 12 1.8821 = 22.6
2 2 2

j =1
 

( 0.44 )2 + ( −0.47 )2 + ( 0.03)2 


3 3  
Interaction: SS ( AB ) = 4∑ ∑ ABij2 = 4  + ( −0.47 ) + ( 0.37 ) + ( 0.12 ) 
2 2 2

 
i =1 j =1
 + ( 0.03)2 + ( 0.12 )2 + ( −0.13)2 
 
= 4  0.8198  = 3.3

These sums of squares correspond to those shown in Figure 15.9, which were obtained
by the conventional methods. Using the treatment effects and associated degrees of free-
dom from ANOM, the ANOVA table is as shown in Figure 15.10.
Here

(∑ X ) ( 226)
2 2

Total = SST = ∑ X − = 1714 − = 295.22


2 i
i
n 36

with degrees of freedom, qT = n – 1 = 36 – 1 = 35

Model = SS(Model) = SS(A) + SS(B) + SS(AB) = 197.7 + 22.6 + 3.3 = 223.6


with degrees of freedom, qA + qB + qAB = 2 + 2 + 4 = 8

Error = SSE = SST – SS(Model) = 295.22 – 223.6 = 71.62


with degrees of freedom,
t
qE = n − ∑ qi − 1 = 36 − 8 − 1 = 27
i =1

with the error sums of squares and degrees of freedom relations taken from Equation
(15.1). Discrepancies are due to numerical differences in calculation between the
two methods.

Source SS df MS
A SS(A ) = 197.7 qA = 2 98.800
B SS(B ) = 22.6 qB = 2 11.300
AB SS(AB ) = 3.3 qAB = 4 0.825
Error SSE = 71.6 qE = 27 2.652
Total SST = 295.2 qT = 35

Figure 15.10 ANOVA table format using treatment effects.


Chapter 15: More Than Two Levels of an Independent Variable 479

Using Equation (15.1), we have

SS ( Total ) − SS ( Model ) 2995.22 − 223.6 71.62


σ̂ e = = = = 1.62
degrees of freedom 27 27

which could have been used in place of the range estimate in calculating the ANOM
limits. Note that it is the same error estimate obtained by taking the square root of the
mean sum of squares for error in the ANOVA table in Figure 15.10.
There is an old saying, “You can’t see the forest for the trees.” For each treatment,
the analysis of variance studies the forest, while the analysis of means studies the trees.
However, both are part of the same landscape.

15.7 ANALYSIS OF FULLY NESTED DESIGNS


(OPTIONAL)
It is sometimes the case that levels of one factor are nested within another higher-order
factor, rather than applicable across all levels of the other. For example, within a plant,
machines and operators may be interchanged, but if the machines were in different
plants, or even countries, it is unlikely that all operators would be allowed to run all
machines. The operators would only run the machine in their plant. They would then be
nested within their own machine. Note that this is different from the crossed experiments
discussed previously in that the average for an operator would apply only to one machine.
Nested experiments can be analyzed using the same steps as are shown for crossed
experiments in Section 15.5 with the exception of Step 3 and Step 5. These should be
modified as follows:
Step 3: Calculate the treatment effects as the difference between the level mean
and the grand mean minus the treatment effects for the factors within which
the factor is nested.
Treatment effects for factor A (from Step 2)

( )
A1 = X1 − X

A = (X − X )
2 2
.......................
A = (X − X )
a a

In general

(
Ai = X i − X )
480 Part III: Troubleshooting and Process Improvement

Treatment effects for factor B nested within A

(
B1(1) = X1(1) − X − A1 )
B ( ) = (X ( ) − X ) − A
21 21 1
...................................
B ( ) = (X ( ) − X ) − A
12 12 2
..................................
B ( ) = (X ( ) − X ) − A
ba ba a

In general

(
B j(i ) = X j(i ) − X − Ai )
Treatment effects for factor C nested within A and B

( )
C1(1,1) = X1(1,1) − X − A1 − B1(1)
.................................................
( )
C1( 2 ,2) = X1( 2 ,2) − X − A2 − B2( 2)
.................................................
( )
Cc( a ,b ) = X c( a ,b ) − X − Aa − Bb( a )

In general

( )
Ck(i , j ) = X k(i , j ) − X − Ai − B j(i )

The pattern is continued for all subsequent nested factors.


Note: No interaction effects can be obtained from a fully nested design.
Step 5: Calculate limits for the highest-order factor using the main-effect limit
formula. Calculate limits for the nested factors using the formula given for
interactions as follows:
Highest-order factor

k
hα = Hα
k − 1 where H is from Table A.8
a

Nested factors
ha = ha* from Table A.19

Consider the following fully nested experiment10 in Figure 15.11 showing the copper
content (coded by subtracting 84) of two samples from each of 11 castings:

10. C. A. Bennett and N. L. Franklin, Statistical Analysis in Chemistry and the Chemical Industry (New York: John
Wiley & Sons, 1954): 364.
Chapter 15: More Than Two Levels of an Independent Variable 481

Casting 1 2 3 4 5

Sample 1 2 1 2 1 2 1 2 1 2
Observation #1 1.54 1.51 1.54 1.25 1.72 0.94 1.48 0.98 1.54 1.84
Observation #2 1.56 1.54 1.60 1.25 1.77 0.95 1.50 1.02 1.57 1.84

Sample X 1.55 1.52 1.57 1.25 1.74 0.94 1.49 1.00 1.56 1.84

Casting X 1.54 1.41 1.34 1.24 1.70

Casting 6 7 8 9 10 11

Sample 1 2 1 2 1 2 1 2 1 2 1 2
Observation #1 1.72 1.81 1.72 1.81 2.12 2.12 1.47 1.755 0.98 1.90 1.12 1.18
Observation #2 1.86 1.91 1.76 1.84 2.12 2.20 1.49 1.77 1.10 1.90 1.17 1.24

Sample X 1.79 1.86 1.74 1.82 2.12 2.16 1.48 1.76 1.04 1.90 1.14 1.21

Casting X 1.82 1.78 2.14 1.62 1.47 1.18


X = 1.57

Figure 15.11 Copper content of castings (X – 84).

Step 1: Means are shown in Figure 15.11.


– –
Step 2: Differentials ( X – X ) are shown as part of Step 3.
Step 3: Treatment effects for the fully nested experiment are now computed.

Casting treatment effect Sample treatment effect


C1 = 1.54 – 1.57 = –0.03 S1(1) = (1.55 – 1.57) – (–0.03) = 0.01
C2 = 1.41 – 1.57 = –0.16 S2(1) = (1.52 – 1.57) – (–0.03) = –0.02
C3 = 1.34 – 1.57 = –0.23 S1(2) = (1.57 – 1.57) – (–0.16) = 0.16
C4 = 1.24 – 1.57 = –0.33 S2(2) = (1.25 – 1.57) – (–0.16) = –0.16
C5 = 1.70 – 1.57 = 0.13 S1(3) = (1.74 – 1.57) – (–0.23) = 0.40
C6 = 1.82 – 1.57 = 0.25 S2(3) = (0.94 – 1.57) – (–0.23) = –0.40
C7 = 1.78 – 1.57 = 0.21 S1(4) = (1.49 – 1.57) – (–0.33) = 0.25
C8 = 2.14 – 1.57 = 0.57 S2(4) = (1.00 – 1.57) – (–0.33) = –0.25
C9 = 1.62 – 1.57 = 0.05 S1(5) = (1.56 – 1.57) – ( 0.13) = –0.14
C10 = 1.47 – 1.57 = –0.10 S2(5) = (1.84 – 1.57) – ( 0.13) = 0.14
C11 = 1.18 – 1.57 = –0.39 S1(6) = (1.79 – 1.57) – ( 0.25) = –0.03
S2(6) = (1.86 – 1.57) – ( 0.25) = 0.04
S1(7) = (1.74 – 1.57) – ( 0.21) = –0.04
S2(7) = (1.82 – 1.57) – ( 0.21) = 0.04
S1(8) = (2.12 – 1.57) – ( 0.57) = –0.02
S2(8) = (2.16 – 1.57) – ( 0.57) = 0.02
S1(9) = (1.48 – 1.57) – ( 0.05) = –0.14
S2(9) = (1.76 – 1.57) – ( 0.05) = 0.14
S1(10) = (1.04 – 1.57) – (–0.10) = –0.43
S2(10) = (1.90 – 1.57) – (–0.10) = 0.43
S1(11) = (1.14 – 1.57) – (–0.39) = –0.04
S2(11) = (1.21 – 1.57) – (–0.39) = 0.03
ΣCi2 = 0.8013 Σ(Sj(i))2 = 0.949
4 ΣCi2 = 3.2052 2 Σ(Sj(i))2 = 1.898

Step 4: Error may be estimated using Equation (15.1).


482 Part III: Troubleshooting and Process Improvement

 n 
2

 ∑ Xi 
 
n t ki


i =1
X i − ∑ ∑ nij Tij − i=1
2

i =1 j =1
2

n
σ̂ e =
 t 
n −  ∑ qi  − 1
 i=1 

113.340 − 4 ( 0.8013) − 2 ( 0.949 ) −


(69) 2

= 44
44 − 10 − 11 − 1
0.0353
=
22
= 0.04

Step 5: Limits using a = 0.05 are as follows:


Castings C (k = 11, df = 22, q = 10)

k
hα = Hα
k −1
11
= 2.98
10
= 3.13
q
0 ± σˆ e hα
n

0 ± 0.04 ( 3.13)
10
44
0 ± 0.0597

Samples S (k = 22, df = 22, q = 11)

hα = hα* = 3.42

q
0 ± σˆ e hα
n

0 ± 0.04 ( 3.42 )
11
44
0 ± 0.06844
Chapter 15: More Than Two Levels of an Independent Variable 483

Step 6: The chart for this nested experiment (Figure 15.12) shows a scale for the mean
as well as for treatment effects since such a scale is meaningful in this case.
Thus, the result is the same as the analysis of variance, shown in Figure 15.13,
but in this case, the plot reveals the nature of the considerable variation.

.5 2.07

.4 1.97

.3 1.87

.2 1.77

.1 1.67

0

Ci 1.57 X

–.1 1.47

–.2 1.37

–.3 1.27

–.4 1.17

–.5 1.07

1 2 3 4 5 6 7 8 9 10 11
Castings

.5 2.07

.4 1.97

.3 1.87

.2 1.77

.1 1.67

Sj(i) 0 1.57 X

–.1 1.47

–.2 1.37

–.3 1.27

–.4 1.17

Sample: 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
Casting: 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 1 1
0 0 1 1

Figure 15.12 Nested analysis of means of copper content of castings.


484 Part III: Troubleshooting and Process Improvement

Source SS df MS F F0.05
Castings 3.2031 10 0.3202 200.1 2.30
Samples (within
castings) 1.9003 11 0.1728 108.0 2.26
Residual 0.0351 22 0.0016
Total 5.1385 43

Figure 15.13 Analysis of variance of copper content of castings.

15.8 ANALYSIS OF MEANS FOR CROSSED


EXPERIMENTS—MULTIPLE FACTORS
The two-factor crossed analysis of means is easily extended to any number of factors or
levels. The procedure remains essentially the same as that for a two-factor experiment
with an extension to higher-order interactions.
To determine the differentials (or treatment effects) for a higher-order interaction,
calculate the appropriate cell means by summing over all factors not included in the
interaction. Then obtain the difference of these cell means from the grand mean. Subtract
all main effects and lower-order interactions contained in the cell means used. This
implies that it is best to work from main effects to successive higher-order interactions.
For a three-factor experiment, this gives:

( )
Ai = X i − X

B = (X − X )
j j

C = (X − X )
k k

AB = ( X − X ) − A − B
ij ij i j

AC = ( X − X ) − A − C
ik ik i k

BC = ( X − X ) − B − C
jk jk j k

ABC = ( X − X ) − A − B − C − AB − AC
ijk ijk i j k ij ik
− BC jk

Suppose there are a levels of A, b levels of B, and c levels of C. Then the degrees
of freedom q for each effect tested are:

qA = a − 1
qB = b − 1
qC = c − 1
Chapter 15: More Than Two Levels of an Independent Variable 485

q AB = ( ab − 1) − q A − qB
q AC = ( ac − 1) − q A − qC
qBC = ( bc − 1) − qB − qC
q ABC = ( abc − 1) − q A − qB − qC − q AB − q AC − qBC

In other words, the initial differences are replaced by the number of cell means
minus one, and from that is subtracted the degrees of freedom of all the lower-order
effects contained therein.
We see also that k, the number of cell means, or points to be plotted on the chart, is
equal to the product of the number of levels of the factors included in the treatment
effect. So
kA = a k AB = ab k ABC = abc
kB = b k AC = ac
kC = c k BC = bc

If the experiment were expanded to include another factor D with d levels, we


would also have

(
Dl = X l − X )
ADil = (X il
− X)− A − D
i l

BD jl = (X jl
− X)− B − Dj l

CDkl = (X kl
− X)−C − D
k l

ABDijl = (X ijl
− X ) − A − B − D − AB − AD − BD
i j l ij il jl

ACDikl = (X ikl
− X ) − A − C − D − AC − AD − CD
i k l ik il kl

BCD jkl = (X jkl


− X ) − B − C − D − BC − BD − CD
j k l jk jl kl

ABCDijkl = (X ijkl
− X ) − A − B − C − D − AB − AC − AD − BC
i j k l ij ik il jk

− BD jl − CDkl − ABCijk − ABDijl − ACDikl − BCD jkl

And clearly

qD = d − 1
q AD = ( ad − 1) − q A − qD
486 Part III: Troubleshooting and Process Improvement

qBD = ( bd − 1) − qB − qD
qCD = ( cd − 1) − qC − qD
q ABD = ( abd − 1) − q A − qB − qD − q AB − q AD − qBD
q ACD = ( acd − 1) − q A − qC − qD − q AC − q AD − qCD
qBCD = ( bcd − 1) − qB − qC − qD − qBC − qBD − qCD
q ABCD = ( abcd − 1) − q A − qB − qC − qD − q AB − q AC − q AD
− qBC − qBD − qCD − q ABC − q ABD − q ACD − qBCD

with
kD = d k ABD = abd
k AD = ad k ACD = acd
k BD = bd k BCD = bcd
kCD = cd k ABCD = abcd

It should be noted that, regardless of the number of factors or levels, or the number
of replicates r per cell, n is the total number of observations in the experiment. So for a
four-factor experiment with r observations per cell:

n = abcdr

Case History 15.3


2 × 3 × 4 Factorial Experiment—Lengths of Steel Bars
The example in Table 15.4 is given by Ott11 and will illustrate the approach given here.
Steel bars were made from two heat treatments (W and L) and cut on four screw
machines (A,B,C,D) at three times (1, 2, 3—at 8:00 AM, 11:00 AM, and 3:00 PM, all on
the same day), with four replicates. The time element suggested the possibility of fatigue
on the part of the operator, which may have induced improper machine adjustment. The
results are shown in Table 15.4 with averages summarized in Table 15.5.
Suppose the main effects of time, machine, and heat were each analyzed separately
as an analysis of k independent samples. We would proceed as follows:

11. E. R. Ott, “Analysis of Means—A Graphical Procedure,” Industrial Quality Control 24, no. 2 (August 1967):
101–9.
Chapter 15: More Than Two Levels of an Independent Variable 487

Table 15.4 A 2 × 3 × 4 factorial experiment (data coded).


Data: Lengths of steel bars*
Heat treatment W Heat treatment L
Machine Machine
A B C D A B C D
Time 1 6 7 1 6 4 6 –1 4
9 9 2 6 6 5 0 5
1 5 0 7 0 3 0 5
3 5 4 3 1 4 1 4

X 4.75 6.50 1.75 5.50 2.75 4.50 0.00 4.50

X 4.63 2.94

R 8 4 4 4 T1 = 3.78 6 3 2 1
Time 2 6 8 3 7 3 6 2 9
3 7 2 9 1 4 0 4
1 4 1 11 1 1 –1 6
–1 8 0 6 –2 3 1 3

X 2.25 6.75 1.50 8.25 0.75 3.50 0.50 5.50

X 4.69 2.56

R 7 4 3 5 T2 = 3.63 5 5 3 6
Time 3 5 10 –1 10 6 8 0 4
4 11 2 5 0 7 –2 3
9 6 6 4 3 10 4 7
6 4 1 8 7 0 –4 0

X 6.00 7.75 2.00 6.75 4.00 6.25 –0.50 3.50

X 5.63 3.31

R 5 7 7 6 T3 = 4.47 7 10 8 7
Column

X 4.33 7.00 1.75 6.83 2.50 4.75 0.00 4.50
–– –
W = 4.98 L = 2.94
– – – –
A = 3.42 B = 5.88 C = 0.88 D = 5.67
– –
X = 3.96 R = 5.29
* W. D. Baten, “An Analysis of Variance Applied to Screw Machines,” Industrial Quality Control 7, no. 10
(April 1956).

Table 15.5 Summary of averages (main effects).


Time Machine Heat
– – ––
T2 = 3.78 A = 3.42 W = 4.98
– – –
T2 = 3.63 B = 5.88 L = 2.94
– –
T3 = 4.47 C = 0.88

D = 5.67
ng = 32 ng = 24 ng = 48
488 Part III: Troubleshooting and Process Improvement

Heat treatment W L
Machine ABCD ABCD ABCD ABCD ABCD ABCD
Time 1 2 3 1 2 3

UCL = 12.1

10


R = 5.29

LCL = 0

Figure 15.14 Range chart of lengths of steel bars.


The first step is to prepare a range chart, as in Figure 15.14, with R = 5.29 and

D4 R = ( 2.28 )( 5.29 ) = 12.1


R 5.29
σˆ = = = 2.56
d 2* 2.07
df = 0.9 ( 24 ) ( 4 − 1) = 64.8 ≅ 65

All the points lie below the control limit, and this is accepted as evidence of homo-
geneity of ranges. However, it may be noted that seven of the eight points for time 3 are

above the average range, R—which suggests increased variability at time 3.
The second step is to compute the averages—these are shown in Table 15.5. It is
immediately evident that the largest differences are between machines, and the least
between times.
Next, decision limits are determined as in Figure 15.15.
Then, the computed decision lines are drawn and the main effects are plotted on the
analysis of means (ANOM) chart (Figure 15.16).
The differences in machine settings contribute most to the variability in the length of
the steel bars; this can probably be reduced substantially by the appropriate factory per-
sonnel. Just which machines should be adjusted, and to what levels, can be determined
by reference to the specifications.
The effect of heat treatments is also significant (at the 0.01 level). Perhaps the
machines can be adjusted to compensate for differences in the effect of heat treatment;
perhaps the variability of heat treatment can be reduced in that area of processing. The
Chapter 15: More Than Two Levels of an Independent Variable 489

Time Machine Heat

σˆ σˆ σˆ
σˆ T = = 0.454 σˆ M = = 0.524 σˆ H = = 0.371
32 24 48
kT = 3 kM = 4 kH = 2
H0.05 = 1.96 H0.01 = 2.72 H0.01 = 1.88
UDL = 4.84 UDL = 5.38 UDL = 4.69
LDL = 3.06 LDL = 2.52 LDL = 3.21

Figure 15.15 Decision limits for main effects for length of steel bars.

Time Machine Heat treatment


T1 T2 T3 A B C D W L

6 2.04
5.38

5 4.84 1.04
4.69


X = 3.96 0 Ti

3.21
3.06
3 –0.96
2.52

2 –1.96

1 –2.96

0 –3.96

Figure 15.16 Analysis of means of length of steel bars—main effects.

magnitude of the machine differences is greater than the magnitude of the heat treat-
ment differences.
Time did not show a statistically significant effect at either the 0.01 or 0.05 level.
However, it may be worthwhile to consider the behavior of the individual machines with
respect to time.
Whether the magnitudes of the various effects found in this study are enough to
explain differences that were responsible for the study must be discussed with the
responsible factory personnel. Statistical significance has been found. If it is not of
practical significance, then additional possible causative factors need to be considered.
490 Part III: Troubleshooting and Process Improvement

Certain combinations of these three factors (heat, machines, and time) may produce
an effect not explained by the factors considered separately; such effects are called
interactions. An answer to the general question of whether a two-factor interaction
exists—and whether it is of such a magnitude to be of actual importance—can be pre-
sented using the analysis of means approach.12
Averages are found by ignoring all factors except those being considered when
there are more factors than those included in the interaction.
In troubleshooting projects, main effects will usually provide larger opportunities
for improvement than interactions—but not always.
Notice the scale on the right side of the ANOM chart. It shows values of the means
plotted minus the constant 3.96, which is the grand mean. In plotting the means for each
treatment, we have constructed the chart for the main effect differentials or treatment
effects, for time, machines, and heat treatment. Thus, by this simple transformation, the
mean effect chart may be thought of in terms of the means themselves or in terms of
the treatment effects or differentials from the grand mean brought about by the levels at
which the experiment was run. Whether these differences are substantial enough to be
beyond chance is indicated by their position relative to the decision lines.
For interactions, the differentials are interpreted as departures from the grand mean
caused by the treatment effect plotted, that is, how much of a difference a particular set
of conditions made. Its significance is again determined by the decision lines.
The steel bar data will now be analyzed in an analysis of means for treatment effects
or ANOME. Underlying this analysis is the assumption of a mathematical model for the
experiment, whereby the magnitude of an individual observation would be composed of
the true mean of the data m plus treatment effect differentials, up or down, depending
on the particular set of treatments applied. That is,

Xijkl = m + Mi + Tj + Hk + MTij + MHik + THjk + MTHijk + el(ijk)

The following is the complete analysis of means for treatment effects (ANOME) on
the steel-bar data.
Step 1: Means and ranges are calculated and summarized as shown in Table 15.4.
Step 2: Compute the differentials to estimate the main effects. The sum of the squared
treatment effects is also shown (to be used in estimating experimental error).
Machines (M)
– –
Mi = Xi – X
M1 = 3.42 – 3.96 = –0.54
M2 = 5.88 – 3.96 = 1.92
M3 = 0.88 – 3.96 = –3.08
M4 = 5.67 – 3.96 = 1.71 Σ(Mi)2 = 16.3885

12. See Case Histories 11.5 and 14.1.


Chapter 15: More Than Two Levels of an Independent Variable 491

Times (T)
– –
Tj = Xj – X
T1 = 3.78 – 3.96 = –0. 18
T2 = 3.63 – 3.96 = –0.33
T3 = 4.47 – 3.96 = 0.51 Σ(Tj)2 = 0.4014

Heats (H)
– –
Hk = Xk – X
H1 = 4.98 – 3.96 = 1.02
H2 = 2.94 – 3.96 = –1.02 Σ(Hk)2 = 2.0808

Step 3: Treatment effects for interactions are now computed.


Machine × time (MT)

Averages, Xij Treatment effects, MTij
Time Time
Machine T1 T2 T3 T1 T2 T3
A 3.75 1.50 5.00 M1 0.51 –1.59 1.07
B 5.50 5.13 7.00 M2 –0.20 –0.42 0.61
C 0.88 1.00 0.75 M3 0.18 0.45 –0.64
D 5.00 6.88 5.13 M4 –0.49 1.54 –1.05

– –
MTij = ( Xij – X) – Mi – Tj
MT11 = (3.75 – 3.96) – (–0.54) – (–0.18) = 0.51
Σ(MTij)2 = 8.8803

Machine × heat (MH)



Averages, Xik Treatment effects, MHik
Heat Heat
Machine W L H1 H2
A 4.33 2.50 M1 –0.11 0.10
B 7.00 4.75 M2 0.10 –0.11
C 1.75 0.00 M3 –0.15 0.14
D 6.83 4.50 M4 0.14 –0.15

– –
MHik = ( Xik – X ) – Mi – Hk
MH11 = (4.33 – 3.96) – (–0.54) – (+1.02) = –0.11
Σ(MHik)2 = 0.1284
492 Part III: Troubleshooting and Process Improvement

Time × heat (TH)



Averages, Xjk Treatment effects, MTjk
Time Time
Heat T1 T2 T3 T1 T2 T3
W 4.63 4.69 5.63 H1 –0.17 0.04 0.14
L 2.94 2.56 3.31 H2 0.18 –0.05 –0.14

– –
THjk = ( Xjk – X) – Tj – Hk
TH11 = (4.63 – 3.96) – (–0.18) – (+1.02) = –0.17
Σ(THji)2 = 0.1046

Machine × time × heat (MTH)



Averages, Xijk
Heat, W Heat, L
Machine Machine
Time A B C D A B C D
T1 4.75 6.50 1.75 5.50 2.75 4.50 0.00 4.50
T2 2.25 6.75 1.50 8.25 0.75 3.50 0.50 5.50
T3 6.00 7.75 2.00 6.75 4.00 6.25 –0.50 3.50

Treatment Effects, MTHijk


Heat, H1 Heat, H2
Machine Machine
Time M1 M2 M3 M4 M1 M2 M3 M4
T1 0.26 0.05 0.17 -0.49 –0.26 –0.05 –0.18 0.49
T2 –0.20 0.46 –0.41 0.17 0.22 –0.45 0.43 –0.16
T3 –0.05 –0.51 0.24 0.32 0.06 0.52 –0.23 –0.32

– –
MTHjk = ( Xijk – X) – Mi – Tj – Hk – MTij – MHik – THjk
MTH111 = (4.75 – 3.96) – (–0.54) – (–0.18) – (+1.02) – (+0.51)
– (–0.11) – (–0.17) = 0.26
Σ(MTHijk)2 = 2.4436

Step 4: Experimental error may be estimated using the range as above. We obtain

R 5.29
σˆ e = = = 2.56
d 2* 2.07

Alternatively, the treatment effects themselves may be used to estimate error


based on the standard deviation. This will give us more degrees of freedom
for error. The formula is that of Equation (15.1).
Chapter 15: More Than Two Levels of an Independent Variable 493

 n 
2

 ∑ Xi 
 
n t k

∑ X1 − ∑ ∑ nij Tij − i=1


i
2 2

i =1 i =1 j =1 n
σ̂ e =
 t 
n −  ∑ qi  − 1
 i=1 

where
X = individual observation
t = number of effects tested (main effects, interactions, blocks, and so on)
ki = number of individual treatment effects (means) for an effect tested
n = total number of observations in experiment
nij = number of observations in an individual treatment effect
Tij2 = treatment effect squared
qi = degrees of freedom for an effect tested
For this experiment, we obtain
n ki

∑∑n T 2
ij ij
= 24 (16.3885) + 32 ( 0.4014 ) + 48 ( 2.0808 ) + 8 (8.8803)
+12 ( 0.1284 ) + 16 ( 0.1046 ) + 4 ( 2.4436 )
i =1 j =1

= 590.0784

 t 
n −  ∑ qi  − 1 = 96 − ( 3 + 2 + 1 + 6 + 3 + 2 + 6 ) − 1 = 72
 i=1 

 380 2 
2542 − 590.0784 −  
 96 
σˆ e = = 2.49
72

Here we have 72 degrees of freedom versus 65 (using the range as an estimate


of error). This estimate of error is the same as we would have obtained if we
used the square root of the error mean square from an analysis of variance.
Step 5: The decision limits for the treatment effect differentials (using the standard
deviation as the estimate of experimental error) are as follows for a = 0.05:

q
0 ± σ̂ e hα
n
494 Part III: Troubleshooting and Process Improvement

0 ± 2.49 ( 2.53)
3 4
Machines M: hα = 2.19 = 2.53
96 3
k = 4, df = 72 0 ± 1.11

0 ± 2.49 ( 2.40 )
2 3
Times T: hα = 1.96 = 2.40
96 2
k = 3, df = 72 0 ± 0.86

0 ± 2.49 ( 2.00 )
1 2
Heats H: hα = 1.41 = 2.00
96 1
k = 2, df = 72 0 ± 0.51

0 ± 2.49 ( 2.96 )
6
MT: hα = hα* = 2.96
96
k = 12, df = 72 0 ± 1.84

0 ± 2.49 ( 2.82 )
3
MH: hα = hα* = 2.82
96
k = 8, df = 72 0 ± 1.24

0 ± 2.49 ( 2.71)
2
TH: hα = hα* = 2.71
96
k = 6, df = 72 0 ± 0.097

0 ± 2.49 ( 3.20 )
6
MTH: hα = hα* = 3.20
96
k = 24, df = 72 0 ± 1.99

Step 6: Plot the decision limits and the treatment effect differentials on the analysis of
means chart. The chart appears as Figure 15.17.
Machine and heat main effects are the only effects to show significance. The inter-
actions are not significant; however, we will examine the MH interaction to illustrate a
procedure developed by Ott13 to analyze a 2 × k interaction. Consider the interaction
diagram shown in Figure 15.18.

13. E. R. Ott, “Analysis of Means—A Graphical Procedure,” Industrial Quality Control 24, no. 2 (August 1967):
101–9.
Chapter 15: More Than Two Levels of an Independent Variable 495

Heat
Time Machine treatment
T1 T2 T3 M1 M2 M3 M4 H1 H2

2 2
UDL = 1.24
1.11
1 UDL = 0.86 1
0.51

Effect
Effect

0 0
LDL = –0.86 –0.51
–1 –1.11 –1
LDL = –1.24

–2 –2
Machine: 1 1 2 2 3 3 4 4
–3 Heat: 1 2 1 2 1 2 1 2

Plot 1: Main effects Plot 3: MHik interaction

2 UDL = 1.84 2

UDL = 0.98
1 1
Effect

Effect

0 0
LDL = –0.98
–1 –1
LDL = –1.84
–2 –2
Machine: 1 1 1 2 2 2 3 3 3 4 4 4 Time: 1 1 2 2 3 3
Time: 1 2 3 1 2 3 1 2 3 1 2 3 Heat: 1 2 1 2 1 2
Plot 2: MTij interaction Plot 4: THjk interaction

UDL = 1.99
2

1
Effect

–1
LDL = –1.99
–2
Machine: 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4
Time: 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
Heat: 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
Plot 5: MTHijk interaction

Figure 15.17 Analysis of means for treatment effects—length of steel bars.


496 Part III: Troubleshooting and Process Improvement

ng = 12
7 ––
Wi
6
– –– –
D i = Wi – L i
5


Length bars, X

Li
4

A B C D

–– –
Figure 15.18 Interaction comparison of patterns W and L.

(.05)
4
ng = 12
3

D 2

0
(.05)

–1

A B C D

–– –
Figure 15.19 Interaction analysis, W × L: ANOM.


If the average difference ∆ between the heats are plotted for the four machines, we
obtain a plot as given in Figure 15.19.
Here the differences plotted are 1.83, 2.25, 1.75, and 2.33, respectively, as can be
seen from the table used to calculate treatment effects for the MH interaction. The dif-
ferences are treated as if they were main effects with a standard deviation.

1 1 σˆ
σˆ ∆i = σˆ e + = 2 e
ng ng ng
Chapter 15: More Than Two Levels of an Independent Variable 497

and the limits become

σˆ e
∆ ± Hα 2
ng

where ng is the number of observations used to calculate each of the means constituting

the difference. In this case ng = 12 and ∆ = 2.04 so we have

2.49
2.04 ± 2.20 2
12
2.04 ± 2.24

giving UDL = 4.28 and LDL = –0.20 as shown in Figure 15.19. Peter R. Nelson has
extended this procedure by providing tables to facilitate the calculation of exact limits
for this case, and also limits for the case in which both factors are at more than two and
up to five levels.14 The reader should refer to his paper for this extension.
It is interesting to compare Figure 15.19 with plot 3 for the MH interaction. The
former shows that the difference between the heats does not vary from machine to
machine. The latter shows estimates of the magnitude of the differences of the cell
means from the grand mean in the MH interaction table if there were no other effects,
by making use of the experiment model. Each provides insight into the data patterns
resulting from the experiment.

15.9 NESTED FACTORIAL EXPERIMENTS (OPTIONAL)


When an experiment includes both crossed and nested factors, it can be dealt with using
the same approach as with fully nested or fully crossed experiments, respectively. The
analysis is essentially as if the experiment were crossed; however, any interactions
between nested factors and those factors within which they are nested are eliminated
from the computations. Thus, if factor C is nested within the levels of factor B, while B
is crossed with factor A, the treatment effect calculations for A, B, and AB would be as
crossed, while those for factor C would be:

( )
Ck( j ) = X jk − X − B j

AC ( ) = ( X
ik j ijk
− X ) − A − B − C ( ) − AB
i j k j ij

14. P. R. Nelson, “Testing for Interactions Using the Analysis of Means,” Technometrics 30, no. 1 (February 1988):
53–61.
498 Part III: Troubleshooting and Process Improvement

Note that for this experiment there could be no BC or ABC interactions. Degrees of
freedom for the effect may be calculated by substituting degrees of freedom for each
– –
of the terms in the treatment effect computation, with the term (X – X ) having degrees of
freedom one less than the number of treatment effects for the effect being plotted.
Analysis of means for fully crossed or nested experiments is considerably simpli-
fied using the method presented. To apply analysis of means to more-complicated fac-
torial models, split-plots, or to incomplete block designs, see Schilling.15

15.10 MULTIFACTOR EXPERIMENTS WITH


ATTRIBUTES DATA
ANOME for Proportion Data
The methods presented for multiple factors are applicable also to attributes data: percent,
proportion, or count. As discussed earlier, analysis of means for attributes data is usually
done through limits set using the normal approximation to the binomial or the Poisson
distribution. This implies that the sample size must be large enough for the approxima-
tion to apply. Sometimes transformations are useful, but experience has shown the
results to be much the same in most cases, with or without the use of such devices.
Treatment effects may be calculated using the estimated proportions in place of the treat-
ment means and the overall proportion or count provides an estimate of error. Thus for
proportions, the standard deviation of a single observation is

σ̂ e = p (1 − p )

with analogous results for percent or count data. Naturally, the factors for the decision
limits are found using

df = ∞

as in a one-way or 2p experiment.
Consider, for example, some data supplied by Richard D. Zwickl showing the pro-
portion defective on three semiconductor wire-bonding machines over three shifts for a
one-month period, given in Table 15.6.16

15. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 1, Analysis of Treatment Effects,” Journal
of Quality Technology 5, no. 3 (July 1973): 93–108.
16. R. D. Zwickl, “Applications of Analysis of Means,” Ellis R. Ott Conference on Quality Management and Applied
Statistics in Industry, New Brunswick, NJ (April 7, 1987).
Chapter 15: More Than Two Levels of an Independent Variable 499

Table 15.6 Proportion defective on bonders (ng = 1800).


Shift Bonder
Bonder 1 2 3 average
Number 600 0.028 0.042 0.017 0.029
Number 611 0.037 0.052 0.029 0.039
Number 613 0.023 0.045 0.015 0.028
Shift average 0.029 0.046 0.020 p– = 0.032

The treatment effects are calculated as


Bonder
B1: 0.029 – 0.032 = –0.003
B2: 0.039 – 0.032 = 0.007
B3: 0.028 – 0.032 = –0.004
Shift
S1: 0.029 – 0.032 = –0.003
S2: 0.046 – 0.032 = 0.014
S3: 0.020 – 0.032 = –0.012
Interaction
BS11: (0.028 – 0.032) – (–0.003) – (–0.003) = 0.002
BS12: (0.042 – 0.032) – (–0.003) – ( 0.014) = –0.001
BS13: (0.017 – 0.032) – (–0.003) – (–0.012) = 0.000
BS21: (0.037 – 0.032) – ( 0.007) – (–0.003) = 0.001
BS22: (0.052 – 0.032) – ( 0.007) – ( 0.014) = –0.001
BS23: (0.029 – 0.032) – ( 0.007) – (–0.012) = 0.002
BS31: (0.023 – 0.032) – (–0.004) – (–0.003) = –0.002
BS32: (0.045 – 0.032) – (–0.004) – ( 0.014) = 0.003
BS33: (0.015 – 0.032) – (–0.004) – (–0.012) = –0.001

Note the disparity in the interaction effects due to rounding.


The limits for a = 0.05, using n = abng = (3)(3)(1800) = 16,200, are

σˆ e = 0.032 (1 − 0.032 ) = 0.176


500 Part III: Troubleshooting and Process Improvement

Main effects
 k  q
0 ± σˆ e  Hα 
 k − 1  n

0 ± 0.176 ( 2.34 )
2
16, 200
0 ± 0.0046
Interaction

q
0 ± σˆ e hα*
n

0 ± 0.176 ( 2.766 )
4
16, 200
0 ± 0.0076
and the plot is as shown in Figure 15.20.

0.014
0.012
0.010
0.008 (0.0076)
0.006
(0.0046)
0.004
0.002
Effect

0
–0.002
–0.004
–0.006 (–0.0046)
(–0.0076)
–0.008
–0.010
–0.012
–0.014

1 2 3 1 2 3 11 12 13 21 22 23 31 32 33
Bonder Shift Interaction

Figure 15.20 ANOM of bonder data.


Chapter 15: More Than Two Levels of an Independent Variable 501

Clearly, the significance of main effects is indicated. It should be noted that this
approach to the analysis of proportions is approximate but, since so much industrial data
is of this type, it provides an extension of the control chart approach to the analysis of
such data as a vehicle of communication and understanding.
The calculation of the interaction limits for the analysis of means for unequal sample
sizes when working with proportions data was not given in Chapter 11. This is due to
the fact that the ANOME approach is more appropriate for the analysis of multifactor
experiments. The treatment effects are calculated in the same manner as shown for the
above example. However, in this example, each proportion was based on a common
sample size, ng = 1800. When the proportions are based on unequal sample sizes, the
standard error for each level of each factor will differ based on the sample size. In the case
of a two-way layout, the ANOME limits for main effects are

q
0 ± hα σ̂ pi i for the ith level of the first factor
n

q
0 ± hα σ̂ pi j for the jth level of the second factor
n

and for the two-factor interaction

q
0 ± hα* σˆ pij
n

where
n = total number of observations
q = (a – 1)(b – 1) = number of degrees of freedom for the interaction effect
ŝ pij = estimate of standard error for interaction effect with nij observations
(see formula in Chapter 11)
In the case of a three-way layout, the ANOME limits for main effects are

q
0 ± hα σ̂ pi ii for the ith level of the first factor, and q = a – 1
n

q
0 ± hα σ̂ pi j i for the jth level of the second factor, and q = b – 1
n

q
0 ± hα σ̂ pii k for the kth level of the third factor, and q = c – 1
n
502 Part III: Troubleshooting and Process Improvement

and for two-way interactions

q
0 ± hα* σˆ pij i with q = (a – 1)(b – 1)
n

q
0 ± hα* σˆ pi i k with q = (a – 1)(c – 1)
n

q
0 ± hα* σˆ pi jk with q = (b – 1)(c – 1)
n

and for the three-way interaction

q
0 ± hα* σˆ pijk with q = (a – 1)(b – 1)(c – 1)
n

where
n = total number of observations
ŝ pij• = estimate of standard error for ijth interaction effect with nij•
observations (see formula in Chapter 11), where
kc
nij i = ∑ nijk
k =1

ŝ pi•k = estimate of standard error for ikth interaction effect with ni•k
observations (see formula in Chapter 11), where
kb
ni i k = ∑ nijk
j =1

ŝ p•jk = estimate of standard error for jkth interaction effect with n•jk
observations (see formula in Chapter 11), where
ka
ni jk = ∑ nijk
i =1

ŝ pijk = estimate of standard error for interaction effect with nijk observations
(see formula in Chapter 11)
Chapter 15: More Than Two Levels of an Independent Variable 503

ANOME for Count Data


An example of the use of analysis of means on count data in a multifactor experiment
is also provided by Richard Zwickl.17 An experiment was performed to find the best
rinse conditions to minimize particulates on semiconductor wafers. The number of
particles greater than 0.5mm was counted using a unit size of 10 wafers for various rinse
times and temperatures. The results are shown in Table 15.7 using the Poisson distribu-
tion for this count data.
The treatment effects become
Time (M)
M1 = 171.5 – 103.7 = 67.8
M2 = 104.0 – 103.7 = 0.3
M3 = 35.5 – 103.7 = –68.2
Temperature (D)
D1 = 121.3 – 103.7 = 17.6
D2 = 86.0 – 103.7 = –17.7
Interaction (MD)
MD11 = (205 – 103.7) – ( 67.8) – ( 17.6) = 15.9
MD12 = (138 – 103.7) – ( 67.8) – (–17.7) =–15.8
MD21 = (111 – 103.7) – ( 0.3) – ( 17.6) =–10.6
MD22 = ( 97 – 103.7) – ( 0.3) – (–17.7) = 10.7
MD31 = ( 48 – 103.7) – (–68.2) – ( 17.6) = –5.1
MD32 = ( 23 – 103.7) – (–68.2) – (–17.7) = 5.2
The limits for a = 0.05 are as follows:

σˆ e = 103.7
= 10.18

Table 15.7 Particle count on wafers.


Elapsed Rinse Time in Minutes, M
Temperature Temperature
in degrees, D 2 min. 5 min. 8 min. average
25°C 205 111 48 364/3 = 121.3
85°C 138 97 23 258/3 = 86.0
Time average 343/2 = 171.5 208/2 = 104.0 71/2 = 35.5 m̂ = 622/6 = 103.7

17. Ibid.
504 Part III: Troubleshooting and Process Improvement

Time

0 ± 10.18 ( 2.34 )
2
6
0 ± 13.75

Temperature

0 ± 10.18 (1.96 )
1
6
0 ± 8.15

Interaction

0 ± 10.18 ( 2.631)
2
6
0 ± 15.46

and the plot is shown in Figure 15.21.


The main effects of time and temperature are clearly significant. Note the down-
ward trend with increasing levels of both. Interaction is also barely significant at the
five percent level. An interaction diagram is shown in Figure 15.22.
Again, the analysis of means of count data such as this involves the approach to nor-
mality of the Poisson distribution (the mean of each cell should be greater than five) and

80
60
40
20
15.46
Effect

13.75 8.15
0
–13.75 –8.15
–15.46
–20
–40
–60
–80

2 5 8 25 85 11 12 21 22 31 32
Time Temp Interaction

Figure 15.21 ANOM of particulates.


Chapter 15: More Than Two Levels of an Independent Variable 505

25°
200

150

Count
85°
100

50

2 5 8
Time (minutes)

Figure 15.22 Interaction of particulates.

is, of course, approximate. Experience has shown, however, that like the c chart, it is
indeed adequate for most industrial applications. More detail will be found in the papers
by Lewis and Ott18 and by Schilling.19

15.11 ANALYSIS OF MEANS WHEN THE SAMPLE


SIZES ARE UNEQUAL
Introduction
Ideally, when studies are planned, the experimenter hopes to obtain and measure all of
the samples that have been requested. Unfortunately, getting all of the samples often
tends to be “more the exception than the rule.” A glass sample may break or be lost
before it is measured; a part may be misplaced due to poor markings on it, and so on.
Consequently, for the analysis of means to be used when the samples for each level of
a factor vary, we must modify our approach.

Calculation of the Limits for Single or Main Effects


The discussion of performing the analysis of means in the situation of unequal samples
sizes for main effects has been presented by L. S. Nelson.20 The ANOM limits are based
on the Sidak factors given in Table A.19

18. S. Lewis and E. R. Ott, “Analysis of Means Applied to Percent Defective Data,” Techology Report No. 2, Rutgers
University Statistics Center (February 10, 1960).
19. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 3, Analysis of Non-Normal Data,”
Journal of Quality Technology 5, no. 4 (October 1973): 156–59.
20. L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15, no. 1
(January 1983): 40–44.
506 Part III: Troubleshooting and Process Improvement

n − n1
X ± shα* ;k ,ν
nn1

where
n = total number of observations
ni = number of observations in the ith mean
– *
MSERROR from an ANOVA, Equation (15.1), or R/d
s= 2

It should be noted that, in the case of unequal sample sizes, the critical factors used
in the computation of the ANOM limits are no longer considered exact. From a practi-
cal viewpoint, these limits still produce useful results. If the experimenter prefers a more
precise definition of the ANOM limits, P. R. Nelson21 recommends the use of the stu-
dentized maximum modulus (SMM) critical values to produce a more exact set of limits
since those shown above are uniformly conservative (wider) in the unequal sample size
situation. Fortunately, the differences between the SMM values and those in Table A.19
are relatively small except when the degrees of freedom for the effect(s) involved are
low (< 5).
The ANOM Excel add-in program (ANOM48.xla) that is included on the CD-ROM
developed for this text can be used to analyze data involving unequal sample sizes
among the factors. The add-in program is demonstrated in Chapter 17 for equal and
unequal sample size scenarios.

Calculation of the Limits for Interactions


The ANOM Excel add-in program will use the Sidak factors for the case of interactions,
as well as for the situation when the sample sizes between factor levels are unequal. As
stated above, these decision limits will be conservative and only bear out interaction
components that are truly statistically significant. For a less conservative approach, the
reader is referred to the paper by Nelson22 which can be found in the “\Analysis of
Means Library” directory on the CD-ROM that comes with this text.

15.12 COMPARING VARIABILITIES


Introduction
The steel-rod lengths from the four machines, three times, and two heat treatments were
being studied because of excessive variability in the finished rods. The comparison of

21. P. R. Nelson, “Multiple Comparisons of Means Using Simultaneous Confidence Intervals,” Journal of Quality
Technology 21, no. 4 (October 1989): 232–41.
22. Ibid.
Chapter 15: More Than Two Levels of an Independent Variable 507

average lengths (Figure 15.16) shows two major special causes for variability; differ-
ences between machines and between heat treatments.
Now let us look at the inherent variability, or common causes. Some machine(s) may
be innately more variable than others, independent of their average settings. We can
compare variabilities from exactly two processes by using either a range–square–ratio
test (FR) or an F test.23
We can apply the method here to compare variabilities from the two heats W and L.
There are 12 subgroup ranges in W and another 12 in L; in each subgroup, r = 4. Their
averages are

64 63
RW = = 5.33 RL = = 5.25
12 12

These two values are surprisingly close; no further statistical comparison is nec-
– –
essary. A procedure, if needed, would be to compute ( RW/d2*) and ( RL/d2*) and form
their ratio

FR = (5.33/5.25)2 with df ≅ F(32,32)

and compare with values in Table A.12.


The range–square–ratio test in this form is applicable only to two levels of a factor.
The following procedure is applicable to the four machines and the three times.

Analysis of Means to Analyze Variability

1. Internal or Within Machine Variability. Figure 15.23 is a rearrangement of the R


chart, Figure 15.14; it allows a ready, visual comparison of machine variabilities. A casual
study of Figure 15.23 suggests the possibility that machine A may be most variable and
machines C or D the least; but the evidence is not very persuasive. An objective com-
parison of their variabilities follows. Table 15.8 shows average machine ranges associ-
ated with the design. The average machine ranges have been plotted in Figure 15.24.
The computation of decision lines requires a measure ŝ R of expected variation of the
range R. Although ranges of individual subgroups are not normally distributed, average
ranges of four (or more) subgroups are essentially normal (Chapter 1, Theorem 3). The
standard deviation of ranges can be estimated as follows:

From Table A.4, the upper 3-sigma limit on R is D4 R where D4 has been computed

to give an upper control limit at R + 3ŝ R

D4 R = R + 3σ̂ R

23. See Chapter 4. Also, for a more extensive discussion of the use of analysis of means in comparing variabilities,
see N. R. Ullman, “The Analysis of Means (ANOM) for Signal and Noise,” Journal of Quality Technology 21, no.
2 (April 1989): 111–27.
508 Part III: Troubleshooting and Process Improvement

Machine: A B C D

r=4 –
D4R = 12.1

– – – –
10 R A = 6.33 R B = 5.50 R C = 4.50 R D = 4.83


R = 5.29
5
R

Figure 15.23 Subgroup ranges (r = 4) arranged by machines. (Data from Table 15.8.)

Table 15.8 Subgroup ranges.


Data from Table 15.4; ng = r = 4.
Heat Treatment
W L
Machines
Time A B C D A B C D

1 8 4 4 4 6 3 2 1 R1 = 32/8 = 4.00

2 7 4 3 5 5 5 3 6 R2 = 38/8 = 4.75

3 5 7 7 6 7 10 8 7 R3 = 57/8 = 7.12

RA = 38/6 = 6.33
– – –
RB = 33/6 = 5.50 RW = 5.33 R = 5.29
– –
RC = 27/6 = 4.50 RL = 5.25

RD = 29/6 = 4.83

r=6
8
UDL = 7.31
(.05)

6
R


R = 5.29

4
LDL = 3.27
(.05)

Figure 15.24 Comparing average machine variabilities. (Data from Table 15.8; each point is an
average of r = 6 ranges.)
Chapter 15: More Than Two Levels of an Independent Variable 509

Then

σˆ R = R ( D4 − 1) / 3
d3
= R
d2
= dR R

Values of the factor dR are given in Table 15.9 to simplify computation. When com-
paring any averages of these ranges,

σˆ R = d R R = ( 0.43)( 5.29 ) = 2.27

with degrees of freedom

df ≅ (0.9) k (n – 1) = (0.9)24(3) = 65.

When comparing machine average ranges (n R– = 6) of Figure 15.24

σˆ R = 2.27 / 6 = 0.92

Decision lines to compare averages of machine ranges are determined with: df = 65,
k = 4, H0.05 = 2.20.

UDL ( 0.05) = R + H 0.05σˆ R


= 5.29 + ( 2.20 )( 0.92 )
= 7.31
LDL ( 0.05) = R − H 0.05σˆ R
= 5.29 − ( 2.20 )( 0.92 )
= 3.27


Table 15.9 Values of dR where ŝR = dR R and
dR = (D4 – 1)/3 = d3/d2.
R dR D4
2 0.76 3.27
3 0.52 2.57
4 0.43 2.28
5 0.37 2.11
6 0.33 2.00
7 0.31 1.92
510 Part III: Troubleshooting and Process Improvement

Time: T1 T2 T3

r=4 –
D4R = 12.1

– – –
10 R 1 = 4.0 R 2 = 4.75 R 3 = 7.125


R = 5.29
5
R

Figure 15.25 Subgroup ranges (r = 4) arranged by time periods. (Data from Table 15.8.)

All four points fall within the decision lines (a = 0.05), and there does not appear
to be a difference in the variabilities of the four machines.
2. Variability at Different Times. The range data, Table 15.8, has been rearranged by
time in Figure 15.25.
Analysis 1. Data for the third period, T3 appears to be significantly large. A compari-
son24 of T3 with a pooling of groups T1 and T2 shows an (a + b) count of (1 + 9) = 10
which shows significance, a ≅ 0.01 by the Tukey–Duckworth test.
Analysis 2. Analysis of means (Figure 15.26):

R = 5.29; k = 3, df ≅ 65; ng = 8

σˆ R = d R R = 2.27 (each R is of r = 4)
σˆ R = σˆ R / ng = 2.27 / 8 = 0.80

a = 0.05:

UDL0.05 = 5.29 + (1.96 )( 0.80 )


= 6.86
LDL0.05 = 5.29 − (1.96 )( 0.80 )
= 3.72

24. See Section 13.2.


Chapter 15: More Than Two Levels of an Independent Variable 511

T1 T2 T3

ng = r = 8

UDL = 7.27
(.01)
7 (.05)
6.86


R = 5.29

R

LDL = 3.72
(.05)
(.01)
3 3.31

Figure 15.26 Comparing average time variabilities. (Data from Table 15.8; each point is an
average of r = 8 ranges.)

Table 15.10 A two-way table (machine by time) ignoring heat treatment.


Data from Table 15.8; each entry below is the average of two ranges.

T1 T2 T3

A 7.0 6.0 6.0


B 3.5 4.5 8.5
C 3.0 3.0 7.5
D 2.5 5.5 6.5

a = 0.01:

UDL0.01 = 5.29 + ( 2.47 )( 0.80 )


= 7.27
LDL0.01 = 5.29 − ( 2.47 )( 0.80 )
= 3.31

Interpretation of Results. There is supporting evidence of a time effect on variability,


risk a ≅ 0.05, with a definite suggestion that it became progressively more variable. The
average at time T1 is close to the lower (0.05) limit and at T3 is outside the (0.05) and
close to the (0.01) limit. Then we can consider the behavior of the different individual
machines with respect to time (Table 15.10). A plot of the data is shown in Figure 15.27.
Surprisingly, this indicates that machine A appears to be affected altogether differently
512 Part III: Troubleshooting and Process Improvement

T1 T2 T3

n=r=2
B
8 Machine
C

A
D
6 A

R

4
B

D
2

Figure 15.27 Graph of machine × time interaction. (Data from Table 15.10.)

than the other three machines. This may be a consequence of operator fatigue or of
machine maintenance, but it requires technical attention.
The biggest factor in variability is machine average—proper adjustments on indi-
vidual machines should quickly minimize this. Secondly, the difference in heat treat-
ment averages may possibly warrant adjustments for each new batch of rods, at least
until the heat treatment process is stabilized. Probably third in importance is to estab-
lish reasons for the effect of time on within-machine variation; in fact, this may be of
more importance than heat treatment.

15.13 NONRANDOM UNIFORMITY


Suppose we were to measure n consecutive steel bars all made on machine A from the
same treatment. We would expect variation, but not too much and not too little. If the n
measurements were made and recorded in the order of manufacture, we could count the
number of runs above and below the median and compare them with the expected num-
ber (Table A.2). Note that there is a minimum number of runs expected (risk a) just as
there is a maximum. Either too few or too many runs is evidence of an assignable cause
in the process.

A variables control chart ( X and R) can signal an assignable cause by too little vari-
ation; we call this nonrandom uniformity.
Chapter 15: More Than Two Levels of an Independent Variable 513

Many articles have been written about evidence indicating the presence of assign-
able causes of nonrandomness and some about the identification of the indicated
assignable causes.25,26 These discussions have usually been concerned with the concept
of nonrandom “excessive” variability. The literature has not emphasized that it is some-
times of scientific importance to discuss statistical evidence of nonrandom uniformity
and to identify types of process behavior that may produce such patterns. Sources of data
displaying nonrandom uniformity include differences in precision between analytical
laboratories and sampling from a bimodal population or other sources of nonrational
subgroups that produce exaggerated estimates of s.

Nonrandom Uniformity—Standard Given


As in Section 15.2, consider k samples of ng each from a process in statistical control
with average m and standard deviation s. If all k sample means lie within narrow deci-
sion lines drawn at

m ± zaŝ X–

this shall be considered evidence (with risk a) of nonrandom uniformity. Let Pr be


the probability that a single point falls by chance between these lines. What must be the
value of za in order that the probability of all k points falling within such a narrow band
shall be only Prk = a ?
Values of za are obtained from Prk = a in the same manner as Za was obtained in
Section 11.3. When k = 3, this becomes

Pr3 = 0.05
and
Pr = 0.368

Then the corresponding z0.05 = 0.48 is found from a table of areas under the normal
curve (Table A.1). Other selected values of za have been computed and are shown in
Table 15.11.
For example, if in the casino example for standards given in Section 15.2, it was
desired to check for nonrandom uniformity, the limits based on a = 0.05 would be

k = 8, m = 7, s = 2.42, ng = 25

25. P. S. Olmstead, “How to Detect the Type of an Assignable Cause: Part 1, Clues for Particular Types of Trouble,”
Industrial Quality Control 9, no. 3 (November 1952): 32.
26. P. S. Olmstead, “How to Detect the Type of an Assignable Cause: Part 2, Procedure When Probable Cause is
Unknown,” Industrial Quality Control 9, no. 4 (January 1953): 22.
514 Part III: Troubleshooting and Process Improvement

σ
µ ± zα
ng

2.42
7 ± 1.01
25
7 ± 0.49
LDL0.05 = 6.51 UDL0.05 = 7.49

and we have the plot in Figure 15.28.


Since all the points are not contained within the limits (in fact half the points are
outside), there is no evidence to impugn the honesty of the casino on this basis.

Table 15.11 Factors to judge presence of nonrandom


uniformity, standard given.
k z.05(k) z.01(k)
2 .28 .13
3 .48 .27
4 .63 .41
5 .75 .52
6 .85 .62
7 .94 .70
8 1.01 .78
9 1.07 .84
10 1.13 .90
15 1.34 1.12
20 1.48 1.27
24 1.57 1.36
30 1.67 1.47
50 1.89 1.71
120 2.25 2.08

8
UDL = 7.49
7

m=7
X

LDL = 6.51
6

1 2 3 4 5 6 7 8
7.23 6.46 7.01 6.38 6.68 7.35 8.12 7.99

Figure 15.28 Nonrandom uniformity chart for eight casino tables.


Chapter 15: More Than Two Levels of an Independent Variable 515

Nonrandom Uniformity—No Standard Given


Some very interesting techniques of analysis are possible in this category. The critical
values of Na have been computed for the case of no standard given,27 and selected val-
ues are given in Table A.16. It happens rather frequently that points on a control chart
all lie very near the process average, and the erroneous conclusion is frequently made
that the process is “in excellent control.” The technique of this section provides an
objective test of nonrandom uniformity. The computation of these entries in Table A.16
is much more complicated than for those in Table 15.11; the method is not given here.
Decision lines to use in deciding whether our data indicate nonrandom uniformity are
drawn at

X ± Nα σ̂ X

and the application proceeds as in za above for the standard given method.

15.14 CALCULATION OF ANOM LIMITS


FOR 2P EXPERIMENTS
It should be noted that the method of Chapter 14 should be used when analyzing 2p
experiments or fractions thereof. That is because it provides an exact test for main
effects and interactions. In effect it gives a simple graphical representation of the series
of exact t-tests performed in the analysis of 2p experiments and their fractions.
When analyzing a 2p level design, the main effects in the ANOME analysis presented
in this chapter will be subjected to an exact test, but the test of interactions will be con-
servative in providing wider limits than the method of Chapter 14. This is because the
Sidak factor provides a conservative test and anticipates a wide variety of correlation pat-
terns among the treatment effects plotted, whereas the correlation is known when only
two levels are involved. ANOME is suitable for all other cases.
In presenting a simple one-way ANOM for main effects it is recommended that the
– –
centerline be in terms of X and the limits in terms of X. Interaction treatment effects,
however, along with main-effect treatment effects require a zero mean and should be
shown as such. The Excel add-in on the CD-ROM allows for both centerline options in
the one-way case, but ANOME is the preferred (and only) method allowed for the
analysis of two or more factors involving interactions.

27. K. R. Nair, “The Distribution of the Extreme Deviate from the Sample Mean and Its Studentized Form,”
Biometrics 35 (1948): 118–44.
516 Part III: Troubleshooting and Process Improvement

15.15 DEVELOPMENT OF ANALYSIS OF MEANS


The analysis of means was originally developed by Dr. E. R. Ott and first reported in
1958.28 Subsequently, Sidney S. Lewis and Ellis R. Ott extended the analysis of means
procedure to binomially distributed data when the normal approximation to the binomial
distribution applies. Their results were reported in 1960.29 In 1967, Dr. Ott published his
Brumbaugh award-winning paper, “Analysis of Means—A Graphical Procedure” in
Industrial Quality Control.30 Significantly, it was the Shewhart Memorial issue.
The basic Ott procedure is intended for use with the means resulting from main
effects in analysis of variance and in similar applications. Schilling extended the analy-
sis of means to the analysis of interactions and to a variety of experiment designs,31 such
as crossed, nested, split-plot, and incomplete block, by providing a systematic method
for the analysis of means derived from the experiment model. This procedure used a
modified factor ha for computation of the limits where

hα = Hα k / ( k − 1)

from the Bonferroni inequality. Based on Ott’s original analysis of 2p experiments,


Schilling extended the procedure to the analysis of contrasts in various forms.32 He also
provided a procedure for use with nonnormal distributions such as attributes data or the
Weibull distribution.33
L. S. Nelson computed an extensive table of ha factors using the Bonferroni inequal-
34
ity and later produced tables of exact ha factors based on the theoretical development
of P. R. Nelson.35 P. R. Nelson has also provided tables of sample size for analysis of
means36 as well as power curves for the procedure.37 The exact values of L. S. Nelson
were modified by D. C. Smialek to provide exact Ha factors equivalent to those used by

28. E. R. Ott, “Analysis of Means,” Technology Report No. 1, Rutgers University Statistics Center (August 10, 1958).
29. S. S. Lewis and E. R. Ott, “Analysis of Means Applied to Percent Defective Data,” Technology Report No. 2,
Rutgers University Statistics Center (February 10, 1960).
30. E. R. Ott, “Analysis of Means—A Graphical Procedure,” Industrial Quality Control 24, no. 2 (August 1967):
101–9.
31. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 1, Analysis of Treatment Effects,” Journal
of Quality Technology 5, no. 3 (July 1973): 93–108.
32. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 2, Analysis of Contrasts,” Journal of
Quality Technology 5, no. 4 (October 1973): 147–55.
33. E. G. Schilling, “A Systematic Approach to the Analysis of Means: Part 3, Analysis of Non-Normal Data,”
Journal of Quality Technology 5, no. 4 (October 1973): 156–59.
34. L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15, no. 1
(January 1983): 40–44.
35. P. R. Nelson, “Multivariate Normal and t Distributions with pjk = aj ak ,” Communications in Stastistics, Part B—
Simulation and Computation, II (1982): 239–48.
36. P . R. Nelson, “A Comparison of Sample Size for the Analysis of Means and the Analysis of Variance,” Journal of
Quality Technology 15, no. 1 (January 1983): 33–39.
37. P . R. Nelson, “Power Curves for the Analysis of Means,” Technometrics 27, no. 1 (February 1985): 65–73.
Chapter 15: More Than Two Levels of an Independent Variable 517

Ott in the original procedure.38 Professor Smialek produced the table of h* factors using
the Sidak approximation that appears here. Schilling showed how analysis of means can
be used to analyze Youden squares.39 P. R. Nelson examined analysis of means for inter-
actions when at least one factor is at two levels, and provided critical values for other
special cases.40
A computer program for analysis of factorial experiments simultaneously by analy-
sis of means and analysis of variance was developed by Schilling, Schlotzer, Schultz,
and Sheesley41 and subsequently modified by P. R. Nelson to include exact values.42
Sheesley has provided a computer program to do control charts and single-factor exper-
iments for measurements or attributes data using the Bonferroni values,43 which allow
for lack of independence among the points plotted.
Sheesley has also provided tables of simplified factors for analysis of means,44 sim-
ilar to control chart factors for use with the range. One of Ott’s last papers on the topic
was an insightful analysis of multiple-head machines coauthored with Dr. R. D. Snee.45
Neil Ullman has expanded the area of application by providing factors for analysis of
means on ranges suitable for use in analysis of the Taguchi signal-to-noise ratio.46
Nelson has explored the state of the art of multiple comparisons using simultane-
ous confidence intervals and recommended the use of the studentized maximum mod-
ulus (SMM) instead of the Sidak values in Table A.19 for unequal sample sizes.47 Many
other approaches were discussed in this paper as well. In a later paper, Nelson discusses
the application of ANOM to balanced incomplete block (BIB) designs, Youden squares,
and axial mixture designs.48

38. E. G. Schilling and D. C. Smialek, “Simplified Analysis of Means for Crossed and Nested Experiments,” 43d
Annals of the Quality Control Conference, Rochester Section ASQC, Rochester, NY (March 10, 1987).
39. E. G. Schilling, “Youden Address—1986: Communication with Statistics,” Chemical and Process Industries
Newsletter 4, no. 2 (December 1986): 1–5.
40. P. R. Nelson, “Testing for Interactions Using the Analysis of Means,” Technometrics 30, no. 1 (February 1988):
53–61.
41. E. G. Schilling, G. Schlotzer, H. E. Schultz, and J. H. Sheesley, “A FORTRAN Computer Program for Analysis of
Variance and Analysis of Means,” Journal of Quality Technology 12, no. 2 (April 1980): 106–13.
42. P. R. Nelson, “The Analysis of Means for Balanced Experimental Designs,” Journal of Quality Technology 15,
no. 1 (January 1983): 45–56; Corrigenda 15, no. 4 (October 1983): 208.
43. J. H. Sheesley, “Comparison of k Samples Involving Variables or Attributes Data Using the Analysis of Means,”
Journal of Quality Technology 12, no. 1 (January 1980): 47–52.
44. J. H. Sheesley, “Simplified Factors for Analysis of Means When the Standard Deviation is Estimated from the
Range,” Journal of Quality Technology 13, no. 3 (July 1981): 184–85.
45. Ellis R. Ott and R. D. Snee, “Identifying Useful Differences in a Multiple-Head Machine,” Journal of Quality
Technology 5, no. 2 (April 1973): 47–57.
46. Neil R. Ullman, “Analysis of Means (ANOM) for Signal to Noise,” Journal of Quality Technology 21, no. 2
(April 1989): 111–27.
47. P. R. Nelson, “Multiple Comparisons of Means Using Simultaneous Confidence Intervals,” Journal of Quality
Technology 21, no. 4 (October 1989): 232–41.
48. P. R. Nelson, “Additional Uses for the Analysis of Means and Extended Tables of Critical Values,” Technometrics
35, no. 1 (February 1993): 61–71.
518 Part III: Troubleshooting and Process Improvement

Nelson and Wludyka developed an ANOM-type test for normal variances that pro-
vides a graphical display showing which are statistically (and practically) different from
the others.49 Tables of critical values for their ANOMV test are presented. The perfor-
mance of AMONV was found to be better than more-established tests for one-way lay-
outs, as well as for more complex designs.
Analysis of means provides a vehicle for the simultaneous display of both statisti-
cal and engineering significance. The procedure brings to bear the intuitive appeal and
serendipity of the control chart to the analysis of designed experiments. It is appro-
priate that the American Society for Quality’s Shewhart Medal should bear an inscrip-
tion of a control chart, and equally fitting that the Shewhart Medal should have been
awarded to Ellis R. Ott in 1960, the year “Analysis of Means for Percent Defective
Data” was published.

15.16 PRACTICE EXERCISES


1. Recompute the decision lines of Exercise 8 in Chapter 14, assuming that
the eight subgroups are from eight levels of a single factor.
2. Analyze the following data on an experiment comparable in nature to
that presented in Case Histories 15.1 and 15.2. Note: this problem has a
hidden “catch.”

Machine Setting 1 2 3
Response Values 876 1050 850
933 895 748
664 777 862
938 929 675
938 1005 837
676 912 921
614 542 681
712 963 797
721 937 752
812 896 646

3. Calculate z0.05(8) = 1.01 and z0.01(8) = 0.78 as in Table 15.11.


4. Is there evidence of nonrandom uniformity in Case History 15.3 at a = 0.05
for any of the machines, days, or heat treatments?
5. Delete machine D and rework the problem of Case History 15.3.
6. Assume that a possible assignable cause exists for the four vials appearing to
be high in the data of Table 13.4 as analyzed in the plot of Figure 13.5.
Reanalyze using analysis of means for three levels.

49. P. R. Nelson and P. S. Wludyka, “An Analysis of Means-Type Test for Variances from Normal Populations,”
Technometrics 39, no. 3 (August 1997): 274–85.
Chapter 15: More Than Two Levels of an Independent Variable 519

7. The following data came out of an experiment to determine the effects


of package plating on bond strength. Two levels of nickel thickness and
two levels of current density were used. Data is supplied by Richard D.
Zwickl.50 Analyze this experiment using analysis of means with a = 0.05.
– –
Note that X = 9.8865 and R = 1.2525. Perform the analysis using
(a) treatment effects and (b) Yates method as a 22 factorial.

Breaking strength of wire bonds


Nickel thickness
Current density Thin Thick Average
2
2 amp/ft 8.68 10.82
– –
9.25 X = 9.272 10.50 X = 10.316
9.41 10.36 9.794
9.77 9.26
9.25 10.64
4 amp/ft2 9.07 9.68
– –
9.41 X = 9.910 10.42 X = 10.048
10.38 10.21 9.979
10.69 10.00
10.00 9.93
Average 9.591 10.182

8. The proportion of wire bonds with evidence of ceramic pullout (CPO) is


given below for various combinations of metal-film thickness, ceramic
surface, prebond clean and annealing time. Note that the sample size is not
maintained over all the cells. This is typical of real industrial data.51 Analyze
the experiment using analysis of means with a = 0.05.

Proportion of wire bond with CPO


Metal-film thickness,
ceramic surface Normal 1.5 ë normal
Prebond Annealing
clean time Unglazed Glazed Unglazed Glazed
No Normal 9/96 70/96 8/96 42/96
Clean 4 × normal 13/64 55/96 7/96 19/96
No Normal 3/96 6/96 1/64 7/96
Clean 4 × normal 5/96 28/96 3/96 6/96

9. A 33 experiment was run on wire bonding to determine the effect of capillary,


temperature, and force on wire bonding in semiconductor manufacture. Data
adapted from that supplied by Richard D. Zwickl52 are as follows:

50. R. D. Zwickl, “Applications of Analysis of Means,” The Ellis R. Ott Conference on Quality Management and
Applied Statistics in Industry, New Brunswick, NJ (April 7, 1987).
51. R. D. Zwickl, “An Example of Analysis of Means for Attribute Data Applied to a 24 Factorial Design,” ASQC
Electronics Division Technical Supplement, issue 4 (Fall 1985): 1–22.
52. R. D. Zwickl, “Applications of Analysis of Means,” The Ellis R. Ott Conference on Quality Management and
Applied Statistics in Industry, New Brunswick, NJ (April 7, 1987).
520 Part III: Troubleshooting and Process Improvement

The values shown are averages of 18 wire bonds. Analyze the experiment
using analysis of means with a = 0.05. What can be said about the nature of
the temperature effect from the analysis of means plot?

Average pull strength


Capillary
New Worn Squashed
Force, Temperature, °C Temperature, °C Temperature, °C
psi 25 100 150 25 100 150 25 100 150
25 4.08 4.67 6.22 3.27 5.91 9.35 2.70 5.04 7.43
4.77 2.96 7.67 4.18 5.60 8.49 3.28 4.66 8.97
40 2.50 4.83 8.62 3.32 5.81 8.53 4.01 5.82 8.57
2.30 6.13 8.12 5.06 6.62 6.78 4.61 573 9.13
55 4.34 4.85 8.31 4.18 6.61 9.38 3.97 6.03 7.62
3.32 4.15 8.38 5.71 7.32 9.21 4.02 6.24 10.44

10. An experiment was conducted to determine the effect of wafer location and
line width measuring equipment (aligner) on line width of a photolithographic
process. The following data were obtained.53

Line width
Wafer location
1 2 3 4
202 211 211 186
208 217 200 174
Aligner 1 215 220 206 198
231 226 211 208
208 212 205 189

X 212.8 217.2 206.6 191.0 206.9
R 29 15 11 34
219 207 211 199
231 222 225 206
Aligner 2 225 216 210 211
222 216 218 213
224 215 216 207

X 224.2 215.2 216.0 207.2 215.6
R 12 15 15 14
255 250 254 246
253 250 252 246
Aligner 3 254 250 254 246
253 249 253 245
253 250 254 249

X 253.6 249.8 253.4 246.4 250.8
R 2 1 2 4
Continued

53. S. Kukunaris, “Optimizing Manufacturing Processes Using Experimental Design,” ASQC Electronics Division
Technical Supplement, issue 3 (Summer 1985): 1–19.
Chapter 15: More Than Two Levels of an Independent Variable 521

Continued

Line width
Wafer location
1 2 3 4
223 226 223 222
220 221 216 211
Aligner 13 228 235 231 212
221 221 222 215
229 231 219 219

X 224.2 226.8 222.2 215.8 222.2
R 9 14 15 11
228.7 227.2 224.6 215.1 223.9

Analyze by analysis of means using a = 0.05. Draw the interaction diagram.


11. The following results were obtained in an experiment to determine the effect
of developers and line width measuring equipment on line width.54

Line width
Aligner 1 Aligner 2 Aligner 3 Average
215 223 226
225 225 240
211 230 237
Developer 1 212 236 234
212 232 235
211 238 225
206 234 236

X 213.1 231.1 233.3 225.8
R 19 15 15
213 220 231
206 223 228
206 221 231
Developer 2 211 220 223
215 222 228
227 228 245
217 228 229

X 213.6 223.1 230.7 222.5
R 21 8 22
213 218 238
219 225 228
211 225 231
Developer 3 212 230 244
207 232 234
207 228 237
216 231 236

X 212.1 227.0 235.4 224.8
R 12 14 16

Average 212.9 227.1 233.1 X = 224.4

Analyze the experiment by analysis of means using a = 0.05. Draw the


interaction diagram.

54. Ibid.
522 Part III: Troubleshooting and Process Improvement

12. Certain questions have arisen regarding the error-making propensity of


four randomly selected workstations from two departments (C and D) on
shifts A or B, Monday through Friday. Records from one week are shown
below. Determine which department, shift, and days of the week are to be
preferred using a 95 percent confidence level.

C D
A B A B
Monday 10 15 16 19
11 22 12 24
16 12 13 15
15 20 11 16
R 6 10 5 9
Tuesday 9 11 15 21
6 15 10 16
10 19 9 17
7 10 10 21
R 4 9 6 5
Wednesday 14 18 15 14
13 16 12 18
10 14 14 13
14 13 15 21
R 4 5 3 8
Thursday 7 8 10 14
13 15 12 12
6 15 13 11
11 17 9 13
R 7 9 4 3
Friday 10 18 14 15
13 16 17 22
9 15 10 15
7 21 17 15
R 6 6 7 7

13. A designed experiment involving two treatments, A and B, in four depart-


ments, w, x, y, and z, and five time periods, I, II, III, IV, and V, gave the
results indicated below. Complete an analysis of means to determine which
departments, treatments, or time periods are to be preferred at a 95 percent
confidence level. High results are desired.
Chapter 15: More Than Two Levels of an Independent Variable 523

w x y z
A B A B A B A B
I 7.1 6.0 7.6 6.5 7.5 6.9 6.2 7.9
7.5 7.3 7.4 7.2 8.5 7.2 7.1 7.6 –
8.0 7.8 6.4 7.7 6.6 6.6 6.1 6.0 I = 7.11
R 0.9 1.8 1.2 1.2 1.9 0.6 1.0 1.9
II 6.6 7.6 7.5 8.0 7.4 6.7 6.7 7.7
7.3 6.2 7.3 6.5 7.2 6.8 6.3 6.7 –
6.3 6.3 6.5 7.5 7.1 7.0 6.9 7.7 II = 6.99
R 1.0 1.4 1.0 1.5 0.3 0.3 0.6 1.0
III 6.4 6.5 7.0 6.8 7.8 7.3 5.5 6.4
7.3 7.1 6.9 6.8 7.2 6.4 6.9 5.9 ––
5.7 6.2 5.9 6.5 7.5 7.5 7.5 5.7 III = 6.70
R 1.6 0.9 1.1 0.3 0.6 1.1 2.0 0.7
IV 6.7 6.2 6.1 6.3 8.4 7.1 7.7 7.9
7.4 6.0 6.7 6.2 7.2 7.3 6.3 6.4 ––
7.3 7.1 7.0 6.9 8.1 7.0 6.3 6.5 IV = 6.92
R 0.7 1.1 0.9 0.7 1.2 0.3 1.4 1.5
V 7.4 6.3 8.0 7.3 7.2 7.5 7.6 6.2
6.1 7.9 6.5 6.1 8.1 8.1 7.4 7.9 ––
7.2 6.4 6.9 7.1 6.7 7.1 6.1 6.2 V = 7.05
R 1.3 1.6 1.5 1.2 1.4 1.0 1.5 1.7
– = 6.84
w – = 6.90
x y– = 7.30 z– = 6.78
– –
A = 7.02 R = 1.1225

B = 6.89 Mean of all sample data points = 6.955
16
Assessing Measurements
As a Process

16.1 INTRODUCTION
Throughout this text, data have been presented for a wide variety of applications. In
each of these case histories and examples, we have sought to understand what variable
or variables affected the manufacturing process. All of this was done without the slight-
est mention of the reliability of the actual measurements taken. This is not to say that
measurement error is insignificant, and that opportunities do not exist in evaluating
the measurement process—quite the contrary! It is important for anyone who works on
a manufacturing process to realize that the act of measurement is a process as well.
Consider the mica thickness data. The engineer involved went to the production line
and measured the mica thickness of n = 200 samples. The readings for the initial n = 50
measurements were higher on average than the remaining n = 150. Was this a true
change in the process average—or simply a measurement problem?
In this chapter, methods for studying the measurement process are presented along
with their most common metrics. In addition, it will be shown that ANOME can be a
useful tool for the graphical evaluation of a measurement study. Its use will be com-
pared to more standard graphical and empirical techniques. Of course, other analytical
methods discussed in this book will be employed as well. The fact that such studies
often utilize a wide variety of methods, including ANOME, is the reason that this topic
is being addressed at the end of this text.

16.2 MEASUREMENT AS A PROCESS


Data are too often simply taken at face value and nothing of their origin is considered. In
other words, data taken from a manufacturing process are more often than not used to
control it with no regard given to whether or not the measurement process that generated

525
526 Part III: Troubleshooting and Process Improvement

Controllable Observed
inputs outputs

Operator or
technician

Test
procedure

Sample Measurement Sample


preparation instrument result

Selection
of sample

Lab environment
etc.

Figure 16.1 Measurement data are a result of a process involving several inputs, most of
them controllable.

the data was in control. Shewhart once said that “In any program of control we must start
with observed data; yet data may be either good, bad, or indifferent. Of what value is the
theory of control if the observed data going into that theory are bad? This is the question
raised again and again by the practical man.”1
Consider the simple model of a process as seen in Figure 16.1. Inputs to the mea-
surement process are varied, but assumed controllable.
Each sample result is the direct output of the manner in which it was created by the
combination of the person making the measurement, the procedure used, the quality
of the sample preparation, where the sample came from, the temperature and humidity
in the lab at that time, and so on. Of course, if the measurement instrument is designed
to be robust to operator and environmental effects then the sample result will be less
affected. However, it is unlikely that the instrument can avoid use of an improper pro-
cedure, poor sample preparation, or a sample that is deficient in some respect.
If the sample measured were a “standard sample,” or control sample, with a specified
reference target value such as a NIST-traceable standard, then we can evaluate each result
against the target to determine whether the measurement was correct or not (see Case
History 2.4). A standard sample that reads on average the same as the target value, or true
value, means that the measurement process is accurate, and the average is considered the
true average if the measurements were made with a precision-calibrated instrument. If
the standard sample fails to agree with the target value on average, the measurement pro-
cess is considered to be inaccurate, or not accurate, and calibration is necessary. Accuracy
is often called the bias in the measurement and is illustrated in Figure 16.2.
The variability of the sample measurements is also considered. When the variabil-
ity of the sample data is small, the measurement is said to have precision. If the sample

1. W. A. Shewhart, Economic Control of Quality of Manufactured Product (New York: D. Van Nostrand, 1931): 376.
Chapter 16: Assessing Measurements As a Process 527

True Measured
value average

Operator 1

Accuracy

Figure 16.2 Gauge accuracy is the difference between the measured average of the gauge
and the true value, which is defined with the most accurate measurement
equipment available.

Accurate and precise Precise but Accurate but Neither accurate


not accurate not precise nor precise

Figure 16.3 Measurement data can be represented by one of four possible scenarios.

variation is large, that is, scattered, then the measurement process is considered to be
imprecise, or not precise. Figure 16.3 shows the four scenarios that relate the combina-
tions of data that are either accurate, precise, neither, or both.

16.3 WHAT CAN AFFECT THE


MEASUREMENT PROCESS?
A measurement process contains any or all of the following:
• Machine(s) or device(s)
• Operator(s) or appraiser(s)
528 Part III: Troubleshooting and Process Improvement

• Sample preparation
• Environmental factors
Multiple machines or devices are often used since it is unrealistic to expect that all
process measurements can be done by a single machine or device. For this reason, it is
important to assess whether the use of multiple measurement machines or devices is
contributing to the error of the measurement system. Likewise, it is typical that more
than one operator or appraiser will be needed to make the measurements. Since not
everyone has the same attention to detail, it is not uncommon for there to be a potential
contribution to measurement error due to differences in results among operators or
appraisers, even measuring the same sample.
A frequent omission in measurement studies is the consideration of any sample
preparation that could affect the measurement result. Samples that are prepared in a lab
can see their measurements affected due to improper polishing, insufficient material,
poor environmental conditions, incorrect chemical solutions, and many other reasons.
Often these problems can be resolved through adequate training of laboratory personnel.
Bishop, Hill, and Lindsay2 offer some useful questions to ask when investigating a
measurement system:
• Does the technician know that the test is very subjective?
• Are technicians influenced by knowledge of the specification and/or control
limits of the process attribute?
• If more than one technician and/or instrument is used to collect the measure-
ments, are there any statistically significant differences among them? We
don’t want to make changes to the process when the data are really representing
measurement differences!
• Is the measurement instrument the source of the problem? Perhaps it is in need
of calibration, or its settings are not correct so an adjustment is needed.
These authors present three examples of how problems associated with the mea-
surement system can mislead the engineer who is investigating the production process.
In each example, a measurement problem would result in an unnecessary, or a lack of a
needed, process adjustment.
1. When things are worse than they appear. This situation can occur when the test
method is very subjective, and there are multiple technicians doing the measurements.
Statistical differences will no doubt be present among the technicians based on mea-
surements of the same samples. If the technician knows the target value for the product,

2. L. Bishop, W. J. Hill, and W. S. Lindsay, “Don’t Be Fooled by the Measurement System,” Quality Progress
(December 1987): 35–38.
Chapter 16: Assessing Measurements As a Process 529

as well as the specification limits for the response, the data may become tainted with
readings closer to the target than they really are. Thus, the measurement data will make
the process look better controlled than it actually is. The solution is a combination of
a new, less subjective test method, new control limits based on the test method, and
further training of the technicians.
2. When things are better than they appear. This situation can occur when the ana-
lytical method being used is out of statistical control from time to time. In response to
this, engineers may feel compelled to “do something” to bring the process back into
control. Unfortunately, the engineers will often fail to discover any assignable cause for
the apparent process change. Likewise, it may not be apparent what the assignable cause
is for the test to be out of control. The solution is to institute the use of a “standard
sample,” or control sample, which is submitted along with the production samples for
measurement. If the “standard sample” continues to read within its control limits, the
test method is deemed to be correct and any out-of-control production measurements
should be a cause for action. On the other hand, in the situation described here, the
“standard sample” will often indicate that the test is out of control, and that process
changes should not be made based on a faulty measurement. As a rule, no process con-
trol change should be made when the results of the production samples are correlated
with the results of the control sample.
3. When the specimen tested is not representative of the product. This situation can
occur if a test specimen is taken from the wrong production line for which the data are
being used for control purposes. It can also occur if the test method is not consistent
with a proper recommended technique, such as that prescribed by an ASTM standard.
In addition, this situation can occur if the test specimen was taken from a production lot
other than the one intended. These scenarios are only examples of how a test specimen
can fail to be representative of the product being evaluated. The reader can probably cite
other examples based on their knowledge of other processes.
In each of the above examples, the authors used a type of nested design discussed
in the next section. Typically, these designs are used for investigating measurement sys-
tems involving multiple technicians making multiple sample preparations and multiple
measurements on each sample preparation. The technicians, preparations, and repeated
measurements are sources of variation that need to be quantified.
Such designs should be performed in conjunction with a process investigation. In this
manner, you can judge how much of the variation seen in the data is attributed to the pro-
duction process and how much to the measurement process. If the measurement process
accounts for the larger portion of the total variation, then efforts should be directed
towards this area as an opportunity for making the overall process more consistent.
Samples should be submitted in a “blind” fashion to the technician so that person is
not aware of what its reading should be, that is, knowledge of its target value. These
samples should be part of the typical workload and they should be tested in a random
sequence (not the order in which they come from the production process).
530 Part III: Troubleshooting and Process Improvement

16.4 CROSSED VERSUS NESTED DESIGNS


A natural extension of the nested design (see Section 15.7) occurs when the experi-
menter wishes to partition sources of variability due to differences in parts, operators,
periods of time, and so on, so as to provide some direction for identifying opportunities
to reduce measurement variation. Nested designs of this nature are typically referred to
as variance component designs. Variance component designs treat operators and parts as
random effects such that their contribution to total variation is additive in nature.
Oftentimes, a crossed design (see Section 15.5) may be more appropriate than a
nested design. In the case of a crossed design, each sample, or part, is measured repeat-
edly by each operator on each day, and so on, in such a way that the factors are crossed
with each other. The operator-part interaction is often of the most interest in these
designs. A significant interaction will indicate that the operators were not able to repro-
duce their results for all of the measured samples, or parts. In other words, one or more
operators may have had difficulty measuring a particular part whereas the others did
not. Crossed designs treat operators and parts as fixed effects such that we are looking
for statistical differences among the levels of each factor and their interaction.
In the case of a nested design, each sample, or part, cannot be measured by another
operator or on another day, and so on, such that the factors become nested within the
other factors.
Nested designs are necessary if the testing is destructive in some way, or if some or
all of one or more factors are isolated from the others in a manner that makes a crossed
design impractical to conduct, or it is not cost-efficient to run.
For example, if plant location is a factor, it may not be cost-effective to send each
operator to the other plants just to collect data using a particular type of gauge, but it
may be possible to carry a set of samples between locations (as long as they do not
become broken in transit). Of course, we would have to assume that the gauges at each
location are in agreement. If the gauges are regularly calibrated to NIST-traceable (or
other organization-traceable) standards, then it may be safe to assume that the gauges
do not contribute much to the reproducibility of the measurement between locations.
Both of these designs can be used in assessing gauge measurement capability. The
next section discusses the approach to gauge measurement studies.

16.5 GAUGE REPEATABILITY AND


REPRODUCIBILITY STUDIES
Gauges, as measurement equipment, are subject to variation. They must be accurate and
precise. If a gauge is not properly calibrated for accuracy, a bias may be present. We
could experience the same result if different people use the calibrated gauge and get dif-
ferent results. This is referred to as the reproducibility of the gauge and is illustrated in
Figure 16.4.
Chapter 16: Assessing Measurements As a Process 531

On the other hand, if a single person uses the gauges and takes repeat readings of a
single sample, there will be variation in the results. This is referred to as the repeata-
bility of the gauge and is illustrated in Figure 16.5.

Operator 3

True
value
Operator 2

Operator 1

Reproducibility

Figure 16.4 Gauge reproducibility can be represented as the variation in the average of
measurements made by multiple operations using the same gauge and measuring
the same parts.

True
value
Operator 2

Repeatability

Figure 16.5 Gauge repeatability can be represented as the variation in the measurements made
by a single operator using the same gauge and measuring the same parts.
532 Part III: Troubleshooting and Process Improvement

Gauge R&R Study (Long Method)


The ASQ Automotive Division SPC Manual defines the long method as determining
repeatability and reproducibility of a gauge separately.3 Comparing these estimates can
give insight into causes of gauge error. If reproducibility is large compared to repeata-
bility, then possible causes could be:
• Operator that is not properly trained to use and read the gauge
• Calibration markings on the gauge are not clear to the operator
If repeatability is large compared to reproducibility, then possible causes could be:
• Gauge is in need of maintenance
• Gauge should be redesigned to be more rigid
• Clamping of and location for gauging needs improvement
In preparation for running the study, establish the purpose of the study and deter-
mine the kind of information needed to satisfy its purpose. Answer these questions:
• How many operators will be involved?
• How many sample parts will be needed?
• What number of repeat readings will be needed for each part?
Next, collect the parts needed for the study. These parts should represent the range
of possible values the gauge is expected to see in practice. Finally, choose the operators
needed to conduct the study. Again, you will want to choose people who represent the
range of skill within the pool of inspectors available. Measurements should be taken in
random order so as to reduce the possibility of any bias.
The study is typically conducted using the following steps, involving multiple
operators (use 2–3, preferably 3), multiple parts (use 5–10, preferably 10), and repeat
number of trials (use 2–5; will depend on cost and time constraints).
1. Refer to operators as A, B, and so on, and to parts as 1, 2, and so on (number
parts so the markings are not visible to the operators).
2. Calibrate the gauge to be evaluated.
3. Operator A measures the parts in random order and enters the data into the
first column of the form shown in Table 16.1.
4. Repeat step 3 for the other operator(s) and appropriate columns.
5. Repeat steps 1 to 4, with the parts measured in another random order, as many
times as the number of trials specified. After each trial, enter the data on the
form for each part and operator.

3. ASQC Automotive Division SPC Manual (Milwaukee: American Society for Quality Control, 1986).
Table 16.1 Gauge repeatability and reproducibility data collection sheet (long method).

Operator A B C
1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th
Part # Trial Trial Trial Trial Trial Range Trial Trial Trial Trial Trial Range Trial Trial Trial Trial Trial Range Average
1
2
3
4
5
6
7
8

Chapter 16: Assessing Measurements As a Process


9
10
Average
– – –
RA RB RC

Sum Sum Sum


– – –
XA XB XC

– # – – –
RA Trials D4 (R ) × (D4) = UCLR* Max oper X Max part X
– – –
RB 2 3.27 ( )×( ) = ____ Min oper X Min part X
– 3 2.58 – Rp**
RC X diff

Sum 4 2.28 * Limit of the individual ranges (Rs). Ranges beyond the limit are circled, and the cause should
be identified and corrected. Repeat the readings with the same operator and same sample, or
– 5 2.11 discard values and reaverage and recompute R and UCLR from rest of data.
R
** Range of all the sample averages.

533
534 Part III: Troubleshooting and Process Improvement

6. Steps 3 to 5 can be modified for large-size parts, when parts are unavailable,
or when operators are on different shifts.
7. Using the data collection form in Table 16.1 and the calculations form in
Table 16.2, compute the gauge R&R statistics.
Repeatability, also referred to as equipment variation (EV), estimates the spread
that encompasses 99 percent of the measurement variation due to the same operator
measuring the same part with the same gauge, and is calculated as

 R
EV = 5.15σˆ EV = 5.15σˆ e = 5.15  *  = R × K1
 d 
2

EV
σˆ EV = σˆ e =
5.15
– – –
where R is the average range of the operator ranges RA, RB, and so on, and K1 is a tab-
ulated constant4 which is given in Table 16.2. The factor 5.15 represents the overall
number of standard deviations (±2.575) around the mean within which 99 percent of the
observations are expected to lie under the assumption of a normal distribution.
Reproducibility, also referred to as appraiser variation (AV), estimates the spread
that encompasses 99 percent of the measurement variation due to different operators
measuring the same part with the same gauge, and is calculated as

(X ) − ( EV ) / ( n × r )
2 2
AV = 5.15σˆ AV = 5.15σˆ o = 5.15 diff
× K2

AV
σˆ AV = σˆ o =
5.15
– – –
where Xdiff is the range of the operator averages XA, XB, and so on; K2 is a tabulated con-
stant,5 which is given in Table 16.2 and based on a d2* factor for k = 1 found in Table A.11,
n is the number of parts measured, and r is the number of trials.

5.15
4. K1 is , where d2* is tabulated in Table A.11 and is based on k = (# operators) × (# parts) and n = # trials.
d 2*

For example, if three operators are used with four parts for three trials, then k = (4)(3) = 12 and n = 3, which yields
a value of d2* = 1.71 from Table A.11. Thus, K1 will be
5.15 5.15
K1 = = = 3.01
d 2* 1.71

5.15
5. K2 is , where d2* is tabulated in Table A.11 and is based on k = 1 and n = # operators. For example, if three
d 2*

operators are used, then k = 1 and n = 3, which yields a value of d2* = 1.91 from Table A.11. Thus, K2 will be
5.15 5.15
K2 = = = 2.70
d 2* 1.91
Table 16.2 Gauge repeatability and reproducibility calculations sheet (long method).

Gauge Repeatability and Reproducibility Report


Gauge name: _____________________ Date: / / Characteristic: _____________________
Gauge no.: _____________________ Study done by: _____________________ Part number and name: _____________________
Gauge type: _____________________ Specification: _____________________ Spec. tolerance (if two-sided): ______________(TOL)
– –
From data collection sheet: R = X Diff. = Rp =
Measurement Analysis % Tolerance Analysis
Repeatability (a.k.a. equipment variation, EV) n = # parts = r = # trials =
%EV = 100[(EV)/(TOL)]

EV = (R ) × (K1) # # = 100[( )/( )]
Parts K1 Operators K2
=( )×( ) = %
2 4.56 2 3.65
3 3.05 3 2.70 %AV = 100[(AV)/(TOL)]
=
= 100[( )/( )]

Chapter 16: Assessing Measurements As a Process


4 2.50 4 2.30
Reproducibility (a.k.a. appraiser variation, AV*) 5 2.21 5 2.08 = %
– %R&R = 100[(R&R)/(TOL)]
AV = [(X diff ) × (K2)]2 – [(EV )2/(n × r )] #
K3 = 100[( )/( )]
Parts
= [( )×( )]2 – [( )2/( )] = %
2 3.65
=
3 2.70 % Total Variation Analysis
Repeatability and Reproducibility (R&R) %EV = 100[(EV)/(TV)]
4 2.30
R&R = (EV )2 + (AV )2 = 100[( )/( )]
5 2.08
= %
= ( )2 + ( )2 6 1.93 %AV = 100[(AV)/(TV)]
= 7 1.82 = 100[( )/( )]
8 1.74 = %
Part Variation (PV)
%R&R = 100[(R&R)/(TV)]
PV = (Rp) × (K3) = ( )×( )= 9 1.67
= 100[( )/( )]
10 1.62 = %
Total Variation (TV)
%PV = 100[(PV)/(TV)]
TV = (R&R )2 × (PV )2 * If AV is a negative value within the square
= 100[( )/( )]
root sign, the appraiser variation will default
= ( )2 × ( )2 = to zero (0). = %

535
536 Part III: Troubleshooting and Process Improvement

Operator 3

True
value
Operator 2

Operator 1

R&R

Figure 16.6 Gauge R&R can be represented as the total variation due to measurements made by
multiple operators using the same gauge and measuring the same parts.

Repeatability and reproducibility, also referred to as gauge R&R, estimates the


spread that encompasses 99 percent of the variation due to both sources and is calcu-
lated as follows and illustrated in Figure 16.6

R & R = 5.15σˆ m = 5.15 (σˆ EV ) + (σˆ AV ) = 5.15 σˆ e2 + σˆ o2


2 2

R& R
σˆ m =
5.15

Part-to-part variation, also referred to as PV, estimates the spread that encompasses
99 percent of the measurements from a normal distribution and is calculated as

 Rp 
PV = 5.15σˆ p = 5.15  *  = Rp × K3
 d2 
PV
σˆ p =
5.15
Chapter 16: Assessing Measurements As a Process 537

where Rp is the range of the part averages, and K3 is a tabulated constant,6 which is given
in Table 16.2 and based on a d2* factor for k = 1 found in Table A.11.
The total process variation, also referred to as TV, is calculated from the measure-
ment study as

TV = 5.15σˆ t = 5.15 σˆ p2 + σˆ m2

TV
σˆ t =
5.15

The number of distinct categories,7 also referred to as NDC, that can be obtained
from the data is calculated as

1.41σˆ p
NDC =
σˆ m

Some guidelines in the interpretation of NDC are:


• If NDC = 1, the measurement system cannot be used to control the process
since the gauge cannot tell one part from another, that is, the data are
100% noise
• If NDC = 2, the data fall into two groups, like attributes data
• If NDC = 3, the variable data are considered to be of a low-grade quality
that will produce insensitive control charts
• If NDC = 4, the variable data are improved
• If NDC = 5, the variable data are even better (minimum acceptability)
• The NDC should be > 5, and the larger the better, in order for the measurement
system to be deemed truly acceptable
The discrimination ratio, also referred to as DR, estimates the degree to which the

observed variation is beyond that characterized by the control limits of an X chart of
the data (discussed in the next section). Recall from Chapter 2 that the control limits are

5.15
6. K3 is , where d2* is tabulated in Table A.11 and is based on k = 1 and n = # parts. For example, if four
d 2*

parts are used, then k = 1 and n = 4, which yields a value of d2* = 2.24 from Table A.11. Thus, K3 will be
5.15 5.15
K3 = = = 2.30
d 2* 2.24

7. For more information on this metric, the reader is referred to the following texts: D. J. Wheeler and R. W. Lyday,
Evaluating the Measurement Process (Knoxville, TN: SPC Press, 1989). ASQC Automotive Industry Action Group
(AIAG), Measurement Systems Analysis Reference Manual (Detroit, MI: 1990).
538 Part III: Troubleshooting and Process Improvement

based on short-term variation, that is, repeatability in a measurement sense, and the
observed variation contains this variation as well as the product variation. Thus, the dis-
crimination ratio shows the relative usefulness of the measurement system for the prod-
uct being measured. The ratio estimate yields the number of non-overlapping categories
within the control limits, or natural process limits, that the product could be sorted into
if operator bias can be eliminated. The discrimination ratio is calculated as

2σ 2p
DR = −1
σ e2

Since operator bias is often present, it is useful to recalculate the discrimination


ratio, incorporating this bias, and then compare the two ratios. While the formula for the
ratio remains the same, the estimates for sp2 and se2 become

σ e′ 2 = σ m2 = σ e3 + σ o2 and σ p′ 2 = σ 2p + σ o2

so that
2σ ′p2
DR = −1
σ e′ 2

A percent tolerance analysis is sometimes preferred as a means of evaluating a


measurement system. Values of % EV, % AV, and % R&R are calculated using the value
of the specification tolerance (TOL) in the denominator as follows:

% EV = 100 [(EV)/(TOL)]
% AV = 100 [(AV)/(TOL)]
% R&R = 100 [(R&R)/(TOL)]

Common guidelines for the interpretation of the % R&R are:


• % R&R < 10%, the measurement system is OK for use
• 10% < % R&R < 30%, the measurement system may be acceptable contingent
upon its importance in application, cost of its replacement, cost of its repair,
and so on
• % R&R > 30%, the measurement system is not to be used, and effort is needed
to identify sources of excess variation and correct them
Another common evaluation is a percent total variation analysis. The computations
are similar to the percent tolerance analysis with the exception that the denominator of
the ratios is the total variation (TV).
Chapter 16: Assessing Measurements As a Process 539

% EV = 100 [(EV)/(TV)]
% AV = 100 [(AV)/(TV)]
% R&R = 100 [(R&R)/(TV)]

Unfortunately, these are poor statistical metrics as they represent ratios of standard
deviations. A more appropriate method is to express them as ratios of variances. In this
manner, the ratios become variance components that sum to 100 percent when the % PV
ratio is factored in as follows

% EV = 100 [(sEV)2/(st)2]
% AV = 100 [(sAV)2/(st)2]
% R&R = 100 [(sR&R)2/(st)2]
% PV = 100 [(sp)2/(st)2]

Thus, the variance components can be graphically portrayed with a simple pie
chart, or in a breakdown diagram as shown in Figure 16.7. Pure error, which is a com-
ponent of repeatability, is the variability of repeated measurements without removing
and re-fixturing the part. It is the smallest possible measurement error.
Gauge accuracy is defined as the difference between the observed average of sam-
ple measurements and the true (master) average of the same parts using precision instru-
ments. Gauge linearity is defined as the difference in the accuracy values of the gauge
over its expected operating range. Gauge stability is defined as the total variation seen
in the measurements obtained with the gauge using the same master or master parts
when measuring a given characteristic over an extended time frame. Gauge system
error is defined as the combination of gauge accuracy, repeatability, reproducibility,
stability, and linearity.

Overall variation

Part-to-part variation Measurement system variation

Variation due to gauge Variation due to operators

Repeatability Reproducibility

Pure Fixturing Operator Operator by


error part

Figure 16.7 Variance components of overall variation can be represented as the breakdown of the
total variation into part-to-part variation and measurement (gauge R&R) variation.
540 Part III: Troubleshooting and Process Improvement

Case History 16.1


Gasket Thickness
A plant that manufactures sheets in the production of gaskets was concerned about the
measurement of thickness. The engineer, Alan, designed a gauge R&R study to evalu-
ate the measurement system. Three operators were chosen for the study and five differ-
ent parts (gaskets) were chosen to represent the expected range of variation seen in
production. Each operator measured each gasket a total of two times. The specification
for thickness is 76 ± 20 mm. The data are shown in Table 16.3.
The data were entered in the data collection form (Table 16.4) and the summary sta-
tistics were computed for use in the calculations form (Table 16.5). The results of the
gauge R&R analysis are as follows:

EV = 5.15σ EV = 4.267 × 4.56 = 19.456


19.56
σ EV = = 3.778
5.15

AV = 5.15σ AV = 5.15 (8.5 × 2.70 ) − (19.456 ) / (10 )  = 22.078


2 2

 
22.075
σ AV = = 4.287
5.15

R & R = 5.15σ m = 5.15 (3.778) + ( 4.287)


2 2
= 29.427
29.427
σm = = 5.714
5.15
PV = 5.15σ p = 58.167 × 2.08 = 120.790
120.790
σp = = 23.454
5.15

Table 16.3 Gasket thicknesses for a gauge R&R study.


Operator
A B C
Part 1st Trial 2nd Trial 1st Trial 2nd Trial 1st Trial 2nd Trial
1 67 62 55 57 52 55
2 110 113 106 99 106 103
3 87 83 82 79 80 81
4 89 96 84 78 80 82
5 56 47 43 42 46 54
Table 16.4 Gauge repeatability and reproducibility data collection sheet (long method) for Case History 16.1.

Operator A B C
1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th
Part # Trial Trial Trial Trial Trial Range Trial Trial Trial Trial Trial Range Trial Trial Trial Trial Trial Range Average
1 67 62 5 55 57 2 52 55 3 58.000
2 110 113 3 106 99 7 106 103 3 106.167
3 87 83 4 82 79 3 80 81 1 82.000
4 89 96 7 84 78 6 80 82 2 84.833
5 56 47 9 43 42 1 46 54 8 48.000
6
7
8

Chapter 16: Assessing Measurements As a Process


9
10
Average 81.8 80.2 5.6 74.0 71.0 3.8 72.8 75.0 3.4
81.8 – 74.0 – 72.8 –
RA RB RC

Sum 162.0 Sum 145.0 Sum 147.8


– 81.0 – 72.5 – 73.9
XA XB XC

– # – – –
RA 5.6 Trials D4 (R ) × (D4) = UCLR* Max oper X 81.0 Max part X 106.167
– 3.8 2 3.27 (4.267) × (3.27) = 13.939 – 72.5 – 48.000
RB Min oper X Min part X
– 3.4 3 2.58 – 8.5 Rp** 58.167
RC X diff

Sum 12.8 4 2.28 * Limit of the individual ranges (Rs). Ranges beyond the limit are circled, and the cause should
be identified and corrected. Repeat the readings with the same operator and same sample, or
– 4.267 5 2.11 discard values and reaverage and recompute R and UCLR from rest of data.
R
** Range of all the sample averages.

541
542
Table 16.5 Gauge repeatability and reproducibility calculations sheet (long method) for Case History 16.1.

Part III: Troubleshooting and Process Improvement


Gauge Repeatability and Reproducibility Report
Gasket inspection
Gauge name: _____________________ 10/08/04
Date: _____________________ Gasket thickness
Characteristic: _____________________
C123
Gauge no.: _____________________ Alan
Study done by: _____________________ GW123
Part number and name: _____________________
Calipers
Gauge type: _____________________ 76 ± 20
Specification: _____________________ 40
Spec. tolerance (if two-sided): ______________(TOL)
– 4.267 – 8.5 Rp = 58.167
From data collection sheet: R = X Diff. =
Measurement Analysis 5 2 % Tolerance Analysis
Repeatability (a.k.a. equipment variation, EV) n = # parts = r = # trials =
%EV = 100[(EV)/(TOL)]

EV = (R ) × (K1) # # = 100[(19.46)/(40)]
Parts K1 Operators K2
= (4.267) × (4.56) = 48.64 %
2 4.56 2 3.65
3 3.05 3 2.70 %AV = 100[(AV)/(TOL)]
= 19.46
4 2.50 4 2.30 = 100[(22.08)/(40)]
Reproducibility (a.k.a. appraiser variation, AV*) 5 2.21 5 2.08 = 55.19 %
– %R&R = 100[(R&R)/(TOL)]
AV = [(X diff ) × (K2)]2 – [(EV )2/(n × r )] #
Parts K3 = 100[(29.43)/(40)]
= [(8.5) × (2.70)]2 – [(19.46)2/(10)] = 73.57 %
2 3.65
= 22.08
3 2.70 % Total Variation Analysis
Repeatability and Reproducibility (R&R) 4 2.30 %EV = 100[(EV)/(TV)]
R&R = (EV )2 + (AV )2 = 100[(19.46)/(124.32)]
5 2.08
= 15.65 %
= (19.46)2 + (22.08)2 6 1.93
%AV = 100[(AV)/(TV)]
= 29.43 7 1.82 = 100[(22.08)/(124.32)]
8 1.74 = 17.76 %
Part Variation (PV)
%R&R = 100[(R&R)/(TV)]
9 1.67
PV = (Rp) × (K3) = (58.167) × (2.08) = 120.79 = 100[(29.43)/(124.32)]
10 1.62 = 23.67 %
Total Variation (TV)
%PV = 100[(PV)/(TV)]
TV = (R&R )2 × (PV )2 * If AV is a negative value within the square
= 100[(120.79)/(124.32)]
root sign, the appraiser variation will default
= (29.44)2 × (120.79)2 = 124.32 to zero (0). = 97.16 %
Chapter 16: Assessing Measurements As a Process 543

TV = 5.15σ t = 5.15 ( 23.454 ) + (5.714 )


2 2
= 124.323
124.323
σt = = 24.140
5.15

% EV = 100 [(19.456)/(40)] = 48.64%


% AV = 100 [(22.078)/(40)] = 55.19%
% R&R = 100 [(29.427)/(40)] = 23.67%

The % R&R value of 23.67% indicated that the measurement system was accept-
able contingent upon its importance in application, cost of its replacement, cost of its
repair, and so on. The engineer recommended to management that the measurement sys-
tem should be investigated further to identify sources of variation that could be elimi-
nated. It was possible to compute a % PV value as part of the tolerance analysis, but the
result was not very meaningful. For this study, the % PV is calculated to be

% PV = 100 [(120.790)/(40)] = 301.97%

The number of distinct categories was also computed from the study results as

NDC =
1.41σ p
=
(1.41)( 23.454 ) = 5.8 → 6
σm 5.714

Since the number of categories was 6, the measurement system was considered to
be acceptable. The discrimination ratio for this study, incorporating operator bias, was
computed as

2 ( 23.454 ) + ( 4.287 ) 


2 2

DR =   − 1 = 5.8 → 6
(5.714 ) 2

which agreed with the number of distinct categories estimate. The engineer recomputed
the discrimination ratio under the assumption that the operator bias could be eliminated and
found that

2 ( 23.454 )
2

DR = − 1 = 8.7 → 9
(3.778) 2

Thus, the engineer discovered that the measurement system could be improved
from distinguishing six quality levels to nine quality levels by eliminating the operator
bias, which was possible through certification and training.
544 Part III: Troubleshooting and Process Improvement

The variance components were also calculated from the study results. These
components gave the investigator some direction on where to focus efforts to reduce
variation.

% EV = 100 [(3.778)2/(24.140)2] = 2.45%


% AV = 100 [(4.287)2/(24.140)2] = 3.15%
% R&R = 100 [(5.714)2/(24.140)2] = 5.60%
% PV = 100 [(23.454)2/(24.140)2] = 94.40%

As expected, the values of % EV (2.45%) and % AV (3.15%) sum to the contribu-


tion of the gauge % R&R value of 5.60%. Most of the variation seen in the data (94.40%)
is due to part-to-part differences. The fact that this component accounts for such a large
portion of the total variation is consistent with the larger value of distinct categories the
gauge can distinguish.
The analysis of gauge R&R studies is available in a wide array of software pro-
grams. If Case History 16.1 is treated as a crossed design with the operators and parts
considered as fixed effects, the following analysis from Minitab is typical. The esti-
mates shown below are consistent with those shown above (minor differences due to
rounding error). Note that the “VarComp” column represents the square of the standard
deviation estimates sm, se, so, sp, and st, respectively, which are shown in the “StdDev
(SD)” column in the fourth table. Note that the number of distinct categories reflects a
conservative estimate, that is, 5.8 is rounded down to 5 instead of up to 6.

Two-Way ANOVA Table With Interaction

Source DF SS MS F P
Part 4 12791.1 3197.78 247.730 0.000
Appraiser 2 415.4 207.70 16.090 0.002
Part * Appraiser 8 103.3 12.91 1.058 0.439 (not significant)
Repeatability 15 183.0 12.20
Total 29 13492.8

Two-Way ANOVA Table Without Interaction

Source DF SS MS F P
Part 4 12791.1 3197.78 256.925 0.000
Appraiser 2 415.4 207.70 16.688 0.000
Repeatability 23 286.3 12.45
Total 29 13492.8
Chapter 16: Assessing Measurements As a Process 545

Gage R&R
%Contribution
Source VarComp (of VarComp)
Total Gage R&R 31.972 5.68
Repeatability 12.446 2.21
Reproducibility 19.525 3.47
Appraiser 19.525 3.47
Part-To-Part 530.889 94.32
Total Variation 562.861 100.00

Study Var %Study Var %Tolerance


Source StdDev (SD) 5.15 * SD) (%SV) (SV/Toler)
Total Gage R&R 5.6544 29.120 23.83 72.80
Repeatability 3.5279 18.169 14.87 45.42
Reproducibility 4.4188 22.757 18.63 56.89
Appraiser 4.4188 22.757 18.63 56.89
Part-To-Part 23.0410 118.661 97.12 296.65
Total Variation 23.7247 122.182 100.00 305.46

Number of Distinct Categories = 5

Alternatively, this gauge study could be treated as a nested design with parts and
operators nested within parts. Note that the Minitab analysis for this model produces
similar variance components compared to the crossed design model. It is also seen in
this analysis that statistical differences still exist among the operators. This is seen in the
graphical analyses in the next section.

Nested ANOVA: Gasket Thickness versus Part, Appraiser

Analysis of Variance for Gasket Thickness

Source DF SS MS F P
Part 4 12791.1333 3197.7833 61.654 0.000
Appraiser 10 518.6667 51.8667 4.251 0.006
Error 15 183.0000 12.2000
Total 29 13492.8000

Variance Components
% of
Source Var Comp. Total StDev
Part 524.319 94.24 22.898 (vs. 23.041 in crossed design)
Appraiser 19.833 3.56 4.453 (vs. 4.419)
Error 12.200 2.19 3.493 (vs. 3.528)
Total 556.353 23.587
546 Part III: Troubleshooting and Process Improvement

Graphical Analysis of R&R Studies


Using Case History 16.1 as an example, there are many ways to display the results of
a gauge R&R study in graphical form. Most software programs provide the user with a
variety of these graphs. Minitab will be used here to demonstrate some of the graphical
analyses available.
A gauge run chart is a good first step at visualizing the results of the study. Figure
16.8 shows a comparison of operators and parts and shows useful information about
measurement capability. In this graph, there is good variability of the parts over the
process operating range, and the measurement variation (successive pairs of points) is
small in comparison to the part-to-part variation. The operator-to-operator variation
is larger than the measurement variation. These observations are consistent with the
variance components seen in the previous section.
A good way to take a closer look at a comparison of operators is to use an appraiser
variation plot, which is also known as a multi-vari plot of part averages by operators.
In the case of a crossed design, this plot would represent the operator-by-part interac-
tion. Figure 16.9 shows such a plot for Case History 16.1.
In this graph, it is evident that Operator A’s results are higher than the others and that
the results of the other operators are in close agreement, with the exception of Part 5. The
fact that the lines are nearly parallel supports the contention that the operator-by-part
interaction is negligible.
Another variation of the multi-vari plot is the R&R plot. This plot uses the re-
sponse data where the part average is subtracted from the original data. Thus, the new
response data shows only variation due to equipment variation (repeatability) and

Gage Run Chart of Gasket Thickness by Part, Appraiser


Reported by: Alan
Gage name: Gasket Inspection Tolerance: 40
Date of study: 10/8/04 Misc:

1 2 3 4 5 Appraiser
A
110 B
C
100
Gasket Thickness

90

80
Mean
70

60

50

40
Appraiser
Panel variable: Part

Figure 16.8 Gauge R&R run plot for Case History 16.1.
Chapter 16: Assessing Measurements As a Process 547

appraiser variation (reproducibility). By regrouping the data by operator rather than by


part, it is possible to compare each source of operator variation and shows where this
variation can be improved. In Figure 16.10, it is clear that Operator A is the primary

Gage R&R (Xbar/R) for Gasket Thickness


Reported by: Alan
Gage name: Gasket Inspection Tolerance: 40
Date of study: 10/8/04 Misc:

Appraiser * Part Interaction


Appraiser
110 A
B
100 C

90
Average

80

70

60

50

40
1 2 3 4 5
Part

Figure 16.9 Gauge R&R appraiser variation plot for Case History 16.1.

Multi-Vari Chart for Dev from Part Avg by Part - Appraiser


Part
1
10 2
3
4
5
Dev from Part Avg

-5

A B C
Appraiser

Figure 16.10 Gauge R&R plot for Case History 16.1.


548 Part III: Troubleshooting and Process Improvement

source of reproducibility variation. This is consistent with the observation of operator-


to-operator variation seen in Figures 16.8 and 16.9.
The variance components are shown as a bar chart in Figure 16.11 and as a pie chart
in Figure 16.12. The bar chart shows a nice comparison of the contributions of each
source of variation as a percentage of the total variance (st2), the study variance (TV),

Gage R&R (Xbar/R) for Gasket Thickness


Reported by: Alan
Gage name: Gasket Inspection Tolerance: 40
Date of study: 10/8/04 Misc:

Components of Variation
% Contribution
300
% Study Var
% Tolerance
250

200
Percent

150

100

50

0
Gage R&R Repeat Reprod Part-to-Part

Figure 16.11 Gauge R&R variance component chart for Case History 16.1.

Gasket Thickness
Components of Variation as a % of Total

% R&R =
6%

% PV =
94%

Figure 16.12 Gauge R&R variance component pie chart for Case History 16.1.
Chapter 16: Assessing Measurements As a Process 549

and the tolerance (TOL). In any of these scenarios, it is clear that part-to-part variation
is the largest component. The pie chart shows the six percent contribution to the over-
all variation that the gauge contributes to the data.
– –
Figure 16.13 shows the data in the form of an X and R chart. The X chart is out of
control, which is a good thing. That means that the measurement is capable of discrim-
inating between parts. However, the results for Operator A look higher than the other
operators. The R chart is in control, which is also a good thing. This chart assesses
measurement system stability and uncertainty, as well as test and retest errors due to
fixturing. Note that the analysis of ranges (ANOR) could be used to check for statisti-
cal differences in variability between operators (see Section 15.12).
The analysis of means for effects (ANOME) can be used to determine the level of
statistical significance between operators, parts, and their interaction. This feature is not
evident in the graphics discussed in this section. The first step is to check for out-of-
control ranges in Table 16.4.

UCLR = D4 R = ( 3.27 )( 4.267 ) = 13.939

Since all of the ranges lie below this limit, this is accepted as evidence of homo-
geneity of ranges. The next step is to compute the averages to be plotted, as shown in
Table 16.6.

Gage R&R (Xbar/R) for Gasket Thickness


Reported by: Alan
Gage name: Gasket Inspection Tolerance: 40
Date of study: 10/8/04 Misc:

Xbar Chart by Appraiser


A B C

100
Sample Mean

UCL=83.82
–=75.8
80 X
LCL=67.78
60

40

R Chart by Appraiser
A B C
15 UCL=13.94
Sample Range

10

5 –
R=4.27

0 LCL=0


Figure 16.13 Gauge R&R X and R chart for Case History 16.1.
550 Part III: Troubleshooting and Process Improvement

Table 16.6 Gasket thicknesses for a gauge R&R study.


Operator
Part A B C

67 – 55 – 52 – –
1 X11 = 64.5 X12 = 56.0 X13 = 53.5 X1• = 58.00
62 57 55
110 – 106 – 106 – –
2 X21 = 111.5 X22 = 102.5 X23 = 104.5 X2• = 106.17
113 99 103
87 – 82 – 80 – –
3 X31 = 85.0 X32 = 80.5 X33 = 80.5 X3• = 82.00
83 79 81
89 – 84 – 80 – –
4 X41 = 92.5 X42 = 81.0 X43 = 81.0 X4• = 84.83
96 78 82
56 – 43 – 46 – –
5 X51 = 51.5 X52 = 42.5 X53 = 50.0 X5• = 48.00
47 42 54

– – – X •• = 75.80
X •1 = 81.00 X •2 = 72.50 X •3 = 73.90
n = 24

Main effects are as follows:

P1 = (58.00 – 75.80) = –17.80


P2 = (106.17 – 75.80) = 30.37 O1 = (81.00 – 75.80) = 5.20
P3 = (82.00 – 75.80) = 6.20 O2 = (72.50 – 75.80) = –3.30
P4 = (84.83 – 75.80) = 9.03 O3 = (73.90 – 75.80) = –1.90
P5 = (48.00 – 75.80) = –27.80

Interaction effects are as follows:

PO11 = (64.5 – 75.80) – (–17.80) – (5.20) = 1.30


PO21 = (111.5 – 75.80) – (30.37) – (5.20) = 0.13
PO31 = (85.0 – 75.80) – (6.20) – (5.20) = –2.20
PO41 = (92.5 – 75.80) – (9.03) – (5.20) = 2.47
PO51 = (51.5 – 75.80) – (–27.80) – (5.20) = –1.70
PO12 = (56.0 – 75.80) – (–17.80) – (–3.30) = 1.30
PO22 = (102.5 – 75.80) – (30.37) – (–3.30) = –0.37
PO32 = (80.5 – 75.80) – (6.20) – (–3.30) = 1.80
PO42 = (81.0 – 75.80) – (9.03) – (–3.30) = –0.53
PO52 = (42.5 – 75.80) – (–27.80) – (–3.30) = –2.20
Chapter 16: Assessing Measurements As a Process 551

PO13 = (53.5 – 75.80) – (–17.80) – (–1.90) = –2.60


PO23 = (104.5 – 75.80) – (30.37) – (–1.90) = 0.23
PO33 = (80.5 – 75.80) – (6.20) – (–1.90) = 0.40
PO43 = (81.0 – 75.80) – (9.03) – (–1.90) = –1.93
PO53 = (50.0 – 75.80) – (–27.80) – (–1.90) = 3.90

The estimate of error based on effects is calculated as

185862 − ( 6 )( 2131.998 ) + (10 )( 41.54 ) + ( 2 ) ( 51.6334 )  −


( 2274 ) 2

σˆ e = 30
30 − ( 4 + 2 + 8 ) − 1

182.1452
= = 3.493
13
df = k ( r − 1) = (15) ( 2 − 1) = 15

and the decision limits for main effects and the interaction are shown in following table:

Part Operator Operator*Part


kP = 5 kO = 3 kOP = 15
H0.05 = 2.573 H0.05 = 2.125
h0.05 = 2.877 h0.05 = 2.603 h*0.05 = 3.472

0 ± ( 2.877 )( 3.493 ) 4 / 30 0 ± ( 2.603 )( 3.493 ) 2 / 30 0 ± ( 3.472)( 3.493 ) 8 / 30


0 ± 3.668 0 ± 2.347 0 ± 6.263
H0.01 = 3.306 H0.01 = 2.796
h0.01 = 3.696 h0.01 = 3.424 h*0.01 = 4.271

0 ± ( 3.696 )( 3.493 ) 4 / 30 0 ± ( 3.424 )( 3.493 ) 2 / 30 0 ± ( 4.271)( 3.493 ) 8 / 30


0 ± 4.714 0 ± 3.089 0 ± 7.704

The ANOME plot is presented in Figure 16.14. The large part-to-part differences
are seen to be highly significant at the a = 0.01 level. The operator-to-operator differ-
ences that were seen in the gauge R&R diagnostic plots are not only evident, but they
are deemed to be statistically significant at the a = 0.01 level as well. The lack of a sig-
nificant operator-by-part interaction is consistent with the nearly parallel lines seen in
Figure 16.9.
This plot provides much of the same information as the gauge R&R diagnostic
plots, but goes further by adding a measure of statistical significance—a feature that is a
useful addition to the gauge R&R analysis.
552
Two-Way ANOM for Gasket Thickness by Part and Appraiser

Part III: Troubleshooting and Process Improvement


No Standard Given
Main Effects Interaction
A: 2

29.163

19.163

A: 4
9.163
A: 3 A: 5 UDL(0.010)=7.704
B: A
Gasket Thickness

A: 4 B: C
UDL(0.050)=6.263
A: 1 A: 1 A: 3
UDL(0.010)=4.714 B: A
UDL(0.050)=3.668 A: 2 A: 2 B: B A: 3
UDL(0.010)=3.089 B: A B: B
UDL(0.050)=2.347 B: A B: C B: C
CL=0.000
-0.837
LDL(0.050)=-2.347 A: 2
LDL(0.010)=-3.089
A: 4
LDL(0.050)=-3.668 B: C B: B A: 4 A: 5 A: 5
LDL(0.010)=-4.714 A: 1 A: 3 B: B
B: B B: C B: A B: B
B: C B: A LDL(0.050)=-6.263
LDL(0.010)=-7.704

-10.837

A: 1
-20.837

A: 5
-30.837
Part (A) and Appraiser (B)

Figure 16.14 ANOME chart for Case History 16.1.


Chapter 16: Assessing Measurements As a Process 553

16.6 PRACTICE EXERCISES


1. What are three examples of problems associated with a measurement system
that can mislead someone investigating it?
2. When is a nested design preferred over a crossed design?
3. Explain how repeatability and reproducibility are represented in a gauge
R&R study.
4. The width of a particular component supplied by a vendor is a critical quality
characteristic. The width specification is 69 ± 0.4mm. Two inspectors were
chosen from the Goods Inward inspection department and seven parts were
taken at random for the study. Both inspectors measured the width of all parts
twice, using a dial vernier caliper accurate to within 0.02mm. The data taken
during the gauge R&R study are given in the table below:

Inspector A Inspector B
Part 1st Trial 2nd Trial 1st Trial 2nd Trial
1 69.38 69.60 69.62 69.52
2 69.72 69.80 69.78 69.90
3 69.58 69.70 69.70 69.62
4 69.50 69.50 69.46 69.50
5 69.48 69.40 69.50 69.42
6 69.56 69.40 69.68 69.64
7 69.90 70.02 69.94 69.88

a. Compute the gauge repeatability and reproducibility using a 5.15s


spread that encompasses 99 percent of the variation expected.
b. Determine the R&R variation of the gauge and the part-to-part variation
using a 5.15s spread that encompasses 99 percent of the variation
expected. Is the measurement system acceptable based on your
% R&R result?
c. Express the estimates in items (a) and (b) as a:
(1) percent of the tolerance, and
(2) percent of the total variation
d. Estimate the variance components for this study.
e. Determine the number of distinct categories that the measurement system
is capable of distinguishing. Is this system acceptable?
554 Part III: Troubleshooting and Process Improvement

f. Assuming that any inspector bias can be eliminated, what is the discrimi-
nation ratio for the resulting measurement system? Compare this result to
that of item (e).
g. Create the ANOME plot for this data and compare your conclusions
from this chart to that from the gauge R&R analysis. Use a = .01. How
do they agree?
5. The worst case uncertainty is defined as ±2.575ŝe. Under what condition
would such a metric make sense? If appropriate, what is the worst case
uncertainty for the study in Exercise 4?
6. The median uncertainty is defined as ±2/3ŝe. Under what condition would
such a metric make sense? What percent of actual measurements should fall
within this interval?
7. The effective measurement resolution is the maximum of the smallest
resolution of a measurement and the median uncertainty. What is the effective
measurement resolution in Exercise 4?
17
What’s on the CD-ROM

I
n an effort to encourage the use of computers for statistical analysis of data, a com-
pact disk (CD-ROM) has been included since the third edition of this text. The CD
has an auto-play menu that provides access to all of its contents. This chapter will
discuss the contents of the disk, and how it can be used by students who may be using
the text for individual study or as part of a formal program of coursework. The head-
ings within this chapter, with the exception of the last one, represent the subdirectory
names on the CD. The authors hope that the reader will find the access to the data files
on the disk a time-saver when doing the practice exercises. In addition, programs have
been included to facilitate use of the analysis of means procedure. All of the files on the
CD are meant to be used on a PC. Macintosh users who have the ability to convert Word
or Excel files for the PC should be able to use them as well.

\Datasets & Solutions to Practice Exercises


The data sets used in the practice exercises throughout this book have been stored in a
subdirectory entitled “\Datasets & Solutions to Practice Exercises.” Under this direc-
tory, there are folders for each of the chapters in the text. Within each of these folders,
that is, Chapter01, the reader will find several types of files:
• Word document (.doc) file containing solutions (not detailed) for each of the
practice exercises (will be discussed later in this section).
• Adobe Acrobat (.pdf) file identical to the Word file for those who do not
have access to Microsoft Word. A freeware copy of the installation file for
Adobe Acrobat Reader 5.1 can be found in the root directory of the CD
(AcroReader51_ENU.exe). Double-click on the file in the Explorer program
and follow the onscreen instructions. The latest version of Acrobat Reader
can be found at www.adobe.com.

555
556 Part III: Troubleshooting and Process Improvement

• Excel data set (.xls) files that the reader can use for analysis directly within
Excel or for reading into another program for analysis, that is, MINITAB.
• MINITAB worksheet (.mtw) files that the reader can bring directly
into MINITAB for immediate analysis.
Note that the file name format for the latter files is of the form:

Q(chapter #)-(exercise #).extension

and can be accessed directly from the CD menu.


For example, if the reader wanted to do Exercise 6 in Chapter 8, then the files of
interest would be Q8-6.xls or Q8-6.mtw (depending on where the reader wanted to do
the analysis of the exercises). There are Chapter 5 exercises in which the Excel file may
contain functions set up to solve the exercise and the reader could follow this approach
to solve these exercises and similar ones on their own. In these situations, no MINITAB
file is provided though the solution could be obtained via MINITAB commands.
Chapter 9 does not have any data sets associated with it.
The practice exercise solution files (.doc and .pdf) have file names according to
this format:

\Chapter (#)\Chapter (#) Solutions.extension

For example, if a student wishes to do one or more practice exercises in Chapter 11,
the solutions file can be found in \Chapter 11\Chapter 11 Solutions.doc or \Chapter
11\Chapter 11 Solutions.pdf. Within these files, the authors have provided cursory solu-
tions to the reader for each of the practice exercises. The solutions are intended to be
used by the reader to check the solution, but not necessarily the detailed method by
which the solution was obtained. It would be expected that a student would be able to
get further guidance from this text and/or an instructor.

\Excel 97 & 2000 Viewer


\Word 97 & 2000 Viewer
Recognizing that not all readers will have ready access to the latest spreadsheet or word
processor software, freeware viewer installation programs have also been provided on
the CD for opening and viewing the files in the previous section. Note that these may
not be the latest versions of these Microsoft viewers. Readers are encouraged, if they
have Internet access, to download a more current version from the www.microsoft.com
Web site.
To install the viewer, the reader can select it from the CD menu or copy the
viewer.exe file from the desired CD directory “\Excel 97 & 2000 Viewer” or “\Word 97
& 2000 Viewer” to a temporary file on their computer’s hard disk. Double-clicking on
the file in the Explorer program will launch the installation program. The reader only
needs to follow the given instructions to install the viewer.
Chapter 17: What’s on the CD-ROM 557

\Mica Thickness Data


The mica thickness example has been used throughout this book to emphasize the tech-
niques presented for the analysis and presentation of data. Since many readers use
spreadsheet programs to work with data files, the mica thickness data have been entered
into a spreadsheet (IOD.xls) for readers to see how different analyses discussed in the
text can be performed within Excel. It can be found in the “\Mica Thickness” directory,
or accessed directly from the CD menu. Figure 17.1 is an example of how Excel can be
used to duplicate the histogram of the mica thickness data that is shown in Figure 1.1.
In fact, any of the control charts discussed in this text can also be created within a
spreadsheet, such as Excel. One advantage of using a spreadsheet to create control
charts is that the process can be automated oftentimes with a macro, which can be either
a series of spreadsheet commands or be based on a computer language such as Visual
Basic for Applications (VBA). Another advantage is that the reader becomes more inti-
mate with the mechanics of creating the control chart in the process.
Figure 17.2 shows the EWMA chart for the mica thickness data. As discussed in
Chapter 7, the exponentially-weighted moving average chart can be an effective means
of working with successive observations that are not necessarily independent of one
another. Unfortunately, the calculations for the plotted points and control limits are more
complex than for conventional control charts. Fortunately, these calculations are itera-
tive and lend themselves nicely to adaptation in a spreadsheet program.

Histogram
45
40
35
30
Frequency

25
Frequency
20
15
10
5
0
75

75

75

75

75

75

e
.7

or
4.

6.

8.

.
10

12

14

16

Cell Boundaries

Figure 17.1 Excel histogram of the mica thickness data, comparable to Figure 1.1.
558 Part III: Troubleshooting and Process Improvement

Mica Thickness Data: Geometric Moving AverageR Chart


(compare to Figure 7.7 in text)

24.0

14.00

20.0
12.00 –
X = 11.1525
Thickness (thousandths of an inch)

Range of Thickness (.001")


10.00 16.0

8.00
12.0

6.00

R = 4.875
8.0
4.00

4.0
2.00

0.00 0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Subgroup (order of production)

Geometric Moving Average, Zt Centerline, X-bar-bar LCL Geometric Moving Average UCL Geometric Moving Average
Range Centerline, R-bar LCL Range UCL Range

Figure 17.2 Exponentially-weighted moving average and range charts of the mica thickness data,
comparable to Figure 7.7.

The process capability and performance indices discussed in Chapter 8 can also be
calculated easily within a spreadsheet. The IOD.xls spreadsheet also contains a work-
sheet tab showing these calculations for the mica thickness data that has been evaluated
in four time periods and over the entire data set.1

\Software
Inasmuch as it is not possible to package a more comprehensive software program with
this book without dramatically driving up its cost, the authors have included some use-
ful programs for doing a number of the analyses presented. Most notably, an Excel add-
in program is located in the “\Software” directory (ANOM48.xla).2 This add-in DOES
work with Excel 97 and higher, but will not work in Excel 95. It handles both attributes

1. Readers will also find an additional mica thickness data set that was created by George Firmstone, formerly of
Corning Incorporated, to illustrate what the control charts, histogram, and process capability/performance calculations
would look like if the mica thickness process experienced good and bad time periods. This file is named IOD New
Mica Data.xls, and it can be found on the CD. It demonstrates the danger of making a short-term evaluation of any
manufacturing process and assessing its control and capability without considering long-term variation.
2. It was originally developed by Casey Volino of Corning Incorporated, and has been further extended by one of the
authors (Neubauer) to handle treatment effects (ANOME) and three factors for both variables and attribute data. It will
do many of the analysis of means plots that are shown in the text, including nested designs with two factors.
Chapter 17: What’s on the CD-ROM 559

(proportions or count) data and variables (measurement) data for up to three variables,
excluding the response. The add-in can also handle the case of unequal sample sizes
among factor levels.
The program gives the user the option to select a single set of decision limit lines
or a dual set (as is seen in many of the text examples). Dual limits combine either a =
0.05 with a = 0.10, for situations where you are looking for effects to be present and
you are willing to relax the Type I error a bit, or a = 0.05 with a = 0.01, for situations
when you want to be certain an effect is significant with lower risk. Single limits can be
chosen from a variety of alpha values (0.10, 0.05, 0.01, and 0.001).
You will need to copy this file into your “\Office\Library” subdirectory under the
“\Microsoft Office” subdirectory on your PC’s hard drive.3 (Note that the “\Microsoft
Office” subdirectory is often in the “C:\Program Files” directory.) Once copied to this
folder, just go into Excel and choose Tools, then Add-ins and select the box next to the
Analysis of Means “Plus” option and click OK. Once back into Excel, you will see
the ANOM48 option on the main menu at the top of the screen.
To run the add-in, click on the ANOM48 option and select the Analysis of Means
‘Plus’ item from the dropdown menu. A dialog box will come up asking you to select
the type of data you wish to analyze. The next dialog box will ask you to select the type
of model that fits your experiment. For variables data, the user is allowed to select
either a factorial model or a nested model in two factors. For attributes data, the only
possible selection is the factorial model.
The following dialog box prompts you to enter the data for the response and variables
of your experiment. Just drag the mouse over the column containing the response data
(including the first row as the label). Next click on the second option box and drag the
mouse over the data for the first factor (including the first row as a label). If desired, and
if analyzing variables data, click on the third option box and drag the mouse over the data
for the second factor (including the first row as a label). Note that if either two or three
factors are entered, a cautionary message box appears to tell the user that the ANOME
method is most appropriate for the analysis of multifactor experiments, particularly for
the analysis of interactions. The user will not have the option to select the overall average
as the centerline value for multifactor data sets.4 Hit Enter and the box will go away.
Once the data has been identified, select what limit(s) you want to use to evaluate
the data (you should choose alpha before the analysis!) and click OK. A dialog box will
appear and give you the opportunity to change the format of the value of the decision
limits. In the case of proportion data, the user can choose either a fraction defective
(default) or percentage format.

3. Newer versions of Windows will typically store the Excel add-in under the “Documents and Settings” folder. If the
account the user is logged into is called “Owner,” the subdirectory would be “\Application Data\Microsoft\AddIns”.
If you are unsure, try to save any Excel file as a Microsoft Excel Template (.xla) file and see where the directory is
located that Windows defaults to. A shortcut to running the add-in is to double-click on the file on the CD. This will
open up Excel with the add-in appearing on the top menu. You will need to open a new worksheet and enter data or
open an existing worksheet with data to use the program.
4. For the case of multifactor data sets of two or more factors, main effects can be independently evaluated with a cen-
terline equal to the overall average by analyzing them one at a time.
560 Part III: Troubleshooting and Process Improvement

The program will produce two types of tabs for each analysis—one type of tab con-
tains the ANOM plot (with all labels and titles) for main effects and interactions, and
the other is the Calculations tab that contains the data that is plotted (Don’t delete this
as the plot will lose its link to the data!). Note that for the analysis of three factors, there
will be separate ANOME tabs for main effects and interactions.
Figure 17.3 shows the ANOM48.xla output for the fully nested design in Section
15.7 (copper content of castings). The data is presented in Figure 15.11 and the output
can be compared to the ANOME chart presented in Figure 15.12.
The flexibility of this add-in to analyze small designs allows for two-level, as well as,
multilevel designs, that is, 22, 23, 2 × 3, 4 × 2, 2 × 3 × 4, and so on. For example, we
can reanalyze the Case History 15.3 (Lengths of Steel Bars) data from Table 15.4. The
ANOME plots are shown in Figure 15.17 for main effects and interactions. It should be
noted for 2p experiments that the text uses a convention of like subscript and unlike sub-
script averages for plotting effects that maintain equal ANOM decision limits for all
effects (see Chapter 11). However, the add-in works with the individual cell averages in
the case of interactions, as discussed for ANOME for treatment effects in Chapter 15.
This approach requires that the decision limits be based on the use of the Sidak factors
in Table A.19. Since these factors produce more conservative limits, these decision lim-
its will typically be wider for the interaction effects.
In Figures 17.4 and 17.5, the add-in ANOME plot duplicates the results for the main
effects and interactions seen in Figure 15.17. The ANOME decision limits computed by
the Excel add-in agree with those given in the text as both depend on a calculation of
the standard deviation using the treatment effects, not one based on the average range.
Unbalanced data has been discussed in some detail in this edition of the text. The add-
in is capable of determining proper ANOM decision limits in situations were data may
be missing in a design.5 As an example, we can use the data of Figure 15.7 (two-factor
crossed factorial experiment—density of a photographic film plate) that is presented in
Figure 15.8. As discussed earlier, developer strength (A) and development time (B) were
both significant effects while the AB interaction was not significant. Figure 17.6 shows
the ANOME chart for the original balanced dataset.
We can compare this plot to the ANOME plot shown in Figure 15.8. Both plots
result in the same conclusions for the developer strength and development time differ-
ences, as well as the interaction treatment effects.
A closer examination of the ANOME decision limit values shows slight differences
from those given in the text. The reason for this is simply that the limits calculated in
Chapter 15 are based on an error term based on the range. The add-in uses a pooled
estimate of the standard deviation that is equivalent to the value of ŝ e defined in
Equation (15.1), and carries a higher number of degrees of freedom for error.
Now suppose that in the process of collecting the data we were unable to collect
densities on all of the photographic film plates. Perhaps some of the plates were taken

5. L. S. Nelson, “Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Technology 15, no. 1
(January 1983): 40–44. This seminal paper presented in the Ott Memorial issue discusses how decision limits can be
determined when sample sizes are unequal.
Nested ANOM for Sample
No Standard Given
Main Effects Nested Effects: B Within A
A: H

0.5128
A: J
A: C B: S2
B: S1

0.3128 A: D
A: F B: S1
A: G
A: B
A: E A: I
B: S1
B: S2 B: S2
A: E

0.1128
Cu Content

A: F A: G A: K
A: I A: H
A: A B: S2 B: S2
UDL(0.050)=0.0592 B: S2 UDL(0.050)=0.0678
B: S1 B: S2

CL=0.0000

Chapter 17: What’s on the CD-ROM


A: A A: H
A: A LDL(0.050)=-0.0592 LDL(0.050)=-0.0678
B: S2 A: F A: G A: K
-0.0872 B: S1
B: S1
B: S1
B: S1
A: J

A: E A: I
A: B A: B
B: S1 B: S1
B: S2
A: C
-0.2872 A: D
B: S2

A: D

A: K A: C
B: S2 A: J
-0.4872 B: S1

Casting (A) and Sample (B(A))

Figure 17.3 ANOM.xla add-in version of the ANOME plot for copper content of two samples from each of 11 castings (data from Figure 15.11)

561
shown in Figure 15.12.
562 Part III: Troubleshooting and Process Improvement

Three-Way ANOM on Length of Steel Bars by Time, Machine, and Heat Treatment
No Standard Given
Main Effects
2.61
B: B
B: D

1.61
UDL(0.010)=1.37
C: W
UDL(0.010)=1.08 UDL(0.050)=1.11
UDL(0.050)=0.86
Length of Steel Bars

A: T3 UDL(0.010)=0.67
0.61 UDL(0.050)=0.51

CL=0.00

-0.39 A: T1 LDL(0.050)=-0.51
A: T2 LDL(0.010)=-0.67
LDL(0.050)=-0.86 B: A
LDL(0.010)=-1.08 LDL(0.050)=-1.11
C: L
-1.39 LDL(0.010)=-1.37

-2.39

B: C
-3.39
Time (A), Machine (B) and Heat Treatment (C)

Figure 17.4 ANOM add-in version of the ANOM plot for a 22 design in Case History 14.1.

to be used elsewhere, or were damaged in the course of the experiment. You can re-run
the ANOM add-in with the resulting unbalanced data and focus again on the main
effects and interaction. Suppose the data omitted from the original data in Figure 15.7
were: A = 1, B = 10, Density = 2; A = 3, B = 18, Density = 10; and A = 3, B = 18, Density
= 8. Figure 17.7 shows the new ANOME plot of the experiment.
In this unbalanced example, we see that the basic conclusions drawn from Figure
17.6 have now changed. The degree of significance has been reduced due to the smaller
sample sizes within each of the plotted means, which causes the limits to become wider.
The significance of the developer strength (A) effect in the ANOME plot is still
apparent, and the pattern of the plotted points for the levels of this factor remains the
same. The development time (B) effect is no longer significant at the a = .05 level as
before, even though the pattern of the factor level means has been virtually unchanged.
Another advantage of the ANOME plot is that it may allude to what could be a real,
physical relationship between levels of a factor and the response that are deemed to be
insignificant for a prescribed level of risk. For example, the apparent linear relationship in
developer time (B) in Figure 17.7 is not statistically significant. The physical relationship
may exist, but there may be insufficient data to reach a statistically significant result.
While many other analyses can be done with the ANOM48.xla add-in, it has some
limitations. For instance, it only addresses the no standard given situation. Since this is
typical of nearly all practical data sets, this is not considered a major drawback. Further-
more, the user should confirm that the ranges are in control prior to implementing the
Three-Way ANOM on Length of Steel Bars by Time, Machine, and Heat Treatment
No Standard Given
Interactions
2.61
UDL(0.050)=2.31
UDL(0.050)=2.17

UDL(0.050)=1.98
A: T2 UDL(0.050)=1.83
B: D
1.61
UDL(0.050)=1.48
A: T3
B: A UDL(0.050)=1.23
UDL(0.050)=1.17

UDL(0.050)=0.97 A: T1 A: T3
A: T3 A: T2 A: T2
A: T1 A: T1 B: D B: B A: T3
A: T2 B: B B: B B: C
B: A B: A A: T1 C: L A: T2 C: L A: T3 B: D
0.61 B: C
B: A
C: W C: L A: T2 B: C
A: T1 A: T1 C: W A: T1 B: C B: D A: T3 C: W
A: T3 B: A B: B B: C B: D B: B C: W C: L B: A C: W
Length of Steel Bars

B: C C: LA: T2 C: W C: W
C: L C: W C: L C: W C: W C: L
C: W

CL=0.00

A: T2 B: B B: C A: T1 A: T3
A: T3 B: A B: D A: T2
-0.39 A: T1 A: T1 C: L
C: L C: W C: L C: W A: T1
B: B A: T1 A: T2 B: A A: T3
B: B C: W C: L C: L B: C B: A B: D A: T3
A: T2 C: W B: C
A: T1
B: A
C: L C: W A: T2 A: T2 C: L B: D
B: B A: T3 C: L
B: D A: T3
C: L A: T1 B: B B: C C: L
B: B

Chapter 17: What’s on the CD-ROM


B: C LDL(0.050)=-0.97 B: D C: L C: W
C: W
C: W
LDL(0.050)=-1.17
A: T3 LDL(0.050)=-1.23

-1.39 B: D
LDL(0.050)=-1.48

A: T2 LDL(0.050)=-1.83
B: A LDL(0.050)=-1.98

LDL(0.050)=-2.17
LDL(0.050)=-2.31
-2.39

-3.39
Time (A), Machine (B) and Heat Treatment (C)

563
Figure 17.5 ANOME add-in plot for interactions based on data in Figure 15.11.
564 Part III: Troubleshooting and Process Improvement

Two-Way ANOM for Density of Photographic Film Plate by Developer


Strength and Developer Time
No Standard Given
Main Effects Interaction
A: 3

2.456
UDL(0.010)=1.915
Density of Photographic Film Plate

UDL(0.050)=1.578
1.456
UDL(0.010)=1.181 B: 18 UDL(0.010)=1.181
A: 1
UDL(0.050)=0.923 UDL(0.050)=0.923
A: 2 B:10 A: 2 A: 2
B:15 A: 3
0.456 B:18
B:15
CL=0.000

A: 3 A: 3
B: 15 A: 1
-0.544 A: 1 B: 18
B: 10 B: 18
LDL(0.050)=-0.923 LDL(0.050)=-0.923 B: 15 A: 2
B: 10 B: 10
LDL(0.010)=-1.181 LDL(0.010)=-1.181

-1.544 LDL(0.050)=-1.578

LDL(0.010)=-1.915

-2.544

A: 1
-3.544
Developer Strength (A) and Developer Time (B)

Figure 17.6 ANOME plot produced for a balanced data set based on Figure 15.7.

Two-Way ANOM for Density of Photographic Film Plate by Developer


Strength and Developer Time
No Standard Given
Main Effects Interaction
A: 3
2.600
UDL(0.010)=2.071
Density of Photographic Film Plate

UDL(0.050)=1.699
1.600 A: 3
UDL(0.010)=1.382 UDL(0.010)=1.382
B: 18
UDL(0.050)=1.089 UDL(0.050)=1.089 A: 2
A: 2 B: 18 A: 1
A: 2 B: 18
0.600 B: 10
A: 1 B: 15 A: 3
B: 18 B: 15
CL=0.000

B: 15
-0.400
A: 3
B: 10 A: 1
B: 10
B: 15
LDL(0.050)=-1.089 LDL(0.050)=-1.089 A: 2
-1.400 LDL(0.010)=-1.382 LDL(0.010)=-1.382
B: 10

LDL(0.050)=-1.699

LDL(0.010)=-2.071

-2.400

A: 1
-3.400
Developer Strength (A) and Developer Time (B)

Figure 17.7 ANOME plot produced for an unbalanced data set based on Figure 15.7.
Chapter 17: What’s on the CD-ROM 565

ANOM or ANOME analyses. Fortunately, the program does test internally to be sure that
the normal approximation is appropriate for attributes data.
Readers who need to perform ANOM analyses on data sets based on factorial
designs for up to seven factors can use a program, ANOMBED.exe, that is on the CD
and which was initially written by Schilling, Schlotzer, Schultz, and Sheesley6 using
Ott’s original limits based on the Bonferroni inequality. It has been updated to exact
limits by Peter Nelson.7 This program produces analysis of means plots of the fixed
effects for either fixed or mixed (fixed and random) effects models, as well as an
ANOVA table with expected mean squares, F ratios, and levels of significance. It has
been compiled to run in the MS Windows environment (Windows 95/98/NT/2000/XP).
Users should consult the README.1st file for information relative to the data sets used
by Nelson in his paper.8 Data files containing these data sets have been provided on the
CD as well. The results of the ANOM analysis can be viewed on the display screen of
the computer or written to an output file that the user specifies.
For those who have access to a FORTRAN compiler, and want to make some changes
to the interface, the source code has also been added to the CD (ANOMBED.for).
Finally, a one-way ANOM program (ANOMONE.exe) written by Sheesley has
been compiled and included on the CD because it offers the choice between an ANOM
chart or a control chart for attributes or variables data.9
This directory also contains two other subdirectories:
• Graph Paper Printer (freeware). One of the best freeware graph paper–making
programs found on the Internet. An updated version can be found on the
author’s Web site: http://perso.easynet.fr/~philimar/. The types of graph
papers that can be made are too many to mention here. Just click on the
self-extracting file gpaper.exe to install.
• SOLO Probability Calculator (freeware). This program is no longer available
but was distributed as freeware several years ago. This program is a great
tool to use on the desktop of your PC and provides a plethora of probability
calculations that eliminates the need for tables for many distributions. Open
the readme.wri file and follow the instructions to install.

Selected Case Histories


Some selected case histories from the third edition were moved to the CD to help make
room for the additional material added to this edition. They can be found in the

6. E. G. Schilling, G. Schlotzer, H. E. Schultz, and J. H. Sheesley, “A FORTRAN Computer Program for Analysis of
Variance and Analysis of Means,” Journal of Quality Technology 12, no. 2 (April 1980): 106–13.
7. P. R. Nelson, “The Analysis of Means for Balanced Experimental Designs,” Journal of Quality Technology 15,
no. 1 (January 1983): 45–54.
8. Ibid.
9. J. H. Sheesley, “Comparison of k Samples Involving Variables or Attributes Data Using the Analysis of Means,”
Journal of Quality Technology 12, no. 1 (January 1980): 47–52.
566 Part III: Troubleshooting and Process Improvement

“\Selected Case Histories” directory under separate chapter subdirectories, or accessed


directly from the CD menu. The format of the file names for these case histories is:

CH (chapter #)-(problem#).pdf

Analysis of Means Library


Readers who are interested in learning more about the analysis of means technique will
find the “\Analysis of Means Library” directory to hold a treasure trove of papers pub-
lished in the Journal of Quality Technology. Most of these papers have been referenced
in this text and can be accessed directly from the CD menu. While this collection does
not include any of the Industrial Quality Control and Technometrics papers referenced
here, it does represent much of the body of knowledge on ANOM and ANOME. All of
these papers are in PDF form so they can be easily opened with the Adobe Acrobat
Reader program supplied on the CD.

Other Files of Interest


The CD also includes some other programs in the root directory, which can be
accessed directly from the CD menu, that the reader may find useful:
• GenerateAnomFactors.xls. An Excel spreadsheet that can be used to generate a
number of ANOM critical values (exact or otherwise) that have been published
by various authors over the years.
• Producing Statistical Tables Using Excel.htm. An HTML file that can be read
with a browser program that describes how standard statistical tables (F, t, Z,
and so on) can be generated easily within Excel. For readers who can’t always
find the tables they need, this is a good way to create your own. However, it is
important to note here that published tables are generally considered to be more
exact than those produced in the manner discussed in this file. Fortunately, this
issue is addressed and the degree of error is small enough to be no more than a
minor issue from a practical viewpoint.
• Where Do the Control Chart Factors Come From.pdf. A paper written by
Edward Schilling discussing the development of the control chart factors
found in Table A.4.
• Binomial Nomograph.pdf. Blank nomograph paper for determining binomial
probabilities graphically. This nomograph provides reasonable approximations
to probabilities found in Table A.5.
• Thorndike Chart.pdf. Blank nomograph paper for determining Poisson
probabilities graphically. This nomograph is the same as Table A.6.
• Normpaper.pdf. Freeware copy of blank normal probability paper for
manual plotting.
18
Epilogue

E
very process and every product is maintained and improved by those who com-
bine some underlying theory with some practical experience. More than that,
they call upon an amazing backlog of ingenuity and know-how to amplify and
support that theory. New-product ramrods are real “pioneers”; they also recognize the
importance of their initiative and intuition and enjoy the dependence resting on their
know-how. An expert can determine just by listening that an automobile engine is in
need of repair. Similarly, an experienced production man can often recognize a recur-
ring malfunction by characteristic physical manifestations. However, as scientific the-
ory and background knowledge increase, dependence on native skill and initiative often
decreases. Problems become more complicated. Although familiarity with scientific
advances will sometimes be all that is needed to solve even complicated problems—
whether for maintenance or for improvement, many important changes and problems
cannot be recognized by simple observation and initiative no matter how competent the
scientist. It should be understood that no process is so simple that data from it will not
give added insight into its behavior. But the typical standard production process has
unrecognized complex behaviors that can be thoroughly understood only by studying
data from the product it produces. The “pioneer” who accepts and learns methods of sci-
entific investigation to support technical advances in knowledge can be an exception-
ally able citizen in an area of expertise. Methods in this book can be a boon to such
pioneers.
This book has presented different direct procedures for acquiring data to suggest the
character of a malfunction or to give evidence of improvement opportunities. Different
types of data and different methods of analysis have been illustrated, which is no more
unusual than a medical doctor’s use of various skills and techniques in diagnosing the
ailments of a patient. It cannot be stressed too much that the value and importance of
the procedure or method are only in its applicability and usefulness to the particular
problem at hand. The situation and the desired end frequently indicate the means.

567
568 Part III: Troubleshooting and Process Improvement

Discussing the situation with appropriate personnel, both technical and supervisory,
at a very early date, before any procedures are planned, will often prevent a waste of
time and even avoid possible embarrassment. It will also often ensure their subsequent
support in implementing the results of the study; but you may expect them to assure you
that any difficulty “isn’t my fault.” Often a study should be planned, expecting that it
will support a diagnosis made by one or more of them. Sometimes it does; sometimes
it does not. Nevertheless, the results should pinpoint the area of difficulty, suggest the
way toward the solution of a problem, or even sometimes give evidence of unsuspected
problems of economic importance. Properly executed, the study will always provide
some insight into the process. A simple remedy for a difficulty may be suggested where
the consensus, after careful engineering consideration, had been that only a complete
redesign or major change in specifications would effect the desired improvements.
Many of the case studies, examples, and discussion in this book relate to “statisti-
cal thinking.”1 This has been defined as
. . . a philosophy of learning and actions based on the following fundamental
principles:
• All work occurs in a system of interconnected processes,
• Variation exists in all processes, and
• Understanding and reducing variation are keys to success.2
The emphasis is on reducing variation. To do so demands recognition that work
occurs within a system of processes that are often ill-defined, causing differences in
the way the processes and the system are understood and implemented. Within these
processes, variation exists and generally must be reduced or eliminated before the
work can be successfully and consistently performed. Therefore, the objective of sta-
tistical thinking is to eliminate variation not only within individual processes but also
within the management of the system in which the processes are directed. Statistical
thinking does not just involve a collection of tools but rather is directed toward under-
standing the process and the sources of variation within which data are collected and
the tools employed.
These relationships are aptly demonstrated in Figure 18.1, developed by the ASQ
Statistics Division.3 In their special publication on Statistical Thinking, they suggest the
following “tips” for successful implementation:
• Get upper management buy-in to the philosophy
• Start small

1. R. N. Snee, “Statistical Thinking and Its Contribution to Total Quality,” The American Statistician 44 (1990):
25–31.
2. Amercian Society for Quality, Glossary and Tables for Statistical Quality Control (Milwaukee: ASQ Quality
Press, 2004).
3. G. Britz, D. Emerling, L. Hare, R. Hoerl, and J. Shade, “Statistical Thinking,” ASQ Statistics Division Newsletter,
Special Edition (Spring 1996).
Chapter 18: Epilogue 569

Statistical Thinking

Statistical thinking Statistical methods

Process Variation Data Improvement

Philosophy Analysis Action

Figure 18.1 The relationship between statistical thinking and statistical methods (ASQ
Statistics Division).

• Designate a core team, drawing from all responsible groups


• Include frontline workers
• Go after “low hanging fruit” first
• Use the “Magnificent 7” tools to gather data:
– Flowchart
– Check sheet
– Run chart
– Histogram
– Pareto chart
– Scatter plot
– Cause-and-effect diagram
• Use the plan–do–check–act (PDCA) cycle to ensure the process is dynamic
See also Balestracci for further insights.4
Statistical thinking is the driving force behind the Six Sigma methodology dis-
cussed in Chapter 9. In that chapter, Six Sigma was defined as “a disciplined and
highly quantitative approach to improving product or process quality.” The term “Six
Sigma” refers to the goal of achieving a process that produces defects in no more than
3.4 parts per million opportunities (assuming a 1.5-sigma process shift), as seen in
Figure 18.2.
The implementation of Six Sigma through problem-solving approaches, such as
DMAIC, involves the use of the same ideas used in the implementation of statistical

4. D. Balestracci, “Data ‘Sanity’: Statistical Thinking Applied to Everyday Data,” ASQ Statistics Division Special
Publication (Summer 1998).
570 Part III: Troubleshooting and Process Improvement

Lower Upper
spec limit Process spec limit
(LSL) shifts 1.5s (USL)

–6s – 1.5s + 1.5s +6s

Figure 18.2 A Six Sigma process that produces a 3.4 ppm level of defects.

thinking. The Six Sigma methodology is much more formalized however. The estab-
lishment of a series of “belts” and corresponding training programs makes this approach
much more rigorous than informal methods. The successful track record of companies
who have pursued Six Sigma is certainly a convincing factor, but it should be noted that
there are no guarantees, and that there are just as many companies who have not been
successful with Six Sigma (oftentimes for good reasons).
The Kepner and Tregoe approach for problem analysis is an example of another
proven methodology that is based more on what cause is more rational than which is
more creative.5 This approach was described in seven steps in Chapter 9.
Ott’s approach to statistical thinking is well illustrated by Case History 6.2 on metal
stamping and enameling. First, a team was formed and the process was outlined (Table
6.5 and Figure 6.6). Then, sources of variation were identified (Table 6.6), data were
collected (Table 6.7) and analyzed (Figure 6.8). Note that Ott always emphasized a fur-
ther element, namely, establishment of controls to prevent reoccurrence of the variation
once identified and eliminated. The reader would do well to follow through the same
steps in Case History 9.1 on black patches on aluminum ingots. In neither case were the
statistical methods elegant or profound, but rather it was the statistical thinking process
that uncovered the source of the variation and solved the problem. Note also that the “tips”
mentioned above were aptly employed in both studies and that all of the “Magnificent 7”
tools are covered in various parts of this book.
An industrial consultant often has the right and authority to study any process or pro-
ject. However, this is not exactly a divine right. It is usually no more than a “hunting or
fishing” license; you may hunt, but no game is guaranteed. So find some sympathetic,

5. C. H. Kepner and B. B. Tregoe, The Rational Manager (New York: McGraw-Hill, 1965). This problem-solving
approach is widely considered to be the best in the business community, and to a large degree in the manufacturing
community as well.
Chapter 18: Epilogue 571

cooperative souls to talk to. They may be able to clear the path to the best hunting
ground. Some of the most likely areas are:
1. A spot on the line where rejects are piling up.
2. Online or final inspection stations.
3. A process using selective assembly. It is fairly common practice in production
to separate components A and B each into three categories; low, medium,
and high, and then assemble low A with high B, and so on. This process
is sometimes a short-term necessary evil, but there are inevitable problems
that result.
Many things need to be said about the use of data to assist in troubleshooting. We
may as well begin with the following, which differs from what we often hear.
Industry is a mass production operation; it differs radically from most agricultural
and biological phenomena, which require a generation or more to develop data. If you
do not get enough data from a production study today, more data can be had tomorrow
with little or no added expense. Simple studies are usually preferred to elaborate non-
replicated designs that are so common in agriculture, biology, and some industry
research and/or development problems.
Throughout this book, much use has been made in a great variety of situations of a
simple yet effective method of studying and presenting data, the graphical analysis of
means. This method makes use of developments in applying control charts to data, and
a similar development in designing and analyzing experiments. Let us look at some of
its special advantages:
1. Computations are simple and easy. Often no calculator is necessary, but it is
possible to program the ANOM for graphical printout on a computer.
2. Errors in calculation, often obvious in a graphical presentation, may be
identified by even the untrained.
3. The graphical comparison of effects presents the results in a way that will
be accepted by many as the basis for decision and action, encouraging the
translation of conclusions into scientific action.
4. Dealing directly with means (averages), the method provides an immediate
study of possible effects of the factors involved.
5. Not only is nonrandomness of data indicated, but (in contrast to the results
from other analyses) the sources of such nonrandomness are immediately
pinpointed.
6. This analysis frequently, as a bonus, suggests the unexpected presence of
certain types of nonrandomness, which can be included in subsequent
studies for checking.
572 Part III: Troubleshooting and Process Improvement

7. The graphical presentation of data is almost a necessity when interpreting


the meaning of any interaction.
Troubleshooters and others involved in process improvement studies who are famil-
iar with analysis of variance will find the graphical analysis of means a logical inter-
pretative follow-up procedure. Others, faced with studying multiple independent
variables, will find that the graphical procedure provides a simple, immediate, and
effective analysis and interpretation of data. It is difficult to repeat too often the impor-
tance to the business of troubleshooting of a well-planned but simple design.
Frequently in setting up or extending a quality control program, some sort of orga-
nized teaching program is necessary. Whenever possible, an outside consultant should
be the instructor. It is important that the instructor play a key role in troubleshooting
projects in the plant. The use of current in-plant data suggested by class members for
study will not only provide pertinent and stimulating material as a basis for discussion
of the basic techniques of analysis but may actually lead to a discussion of ways of
improving some major production problem. However, not many internal consultants can
keep sensitive issues often raised by such discussion in check without serious scars.
Quality control requires consciousness from top management to operator and
throughout all departments. Therefore, representatives from purchasing, design, manu-
facturing, quality, sales, and related departments should be included in the class for at
least selected pertinent aspects of the program.
And what should the course include? Well, that is what this book is all about. But
to start, keep it simple and basic, encouraging the application of the students’ ingenuity
and know-how to the use of whatever analytical techniques they learn, in the study of
data already available.
Friends and associates of many years and untold experiences sometimes come to
the rescue. Not too long ago, one responded when asked, “Bill, what shall I tell them?”
Slightly paraphrased, here is what he scribbled on a note:
• Come right out and tell them to plot the data.
• The important thing is to get moving on the problem quickly; hence, use quick,
graphical methods of analysis. Try to learn something quickly—not everything.
Production is rolling. Quick, partial help now is preferable to somewhat better
advice postponed. Get moving. Your prompt response will trigger ideas from
them too.
• Emphasize techniques of drawing out the choice of variables to be considered,
asking “dumb, leading questions.” (How does one play dumb?)
• Develop the technique of making the operators think it was all their idea.
• Make them realize the importance of designing little production experiments
and the usefulness of a control chart in pointing up areas where experimenta-
tion is needed. The chart does not solve the problem, but it tells you where and
when to look for a solution.
Chapter 18: Epilogue 573


• Say something like “you don’t need an X and R control chart on every machine

at first; a p chart may show you the areas where X and R charts will be helpful.”
• Introduce the outgoing product quality rating philosophy of looking at the fin-
ished product and noting where the big problems are.
• After the data are analyzed, you have to tell someone about the solution—like
the boss—to get action. You cannot demand that the supervisor follow
directions to improve the process, but the boss can find a way. For one thing,
the boss’s remarks about how the supervisor worked out a problem with you
can have a salutary effect—on the supervisor himself and on other supervisors.
Now Bill would not consider this little outline a panacea for all ailments, but these
were the ideas that popped into his head and they warrant some introspection.
If you have read this far, there are two remaining suggestions:
1. Skim through the case histories in the book. If they do not trigger some ideas
about your own plant problems, then at least one of us has failed.
2. If you did get an idea, then get out on the line and get some data! (Not too
much, now.)

18.1 PRACTICE EXERCISES


1 to ∞. Practice does not end here. Find some data. Get involved. Use the
methods you have learned. The answers are not in an answer book, but
you will know when you are correct and the rewards will be great.

Remember— plot the data!


Case Histories

1.1 Solder Joints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45


2.1 Depth of Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.2 Excessive Variation in Chemical Concentration . . . . . . . . . . . . . . . . . 77
2.3 Filling Vanilla Ice Cream Containers . . . . . . . . . . . . . . . . . . . . . . 80
2.4 An Adjustment Procedure for Test Equipment . . . . . . . . . . . . . . . . . 84
* 2.5 Rational Subgroups in Filling Vials with Isotonic Solution . . . . . . . . . . . 95
3.1 A Chemical Analysis—An R Chart As Evidence of Outliers . . . . . . . . . . 100
4.1 Vial Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.1 Defective Glass Stems in a Picture Tube for a Color TV Set . . . . . . . . . . 143
* 5.2 Incoming Inspection of a TV Component . . . . . . . . . . . . . . . . . . . . 146
5.3 Notes on Gram Weight of a Tubing Process . . . . . . . . . . . . . . . . . . . 147
* 6.1 Outgoing Product Quality Rating (OPQR) . . . . . . . . . . . . . . . . . . . 167
6.2 Metal Stamping and Enameling . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.3 An Investigation of Cloth Defects in a Cotton Mill (Loom Shed) . . . . . . . 176
6.4 Extruding Plastic Caps and Bottles . . . . . . . . . . . . . . . . . . . . . . . 184
6.5 Chemical Titration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.6 Machine-Shop Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.1 Metallic Film Thickness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8.1 The Case of the Schizophrenic Chopper . . . . . . . . . . . . . . . . . . . . 256
9.1 Black Patches on Aluminum Ingots . . . . . . . . . . . . . . . . . . . . . . . 273
10.1 23 Experiment on Fuses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
11.1 Spot-Welding Electronic Assemblies . . . . . . . . . . . . . . . . . . . . . . 331

* Detailed case history is located on the CD in the “\Selected Case Histories” subdirectory (see Chapter 17).

xxvii
xxviii Case Histories

11.2 A Corrosion Problem with Metal Containers . . . . . . . . . . . . . . . . . . 334


11.3 End Breaks in Spinning Cotton Yarn . . . . . . . . . . . . . . . . . . . . . . 336
11.4 An Experience with a Bottle Capper . . . . . . . . . . . . . . . . . . . . . . 339
11.5 Comparing Effects of Operators and Jigs in a Glass-Beading Jig Assembly
(Cathode-Ray Guns) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
11.6 A Multistage Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
11.7 Machine Shutdowns (Unequal ri) . . . . . . . . . . . . . . . . . . . . . . . . 351
11.8 Strains in Small Glass Components . . . . . . . . . . . . . . . . . . . . . . . 356
11.9 A Problem in a High-Speed Assembly Operation (Broken Caps) . . . . . . . 361
11.10 Winding Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
12.1 Extruding Plastic Components . . . . . . . . . . . . . . . . . . . . . . . . . . 380
12.2 Automatic Labelers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
12.3 Noisy Kitchen Mixers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
12.4 Screening Some Trace Elements . . . . . . . . . . . . . . . . . . . . . . . . 392
12.5 Geometry of an Electronic Tube . . . . . . . . . . . . . . . . . . . . . . . . . 394
12.6 Defects/unit2 on Glass Sheets . . . . . . . . . . . . . . . . . . . . . . . . . . 407
13.1 Nickel-Cadmium Batteries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
13.2 Height of Easter Lilies on Date of First Bloom . . . . . . . . . . . . . . . . . 421
13.3 Vials from Two Manufacturing Firms . . . . . . . . . . . . . . . . . . . . . . 423
13.4 Average of Electronic Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 426
14.1 Height of Easter Lilies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
14.2 Assembly of Nickel-Cadmium Batteries . . . . . . . . . . . . . . . . . . . . 444
14.3 An Electronic Characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
15.1 Possible Advantage of Using a Selection Procedure for Ceramic Sheets . . . . 464
15.2 Adjustments on a Lathe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
15.3 2 × 3 × 4 Factorial Experiment—Lengths of Steel Bars . . . . . . . . . . . . 486
16.1 Gasket Thickness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
An adequate science of control for management
should take into account the fact that measurements
of phenomena in both social and natural science for
the most part obey neither deterministic nor statisti-
cal laws, until assignable causes of variability have
been found and removed.
W. A. Shewhart
“Statistical Quality Control” Trans. ASME,
Ten Year Management Report (May 1942)
Appendix
Tables

575
576
Table A.1 Areas under the normal curve.*

Appendix
Proportion of total area under the curve to the left of a vertical line drawn at m + Z ŝ, where Z represents any desired value from
Z = 0 to Z = ± 3.09.
Z –0.00 –0.01 –0.02 –0.03 –0.04 –0.05 –0.06 –0.07 –0.08 –0.09
–3.0 0.00135 0.00131 0.00126 0.00122 0.00118 0.00114 0.00111 0.00107 0.00104 0.00100
–2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
–2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
–2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
–2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
–2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
–2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
–2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
–2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
–2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
–2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
–1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
–1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
–1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
–1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
–1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
–1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
–1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
–1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1057 0.1038 0.1020 0.1003 0.0985
–1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
–1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
–0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
–0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
–0.7 0.2420 0.2389 0.2358 0.2327 0.2297 0.2266 0.2236 0.2207 0.2177 0.2148
–0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
–0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
–0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
–0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
–0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
–0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
–0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
Continued
Z +0.00 +0.01 +0.02 +0.03 +0.04 +0.05 +0.06 +0.07 +0.08 +0.09
+0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
+0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
+0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
+0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
+0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
+0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
+0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
+0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
+0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8079 0.8106 0.8133
+0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
+1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
+1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
+1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
+1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
+1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
+1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
+1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
+1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
+1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
+1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
+2.0 0.9773 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
+2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
+2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
+2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
+2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
+2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
+2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
+2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
+2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
+2.9 0.9981 0.9982 0.9983 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
+3.0 0.99865 0.99869 0.99874 0.99878 0.99882 0.99886 0.99889 0.99893 0.99896 0.99900

Appendix
Source: This table is a modification of one that appears in Grant and Leavenworth, Statistical Quality Control, 4th ed. (New York: McGraw-Hill, 1972): 642–43.

Continued

577
578 Appendix

Continued
* Following are specific areas under the normal curve.
Cumulative Cumulative
probability Tail probability Z probability Z
0.5 0.5 0 0.99903 3.1
0.75 0.25 0.675 0.99931 3.2
0.80 0.20 0.842 0.99952 3.3
0.90 0.10 1.282 0.99966 3.4
0.95 0.05 1.645 0.99977 3.5
0.975 0.025 1.96 0.99984 3.6
0.98 0.02 2.055 0.99989 3.7
0.99 0.01 2.33 0.99993 3.8
0.995 0.005 2.575 0.99995 3.9
0.998 0.002 2.88 0.99997 4.0
0.999 0.001 3.09 0.99999 4.8

Table A.2 Critical values of the number of runs NR above and below the median in
k = 2m observations (one-tail probabilities).
Significantly Small Significantly Large
Critical Values of NR Critical Values of NR
k m ` = 0.1 ` = 0.5 ` = 0.5 ` = 0.1
10 5 2 3 8 9
12 6 2 3 10 11
14 7 3 4 11 12
16 8 4 5 12 13
18 9 4 6 13 15
20 10 5 6 15 16
22 11 6 7 16 17
24 12 7 8 17 18
26 13 7 9 18 20
28 14 8 10 19 21
30 15 9 11 20 22
32 16 10 11 22 23
34 17 10 12 23 25
36 18 11 13 24 26
38 19 12 14 25 27
40 20 13 15 26 28
42 21 14 16 27 29
44 22 14 17 28 31
46 23 15 17 30 32
48 24 16 18 31 33
50 25 17 19 32 34
60 30 21 24 37 40
70 35 25 28 43 46
80 40 30 33 48 51
90 45 34 37 54 57
100 50 38 42 59 63
110 55 43 46 65 68
120 60 47 51 70 74
Source: S. Swed and C. Eisenhart, “Tables for Testing Randomness of Sampling in a Sequence of Alternatives,”
Annals of Mathematical Statistics 14 (1943): 66–87. (Reproduced by permission of the editor.)
Appendix 579

Table A.3 Runs above and below the median of length s in k = 2m observations with k as large
as 16 or 20.
Expected number of length greater than
s Expected number of length exactly s or equal to s

k (k + 2 )
(k + 2) / 2 = ( total number )
k
1 ≅ k / 22 =
22 (k − 1) 4

k (k + 2 ) k (k + 2 ) (k − 2 ) ≅ k / 2 k
2 ≅ k / 23 = 2
=
2 (k − 1)
3
8 2 (k − 1)
2
4

k (k + 2 ) (k − 4 ) k (k − 4 ) (k + 2 ) ≅ k / 2 k
3 ≅ k / 24 = 3
=
2 (k − 1) (k − 3 )
4
16 2 (k − 1)
3
8

k (k + 2 ) (k − 6 ) k (k + 2 ) (k − 4 ) (k − 6 ) ≅ k / 2 k
4 ≅ k / 25 = 4
=
2 (k − 1) (k − 3 )
5
32 2 (k − 1) (k − 3 )
4
16

k (k + 2 ) (k − 6 ) (k − 8 ) k (k + 2 ) (k − 6 ) (k − 8 ) ≅ k / 2 k
5 ≅ k / 26 = 5
=
2 (k − 1) (k − 3 ) (k − 5)
6
64 2 (k − 1) (k − 3 )
5
32

k (k + 2) (k − 8 ) (k − 10 ) k (k + 2)(k − 8)(k − 10) ≅ k / 2 k


6 ≅ k / 27 = 6
=
2 (k − 1) (k − 3 ) (k − 6 )
7
128 2 (k − 1) (k − 3 )
6
64
580
Table A.4 Control chart limits for samples of ng.
Standards given

Appendix

Mean X with l , Standard deviation s or Proportion p̂ or number Defects ĉ or defects per
Plot r given range R with r given defects n p̂ with p given unit û against c or l given

p (1− p )
Upper control µ + 3σ / n (
s : B6σ = c 4 + 3 1− c 42 σ ) pˆ : p + 3
n
cˆ : c + 3 c
u
R : D 2σ = (d 2 + 3d 3 ) σ
limit = µ + Aσ uˆ : µ + 3
npˆ : np + 3 np (1− p ) n

s : c 4σ pˆ : p cˆ : c
Centerline m
R : d 2σ npˆ : np uˆ : µ

p (1− p )
Lower control µ − 3σ / n (
s : B 5σ = c 4 − 3 1− c 42 σ) pˆ : p − 3
n
cˆ : c − 3 c
u
R : D1σ = (d 2 − 3d 3 ) σ
limit = µ − Aσ uˆ : µ − 3
npˆ : np − 3 np (1− p ) n

No standards given

Plot Mean X of past data using s Standard deviation s or Proportion p̂ or number Defects ĉ or defects per
or R against past data range R against past data defects n p̂ against past data per unit û against past data

 
p (1− p )
3
s : X + A3s s : B 4s = 1+ 1− c 42  s cˆ : c + 3 c
Upper control  c4  pˆ : p + 3
n
limit R : X + A2R  u
d  uˆ : u + 3
R : D 4R = 1+ 3 3  R npˆ : np + 3 np (1− p ) n
 d2 

s:X s :s pˆ : p cˆ : c
Centerline
R :X R :R npˆ : np uˆ : u

 
p (1− p )
3
s : X − A3s s : B3s = 1− 1− c 42  s cˆ : c − 3 c
 c4  pˆ : p − 3
Lower control n
R : X − A2R  u
d  uˆ : u − 3
limit
R : D 3R = 1− 3 3  R npˆ : np − 3 np (1− p ) n
 d2 

Continued
Continued
ng A A2 A3 B3 B4 B5 B6 c4 d2 d3 D1 D2 D3 D4
2 2.121 1.880 2.659 0.000 3.267 0.000 2.606 0.7979 1.128 0.853 0.000 3.686 0.000 3.267
3 1.732 1.023 1.954 0.000 2.568 0.000 2.276 0.8862 1.693 0.888 0.000 4.358 0.000 2.575
4 1.500 0.729 1.628 0.000 2.266 0.000 2.088 0.9213 2.059 0.880 0.000 4.698 0.000 2.282
5 1.342 0.577 1.427 0.000 2.089 0.000 1.964 0.9400 2.326 0.864 0.000 4.918 0.000 2.114
6 1.225 0.483 1.287 0.030 1.970 0.029 1.874 0.9515 2.534 0.848 0.000 5.079 0.000 2.004
7 1.134 0.419 1.182 0.118 1.882 0.113 1.806 0.9594 2.704 0.833 0.205 5.204 0.076 1.924
8 1.061 0.373 1.099 0.185 1.815 0.179 1.751 0.9650 2.847 0.820 0.388 5.307 0.136 1.864
9 1.000 0.337 1.032 0.239 1.761 0.232 1.707 0.9693 2.970 0.808 0.547 5.393 0.184 1.816
10 0.949 0.308 0.975 0.284 1.716 0.276 1.669 0.9727 3.078 0.797 0.686 5.469 0.223 1.777
11 0.905 0.285 0.927 0.321 1.679 0.313 1.637 0.9754 3.173 0.787 0.811 5.535 0.256 1.744
12 0.866 0.266 0.886 0.354 1.646 0.346 1.610 0.9776 3.258 0.778 0.923 5.594 0.283 1.717
13 0.832 0.249 0.850 0.382 1.618 0.374 1.585 0.9794 3.336 0.770 1.025 5.647 0.307 1.693
14 0.802 0.235 0.817 0.406 1.594 0.399 1.563 0.9810 3.407 0.763 1.118 5.696 0.328 1.672
15 0.775 0.223 0.789 0.428 1.572 0.421 1.544 0.9823 3.472 0.756 1.203 5.740 0.347 1.653
16 0.750 0.212 0.763 0.448 1.552 0.440 1.526 0.9835 3.532 0.750 1.282 5.782 0.363 1.637
17 0.728 0.203 0.739 0.466 1.534 0.458 1.511 0.9845 3.588 0.744 1.356 5.820 0.378 1.622
18 0.707 0.194 0.718 0.482 1.518 0.475 1.496 0.9854 3.640 0.739 1.424 5.856 0.391 1.609
19 0.688 0.187 0.698 0.497 1.503 0.490 1.483 0.9862 3.689 0.733 1.489 5.889 0.404 1.596
20 0.671 0.180 0.680 0.510 1.490 0.504 1.470 0.9869 3.735 0.729 1.549 5.921 0.415 1.585
21 0.655 0.173 0.663 0.523 1.477 0.516 1.459 0.9876 3.778 0.724 1.606 5.951 0.425 1.575
22 0.640 0.167 0.647 0.534 1.466 0.528 1.448 0.9882 3.819 0.720 1.660 5.979 0.435 1.565
23 0.626 0.162 0.633 0.545 1.455 0.539 1.438 0.9887 3.858 0.716 1.711 6.006 0.443 1.557
24 0.612 0.157 0.619 0.555 1.445 0.549 1.429 0.9892 3.895 0.712 1.759 6.032 0.452 1.548

Appendix
25 0.600 0.153 0.606 0.565 1.435 0.559 1.420 0.9896 3.931 0.708 1.805 6.056 0.459 1.541

581
582
Appendix
Table A.5 Binomial probability tables.
The cumulative probabilities of x ≤ c are given in the column headed by p for any sample size. Note that c is the sum of the row heading I and
the column heading J, so c = I + J. Each value shown is P(x ≤ c). To find the probability of exactly x in a sample of n, take P(X = x) = P(X ≤ x) –
P(X ≤ x – 1). To find P(X ≤ x) when p > 0.5, use c = (n – x – 1) under (1 – p) and take the complement of the answer, that is, P(X ≤ x | n, p) =
1 – P(X ≤ n – x – 1 | n, 1 – p).
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0 .30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I
n = 1; c = I + J
0 0.990 0.980 0.970 0.960 0.950 0.940 0.930 0.920 0.910 0.900 0.850 0.800 0.750 0.700 0.600 0.500
1 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
n = 2; c = I + J
0 0.990 0.960 0.941 0.922 0.902 0.884 0.846 0.828 0.810 0.722 0.640 0.562 0.490 0.360 0.250 0.600
1 1.000 1.000 0.999 0.998 0.998 0.996 0.995 0.994 0.992 0.990 0.978 0.960 0.938 0.910 0.840 0.750
2 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
n = 3; c = I+J
0 0.970 0.940 0.913 0.885 0.857 0.831 0.804 0.779 0.754 0.729 0.614 0.512 0.422 0.343 0.246 0.125
1 1.000 0.999 0.997 0.995 0.993 0.990 0.986 0.982 0.977 0.972 0.939 0.896 0.844 0.784 0.648 0.600
2 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.997 0.992 0.984 0.973 0.936 0.875
3 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
n = 4; c = I+J
0 0.961 0.922 0.855 0.849 0.815 0.781 0.748 0.716 0.686 0.656 0.522 0.410 0.316 0.240 0.130 0.063
1 0.999 0.998 0.995 0.991 0.986 0.980 0.973 0.996 0.957 0.948 0.890 0.819 0.738 0.652 0.475 0.313
2 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.996 0.988 0.973 0.949 0.916 0.821 0.688
3 1.000 1.000 1.000 1.000 1.000 0.999 0.998 0.996 0.992 0.974 0.938
4 1.000 1.000 1.000 1.000 1.000 1.000
Continued
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0 .30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I
n = 5; c = I + J
0 0.951 0.904 0.859 0.815 0.774 0.734 0.696 0.659 0.624 0.590 0.444 0.328 0.237 0.168 0.078 0.031
1 0.999 0.996 0.992 0.985 0.977 0.968 0.958 0.946 0.933 0.919 0.835 0.737 0.633 0.528 0.337 0.188
2 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.995 0.994 0.991 0.973 0.942 0.896 0.837 0.683 0.500
3 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.998 0.993 0.984 0.969 0.913 0.813
4 1.000 1.000 0.999 0.998 0.990 0.969
5 1.000 1.000 1.000 1.000
n = 10; c = I + J
0 0.904 0.817 0.737 0.665 0.699 0.539 0.484 0.434 0.389 0.349 0.197 0.107 0.056 0.028 0.006 0.001
1 0.996 0.984 0.965 0.942 0.914 0.882 0.848 0.812 0.775 0.736 0.544 0.376 0.244 0.149 0.046 0.011
2 1.000 0.999 0.997 0.994 0.988 0.981 0.972 0.960 0.946 0.930 0.820 0.678 0.526 0.383 0.167 0.055
3 1.000 1.000 1.000 0.999 0.998 0.996 0.994 0.991 0.987 0.950 0.879 0.776 0.650 0.382 0.172
4 1.000 1.000 1.000 0.999 0.999 0.998 0.990 0.967 0.922 0.850 0.633 0.377
5 1.000 1.000 1.000 0.999 0.994 0.980 0.953 0.834 0.623
6 1.000 0.999 0.996 0.989 0.945 0.828
7 1.000 1.000 0.998 0.988 0.945
8 1.000 0.998 0.989
9 1.000 0.999
10 1.000
J 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2
I
n = 15; c = I + J
0 0.860 0.739 0.633 0.542 0.463 0.395 0.337 0.286 0.243 0.206 0.087 0.035 0.013 0.005 0.005 0.004
1 0.990 0.965 0.927 0.881 0.829 0.774 0.717 0.660 0.603 0.549 0.319 0.167 0.080 0.035 0.027 0.018
2 1.000 0.997 0.991 0.980 0.964 0.943 0.917 0.887 0.853 0.816 0.604 0.398 0.236 0.127 0.091 0.069
3 1.000 1.000 0.998 0.995 0.990 0.982 0.973 0.960 0.944 0.823 0.648 0.461 0.297 0.217 0.151
4 1.000 0.999 0.999 0.997 0.995 0.992 0.987 0.938 0.836 0.686 0.515 0.403 0.304
5 1.000 1.000 1.000 0.999 0.999 0.998 0.983 0.939 0.851 0.722 0.610 0.500

Appendix
6 1.000 1.000 1.000 0.996 0.982 0.943 0.869 0.787 0.696
7 0.999 0.996 0.983 0.950 0.905 0.849
8 1.000 0.999 0.996 0.985 0.966 0.941
9 1.000 0.999 0.996 0.991 0.982
10 1.000 0.999 0.998 0.996

583
11 1.000 1.000 1.000
Continued
584
p 0.01 0.02 0.03 .0.4 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 3

Appendix
I
n = 20; c = I + J
0 0.818 0.668 0.544 0.442 0.358 0.290 0.234 0.189 0.152 0.122 0.039 0.012 0.003 0.001 0.001 0.001
1 0.983 0.940 0.880 0.810 0.736 0.660 0.587 0.517 0.452 0.392 0.176 0.069 0.024 0.008 0.004 0.006
2 0.999 0.993 0.979 0.956 0.925 0.885 0.839 0.788 0.733 0.677 0.405 0.206 0.091 0.035 0.016 0.021
3 1.000 0.999 0.997 0.993 0.984 0.971 0.953 0.929 0.901 0.867 0.648 0.411 0.225 0.107 0.051 0.058
4 1.000 1.000 0.999 0.997 0.994 0.989 0.982 0.971 0.957 0.830 0.630 0.415 0.238 0.126 0.131
5 1.000 1.000 0.999 0.998 0.996 0.993 0.989 0.933 0.804 0.617 0.416 0.250 0.252
6 1.000 1.000 0.999 0.999 0.998 0.978 0.913 0.786 0.608 0.416 0.412
7 1.000 1.000 1.000 0.994 0.968 0.898 0.772 0.596 0.588
8 0.999 0.990 0.959 0.887 0.755 0.748
9 1.000 0.997 0.986 0.952 0.872 0.868
10 0.999 0.996 0.983 0.943 0.942
11 1.000 0.999 0.995 0.979 0.979
12 1.000 0.999 0.994 0.994
13 1.000 0.998 0.999
14 1.000 1.000
J 0 0 0 0 0 0 0 0 0 0 0 0 1 1 3 5
I
n = 25; c = I + J
0 0.778 0.603 0.467 0.360 0.277 0.213 0.163 0.124 0.095 0.072 0.017 0.004 0.007 0.002 0.002 0.002
1 0.974 0.911 0.828 0.736 0.642 0.553 0.470 0.395 0.329 0.271 0.093 0.027 0.032 0.009 0.009 0.007
2 0.998 0.987 0.962 0.924 0.873 0.813 0.747 0.677 0.606 0.537 0.254 0.098 0.096 0.033 0.029 0.022
3 1.000 0.999 0.994 0.983 0.966 0.940 0.906 0.865 0.817 0.764 0.471 0.234 0.214 0.090 0.074 0.054
4 1.000 0.999 0.997 0.993 0.985 0.973 0.955 0.931 0.902 0.682 0.421 0.378 0.193 0.154 0.115
5 1.000 1.000 0.999 0.997 0.993 0.988 0.979 0.967 0.838 0.617 0.561 0.341 0.274 0.212
6 1.000 0.999 0.999 0.997 0.995 0.991 0.930 0.780 0.727 0.512 0.425 0.345
7 1.000 1.000 0.999 0.999 0.998 0.975 0.891 0.851 0.677 0.586 0.500
8 1.000 1.000 1.000 0.992 0.953 0.929 0.811 0.732 0.655
9 0.998 0.983 0.970 0.902 0.846 0.788
10 1.000 0.994 0.989 0.956 0.922 0.885
11 0.998 0.997 0.983 0.966 0.946
12 1.000 0.999 0.994 0.987 0.978
13 1.000 0.998 0.996 0.993
14 1.000 0.999 0.998
15 1.000 1.000
Continued
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 .040 0.50
J 0 0 0 0 0 0 0 0 0 0 0 0 1 2 4 6
I
n = 30; c = I + J
0 0.740 0.545 0.401 0.294 0.215 0.156 0.113 0.082 0.059 0.042 0.008 0.001 0.002 0.002 0.002 0.001
1 0.964 0.879 0.773 0.661 0.554 0.455 0.369 0.296 0.234 0.184 0.084 0.011 0.011 0.009 0.006 0.003
2 0.997 0.978 0.940 0.883 0.812 0.732 0.649 0.565 0.486 0.411 0.151 0.044 0.037 0.030 0.017 0.008
3 1.000 0.997 0.988 0.969 0.939 0.897 0.845 0.784 0.717 0.647 0.322 0.123 0.098 0.077 0.044 0.021
4 1.000 0.998 0.994 0.984 0.968 0.945 0.913 0.872 0.825 0.524 0.255 0.203 0.160 0.094 0.049
5 1.000 0.999 0.997 0.992 0.984 0.971 0.952 0.927 0.711 0.428 0.348 0.281 0.176 0.100
6 1.000 0.999 0.998 0.996 0.992 0.985 0.974 0.847 0.607 0.514 0.432 0.291 0.181
7 1.000 1.000 0.999 0.998 0.996 0.992 0.930 0.761 0.674 0.589 0.431 0.292
8 1.000 1.000 0.999 0.998 0.972 0.871 0.803 0.730 0.578 0.428
9 1.000 1.000 0.990 0.939 0.894 0.841 0.714 0.572
10 0.997 0.974 0.949 0.916 0.825 0.708
11 0.999 0.991 0.978 0.960 0.903 0.819
12 1.000 0.997 0.992 0.983 0.952 0.900
13 0.999 0.997 0.994 0.979 0.951
14 1.000 0.999 0.998 0.992 0.979
15 1.000 0.999 0.997 0.992
16 1.000 0.999 0.997
17 1.000 0.999
18 1.000
J 0 0 0 0 0 0 0 0 0 0 0 1 2 3 5 8
I
n = 35; c = I + J
0 0.703 0.493 0.344 0.240 0.166 0.115 0.079 0.054 0.039 0.025 0.003 0.004 0.001 0.002 0.001 0.001
1 0.952 0.845 0.717 0.589 0.472 0.371 0.287 0.218 0.164 0.122 0.024 0.019 0.003 0.009 0.003 0.003
2 0.995 0.967 0.913 0.837 0.746 0.649 0.552 0.461 0.379 0.306 0.087 0.061 0.014 0.027 0.010 0.008
3 1.000 0.995 0.980 0.950 0.904 0.844 0.773 0.694 0.612 0.531 0.209 0.143 0.041 0.065 0.026 0.020
4 0.999 0.986 0.988 0.971 0.944 0.905 0.856 0.797 0.731 0.381 0.272 0.098 0.133 0.058 0.045
5 1.000 0.999 0.998 0.993 0.983 0.967 0.943 0.910 0.868 0.569 0.433 0.192 0.234 0.112 0.088

Appendix
6 1.000 1.000 0.998 0.996 0.990 0.981 0.966 0.945 0.735 0.599 0.322 0.365 0.195 0.155
7 1.000 0.999 0.998 0.994 0.989 0.980 0.856 0.745 0.474 0.510 0.306 0.250
8 1.000 0.999 0.999 0.997 0.994 0.931 0.854 0.626 0.652 0.436 0.368
9 1.000 1.000 0.999 0.998 0.971 0.925 0.758 0.773 0.573 0.500
10 1.000 1.000 0.989 0.966 0.858 0.865 0.700 0.632

585
Continued
586
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 0 1 2 3 5 8

Appendix
I
n = 35; c = I + J
11 0.996 0.986 0.924 0.927 0.807 0.750
12 0.999 0.995 0.964 0.964 0.886 0.845
13 1.000 0.998 0.984 0.984 0.938 0.912
14 0.999 0.994 0.994 0.970 0.955
15 1.000 0.998 0.998 0.987 0.980
16 0.999 0.999 0.995 0.992
17 1.000 1.000 0.998 0.997
18 0.999 0.999
19 1.000 1.000
J 0 0 0 0 0 0 0 0 0 0 0 1 2 3 6 10
I
n = 40; c = I + J
0 0.699 0.446 0.296 0.195 0.129 0.084 0.055 0.036 0.023 0.015 0.002 0.001 0.001 0.001 0.001 0.001
1 0.939 0.810 0.662 0.521 0.399 0.299 0.220 0.159 0.114 0.080 0.012 0.008 0.005 0.003 0.002 0.003
2 0.993 0.954 0.882 0.786 0.677 0.567 0.463 0.369 0.289 0.223 0.049 0.028 0.016 0.009 0.006 0.008
3 0.999 0.992 0.969 0.925 0.862 0.783 0.684 0.601 0.509 0.423 0.130 0.076 0.043 0.024 0.016 0.019
4 1.000 0.999 0.993 0.979 0.952 0.910 0.855 0.787 0.710 0.629 0.263 0.161 0.096 0.055 0.035 0.040
5 1.000 0.999 0.995 0.986 0.969 0.942 0.903 0.853 0.794 0.433 0.286 0.182 0.111 0.071 0.077
6 1.000 0.999 0.997 0.991 0.980 0.962 0.936 0.900 0.607 0.437 0.330 0.196 0.129 0.134
7 1.000 0.999 0.998 0.994 0.987 0.976 0.958 0.756 0.593 0.440 0.309 0.211 0.215
8 1.000 0.999 0.998 0.996 0.992 0.985 0.865 0.732 0.584 0.441 0.317 0.318
9 1.000 1.000 0.999 0.998 0.995 0.933 0.839 0.715 0.577 0.440 0.437
10 1.000 0.999 0.999 0.970 0.912 0.821 0.703 0.568 0.563
11 1.000 1.000 0.988 0.957 0.897 0.807 0.689 0.682
12 0.996 0.981 0.946 0.885 0.791 0.785
13 0.999 0.992 0.974 0.937 0.870 0.866
14 1.000 0.997 0.988 0.968 0.926 0.923
15 0.999 0.995 0.985 0.961 0.960
16 1.000 0.998 0.994 0.981 0.981
17 0.999 0.998 0.992 0.992
18 1.000 0.999 0.997 0.997
19 1.000 0.999 0.999
20 1.000 1.000
Continued
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 0 1 3 4 8 12
I
n = 45; c = I + J
0 0.636 0.403 0.254 0.159 0.099 0.062 0.038 0.023 0.014 0.009 0.001 0.001 0.002 0.001 0.001 0.001
1 0.925 0.773 0.607 0.458 0.335 0.239 0.167 0.115 0.078 0.052 0.006 0.003 0.006 0.003 0.004 0.003
2 0.990 0.939 0.848 0.732 0.608 0.488 0.382 0.291 0.217 0.159 0.027 0.013 0.018 0.008 0.009 0.008
3 0.999 0.988 0.954 0.895 0.813 0.716 0.613 0.510 0.414 0.329 0.078 0.038 0.045 0.021 0.022 0.018
4 1.000 0.998 0.989 0.967 0.927 0.869 0.795 0.710 0.619 0.527 0.175 0.090 0.094 0.047 0.045 0.036
5 1.000 0.998 0.991 0.976 0.949 0.908 0.852 0.785 0.708 0.314 0.177 0.173 0.093 0.084 0.068
6 1.000 0.998 0.993 0.983 0.964 0.935 0.894 0.841 0.478 0.297 0.280 0.165 0.143 0.116
7 1.000 0.998 0.995 0.988 0.975 0.954 0.924 0.639 0.441 0.409 0.262 0.225 0.186
8 1.000 0.999 0.996 0.992 0.983 0.968 0.775 0.588 0.546 0.380 0.327 0.276
9 1.000 0.999 0.997 0.994 0.988 0.873 0.720 0.675 0.509 0.444 0.383
10 1.000 0.999 0.998 0.996 0.935 0.826 0.784 0.635 0.564 0.500
11 1.000 1.000 0.999 0.970 0.901 0.867 0.746 0.679 0.617
12 1.000 0.987 0.948 0.925 0.836 0.778 0.724
13 0.995 0.975 0.961 0.901 0.856 0.814
14 0.998 0.989 0.981 0.945 0.914 0.884
15 0.999 0.996 0.992 0.972 0.952 0.932
16 1.000 0.998 0.997 0.986 0.975 0.964
17 0.999 0.999 0.994 0.988 0.982
18 1.000 1.000 0.998 0.995 0.992
19 0.999 0.998 0.997
20 1.000 0.999 0.999
21 1.000 1.000
J 0 0 0 0 0 0 0 0 0 0 1 2 4 5 9 14
I
n = 50; c = I + J
0 0.605 0.364 0.218 0.129 0.077 0.045 0.027 0.015 0.009 0.006 0.003 0.001 0.002 0.001 0.001 0.001
1 0.911 0.736 0.555 0.400 0.279 0.190 0.126 0.083 0.053 0.034 0.014 0.005 0.007 0.002 0.002 0.003
2 0.986 0.922 0.811 0.677 0.541 0.416 0.311 0.226 0.161 0.112 0.046 0.018 0.019 0.007 0.006 0.008

Appendix
3 0.998 0.982 0.937 0.861 0.760 0.647 0.533 0.425 0.330 0.250 0.112 0.048 0.045 0.018 0.013 0.016
4 0.999 0.997 0.983 0.951 0.896 0.821 0.729 0.629 0.628 0.431 0.219 0.103 0.092 0.040 0.028 0.032
5 0.999 0.996 0.986 0.962 0.922 0.865 0.792 0.707 0.616 0.361 0.190 0.164 0.079 0.054 0.059
6 0.999 0.996 0.988 0.971 0.942 0.898 0.840 0.770 0.518 0.307 0.262 0.139 0.096 0.101
7 0.999 0.997 0.990 0.978 0.956 0.923 0.878 0.668 0.443 0.382 0.223 0.156 0.161

587
Continued
588
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 0 1 2 4 5 9 14

Appendix
I
n = 50; c = I + J
8 0.999 0.997 0.993 0.983 0.967 0.942 0.791 0.584 0.511 0.329 0.237 0.239
9 0.999 0.998 0.994 0.987 0.975 0.880 0.711 0.637 0.447 0.335 0.336
10 0.999 0.998 0.996 0.991 0.937 0.814 0.748 0.569 0.446 0.444
11 0.999 0.999 0.997 0.969 0.889 0.837 0.684 0.561 0.556
12 0.999 0.999 0.987 0.939 0.902 0.782 0.670 0.604
13 0.994 0.969 0.945 0.859 0.766 0.760
14 0.998 0.986 0.971 0.915 0.844 0.839
15 0.999 0.993 0.986 0.952 0.902 0.899
16 0.997 0.994 0.975 0.943 0.941
17 0.999 0.997 0.988 0.969 0.968
18 0.999 0.999 0.994 0.984 0.984
19 0.999 0.997 0.992 0.992
20 0.999 0.997 0.997
21 0.998 0.999
22 0.999 0.999
J 0 0 0 0 0 0 0 0 0 1 3 5 7 10 17 25
I
n = 75; c = I + J
0 0.471 0.219 0.101 0.047 0.021 0.009 0.004 0.002 0.001 0.003 0.002 0.001 0.001 0.000 0.001 0.001
1 0.827 0.556 0.338 0.193 0.105 0.056 0.029 0.014 0.007 0.016 0.008 0.004 0.002 0.002 0.003 0.005
2 0.960 0.810 0.608 0.419 0.269 0.165 0.096 0.055 0.030 0.050 0.023 0.010 0.004 0.004 0.006 0.010
3 0.993 0.936 0.812 0.647 0.479 0.334 0.211 0.140 0.085 0.119 0.054 0.024 0.010 0.009 0.011 0.018
4 0.999 0.982 0.925 0.819 0.679 0.529 0.390 0.274 0.184 0.227 0.108 0.050 0.022 0.019 0.021 0.032
5 0.999 0.996 0.975 0.920 0.828 0.706 0.571 0.439 0.322 0.367 0.189 0.093 0.043 0.035 0.037 0.053
6 0.999 0.992 0.969 0.919 0.837 0.729 0.606 0.482 0.521 0.295 0.156 0.077 0.062 0.061 0.083
7 0.998 0.989 0.966 0.919 0.847 0.749 0.638 0.666 0.418 0.239 0.127 0.102 0.096 0.124
8 0.999 0.997 0.988 0.965 0.922 0.856 0.769 0.786 0.547 0.341 0.195 0.157 0.144 0.178
9 0.999 0.996 0.986 0.964 0.925 0.865 0.874 0.668 0.454 0.279 0.227 0.205 0.244
10 0.999 0.999 0.995 0.985 0.964 0.928 0.931 0.772 0.569 0.377 0.312 0.279 0.322
11 0.999 0.998 0.994 0.984 0.965 0.966 0.853 0.676 0.482 0.407 0.365 0.409
12 0.999 0.998 0.994 0.984 0.984 0.911 0.769 0.588 0.507 0.456 0.500
13 0.999 0.998 0.993 0.993 0.949 0.844 0.686 0.605 0.549 0.591
14 0.999 0.997 0.997 0.973 0.900 0.771 0.697 0.641 0.678
Continued
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 0 0 1 3 5 7 10 17 25
I
n = 75; c = I + J
15 0.999 0.999 0.987 0.939 0.842 0.777 0.724 0.756
16 0.999 0.999 0.993 0.965 0.895 0.843 0.796 0.822
17 0.997 0.981 0.934 0.895 0.855 0.876
18 0.999 0.990 0.961 0.932 0.902 0.917
19 0.999 0.995 0.978 0.959 0.936 0.947
20 0.998 0.988 0.976 0.960 0.968
21 0.999 0.994 0.987 0.977 0.982
22 0.997 0.993 0.987 0.990
23 0.999 0.996 0.993 0.995
24 0.998 0.996 0.997
25 0.999 0.998 0.999
26 0.999 0.999
J 0 0 0 0 0 0 0 1 1 2 6 8 12 16 24 34
I
n = 100; c = I + J
0 0.366 0.133 0.048 0.017 0.006 0.002 0.001 0.002 0.001 0.002 0.002 0.001 0.001 0.001 0.001 0.001
1 0.736 0.403 0.195 0.087 0.037 0.016 0.006 0.011 0.005 0.008 0.005 0.002 0.002 0.002 0.001 0.002
2 0.921 0.677 0.420 0.232 0.118 0.057 0.026 0.037 0.017 0.024 0.012 0.006 0.005 0.005 0.002 0.003
3 0.982 0.859 0.647 0.429 0.258 0.143 0.074 0.090 0.047 0.058 0.027 0.013 0.011 0.009 0.005 0.006
4 0.997 0.949 0.818 0.629 0.436 0.277 0.163 0.180 0.105 0.117 0.055 0.025 0.021 0.016 0.008 0.010
5 0.999 0.985 0.919 0.788 0.616 0.441 0.291 0.303 0.194 0.206 0.099 0.047 0.038 0.029 0.015 0.018
6 1.000 0.996 0.969 0.894 0.766 0.607 0.444 0.447 0.313 0.321 0.163 0.080 0.063 0.048 0.025 0.028
7 0.999 0.989 0.952 0.872 0.748 0.699 0.593 0.449 0.451 0.247 0.129 0.100 0.076 0.040 0.044
8 1.000 0.997 0.981 0.937 0.854 0.734 0.722 0.688 0.683 0.347 0.192 0.149 0.114 0.062 0.067
9 0.999 0.993 0.972 0.922 0.838 0.824 0.712 0.703 0.457 0.271 0.211 0.163 0.091 0.097
10 1.000 0.998 0.989 0.962 0.909 0.897 0.812 0.802 0.568 0.362 0.286 0.224 0.130 0.136
11 0.999 0.996 0.983 0.953 0.944 0.886 0.876 0.672 0.460 0.371 0.296 0.179 0.184
12 1.000 0.999 0.993 0.978 0.972 0.936 0.927 0.763 0.559 0.462 0.377 0.239 0.242

Appendix
13 1.000 0.997 0.990 0.987 0.966 0.960 0.837 0.654 0.653 0.462 0.307 0.309
14 0.999 0.996 0.994 0.983 0.979 0.893 0.739 0.642 0.549 0.382 0.382
15 1.000 0.998 0.998 0.992 0.990 0.934 0.811 0.722 0.633 0.462 0.460
16 0.999 0.999 0.996 0.995 0.961 0.869 0.792 0.711 0.543 0.640
17 1.000 1.000 0.999 0.998 0.978 0.913 0.850 0.779 0.622 0.618

589
Continued
590
Appendix
Continued
p 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.40 0.50
J 0 0 0 0 0 0 0 1 1 2 6 8 12 16 24 34
I
n = 100; c = I + J
18 0.999 0.999 0.988 0.944 0.896 0.837 0.697 0.691
19 1.000 1.000 0.994 0.966 0.931 0.884 0.763 0.758
20 0.997 0.980 0.956 0.920 0.821 0.816
21 0.999 0.989 0.972 0.947 0.869 0.864
22 0.999 0.994 0.984 0.966 0.907 0.903
23 1.000 0.997 0.991 0.979 0.936 0.933
24 0.998 0.995 0.987 0.958 0.956
25 0.999 0.997 0.993 0.973 0.972
26 1.000 0.999 0.996 0.983 0.982
27 0.999 0.998 0.990 0.990
28 0.999 0.994 0.994
29 0.997 0.997
30 0.998 0.998
31 0.999 0.999
32 1.000 1.000
Table A.6 Poisson probability curves.*
Probability of occurrence of c less defects in a sample of n

C=6 7 8 9 10 15 20 30 40 C = 50
0.99999
4 5
0.9999 3
50
0.999 2

0.99 1
Probability of occurrence of c or less defects

40
0.9 C=0

0.8
0.7
0.6
0.5 30
0.4
0.3
0.2

0.1

20
0.01

15
0.001

0.0001

10
0.00001

Appendix
0.1 0.2 0.3 0.4 0.5 0.7 0.91.0 2 3 4 5 6 7 8 9 10 C = 0 20 C = 5 30
Value of l = np

* This table, copyright 1926 American Telephone and Telegraph Company, is a modification of Figure 5, following p. 612, in Frances Thorndike’s article, “Application
of Poisson’s Probability Summation,” The Bell System Technical Journal 5 (October 1926), and is reproduced by permission of the editor of BSTJ. It appears also as

591
Figure 2.6 on p. 35 of H. F. Dodge and H. G. Romig, Sampling Inspection Tables, 2nd ed. (New York: John Wiley & Sons, 1959): 35, copyright 1959 and has the
permission of John Wiley & Sons Inc. to be reproduced here.
592 Appendix

Table A.7 Nonrandom variability—standard given: df = ∞ (two-sided).


k Z.10 Z.05 Z.01
1 1.64 1.96 2.58
2 1.96 2.24 2.81
3 2.11 2.39 2.93
4 2.23 2.49 3.02
5 2.31 2.57 3.09
6 2.38 2.63 3.14
7 2.43 2.68 3.19
8 2.48 2.73 3.22
9 2.52 2.77 3.26
10 2.56 2.80 3.29
15 2.70 2.93 3.40
20 2.79 3.02 3.48
24 2.85 3.07 3.53
30 2.92 3.14 3.59
50 3.08 3.28 3.72
120 3.33 3.52 3.93
Table A.8 Exact factors* for one-way analysis of means, Ha (two-sided).
Significance level = 0.10
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 2.065
3 1.663 2.585
4 1.507 2.293 2.689
5 1.425 2.143 2.494 2.731
6 1.374 2.052 2.376 2.597 2.764
7 1.340 1.991 2.297 2.507 2.666 2.792
8 1.315 1.947 2.240 2.442 2.595 2.717 2.817
9 1.296 1.914 2.197 2.393 2.541 2.659 2.757 2.839
10 1.282 1.888 2.164 2.355 2.499 2.614 2.709 2.790 2.859
11 1.270 1.868 2.138 2.324 2.465 2.578 2.671 2.749 2.818 2.877
12 1.260 1.851 2.116 2.299 2.438 2.548 2.639 2.716 2.783 2.842 2.894
13 1.252 1.836 2.097 2.278 2.414 2.523 2.613 2.689 2.754 2.812 2.864 2.910
14 1.245 1.824 2.082 2.260 2.395 2.502 2.590 2.665 2.730 2.787 2.837 2.883 2.924
15 1.240 1.814 2.069 2.245 2.378 2.483 2.571 2.645 2.708 2.765 2.815 2.860 2.901 2.938
16 1.235 1.805 2.057 2.232 2.363 2.468 2.554 2.627 2.690 2.746 2.795 2.839 2.880 2.917 2.951
17 1.230 1.797 2.047 2.220 2.350 2.454 2.539 2.611 2.674 2.729 2.778 2.822 2.861 2.898 2.932 2.963
18 1.226 1.790 2.038 2.210 2.339 2.441 2.526 2.597 2.659 2.714 2.762 2.806 2.845 2.881 2.915 2.945 2.974
19 1.223 1.784 2.030 2.201 2.329 2.430 2.514 2.585 2.647 2.701 2.749 2.792 2.831 2.867 2.900 2.930 2.958 2.985
20 1.220 1.779 2.023 2.192 2.319 2.420 2.504 2.574 2.635 2.689 2.736 2.779 2.818 2.853 2.886 2.916 2.944 2.971 2.995
24 1.210 1.762 2.001 2.167 2.291 2.390 2.471 2.540 2.599 2.651 2.697 2.739 2.777 2.811 2.843 2.873 2.900 2.926 2.949
30 1.200 1.745 1.979 2.141 2.263 2.359 2.438 2.505 2.563 2.614 2.659 2.699 2.736 2.770 2.801 2.829 2.856 2.881 2.904
40 1.191 1.728 1.958 2.116 2.235 2.329 2.406 2.471 2.527 2.577 2.621 2.660 2.696 2.728 2.758 2.786 2.812 2.836 2.858
60 1.181 1.711 1.937 2.092 2.207 2.299 2.374 2.438 2.492 2.540 2.583 2.621 2.656 2.687 2.716 2.743 2.768 2.791 2.813
120 1.172 1.695 1.916 2.067 2.180 2.269 2.343 2.404 2.457 2.504 2.545 2.582 2.616 2.646 2.674 2.700 2.724 2.747 2.768
∞ 1.163 1.679 1.896 2.043 2.154 2.240 2.311 2.371 2.423 2.468 2.508 2.544 2.576 2.606 2.633 2.658 2.681 2.703 2.723

Appendix
SG 1.949 2.11 2.23 2.31 2.38 2.43 2.48 2.52 2.56 2.59 2.62 2.65 2.67 2.70 2.72 2.74 2.76 2.77 2.79
Continued

593
594
Appendix
Significance level = 0.05
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 3.042
3 2.248 3.416
4 1.963 2.906 3.373
5 1.817 2.655 3.053 3.330
6 1.730 2.505 2.864 3.114 3.304
7 1.672 2.406 2.740 2.972 3.148 3.289
8 1.631 2.336 2.652 2.871 3.038 3.171 3.281
9 1.600 2.283 2.586 2.796 2.955 3.082 3.187 3.277
10 1.576 2.242 2.535 2.738 2.891 3.013 3.115 3.201 3.276
11 1.556 2.209 2.494 2.691 2.840 2.959 3.057 3.140 3.212 3.276
12 1.541 2.182 2.461 2.653 2.798 2.914 3.010 3.091 3.161 3.223 3.278
13 1.528 2.160 2.433 2.622 2.764 2.877 2.970 3.050 3.118 3.179 3.232 3.281
14 1.517 2.141 2.410 2.595 2.735 2.846 2.937 3.015 3.082 3.141 3.194 3.241 3.284
15 1.507 2.125 2.390 2.573 2.710 2.819 2.909 2.985 3.051 3.109 3.161 3.207 3.250 3.288
16 1.499 2.111 2.373 2.553 2.688 2.795 2.884 2.959 3.024 3.081 3.132 3.178 3.220 3.258 3.293
17 1.492 2.099 2.358 2.536 2.669 2.775 2.863 2.937 3.001 3.057 3.107 3.152 3.193 3.231 3.265 3.297
18 1.486 2.088 2.345 2.520 2.653 2.757 2.844 2.917 2.980 3.036 3.085 3.130 3.170 3.207 3.241 3.273 3.302
19 1.480 2.079 2.333 2.507 2.638 2.741 2.827 2.899 2.962 3.017 3.065 3.109 3.149 3.186 3.220 3.251 3.280 3.307
20 1.475 2.070 2.322 2.495 2.624 2.727 2.812 2.883 2.945 3.000 3.048 3.091 3.131 3.167 3.200 3.231 3.260 3.287 3.312
24 1.459 2.043 2.289 2.457 2.583 2.683 2.765 2.834 2.894 2.946 2.993 3.035 3.073 3.108 3.140 3.170 3.197 3.223 3.248
30 1.444 2.017 2.257 2.420 2.543 2.639 2.719 2.786 2.843 2.894 2.939 2.980 3.017 3.050 3.081 3.110 3.136 3.161 3.184
40 1.429 1.991 2.225 2.384 2.503 2.596 2.673 2.738 2.794 2.843 2.886 2.926 2.961 2.993 3.023 3.051 3.076 3.100 3.122
60 1.414 1.966 2.194 2.349 2.464 2.555 2.629 2.692 2.745 2.793 2.835 2.872 2.906 2.937 2.966 2.992 3.017 3.040 3.061
120 1.400 1.941 2.164 2.314 2.426 2.513 2.585 2.646 2.698 2.743 2.784 2.820 2.853 2.883 2.910 2.935 2.959 2.981 3.002
∞ 1.386 1.917 2.134 2.280 2.388 2.473 2.543 2.601 2.651 2.695 2.734 2.769 2.800 2.829 2.855 2.879 2.902 2.923 2.943
SG 2.236 2.39 2.49 2.57 2.63 2.68 2.73 2.77 2.80 2.83 2.86 2.88 2.91 2.93 2.95 2.97 2.98 3.00 3.02
Continued
Significance level = 0.01
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 7.018
3 4.098 6.143
4 3.248 4.676 5.381
5 2.849 4.021 4.577 4.966
6 2.621 3.654 4.130 4.460 4.711
7 2.474 3.420 3.846 4.141 4.364 4.542
8 2.372 3.257 3.651 3.921 4.125 4.287 4.422
9 2.298 3.138 3.508 3.761 3.951 4.103 4.227 4.333
10 2.241 3.048 3.400 3.640 3.820 3.962 4.080 4.180 4.266
11 2.196 2.976 3.314 3.544 3.716 3.852 3.964 4.059 4.141 4.213
12 2.160 2.918 3.245 3.467 3.633 3.763 3.871 3.962 4.040 4.109 4.171
13 2.130 2.870 3.188 3.404 3.564 3.691 3.794 3.882 3.958 4.024 4.084 4.137
14 2.105 2.830 3.141 3.351 3.507 3.630 3.730 3.816 3.889 3.953 4.011 4.062 4.109
15 2.084 2.796 3.100 3.306 3.458 3.578 3.676 3.759 3.830 3.893 3.949 3.999 4.044 4.085
16 2.065 2.767 3.066 3.267 3.416 3.533 3.630 3.710 3.780 3.841 3.896 3.944 3.988 4.029 4.066
17 2.049 2.741 3.035 3.233 3.380 3.495 3.589 3.668 3.737 3.796 3.849 3.897 3.940 3.980 4.016 4.049
18 2.035 2.719 3.009 3.204 3.348 3.461 3.554 3.631 3.698 3.757 3.809 3.856 3.898 3.937 3.972 4.005 4.036
19 2.023 2.699 2.985 3.178 3.320 3.431 3.522 3.599 3.665 3.722 3.773 3.819 3.861 3.899 3.933 3.966 3.996 4.024
20 2.012 2.681 2.965 3.154 3.295 3.405 3.494 3.570 3.635 3.691 3.742 3.787 3.828 3.865 3.899 3.931 3.960 3.988 4.013
24 1.978 2.626 2.900 3.082 3.217 3.322 3.408 3.480 3.542 3.596 3.643 3.686 3.725 3.760 3.793 3.823 3.851 3.877 3.901
30 1.945 2.573 2.837 3.013 3.142 3.242 3.324 3.393 3.452 3.503 3.548 3.589 3.626 3.659 3.690 3.718 3.745 3.769 3.792
40 1.912 2.521 2.776 2.945 3.069 3.165 3.243 3.309 3.365 3.414 3.457 3.495 3.530 3.562 3.591 3.618 3.643 3.666 3.688
60 1.881 2.471 2.717 2.880 2.998 3.091 3.165 3.228 3.281 3.327 3.368 3.405 3.438 3.468 3.495 3.521 3.544 3.566 3.586
120 1.851 2.421 2.660 2.816 2.930 3.018 3.090 3.149 3.200 3.244 3.283 3.317 3.349 3.377 3.403 3.427 3.449 3.470 3.489
∞ 1.821 2.374 2.604 2.755 2.864 2.949 3.017 3.073 3.121 3.163 3.200 3.233 3.262 3.289 3.314 3.336 3.357 3.377 3.395

Appendix
SG 2.806 2.93 3.02 3.09 3.14 3.19 3.23 3.26 3.29 3.32 3.34 3.36 3.38 3.40 3.42 3.44 3.45 3.47 3.48
Continued

595
596
Continued

Appendix
Significance level = 0.001
Number of means, k *
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 22.344
3 8.759 13.383
4 6.008 8.642 9.867
5 4.832 6.743 7.624 8.227
6 4.204 5.754 6.560 6.942 7.303
7 3.820 5.157 5.761 6.170 6.476 6.719
8 3.563 4.762 5.298 5.660 5.930 6.144 6.320
9 3.379 4.483 4.972 5.301 5.546 5.739 5.898 6.032
10 3.243 4.275 4.730 5.035 5.261 5.439 5.585 5.709 5.815
11 3.137 4.115 4.544 4.831 5.042 5.209 5.345 5.461 5.560 5.647
12 3.053 3.989 4.397 4.669 4.870 5.027 5.156 5.264 5.358 5.440 5.513
13 2.984 3.886 4.278 4.538 4.730 4.880 5.003 5.106 5.195 5.272 5.341 5.403
14 2.928 3.802 4.180 4.430 4.615 4.759 4.876 4.975 5.060 5.134 5.200 5.259 5.313
15 2.880 3.730 4.097 4.340 4.518 4.657 4.770 4.865 4.947 5.019 5.082 5.139 5.190 5.237
16 2.839 3.670 4.027 4.263 4.436 4.571 4.680 4.772 4.851 4.920 4.981 5.036 5.086 5.131 5.173
17 2.804 3.618 3.967 4.197 4.365 4.496 4.603 4.692 4.769 4.836 4.895 4.948 4.996 5.040 5.080 5.117
18 2.773 3.572 3.914 4.139 4.304 4.432 4.536 4.623 4.697 4.762 4.820 4.872 4.918 4.961 5.000 5.036 5.070
19 2.746 3.532 3.868 4.089 4.250 4.375 4.477 4.562 4.635 4.698 4.754 4.805 4.850 4.891 4.930 4.965 4.997 5.028
20 2.722 3.497 3.827 4.044 4.202 4.325 4.425 4.508 4.579 4.641 4.696 4.745 4.790 4.830 4.867 4.902 4.934 4.963 4.991
24 2.648 3.390 3.703 3.908 4.057 4.172 4.266 4.343 4.410 4.468 4.519 4.565 4.606 4.643 4.678 4.709 4.739 4.766 4.792
30 2.578 3.287 3.585 3.779 3.920 4.028 4.115 4.188 4.250 4.304 4.352 4.394 4.432 4.467 4.499 4.528 4.555 4.581 4.604
40 2.511 3.190 3.474 3.658 3.790 3.892 3.974 4.042 4.099 4.150 4.194 4.233 4.268 4.300 4.330 4.357 4.382 4.405 4.427
60 2.447 3.099 3.369 3.543 3.667 3.763 3.840 3.903 3.957 4.004 4.045 4.081 4.114 4.144 4.171 4.196 4.219 4.240 4.260
120 2.385 3.012 3.269 3.434 3.552 3.642 3.713 3.773 3.823 3.866 3.904 3.938 3.968 3.996 4.021 4.044 4.065 4.085 4.103
∞ 2.327 2.930 3.175 3.332 3.443 3.527 3.595 3.650 3.697 3.737 3.772 3.804 3.832 3.857 3.880 3.901 3.920 3.938 3.955
SG 3.481 3.59 3.66 3.72 3.76 3.80 3.84 3.86 3.89 3.91 3.93 3.95 3.97 3.99 4.00 4.02 4.03 4.04 4.06
* The values for k ≥ 3 in this table are exact values for the studentized maximum absolute deviate from the sample mean in normal samples Ha and represent
modifications by E. G. Schilling and D. Smialek [“Simplified Analysis of Means for Crossed and Nested Experiments,” Proceedings of the 43rd Annual Quality
Control Conference, Rochester Section, ASQC (March 10, 1987)] of the exact values ha calculated by L. S. Nelson [“Exact Critical Values for Use with the Analysis of
Means,” Journal of Quality Technology 15, no. 1 (January 1983): 40–44] using the relationship Ha = ha ([k – 1] / k)1/2. The values for k = 2 are from the Student’s t
distribution as calculated by Ott [E. R. Ott, “Analysis of Means,” Rutgers University Statistics Center Technical Report No. 1 (August 10, 1958)]. These table values
have been recalculated by D. V. Neubauer.
Appendix 597

Table A.9 Dixon criteria for testing extreme mean or individual.*


Dixon statistic No. of ` = 0.10 ` = 0.05 ` = 0.02 ` = 0.01
X(1) is outlier X(n) is outlier obs., n P90 P95 P98 P99

X ( 2 ) − X (1) X ( n ) − X ( n −1)
r10 = r10 = 3 0.886 0.941 0.976 0.988
X (n ) − X (1) X ( n ) − X (1)

4 0.679 0.765 0.846 0.889


5 0.557 0.642 0.729 0.780
6 0.482 0.560 0.644 0.698
7 0.434 0.507 0.586 0.637

X ( 2 ) − X (1) X ( n ) − X ( n −1)
r11 = r11 = 8 0.479 0.554 0.631 0.683
X (n −1) − X (1) X (n ) − X ( 2)

9 0.441 0.512 0.587 0.635


10 0.409 0.477 0.551 0.597

X ( 3 ) − X (1) X (n ) − X (n −2)
r21 = r21 = 11 0.517 0.576 0.638 0.679
X ( n −1) − X (1) X (n ) − X ( 2)

12 0.490 0.546 0.605 0.642


13 0.467 0.521 0.578 0.615

X ( 3 ) − X (1) X (n ) − X (n −2)
r22 = r22 = 14 0.492 0.546 0.602 0.641
X ( n − 2 ) − X (1) X (n ) − X ( 3 )

15 0.472 0.525 0.579 0.616


16 0.454 0.507 0.559 0.595
17 0.438 0.490 0.542 0.577
18 0.424 0.475 0.527 0.561
19 0.412 0.462 0.514 0.547
20 0.401 0.450 0.502 0.535
21 0.391 0.440 0.491 0.524
22 0.382 0.430 0.481 0.514
23 0.374 0.421 0.472 0.505
24 0.367 0.413 0.464 0.497
25 0.360 0.406 0.457 0.489
* Note that:
X(1) = Smallest value (first-order statistic)
X(2) = Next smallest value (second-order statistic)
X(n) = Largest value (n th order statistic)
Source: W. J. Dixon, “Processing Data for Outliers,” Biometrics 9, no. 1 (1953): 74–89. (Reprinted by
permission of the editor of Biometrics.)
598 Appendix

Table A.10 Grubbs criteria for simultaneously testing the two largest or two smallest observations.
2
Compare computed values of S n–1,n /S 2 or S 1,2
2
/S 2 with the appropriate critical ratio in this table;
smaller observed sample ratios call for rejection. X(1) ≤ X(2) ≤ . . . ≤ X(n).
Number of 10% 5% 1%
observations level level level
4 .0031 .0008 .0000
5 .0376 .0183 .0035
6 .0921 .0565 .0186
7 .1479 .1020 .0440
8 .1994 .1478 .0750
9 .2454 .1909 .1082
10 .2853 .2305 .1415
11 .3226 .2666 .1736
12 .3552 .2996 .2044
13 .3843 .3295 .2333
14 .4106 .3568 .2605
15 .4345 .3818 .2859
16 .4562 .4048 .3098
17 .4761 .4259 .3321
18 .4944 .4455 .3530
19 .5113 .4636 .3725
20 .5269 .4804 .3909

( )
n n
S2 = ∑ Xi − X X = ∑Xi / n
2

i =1 i =1

( )
n n
= ∑ X i − X 12, X 12, = ∑ X i / (n − 2)
2
2
S12
,
i =3 i =3

n −2 n −2

(
Sn2−1,n = ∑ X i − X n −1,n ) X n −1,n = ∑ X i / (n − 2)
2

i =1 i =1

Source: F. E. Grubbs, “Procedures for Detecting Outlying Observations in Samples,” Technometrics 11, no. 1
(February 1969): 1–21. (Reproduced by permission of the editor.)
Table A.11 Expanded table of the adjusted d2 factor (d2*) for estimating the standard deviation from the average range.

To be used with estimates of s based on k independent sample ranges of ng each. (Unbiased estimate of s 2 is ( R/d2*) 2; unbiased estimate of s

is R/d2 , where d2 is from Table A.4.)
Subgroup size, ng
k ng = 2 ng = 3 ng = 4 ng = 5 ng = 6 ng = 7 ng = 8 ng = 9
Number of
* * * * * * * *
samples d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df
1 1.400 1.0 1.910 2.0 2.239 2.9 2.481 3.8 2.672 4.7 2.830 5.5 2.963 6.3 3.078 7.0
2 1.278 1.8 1.805 3.6 2.151 5.5 2.405 7.2 2.604 8.9 2.768 10.5 2.906 12.1 3.024 13.5
3 1.231 2.6 1.769 5.4 2.120 8.2 2.379 10.9 2.581 13.4 2.747 15.8 2.886 18.1 3.006 20.3
4 1.206 3.5 1.750 7.3 2.105 11.0 2.366 14.5 2.570 17.9 2.736 21.1 2.877 24.1 2.997 27.0
5 1.191 4.4 1.739 9.1 2.096 13.7 2.358 18.1 2.563 22.3 2.730 26.3 2.871 30.2 2.992 33.8
6 1.180 5.3 1.732 10.9 2.091 16.4 2.353 21.7 2.557 26.8 2.725 31.6 2.866 36.2 2.987 40.6
7 1.173 6.1 1.727 12.7 2.086 19.2 2.349 25.4 2.554 31.3 2.722 36.9 2.863 42.2 2.985 47.3
8 1.167 7.0 1.722 14.5 2.083 21.9 2.346 29.0 2.552 35.7 2.720 42.1 2.861 48.2 2.983 54.1
9 1.163 7.9 1.719 16.3 2.080 24.6 2.344 32.6 2.550 40.2 2.718 47.4 2.860 54.3 2.981 60.8
10 1.159 8.8 1.717 18.2 2.078 27.4 2.342 36.2 2.548 44.7 2.717 52.7 2.858 60.3 2.980 67.6
11 1.156 9.6 1.714 20.0 2.076 30.1 2.341 39.9 2.547 49.1 2.715 57.9 2.857 66.3 2.979 74.3
12 1.154 10.5 1.713 21.8 2.075 32.9 2.339 43.5 2.546 53.6 2.714 63.2 2.856 72.4 2.979 81.1
13 1.152 11.4 1.711 23.6 2.074 35.6 2.338 47.1 2.545 58.1 2.714 68.5 2.856 78.4 2.978 87.9
14 1.150 12.3 1.710 25.4 2.073 38.3 2.338 50.7 2.544 62.5 2.713 73.7 2.855 84.4 2.977 94.6
15 1.149 13.1 1.709 27.2 2.072 41.1 2.337 54.3 2.543 67.0 2.712 79.0 2.855 90.5 2.977 101.4
20 1.144 17.5 1.705 36.3 2.069 54.8 2.334 72.5 2.541 89.3 2.710 105.3 2.853 120.6 2.975 135.2
25 1.141 21.9 1.702 45.4 2.066 68.4 2.332 90.6 2.540 111.6 2.709 131.7 2.852 150.8 2.974 169.0
30 1.138 26.3 1.701 54.5 2.065 82.1 2.331 108.7 2.539 134.0 2.708 158.0 2.851 180.9 2.973 202.8
40 1.136 35.0 1.699 72.6 2.064 109.5 2.330 144.9 2.538 178.6 2.707 210.7 2.850 241.2 2.973 270.3
60 1.133 52.6 1.697 108.9 2.062 164.3 2.329 217.4 2.536 267.9 2.706 316.0 2.849 361.8 2.972 405.5
∞ 1.128 ∞ 1.693 ∞ 2.059 ∞ 2.326 ∞ 2.534 ∞ 2.704 ∞ 2.847 ∞ 2.970 ∞

Appendix
Continued

599
600
Appendix
Subgroup size, ng
k ng = 10 ng = 11 ng = 12 ng = 13 ng = 14 ng = 15 ng = 16 ng = 17
Number of
* * * * * * * *
samples d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df d 2 df
1 3.179 7.7 3.269 8.3 3.350 9.0 3.424 9.6 3.491 10.2 3.553 10.8 3.611 11.3 3.664 11.9
2 3.129 14.9 3.221 16.2 3.305 17.5 3.380 18.7 3.449 19.9 3.513 21.1 3.572 22.2 3.626 23.3
3 3.112 22.4 3.205 24.4 3.289 26.3 3.366 28.1 3.435 29.9 3.499 31.6 3.558 33.3 3.614 34.9
4 3.103 29.8 3.197 32.5 3.282 35.0 3.358 37.5 3.428 39.9 3.492 42.2 3.552 44.4 3.607 46.5
5 3.098 37.3 3.192 40.6 3.277 43.8 3.354 46.9 3.424 49.8 3.488 52.7 3.548 55.5 3.603 58.1
6 3.095 44.7 3.189 48.7 3.274 52.6 3.351 56.2 3.421 59.8 3.486 63.2 3.545 66.5 3.601 69.8
7 3.092 52.2 3.187 56.8 3.272 61.3 3.349 65.6 3.419 69.8 3.484 73.8 3.543 77.6 3.599 81.4
8 3.090 59.6 3.185 65.0 3.270 70.1 3.347 75.0 3.417 79.7 3.482 84.3 3.542 88.7 3.598 93.0
9 3.089 67.1 3.184 73.1 3.269 78.8 3.346 84.4 3.416 89.7 3.481 94.9 3.541 99.8 3.596 104.6
10 3.088 74.5 3.183 81.2 3.268 87.6 3.345 93.7 3.415 99.7 3.480 105.4 3.540 110.9 3.596 116.3
11 3.087 82.0 3.182 89.3 3.267 96.4 3.344 103.1 3.415 109.6 3.479 115.9 3.539 122.0 3.595 127.9
12 3.086 89.4 3.181 97.4 3.266 105.1 3.343 112.5 3.414 119.6 3.479 126.5 3.539 133.1 3.594 139.5
13 3.085 96.9 3.180 105.6 3.266 113.9 3.343 121.9 3.413 129.6 3.478 137.0 3.538 144.2 3.594 151.1
14 3.085 104.4 3.180 113.7 3.265 122.6 3.342 131.2 3.413 139.5 3.478 147.5 3.538 155.3 3.593 162.8
15 3.084 111.8 3.179 121.8 3.265 131.4 3.342 140.6 3.412 149.5 3.477 158.1 3.537 166.4 3.593 174.4
20 3.083 149.1 3.178 162.4 3.263 175.2 3.340 187.5 3.411 199.3 3.476 210.8 3.536 221.8 3.592 232.5
25 3.082 186.4 3.177 203.0 3.262 219.0 3.340 234.4 3.410 249.2 3.475 263.5 3.535 277.3 3.591 290.7
30 3.081 223.6 3.176 243.6 3.262 262.8 3.339 281.2 3.410 299.0 3.475 316.2 3.535 332.7 3.590 348.8
40 3.080 298.2 3.175 324.8 3.261 350.4 3.338 375.0 3.409 398.7 3.474 421.6 3.534 443.7 3.590 465.1
60 3.079 447.2 3.175 487.2 3.260 525.6 3.337 562.5 3.408 598.0 3.473 632.3 3.533 665.5 3.589 697.6
∞ 3.078 ∞ 3.173 ∞ 3.258 ∞ 3.336 ∞ 3.407 ∞ 3.472 ∞ 3.532 ∞ 3.588 ∞
Continued
Continued
Subgroup size, ng
k ng = 18 ng = 19 ng = 20 ng = 21 ng = 22 ng = 23 ng = 24 ng = 25
Number of
samples d2* df d2* df d2* df d2* df d2* df d2* df d2* df d2* df
1 3.714 12.4 3.761 12.9 3.805 13.4 3.847 13.8 3.887 14.3 3.924 14.8 3.960 15.2 3.994 15.6
2 3.677 24.3 3.725 25.3 3.770 26.3 3.813 27.2 3.853 28.1 3.891 29.0 3.928 29.9 3.962 30.8
3 3.665 36.4 3.713 37.9 3.759 39.4 3.801 40.8 3.842 42.2 3.880 43.6 3.917 44.9 3.952 46.2
4 3.659 48.6 3.707 50.6 3.753 52.5 3.796 54.4 3.836 56.3 3.875 58.1 3.912 59.9 3.947 61.6
5 3.655 60.7 3.704 63.2 3.749 65.7 3.792 68.1 3.833 70.4 3.872 72.6 3.908 74.8 3.943 77.0
6 3.653 72.9 3.701 75.9 3.747 78.8 3.790 81.7 3.831 84.4 3.869 87.1 3.906 89.8 3.941 92.4
7 3.651 85.0 3.699 88.5 3.745 92.0 3.788 95.3 3.829 98.5 3.868 101.7 3.905 104.7 3.940 107.7
8 3.649 97.2 3.698 101.2 3.744 105.1 3.787 108.9 3.828 112.6 3.867 116.2 3.903 119.7 3.939 123.1
9 3.648 109.3 3.697 113.8 3.743 118.2 3.786 122.5 3.827 126.7 3.866 130.7 3.903 134.7 3.938 138.5
10 3.648 121.4 3.696 126.5 3.742 131.4 3.785 136.1 3.826 140.7 3.865 145.2 3.902 149.6 3.937 153.9
11 3.647 133.6 3.696 139.1 3.741 144.5 3.785 149.7 3.826 154.8 3.864 159.8 3.901 164.6 3.936 169.3
12 3.646 145.7 3.695 151.8 3.741 157.6 3.784 163.3 3.825 168.9 3.864 174.3 3.901 179.6 3.936 184.7
13 3.646 157.9 3.695 164.4 3.740 170.8 3.784 176.9 3.825 183.0 3.863 188.8 3.900 194.5 3.936 200.1
14 3.645 170.0 3.694 177.1 3.740 183.9 3.783 190.6 3.824 197.0 3.863 203.3 3.900 209.5 3.935 215.5
15 3.645 182.2 3.694 189.7 3.740 197.0 3.783 204.2 3.824 211.1 3.863 217.9 3.900 224.4 3.935 230.9
20 3.644 242.9 3.693 252.9 3.739 262.7 3.782 272.2 3.823 281.5 3.862 290.5 3.899 299.3 3.934 307.8
25 3.643 303.6 3.692 316.2 3.738 328.4 3.781 340.3 3.822 351.8 3.861 363.1 3.898 374.1 3.933 384.8
30 3.643 364.3 3.691 379.4 3.737 394.1 3.781 408.3 3.822 422.2 3.861 435.7 3.898 448.9 3.933 461.8
40 3.642 485.8 3.691 505.9 3.737 525.4 3.780 544.4 3.821 562.9 3.860 580.9 3.897 598.5 3.932 615.7
60 3.641 728.7 3.690 758.8 3.736 788.2 3.779 816.7 3.821 844.4 3.859 871.4 3.896 897.8 3.932 923.5
∞ 3.640 ∞ 3.689 ∞ 3.735 ∞ 3.778 ∞ 3.819 ∞ 3.858 ∞ 3.895 ∞ 3.931 ∞
Source: The approximation for d2* is based on the approximation given by P. B. Patnaik in the paper, “The Use of Mean Range As an Estimator of Variance in Statistical
Tests,” Biometrika 37 (1950): 78–87. The calculation for the degrees of freedom is based on an extension to the approximation given by P. B. Patnaik, which was

Appendix
presented by H. A. David in the paper, “Further Applications of Range to the Analysis of Variance,” Biometrika 38 (1951): 393–407, to improve the accuracy for
k > 5, in particular. This table has been extended and the values have been recalculated by D. V. Neubauer.

601
602
Table A.12a F distribution, upper five percent points (F0.95) (one-sided).
v1 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 40 60 120 ∞

Appendix
v2
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 243.9 245.9 248.0 249.1 250.1 251.1 252.2 253.3 254.3
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.74 8.70 8.68 8.64 8.62 8.59 8.57 8.55 8.53
4 7.71 6.94 6.59 6.39 6.26 6.16 6.08 6.04 6.00 5.96 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.63
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.36
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.67
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27 3.23
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.93
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.71
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.54
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.45 2.40
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.30
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.60 2.53 2.46 2.42 2.38 2.34 2.30 2.25 2.21
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18 2.13
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.07
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.42 2.35 2.28 2.24 2.19 2.15 2.11 2.06 2.01
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01 1.96
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.34 2.27 2.19 2.15 2.11 2.06 2.02 1.97 1.92
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93 1.88
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.84
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.25 2.18 2.10 2.05 2.01 1.95 1.92 1.87 1.81
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.23 2.15 2.07 2.03 1.98 1.94 1.89 1.84 1.78
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.20 2.13 2.05 2.01 1.96 1.91 1.86 1.81 1.76
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.73
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.16 2.09 2.01 1.96 1.92 1.87 1.82 1.77 1.71
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 2.15 2.07 1.99 1.95 1.90 1.85 1.80 1.75 1.69
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 2.13 2.06 1.97 1.93 1.88 1.84 1.79 1.73 1.67
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 2.12 2.04 1.96 1.91 1.87 1.82 1.77 1.71 1.65
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 2.10 2.03 1.94 1.90 1.85 1.81 1.75 1.70 1.64
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.62
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.51
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.39
120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.91 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.25
∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.00
Source: E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, 3rd ed. (London: University College, 1966). (Reproduced by permission of the
Biometrika trustees.)
Table A.12b F distribution, upper 2.5 percent points (F0.975) (one-sided).
v1 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 40 60 120 ∞
v2
1 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.7 963.3 968.6 976.7 984.9 993.1 997.2 1001 1006 1010 1014 1018
2 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.41 39.43 39.45 39.46 39.46 39.47 39.48 39.49 39.50
3 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.34 14.25 14.17 14.12 14.08 14.04 13.99 13.95 13.90
4 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.75 8.66 8.56 8.51 8.46 8.41 8.36 8.31 8.26
5 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.52 6.43 6.33 6.28 6.23 6.18 6.12 6.07 6.02
6 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.37 5.27 5.17 5.12 5.07 5.01 4.96 4.90 4.85
7 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76 4.67 4.57 4.47 4.42 4.36 4.31 4.25 4.20 4.14
8 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.20 4.10 4.00 3.95 3.89 3.84 3.78 3.73 3.67
9 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.87 3.77 3.67 3.61 3.56 3.51 3.45 3.39 3.33
10 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.62 3.52 3.42 3.37 3.31 3.26 3.20 3.14 3.08
11 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53 3.43 3.33 3.23 3.17 3.12 3.06 3.00 2.94 2.88
12 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.28 3.18 3.07 3.02 2.96 2.91 2.85 2.79 2.72
13 6.41 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25 3.15 3.05 2.95 2.89 2.84 2.78 2.72 2.66 2.60
14 6.30 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15 3.05 2.95 2.84 2.79 2.73 2.67 2.61 2.55 2.49
15 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.96 2.86 2.76 2.70 2.64 2.59 2.52 2.46 2.40
16 6.12 4.69 4.08 3.73 3.50 3.34 3.22 3.12 3.05 2.99 2.89 2.79 2.68 2.63 2.57 2.51 2.45 2.38 2.32
17 6.04 4.62 4.01 3.66 3.44 3.28 3.16 3.06 2.98 2.92 2.82 2.72 2.62 2.56 2.50 2.44 2.38 2.32 2.25
18 5.98 4.56 3.95 3.61 3.38 3.22 3.10 3.01 2.93 2.87 2.77 2.67 2.56 2.50 2.44 2.38 2.32 2.26 2.19
19 5.92 4.51 3.90 3.56 3.33 3.17 3.05 2.96 2.88 2.82 2.72 2.62 2.51 2.45 2.39 2.33 2.27 2.20 2.13
20 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.77 2.68 2.57 2.46 2.41 2.35 2.29 2.22 2.16 2.09
21 5.83 4.42 3.82 3.48 3.25 3.09 2.97 2.87 2.80 2.73 2.64 2.53 2.42 2.37 2.31 2.25 2.18 2.11 2.04
22 5.79 4.38 3.78 3.44 3.22 3.05 2.93 2.84 2.76 2.70 2.60 2.50 2.39 2.33 2.27 2.21 2.14 2.08 2.00
23 5.75 4.35 3.75 3.41 3.18 3.02 2.90 2.81 2.73 2.67 2.57 2.47 2.36 2.30 2.24 2.18 2.11 2.04 1.97
24 5.72 4.32 3.72 3.38 3.15 2.99 2.87 2.78 2.70 2.64 2.54 2.44 2.33 2.27 2.21 2.15 2.08 2.01 1.94
25 5.69 4.29 3.69 3.35 3.13 2.97 2.85 2.75 2.68 2.61 2.51 2.41 2.30 2.24 2.18 2.12 2.05 1.98 1.91
26 5.66 4.27 3.67 3.33 3.10 2.94 2.82 2.73 2.65 2.59 2.49 2.39 2.28 2.22 2.16 2.09 2.03 1.95 1.88
27 5.63 4.24 3.65 3.31 3.08 2.92 2.80 2.71 2.63 2.57 2.47 2.36 2.25 2.19 2.13 2.07 2.00 1.93 1.85
28 5.61 4.22 3.63 3.29 3.06 2.90 2.78 2.69 2.61 2.55 2.45 2.34 2.23 2.17 2.11 2.05 1.98 1.91 1.83
29 5.59 4.20 3.61 3.27 3.04 2.88 2.76 2.67 2.59 2.53 2.43 2.32 2.21 2.15 2.09 2.03 1.96 1.89 1.81

Appendix
30 5.57 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.51 2.41 2.31 2.20 2.14 2.07 2.01 1.94 1.87 1.79
40 5.42 4.05 3.46 3.13 2.90 2.74 2.62 2.53 2.45 2.39 2.29 2.18 2.07 2.01 1.94 1.88 1.80 1.72 1.64
60 5.29 3.93 3.34 3.01 2.79 2.63 2.51 2.41 2.33 2.27 2.17 2.06 1.94 1.88 1.82 1.74 1.67 1.58 1.48
120 5.15 3.80 3.23 2.89 2.67 2.52 2.39 2.30 2.22 2.16 2.05 1.94 1.82 1.76 1.69 1.61 1.53 1.43 1.31
∞ 5.02 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.05 1.94 1.83 1.71 1.64 1.57 1.48 1.39 1.27 1.00

603
Source: E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, 3rd ed. (London: University College, 1966). (Reproduced by permission of the
Biometrika trustees.)
604
Table A.12c F distribution, upper one percent points (F0.99) (one-sided).
v1 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 40 60 120 ∞

Appendix
v2
1 4052 4999.5 5403 5625 5764 5859 5928 5982 6022 6056 6106 6157 6209 6235 6261 6287 6313 6339 6366
2 98.50 99.00 99.17 99.25 99.30 99.93 99.36 99.37 99.39 99.40 99.42 99.43 99.45 99.46 99.47 99.47 99.48 99.49 99.50
3 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23 27.05 26.87 26.69 26.60 26.50 26.41 26.32 26.22 26.13
4 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55 14.37 14.20 14.02 13.93 13.84 13.75 13.65 13.56 13.46
5 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 9.89 9.72 9.55 9.47 9.38 9.29 9.20 9.11 9.02
6 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.72 7.53 7.40 7.31 7.23 7.14 7.06 6.97 6.88
7 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.47 6.31 6.16 6.07 5.99 5.91 5.82 5.74 5.65
8 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.67 5.52 5.35 5.28 5.20 5.12 5.03 4.95 4.86
9 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 5.11 4.96 4.81 4.73 4.65 4.57 4.48 4.40 4.31
10 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.71 4.56 4.41 4.33 4.25 4.17 4.08 4.00 3.91
11 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54 4.40 4.25 4.10 4.02 3.94 3.86 3.78 3.69 3.60
12 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.16 4.01 3.83 3.78 3.70 3.62 3.54 3.45 3.36
13 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10 3.96 3.82 3.66 3.59 3.51 3.43 3.34 3.25 3.17
14 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94 3.80 3.66 3.51 3.43 3.35 3.27 3.18 3.09 3.00
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.67 3.52 3.37 3.29 3.21 3.13 3.05 2.96 2.87
16 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69 3.55 3.41 3.26 3.18 3.10 3.02 2.93 2.84 2.75
17 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59 3.46 3.31 3.16 3.08 3.00 2.92 2.83 2.75 2.65
18 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51 3.37 3.23 3.03 3.00 2.92 2.84 2.75 2.66 2.57
19 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43 3.30 3.15 3.00 2.92 2.84 2.70 2.67 2.58 2.49
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37 3.23 3.09 2.94 2.86 2.78 2.69 2.61 2.52 2.42
21 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31 3.17 3.03 2.88 2.80 2.72 2.64 2.55 2.46 2.36
22 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26 3.12 2.98 2.83 2.75 2.67 2.58 2.50 2.40 2.31
23 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21 3.07 2.93 2.78 2.70 2.62 2.54 2.45 2.35 2.26
24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17 3.03 2.89 2.74 2.66 2.58 2.49 2.40 2.31 2.21
25 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.13 2.99 2.85 2.70 2.62 2.54 2.45 2.36 2.27 2.17
26 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.09 2.96 2.81 2.66 2.58 2.50 2.42 2.33 2.23 2.13
27 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.06 2.93 2.78 2.63 2.55 2.47 2.38 2.29 2.20 2.10
28 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.03 2.90 2.75 2.60 2.52 2.44 2.35 2.26 2.17 2.06
29 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.00 2.87 2.73 2.57 2.49 2.41 2.33 2.23 2.14 2.03
30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.93 2.84 2.70 2.55 2.47 2.39 2.30 2.21 2.11 2.01
40 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80 2.63 2.52 2.37 2.29 2.20 2.11 2.02 1.92 1.80
60 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63 2.50 2.35 2.20 2.12 2.03 1.94 1.84 1.73 1.60
120 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47 2.34 2.19 2.03 1.95 1.86 1.76 1.66 1.53 1.38
∞ 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32 2.18 2.04 1.88 1.79 1.70 1.59 1.47 1.32 1.00
Source: E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, 3rd ed. (London: University College, 1966). (Reproduced by permission of the
Biometrika trustees.)
Appendix 605

Table A.13 Critical values of the Tukey-Duckworth sum.


Two-sided critical
values of the sum One-sided critical
Approximate risk a+b values of the sum
0.09 6 5
0.05 7 6
0.01 10 9
0.001 13 12
Source: John W. Tukey, “A Quick, Compact, Two-Sample Test to Duckworth’s
Specifications,” Technometrics 1, no. 1 (February 1959): 21–48. (Reproduced
by permission.)
The critical 0.09 value was given by Peter C. Dickinson. One-sided values were
contributed by Dr. Larry Rabinowitz.

Table A.14 Values of Ha , k = 2, ANOM (two-tailed test).


df ` = 0.10 0.05 0.01
2 2.06 3.04 7.02
3 1.66 2.25 4.13
4 1.51 1.96 3.26
5 1.42 1.82 2.85
6 1.37 1.73 2.62
7 1.34 1.67 2.47
8 1.32 1.63 2.37
9 1.30 1.60 2.30
10 1.28 1.58 2.24
12 1.26 1.54 2.16
15 1.24 1.51 2.08
18 1.23 1.49 2.04
20 1.22 1.48 2.01
25 1.21 1.46 1.97
30 1.20 1.44 1.94
40 1.19 1.43 1.91
60 1.18 1.41 1.88
∞ 1.16 1.39 1.82
606 Appendix

Table A.15 Distribution of Student’s t (two-tail).


Values of t corresponding to selected probabilities. Each
probability is the sum of two equal areas under the two tails
of the t curve. For example, the probability is 0.05 = 2(0.025)
that a difference with df = 20 would have t ≥ |2.09|.

.025

– ts 0 – ts

Probability
df .10 .05 .02 .01
6 1.94 2.45 3.14 3.71
7 1.90 2.37 3.00 3.50
8 1.86 2.31 2.90 3.35
9 1.83 2.26 2.82 3.25
10 1.81 2.23 2.76 3.17
11 1.80 2.20 2.72 3.11
12 1.78 2.18 2.68 3.06
13 1.77 2.16 2.65 3.01
14 1.76 2.15 2.62 2.98
15 1.75 2.13 2.60 2.95
20 1.73 2.09 2.52 2.85
25 1.70 2.06 2.49 2.79
30 1.70 2.04 2.46 2.75
50 1.68 2.01 2.40 2.68
∞ 1.645 1.960 2.326 2.576
Source: This table is a modification of the one by E. T. Federighi,
“Extended Tables of the Percentage Points of Student’s t
Distribution,” Journal of the American Statistics Association 54
(1959): 684. (Reproduced by permission of ASA.)
Appendix 607

Table A.16 Nonrandom uniformity, Na (no standard given).


` = .05 ` = .01
df: 10 15 30 ∞ df: 10 15 30 ∞
k* k
3 .20 .20 .20 .20 3 .09 .09 .09 .09
4 .35 .35 .35 .35 4 .19 .19 .20 .20
5 .46 .46 .46 .47 5 .29 .29 .29 .30
6 .55 .55 .56 .56 6 .37 .37 .38 .38
7 .62 .63 .64 .65 7 .43 .44 .45 .46
8 .69 .70 .70 .72 8 .49 .50 .51 .53
9 .74 .75 .77 .78 9 .54 .56 .57 .59
* k = number of means being compared.
Source: K. R. Nair, “The Distribution of the Extreme Deviate from the Sample Mean and Its Studentized Form,”
Biometrika 35 (1948): 118–44. (Reproduced by permission of the Biometrika trustees.)

Table A.17 Some blocked full factorials.


Design 0 1 2 2A 3
Factors 2 3 4 4 5
Blocks 2 2 2 4 2
Runs 4 8 16 16 32
B1 B2 B1 B2 B1 B2 B1 B2 B1 B2
1 b 1 c 1 d 1 a 1 e
ab a ac a ad a bc abc ae a
AB bc b bd b abd bd be b
ab abc ab abd acd cd ab abc
ABC cd c B3 B4 ce c
ac acd b d ac ace
bc bcd c bcd bc bce
abcd abc ad ab abce abc
ABCD abcd ac de d
AD, ABC, ad ade
BCD bd bde
B1 = Block 1 B3 = Block 3 abde abd
B2 = Block 2 B4 = Block 4 cd cde
Interaction confounded with acde acd
blocks shown at bottom of bcde bcd
column abcd abcde
ABCDE
608
Appendix
Table A.18 Some fractional factorials.
Resolution II III IV III V
Design 0 1 2 4 3
Factors 2 3 4 5 5
Fraction 1/2 1/2 1/2 1/4 1/2
Runs 2 4 8 8 16
TRT EFF TRT EFF TRT EFF TRT EFF TRT EFF
(1) T (1) T (1) T (1) T (1) T
a(b) A+B a(c) A – BC a(d) A a(d) A – DE a(e) A
I = AB b(c) B – AC b(d) B b(de) B – CE b(e) B
ab AB – C ab AB + CD ab(e) AB + CD ab AB
I = –ABC c(d) C c(de) C – BE c(e) C
ac AC + BD ac(e) AC + BD ac AC
bc BC + AD bc –E + BC + AD bc BC
abc(d) D abc(d) D – AE abc(e) –DE
I = ABCD I = –BCE = –ADE = ABCD d(e) D
ad AD
bd BD
TRT = treatments in Yates order abd(e) –CE
EFF = effect estimated for corresponding row in Yates cd CD
(ignores higher than two-factor interaction) acd(e) –BE
I = defining relation bcd(e) –AE
abcd –E
I = –ABCDE
Table A.19 Sidak factors for analysis of means for treatment effects, ha* (two-sided).
Significance level = 0.10
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 24 30 40 60
2 2.920
3 2.352 3.683
4 2.132 3.148 3.450
5 2.015 2.881 3.128 3.326
6 1.943 2.723 2.938 3.109 3.252
7 1.895 2.618 2.814 2.968 3.096 3.206
8 1.860 2.544 2.726 2.869 2.987 3.088 3.175
9 1.833 2.488 2.661 2.796 2.907 3.001 3.083 3.155
10 1.812 2.446 2.611 2.739 2.845 2.934 3.011 3.080 3.142
11 1.796 2.412 2.571 2.695 2.796 2.881 2.955 3.021 3.079 3.133
12 1.782 2.384 2.539 2.658 2.756 2.838 2.910 2.973 3.029 3.080 3.127
13 1.771 2.361 2.512 2.628 2.723 2.803 2.872 2.933 2.988 3.037 3.082 3.124
14 1.761 2.342 2.489 2.603 2.696 2.774 2.841 2.900 2.953 3.001 3.045 3.085 3.122
15 1.753 2.325 2.470 2.582 2.672 2.748 2.814 2.872 2.924 2.970 3.013 3.052 3.088 3.122
16 1.746 2.311 2.453 2.563 2.652 2.726 2.791 2.848 2.898 2.944 2.985 3.024 3.059 3.092 3.122
17 1.740 2.298 2.439 2.547 2.634 2.708 2.771 2.826 2.876 2.921 2.962 2.999 3.034 3.066 3.096 3.124
18 1.734 2.287 2.426 2.532 2.619 2.691 2.753 2.808 2.857 2.901 2.941 2.977 3.011 3.043 3.072 3.100 3.126
19 1.729 2.277 2.415 2.520 2.605 2.676 2.738 2.791 2.839 2.883 2.922 2.958 2.992 3.023 3.052 3.079 3.104 3.129
20 1.725 2.269 2.405 2.508 2.593 2.663 2.724 2.777 2.824 2.867 2.906 2.941 2.974 3.005 3.033 3.060 3.085 3.109 3.132
24 1.711 2.241 2.373 2.473 2.554 2.622 2.680 2.731 2.777 2.818 2.855 2.889 2.920 2.949 2.976 3.002 3.026 3.048 3.070 3.146
30 1.697 2.215 2.342 2.439 2.517 2.582 2.638 2.687 2.731 2.770 2.805 2.838 2.868 2.895 2.921 2.946 2.968 2.990 3.010 3.082 3.169
40 1.684 2.189 2.312 2.406 2.481 2.544 2.597 2.644 2.686 2.723 2.757 2.788 2.817 2.843 2.868 2.891 2.913 2.933 2.952 3.021 3.103 3.208
60 1.671 2.163 2.283 2.373 2.446 2.506 2.558 2.603 2.643 2.678 2.711 2.740 2.768 2.793 2.816 2.838 2.859 2.878 2.897 2.962 3.040 3.139 3.276
120 1.658 2.138 2.254 2.342 2.411 2.469 2.519 2.562 2.600 2.635 2.666 2.694 2.720 2.744 2.767 2.787 2.807 2.826 2.843 2.904 2.979 3.072 3.201

Appendix
∞ 1.645 2.114 2.226 2.311 2.378 2.434 2.482 2.523 2.560 2.592 2.622 2.649 2.674 2.697 2.718 2.738 2.757 2.774 2.791 2.849 2.920 3.008 3.129
Continued

609
610
Appendix
Significance level = 0.05
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 24 30 40 60
2 4.303
3 3.179 4.804
4 2.776 3.936 4.283
5 2.570 3.517 3.789 4.008
6 2.447 3.273 3.505 3.689 3.843
7 2.365 3.115 3.321 3.484 3.619 3.735
8 2.306 3.004 3.193 3.341 3.464 3.569 3.661
9 2.262 2.923 3.099 3.237 3.351 3.448 3.532 3.607
10 2.228 2.860 3.027 3.157 3.264 3.355 3.434 3.504 3.567
11 2.201 2.811 2.970 3.094 3.196 3.282 3.357 3.424 3.483 3.537
12 2.179 2.770 2.924 3.044 3.141 3.224 3.296 3.359 3.416 3.467 3.515
13 2.160 2.737 2.886 3.002 3.096 3.176 3.245 3.306 3.360 3.410 3.455 3.497
14 2.145 2.709 2.854 2.967 3.058 3.135 3.202 3.261 3.314 3.362 3.406 3.446 3.483
15 2.131 2.685 2.827 2.937 3.026 3.101 3.166 3.224 3.275 3.321 3.363 3.402 3.438 3.472
16 2.120 2.665 2.804 2.911 2.998 3.072 3.135 3.191 3.241 3.286 3.327 3.365 3.400 3.433 3.464
17 2.110 2.647 2.783 2.889 2.974 3.046 3.108 3.163 3.212 3.256 3.296 3.333 3.367 3.399 3.429 3.457
18 2.101 2.631 2.766 2.869 2.953 3.024 3.085 3.138 3.186 3.229 3.269 3.305 3.338 3.370 3.399 3.426 3.452
19 2.093 2.617 2.750 2.852 2.934 3.004 3.064 3.116 3.163 3.206 3.245 3.280 3.313 3.343 3.372 3.399 3.424 3.448
20 2.086 2.605 2.736 2.836 2.918 2.986 3.045 3.097 3.143 3.185 3.223 3.258 3.290 3.320 3.348 3.375 3.399 3.423 3.445
24 2.064 2.566 2.692 2.788 2.866 2.931 2.988 3.037 3.081 3.121 3.157 3.190 3.220 3.249 3.275 3.300 3.323 3.345 3.366 3.440
30 2.042 2.528 2.649 2.742 2.816 2.878 2.932 2.979 3.021 3.058 3.092 3.124 3.153 3.180 3.205 3.228 3.250 3.271 3.291 3.360 3.445
40 2.021 2.492 2.608 2.696 2.768 2.827 2.878 2.923 2.963 2.998 3.031 3.060 3.088 3.113 3.137 3.159 3.180 3.199 3.218 3.283 3.363 3.464
60 2.000 2.456 2.568 2.653 2.721 2.777 2.826 2.869 2.906 2.940 2.971 2.999 3.025 3.049 3.071 3.092 3.112 3.130 3.148 3.210 3.284 3.379 3.511
120 1.980 2.422 2.529 2.610 2.675 2.729 2.776 2.816 2.852 2.884 2.913 2.940 2.965 2.987 3.008 3.028 3.047 3.064 3.081 3.139 3.209 3.298 3.421
∞ 1.960 2.388 2.491 2.569 2.631 2.683 2.727 2.766 2.800 2.830 2.858 2.883 2.906 2.928 2.948 2.967 2.984 3.001 3.016 3.071 3.137 3.220 3.335
Continued
Significance level = 0.01
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 24 30 40 60
2 9.925
3 5.795 8.399
4 4.594 6.213 6.703
5 4.029 5.232 5.585 5.870
6 3.706 4.690 4.971 5.196 5.385
7 3.499 4.351 4.589 4.779 4.937 5.073
8 3.355 4.119 4.329 4,496 4.635 4.753 4.858
9 3.250 3.951 4.143 4.293 4.418 4.525 4.618 4.701
10 3.169 3.825 4.002 4.141 4.255 4.353 4.439 4.515 4.583
11 3.106 3.726 3.892 4.022 4.129 4.220 4.300 4.370 4.434 4.491
12 3.055 3.647 3.804 3.927 4.028 4.114 4.189 4.255 4.315 4.369 4.418
13 3.012 3.582 3.732 3.850 3.946 4.028 4.099 4.162 4.218 4.269 4.316 4.360
14 2.977 3.528 3.673 3.785 3.878 3.956 4.024 4.084 4.138 4.187 4.232 4.273 4.311
15 2.947 3.482 3.622 3.731 3.820 3.895 3.961 4.019 4.070 4.117 4.160 4.200 4.237 4.271
16 2.921 3.443 3.579 3.684 3.770 3.843 3.907 3.963 4.013 4.058 4.100 4.138 4.173 4.206 4.237
17 2.898 3.409 3.541 3.644 3.728 3.799 3.860 3.914 3.963 4.007 4.047 4.084 4.118 4.150 4.180 4.208
18 2.878 3.379 3.508 3.609 3.690 3.760 3.820 3.872 3.920 3.962 4.001 4.037 4.071 4.102 4.131 4.158 4.184
19 2.861 3.353 3.479 3.578 3.658 3.725 3.784 3.835 3.881 3.923 3.961 3.996 4.029 4.059 4.087 4.114 4.139 4.162
20 2.845 3.329 3.454 3.550 3.629 3.695 3.752 3.802 3.848 3.888 3.926 3.960 3.991 4.021 4.049 4.074 4.099 4.122 4.144
24 2.797 3.257 3.375 3.465 3.539 3.601 3.654 3.702 3.744 3.782 3.816 3.848 3.878 3.905 3.930 3.955 3.977 3.999 4.019 4.091
30 2.750 3.188 3.298 3.384 3.453 3.511 3.561 3.605 3.644 3.680 3.712 3.742 3.769 3.794 3.818 3.840 3.861 3.881 3.900 3.966 4.048
40 2.704 3.121 3.225 3.305 3.370 3.425 3.472 3.513 3.549 3.582 3.612 3.640 3.665 3.689 3.711 3.732 3.751 3.769 3.787 3.848 3.923 4.019
60 2.660 3.056 3.155 3.230 3.291 3.342 3.386 3.425 3.459 3.489 3.517 3.543 3.567 3.589 3.609 3.628 3.646 3.663 3.679 3.736 3.805 3.893 4.015
120 2.617 2.994 3.087 3.158 3.215 3.263 3.304 3.340 3.372 3.401 3.427 3.451 3.473 3.493 3.512 3.530 3.546 3.562 3.577 3.629 3.693 3.774 3.886
∞ 2.576 2.934 3.022 3.089 3.143 3.188 3.226 3.260 3.289 3.316 3.340 3.362 3.383 3.402 3.419 3.436 3.451 3.466 3.480 3.528 3.587 3.661 3.764

Appendix
Continued

611
612
Continued

Appendix
Significance level = 0.001
Number of means, k
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 24 30 40 60
2 31.599
3 12.386 17.362
4 8.497 11.161 11.968
5 6.834 8.608 9.131 9.555
6 5.945 7.282 7.668 7.978 8.239
7 5.402 6.488 6.796 7.042 7.249 7.427
8 5.038 5.965 6.225 6.431 6.603 6.752 6.882
9 4.779 5.598 5.825 6.004 6.153 6.282 6.394 6.494
10 4.586 5.327 5.530 5.690 5.823 5.937 6.037 6.126 6.206
11 4.436 5.119 5.304 5.451 5.571 5.675 5.765 5.845 5.918 5.984
12 4.317 4.954 5.126 5.262 5.373 5.469 5.552 5.626 5.692 5.753 5.808
13 4.221 4.821 4.983 5.110 5.214 5.303 5.380 5.449 5.511 5.567 5.619 5.666
14 4.140 4.712 4.865 4.984 5.083 5.166 5.239 5.304 5.362 5.415 5.463 5.508 5.550
15 4.073 4.620 4.766 4.879 4.973 5.053 5.122 5.183 5.238 5.288 5.334 5.376 5.415 5.452
16 4.015 4.542 4.681 4.790 4.880 4.956 5.022 5.081 5.133 5.181 5.224 5.265 5.302 5.337 5.370
17 3.965 4.474 4.609 4.714 4.800 4.873 4.937 4.993 5.043 5.089 5.131 5.169 5.205 5.238 5.270 5.299
18 3.922 4.416 4.546 4.648 4.731 4.801 4.863 4.917 4.965 5.009 5.049 5.087 5.121 5.153 5.183 5.211 5.238
19 3.883 4.365 4.491 4.590 4.670 4.738 4.798 4.850 4.897 4.940 4.978 5.014 5.048 5.079 5.108 5.135 5.161 5.185
20 3.849 4.319 4.443 4.538 4.617 4.683 4.741 4.791 4.837 4.878 4.916 4.951 4.983 5.013 5.041 5.067 5.092 5.116 5.138
24 3.745 4.181 4.294 4.382 4.453 4.514 4.566 4.613 4.654 4.692 4.726 4.757 4.786 4.814 4.839 4.863 4.885 4.907 4.927 4.999
30 3.646 4.049 4.153 4.234 4.299 4.355 4.402 4.445 4.482 4.516 4.547 4.576 4.602 4.627 4.650 4.671 4.692 4.711 4.729 4.794 4.873
40 3.551 3.925 4.020 4.094 4.154 4.204 4.248 4.286 4.321 4.351 4.380 4.405 4.429 4.451 4.472 4.492 4.510 4.527 4.544 4.602 4.673 4.764
60 3.460 3.806 3.894 3.962 4.017 4.063 4.103 4.137 4.168 4.196 4.222 4.245 4.267 4.287 4.306 4.323 4.340 4.355 4.370 4.423 4.486 4.568 4.682
120 3.373 3.694 3.775 3.837 3.887 3.929 3.965 3.997 4.025 4.051 4.074 4.095 4.115 4.133 4.150 4.165 4.180 4.194 4.208 4.255 4.312 4.385 4.487
∞ 3.291 3.588 3.662 3.719 3.765 3.803 3.836 3.865 3.891 3.914 3.935 3.954 3.972 3.988 4.003 4.018 4.031 4.044 4.056 4.098 4.149 4.215 4.306
* The values for k > 3 in this table are upper bounds as calculated by E. G. Schilling and D. Smialek [“Simplified Analysis of Means for Crossed and Nested
Experiments,” Proceedings of the 43rd Annual Quality Control Conference, Rochester Section, ASQC (March 10, 1987)] from the inequality of Z. Sidak
[“Rectangular Confidence Regions for the Means of Multivariate Normal Distributions,” Journal of the American Statistics Association 62 (1967): 626–33] as
suggested by L. S. Nelson [“Exact Critical Values for Use with the Analysis of Means,” Journal of Quality Techology 15, no. 1 (January 1983): 40–44]. For unequal
sample sizes, plot individual limits with the factor ha*[(N – ni)/(Nni)]1/2. Less conservative limits can be obtained using the factor ma[(N – ni)/(Nni)]1/2, where ma is the
upper alpha quantile of the Studentized maximum modulus (SMM) distribution described and tabulated by P. R. Nelson [“Multiple Comparisons of Means Using
Simultaneous Confidence Intervals.” Journal of Quality Technology 21, no. 4 (October 1989): 232–41] based on a, k, and n degrees of freedom. These table values
have been recalculated by D. V. Neubauer.
Appendix 613


Table A.20 Criteria for the ratio F* = ŝLT2 /ŝST2 for the X chart with ng = 5.*
Total number of
observations
n ` = 0.10 ` = 0.05 ` = 0.01 ` = 0.001
5 2.297 3.004 4.913 7.187
10 1.809 2.163 2.949 4.131
15 1.642 1.908 2.492 3.293
20 1.554 1.770 2.184 2.688
25 1.484 1.666 2.129 2.716
30 1.440 1.592 1.929 2.334
35 1.400 1.552 1.848 2.286
40 1.361 1.498 1.782 2.112
45 1.358 1.482 1.734 2.008
50 1.330 1.442 1.659 1.827
55 1.316 1.427 1.650 1.894
60 1.300 1.399 1.585 1.910
65 1.280 1.380 1.589 1.808
70 1.270 1.367 1.550 1.858
75 1.262 1.357 1.539 1.806
80 1.252 1.337 1.520 1.719
85 1.243 1.328 1.498 1.649
90 1.246 1.325 1.489 1.699
95 1.228 1.301 1.445 1.636
100 1.232 1.308 1.452 1.614
110 1.212 1.277 1.420 1.623
120 1.209 1.270 1.403 1.539
130 1.194 1.256 1.380 1.526
140 1.189 1.253 1.371 1.490
150 1.182 1.241 1.359 1.500
160 1.179 1.231 1.343 1.476
* E. N. Cruthis and S. E. Rigdon, “Comparing Two Estimates of the Variance to Determine the Stability of the
Process,” Quality Engineering 5, no. 1 (1992–1993): 67–74.
614 Appendix

Table A.21a Tolerance factors, K, using the standard deviation s to obtain intervals containing
P percent of the population with γ = 95 percent confidence, for samples of size n,
assuming a normal distribution.*
n P = 90% P = 95% P = 99% P = 99.9%
2 32.019 37.674 48.430 60.573
3 8.380 9.916 12.861 16.208
4 5.369 6.370 8.299 10.502
5 4.275 5.079 6.634 8.415
6 3.712 4.414 5.775 7.337
7 3.369 4.007 5.248 6.676
8 3.136 3.732 4.891 6.226
9 2.967 3.532 4.631 5.899
10 2.839 3.379 4.433 5.649
11 2.737 3.259 4.277 5.452
12 2.655 3.162 4.150 5.291
13 2.587 3.081 4.044 5.158
14 2.529 3.012 3.955 5.045
15 2.480 2.954 3.878 4.949
16 2.437 2.903 3.812 4.865
17 2.400 2.858 3.754 4.791
18 2.366 2.819 3.702 4.725
19 2.337 2.784 3.656 4.667
20 2.310 2.752 3.615 4.614
21 2.286 2.723 3.577 4.567
22 2.264 2.697 3.543 4.523
23 2.244 2.673 3.512 4.484
24 2.225 2.651 3.483 4.447
25 2.208 2.631 3.457 4.413
26 2.193 2.612 3.432 4.382
27 2.178 2.595 3.409 4.353
28 2.164 2.579 3.388 4.326
29 2.152 2.564 3.368 4.301
30 2.140 2.549 3.350 4.278
35 2.090 2.490 3.272 4.179
40 2.052 2.445 3.213 4.104
45 2.021 2.408 3.165 4.042
50 1.996 2.379 3.126 3.993
60 1.958 2.333 3.066 3.916
70 1.929 2.299 3.021 3.859
80 1.907 2.272 2.986 3.814
90 1.889 2.251 2.958 3.778
100 1.874 2.233 2.934 3.748
150 1.825 2.175 2.859 3.652
200 1.798 2.143 2.816 3.597
250 1.780 2.121 2.788 3.561
300 1.767 2.106 2.767 3.535
400 1.749 2.084 2.739 3.499
500 1.737 2.070 2.721 3.475
1000 1.709 2.036 2.676 3.418
∞ 1.645 1.960 2.576 3.291
* Selected values condensed from: Statistical Research Group Columbia University, Techniques of Statistical
Analysis (New York: McGraw Hill, 1947): 102–7.
Appendix 615


Table A.21b Tolerance factors K* using the average range R of samples of ng = 5 to obtain
intervals containing P percent of the population with γ = 95 percent confidence
assuming a normal distribution.
k P = 90% P = 95% P = 99% P = 99.9% n
4 0.999 1.190 1.563 1.996 20
5 0.961 1.145 1.505 1.921 25
6 0.934 1.113 1.463 1.868 30
7 0.914 1.089 1.431 1.829 35
8 0.898 1.070 1.406 1.797 40
9 0.885 1.055 1.386 1.771 45
10 0.874 1.042 1.369 1.749 50
11 0.865 1.031 1.355 1.731 55
12 0.857 1.022 1.343 1.715 60
13 0.851 1.013 1.332 1.702 65
14 0.844 1.006 1.322 1.689 70
15 0.839 1.000 1.314 1.679 75
16 0.834 0.994 1.307 1.669 80
17 0.830 0.989 1.300 1.660 85
18 0.826 0.984 1.294 1.652 90
19 0.823 0.980 1.288 1.645 95
20 0.819 0.976 1.283 1.639 100
25 0.806 0.960 1.262 1.612 125
30 0.796 0.949 1.247 1.593 150
40 0.783 0.933 1.226 1.567 200
50 0.774 0.923 1.213 1.549 250
75 0.761 0.907 1.192 1.523 375
100 0.753 0.898 1.180 1.507 500
∞ 0.707 0.843 1.107 1.415 ∞
(k = number of subgroups, n = total sample size)
* Condensed from R. S. Bingham, “Tolerance Limits for Process Capability Studies,” Industrial
Quality Control 19, no. 1 (July 1962): 38.
INDEX

Index Terms Links

acceptable process level (APL) 205 217


acceptable-quality level (AQL) 161 162 252
concept 163
acceptance control chart 204
for attributes 209
average run length 210
risks 204
selection and use of 242
acceptance control limit 205
acceptance sampling 139 153
accuracy 286 526
as bias 526
adaptive control charts 232
aliasing 299
pattern 300
alpha risk 55 56
analysis of fully nested designs 479
analysis of means (ANOM)
as an alternative to chi-square analysis 324
p p
analysis of 2 and 2 designs 456
to analyze variability 507
case histories using 330
for count data 329
development of 516
distinguished from analysis of means for effects 470
exact factors, Hα , for one-way (A.8) 593
in half-replicate of a two-cubed design 367
for measurement data 434

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

analysis of means (ANOM) (Cont.)


with no standard given 322
with no standard given, more than one
independent variable 469
with one independent variable with k levels 331
for proportions 324
relation to analysis of variance 477
when sample sizes are unequal 505
with standard given 318
t test compared with 419
with three independent variables 355
with three independent
variables, in 23 factorial design 444
and transformations 408
for treatment effects 490
and Tukey procedure 413
with two independent variables 341
2
in 2 factorial design 436
for variables data 415
analysis of means for effects (ANOME) 470
for count data 503
for crossed experiments with multiple factors 484
distinguished from analysis of means 470
P
limits for 2 experiments, calculation of 515
for main effects, one-way 515
for proportion data 498
analysis of ranges (ANOR) 549
analysis of two-factor crossed designs 470
analysis of variance (ANOVA)
between-group variation 440
F tests 440
relation to analysis of means 477
table 292
within-group variation 440

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

analytic study 199


Analyze (A) 283 285
See also DMADV; DMAIC process
ANOM. See analysis of means
ANOME. See analysis of means for effects
ANOVA. See analysis of variance
ANSI/ASQ Z1.4 system 162 163
appraiser variation (AV) 534
%AV 538
plot 546
areas of differences 316
areas under the normal curve (A.1) 576
arithmetic average 12
arithmetic moving average 212 214
assignable causes 8
evidence of 65
identifying presence of 71
and process average 84
variability from 8
ASTM Manual on Presentation of Data 36
ATT Statistical Quality Control Handbook 249
attributes data 127 272
analysis of means, one independent variable 322
that approximate a Poisson distribution 139
multifactor experiments with 498
sequences of 316
troubleshooting with 315
autoregressive integrated moving average
(ARIMA) model 213
average, desired or specified 21n
average outgoing quality (AOQ) 158
computing 160
average outgoing quality limit (AOQL) 159
average range 32

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

average run length (ARL) 73 210


average run length curve 74

Barnard CUSUM chart 217 220 225 226


batch analyses 102
before-and-after study 394 395n 396
bell-shaped distribution 24
beta risk 55 56
between-factor variation 440
bias
accuracy as 526
in variability estimates 109
Bicking’s checklist 276
Bingham, R. S. 43
binomial distribution 323n
measure of variability for 132
binomial probability tables for n 130 582 (A.5)
simple probability 155
binomial theorem 128
Black Belts, Six Sigma 281
blocked full factorials (A.17) 607
blocking 298
Bonferroni inequality 516
Box–Cox transformations 405
box plots 38
bunch-type factors 270n

c control chart 141


capability and specifications 251
capability flow-up 284

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

capability index (Cp) 252 258


Cpk 253 257
causative variables 270
cause-and-effect diagram 263
cause-and-effect relationships 272
cell boundaries 8 9
central value 12
Champions, Six Sigma 281
characteristics diagram 263
chi-square analysis 60 324
as an alternative to analysis of means 324
chunky-type factors 270n
classification data 272
coding of data 17
combined Shewhart–CUSUM chart 228
common causes 6
compact disk (CD) 555
ANOM critical values 566
ANOM program for balanced experimental
designs 565
ANOM program for one-way analysis of attribute
and variables data 565
ANOM FORTRAN program for balanced
experimental designs 565
data sets and solutions to practice exercises 555
Excel 97 and 2000 viewer 556
Excel ANOM add-in 558
mica thickness data spreadsheet 557
statistical table generation within Excel 566
Word 97 and 2000 viewer 556
comparison of long-term and short-term variation 122
compressed gauges. See narrow-limit gauging
confidence 28 30
confidence intervals 27

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

consumer’s risk 162


continuous variables 271
as possible causative variables 432
Control (C) 283
See also DMAIC process
control chart limits 66
for samples of ng (A.4) 580
control charts 51 54
acceptance control charts 204
adaptive control charts 232
applying 241
arithmetic moving average charts 212 214
average run length 73
cumulative sum 216
exponentially weighted moving average
(EWMA) chart 213
keeping notes 146
manual adjustment charts 232
mechanics for preparing 63
median charts 201
midrange chart 201
modified limits chart 211
multivariate control charts 240
narrow limit chart 232
with no standard given 197
in process optimization programs 261
progression of 243
recommended criteria 67
selecting and applying 241
Shewhart 62
short run control charts 237
standard deviation chart 203
with standard given 197
for trends 88

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

control charts (Cont.)


types and uses 197
control limits 63
for exponentially weighted moving average 215
how to compute 200
for moving average 214
convenience samples 158
correlation 251 400
correlation coefficient 397 399
correlation matrix 402
count data, analysis of means for 329
critical-to-quality characteristics (CTQs) 282 283 284
crossed designs versus nested designs 530
crossed experiments, multiple factors 484
cumulative sum (CUSUM) charts 216
combined Shewhart–CUSUM chart 228
computational method 221 229
fast initial response (FIR) 228
scaling of 220
Shewhart chart versus 216 227
snub-nosed mask 227
special charts 227
standardized 228
V-mask 217 222 225 226
229
customer requirements 284
cycle time 286

data
causes for peculiarities 53
coding 17
collection 3 341
graphing 4

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

data (Cont.)
grouping when n is large 8
organizing 8
plotting 433
Poisson type 140
from a scientific or production process 54
statistical analysis 101
data collection plans, troubleshooting improved by 341
decision errors 397
decision lines
applicable to k points simultaneously 317
half-replicate of a two-cubed design 372
main effects and two-factor interaction 437 442
upper and lower 318
defect classification 164
defectives 127 128
defects
incentives to correct 166
reducing 280
Define (D) 282 284
See also DMADV; DMAIC process
defining contrast 301 302
defining relation 299
degrees of freedom (df) 109 114
and F ratio 119
values of adjusted d2 factor and (A.11) 599
demerit per unit 241
Deming,W. E. 199
dependent variables 270 272
Design (D) 285
See also DMADV
design elements 284
Design for Six Sigma (DFSS) 284
design matrix 389

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

design of experiments, statistical 287


design resolution 300
desired average 21n
desired measure of process variability 21n
determination 440
difference between two means 420
difference charts 238
digidot plot 95
disassembly and reassembly 383
discrimination ratio (DR) 537
distributions
nonnormal 7
normal 6
of sample averages 23
Dixon criteria for testing for extreme mean (A.9) 597
Dixon’s test
for a pair of outliers 106
for a single outlier 105
DMADV 284
DMAIC process 282
Dodge–Romig plans 162
dot plot 41
Duckworth. See Tukey–Duckworth procedure

economic significance 420


effective measurement resolution 554
effects 288
aliased 299
blocking 298
calculation of 288 293
contrasts 292
interaction of 289
machine 288

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

effects (Cont.)
measure of variance associated with 292
operator 289 290
plotting 304
Yates method for calculating 291
enumerative study 199
equipment variation (EV) 534
%EV 538
error sum of squares 292
estimate 11
best linear 16
mean deviation from median 16
range 16
unbiased 16 109
2
estimating σ and σ from data 111
evolutionary operation (EVOP) 432
experimental designs, multifactor 355
experimental error, nonreplicated design 306
experimental plan 287
experimentation
appropriateness and efficiency of 287
in statistical quality control 197
experiments, principles
and method of 267
exploratory studies 431
exponentially weighted moving average
(EWMA) chart 213 215
use with manual adjustment charts 236

F, criteria for the ratio (A.20) 613


F distribution (A.12a, A.12b, A.12c) 602
F test (variance ratio test) 114 292 417

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

factor effects, plotting 304


See also effects factor sparsity
See sparsity of effects factorial design 341
a×b 341 346
three-by-four 347
factorials
some blocked full (A.17) 607
some fractional (A.18) 608
fast initial response (FIR) 228
feedback system 155
and outgoing product quality rating 167
sampling to provide 163
where it should begin 166
fishbone diagram 263
fixed effects 530
fixed stable standard deviation 21
FR (range–square–ratio test) 119
fractional factorials 299 608 (A.18)
frequency 8
frequency distribution 9 82
spread or variability of 13

gauge accuracy 539


gauge error, causes of 532
gauge linearity 539
gauge measurement capability, assessing 530
gauge R&R studies 530
graphical analysis 546
long method 532
gauge run chart 546
gauge stability 539
gauge system error 539

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

gauging. See go/no-go data;


narrow-limit gauging
Gaussian distribution 24
geometric moving average chart 213
See also exponentially weighted
moving average chart
selection and use of 242
go/no-go data 127
advantages and disadvantages 180
grab sample 153
graphical analysis of means, advantages 571
Green Belts, Six Sigma 281
grouped frequency distribution 8
Grubbs criteria for testing two outliers 106 598 (A.10)

half-normal probability plot 306


half-replicate of a two-cubed design
troubleshooting with attributes data 367
troubleshooting with variables data 450
higher-order interactions, calculating the differentials 484
histogram 8

Improve (I) 283


See also DMAIC process
independent random samples analysis, one
independent variable,
with no standard given 463
with standard given 462
independent variables. See also one independent
variable; three independent variables; two
independent variables

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

independent variables (Cont.)


levels of 432
with more than two levels 461
types of 271
index of summation 12n
indifference, point of 161
individual batch analysis 102
inequality of variances 418
inherent variability 6
innovation, process improvement and 261
integrated moving average (IMA) 213
interaction of effects 289 295 344
plots 302
signs of 290
three-factor interactions 355
two-factor interactions 355
interaction plots 302
interactions, higher order 484
interpretation, in statistical quality control 197
investigation, principles and method of 267
investigations, suggestions in planning 432
ISO/TR 7811 CUSUM standard 217 226

k 44
K1 534n
K2 534n
K3 537n
Kepner and Tregoe approach for problem analysis 279 570

least squares 398


Lewis, S. S., and development of analysis of means 516

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

line of best fit 397


line of no change 396
linear estimate 16
lognormal distribution 7
long-term variation 35
lot 155
lot tolerance percent defective (LTPD) plans 161 162
lower action limit 90
lower control limits (LCL) 63 64 65 134
lower specification limit 90

“Magnificent 7” tools 569


main effect plots 302
main effects
in 23 factorial design 355
and two-factor interaction 436
manual adjustment charts 232
assumptions, based on 236
Master Black Belts, Six Sigma 281
maverick. See outlier
mean,
arithmetic value of 12
computations of 14
cumulative sum chart for 223
Dixon criteria for testing for extreme (A.9) 597
mean deviation from median 16
mean squares (MS) 291n 292
means, differences between two 420
measure (M) 282 284
See also DMADV; DMAIC process
measurement data, analysis of means for 434
example 435
measurement error 528

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

measurement system
investigating 528
problems associated with 528
measurements as a process, assessing 525
median 12 18 157
mean deviation from the 16
probabilities associated with control charts 71
runs above and below 57 578 (A.2) 579 (A.3)
median chart
conversion to 200
in statistical process control 200
median range 200
median uncertainty 554
midrange 12
midrange chart 201
minimum average total inspection 161
modeling 284
modified control limits 211
moving average charts 212
arithmetic and geometric 212 213 214
selection and use of 242
moving range (MR) chart as a test for outliers 103
μ 21n
multifactor experimental designs 355
multifactor experiments with attributes data 498
multi-vari plot 546
multivariate control charts 240
selection and use of 242

n (sample size) 29 44 432


See also sample size (n)
ng (subgroup size) 44
narrow-limit control charts 232

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

narrow-limit gauging 180 232


basic assumptions for 182
OC curves of plans 187
optimal narrow limit plan 192
outline of a plan 181
for process capability 257
selection of a simple sampling plan 182 192
Nelson, L. S. 464 516
Nelson, P. R. 497 516 518
nested 479
nested designs
analysis of fully 479
versus crossed designs 530
nested factorial experiments 497
no standard given
analysis of k independent samples, one
independent variable 463
analysis of means with 322
analysis of means with more than one
independent variable 469
for control charts 197
nonrandom uniformity 515
nominal 238
nonconforming items 127
nonnormal distributions 7
nonrandom uniformity 512
objective test of 515
with no standard given 515
with standard given 513
nonrandom variability 319
scatter diagrams and 394
with standard given (A.7) 592
nonrandomness 53 54 269
of single variable 394

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

nonreplicated design 306


normal 24
normal curve 20
areas under 24 576 (A.1)
normal distribution 6
normal probability plots 18
of effects 303
notation 44
notes on control charts 146
number of distinct categories (NDC) 537

OC curves. See operating-characteristic curves


Olmstead, Paul 55
omnibus-type variables 270 341
100 percent inspection, sampling versus 153 165
one independent variable
analysis of k independent samples 462
analysis of means, attributes data 331
online acceptance sampling plans 163
online inspection stations 163
online quality control 195
operating-characteristic (OC) curves 71
associated with other criteria 77
computations associated with 75
of narrow-limit gauge plans 187
of a single sampling plan 156
of X charts 72
OPQR. See outgoing
product quality rating
order statistics 105
Ott, E. R., and development of analysis of means 516
outages 64n 65 99n 102
134
This page has been reformatted by Knovel to provide easier navigation.
Index Terms Links

outgoing product quality rating (OPQR) 167


chart 241
outliers 99
detecting reasons for 99
Dixon’s test for single 105
Grubbs criteria for testing two 106 598 (A.10)
objective tests for 103
tests for a pair 106
two suspected, on same end of a sample of n 106

p 44
Pareto analysis 167 250 263
part-to-part variation (PV) 536
%PV 539
parts per million (ppm) 252
narrow-limit gauging 257
patterns of data, troubleshooting strategies using 379
p-chart, stabilized 326
Pearson type III distribution 191
percent frequency 9
percent tolerance analysis 538
percent total variation analysis 538
plotting on normal probability paper 18
point of indifference 161
Poisson curves 141
Poisson distribution
and analysis of means for count data 329
attributes data, approximating 139
variation predicted in samples from 140
Poisson probability curves (A.6) 591
Poisson type data 140
power transformations 404
practical significance 420

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

precision 526
precontrol, in statistical process control 229
procedure 231
rules 230
schematic 231
prioritization 263
probability 30
probability of acceptance (P A ) 155
probability paper, plotting on normal 18
probability plot of effects
half-normal 306
normal 303
problem analysis, Kepner and Tregoe 279 570
problem finding, strategies in 272
problem identification 263
problem solving, strategies in 272
problem solving skills 278
process adjustments 233
process averages 21n
analysis of means 415
assignable causes producing shifting 84
comparing two 413
sample size needed to estimate 31
Tukey–Duckworth procedure 413
process capability, in statistical process control 196 249
estimation methods for nonnormal distributions 261
and specifications 251
process capability study 250 251
process change 196 262
statistical tool for 433
process control 122 196 197
objective 262
studies 199
process improvement 261

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

process improvement program 433


process improvement study 250 251
steps 262
process inputs 286
process maintenance and improvement 20
process misbehavior 397
process monitoring 233
process optimization programs 250
control charts and 261
process outputs 285
process performance 259
process performance check 250
process performance evaluation 250
process performance index (Pp) 259
relation to process capability 259
process problems, solving 278
process quality control 195 265
See also statistical
process control implementation
key aspects of 196
process regulation 233
process stability 128 134
process variability 21n
process variation 261
producer’s risk 162 163
production control chart 159n
profitability 286
proportions, analysis of means for 324

Q charts 239
quality 162
quality characteristics 260
as attributes 368

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

quality characteristics (Cont.)


types of 272
quality control committee 174
Quality Function Deployment (QFD) 284
quartiles, estimating 40

r 44
random effects 530
random variation 56
randomization, rational subgrouping versus 199
R&R plot 546
%R&R 538
range (R) chart
as evidence of outliers 100
of moving ranges as test for outliers 103
range estimate 16
range of a sample 32
range–square–ratio test (FR) 114 119
rational subgrouping
versus randomization 199
rational subgroups 37 63 66 199
reassembly, disassembly and 383
reduced inspection 163
reference target value 526
regression 251
regression line 397n
rejectable process level (RPL) 205 218
rejects 127
relationship, measuring the degree of 397
relative frequency 8
repeatability 531 534
repeated observations 112
replicated observations 112

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

replicates 440
Yates method with 293 294
reproducibility 530 534
requirements flow-down 284
resolution 300
response 288
response variables 272
risks 55 162
Roberts, S. W. 213
run analysis 54 62
run criteria 51 56
run-down, longest 61 62
runs
above and below the median 57
average expected number of 59
average expected number of length s (A.3) 579
critical values of number of (A.2) 578
interpretations of 58
lengths of 59
standard deviation of 58
total number of 58
run-up, longest 61

SADE-Q 265
Salvosa tables 191
sample, how to 138
sample difference 420
sample size (n) 29 432
binomial probability tables 130 582 (A.5)
changes in 29
data from n observations consisting of k subsets of
ng = r 112
estimating standard deviation from one sample 111

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

sample size (n) (Cont.)


required to estimate percent defective 128 135
required to estimate process average 31
unequal 505
2
sample variance (s )
sampling
ANSI/ASQC Z1.4 163
and AOQ 158
minimizing total amount of inspection 161
reasons for 138
versus 100 percent inspection 153 165
sampling plans 154
to accept or reject 155
cost of 157
Dodge–Romig 163
feedback of information 163
as a feedback system 155
a good plan 157
important concepts 161
narrow-limit gauging, selection of a 182
quality of a plan 157
single, OC curves of a 156
tabulated 162
sampling variation, predicting 20
sampling versus 100 percent inspection 165
scatter diagrams 394 396 400 401
scatter plot matrix 402
Schilling, E. G.,and development of analysis of means 516 517
Schilling and Sommers narrow-limit plans 190
scientific discovery 54
scientific process, data from 55
screening program for treatments 387
assumptions 388
examples 389

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

screening program for treatments (Cont.)


other screening strategies 393
theorem 388
semi-interquartile range 41
sequences of attributes data 316
Sheesley, J. H. 517
Shewhart,Walter 195 244 433 526
Shewhart control charts 62 227 229 433
selection and use of 242
short-run control charts 237
Bothe X , R charts 238
difference charts 238
Q charts 239
standardized charts 238
short-run X and R charts 238
short-term variation 35
Sidak approximation 517
Sidak factors 515
for analysis of means for
treatment effects (A.19) 609
sigma hat 11n 13
See also standard deviation
sigma prime 21n
sin of commission 55
sin of omission 55
SIPOC model 285
Six Sigma
Black Belts 281
Champions 281
Green Belts 281
Master Black Belts 281
methodology 280 569
training 281
Smialek, D. C. 516

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

Snedecor, George W. 114


Snee, R. D. 517
snub-nosed mask 227
sparsity of effects 298
special causes 8
See also assignable causes
specification tolerance (TOL) 538
specifications and process capability 251
specified average 21n
specified measure of process variability 21n
stabilized p-chart 326
stable process 128 134
standard deviation
computing 13
estimating 19 111
fixed stable 21
predictions about 29
of runs 58
statistical efficiency of 110
theorems 23 34
variation of 30
standard deviation chart (s chart) 203
standardized charts 238
standardized CUSUM charts 228
standards given
analysis of k independent samples,
one independent variable 462
analysis of means with 318
for control charts 197
limits 462
nonrandom uniformity 513
nonrandom variability (A.7) 592
statistical control 233
criteria for 67

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

statistical design of experiments 287


statistical efficiency
of standard deviation 111
in variability estimates 109
statistical process control implementation
acceptance control charts 204
adaptive control charts 232
analysis of means for count data 329
analysis of means for measurement data 434
analysis of means for proportions 324
applying control charts 241
arithmetic and geometric moving-average charts 212
capabilities and specifications 251
check sequence for control chart implementation 244 245
cumulative sum charts 216
experimentation in 197
interpretation in 197
key aspects of process quality control 196
lifecycle of control chart application 244
median charts 200
modified control limits 211
multivariate control charts 240
narrow-limit control charts 232
OPQR charts 241
precontrol 229
prioritization 263
problem identification 263
process capability 196 249
process change 196 262
process control 196 197
process improvement 261
process optimization studies 250
progression of control charts 243
rational subgroups 199

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

statistical efficiency (Cont.)


selection of control charts 242
special control charts 240
standard deviation charts 203
2
T control charts 240
use of control charts 242
statistical thinking 568
stem-and-leaf diagram 37 41 95
structure, design and 287
Student’s t test 417
compared with analysis of means 419
distribution of (A.15) 606
studentized maximum modulus (SMM) critical values 506
Sturges’rule 10
summation (Σ) 12
sums of squares (SS) 291
residual 440
Yates method for calculating 293

T2 control charts 240 242


t test 417
compared with analysis of means 419
distribution of (A.15) 606
tally sheet 9
team charter 284
test programs, planning 276
theorems, standard deviation 23
three independent variables, analysis of means 355
three-factor interactions 355 386 450
tightened inspection plans 163
time sequences 51
diagnosing the behavior of the data 51
Tippett, L. H. C. 191

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

tokusei yoinzu 263


tolerance factors 43
using average range (A.21b) 615
using standard deviation (A.21a) 614
tolerance intervals 42
nonparametric 43
total process variation (TV) 537 538
transformations 403
and analysis of means 408
Box–Cox 405
power 404
use of 403
treatment combinations 290
treatment effects
analysis of means for 490
calculating for a higher-order interaction 484
Sidak factors for analysis of means for (A.19) 609
trends, charts for 88
action limits 89
basic procedure for establishing 94
control limits 92
estimation of tool life 94
forced intercept 93
trend line 90
trivial many 263
troubleshooting
with attributes data 315
basic ideas and methods of 269
comparing two process averages 413
improving with data collection plans 341
patterns of data and 379
principles of 3
special strategies in 379
statistical design of experiments, concepts of 287

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

troubleshooting (Cont.)
strategies in 272
with variables data 431
true average 526
true value 526
Tukey, John 37
box plots 38
stem-and-leaf diagram 37
Tukey–Duckworth procedure 413
Tukey–Duckworth sum,
critical values of (A.13) 605
23 factorial design 355
three-factor interactions 450
two-factor interactions 449
two independent variables, analysis of means 341
two-factor crossed designs, analysis of 470
two-factor interactions 355 386 449
P
2 designs, graphical analysis 302
2
2 factorial design 341

u chart 142
Ullman, N. 517
unbiased estimate 109 112
upper action limit 90
upper control limits (UCL) 63 64 65 134
upper specification limit 90

variabilities
comparing 506
comparing, of two populations 114
estimating and comparing 102

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

variability 6
analysis of means to analyze 507
from assignable causes 8
inherent in process 6
long term and short term 35
patterns 6
reducing 291
of sample measurements 526
variables
causative 270
continuous 271 432
dependent 270 272
independent. See independent variables
relationship of one to another 394
two populations, comparing 114
variables data
ideas from outliers 99
troubleshooting with 431
variance component designs 530
variance ratio test (F test) 114 292 417
variance-stabilizing transformations.
See power transformations
variation
and attributes data 133
comparison of long-term and short-term 122
expected, in stable process 128 133
long-term 35
measures of 13
random 56
sampling 20
short-term 35
of 30
of X 29

This page has been reformatted by Knovel to provide easier navigation.


Index Terms Links

Verify (V) 285


See also DMADV
vital few 263
V-mask 217 222 225 226
229
voice of the customer (VOC) 282

Welch–Aspin test 419


wild-shot. See outliers
within-group variation 440
within-subgroup variability 445
Wludyka, P. S. 518
word 300
worst case uncertainty 554

X chart, converting to a median chart 200


X-bar 11n
X-bar prime 21n

Yates method 293


with fractional factorials 299
in larger experiments 301
steps 293
Yates order 291 293
Youden squares 517

Z-charts, variations 238

This page has been reformatted by Knovel to provide easier navigation.

You might also like