ABSTRACT and indicate how they can be implemented with short SAS
programs.
Although the Shewhart chart serves well as the fundamental
tool for statistical process control (SPC) applications, its as A common theme in the sections that follow is the use
sumptions are challenged by many modern manufacturing of SAS procedures for statistical modeling in conjunction
environments. For example, when standard control limits with the SHEWHART procedure for control charting. This
are used in applications where the process is sampled fre tandem is highly effective in a variety of nonstandard SPC
quently, autocorrelation in the measurements can result in applications. The example SAS code provided in this paper
too many outofcontrol signals. Other situations considered is readily extended to SAS programs that can handle large
in this paper include the presence of multiple components numbers of processes. Likewise, the examples can be
of variation, the issue of short run process control, and the incorporated in customized pointandclick interfaces devel
effect of nonnormal process data. While some of these prob oped with SAS/AF software.
lems are subjects of ongoing research, statistical models
that extend the basic Shewhart model can provide effec AUTOCORRELATION IN PROCESS DATA
tive solutions. This paper demonstrates the use of SAS
procedures for statistical modeling in conjunction with the Autocorrelation has long been recognized as a natural phe
SHEWHART procedure. nomenon in process industries where parameters such as
temperature and pressure vary slowly relative to the rate at
which they are measured. Only in recent years has autocor
INTRODUCTION
relation become an issue in SPC applications, particularly
One of the main outcomes of the modern industrial qual in parts industries where autocorrelation is viewed as a
ity movement has been the widespread use of statistical problem that can undermine the interpretation of Shewhart
process control methods to eliminate special causes of vari charts. One reason for this concern is that, as automated
ation in manufacturing processes and to reduce common data collection becomes prevalent in parts industries, pro
causes of variation. Companies throughout the world have cesses are sampled more frequently and it is possible to
adopted the Shewhart chart as their main SPC method, and recognize autocorrelation that was previously undetected.
Shewhart charting is now taught and applied on a large Another reason, noted by Box and Kramer (1992), is that the
scale in both discrete components (parts) and continuous distinction between parts and process industries is becom
process industries. ing blurred in areas such as computer chip manufacturing.
For two other very readable discussions of this issue, refer
While the Shewhart chart is remarkably versatile, it is not to Schneider and Pruett (1994) and Woodall (1993).
the best solution for every SPC application. There is a
growing awareness among quality engineers that standard The standard Shewhart analysis of individual measure
control charts are inappropriate or of limited value in a ments assumes that the process operates with a constant
number of manufacturing situations, and questions such as mean , and that xt (the measurement at time t) can be
the following are being asked: represented as
xt = + t
“What can I do about autocorrelation in my process where t is a random displacement or error from the process
data?” mean . The errors are typically assumed to be statistically
“Is there a way to adjust control charts for multiple independent in the derivation of the conventional control
sources of variation?” limits that are displayed at three standard deviations above
“How can I do short run process control?” and below the center line, which represents an estimate for
.
“Are the standard control limits appropriate for non
normal data?” When Shewhart charts are constructed for measurements
that are autocorrelated, the result can be too many false sig
Most of these questions are subjects of current research nals, and users sometimes comment that “the control limits
and debate, and it is not the goal of this paper to provide are too tight.” This situation is illustrated in Figure 1, which
definitive solutions. Instead the purpose is to describe some displays an individual measurement and moving range chart
of the more basic approaches that have been proposed for 100 observations of a chemical process.
1
the process measurements as vertical displacements from
a common mean.
x~t xt = 0 + 1 x~t 1 + t
The chart in Figure 1 is created with the IRCHART statement Output 3. Partial Autocorrelation Plot
in the SHEWHART procedure as follows:
Partial Autocorrelations
Lag Correlation 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
title ’Individual Measurements Chart’; 1 0.83026  . ***************** 
proc shewhart graphics data=chemical; 2 0.09346  . ** . 
irchart xt*t / npanel = 100 3 0.00385  .  . 
4 0.07340  . * . 
needles 5 0.00278  .  . 
split = ’/’; 6 0.09013  . ** . 
label xt = ’Observed/Mvg Rng’; 7 0.08781  . ** . 
8 0.10327  . ** . 
run; 9 0.07240  . * . 
10 0.11637  . ** . 
11 0.08210  . ** . 
This is a standard application of the SHEWHART proce 12 0.07580  . ** . 
dure.y Note that the NEEDLES option is used to depict 13
14
0.04429 
0.11661 
.
.
* .
** .


15 0.10446  . ** . 
These are patterned after the values plotted in Figure 1
of Montgomery and Mastrangelo (1991).
y Refer to SAS Institute Inc. (1989b, 1994) for addi You can fit this model with the ARIMA procedure:
tional standard examples of the SHEWHART procedure in
SAS/QC software. Also refer to SAS Institute Inc. (1991b) proc arima data=chemical;
for an introduction to the SQC Menu System, which is in identify var = xt;
cluded with SAS/QC software and provides a pointandclick estimate p = 1 method = ml;
run;
interface to the SHEWHART procedure that is intended for
standard applications.
2
The results are shown in Output 4. The equation of the fitted that the TRENDVAR= option is needed to specify this ar
model is rangement (otherwise, a standard individual measurements
x~t = 13:05 + 0:847x~t 1 chart for XT would be produced by default).
Approx.
Parameter Estimate Std Error T Ratio Lag
MU 85.28375 2.32973 36.61 0
AR1,1 0.84694 0.05221 16.22 1
Strategies
The EWMA chart (which you can construct with the MA
proc arima data=chemical; CONTROL procedure) is based on the assumption that the
identify var=xt; observations xt are independent. However, in the context of
estimate p=1 method=ml;
autocorrelated process data (and more generally in time se
ries analysis), the EWMA statistic zt plays a different role :
forecast out=results id=t;
run;
it is the optimal onestepahead forecast for a process that
can be modeled by an ARIMA(0,1,1) model
The output data set (named RESULTS) saves the onestep
ahead forecasts as a variable named FORECAST, and it xt = xt 1 + t t 1
also contains the original variables XT and T. You can
create a Shewhart chart for the residuals by using the data provided that the weight parameter is chosen as = 1 .
set RESULTS as input to the SHEWHART procedure: This statistic is also a good predictor when the process can
be described by a subset of ARIMA models for which the
process is “positively autocorrelated and the process mean
does not drift too quickly.”y
title ’Residual Analysis Using AR(1) Model’;
proc shewhart data=results graphics;
xchart xt*t / trendvar = forecast
You can fit an ARIMA(0,1,1) model to the chemical data
xsymbol = xbar
npanel = 100 with the following statements:
ypct1 = 40
split = ’/’
For a discussion of these roles refer to Hunter (1986).
nolegend;
label xt=’Residual/Forecast’;
run; y Refer to Montgomery and Mastrangelo (1991) and the
discussion that follows their paper.
The result is displayed in Figure 2. The lower chart plots the
values of FORECAST, and the upper chart plots the resid
uals (XT  FORECAST) together with their 3 limits. Note
3
proc arima data=chemical; The EWMA center line control chart plots the forecasts from
identify var=xt(1); the ARIMA(0,1,1) model as the center “line,” and it uses the
estimate q=1 method=ml noint;
standard errors of prediction to determine upper and lower
forecast out=ewma id=t;
run; control limits. You can construct this chart with the following
statements:
A summary of the fitted model is shown in Output 5.
data ewmatab;
Output 5. Fitted ARIMA(0,1,1) Model length _var_ $ 8 ;
set ewma
Maximum Likelihood Estimation
(rename=(forecast=_mean_ xt=_subx_));
Approx. _var_ = ’xt’;
Parameter Estimate Std Error T Ratio Lag _sigmas_ = 3;
MA1,1 0.15041 0.10021 1.50 1
_limitn_ = 1;
Variance Estimate = 14.970245 _lclx_ = _mean_  3 * std;
Std Error Estimate = 3.86914009 _uclx_ = _mean_ + 3 * std;
AIC = 549.868021
SBC = 552.46314 _subn_ = 1;
Number of Residuals= 99 run;
4
Again, the conclusion is that the process is in control. A company that manufactures polyethylene film is con
While Figure 2 and Figure 4 are not the only displays that cerned with monitoring statistical control of an extrusion
can be considered for analyzing the chemical data, their process that produces a continuous sheet of film. At peri
construction illustrates the conjunctive use of the ARIMA odic intervals of time, samples are taken at four locations
and SHEWHART procedures in control chart applications. (referred to as lanes) along a cross section of the sheet,
and a test measurement is made of each sample. The test
A third view of autocorrelation in process data treats depen values are saved in a SAS data set named FILM. A partial
dent structure as a phenomenon to be exploited rather than listing of FILM is shown in Output 6.
removed. This approach, referred to as automatic process
control (APC) or engineering process control, has a long Output 6. SAS Data Set FILM
history in process industries where it is common practice for
Polyethylene Sheet Measurements
chemical engineers to devise feedback control algorithms
that continuously respond to deviations from target. In con SAMPLE LANE TESTVAL
5
Figure 5. Outlier Analysis for FILM Data
Figure 7. Comparative Histogram
6
Note that the LIMITN= option in the preceding XRCHART You can use a variety of SAS procedures for fitting linear
statement requests control limits for a nominal sample size models to estimate the variance components. The following
of four, and the ALLN option specifies that these limits statements show how this can be done with the MIXED
are to be displayed with all subgroup samples regardless procedure:
of sample size (not all samples have four measurements
because outliers have been removed). If LIMITN= and
proc mixed data=film;
ALLN were not specified, the control limits would vary with class sample;
sample size. model testval = / s;
random sample;
The number of outofcontrol points in Figure 8 would ordi make ’solutionf’ out=sf;
narily indicate that the process is not in statistical control. make ’covparms’ out=cp;
In this situation, however, the process is known to be quite run;
stable, and the data have been screened for outliers. Thus,
it is suspected that the control limits are not appropriate for
the data. c=
estimates are B
2 c=
The results are shown in Output 7. Note that the parameter
19:25, W
2
^ = 88:90.
39:68, and
Recall that the standard Shewhart analysis assumes that
Output 7. Partial Output from MIXED Procedure
sampling variation, also referred to as withingroup variation,
is the only source of variation. Writing xij for the j th Covariance Parameter Estimates (REML)
measurement within the i th subgroup, you can express the
model for the conventional X and R chart as
Cov Parm Ratio Estimate Std Error Z
0.0007
are assumed to be independent with zero mean and unit 0.0001
variance, and W
2
is the withinsubgroup variance. The
parameter denotes the process mean. Solution for Fixed Effects
In a process such as film manufacturing, however, this Parameter Estimate Std Error DDF T Pr > T
model is not adequate because there is additional variation INTERCEPT 88.89629708 0.72504535 55 122.61 0.0001
due to changes in temperature, pressure, raw material, and
other factors. Instead, a useful model is
The following statements merge the output data sets from
xij = + B !i + W ij the MIXED procedure into a SAS data set named NEWLIM
that contains the appropriately derived control limit param
where B
etersy for the average test value:
2
is the betweensubgroup variance, the random
variables !i are independent with zero mean and unit vari
ance, and the random variables !i are independent of the
random variables ij . data sf; set sf; rename _est_=est;
data cp; set cp sf; keep est;
X x =n
To plot the subgroup averages proc transpose data=cp out=newlim;
n
xi:
data newlim;
ij set newlim;
j=1 drop _name_ _label_ col1col3;
length _var_ _subgrp_ _type_ $8;
on a control chart, you need expressions for the expectation _var_ = ’testval’;
and variance of xi: . These are _subgrp_ = ’sample’;
E (xi: ) =
_type_ = ’estimate’;
_limitn_ = 4;
Var(xi: )= B2 + (W2 =n) _mean_ = col3;
_stddev_ = sqrt(4*col1 + col2);
^, and 3
Thus the center line should be located at
qc c
limits output;
should be located at run;
c c
2
d c + c 2
4 B 2
This is the notation used in Chapter 3 of Wetherill and adj W
Brown (1991), which provides helpful discussion of this y See the Appendix for additional discussion.
issue.
7
In the next set of statements the SHEWHART procedure A simple alternative to the chart in Figure 9 is a con
reads these parameter estimates and displays the X and R ventional “individual measurements” control chart for the
chart shown in Figure 9. The control limits for the X chart subgroup means which you can construct with the following
are displayed at
dp
^ 3adj = n
statements:
8
linear model to measurements taken while the process is in
statistical control and estimating the variance components.
The MIXED and GLM procedures (among others) are use
ful for fitting the model and creating output SAS data sets
with estimates that can subsequently serve as input to the
SHEWHART procedure. This approach offers considerable
generality because it is possible to fit mixed models that
incorporate time dependent structure as well as mixed and
fixed effects.
lsmeans lane;
run; the difference from nominal approach. A product
specific nominal value is subtracted from each mea
The results of this analysis are given in Output 8. sured value, and the differences (together with appro
priate control limits) are charted. Here it is assumed
Output 8. Partial Output from MIXED Procedure that the nominal value represents the central location
of the process (ideally estimated with historical data)
Covariance Parameter Estimates (REML)
and that the process variability is constant across
Cov Parm Ratio Estimate Std Error Z
products.
SAMPLE
Residual
0.82943785
1.00000000
22.52245916
27.15388406
5.66313893
3.01371956
3.98
9.01
the standardization approach. Each measured value
is standardized with a productspecific nominal and
standard deviation values. This approach is followed
Covariance Parameter Estimates (REML)
when the process variability is not constant across
Pr > Z
products.
0.0001
0.0001
These approaches are highlighted in this section because
of their popularity, but it is worth noting two alternatives that
Tests of Fixed Effects
are technically more sophisticated:
Source NDF DDF Type III F Pr > F
LANE 3 162 26.37 0.0001 Hillier (1969) provided a method for modifying the
usual control limits for X and R charts in startup
Least Squares Means situations where fewer than 25 subgroup samples
Level LSMEAN Std Error DDF T Pr > T are available for estimating the process mean
and standard deviation ; also refer to Quesenberry
LANE A 88.04445518 0.94862578 162 92.81 0.0001
LANE B 93.22627337 0.94862578 162 98.28 0.0001 (1993).
LANE
LANE
C
D
89.89900064
84.60714286
0.94862578
0.94184795
162
162
94.77
89.83
0.0001
0.0001
Quesenberry (1991a, 1991b) introduced the so
called Q chart for short (or long) production runs,
which standardizes and normalizes the data using
Note that the lane effect is significant, and this suggests probability integral transformations.
that it might be wise to maintain separate control charts for
each lane. SAS examples illustrating these alternatives are given by
Rodriguez and Bynum (1992).
Summary
9
Difference From Nominal Note that an IN= variable is used in the MERGE statement
The following example is adapted from an application
to allow for the fact that NOMVAL includes nominal values
for product types that are not represented in OLD. Output 11
in aircraft component manufacturing. A metal extrusion lists the merged data set OLD.
process is used to make three slightly different models of
the same component. The three product types (labeled M1, Output 11. Data Merged with Nominal Values
M2, and M3) are produced in small quantities because the
SAMPLE PRODTYPE NOMINAL DIAMETER DIFF
process is expensive and timeconsuming.
1 M3 14.8 13.99 0.81
Output 9 indicates the structure of a SAS data set named 2 M3 14.8 14.69 0.11
3 M3 14.8 13.86 0.94
OLD in which the diameter measurements for various short 4 M3 14.8 14.32 0.48
runs have been saved. Samples 1 to 30 are to be used to 5 M3 14.8 13.23 1.57
10
Output 12. Values of R by Product Type (numbered 31 to 60) are measured and saved in a SAS
data set named NEW, as listed in Output 14.
Control Limits By Product Type
Output 14. Data Set NEW
PRODTYPE _VAR_ _SUBGRP_ _TYPE_ _LIMITN_ _ALPHA_ _SIGMAS_
New Data For Short Production Runs
M1 DIFF SAMPLE ESTIMATE 2 .0026998 3
M2 DIFF SAMPLE ESTIMATE 2 .0026998 3
SAMPLE PRODTYPE DIAMETER
M3 DIFF SAMPLE ESTIMATE 2 .0026998 3
31 M2 14.69
_LCLI_ _MEAN_ _UCLI_ _LCLR_ _R_ _UCLR_ _STDDEV_
32 M2 15.39
33 M2 14.56
3.13153 0.12924 3.39001 0 1.22646 4.00628 1.08692
34 M2 15.02
1.78485 0.06551 1.65382 0 0.64669 2.11243 0.57311
35 M2 13.93
3.22326 0.19034 2.84258 0 1.14076 3.72633 1.01097
36 M1 17.55
37 M1 14.26
38 M3 14.42
39 M3 12.77
The data set DIFFLIM is structured for subsequent use as proc sort data=new;
an input LIMITS= data set by the SHEWHART procedure by prodtype;
(below). The variables in a LIMITS= data set provide pre
computed control limits or—as in this case—the parameters data new;
from which control limits are to be computed. These vari merge nomval new(in = a);
ables have reserved names that begin and end with the by prodtype; if a;
diff = diameter  nominal;
underscore character. Here the variable STDDEV saves run;
the estimate of , and the variable MEAN saves the mean
of the differences from nominal (recall that this mean is zero
To construct a short run individual measurements and mov
since the nominal values are assumed to represent the pro
ing range chart, first sort the observations in NEW by
cess mean for each product type.) The identifier variables
SAMPLE, and then use the SHEWHART procedure with
VAR and SUBGRP record the names of the process and
DIFFLIM specified as an input LIMITS= data set.
subgroup variables (these variables are critical in applica
tions involving many product types.) The variable LIMITN
is assigned a value of two to specify moving ranges of two proc sort data=new;
consecutive measurements, and the variable SIGMAS is by sample;
assigned a value of three to specify 3 limits. The data set
title ’Chart for Difference from Nominal’;
DIFFLIM is listed in Output 13.
symbol1 v=dot ;
Output 13. Estimates of Mean and Standard Deviation symbol2 v=plus ;
symbol3 v=circle;
11
specifying PRODTYPE as a symbol variable after the = In some applications, it may be useful to replace the moving
sign in the IRCHART statement. range chart with a plot of the nominal values. You can do this
with the TRENDVAR= option in the XCHART statement
provided that you reset the value of LIMITN to one (this
specifies a subgroup sample of size one).
data difflim;
set difflim;
_var_ = ’diameter’;
_limitn_ = 1;
run;
proc shewhart The display is shown in Figure 15. Note that you identify the
data=new (rename=(prodtype=_phase_)) product types by specifying PRODTYPE as a block variable
limits=difflim graphics; enclosed in parentheses after the subgroup variable SAM
irchart diff * sample / PLE. The BLOCKLABTYPE= option specifies that values of
readlimits the block variable are to be scaled (if necessary) to fit the
readphases=all
space available in the block legend. The BLOCKLABEL
POS= optiony specifies that the label of the block variable
phaseref
phasebreak
phaselegend is to be displayed to the left of the block legend.
split=’/’;
label diff = ’Difference/Mvg Rng’;
run;
12
then various authors recommend standardizing the differ
ences from nominal and displaying them on a common chart
Are the Variances Constant?
with control limits at 3.
The differencefromnominal chart should be accompanied To illustrate this method, assume that the hypothesis of
by a test that checks whether the variances for each prod homogeneity is rejected for the differences in OLD. Then
uct type are identical (homogeneous). Wheeler (1991a) you can use the productspecific estimates of in BASELIM
describes a simple rangebased method. A second method, to standardize the differences from nominal in NEW as
used by a number of manufacturing companies, is to apply follows:
a KruskalWallis test to the moving ranges for each product
type. Both methods can be implemented with short SAS
programs. proc sort data=baselim; by prodtype;
proc sort data=new; by prodtype;
This section illustrates a third method, Levene’s test of
homogeneity, which is particularly appropriate for short data new;
run applications because it is robust to departures from keep sample prodtype z diameter
diff nominal _stddev_;
normality; refer to Snedecor and Cochran (1980). One label sample = ’Sample Number’
reason Levene’s test has not been widely used for this format diff 5.2 ;
purpose is that it is more complicated than the first two merge baselim new(in = a);
methods. However, you can implement Levene’s method by prodtype; if a;
by using the GLM procedure to construct a oneway analysis z = diff / _stddev_ ;
of variance for the absolute deviations of the diameters from
proc sort data=new; by sample;
averages within product types.
run;
proc sort data=old; by prodtype; The following statements create the standardized chart:
proc means data=old noprint;
var diameter; title ’Standardized Chart’;
output out=oldmean (keep=prodtype diammean) symbol v=dot;
mean=diammean; proc shewhart data=new graphics;
by prodtype; irchart z*sample (prodtype) /
blocklabtype = scaled
data old; mu0 = 0
merge old oldmean; by prodtype; sigma0 = 1
absdev = abs( diameter  diammean ); split = ’/’;
label prodtype = ’Product Classification’
proc means data=old noprint; z = ’Std Diff/Mvg Rng’;
var absdev; run;
by prodtype;
output out=stats
n=n mean=mean css=css std=std; Note that the options MU0= and SIGMA= specify that the
control limits for the standardized differences from nominal
title ’Test for Constant Variance’; are to be based on the parameters = 0 and = 1. The
proc glm data=old outstat=glmout ; chart is displayed in Figure 16.
class prodtype;
model absdev = prodtype;
run;
Error 27 12.27280629
13
Summary
title ’Standard Analysis of Individual ’
The two most basic approaches to short run process control ’Delays’;
are to chart differences from productspecific nominal values
when the variability is constant across products and to proc shewhart data=calls graphics ;
chart standardized differences when the variability is not irchart time * recnum /
rtmplot = schematic
constant. In practice, this requires maintaining a database
outlimits = delaylim
of nominal values and standard deviations for products in cboxfill = grey
addition to the process data. Charting the differences is nochart2;
a straightforward application of the SAS DATA step and run;
the SHEWHART procedure, and the examples presented
here can be extended to any number of product types. It The chart is displayed in Figure 17.
is important to check whether the variances are constant
(homogeneous), and this can be done with Levene’s test by
using the GLM procedure.
14
corresponding percentiles from a fitted lognormal distribu Output 18. Summary of Fitted Lognormal Model
tion. The equation for the lognormal density function is
Fitted Distribution: Lognormal
f (x) = p 1
exp([log(x) ] =2 ) ; x > 0 ;
2 2 Parameters EDF GoodnessofFit
x 2
Threshold (Theta) 2.3 AD (ASquare) 0.34761167
where denotes the shape parameter and denotes the
Scale (Zeta) 0.6891925 Pr > ASquare 0.4765
Shape (Sigma) 0.64121248
scale parameter. Est Mean 2.91655007 Cvon M (WSquare) 0.05863868
Est Std Dev 0.4396814 Pr > WSquare 0.4106
The following statements use the CAPABILITY procedure ChiSquare GoodnessofFit Kolmogorov (D) 0.09208575
to fit a lognormal model and superimpose the fitted density ChiSquare 6.81723374
Pr > D >.15
proc capability data=calls graphics ; This information is also saved in a SAS data set named
histogram time / LNFIT, which is listed below.
lognormal( threshold = 2.3 )
cfill = grey
Output 19. Output Data Set LNFIT
outfit = lnfit
Parameters of Fitted Lognormal Model
nolegend ;
inset n = ’Number of Calls’ _VAR_ _CURVE_ _LOCATN_ _SCALE_ _SHAPE1_ _MIDPTN_
lognormal( sigma=’Shape’ (4.2)
TIME LNORMAL 2.3 0.68919 0.64121 4.2
zeta =’Scale’ (5.2)
theta ) / _ADASQ_ _ADP_ _CVMWSQ_ _CVMP_ _KSD_ _KSP_
pos = ne;
0.34761 0.47648 0.058639 0.41060 0.092086 0.15
run;
The LOGNORMAL option in the HISTOGRAM statement is The following statements use the information in LN
used to request the lognormal fit, and the INSET statement FIT to replace the control limits in DELAYLIM with lim
is used to request the boxed information displayed on the its computed from percentiles of the fitted lognormal
histogram. The histogram is shown in Figure 18. model. The th percentile of the lognormal distribution
is P = exp( 1 () + ), where 1 denotes the inverse
standard normal cumulative distribution function.
data delaylim;
merge delaylim lnfit;
drop _sigmas_ ;
_lcli_ = _locatn_ +
exp(_scale_+probit(0.5*_alpha_)*_shape1_);
_ucli_ = _locatn_ +
exp(_scale_+probit(10.5*_alpha_)*_shape1_);
_mean_ = _locatn_ +
exp(_scale_+0.5*_shape1_*_shape1_);
run;
Refer to Rodriguez and Bynum (1993) or SAS Institute The chart is displayed in Figure 19.
Inc. (1994) for details concerning recent enhancements to
the CAPABILITY procedure, including the INSET statement
and additional goodnessoffit tests based on the empirical
distribution function (EDF).
15
Quality Technology , 24, 138144.
 and RChart Control Limits Based
Hillier, F. S. (1969), “X
On a Small Number of Subgroups,” Journal of Quality Tech
nology , 1, 1726.
MacGregor, J. (1987), “Interfaces Between Process Control
and Online Statistical Process Control,” Computing and
Systems Technology Division Communications , 10, 920.
MacGregor, J. (1990), “A Different View of the Funnel
Experiment,” Journal of Quality Technology , 22, 255259.
The assumption of normality is critical to the interpretation Quesenberry, C. P. (1991a), “SPC Q Charts for StartUp
of control charts for individual measurements. This can Processes and Short or Long Runs,” Journal of Quality
be checked with graphical displays or with goodnessoffit Technology , 23, 213224.
tests available with the CAPABILITY procedure. You can
Quesenberry, C. P. (1991b), “SPC Q Charts for a Bino
also use this procedure to fit nonnormal models such as
mial Parameter p: Short or Long Runs,” Journal of Quality
the lognormal distribution and then derive modified control
limits from the fitted model.
Technology , 23, 239246.
Quesenberry, C. P. (1993), “The Effect of Sample Size
ACKNOWLEDGEMENTS on Estimated Effects,” Journal of Quality Technology , 25,
237247.
I am grateful to Richard A. Bynum, Sharad S. Prabhu, and
Russell D. Wolfinger of the Applications Division at SAS Rodriguez, R. N. and Bynum, R. A. (1992), Examples of
Institute Inc. for their valuable assistance in the preparation Short Run Process Control Methods With the SHEWHART
of this paper. Procedure in SAS/QC Software . Unpublished manuscript
available from the authors.
Farnum, N. R. (1992), “Control Charts for Short Runs: SAS Institute Inc. (1993), SAS/ETS User’s Guide, Version
Nonconstant Process and Measurement Error,” Journal of 6, Second Edition, Cary, NC: SAS Institute Inc.
16
SAS Institute Inc. (1994), SAS/QC Software: Reference Examples: The SAS data set CHEMICAL listed in Output 1
and Usage, Cary, NC: SAS Institute Inc. In preparation. is read as a DATA= input data set because it contains a
process variable (XT) and a subgroup variable (T). The
Schilling, E. G. and Nelson, P. R. (1976), “The Effect of following statements create a chart for individual measure
NonNormality on the Control Limits of X Charts,” Journal ments and moving ranges (not shown):
of Quality Technology , 8, 183187.
Schneider, H. and Pruett, J. M. (1994), “Control Charting proc shewhart data=chemical graphics;
Issues in the Process Industries,” Quality Engineering , 6, irchart xt * t;
347373. run;
DATA= Data Sets for Reading Raw Data proc shewhart history=machine graphics;
xrchart value * time;
The simplest input data set is the DATA= data set, which you run;
use to provide the SHEWHART procedure with raw mea
surements. This data set must contain at least one process
Note that the prefix VALUE is specified in the XRCHART
statement.
variable , whose values are the process measurements, and
one subgroup variable , whose values determine the rational None of the earlier examples in this paper use HISTORY=
subgroups into which the measurements are classified (or, input data sets, but examples are provided in SAS Institute
in the case of an “individual measurements” chart, the order Inc. (1989b, 1994).
of the measurements.) Subgrouped measurements in a
You should use HISTORY= data sets in applications where
DATA= data set must be arranged in strungout form (one
the data arrive in summarized form (for instance, from a
measurement per observation).
data collection device). You can use a variety of SAS
summarization procedures to create a HISTORY= data
Although the discussion here refers to control charts for
set, and you can also create a HISTORY= data set as a
variables (continuous measurements), similar guidelines SHEWHART output data set by using the OUTHISTORY=
apply in the case of control charts for attributes. option in any chart statement.
17
LIMITS= Data Sets for Reading Control Limit Informa TABLE= Data Sets for Creating Customized Charts
tion
Unlike a LIMITS= data set, which is useful in situations
You use a LIMITS= data set to provide the SHEWHART where you want to display predetermined limits for standard
procedure with either a complete set of precomputed fixed control charts, a TABLE= input data set is appropriate for
control limits or a set of parameters from which the displayed nonstandard applications where you want to display highly
control limits are to be computed. You can specify a LIMITS= customized, precomputed chart statistics and control limits.
data set in conjunction with a DATA= or HISTORY= data
set. If you do not specify a LIMITS= data set, the control Each block of observations in a TABLE= data set provides
limits are computed from the data. a set of chart statistics and control limits to be displayed for
a particular process variable. These blocks are identified
Each observation in a LIMITS= data set provides a set of by a variable named VAR whose values are the process
control limits (or parameters) to be displayed with a BY variables specified in the chart statement. The particular
group of observations for a particular process variable. The variables that you provide in a TABLE= data set depend
variables that you provide in a LIMITS= data set depend upon the chart statement that you are using, and they have
upon the chart statement that you are using, and they have reserved names beginnning and ending with the underscore
reserved names beginning and ending with the underscore character. A TABLE= data set must also include a subgroup
character. You can create a LIMITS= data set with a DATA variable.
step program, or you can create a LIMITS= data set as
a SHEWHART output data set by using the OUTLIMITS= You can create a TABLE= data set with a DATA step
option in any chart statement. The latter method is partic program, and you can also create a TABLE= data set as
ularly useful if you compute startup limits to be saved for a SHEWHART output data set by using the OUTTABLE=
monitoring new data. option in any chart statement.
Examples. The data set DELAYLIM listed in Output 17 Example . In the section on “Autocorrelation in Process
can be read as a LIMITS= data set to provide a full set of Data”, the data set EWMATAB is read as a TABLE= in
predetermined individual measurement and moving range put data set. A partial listing of EWMATAB is shown in
control limits for a process variable named TIME. The fol Output 20.
lowing statements create this chart (not shown): Output 20. Partial Listing of EWMATAB
_VAR_ T _LCLX_ _SUBX_ _MEAN_ _UCLX_ _SIGMAS_ _LIMITN_ _SUBN_
proc shewhart data=calls
limits=delaylim graphics; xt 2 84.2620 90 96.0000 107.738 3 1 1
xt 3 79.2722 89 90.8825 102.493 3 1 1
irchart time * recnum / readlimits; xt 4 77.6755 89 89.2830 100.890 3 1 1
run; xt 5 77.4351 94 89.0426 100.650 3 1 1
xt 6 81.6469 91 93.2543 104.862 3 1 1
xt 7 79.7317 89 91.3391 102.947 3 1 1
Note that you specify the READLIMITS option to indicate xt 8 77.7444 93 89.3518 100.959 3 1 1
that the control limits are to be read from the LIMITS= data . . . . . . . . .
set rather than computed from the data.
. . . . . . . . .
xt 98 71.0715 83 82.6790 94.286 3 1 1
xt 99 71.3443 86 82.9517 94.559 3 1 1
In contrast, the data set DIFFLIM listed in Output 13 does xt 100 73.9341 88 85.5415 97.149 3 1 1
18