You are on page 1of 9

Control Charts for Multivariate Processes

Author(s): Regina Y. Liu


Reviewed work(s):
Source: Journal of the American Statistical Association, Vol. 90, No. 432 (Dec., 1995), pp. 1380-
1387
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2291529
Accessed: 28/08/2012 14:20

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.

http://www.jstor.org
Control Charts for Multivariate Processes
Regina Y uu

This article uses the concept of data depth to introduce several new control charts for monitoring p~ocess~s of multivariate quality
measurements. For any dimension of the measurements, these charts are_ in the form of two-dimensional graphs that can be
visualized and interpreted just as easily as the well-known univaria~e X,~, and CUS.UM charts. Moreover, th~y have s~v~ral
significant advantages. First, they can detect simultaneously the .1ocat1on sh1_ft a~d scale mcrease of the pro~~s~, unh~e the e~1stmg
methods, which can detect only the location shift. Second, their construction is com~letely nonparametnc, m particular, it <loes
not require the assumption of normality for the quality distribution, which is needed m stan?ar~ appro~ches such as the X2 and
Hotelling's r2 charts. Thus these new charts generalize the principie of control charts to multivariate settmgs and apply to a much
broader class of quality distributions.
KEY WORDS: Control charts; Q chart; Quality control; r chart; S chart; Statistical process control.

1. INTRODUCTION CUSUM charts. The geometric nature of the notion of data


Control charts are useful tools for monitoring/ controlling depth makes it easy to interpret the values of statistics de-
a manufacturing process. With properly chosen control lim- rived from those ranks and to visualize their plots. This
its, a control chart can detect a shift from a "good" qual- approach is completely nonparametric, and thus the result-
ity distribution to a "bad" one. When the measurement, ing charts are valid without parametric assumptions on the
denoted by X, of a particular characteristic of a prod- process model. Moreover, these charts allow us to detect
uct is used to gauge the quality of the produc!, the most simultaneously the location change and the scale increase
commonly used charts are the X chart, the X chart (or in a process. In Section 3 three types of control charts-the
r, Q, and S charts-are proposed and justified. They can
Shewhart chart), and the cumulative sum (CUSUM) chart.
be viewed as data-depth-based multivariate generalizations
These charts are easy to construct, visualize, and interpret,
of the univariate X, X, and CUSUM charts. Their names
and most important, have been proven effective in practice.
are suggested respectively by the relative ranks of sample
However, they are usually suitable only when the observa-
points with respect to a reference sample, by the quality
tion X is univariate, and their validity often relies on the
index introduced in Liu and Singh (1993), and by a plot
assumption of normality of X, which is not always realistic.
of sums of deviations. In Section 2 a brief description of
In real life, we often encounter multivariate quality mea-
data depths and definitions of the relevant statistics suitable
surements rather than univariate, since the overall quality
for plotting in control charts are presented. A simulated bi-
of a product is usually determined by more than one quality
variate data set is used to demonstrate the construction of
characteristic. For example, the quality of a certain type of
the proposed charts. The results, presented in Figures 1-5,
tablets may be determined by weight, degree of hardness, appear to support our methods. A detailed discussion of
thickness, width, and length. These quality characteristics the simulation is given in Section 4, and sorne concluding
are clearly correlated, and control charts for monitoring in- remarks are presented in Section 5.
dividual quality characteristics may not be adequate for de-
tecting changes in the overall quality of the product. Thus 2. SOME STATISTICSDERIVED FROM DATA DEPTH
it is desirable to have control charts that can monitor mul-
tivariate measurements directly. Assume that k ( k 2:: 1) characteristics of each product are
There are sorne methods for constructing multivariate used to determine the quality of the product. The process is
control charts in the literature (see, for example, Alt and considered to be in control if the measurements are follow-
Smith 1988 for a thorough survey and for further refer- ing a prescribed quality distribution (required by customers
ences). However these methods are usually restricted to or designing engineers). Let G denote the prescribed k-
the case of normal distributions and are difficult to visual- dimensional distribution, and let Y1 , , Ym be m random
ize and interpret. The main idea behind our control charts observations from G. The sample Yi, , Ym is generally
is to reduce each multivariate measurement to a univariate referred to as a reference sample in the context of quality
index-namely, its relative center-outward ranking induced control, and considered as the measurements of products
by a data depth (cf. Sec. 2). Representing the original qual- produced by an in-control process. Let X1, X2, ... be the
ity measurements by their corresponding univariate ranks, new observations from the manufacturing process. Assume
we are able to develop control charts based on these_ranks that the Xi 's follow a distribution F. Based on the observa-
following the same principles for the univariate X, X, and tions X/s, we would like to determine whether the quality
of the product has deteriorated or whether the process is out
of control. This would mean that the X/s are not meeting
Regina Y. Liu is Professor, Department of Statistics, Rutgers University,
New Brunswick, NJ 08903. The author gratefully acknowledges support
from N ational Science Foundation Grants DMS-90-04658 and DMS @ 1995 American Statistical Association
90- Journal of the American Statistical Association
22126. The author thanks Kay Tatsuoka for bis computing assistance and December 1995, Vol. 90, No. 432, Theory and
the referees, associate editor, and editor for their helpful comments. Methods
1380
13 Control Charts for Multivariate
Liu: Journal of the 13
Another notion of depth is based on the Mahalanobis
1.0 distance. Here how deep a point y is with respect to a
given distribution G is measured by how small its quadratic
distance is to the mean
0.8 MDa(Y) = 1/[1 +(y - µa)'"EchY - µa)], (3)

where tia and "Ea denote the mean and the covariance ma-
trix of G, " ' " denotes the transpose of a ( k x 1) vector,
0.6 and "-1" denotes the inverse of a matrix. The empirical
version of MDa(y) is
0.5 -
MDam (y) = 1/[1 +(y - Y-)'s-1(y - Y)], (4)
0.4
where Y is the sample mean of Y1 , ... , Ym and S is the
sample covariance matrix. We observe that MDa(-) is also
affine invariant.
0.2 There are several other affine-invariant notions of data
depth, including Tukey's depth (Tukey 1975) and the ma-
jority depth of Singh (Liu and Singh 1993). As a matter
o.o ··---·-- -·-·---·--- ·----·-- of fact, all control charts proposed herein are also valid for
these two depths. (See Liu and Singh 1993 for a fuller dis-
cussion of various notions of data depth.) The simplicial
o 20 40 60 80 depth and the Mahalanobis depth suffice for our purposes,
because they illustrate well the contrasting properties of
Figure 1. r Chart. probabilistic geometry and metric distances. Henceforth
we use the same notation Da(·) to denote either notion of
the prescribed G( ·) in a certain sense. Thus we need to depth, unless indicated otherwise. We also assume that G
compare F with G. The statistics that we use to character- and F are two absolutely continuous distributions.
ize certain aspects of the difference between G and F are Clearly, a data depth induces a center-outward ordering
based on the notion of data depth, so we begin by describing of the sample points if depth values for all points are com-
sorne concepts of data depth. puted and compared. More specifically, if we arrange all
For any point y in R k, the simplicial depth (Liu 1990) of Da(Yi)'s in an ascending order and use Y[J] to denote the
y with respect to G is defined to be sample point associated with the jth smallest depth value,
then Y¡1¡, Y¡2¡, ... , Y[m] are the order statistics of Yi 's,
SDa(Y) = Pa{Y E s[Y1, ... , Yk+1]}, (1) with

where s[Yi, ... , Yk+1] is the open simplex whose vertices


Yi, ... , Yk+1 are (k + 1) random observations from G. The 0.8
value of SDa is a measure of how "deep," or how "central,"
y is with respect to G. When G is unknown and only a 0.7
sample {Y1, ... , Ym} is given, the sample simplicial depth
of y is defined as 0.6
0.5

0.4
which measures how deep y is within the data cloud
{Y1, ... , Ym}. Here I ( ·) is the indicator function; that is,
0.3
I(A) = 1 if A occurs and I(A) =O otherwise. The function
Gm(·) denotes the empirical distribution of {Yi, ... , Ym}
and (*) runs over all possible subsets of {Yi, ... , Ym} of 0.2
size ( k + 1). A fuller motivation together with the basic
properties of SDa ( ·) can be found in an earlier work 0.1
(Liu
1990), where it was shown in particular that SDa(·) is affine
invariant and that SDam ( ·) converges uniformly and
o.o
strongly to SDa ( ·). The affine invariance will ensure that
our proposed control charts are coordinate free, and the 5 10 15 20
convergence of SDam to SDa will allow us to approximate Figure 2. Q Chart (n = 4).
SDa(·) by SDam (-) when Gis not specified.
13 Control Charts for Multivariate
Liu: Journal of the 13

0.7 been transformed into univariate data· by data depth. In


principle, a control chart consists of critica! values, the up-
per control limit (UCL) and the lower control limit (LCL),
0.6 for a sample quality measurement. Between the two control
limits is the center line (CL), which represents no deviation
from the prescribed distribution. Samples from the manu-
0.5 facturing process are recorded in time order, and their mea-
surements are plotted on the chart. By convention, those
0.4 sample points are connected by a straight line, so that the
sequence of activities over time can be easily visualized.
The region above UCL or below LCL is termed the out-of-
0.3 control region. A sample point falling in the out-of-control
region is interpreted as evidence that the process is out of
0.2 control, and a proper corrective action is sought. If the
process is declared out-of-control when in fact it is not, we
say that we have a "false alarm." The UCL and LCL are
0.1 chosen so that the false alarm rate is small, say a. Thus a
control chart at every plotted point is a visualization of an
o.o o-level test with the null hypothesis H0: G = F. The re-
jection region in this test corresponds to the out-of-control
region in the control chart. (A more detailed discussion of
2 4 6 8
control charts can be found in, for example, Banks 1989
and Wadsworth, Stephen, and Godfrey 1986.)
Figure 3. Q Chart (n = 1 O).
3.1 The r Charts
Y[m] being the most central point. The smaller the order (or The r chart introduced in this section is similar to the X
the rank) of a point, the more outlying that point with re- chart for univariate data. lt is based on the statistics r * (
spect to the underlying distribution G(·). We now proceed ·) of (5) and (6). First we discuss the X chart. Assume
to list sorne statistics derived from data depth that are used that the observations Y1 , ... , Ym and X 1 , ... , X n are
in the next section to construct control charts. We write univariate and that our main concern is a possible shift in
Y rv G to indicate that the random variable Y follows the the mean
distribution G, and set in the X/s. If Gis a normal distribution with meanµ and
standard deviation a, then the following is a typical X chart
of X/s:

rc(y) = P{Dc(Y) :S Dc(Y)IY rv G} (5) X·


1

and UCL
rcm(Y) = #{YjlDcm(Yj) :S Dcm(y),j = 1, ... ,m}/m.
(6)
Let Fn ( ·) denote the empirical distribution of the sample
{X1, ... , Xn}· We can now define

Q( G, F) = P{ Dc(Y) :S Dc(X) IY rv G, X rv F} 2 3 4 5 6 7 8 9 10
( = Ep[rc(X)]), (7)
In this example, UCL = CL + Za¡2a, LCL = CL - za¡2a,
1 and CL = µ if µis known and =Y otherwise. Here Za indi-
Q(G, Fn) = -
n
L rc(Xi),
n
(8) cates the upper a critical value of the standard normal dis-
tribution; that is, a = P(Z > Za), where z
rv N(O, 1). The
i=l

and X chart allows us to detect a possible mean shift from the


prescribed value µ or the existence of any trend or pattern
in the sequence of observations. lt is a simple but effective
(9) tool for monitoring an univariate process; however, it does
not generalize easily to the multivariate case. For bivariate
normal G, a bivariate X chart with elliptical contours as
3. CONTROL CHARTS BASED ON DATA DEPTH
control limits, also called control ellipses, was studied by
We now introduce three control charts-the r chart, Q Alt and Smith (1988). Besides the restriction of normal-
chart, and S chart-which can be viewed as the X chart, X ity, it is also difficult to visualize and detect any pattern or
chart, and CUSUM chart, after the multivariate data have trend, because the chronological order of the observations
13 Control Charts for Multivariate
Liu: Journal of the 13
control when H0 is rejected or, equivalently, when an ob-
5 servation falls below a in the r chart.
To explain the choice of CL = .5 and LCL = a in the
r chart, we require the properties of rc(X) and rc.; (X)
o established by Liu and Singh (1993) and listed in Proposi-
tion 3.1.
-2 Proposition 3.1. Assume that F = G and X rv F. Let
-4 U[O, 1] denote a uniform distribution supported in [O, 1],
and let the notation --+1:, stand for convergence in law. lf
-6 Dc(X) has a continuous distribution, then

-1 o
a. rc(X) rv U[O, 1], and
b. as m --+ oo, "c-; (X) --+¡:, U[0,1] along almost all
{Y1, ... , Ym} sequences, provided that Dcm (-) con-
verges to De(·) uniformly as m=» oo.
-15
Remark 3.1. The uniform convergence of Dcm (·) holds
for the simplicial depth if G is absolutely continuous, and
for the Mahalanobis depth if G has a bounded second ab-
-20 solute moment.
Under H0: F = G, Proposition 3.1 implies that the ex-
pected value of rc(X) is .5 and that of rc.; (X) is .5 almost
o 20 40 60 80 surely for all sequences {Y1, ... , Ym} for large m. This jus-
Figure 4. S Chart. tifies choosing .5 to be CL of the r chart. When rc(X) (or
rc.; (X)) is much smaller than .5, there is doubt for H0 and

is lost in the plot. Furthermore, when the dimension k goes evidence to support Ha, signaling a possible quality deteri-
beyond 3, it does not seem possible to follow the same idea oration. When rc(X) (or "a.; (X)) is larger than .5, there
to construct charts that are easy to visualize. is indication of a decrease in scale with perhaps a negligible
Our r chart is constructed as follows. Compute {re (X 1), location shift. This is seen as an improvement in quality,
termed a gain in precision, and thus the process should not
rc(X2), ... } (or rcm(X1),rcTJX2), ... if only Y1, ... , Ym
be viewed as out-of-control. Therefore, there is only an
are available, but not G), following (5) (or (6)). The r chart
LCL in the r chart. The uniform distribution of rc(X) (or
is the plot of rc(Xi)'s (or "a.; (Xi)'s) against time i, with
"a.; (X)) implies clearly that LCL should be a.
CL = .5 and the control limit a. The process is declared
out-of-control if re ( ·) falls below a. Recall that a is the
false alarm rate, which generally is close to zero, so the r 1
chart only has LCL = a but no UCL. The motivation and
justification of the r chart as a control chart are given next.
The expression (6) shows that "a.; (X) is an indication
o
of how outlying X is with respect to the data cloud Yi 's.
A very small value of rc.; (X) means that only a very small -1
proportion of }i's are more outlying than X. Thus X is at
the "outskirt" and is not conforming to most of the central -2
part of the good data set. Assuming that X rv F, a small
value of "a.; () then suggests a possible deviation from G -3
to F. Since rc.; (-) is defined according to data depth, the
possible deviation here can be a shift in "center" and/ or an
increase in scale. (A detailed mathematical justification of -4
this interpretation can be derived from Liu and Singh 1993,
sec. 3.) Thus the r chart with LCL =a corresponds toan -5
o-level test of the following hypotheses:
-6
H0: F = G vs. Ha: there is a location shift
and/or a scale increase from G to F. (10) -7
We observe that the alternative hypothesis is particularly
suitable for detecting quality deterioration in quality con-
trol, as it presents a loss of accuracy and/ or a loss of pre-
o 20 40 60 80
cision. This also justifies viewing the process as out-of- Figure5. S * Chart.
13 Control Charts for Multivariate
Liu: Journal of the 13
Table 1. Simplicial Depth Values and Ranks
or
X D(X) r(X) X D(X) r(X)
{Q(Gm, F~), Q(Gm, F~), ... }
1 .0028 .082 41 o .022
2 .2263 .948 42 o .022
if only Y1, ... , Ym are available .
3 .1794 . 840 43 o .022
64 .0196
.0025 .256
.074 44
46 .0107
.0041 .194
.100
The main issue now is to set the correct values for CL
75 .1144
.0115 .670
.196 45
47 o .022
.022 shall
andsee
LCL thatQwhen
in this chart.n is large,
This in view
depends onofthethechoice
approximations
of n.
8 .0443 .392 48 o .022 described in Proposition 3.2, CL should be .5, whereas LCL
9 .0389 .358 49 .0111 .194 should be (.5- z (12n)-112) for plotting {Q(G, F~)}'s and
0

ZaV
10 .0268 .296 50 .0261 .290
11 o .022 51 o .022 {.5- 112[(1/m) + (1/n)]} for plotting {Q(Gm,F~)}'s
12 .1962 .888 52 o .022
13 .1651 .812 53 o .022 (cf. Fig. 3). This approximation seems to be quite reason-
14 .1835 .852 54 o .022 able even when nis as small as 5. In practice, however, n
15 .0249 .280 55 o .022 can be even smaller, say 3 or 4. In this case, we may use
16 .0583 .446 56 o .022
17 .1106 .658 57 o .022 the exact distributions for Q(G, Fn) given in Proposition
18 .0022 .068 58 o .022 3.3. lt turns out that for a small a value the Q chart should
19 .2315 . 962 59 o have CL = .5 and LCL = (n!a)1fn /n .
.022
20 .0366 .348 60 o .022 First we describe the large n asymptotics. The Q chart
21 .0711 .502 61 .0932 .588 corresponds to the o-level test based on Q( G, Fn) (or
22 . 0645 .472 62 o Q( Gm, Fn)) for testing the same set of hypotheses in (10).
.022
23 .0103 .186 63 o .022 These are actually two of the several multivariate rank tests
24 .0797 .542 64 o .022 studied by Liu (1992) and Liu and Singh (1993). Their main
25 .0870 .566 65 o .022 asymptotic properties are as follows .
26 .0051 . 114 66 o .022
27 .0518 .424 67 o .022
Proposition 3.2. Assume that the conditions in Propo-
28 o .022 68 .0123 .202
29 .0044 .102 69 o .022 sition 3 .1 hold. Then
30 .0903 .576 70 .1984 .896 a. as n ~ oo, [Q(G, Fn) - ~] ~1:, N(O, 1/(12n)); and
31 .1900 .866 71 .0250 .280 b. as min(m,n) ~ oo, [Q(Gm,Fn) - ~] ~¡:, N{O,
32 .1621 .800 72 .0087 .160
33 .1499 .768 73 o .022 [(1/m) + (1/n)l/(12)}, under the following additional
34 .0757 .528 74 o .022 condition: if MD(·) is used to define Q(·, ·), and G has
35 .0514 .420 75 o .022 a bounded fourth absolute moment; if SD( ·) is used to
36 .0581 .444 76 o .022
define Q(·, ·), and G is a one-dimensional distribu-
37 .1096 .656 77 o .022
38 .0570 .436 78 o .022 tion and its density is bounded above and below in a
39 .2082 .920 79 o .022 neighborhood of the median (or center) .
40 .1927 . 876 80 o
.022 The statement (a) is a straightforward application of the
central limit theorem, because Q( G, Fn) is just the average
Remark 3.2. Even though the r chart does not have the of n iid uniform random variables. The statement (b) has
UCL to make its CL the center line of the in-control region, been established by Liu and Singh (1993). Although (b) has
the CL here does serve as a reference point to allow us been proven only for R 1 in the case of SD, it was conjec-
to observe whether a pattern or trend is developing in a tured by Liu and Singh (1993) with the support of simu-
sequence of samples. lation results that it actually holds for any k-dimensional
G. lt is now evident that CL and LCL should be set to the
3.2 The Q Charts values indicated earlier when n is large.
The idea behind the Q chart is similar to that of the When n is small, the foregoing asymptotic results may
univariate X chart. When X1, X2, ... are univariate and G not be applicable. Since LCL in this case is the oth quan-
is normal, the X chart plots the averages of consecutive tile of the distribution of Q(G, Fn) = (1/n) ¿7=1 rc(Xi),
we need the distribution of the average of uniform random
subsets of the Xi 's. The X chart may prevent a false alarm
variables (cf. Prop. 3.1). This follows directly from the
when the process is actually in control but sorne individual
formula for the distribution of the sum of uniform random
sample point falls outside the control limits merely due to
variables provided in Proposition 3.3.
random ftuctuations. This is an advantage over the X chart.
In the multivariate setting we propose to plot the averages Proposition 3.3. Let {U1, ... , Un} be an iid sample
of subsets of the rc(Xi)'s (or "o.; (Xi)'s). Assume that each from U[O, 1], and let Hn(t) be the distribution function of
subset has size n. In the notation of (8) and (9), the averages 2.::=7=1 u; that is, Hn(t) = P{2.::=7=1 u, < t}. Then for each
of the rc(Xi)'s and rc.; (Xi)'s are given by Q( G, F~) and n = 1, 2, ... , Hn(t) =O for t ~O and
Q(Gm, F~). Here F~ is the empirical distribution of the
X/sin the jth subset, j = 1, 2, .... The Q chart plots
Hn(t) = ~! ~ (-1)• ( ~) (t-k)~, (11)

{Q(G, F~), Q(G, F~), ... }


13 Control Charts for Multivariate
Liu: Journal of the 13
Table2. Q-values(n = 4)
.5315 .3330 .3910 .5975 .5090 .4255 .2815 .5860 .5400 .7220
.0650 .0415 .1320 .0220 .0220 .1635 .0670 .3395 .0220 .0220

where defined by

(x)~ o, if X < O;
(13)
xn if X> O.
This formula has been derived by Feller (1971). The and

~l ·
expression (11) shows that H¿ ( ·) is a piecewise polynomial.
For our purpose, the most relevant part of the polynomial
is
«sc.: = ~ [ra=(X¡) - (14)

~tn, ifO:St<l; Since Sn(G) n[Q(G, Fn) - 1/2] and Sn(Gm)


n. = n[Q(Gm, Fn) - 1/2], we can immediately deduce the
~ ( t n - n ( t - 1) n) , if 1 :S t < 2; following from Proposition 3.2.
n.
Proposition 3.4. Under the conditions described in
~! (tn-n(t-l)n+n(n2-1) (t-2)n), Proposition 3 .2, we have
if 2 :::;; t < 3. (12) a. Sn(G) -7r, N(O, n/12) as n -7 oo, and
b. Sn(Gm) -7r, N(O, n2[(1/m) + (1/n)]/12), as
To determine LCL for our Q chart for small n, we
need to find the value w such that P(l/n
:::;; w =
a or, equivalently, Hn(nw
0 1 U,
= a. Formula (12)
¿: min(m, n) -7 oo.
Proposition 3.4 implies that the LCL for the S chart
r
0) 0)

implies that for a :::;; 1/n!, (nw /n! = a. Consequently,


0 based on Sn(G) is -(z (n/12)112) and the LCL for the S
0

w0 = (n!a)1fn /n. This justifies our choice of LCL chart based on Sn(Gm) is -{z Jn2[(1/m) + (1/n)]/12}.
0

for We observe that the control limit here is a curve rather


the Q chart. For example, when n = 4 and a = .025, than a line, as shown in Figure 4. In fact, the control
then w.025 = [24( .025)]1/4 / 4 = .220. This value is used limit curves down following fo,. When n is large, the S
as the LCL for the Q chart in Figure 2, where the X/s
chart can easily exceed the standard paper size, which is
are grouped in sets of 4. It is also clear that CL here impractical. Thus it is convenient to standardize all the
should be .5, because it is the expected value of the av- CUSUM's to have a straight line control limit (see Fig. 5).
erage of n iid U[O, 1] random variables. Note that in prac-
tica! situations in quality control, a is usually chosen to be
This means plotting S~(G) = Sn(G)/ Vnlf-2
or S~(Gm)
= S~(Gm)/Jn2[(1/m) + (1/n)]/12forn=1,2, .... This
.0027 or smaller. Thus when n is not greater than 4, the
S* chart has CL = O and LCL = - Za.
LCL Wa is given by (n!a)1fn /n as shown earlier. How-
ever, if for whatever reasons, a is chosen to be greater 4. SIMULATION RESULTS
than 1/n!, then the proper piecewise formula in (12) should
In this section we use a bivariate data set to illustrate the
be used to determine the value for wa. For example, for
construction of the control charts discussed earlier. The
n = 4 and a = .1, we would need to solve the equa-
simulation is carried out using S language on a SUN work-
tion 1/4!((4w.1)4 - 4((4w.1) - 1)4) = .l. The solution is
station.
unique, because H¿ ( ·) is a strictly increasing function.
The data set is obtained as follows. Let G ""' N ( (8) ,
In general, there are no convenient closed forms for
solutions of polynomial equations of high orders. However, (6 V). We generate 540 sample points from G, labeling
the first 500 as Y1, ... , Y500 and the last 40 as Xi, ... , X40.
they can be easily obtained by using Newton's method or
We also generate 40 sample points from the distribu-
by using computer algorithms in, say, Mathematica.
tion N ( (n, (6 ~)) and label these 40 sample points as
3.3 The S Charts X41, ... , X80. The distributions here have been chosen to be
normal just to make the evaluation of the outcome easier.
We shall use the univariate CUSUM chart to motivate
Normality is not required for the applicability of the charts.
the S chart. When the X/s are univariate, the simplest
Note that there is a clear mean shift and a scale increase in
CUSUM chart is basically the plot of ¿~=l (Xi - µ),
the distribution for the last 40 Xi 's. In principle, we should
which
expect all our charts to detect this change. As Figures 1-5
reflects the pattern of the total deviation from the expected
value. It is more effective than the X chart or the X chart in show, this is indeed the case.
For each Xi, we compute its simplicial depth, using the
detecting small process change and is perhaps the most used
chart. In the multivariate setting, the idea of CUSUM chart FORTRAN algorithm developed by Rousseeuw and Ruts
naturally suggests plotting the values S« ( G) and Sn (
Gm)
Table3. Q-values(n = 10)

.4112 .5336 .3506 .6714 .0910 .02200 .1840 .06160


13 Control Charts for Multivariate
Liu: Journal of the 13
Table4. S-values

-.418 .030 .370 .126 .296 -.130 -.434 -.542 -.684 -.888
-1.366 -.978 -.666 -.314 -.534 -.588 -.430 -.862 -.400 -.552
-.550 -.578 -.892 -.850 -.784 -1.170 -1.246 -1.724 -2.122 -2.046
-1.680 -1.380 -1.112 -1.084 -1.164 -1.220 -1.064 -1.128 -.708 -.332
-.810 -1.288 -1.766 -2.072 -2.550 -2.950 -3.428 -3.906 -4.212 -4.422
-4.900 -5.378 -5.856 -6.334 -6.812 -7.290 -7.768 -8.246 -8.724 -9.202
-9.114 -9.592 -10.070 -10.548 -11.026 -11.504 -11.982 -12.28 -12.758 -12.362
-12.582 -12.922 -13.400 -13.878 -14.356 -14.834 -15.312 -15.79 -16.268 -16.746

(1992). This algorithm is highly efficient, because it re- which is -1. 96 in this case. For both figures, CL equals
quires only O( m log m) steps in computing the simplicial zero.
depths for m data point, instead of O(m4) steps as required In the simulation here, we have chosen m = 500. Clearly,
by direct computation based on solving systems of linear larger values of m give better approximations to the limiting
equations. The simplicial depth values of X/s are recorded distributions stated in Propositions 3.1, 3.2, and 3.4 and to
in the first column of Table 1. Based on these values we LCL' s for the r, Q, and S charts. Our experience shows
can compute all rc.; (Xi) using (6), and record them in the that the approximation results are reasonable when m is as
second column of Table 1. Figure 1 gives the plot of the small as 50 in the bivariate case. We would recommend
ro.; (Xi) 's with CL = .5 and LCL = .025, which is the larger values for higher-dimensional observations.
a value that we choose for all five charts. It clearly shows
5. CONCLUDING REMARKS
that the process is out-of-control in the second half, with
most of the re; (Xi)'s falling below LCL. The few false In addition to the X, X and CUSUM charts, there are
alarms in the first half of the X/s should be attributed to more complicated control charts for monitoring a univari-
random fluctuations in the same manner that false alarms ate process mean change, such as the moving average chart,
are characterized in a univariate X chart. the EWMA chart and the CUSUM chart with a V mask
Figures 2 and 3 show the Q charts with the group size (cf. Wetherill 1977). It would be interesting to develop
n = 4 and n = 10. The {Q(Gm, F~),j = 1, 2, ... } are our charts further along these lines. For example, a mov-
computed according to the definition (9) and are recorded ing average chart based on the r * ( ·) values in (5) or ( 6)
in Tables 2 and 3. For Figure 2, the CL has been set to .5 can be readily constructed. To obtain proper control limits
and the LCL has been set to .220, following Proposition 3.3. for this chart, one may apply the moving blocks bootstrap
In Figure 3, the results in Proposition 3.2 lead to the choice tech- niques of Liu and Singh ( 1992) to develop the
of CL = .5 and LCL = g-zaJl/12[(1/m) + (1/n)]}, distributions of the moving averages.
As discussed by Alt and Smith (1988), the classical mul-
which turns out to be .3193 when a = .025. Both plots
tivariate control charts based on the x2 or Hotelling's T2
clearly show that the process is out-of-control in the second
statistics (Hotelling 1949) are valid only when the process
half. We also observe that the averaging of rc.; (-)'s in
follows a normal distribution and can be used to detect a
Q has eliminated the random fluctuations appearing in the
mean shift only. When the process is bivariate, a control
first half of the r chart in Figure 1. In principie, because
ellipse may be used instead of the foregoing two charts.
the underlying distribution here is specified, we can use for
The control ellipse approach also requires the normality
example the computing package Mathematica to compute
assumption for the underlying process, and it loses the
the exact values of Dc(·)'s and hence Q(G, F~),j = 1, 2, ... chronological order of the plotted observations. In a differ-
and give the corresponding Q chart. The difference of this ent direction, one may use separate X charts for individual
chart and our Figure 2 appears to be negligible. component variables and then apply Bonferroni's inequal-
Figure 4 illustrates the S chart of the Sn(Gm) values in ity to provide a bound for the level of the combined test.
Table 4. Since the S values are not standardized here, the
As pointed out by Alt (1982), this inequality is not sharp
LCL is -zaJ(n2 /12)[(1/m) + (1/n)]. To keep the chart enough to give an accurate level unless the component vari-
within standard paper size, we need to adopt a much smaller ables are independent. More precisely, this approach tends
scale for the S axis. By contrast, in Figure 5, the S values to overestimate the probability for asserting that the process
have been standardized, and hence no severe rescaling is is in control.
needed. The standardized S values are recorded in Table 5, Since the sample Mahalanobis depth defined in ( 4) and
labeled as S*. The control limit LCL is a straight line - Za, Hotelling' s T2 are both measuring the quadratic distance of

Table5. S*-values

-1.447 .073 .738 .217 .456 -.183 -.564 -.659 -.783 -.963
-1.411 -.966 -.632 -.287 -.471 -.501 -.355 -.691 -.312 -.419
-.407 -.418 -.630 -.587 -.530 -.775 -.809 -1.098 -1.327 -1.257
-1.014 -.819 -.649 -.623 -.659 -.680 -.585 -.611 -.378 -.175
-.421 -.661 -.895 -1.037 -1.261 -1.442 -1.656 -1.866 -1.989 -2.066
-2.264 -2.459 -2.650 - 2.837 -3.020 -3.200 -3.377 -3.550 3.721 -3.889
-3.816 -3.980 -4.142 -4.300 -4.457 -4.610 -4.762 -4.840 -4.987 -4.794
-4.840 -4.932 -5.075 -5.216 -5.355 -5.492 -5.627 -5.760 -5.892 -6.022
13 Control Charts for Multivariate
Liu: Journal of the 13

a point to its mean, one may attempt to equate Hotelling's book of Statistics, 7, eds. P. R. Krishnaiah and C. R. Rao, Amsterdam:
T2 chart to our r or Q charts when Mahalanobis depth Elsevier, pp. 333-351.
is used. Note that in our approach, Mahalanobis depth Banks, J. (1989), Principies of Quality Control, New York: John Wiley.
Feller, W. (1971), Introduction to Probability Theory and Its Applications
serves only as a stepping stone to reduce the observations (2nd ed.), New York: John Wiley.
to "ranks." What we chart here are the "ranks" but not Hotelling, H. (1949), "Multivariate Quality Control," in Techniques in Sta-
the Mahalanobis depth values themselves. The determi- tistical Analysis, eds. C. Eisenhart, M. W. Hastay, and W. A. Wallis,
nation of the control limit in Hotelling's T2 plot requires New York: McGraw-Hill.
the exact sampling distribution of Hotelling's T2 statistic, Liu, R. (1990), "On a Notion of Data Depth Based on Random Simplices,"
The Annals of Statistics, 18, 405-414.
whereas this is not needed in our charts due to the fur- -- (1992), "Data Depth and Multivariate Rank Tests," in L1 -Statistical
ther transformation of statistics into ranks. Consequently, Analysis and Related Methods, ed. Y. Dodge, Amsterdam: Elsevier, pp.
our charts based on Mahalanobis depth are different from 279-294.
the Hotelling T2 plots. Regarding the choice of data depth Liu, R., and Singh, K. (1992), "Moving Blocks Bootstrap and Jackknife
Capture Weak Dependence," in Exploring the Limits of Bootstrap, eds.
for our charts, we note that if the underlying distribution
R. LePage and L. Billard, New York: John Wiley, pp. 225-248.
is close to elliptical, then it is more efficient to use Ma- -- (1993), "A Quality Index Based on Data Depth and Multivariate
halanobis depth. Otherwise, the more geometric type of Rank Tests;' Joumal of the American Statistical Association, 88, 252-
depth, such as majority depth, simplicial depth, and Tukey's 260.
depth, may be more desirable, because they do not require Mahalanobis, P. C. (1936), "On the Generalized Distance in Statistics,"
Proceedings of the National Academy India, 12, 49-55.
moment conditions.
Rousseeuw, P. J., and Ruts, l. (1992), "Bivariate Simplicial Depth,"
techni- cal report, University of Antwerp, Dept. of Mathematics and
[Received September 1993. Revised January 1995.] Computer Science.
Tukey, J. W. (1975), "Mathematics and Picturing Data," Proceedings of
REFERENCES the 1975 Intemational Congress of Mathematics, 2, 523-531.
Wadsworth, H., Stephen, K. S., and Godfrey, A. B. (1986), Modem Meth-
Alt, F. (1982), "Multivariate Quality Control: State of the Art," ASQC ods for Quality Control and lmprovement, New York: John Wiley.
Annual Quality Congress Transactions, pp. 886-893. Wetherill, G. B. (1977), Sampling Inspection and Quality Control (2nd
Alt, F., and Smith, N. (1988), "Multivariate Process Control," in Hand- ed.), New York: Chapman and Hall.

You might also like