Professional Documents
Culture Documents
Editorial Board
Robert N. Rodriguez Gary C. McDonald
SAS Institute, Inc., Editor-in-Chief General Motors R&D Center
John C. Young
McNeese State University
Lake Charles, Louisiana
InControl Technologies, Inc.
Houston, Texas
S1HJTL ASA
Society for Industrial and Applied Mathematics American Statistical Association
Philadelphia, Pennsylvania Alexandria, Virginia
Copyright © 2002 by the American Statistical Association and the Society for Industrial
and Applied Mathematics.
10987654321
All rights reserved. Printed in the United States of America. No part of this book may be
reproduced, stored, or transmitted in any manner without the written permission of the
publisher. For information, write to the Society for Industrial and Applied Mathematics,
3600 University City Science Center, Philadelphia, PA 19104-2688.
is a registered trademark.
The materials on the CD-ROM are for demonstration only and expire after 90 days
of use. These materials are subject to the same copyright restrictions as hardcopy
publications. No warranties, expressed or implied, are made by the publisher, authors,
and their employers that the materials contained on the CD-ROM are free of error.
You are responsible for reading, understanding, and adhering to the licensing terms
and conditions for each software program contained on the CD-ROM. By using this
CD-ROM, you agree not to hold any vendor or SIAM responsible, or liable, for any
problems that arise from use of a vendor's software.
^&o ^Saimen an& Qj^am
This page intentionally left blank
Contents
Preface xi
vii
viii Contents
Bibliography 253
Index 259
Preface
Industry continually faces many challenges. Chief among these is the requirement to
improve product quality while lowering production costs. In response to this need,
much effort has been given to finding new technological tools. One particularly im-
portant development has been the advances made in multivariate statistical process
control (SPC). Although univariate control procedures are widely used in industry
and are likely to be part of a basic industrial training program, they are inadequate
when used to control processes that are inherently multivariate. What is needed
is a methodology that allows one to monitor the relationships existing among and
between the process variables. The T2 statistic provides such a procedure.
Unfortunately, the area of multivariate SPC can be confusing and complicated
for the practitioner who is unfamiliar with multivariate statistical techniques. Lim-
ited help comes from journal articles on the subject, as they usually include only
theoretical developments and a limited number of data examples. Thus, the prac-
titioner is not well prepared to face the problems encountered when applying a
multivariate procedure to a real process situation. These problems are further
compounded by the lack of adequate computer software to do the required complex
computations.
The motivation for this book came from facing these problems in our data con-
sulting and finding only a limited array of solutions. We soon decided that there
was a strong need for an applied text on the practical development and application
of multivariate control techniques. We also felt that limiting discussions to strate-
gies based on Hotelling's T2 statistic would be of most benefit to practitioners. In
accomplishing this goal, we decided to minimize the theoretical results associated
with the T2 statistic, as well as the distributional properties that describe its be-
havior. These results can be found in the many excellent texts that exist on the
theory of multivariate analysis and in the numerous published papers pertaining
to multivariate SPC. Instead, our major intent is to present to the practitioner
a modern and comprehensive overview on how to establish and operate an ap-
plied multivariate control procedure based on our conceptual view of Hotelling's
T2 statistic.
The intended audience for this book are professionals or students involved with
multivariate quality control. We have assumed the reader is knowledgeable about
univariate statistical estimation and control procedures (such as Shewhart charts)
and is familiar with certain probability functions, such as the normal, chi-square,
t, and F distributions. Some exposure to regression analysis also would be helpful.
xi
xii Preface
Robert L. Mason
John C. Young
This page intentionally left blank
Chapter 1
Introduction to the T2 Statistic
1
2 Chapter 1. Introduction to the T2 Statistic
retrieved from the data net for both a good-run time period and the upset time
period. He states that the operations staff is demanding that the source of the
problem be identified. You immediately empathize with them. Having lived
through your share of unit upsets, you know no one associated with the unit
will be happy until production is restored and the problem is resolved. There
is an entire megabyte of data stored on the diskette, and you must decide how
to analyze it to solve this problem.
What are your options ? You import the data file to your favorite spread-
sheet and observe that there are 10,000 observations on 35 variables. These
variables include characteristics of the feedstock, as well as observations on
the process, production, and quality variables. The electronic data collector
has definitely done its job.
You remember a previous upset condition on the unit that was caused by a
significant change in the feedstock. Could this be the problem? You scan the
10,000 observations, but there are too many numbers and variables to see any
patterns. You cannot decipher anything.
The thought strikes you that a picture might be worth 1,000 observations.
Thus, you begin constructing graphs of the observations on each variable plot-
ted against time. Is this the answer? Changes in the observations on a variable
should be evident in its time-sequence graph. With 35 variables and 10,000
observations, this may involve a considerable time investment, but it should
be worthwhile. You readily recall that your college statistics professor used to
emphasize that graphical procedures were an excellent technique for gaining
data insight.
You initially construct graphs of the feedstock characteristics. Success
eludes you, however, and nothing is noted in the examination of these plots.
All the input components are consistent over the entire data set, including
over both the prior good-run period and the upset period. From this analysis,
you conclude that the problem must be associated with the 35 process vari-
ables. However, the new advanced process control (APC) system was working
well when you left the unit. The multivariable system keeps all operational
variables within their prescribed operational range. If a variable exceeded the
range, an alarm would have signaled this and the operator would have taken
corrective action. How could the problem be associated with the process when
all the variables are within their operational ranges?
Having no other options, you decide to go ahead and examine the process
variables. You recall from working with the control engineers in the instal-
lation of the APC system that they had been concerned with how the process
variables vary together. They had emphasized studying and understanding the
correlation structure of these variables, and they had noted that the variables
did not move independently of one another, but as a group. You decide to
examine scatter plots of the variables as well as time-sequence plots. Again,
you recall the emphasis placed on graphical techniques by that old statistics
professor. What was his name?
You begin the laborious task, soon realizing the enormity of the job. From
experience, it is easy to identify the most important control variables and the
1.1. Introduction 3
1.1 Introduction
The problem confronting the young engineer in the above situation is common in
industry. Many dollars have been invested in electronic data collectors because
of the realization that the answer to most industrial problems is contained in the
observations. More money has been spent on multivariable control or APC sys-
tems. These units are developed and installed to ensure the containment of process
variables within prescribed operational ranges. They do an excellent job in reduc-
ing overall system variation, as they restrict the operational range of the variables.
However, an APC system does not guarantee that a process will satisfy a set of
baseline conditions, and it cannot be used to determine causes of system upsets.
As our young engineer will soon realize, a multivariate SPC procedure is needed
to work in unison with the electronic data collector and the APC system. Such a
4 Chapter 1. Introduction to the T2 Statistic
Group Number
procedure will signal process upsets and, in many cases, can be used to pinpoint
precursors of the upset condition before control is lost. When signals are identified,
the procedure allows for the decomposition of the signal in terms of the variables
that contributed to it. Such a system is the main subject of this book.
addition, many of the variables follow certain mathematical relationships and form
a highly correlated set.
The correlation among the variables of a' multivariate system may be due to
either association or causation. Correlation due to association in a production unit
often occurs because of the effects of some unobservable variable. For example, the
blades of a gas or steam turbine will become contaminated (dirty) from use over
time. Although the accumulation of dirt is not measurable, megawatt production
will show a negative correlation with the length of time from the last cleaning of
the turbine. The correlation between megawatt production and length of time since
last cleaning is one of association.
An example of a correlation due to causation is the relationship between tem-
perature and pressure since an increase in the temperature will produce a pressure
change. Such correlation inhibits examining each variable by univariate procedures
unless we take into account the influence of the other variable.
Multivariate process control is a methodology, based on control charts, that is
used to monitor the stability of a multivariate process. Stability is achieved when
the means, variances, and covariances of the process variables remain stable over
rational subgroups of the observations.
The analysis involved in the development of multivariate control procedures
requires one to examine the variables relative to the relationships that exist among
them. To understand how this is done, consider the following example. Suppose
we are analyzing data consisting of four sets of temperature and pressure readings.
The coordinates of the points are given as
where the first coordinate value is the temperature and the second value is the
pressure. These four data points, as well as the mean point of (175, 75), are plotted
in the scatter plot given in Figure 1.3. There also is a line fitted through the points
and two circles of varying sizes about the mean point.
If the mean point is considered to be typical of the sample data, one form of
analysis consists of calculating the distance each point is from the mean point. The
distance, say D, between any two points, (ai, a^) and (&i, 62)5 is given by the
formula
From these calculations, it is seen that points 1 and 4 are located an equal distance
from the mean point on a circle centered at the mean point and having a radius of
3.16. Similarly, points 2 and 3 are located at an equal distance from the mean but
on a larger circle with a radius of 7.07.
1.3. Multivariate Control Procedures 7
There are two major criticisms of this analysis. First, the variation in the
two variables has been completely ignored. From Figure 1.3, it appears that the
temperature readings contain more variation than the pressure readings, but this
could be due to the difference in scale between the two variables. However, in this
particular case the temperature readings do contain more variation.
The second criticism of this analysis is that the covariation between tempera-
ture and pressure has been ignored. It is generally expected that as temperature
increases, the pressure will increase. The straight line given in Figure 1.3 depicts
this relationship. Observe that as the temperature increases along the horizontal
axis, the corresponding value of the pressure increases along the vertical axis. This
poses an interesting question. Can a measure of the distance between two points
be devised that accounts for the presence of a linear relationship between the corre-
sponding variables and the difference in the variation of the variables? The answer
is yes; however, the distance is statistical rather than Euclidean and is not as easy
to compute.
To calculate statistical distance (SD), a measure of the correlation between the
variables of interest must be obtained. This is generally expressed in terms of the
covariance between the variables, as covariance provides a measure of how variables
vary together. For our example data, the sample covariance between temperature
and pressure, denoted as S]2, is computed using the formula
is given by
The sample variances for temperature and pressure as determined from the example
data are 22.67 and 17.33, respectively.
Using the value of the covariance, and the values of the sample variances and the
sample means of the variables, the squared statistical distance, (SD)2, is computed
using the formula
From this analysis it is concluded that our four data points are the same statistical
distance from the mean point. This result is illustrated graphically in Figure 1.4.
All four points satisfy the equation of the ellipse superimposed on the plot.
From a visual perspective, this result appears to be unreasonable. It is obvious
that points 1 and 4 are closer to the mean point in Euclidean distance than points
2 and 3. However, when the differences in the variation of the variables and the
1.4. Characteristics of a Multivariate Control Procedure 9
relationships between the variables are considered, the statistical distances are the
same. The multivariate control procedures presented in this book are developed
using methods based on the above concept of statistical distance.
Although many different multivariate control procedures exist, it is our belief that
a control procedure built on the T2 statistic possesses all the above characteristics.
Like many multivariate charting statistics, the T2 is a univariate statistic. This is
true regardless of the number of process variables used in computing it. However,
because of its similarity to a univariate Shewhart chart, the T2 control chart is
sometimes referred to as a multivariate Shewart chart. This relationship to com-
mon univariate charting procedures facilitates the understanding of this charting
method.
Signal interpretation requires a procedure for isolating the contribution of each
variable and/or a particular group of variables. As with univariate control, out-
of-control situations can be attributed to individual variables being outside their
allowable operational range; e.g., the temperature is too high. A second cause of
a multivariate signal may be attributed to a fouled relationship between two or
more variables; e.g., the pressure is not where it should be for a given temperature
reading.
The signal interpretation procedure covered in this text is capable of separating
a T2 value into independent components. One type of component determines the
contribution of the individual variables to a signaling observation, while the other
components check the relationships among groups of variables. This procedure is
global in nature and not isolated to a particular data set or type of industry.
The T2 statistic is one of the more flexible multivariate statistics. It gives ex-
cellent performance when used to monitor independent observations from a steady-
state continuous process. It also can be based on either a single observation or the
mean of a subgroup of n observations. Minor adjustments in the statistic and its
distribution allow the movement from one form to the other.
Many industrial processes produce observations containing a time dependency.
For example, process units with a decaying cycle often produce observations that
can be modeled by some type of time-series function. The T2 statistic can be readily
adapted to these situations and can be used to produce a time-adjusted statistic.
The T2 statistic also is applicable to situations where the time correlation behaves
as a step function.
We have experienced no problems in applying the T2 statistic to batch or semi-
batch processes with targets specified or unspecified. In the case of target specifica-
tion, the T2 statistic measures the statistical distance the observed value is from the
specified target. In cases where rework is possible, such as blending, components
of the T2 decomposition can be used in determining the blending process.
Sensitivity to small process change is achieved with univariate control proce-
dures, such as Shewhart charts, through applications of zonal charts with run rules.
Small, consistent process changes in a T2 chart can be detected by using certain
components of the decomposition of a T2 statistic. This is achieved by monitoring
the residual error inherent to these terms. The detection of small process shifts is
so important that a whole chapter of the text is devoted to this procedure.
An added benefit of the T2 charting procedure is the potential to do on-line
experimentation that can lead to local optimization. Because of the demand of
1.5. Summary 11
1.5 Summary
Industrial process control generally involves monitoring a set of correlated variables.
Such correlation confounds the interpretation of univariate procedures run on indi-
vidual variables. One method of overcoming this problem is to use a Hotelling's T2
statistic. As demonstrated in our discussion, this statistic is based on the concept
of statistical distance. It consolidates the information contained in a multivariate
observation to a single value, namely, the statistical distance the observation is from
the mean point.
Desirable characteristics for a multivariate control chart include ease of applica-
tion, adequate signal interpretation, flexibility, sensitivity to small process changes,
and available software to use it. One multivariate charting procedure that possesses
all these characteristics is the method based on the T2 statistic. In the following
chapters of this book, we explore the various properties of the T2 charting procedure
and demonstrate its value.
This page intentionally left blank
Chapter 2
Basic Concepts about the
72 Statistic
2.1 Introduction
Some fundamental concepts about the T2 statistic must be presented before we
can discuss its usage in constructing a multivariate control chart. We begin with
a discussion of statistical distance and how it is related to the T2 statistic. How
statistical distance differs from straight-line or Euclidean distance is an important
part of the coverage. Included also is a discussion of the relationship between
the univariate Student t statistic and its multivariate analogue, the T2 statistic.
The results lead naturally to the understanding of the probability functions used
to describe the T2 statistic under a variety of different circumstances. Having
knowledge of these distributions aids in determining the UCL value for a T2 chart,
as well as the corresponding false alarm rate.
13
14 Chapter 2. Basic Concepts about the T2 Statistic
control procedures. The reader unfamiliar with vectors and matrices may find the
definitions and details given in this chapter's appendix (section 2.8) to be helpful
in understanding these results. Suppose we denote a multivariate observation on p
variables in vector form as X' = (x±,x2,..., xp}. Our main concern is in processing
the information available on these p variables. One approach is to use graphical
techniques, which are usually excellent for this task, but plotting points in a p-
dimensional space (p > 3) is severely limited. This restriction inhibits overall
viewing of the multivariate situation. Another method for examining information
provided in a ^-dimensional observation is to reduce the multivariate data vector
to a single univariate statistic. If the resulting statistic contains information on
all p variables, it can be interpreted and used in making decisions as to the status
of a process. There are numerous procedures for achieving this result, and we
demonstrate two of them below.
Suppose a process generates uncorrelated bivariate observations, (xi,:^), and
it is desired to represent them graphically. It is common to construct a two-
dimensional scatter plot of the points. Also, suppose there is interest in determining
the distance a particular point is from the mean point. The distance between two
points is always measured as a single number or value. This is true regardless of
how many dimensions (variables) are involved in the problem.
The usual straight-line (Euclidean) distance measures the distance between two
points by the number of units that separate them. The squared straight-line dis-
tance, say D, between a point (£1,22) and the population mean point (//i,/^) is
defined as
Note that we have taken the bivariate observation, (xi,x 2 ), and converted it
to a single number D, the distance the observation is from the mean point. If this
distance, D, is fixed, all points that are the same distance from the mean point can
be represented as a circle with center at the mean point and a radius of D (i.e., see
Figure 2.1). Also, any point located inside the circle has a distance to the mean
point less than D.
Unfortunately, the Euclidean distance measure is unsatisfactory for most statis-
tical work (e.g., see Johnson and Wichern (1998)). Although each coordinate of an
observation contributes equally to determining the straight-line distance, no con
sideration is given to differences in the variation of the two variables as measured
by their variances, a\ and cr|, respectively. To correct this deficiency, consider the
standardized values
The value SD, the square root of (SD)2 in (2.1), is known as statistical distance.
For a fixed value of SD, all points satisfying (2.1) are the same statistical distance
2.2. Statistical Distance 15
from the mean point. The graph of such a group of points forms an ellipse, as is
illustrated in the example given in Figure 2.2. Any point inside the ellipse will have
a statistical distance less than SD, while any point located outside the ellipse will
have a statistical distance greater than SD.
In comparing statistical distance to straight-line distance, there are some major
differences to be noted. First, since standardized variables are utilized, the statis-
tical distance is dimensionless. This is a useful property in a multivariate process
since many of the variables may be measured in different units. Second, any two
points on the ellipse in Figure 2.2 have the same SD but could have possibly dif-
ferent Euclidean distances from the mean point. If the two variables have equal
variances and are uncorrelated, the statistical and Euclidean distance, apart from
a constant multiplier, will be the same; otherwise, they will differ.
The major difference between statistical and Euclidean distance in Figure 2.2
is that the two variables used in statistical distance are weighted inversely by their
standard deviations, while both variables are equally weighted in the straight-line
distance. Thus, a change in a variable with a small standard deviation will con-
tribute more to statistical distance than a change in a variable with a large standard
deviation. In other words, statistical distance is a weighted straight-line distance
where more importance is placed on the variable with the smaller standard devia-
tion to compensate for its size relative to its mean.
It was assumed that the two variables in the above discussion are uncorrelated.
Suppose this is not the case and that the two variables are correlated. A scatter
plot of two positively correlated variables is presented in Figure 2.3. To construct
a statistical distance measure to the mean of these data requires a generalization
of (2.1).
16 Chapter 2. Basic Concepts about the T2 Statistic
where the a^- are specified constants satisfying the relationship (a^2 — 4ana22) < 0,
and c is a fixed value. By properly choosing the a^ in (2.2), we can rotate the
2.3. T2 and Multivariate Normality 17
ellipse while keeping the scatter of the two variables fixed, until a proper alignment
is obtained. For example, the ellipse given in Figure 2.4 is centered at the mean of
the two variables vet rotated to reflect the correlation between them.
where — oo < x^ < oo for i = 1,2, and Oi > 0 represents the standard deviation of
Xi. The value of (SD) 2 is given by
where p represents the correlation between the two variables, with — 1 < p < I.
The cross-product term between x\ and x^ in (2.4) accounts for the fact that
the two variables vary together and are dependent. When x\ and x^ are correlated,
the major and minor axes of the resulting ellipse differ from that of the variable
space (,TI, x%). If the correlation is positive, the ellipse will tilt upward to the right,
and if the correlation is negative, the ellipse will tilt downward to the right. This
18 Chapter 2. Basic Concepts about the T2 Statistic
1
where X' = (£1,2:2), // = (//i, ^2), and E is the inverse of the matrix E. Note
a\E
THAT ai2 [ L
2
2 = cr2i
1
= po\a-2
whereis the covariance between
<7i 2:1 and
x-2- The matrix E is referred to as the covariance matrix between x\ and 2:2- The
expression in (2.5) is a form of Hotelling's T2 statistic.
Equations for the contours of a bivariate normal density are obtained by fixing
the value of SD in (2.4). This can be seen geometrically by examining the bivariate
normal probability function presented in Figure 2.6. The locus, or path, of the
point X' = (2:1,2:2) traveling around the probability function at a constant height
is an ellipse. Ellipses of constant density are referred to as contours and can be
determined mathematically to contain a fixed amount of probability. For example,
the 75% and 95% contours for the bivariate normal function illustrated in Figure
2.6 are presented in Figure 2.7. The elliptical contours represent all points having
the same statistical distance or T2 statistic value.
2.3. T2 and Multivariate Normality 19
Figure 2.7: Bivariate normal contours containing 75% and 95% of the probability.
20 Chapter 2. Basic Concepts about the T2 Statistic
This result can be generalized to the situation where X' — (xi, £2, • • • , %p) is de-
scribed by the p-variate normal (multivariate normal (MVN)) probability function
given by
where — oo < Xi < oo for i = 1,2, . . . , £ > . The mean vector of X' is given by
//' = (//I, //2, • • • , (J>P) and the covariance matrix is given by
A diagonal element, an, of the matrix E represents the variance of the ith
variable, and an off-diagonal element, a^, represents the covariance between the
ith and jth variables. Note that E is a nonsingular, symmetric, and positive def-
inite matrix. In this setting, the equation for an ellipsoidal contour of the MVN
distribution in (2.6) is given by
and its value is defined as the squared statistical distance between the sample mean
and the population mean.
2.4. Student t versus He-telling's T2 21
The numerator of (2.9) is the squared Euclidean distance between x and //.
Thus, it is a measure of the closeness of the sample mean to the population mean.
As x gets closer to //, the value of t2 approaches zero. Division of the squared Eu-
clidean distance by the estimated variance of x (i.e., by s2/n) produces the squared
statistical distance. Hotelling (1931) extended the univariate t statistic to the mul-
tivariate case using a form of the T2 statistic based on sample estimates (rather
than known values) of the covariance matrix. His derivation is described as follows.
Consider a sample of n observations X\, X^, • • • , Xn, where X[ = (xn,Xi2,...,
ZIP), i = 1, 2 , . . . , n, is taken from a p-variate normal distribution having a mean
vector // and a covariance matrix E. A multivariate generalization of the t2 statistic
is given by
and
where sa is the sample variance of the ith variable and s^- is the sample covariance
between the ith and jth variables. The matrix S has many special properties. Those
properties that pertain to our use of the T2 as a control statistic for multivariate
processes are discussed in later sections.
In terms of probability distributions, the square of the t statistic in (2.9) has
the form
t 2 = (normal random variable) * (chi-square random variable/dj)~ 1
* (normal random variable),
where df represents the n — I degrees of freedom of the chi-square variate, (n —
I)s 2 /cr 2 , and the normal random variable is given by ^/n(x — /u)/cr. In this rep-
resentation, the random variable, x, and the random variable, s 2 . are statistically
independent. Similarly, the T2 statistic in (2.10) may be expressed as
the multivariate normal vector is given by ^/n(X — ^). The random vector X and
the random matrix S are statistically independent. The Wishart distribution (see
section 2.8.6 for details) in (2.12) is the multivariate generalization of the univariate
chi-square distribution.
Using the two forms presented in (2.10) and (2.12), it is possible to extend
Hotelling's T2 statistic to represent the squared statistical distance between many
different combinations of p-dimensional points. For example, one can use the T2
statistic to find the statistical distance between an individual observation vector
X and either its known population mean // or its population mean estimate X.
Hotelling's T2 also can be computed between a sample mean, Xi, of a subgroup
and the overall mean, X, of all the subgroups.
to the unit interval (0,1). However, within this interval, the distribution can take
on many familiar shapes, such as those associated with the normal, chi-square, and
F distributions. Examples of the beta distribution for various parameter values are
depicted in Figure 2.10, and percentage points of the beta distribution are given in
Table A.5 in the appendix.
It was stated earlier that the distribution used in describing a T2 statistic when
the parameters of the underlying normal distribution are unknown is some form of
an F distribution. However, in (2.15) we have used the beta distribution to describe
the T2 statistic. The T 2 , for this case, can be expressed as an F statistic by using
a relationship that exists between the F and beta probability functions. The result
is given by
where
In practice, we generally choose to use the beta distribution rather than the F dis-
tribution in (2.16). Although this is done to emphasize that the observation vector
X is not independent of the estimates obtained from the HDS, either distribution
is acceptable.
Since each T2 value obtained from the HDS depends on the same value of X
and 5, a weak interdependence among the T2 values is produced. The correlation
between any two T2 values computed from an HDS is given as — l/(n — 1) (see
Kshirsagar and Young (1971)). It is easily seen that even for modest values of
n, this correlation rapidly approaches zero. Although this is not justification for
26 Chapter 2. Basic Concepts about the T2 Statistic
assuming independence, it has been shown (see David (1970) and Hawkins (1981))
that, as n becomes large, the set of T2 values behaves like a set of independent
observations. This fact becomes important when subjecting the T2 values of an
HDS to other statistical procedures.
Process control in certain situations is based on the monitoring of the mean of
a sample (i.e., subgroup) of m observations taken at each of k sampling intervals.
The distribution that describes the statistical distance between the sample mean
of the ith observation vector Xi and the HDS mean X is given by
where Si represents the sample covariance estimate for the data taken during the
ith sampling period, i = 1, 2 , . . . , k. With this estimate, the form and distribution
of the T2 statistic becomes
where
is based on a result given in Scholz and Tosch (1994). Note that the formula in
(2.21) contains a correction to the expression given in the SfeW article. In selected
situations, the statistic in (2.21) serves as an alternative to the common T2 in
detecting step and ramp shifts in the mean vector.
Other covariance estimators have been constructed in a similar fashion by parti-
tioning the sample in different ways. For example, Wierda (1994) suggested forming
a covariance estimator by partitioning the data into independent, nonoverlapping
groups of size 2. Consider a sample of size n, where n is even. Suppose group 1
= {Xi,X2}, group2 = {X 3 ,X 4 },..., group(n/2) = {Xn-i,Xn}. The estimated
covariance matrix for each group C{, i = 1, 2 , . . . , (n/2), is
process is in-control, does not represent random fluctuation about a constant mean
vector. As we will see later, a multivariate control procedure can accommodate
such systematic variations in the mean vector. The common estimator S captures
the total variation, including the systematic variation of the mean, whereas many
of the above-mentioned alternative estimators estimate only random variation, i.e.,
stationary variation. Thus, these alternative estimators require additional modeling
of the systematic variation to be effective. With autocorrelation, the T2 charts
based on S are less likely to signal, either when the data are out-of-control or
in-control, unless similar additional modeling is performed (i.e., see Chapter 10).
When data are collected from an MVN distribution, the common covariance
estimator has many interesting properties. For example, 5 is an unbiased estimator
and is the maximum likelihood estimator. The probability function that describes
5 is known, and this is important in deriving the distribution of the T2 statistic.
Another important observation on S that is useful for later discussion is that its
value is invariant to a permutation of the data. Thus, the value of the estimator is
the same regardless of which one of the many possible arrangements of the data,
Xi, X-2-, • • • , Xn, is used in its computation.
2.7 Summary
In this chapter, we have demonstrated the relationship between the univariate t
statistic and Hotelling's T2 statistic. Both are shown to be a measure of the statis-
tical distance between an observed sample mean and its corresponding population
mean. This concept of statistical distance, using the T2 statistic, was expanded to
include the distance a single observation is from the population mean (or its sample
estimate) and the distance a subgroup mean is from an overall mean.
With the assumption of multivariate normality, we presented several probability
functions used to describe the T2 statistic. This was done for control procedures
based on the monitoring of a single observation or the mean of a subgroup of obser-
vations. Since there are many occasions in which the T2 statistic is slightly modified
to accommodate a specific purpose, we will continue to introduce appropriate forms
of the T2 statistic and its accompanying distribution. The ability to construct a
Hotelling's T2 for these different situations adds to its versatility as a useful tool in
the development of a multivariate control procedure.
where the a^ are constant values. Vectors are matrices with either one row a row
vector) or one column (a column vector).
Consider a multivariate process involving p process variables. We denote the
first process variable as #1, the second as #2, • • • , and the pth process variable as xp.
A simple way of denoting an observation (at a given point in time) on all process
variables is by using a (p x 1) column vector X, where
Another important form of the data matrix is achieved by subtracting the mean
vector from each observation vector. This form is given as
The inverse matrix exists only if the determinant of A is nonzero. This implies
that the matrix A must be nonsingular. If the inverse does not exist, the matrix
A is singular. Sophisticated computer algorithms exist to compute accurately the
inverses of large matrices and are contained in most computer packages used in the
analysis of multivariate data.
The definition dictates that the matrix A must be a square matrix, so that the
number of rows equals the number of columns. It also implies that the off-diagonal
elements of the matrix A, denoted by a^-, i ^ j, are equal, i.e., that
Performing the matrix and vector multiplication produces the following univariate
expression:
With the assumption that A is a symmetric matrix, so that 0-12 = 021, the above
expression can be written as
In this form, it is easy to see the relationship between the algebraic expression and
the matrix notation for a quadratic form. The matrix A of the quadratic form is
defined to be positive definite if the quadratic expression is larger than zero for all
nonzero values of the vector X.
To demonstrate this procedure, consider the quadratic form in three variables
given by
This expression is written in matrix notation as X'AX, where X' = (2:1, £2,2:3)
and
With a little manipulation, the above T2 can be written as the Quadratic form
where X' = (xi,X2) and // = (^1,^2)- The inverse of the matrix S, is given by
where a\i = a<i\ — pa\ai is the covariance between x\ and #2, p is the corresponding
correlation coefficient, and ai is the square root of an, i = 1,2.
is labeled a Wishart matrix. This name comes from Wishart (1928), who generalized
the joint distribution of the p(p + l)/2 unique elements of this matrix. This result
is predicated on the assumption that the original random sample of n observation
vectors is obtained from an NP(/J,, E), so that each Xi ~ NP(^L, E) for i = 1, 2 , . . . , n.
It is also assumed that the matrix 5 is a symmetric matrix.
The Wishart probability function is given as
The matrix S is positive definite and F(.) is the gamma function (e.g., see Anderson
(1984)). Unlike the MVN distribution, the Wishart density function has very little
use other than in theoretical derivations (e.g., see Johnson and Wichern (1998)).
For the case of p = 1, it can be shown that the Wishart distribution reduces to a
constant multiple of a univariate chi-square distribution. It is for this reason that
the distribution is thought of as being the multivariate analogue of the chi-square
distribution.
Chapter 3
3.1 Introduction
As was indicated in Chapter 2, the distributions of various forms of the T2 statistic
are well known when the set of p-dimensional variables being sampled follows an
MVN distribution. The MVN assumption for the observation vectors guarantees the
satisfaction of certain conditions that lead to the known T2 distributions. However,
validating this assumption for the observation vectors is not an easy task. An
alternative approach for use with nonnormal distributions is to approximate the
sampling distribution of the T2. In this chapter, we take this latter approach by
seeking to validate only the univariate distribution of the T2 statistic, rather than
the MVN distribution of the observation vectors.
There are a number of other basic assumptions that must be made and re-
quirements that must be met in order to use the T2 as a control statistic. These
conditions include: (1) selecting a sample of independent (random) observations,
(2) determining the UCL to use in signal detection, (3) collecting a sufficient sample
size, and (4) obtaining a consistent estimator of the covariance matrix for the vari-
ables. In this and later chapters, we discuss these assumptions and requirements
and show how they relate to the T2 statistic. We also demonstrate techniques
for checking their validity and offer alternative procedures when these assumptions
cannot be satisfied.
33
34 Chapter 3. Checking Assumptions for Using a T2 Statistic
where the T2 statistic is based on the formula given in (2.15). Since the T2 in
(3.1) is a univariate statistic with a univariate distribution, we propose performing
a goodness-of-fit test on its values to determine if the beta is the appropriate dis-
tribution, rather than on the values of X to determine if the MVN is the correct
distribution.
Although observations taken from an MVN can be transformed to a T2 statistic
having a beta distribution, it is unknown whether other multivariate distributions
possess this same property. The mathematics for the nonnormal situations quickly
becomes intractable. However, we do know that the beta distribution obtained
under MVN theory provides a good approximation for some nonnormal situations.
We illustrate this phenomenon below using a bivariate example.
We begin by generating 1,000 standardized bivariate normal observations having
a correlation of 0.80. A scatter plot of these observations is presented in Figure
3.1. For discussion purposes, three bivariate normal contours (i.e., equal altitudes
on the surface of the distribution) at fixed T2 values are superimposed on the data.
These also are illustrated on the graph.
Note the symmetrical dispersion of the points between the concentric T2 ellipses.
The concentration of points diminishes from the center outward. Using 31 contours
to describe the density of the 1,000 observations, and summing the number of
observations between these contours, we obtain the histogram of the T2 values
presented in Figure 3.2. The shape of this histogram corresponds to that of a beta
distribution.
In contrast to the above example, a typical scatter plot of a nonnormal bivariate
distribution is presented in Figure 3.3. Note the shape of the plot. The observa-
tions in Figure 3.3 are generated from the observations in Figure 3.2 by truncating
variable x\ at the value of 1. Distributions such as this occur regularly in industries
where run limits are imposed on operational variables. Truncation, which produces
3.2. Assessing the Distribution of the T2 35
long-tailed distributions, also occurs with the use of certain lab data. This can be
due to the detection limit imposed by the inability of certain types of equipment
to make determinations below (or above) a certain value.
In Figure 3.4, three bivariate normal contours at fixed T2 values, computed
using the mean vector and covariance matrix of the truncated distribution, are su-
perimposed on the nonnormal data. A major distinction between the scatter plots
in Figures 3.1 and 3.4 is the dispersion of points between the elliptical contours.
For the bivariate normal data in Figure 3.1, the points are symmetrically dispersed
between the contours. This is not the case for the nonnormal data in Figure 3.4.
36 Chapter 3. Checking Assumptions for Using a T2 Statistic
For example, note the absence of points in the lower left area between the two
outer contours. Nevertheless, possibly due to this particular pattern of points, or
due to the size of the sample, the corresponding T2 histogram given in Figure 3.5
for this distribution has a strong resemblance to the histogram given in Figure 3.2.
Agreement of this empirical distribution to a beta distribution can be determined
by performing a univariate goodness-of-fit test.
3.4. The Sampling Distribution of the T2 Statistic 37
role, consider first the kurtosis, denoted by 0:4, for a imivariate distribution with a
known mean fj, and a known standard deviation a. The kurtosis is usually defined
as being the expected value of the fourth standard moment, i.e.,
When X follows an MVN distribution, Np(p,, £), the kurtosis value in (3.3) reduces
to
where T2 is based on the formula given in (2.15). The relationship in (3.5) indicates
that large T2 values directly influence the magnitude of the kurtosis measure.
We can use the above results to relate the kurtosis value of a multivariate non-
normal distribution to that of an MVN distribution. As an example, consider two
uncorrelated variables, (x\,x-2), having a joint uniform distribution represented by
a unit cube. Both marginal distributions are uniform and have a kurtosis value
of 1.8. Thus, these are "very flat" distributions relative to a univariate normal.
40 Chapter 3. Checking Assumptions for Using a T2 Statistic
This "flatness" carries over to the joint distribution of the two variables. Using
(3.4), the kurtosis of the bivariate normal, with known parameters, is given by
p(p + 2) = 2(4) = 8. The kurtosis value for the joint uniform is found by evaluating
(3.3), where // = (0.5,0.5) and E is a diagonal matrix with entries of (1/12) on
the diagonal. The value is calculated as 5.6. This implies that a bivariate uniform
distribution is considerably "flatter" than a bivariate normal distribution.
In contrast to the above, there are many combinations of distributions of p in-
dependent nonnormal variables that can produce a multivariate distribution that
has the same kurtosis value as a p-variate normal. For example, consider a mul-
tivariate distribution composed of two independent variables: x\ distributed as a
uniform (0,1) and x% distributed as an exponential (i.e., f ( x ) = e~x for x > 0 and
zero elsewhere). Using (3.3), the kurtosis of this bivariate nonnormal distribution
is 12.8. In comparison, the kurtosis of a bivariate normal distribution is 8. Thus,
this distribution is heavier in the tails than a bivariate normal. However, suppose
we keep adding another independent uniform variate to the above nonnormal dis-
tribution and observe the change in the kurtosis value. The results are provided in
Table 3.1.
As the number of uniform variables increases in the multivariate nonnormal
distribution, the corresponding kurtosis value of the MVN distribution approaches
and then exceeds the kurtosis value of the nonnormal distribution. Equivalence
of the two kurtosis values occurs at the combination of five uniform variables and
one exponential variable. For this combination, the tails of the joint nonnormal
distribution are similar in shape to the tails of the corresponding normal. The
result also implies that the T2 statistic based on this particular joint nonnormal
distribution will have the same variance as the T2 statistic based on the MVN
distribution.
This example indicates that there do exist combinations of many (independent)
univariate nonnormal distributions with the same kurtosis value that is achieved
under an MVN assumption. For these cases, the mean and variance of the T2
statistic based on the nonnormal data are the same as for the T2 statistic based on
the corresponding normal data. This result does not guarantee a perfect fit of the
T2 sampling distribution to a beta (or chi-square or F) distribution, as this would
require that all (higher) moments of the sampling distribution of the T2 statistic
be identical to those of the corresponding distribution. However, such agreement of
3.5. Validation of the T2 Distribution 41
the lower moments suggests that, in data analysis using a multivariate nonnormal
distribution, it may be beneficial to determine if the sampling distribution of the
T2 statistic fits a beta (or chi-square or F) distribution. If such a fit is obtained,
the data can then be analyzed as if the MVN assumption were true.
sampling distributions are also equal. Under the MVN assumption, the T2 statistic
follows a chi-square distribution with p degrees of freedom. We now illustrate the
appropriateness of the same chi-square distribution for the T2 statistic generated
from the nonnormal distribution. Two hundred observations for five independent
uniform variables and one independent exponential are generated. The T2 chart
for the 200 observations are presented in Figure 3.9, where UCL = 16.81 is based
on a = 0.01.
The corresponding Q-Q plot using chi-square quantiles is presented in Figure
3.10. Our interest lies in the tail of the distribution. For a = 0.01, we would expect
two values greater than the chi-square value of 16.81. Although two T2 values in
Figure 3.9 are near the UCL, the T2 chart indicates no signaling T2 values. These
two large T2 values are located in the extreme right-hand corner of the Q-Q plot in
Figure 3.10. The Q-Q plot follows a linear trend and displays little deviation from
3.5. Validation of the T2 Distribution 43
Figure 3.10: Q-Q plot based on simulated data from nonnormal distribution.
it. Thus, the T2 values appear to follow a chi-square distribution despite the fact
that the underlying multivariate distribution is nonnormal.
In contrast, consider a bivariate nonnormal distribution based on one indepen-
dent uniform and one independent exponential variable. From Table 3.1, it is noted
that the kurtosis of this distribution is larger than that of a bivariate normal. This
implies a heavier tail (i.e., more large T2 values) than would be expected under
normality. We generate 200 observations from this nonnormal distribution and
construct the T2 chart presented in Figure 3.11 using UCL = 9.21 and a = 0.01.
With this a, we would expect to observe two signals in the chart. However, there
are five signaling T2 values, indicating the heavy tailed distribution expected by
the high kurtosis value.
The corresponding Q-Q plot for this data is presented in Figure 3.12. Note the
severe deviation from the trend line in the upper tail of the plot of the T2 values.
44 Chapter 3. Checking Assumptions for Using a T2 Statistic
Figure 3.12: Q-Q plot based on simulated data from nonnormal distribution.
The conclusion is that a chi-square distribution does not provide a good fit to
these data.
To illustrate the appropriateness of the use of the beta distribution to describe
the sampling distribution of the T2 statistic, we consider 104 bivariate observations
taken from an actual industrial process. A scatter plot of the observations on the
variables x\ and x 2 is presented in Figure 3.13. Observe the elongated elliptical
shape of the data swarm. This is a characteristic of the correlation, r = —0.45,
between the two variables and not of the form of their joint distribution. Observe
also the presence of one obvious outlier that does not follow the pattern established
by the bulk of the data.
The presence of outliers poses a problem in assessing the distribution of the T2
statistic. Thus, we must remove the three outliers and recompute the T2 values
of the remaining data. These remaining values are plotted in the T2 chart given
in Figure 3.14. There remain some large T2 values associated with some of the
observations, but none are signals of out-of-control points. Observations of this
3.5. Validation of the T2 Distribution 45
type (potential outliers) are not removed in this example, although they could
possibly affect the fit of the corresponding beta distribution.
The corresponding Q-Q plot for this data is presented in Figure 3.15. Since
p — 2 and n = 103, the beta distribution fit to the data, using (3.1), is 5^ 50 ).
From inspection, the Q-Q plot exhibits a very strong linear trend that closely follows
the 45° line imposed on the plot. This indicates an excellent fit between the T2
sampling distribution and the appropriate beta distribution.
A question of interest is whether the above beta distribution is describing the
process data because the actual observations follow a bivariate normal or because
the fit provides a good approximation to the sampling distribution. To address
this question, we examine estimates of the marginal distributions of the individual
variables x\ and x<±. If the joint distribution is bivariate normal, then each marginal
46 Chapter 3. Checking Assumptions for Using a T2 Statistic
where A: is a chosen constant such that k > I and where p, and d1 are the mean and
variance, respectively, of x. For example, the probability that a random variable
x will take on a value within k = 3.5 standard deviations of its mean is at least
1 — l/k2 = 1 — l/(3.5) 2 = 0.918. Conversely, the probability that x would take on
a value outside this interval is no greater than l/k2 = 1 — 0.918 = 0.082.
To use the Chebyshev procedure in a T2 control chart, calculate the mean, T,
and the standard deviation, s^, of the T2 values obtained from the HDS. Using these
as estimates of the parameters HT and a? of the T2 distribution, an approximate
UCL is given as
replication, one would also need multiple observations at various loads (megawatt
production) for the different temperatures.
An alternative solution to choosing a large sample size is to seek to reduce
the dimensionality of the multivariate problem. This can be achieved by reducing
the number of parameters that need to be estimated. One useful solution to this
problem involves applying principal component analysis (Jackson (1991)).
3.10 Summary
Fundamental to the use of any statistic as a decision-making tool is the probability
function describing its behavior. For the T2 statistic, this is either the chi-square,
the beta, or the F distribution. Multivariate normal observation vectors are the
basis for these distributions. Since multivariate normality is not easily validated,
3.11. Appendix: Confidence Intervals for UCL 51
and s — r is a minimum. For large n, one may approximate r and s by the two
values
where 2( 7 /2) is the upper 7/2th quantile of the standard normal distribution.
The CIs obtained from (A3.1) and (A3.2) are generally very similar when n is
large and 7 ~ 0.95. We choose to use the inequality in (A3.1) to obtain r and s.
From (A3.1), one can be at least 1007% sure that the UCL is somewhere between
TV2 N) and T\2 l-,. Since there is an infinite number of values between T,\2 ^) and T/\2 )^ , there
b r s
are infinitely many choices for the UCL. For convenience, we choose the midpoint
of the interval as an approximate value for the UCL. It is given by
This page intentionally left blank
Chapter 4
Construction of Historical
Data Set
53
54 Chapter 4. Construction of Historical Data Set
The statistic used to make the needed comparison is a Hotelling's T2. You
twice read the section explaining how this is done. The theory is complex,
but from an intuitive point of view, you now understand how multivariate
control procedures work. A T2 statistic, the multivariate analogue of a com-
mon t-statistic, can assess all 35 variables at the same time. It is written as
a quadratic form in matrix notation. You never appreciated that course in
matrix algebra until now. It is all very amazing.
Suddenly, you realize you still have a most serious problem. How is all
of this computing to be done? You can't do it with your favorite spreadsheet
without spending days writing macros. How was it done in the text? What
software did they use in all of their data examples? A quick search provides the
answer, QualStat™, a product of InControl Technologies, Inc. You note that
a CD-ROM containing a demonstration version of this program is included
with the book. (This chicken salad sandwich is good. You must remember
to tell the cafeteria staff that their new recipe is excellent.) Following the
instructions on your computer screen, you quickly upload the software. Now,
you are ready to work on a Phase I operation and create an HDS.
4.1 Introduction
An in-control set of process data is a necessity in multivariate control procedures.
Such a data set, often labeled historical, baseline, or reference, provides the basis
for establishing the initial control limits and estimating any unknown parameters.
However, the construction of a multivariate HDS is complicated and involves prob-
lem areas that do not occur in a univariate situation. It is the purpose of this
chapter to explore in detail some of these problem areas and offer possible solutions.
The development of the HDS is referred to as a Phase I operation. Using it as
a baseline to determine if new observations conform to its structure is termed a
Phase II operation. Since there is only one variable to consider, univariate Phase I
procedures are easy to apply. Upon deciding which variable to chart, one collects
a sample of independent observations (preliminary data) on this variable from the
in-control process. The resulting data provide initial estimates of the parameters
that characterize the distribution of the variable of interest.
The parameter estimates are used to construct a preliminary control procedure
whose major purpose is to purge the original data set of any observations that do not
conform to the structure of the HDS. These nonconforming or atypical observations
are labeled outliers. After the outliers are removed from the preliminary data set,
new estimates of the parameters are obtained and the purging process is repeated.
This is done as many times as necessary to obtain a homogeneous data set as
defined by the control procedure. After all outliers are removed, the remaining
data is referred to as the HDS.
The role of a multivariate HDS is the same as in the univariate situation. It pro-
vides a baseline for the control procedure by characterizing the in-control process.
However, construction of a historical data set becomes more complicated when us-
ing multivariate systems. For example, we must decide which variables to include
and their proper functional forms. This determination may require in-depth process
56 Chapter 4. Construction of Historical Data Set
Planning
Establish Goals
Study and Map Process
Define Good Process Operations
Variable Form
Collection Procedures Thoreticel Relationships Missing Data
Human Errors Estimation
Empirical Relationships
Electronic Errors Transformations Deletion
Collinearity Autocorrelation
Effects Effects
Detection & Removal Detection
Outliers
Detection
Purging Process
Alternative Covariance
Estimators
• The production of caustic soda and chlorine gas is a major industry in the
United States. One method of production is through the electrolysis of brine (i.e.,
salt water). This work is done in an electrolyzer that is composed of one or more
cells. A cell is the basic unit where the conversion takes place. The major purpose
of a control procedure is to locate cells whose conversion efficiency has dropped so
that they can be treated to restore their efficiency
• The brine (feed stock) for an electrolyzer must be treated to remove impurities.
This operation takes place in a brine treatment facility. The primary purpose
of a control procedure on this unit is to maintain the quality of the feed stock
for the electrolyzer. "Bad" brine has the potential of destroying the cells and
contaminating the caustic soda being produced.
• From the electrolyzer, caustic soda is produced in a water solution. The water
is removed through evaporation in an evaporating unit. A control procedure on this
unit maintains maximum production for a given set of run conditions, maintains
the desired caustic strength, and helps locate sources of problems.
• One method of transporting the finished caustic product is by railroad tank
cars. Overweight tank cars present a number of major problems. Control proce-
dures on the loading of the tank cars can ensure that no car will be loaded above
its weight limit.
• Control procedures on steam and gas turbines, used in the production of
electricity for the electrolysis of the brine, detect deviations in the efficiency of the
turbines. Also, they are used to locate sources of problems that occur in operations.
Boilers, used in steam production for the evaporation of water, are controlled in a
similar fashion.
• Control procedures on large pumps and compressors are used for maintenance
control to detect any deviation from a set of "ideal" run conditions. It is less
expensive to replace worn parts than to replace a blown compressor.
• Control procedures on various reactors are used to maintain maximum effi-
ciency for a given set of run conditions and to locate the source of the problem
when upsets occur.
To understand these concepts, consider the data given in Table 4.1. It consists
of a sample of 30 observations taken on eight variables, (J^i, X < 2 , . . . , Xg), measured
on a chemical process.
It is assumed at this stage of data investigation that a decision has been made as
to the purposes and types of control procedures required for this process. Suppose
it is desired to construct a control procedure using only the observations on the
first seven variables presented in Table 4.1. Further, suppose it is most important
to maintain variable X^ above a critical value of 5. Any drifting or changes in
relationships of the other process variables from values that help maintain X^ above
its critical value need to be detected so that corrective action can be taken.
Initially, the data must be filtered to obtain a preliminary data set from which
the HDS can be constructed. There are 17 observations with X± above its critical
value of 5. The obvious action is to sort the data on X± and remove all observations
in which X^ has a value below its critical value. This action should produce a set
of data with the desired characteristics.
4.3. Preliminary Data 59
The filtering of data can provide valuable process information. For example,
suppose the out-of-specification runs (i.e., X^ < 5) are labeled Group 1 and the in-
specification runs (i.e., X^ > 5) are labeled Group 2. The means of the variables of
the two groups are presented in Table 4.2. Valuable process information is obtained
by closely examining the data. A mean difference on variable 4 is to be expected
since it was used to form the groups. However, large percentage differences in the
means are observed on variables 1 and 5, and a moderate difference is observed on
variable 3. Further investigation is needed in determining how these variables are
influencing variable 4.
A preliminary data set should be thoroughly examined using both statistical
procedures and graphical tools. For example, consider the graph presented in Figure
4.1 of fuel consumption, labeled Fuel, and megawatt-hours production, labeled
Megawatts (or MW), of a steam turbine over time (in days of operation). Close
examination of the graph produces interesting results. Note the valleys and peaks
in the MW trace. These indicate load changes on the unit, whereas the plateaus
reflect production at a constant load. When the load is reduced, the MW usage
curve follows the fuel graph downward. Similarly, the MW graph follows the fuel
graph upwards when the load is increased. This trend indicates there is a lag,
60 Chapter 4. Construction of Historical Data Set
during a load change, in the response time of the turbine to the amount of fuel
being supplied. This is very similar to the operation of a car, since accelerating or
decelerating it does not produce an instantaneous response.
Lags in only part of the data, as seen in the example in Figure 4.1, often can
be easily recognized by graphical inspection. Other methods, however, must be
used to detect a lag time extending across an entire processing unit. Observations
across a processing unit are made at a single point in time. Before one can use the
observations in this form, there must be some guarantee that the output observa-
tions match the input observations. Otherwise, the lag time must be determined
and the appropriate parts of the observation vector shifted to match the lag. Some
processes, such as strippers used to remove unwanted chemical compounds, work
instantaneously from input to output. Other processes, such as silica production,
have a long retention time from input to output. Consultation with the operators
and process engineers can be most helpful in determining the correct lag time of
the process.
A helpful method for determining if lag relationships exist between two vari-
ables is to compute and compare their pairwise correlation with the correlation be-
tween one variable and the lag of the other variable. For example, consider hourly
4.4. Data Collection Procedures 61
observations taken (at the same sampling period) on two process variables. A
sample of size 27 is presented in Table 4.3, where the variable x is a feedstock
characteristic and the variable y is an output quality characteristic. We begin by
calculating the correlation between x and y. The value is given as 0.148. Next
we lag the y values one time period and reconstruct the data set, as presented in
Table 4.3. The resulting variable, labeled ylagl, has a correlation of 0.447 when
compared to the x variable. We note an increase in the correlation.
The observations on the quality variable y could be continuously shuffled down-
ward until its maximum correlation with y is obtained. The correlations for three
consecutive lags are presented in Table 4.4. Maximum correlation is obtained by
lagging the quality characteristic two time periods. Note the decrease in the cor-
relation for three lags of the quality characteristic. Thus, we estimate the time
through the system as being two hours.
needs to examine closely how each observation is determined and the recycle time
of each piece of equipment.
Another problem area in data collection includes incorrect observations on com-
ponents. This may be a result of faulty equipment such as transistors and temper-
ature probes. Problems of this type can be identified using various forms of data
plots or data verification techniques (usually included in the purging process).
and the estimated value of C12 is computed as 98.58. This is in close agreement
with the actual value of 98.41 given in Table 4.6.
The T2 value, with this estimate of the missing component, is 5.94 as compared
to the actual value of 5.92. Thus, substituting the missing value has little influence
on the T2 statistic. Similarly, there is negligible change in the mean of X$. The
mean of the observations without the missing value is 98.21 versus a mean of 98.23
when the estimated value is included. A comparison of the correlations between
^5 and the other five variables, with and without the predicted value, are given in
Table 4.7. There appears to be little difference between these correlations.
The fill-in-the-value approach presented above is a simple and quick method for
estimating missing values in an HDS. However, among its limitations are the fact
64 Chapter 4. Construction of Historical Data Set
that the estimated value is only as good as the prediction equation that produced
it, and the fact that estimation may affect the variance estimates. Many other
solutions exist (e.g., see Little and Rubin (1987)), and these can be used when
better estimation techniques are preferred.
A brief summary of PCA and its relationship to the eigenvalue problem is contained
in subsection 4.11.2 of this chapter's appendix. The reader unfamiliar with these
concepts is encouraged to review this appendix before continuing with this section.
A more detailed discussion of PCA is provided by Jackson (1991).
The effects of a near-singular covariance matrix on the performance of a T2
statistic will be demonstrated in the following example. We begin by expressing
the inverse of the sample covariance matrix as
where AI > A2 > • • • > Ap are the eigenvalues of S and t/j, j — 1, 2 , . . . ,p, are
the corresponding eigenvectors. If Xp is close to zero, the ratio (1/AP) becomes
very large and can have a disproportionate effect on the calculation of the inverse
matrix. This distorts any statistic, such as the T2, that uses the inverse matrix in
its calculation.
To demonstrate how to examine the eigenstructure of a matrix, we examine
the correlation matrix of a chlorine (C^)/caustic (NaOH) production unit. A
unit schematic is presented in Figure 4.4. This particular process is based on
the electrolysis of brine (salt). A current is passed through a concentration of brine
solution where the anode and cathode are separated by a porous diaphragm. The
chlorine is displaced as a gas and the remaining water/brine solution contains the
caustic. The unit performing this work is referred to as a cell, and several of these
are housed together (as a unit) to form an electrolyzer. Overall performance of
the cell is measured by the percentage of the available power being used in the
conversion process. This percentage is a computed variable and is referred to as
conversion efficiency (CE). High values of this variable are very desirable.
Many variables other than CE are used as indicators of cell performance. Mea-
sured variables are the days of life of the cell (DOL), cell gases including chlorine
4.7. Detecting Collinearities 67
and oxygen (Cl? and 02), caustic (NaOH), salt (NaCl), and impurities production
(Ii and 12). The levels of impurities are important since their production indicates
a waste of electrical power, and they contaminate the caustic.
Table 4.8 is the correlation matrix for an HDS (n = 416) based on seven of these
variables. Its eigenstructure will be examined in order to determine if a severe
collinearity exists among the computed variables. Inspection of the correlation
matrix reveals some very large pairwise correlations. For example, the correlation
between the two measured gases, C12 and O 2 , has a value of -0.956. Also, the
computed CE variable, which contains both Cl2 and O2, has a correlation of 0.956
with C12 and -0.999 with O 2 .
Using a PCA, the seven eigenvalues and eigenvectors for the correlation matrix
are presented in Table 4.9. Also included is the proportion of the correlation varia-
tion explained by the corresponding eigenvectors as well as the cumulative percent-
age of variation explained. A recommended guideline for identifying a near-singular
matrix is based on the size of the square root of the ratio of the maximum eigen-
value to each of the other eigenvalues. These ratios are labeled as condition indices.
A condition index greater than 30 implies that a severe collinearity is present. The
value of the largest index, labeled the condition number, for the data in Table 4.9 is
which clearly indicates the presence of a severe collinearity among the variables.
A severe collinearity in the correlation matrix translates into the presence of a
severe collinearity in the associated covariance matrix. Since it is not possible or
advisable to use a T2 control statistic when the covariance matrix is singular or near
singular, several alternatives are suggested. The first, and simplest to implement, is
to remove one of the variables involved in the collinearity. This is especially useful
68 Chapter 4. Construction of Historical Data Set
when one of the collinear variables is computed from several others, since deletion
of one of these variables will not remove any process information.
To determine which variables are involved in a severe collinearity, one need
only examine the linear combination of variables provided by the eigenvector cor-
responding to the smallest eigenvalue. From Table 4.9, this linear combination
corresponding to the smallest eigenvalue of 0.0003 is given by
Ignoring the variables with small coefficients (i.e., small loadings) gives the linear
relationship between the two variables that is producing the collinearity problem.
This relationship is given as
This relationship confirms the large negative correlation, —0.999, found between
CE and O2 in Table 4.8.
The information contained in the computed variable CE is redundant with that
contained in the measured variable 62- This relationship is producing a near sin-
gularity in the correlation matrix. Since CE is a computed variable that can be
removed with no loss of additional information, one means of correcting this data
deficiency is to compute the T2 statistic using only the remaining six variables.
The revised correlation matrix for these six variables is obtained from the correla-
tion matrix presented in Table 4.9 by deleting the row and column corresponding
to CE.
Another method for removing a collinearity from a covariance matrix is to re-
construct the matrix by excluding the eigenvectors corresponding to the near-zero
eigenvalues. The contribution of the smallest eigenvalues would be removed and
S~l would be computed using only the larger ones; i.e.,
This approach should be used with caution since, in reducing the number of prin-
cipal components, one may lose the ability to identify shifts in some directions in
terms of the full set of the original variables.
The second form of autocorrelated data is labeled stage decay (e.g., see Mason,
Tracy, and Young (1996)). This occurs when the time change in the variable is in-
consistent over shorter time periods, but occurs in a stepwise fashion over extended
periods of time. This can occur in certain types of processes where change with
time occurs very slowly. The time relationship comes from the performance in one
stage being dependent on the process performance in the previous stage(s). The
graph of a process variable having two stages of decay is presented in Figure 4.6.
If autocorrelation is undetected or ignored, it can create serious problems with
control procedures that do not adjust for it. The major problem is similar to the one
that occurs when using univariate control procedures on variables of a multivariate
process. Univariate procedures ignore relationships between variables. Thus, the
effect of one variable is confounded with the effects of other correlated variables. A
similar situation occurs with autocorrelated data when the time dependencies are
not removed. Adjustment is necessary in order to obtain an undistorted observation
on process performance at a given point in time.
Control procedures for autocorrelated data in a univariate setting often make
these adjustments by modeling the time dependency and plotting the resultant
residuals. Under proper assumptions, these residual errors, or adjusted values (i.e.,
effect of the time dependency removed), can be shown to be independent and
normally distributed. Hence, they can be used as the charting statistic for the
time-adjusted process. It is also useful to look at forecasts of charting statistics
since processes with in-control residuals can drift far from the target values (e.g.,
see Montgomery (1997)).
We offer a somewhat similar solution for autocorrelated data from multivari-
ate processes. However, the problem becomes more complicated. We must be
concerned not only with autocorrelated data on some of the variables, but also
with how the time variables relate to the other process variables. Autocorrelation
4.8. Detecting Autocorrelation 71
does not eliminate these relationships, but instead confounds them and thus must
be removed for clear signal interpretation. How this is done is a major focus of
Chapter 10.
One simple method of detecting autocorrelation in univariate processes is ac-
complished by plotting the variable against time. Depending on the nature of the
autocorrelation, the points in a graph of the process variable versus time will either
move up or down or oscillate back and forth. Subsequent data analysis is used
to verify the presence of autocorrelation, determine lag times, and fit appropriate
autoregressive models.
Observations from a multivariate process are p-dimensional and the components
are usually correlated. The simple method of plotting graphs of individual compo-
nents against time can be inefficient when there are a large number of variables.
Also, these time-sequence plots may be influenced by other correlated variables,
resulting in incorrect interpretations. For example, considering the cyclic nature
over time of the variable depicted in Figure 4.7, one might suspect that some form
of autocorrelation is present. However, this effect is due to the temperature of the
coolant, which has a seasonal trend. Nevertheless, even with this drawback, we
have found that graphing each variable over time is useful.
To augment the above graphical method and to reduce the number of individual
graphs for study, one could introduce a time-sequence variable in the data set and
examine how the individual variables relate to it. If a process variable correlates
with the time-sequence variable, it is highly probable that the process variable
correlates with itself in time. Using this method, one can locate potential variables
that are autocorrelated. Detailed analysis, including the graphing of the variable
over time, will either confirm or deny the assertion for individual variables.
72 Chapter 4. Construction of Historical Data Set
df SS MS F p- value
Regression I 506.304 506.304 109.711 0.000
Residual 21 96.912 4.614
Total 22 603.217
where b0 and 61 are the estimated coefficients of the model relating the heat trans-
fer variable yt to its lag value yt-i- The small p-value for the F statistic in the
table implies that there is strong evidence that the immediate past heat transfer
coefficient is an important predictor of the current heat transfer coefficient.
As another example, consider the techniques necessary for detecting autocorre-
lation in data collected from a reactor used to convert ethylene (C2H4) to ethylene
dichloride (EDC). EDC is the basic building block for much of the vinyl product
industry. Feed stock for the reactor includes hydrochloric acid gas (HC1), ethy-
lene, and oxygen (02). Conversion of the feed stock to EDC occurs under high
temperature in the reactor. The conversion process is labeled oxyhydrochlorination
(OHC).
There are many different types of OHC reactors available to perform the conver-
sion of ethylene and HC1 to EDC. One type, a fixed life or fixed bed reactor, must
have critical components replaced at the end of each run cycle. The components
74 Chapter 4. Construction of Historical Data Set
are slowly depleted during operation and performance of the reactor follows the
depletion of the critical components. The best performance of the reactor is at the
beginning of the cycle, as the reactor gradually becomes less efficient during the
remainder of the cycle. While other variables have influence on the performance of
the reactor, this inherent decay of the reactor produces a time dependency in many
of the process and quality variables.
We have chosen seven variables to demonstrate how to detect and adjust for
autocorrelated data in this type of process. These are presented in Table 4.12. The
first variable, RPl, is a measure of feed rate. The next four, Temperature, LI, 1/2,
and 1/3, are process variables, and the last two are output variables. Variable PI
is an indication of the amount of production for the reactor and variable Cl is an
4.9. Example of Autocorrelation Detection Techniques 75
undesirable by-product of the production system. All variables, with the exception
of feed rate, show some type of time dependency.
Temperature measurements are available from many different locations on a
reactor. All of them are important elements in the performance and control of
the reactor and increase over the life cycle. To demonstrate the time decay of the
measured temperatures, we present in Figure 4.10 a graph of the average tempera-
ture over a good production run. The graph indicates the average temperature of
the reactor is initially stable, but then it gradually increases over the life cycle of
the unit.
The time-sequence graphs in Figures 4.11 and 4.12 of the two process variables,
L3 and LI, present two contrasting patterns. In Figure 4.12, L3 increases linearly
with time and has the appearance of a first-order lag relationship. This is confirmed
by the fact that r\ — 0.7533. However, the graph of LI in Figure 4.13 is depicted as
a quadratic or exponential across time, but can still be approximated by a first-order
lag relationship. In this case, r<2 = 0.9331.
The graph of Cl versus time is presented in Figure 4.13. The time trend in
this graph differs somewhat from the previous graphs. There appear to be separate
stages in the plot: one at the beginning, another in the middle, and the third stage
at the end.
Of the remaining three variables, none show strong time dependencies. As an
example, consider the time-sequence plot for RPl given in Figure 4.14. Across
time, the plot of the data is nearly horizontal and shows no trends or patterns.
76 Chapter 4. Construction of Historical Data Set
4.10 Summary
Control procedures are designed to detect and help in determining the cause of
unusual process events. The point of reference for "unusual events" is the historical
data set. This is the baseline of any control procedure and must be constructed with
great care. The first step in its construction is to acquire an understanding of the
process. This knowledge can be obtained from the operators and process engineers.
A study of the overall system will reveal problem areas where the applications of a
control procedure would be most helpful. This is necessary to determine the type
and purpose of the control procedure.
With the selection of an appropriate area for application of a control procedure,
we can obtain a preliminary data set. However, the data must be carefully filtered
of its impurities so that the resulting data set is clean. Graphical tools can be
a great aid in this process, as they can be used to identify obvious outliers and,
in some cases, determine useful functional relationships among the variables. In
addition, data collection and data verification procedures must be examined and
any missing data replaced or estimated, or else one must remove the associated
observation vector or process variable.
After filtering the preliminary data, we strongly recommend checking on the
singularity of the covariance matrix. The problems of a singular covariance matrix,
or of collinearity among the variables, can be quite critical. Collinearity often
occurs when there are many variables to consider or when some of the variables
are computed from measured ones. These situations can be detected using the
eigenvalues of the covariance or correlation matrices. Principal component analysis
can be a useful tool in this determination, as can consultation with the process
engineers. Since a severe collinearity can inflate the T2 statistic, appropriate action
must be taken to remove this problem.
Steady-state control procedures do not work well on autocorrelated processes.
Thus, one must investigate for the presence of autocorrelation in the preliminary
data set. We offer two procedures for detecting the presence of autocorrelated data
in a multivariate system. The first is based on plotting a variable over time and
looking for trends or patterns in the plot. The second is based on plotting the sample
autocorrelations between observations separated by a specified lag time versus time
and examining the observed trends. Large autocorrelations will pinpoint probable
cases for further study. In Chapter 10, we discuss procedures for removing the
effects of these time dependencies on the T2 statistic.
4.11 Appendix
4.11.1 Eigenvalues and Eigenvectors
Consider a square (p x p) matrix A. We seek to find scalar (constants) values A^,
i = l,2,...,p, and the corresponding (p x 1) vectors Ui, i = 1, 2 , . . . , p , such that
the matrix equation
4.11. Appendix 79
where A—XI \ represents the determinant of the matrix (A—XI). The corresponding
eigenvectors Ui are then obtained by solving the homogeneous system of equations
given in (A4.1).
The eigenvalues (Ai, A 2 , . . . , Xp) are unique to the matrix A] however the cor-
responding eigenvectors (t/i, C/2, • • • , Up) are not unique. In statistical analysis the
eigenvectors are often scaled to unity or normalized, so that
Note that the eigenvalues of A~l are the reciprocals of the eigenvalues of A. The
corresponding eigenvectors of A~l are the same as those of A.
Covariance matrices, such as S, that are associated with a T2 statistic are
symmetric, positive definite matrices. For symmetric matrices, the correspond-
ing eigenvalues must be real numbers. For positive definite symmetric matrices,
the eigenvalues must be greater than zero. Also, with symmetric matrices the
eigenvectors associated with distinct eigenvalues are orthogonal so that UiU'j = 0.
Near-singular conditions (i.e., collinearities) exist when one or more eigenvalues are
close to zero. Closeness is judged by the size of an eigenvalue relative to the largest
eigenvalue. The square root of the ratio of the largest eigenvalue (Ai) to any other
eigenvalue (Xi) of a matrix A is known as a condition index and is given by
This equation produces the collinear relationship that exists between Xj and the
other variables of the system.
The theoretical development of PCA is covered in the many texts on multivariate
analysis, e.g., Morrison (1990), Seber (1984), and Johnson and Wichern (1998). An
especially helpful reference on the applications of PCA is given in Jackson (1991).
Chapter 5
5.1 Introduction
In this chapter we discuss methods, based on the T2 statistic, for identifying atypical
observations in an HDS. We also include some examples of detection schemes based
on distribution-free methods. When attempting to detect such observations, it is
assumed that good preliminary data are available and that all other potential data
problems have been investigated and resolved.
The statistical purging of unusual observations in a Phase I operation is essen-
tially the same as an outlier detection problem. An outlier is an atypical observation
located at an extreme distance from the main part of the sample data. Several use-
ful statistical tests have been presented for identifying these observations, and these
techniques have been described in numerous articles and books (e.g., see Barnett
and Lewis (1994), Hawkins (1980), and Gnanadesikan (1977)).
Although the T2 statistic is not necessarily the optimal method for identifying
outliers, particularly when used repeatedly as in a control chart, it is a simple pro-
cedure to apply and can be very helpful in locating individual outlying observations.
Further, as shown in Chapter 7, the T2 statistic has the additional advantage of
being capable of determining the process variables causing an observation to signal.
For these reasons, we will concentrate only on the T2 statistic.
black circle son the graph will bias the estimates of thwe varieanc of the two vart
ables and/or the estimates of the correlation between these two variables. For
example, the inclusion of Group A data will increase the variation in both variables
but will have little effect on their pairwise correlation. In contrast, including the
Group C data will distort the correlation between the two variables, though it will
increase the variation of mainly the x\ variable.
Why do atypical observations, similar to those presented above, occur in an
initial sample from an in-control multivariate process? There are many reasons,
such as a faulty transistor sending wrong signals, human error in transcribing a log
entry, or units operating under abnormal conditions. Most atypical information can
be identified using graphs and scatter plots of the variables or by consulting with
the process engineer. For example, several of the observations in Groups B and C
of Figure 5.1, such as points Bl and Cl, are obvious outliers; however, others may
not be as evident. It is for this reason that a good purging procedure is needed.
Detecting atypical observations is not as straightforward in multivariate systems
as in univariate ones. A nonconforming observation vector in the multivariate sense
is one that does not conform to the group. The purging procedure must be able to
identify both the components of the observation vectors that are out of tolerance
as well as those that have atypical relationships with other components.
purging it of outliers. Any observation in the data set that is beyond the control
limits of the chart is removed from further consideration.
Suppose that the Shewhart upper and lower control limits, denoted UCL and
LCL, for this data set are those depicted in Figure 5.2. We assume that any outlier is
an observation that does not come from this distribution, but from another normal
distribution, N(/j, + d, a 2 ), having the same standard deviation, but with the mean
shifted d units to the right. Both distributions are depicted in Figure 5.3.
Detecting an outlier in this setting is equivalent to the testing of a statistical
null hypothesis. To decide if the given observation is taken from the shifted normal
distribution, and thus declared an outlier, we test the null hypothesis
that all observations arise from the normal distribution JV(p,,cj 2 ) against the alter-
native hypothesis
that all observations arise from the shifted normal distribution N({j,-\-d: a 2 ). If the
null hypothesis is rejected, we declare the observation to be an outlier and remove
it from the preliminary data set.
In the above hypothesis test, the distribution under the null hypothesis is re-
ferred to as the null distribution and the distribution under the alternative hypoth-
esis is labeled the nonnull distribution. The power of the hypothesis test is denoted
in Figure 5.3 by the area of the shaded region under the nonnull distribution,
which is the distribution shifted to the right. This is the probability of detecting
an observation as an outlier when it indeed comes from the shifted distribution.
Comparisons are made among different outlier detection schemes by comparing the
power function of the procedures across all values of the mean shift, denoted by d.
Many analysts use univariate control chart limits of individual variables to re-
move outlying observations. In this procedure, all observation vectors that contain
84 Chapter 5. Charting the T2 Statistic in Phase I
an observation on a variable outside the 3cr range are excluded. This is equivalent
to using the univariate Shewhart limits for individual variables to detect outlying
observations. A comparison of this procedure with the T2 procedure is illustrated
in Figure 5.4 for the case of two variables.
5.4. Multivariate Outlier Detection 85
The shaded box in the graph is defined by the univariate Shewhart chart for each
variable. For moderate-to-strong correlations between the two variables, the T2
control ellipse usually extends beyond the box. This indicates that the operational
range of the variables of a multivariate correlated system can be larger than the
control chart limits of independent variables.
Use of the univariate control chart limits of a set of variables ignores the con-
tribution of their correlations and in most cases restricts the operational range of
the individual variables. This restriction produces a conservative control region for
the control procedure, which in turn generates an increased number of false signals.
This is one of the main reasons for not using univariate control procedures to detect
outliers in a multivariate system.
and where -S[a, p /2,(n-p-i)/2] is the upper crth quantile of the beta distribution,
B\p/2,(n-p-i)/2]- If an observation vector has a value greater than the UCL, it
is to be purged from the preliminary data. With the remaining observations, we
calculate new estimates of the mean vector and covariance matrix. A second pass
through the data is now made. Again, we remove all detected outliers and repeat
the process until a homogeneous set of observations is obtained. The final set of
data is the HDS.
When process control is to be based on monitoring the subgroup means of k
samples of observations, the actual purging process of the preliminary data set is
the same as for individual observations. The data are recorded in samples of size
k
m^, i = l , 2 , . . . , f c , yielding a total sample size of n = ^ m^. Since each individual
i=l
observation vector comes from the same MVN distribution, we can disregard the
k subgroups and treat the observations as one group. With the overall group, we
obtain the estimates, and S, and proceed as before. When the process is in control,
this approach produces the most efficient estimator of the covariance matrix (e.g.,
see Wierda (1994) or Chou, Mason, and Young (1999)).
New estimates of the mean vector and covariance matrix are computed from
the remaining 24 observations, and the purging process is repeated. The new
correlation matrix is presented in Table 5.4. Comparing the correlation matrices of
the purged data and the unpurged data, we find a definite change. For example,
with the removal of observation 9 the correlation between ti and £3 increases from
0.584 to 0.807. This illustrates the effect a single outlier can have on a correlation
coefficient when there is a small sample size (n = 25). Note that such a significant
change in the correlation matrix implies a similar change in the covariance matrix.
The new T2 values are presented in Table 5.5. Since the new UCL for the
reduced set of 24 observations is 17.00, the second pass through the data produces
88 Chapter 5. Charting the T2 Statistic in Phase I
above temperature data, with p = 2 and n = 25, the beta distribution for the T2
statistic using the formula given in (2.15) is
A Q-Q plot of the T2 values, converted to beta values by dividing them by 0.922, is
presented in Figure 5.5. Several of the plotted points do not fall on the given line
through the data. This is especially true for the few points located in the upper
right corner of the graph. This is supported by the T2 values given in Table 5.3.
where four points. 1, 4, 9, and 21, have T2 values larger than 10. Observation 9,
located at the upper end of the line of plotted values, is somewhat removed from
the others. Given this result, the point should be investigated as a potential outlier.
Figure 5.9: Q-Q plot for transformer data after outlier removal.
The UCL is recalculated as 44.528. Observe that the system appears to be very
consistent and all observations have T2 values below the UCL. The corresponding
Q-Q plot of the T2 values is presented in Figure 5.9. Observe the strong linear
trend exhibited in the plot and the absence of observations far off the trend line.
For a given a level, the UCL for the purging process is determined using
where xJa p\ is the upper ath quantile of a chi-square distribution having p degrees
of freedom.
To illustrate this procedure and contrast it to the case where the parameters
are unknown, assume the sample mean vector and covariance matrix of the data
in Table 5.1 are the true population values. Using an a — 0.001, the UCL is
^fo 001 2) = 26.125. Comparing the observed T2 values in Table 5.3 to this value
we find that no observation is declared an outlier. Thus, observation 9 would not
be deleted.
A major difference between the T2 statistics in (5.1) and (5.3) is due to how we
determine the corresponding UCL. When the mean vector and covariance matrix are
estimated, as in (5.1), the beta distribution is applicable, but when these parameters
are known, as in (5.4), the chi-square distribution should be used. It can be shown
that for large n, the UCL as calculated under the beta distribution (denoted as
92 Chapter 5. Charting the T2 Statistic in Phase I
with k = 4.472 and a < 0.10. The estimated UCL for this first pass is 19.617,
and it is used to remove 13 outliers. The estimation of the UCL and the resultant
purging process is repeated until a homogeneous data set of T2 values is obtained.
Five passes are required and 28 observations are deleted. The results of each pass
of this procedure are presented in Table 5.7.
5.7. Unknown T2 Distribution 93
If one assumes that the T2 values are described by a beta distribution and
calculates the UCL using (5.2) with an a = 0.01, the same 28 observations are
removed. However, the order of removal is not the same and only four passes
are required. These results are presented in Table 5.8. In this case, the major
difference between the two procedures is that the probability of a Type I error is
fixed at a = 0.01 for the beta distribution, whereas the error rate for the Chebyshev
approach is only bounded by a < 0.10.
To demonstrate the procedure based on the quantile technique, the 491 T2
values are arranged in descending order and an approximate UCL is calculated
using a = 0.01 and the formula in (A3.3) from Chapter 3, i.e.,
The estimated UCL for this first pass is 22.952, and it is used to remove five
outliers. The estimation of the UCL and the purging process are repeated until a
homogeneous data set of T2 values is obtained.
In this procedure, there will always be at least one T2 value exceeding the
estimated UCL in each pass. Thus, one must stop at the step where only a single
outlier is encountered. Since this occurs at step 6 for our data example, only five
passes are required and 27 observations are deleted. The results of each pass of this
procedure are presented in Table 5.9.
The third method for obtaining an appropriate UCL is to fit a distribution to the
T2 statistic using the kernel smoothing technique. The UCL can be approximated
using the (1 — a)th quantile of the fitted kernel distribution function of the T2. We
begin by using the preliminary data of n observations to obtain the estimates X
and S of the parameters // and E. Using these estimates, we compute the T2 values.
These n values provide the empirical distribution of the T2 statistic for a Phase I
operation. As previously noted, we are assuming that the intercorrelation common
to the T2 values has little effect on the application of these statistical procedures.
We apply the kernel smoothing procedure described by Polansky and Baker
(2000) to obtain FK(t), the kernel estimate of the distribution of T 2 , or simply the
5.7. Unknown T Distribution 95
Table 5.10: Results of the purging process using the kernel smoothing technique.
Pass 1 Pass 2 Pass 3 Pass 4 Pass 5 Pass 6 Pass 7
UCL 23.568 22.531 19.418 18.388 17.276 16.626 16.045
# of Outliers Removed 5 4 4 4 5 4 3
(5.7)
Since FK is generally a skewed distribution, the UCL can be large for small a
values, such as 0.01 and 0.001. The (1 — a)th sample quantile of the T 2 (j), j =
1 , . . . , n, can be used as the initial value for the UCL in (5.7).
Since the kernel distribution tends to fit the data well, for a moderate a value
between 0 and 1, approximately no. (rounded to the nearest integer) of the T2 values
are beyond the UCL or the upper lOOcrth percentile of the kernel distribution. For
n = 491 and a = 0.01, na = 4.91 ~ 5, and one may expect that four to five values
of the T2 are above the UCL.
After these outliers are removed in each pass, there are always some points
above the newly calculated UCL. This seems to be inevitable unless n or a is very
small, so that na is around 0. Because the kernel method is based solely on data,
one way of determining the UCL for the final stage is to compare the UCLs and
the kernel distribution curves for successive passes. If the UCLs for two consecutive
passes are very different, this implies that the kernel distribution curves also differ
significantly after outliers are removed. However, if the UCLs and the curves for
two consecutive passes are nearly the same, this implies that the UCL is the desired
UCL for the final stage.
For the data in the example, the UCLs for Passes 7 and 8 are 16.045 and 15.829,
respectively. The difference between the bandwidths of these kernel estimates is
only 0.004. Therefore, the three points in Pass 7 cannot be viewed as outliers and
should be kept in the HDS. After six passes, 26 observations are removed and the
remaining 465 observations form the HDS. The UCL for the T2 chart should be set
at 16.045, as Pass 7 is the final pass. Table 5.10 presents the results of the entire
purging process.
96 Chapter 5. Charting the T2 Statistic in Phase I
5.8 Summary
Historical data sets are very important in process control as they provide a baseline
for comparison. Any process deviation from this reference data set is considered
out of control, even in those situations where the system improves. Because of this
criticality, the process needs to be in control and on target when the HDS observa-
tions are selected. Inclusion of atypical observations will increase the variation and
distort the correlations among the variables. As few as one outlying observation
can do this. It is for these reasons that a good outlier purging procedure, such as
one based on the T2 statistic, is needed.
A common misconception is that the operational range of a variable in a mul-
tivariate process is the same as the operational range of a variable in a univariate
process. This is only true for independent variables. For correlated variables, the
operational range of the variables is increased. Correct outlier purging procedures
will determine the appropriate operational ranges on the variables.
Several different forms of the T2 can be used in detecting outliers in a Phase I op-
eration. These include situations where the population parameters are both known
and unknown and where observations are either individually charted or charted
as means. In addition, three alternative procedures are available for use when
the assumption of multivariate normality is invalid. These include the Chebyshev
approach, the quantile method, and the kernel smoothing approach.
Chapter 6
Charting the T2 Statistic
in Phase II
6.1 Introduction
A number of items need to be considered when choosing the appropriate T2 chart-
ing procedure for a Phase II operation. These include computing the appropriate
charting statistic, selecting a Type I error probability, and determining the UCL.
For example, if we monitor a steady-state process that produces independent obser-
vations, a T2 charting procedure will suffice. However, if the observations exhibit
a time dependency, such as that which is inherent to decay processes, some adjust-
ment for the time dependency must be made to the T2 statistic (i.e., see Chapter
10).
The charting of the T2 statistic in a Phase II operation is very similar to the
approach used in charting the statistic for a Phase I operation. The major difference
is in the probability functions used in determining the control region. Two cases
exist. When the mean vector and covariance structure are known, a chi-square
distribution is used to describe the behavior of the statistic and determine the
upper control limit. When the mean and covariance parameters are unknown and
must be estimated from the historical data, an F distribution is used to describe
the statistic and locate the upper control limit.
In this chapter, we address several different charting procedures for the T2
statistic and examine the advantages and disadvantages of each. We initially
discuss monitoring a process using a T2 statistic when only a single observation
vector is collected at each time point. This is later extended to the situation
where the process is monitored using the mean of a subgroup of observations taken
at each time point. Other topics discussed include the choice of the probabil-
ity of a Type I error, procedures for calculating the average run length to de-
tect a given mean shift, and charts for the probability of detecting a shift in the
mean vector. Any nonrandom pattern displayed in a T2 chart can imply process
change. For this reason, we include a section on detecting systematic patterns
in T2 charts.
decision, since a value of a can be chosen such that the T2 value of an observation
exceeds the UCL, even though the observation really contains no signal. Note, also,
that the size of the control region is 1 — a. This is the probability of concluding
that the process is in control when in fact control is being maintained on all process
variables.
The size of a cannot be considered without discussion of /3, the probability of a
Type II error. This is the error of concluding there is no signal when in fact a signal
is present. Type I and Type II errors are interrelated in that an increase in the
probability of one will produce a decrease in the probability of the other. Careful
consideration must be given to the consequences produced by both types of error.
For example, suppose a chemical process is producing a product that becomes
hazardous when a particular component increases above a given level. Assume that
this component, along with several other correlated components, is observed on a
regular basis. A T2 control procedure is used to check the relationships among the
components as well as to determine if each is in its desired operational range. If a
Type I error is made, needless rework of the product is required since the process is
in control. If a Type II error is made, dangerous conditions immediately exist. Since
dangerous conditions override the loss of revenue, a very small (3 is desirable. Given
this preference for low risk of including an outlier, a large a would be acceptable.
The value of a chosen for a T2 chart in a Phase II operation does not have
to agree with the value used in constructing the Phase I chart. Instances do exist
where making a Type I error in a Phase II operation is not so crucial. For example,
suppose change is not initiated in the actual control of a process until more than one
signal is observed. This reduces the risk of overcontrolling the process. Situations
such as these require a larger a in the Phase II operation. In contrast, a large a
for Phase I can produce a conservative estimate of both the mean vector and the
covariance matrix, so some balance is necessary.
The choice of a for a univariate charting procedure pertains only to the false
alarm rate for the specified variable being monitored. For example, the control
limits of a Shewhart chart are frequently located at plus or minus three standard
deviations from the center line of the charted statistic. This choice fixes the false
alarm rate a at a value of 0.0027 and fixes the size of the control region at (1 — a)
or 0.9973. This translates to a false alarm rate of about 3 observations per 1,000.
The choice of a in monitoring a multivariate process is more complex, as it
reflects the simultaneous risk associated with an entire set of variables (e.g., Timm
(1996)). Establishing a control procedure for each component of the observation
vector would lead to an inappropriate control region for the variables as a group,
as individual control does not consider relationships existing among the variables.
This is illustrated with the following example.
Suppose X' = (£1,0:2) is a bivariate normal observation on a process that is to
be monitored by a joint control region defined by using a 3-sigma Shewhart proce-
dure for each individual variable. The shaded box given in Figure 6.1 illustrates the
control region. A major problem with this approach is that it ignores the relation-
ship that exists between the process variables and treats them independently. The
true joint control procedure, if the two variables were correlated, would be similar
to the ellipse, which is superimposed on the box in Figure 6.1.
100 Chapter 6. Charting the T2 Statistic in Phase II
where a is the false alarm rate for each individual variable. Thus the value of as
increases as p, the number of variables, increases. For example, if a = 0.0027, the
simultaneous false alarm rate for p — 2 is 1 — (0.0027)2 = 0.0053, but for p = 4, the
rate increases to 0.0108. This example produces exact probabilities for a process
with independent variables. In reality, a process usually consists of a group of
correlated variables. Such situations tend to increase the true false alarm rate even
beyond (6.1).
where the common estimates X and S are obtained from the HDS following the
procedures described in Chapter 4. In this Phase II setting, the T2 statistic in (6.2)
follows the F distribution given in (2.14). For a given a, the UCL is computed as
where n is the size of the HDS and F(a.p^n_p) is the ath quantile of F( p;n _ p ).
6.3. T2 Charts with Unknown Parameters 101
and
- 5.25£07 2.76£07 -11749.5 3313.42 -408.673 -169.391 '
2.76£07 1.91E07 -8112.06 2302.14 -203.795 -115.969
-11749.5 -8112.06 8.6918 0.93735 0.152381 0.032143
S=
3313.42 2302.14 -0.93735 0.285332 -0.02312 -0.0134
-408.673 -203.795 0.152381 0.02312 0.043598 0.003757
.-169.391 -115.969 0.032143 0.0134 0.003757 0.002474.
102 Chapter 6. Charting the T2 Statistic in Phase II
Considering the data in Table 6.1 as the HDS, T2 values for 16 new incoming
observations are computed. Table 6.2 contains these values in addition to their
corresponding T2 values. The T2 values are computed using (6.2) and the parameter
estimates obtained from the historical data in Table 6.1. For example, the T2 value
for observation 1 is
where
Reaction to a Signal
Any observation that produces a T2 value falling outside its control region is a
signal. This implies that conditions have changed from the historical situation.
104 Chapter 6. Charting the T2 Statistic in Phase II
the critical components, and a risk analysis to determine the consequences of the
actions to be taken.
The probability function used to describe this T2 statistic is the chi-square distri-
bution with p degrees of freedom as given in (2.13). This is the same distribution
used to purge outliers when constructing the HDS. For a given value of a, the UCL
is determined as
where xfa „) ig the upper ath quantile of % 2 . In this case, the control limit is
independent of the size of the HDS.
To illustrate the procedure for signal detection in the known parameter case,
consider a bivariate industrial process. A sample of 11 new observations and their
mean corrected values are presented in Table 6.3. The purpose of the control
procedure is to maintain the relationship between the two variables (xi,x-z) and to
guarantee that the two variables stay within their operational ranges. The mean
vector arid covariance matrix are given as
and
The T2 chart for the data in Table 6.3 is presented in Figure 6.5. Letting
a = 0.05, observations 1 and 10 produce signals since their T2 values are above the
UCL = 4,05,2) = 5-99.
The occurrence of situations where the distribution parameters are known is rare
in industry. Processing units, especially in the chemical industry, are in a constant
state of flux. A change in the operational range of a single variable of a process
can produce a ripple effect throughout the system. Many times these changes
106 Chapter 6. Charting the T2 Statistic in Phase II
are initiated through pilot-plant studies, research center studies under controlled
conditions, or from data obtained through other types of experiments. This requires
constant updating of the baseline conditions, which in turn demands the use of new
estimates of the parameters. Managers, performance engineers, process engineers,
and the operators are constantly striving to improve the performance of the unit.
There is no status quo.
p-variate normal distribution Np(p,: S), we are assured that the mean vector X of
a sample of m observations is distributed as a p-variate normal Np(n, E/m) with
the same mean vector [i as an individual observation, but with a covariance matrix
given by E/ra. If the individual observation vectors are not multivariate normally
distributed, we are assured by the central limit theorem that as m increases in size,
the distribution of X becomes more like that of an Np(n, E/ra). This produces the
following changes in the charting procedure.
When the parameters of the underlying MVN distribution are known, the T2
statistic for the iih subgroup mean Xi is computed by
and the UCL for a given a is determined by using (6.5). The control limit is
independent of the sample size of either the subgroup or the HDS.
When the parameters of the underlying MVN distribution are unknown, the T2
statistic for the ith sample mean Xi is computed as
where X and S are the common estimates of JJL and E obtained from the HDS. The
distribution of a T2 statistic based on the mean of a subgroup of m observation
vectors is given in (2.17). For a given a, the UCL for use with the statistic given
in (6.7) is computed as
The T2 values for the average of the four cells for each electrolyzer are listed in
Table 6.5. When compared to UCL = 10.28, we conclude that electrolyzers 573
and 963 are to be removed from service and refurbished.
108 Chapter 6. Charting the T2 Statistic in Phase II
points distributed randomly about the center line. This occurs because, for a
normal distribution, approximately 68% of the observations are contained within
one standard deviation of the mean (centerline). Does something similar occur in
a T2 chart? The answer is no, but it is not emphatic.
In some types of industries, T2 charts are often unique and can be used to
characterize the behavior of the process. Close study of the plotted statistic can
produce valuable insight on process performance. Upset conditions, after they
occur, become obvious to those involved. However, process conditions leading to
upsets are not as obvious. If so. there would be few upsets. If the precursor
conditions can be identified by examining the T2 plot, sometimes it is possible to
avoid the upset.
Figure 6.6 is the plotted T2 statistic for a control procedure on a mercury cell (Hg
cell), which is another type of processing unit used to produce chlorine gas and caus-
tic soda. Seven process variables are observed simultaneously in monitoring the per-
formance of the cell. The plotted T2 statistics over the given time period illustrated
in Figure 6.6 indicate a very steady-state iri-control process relative to the baseline
data set. There is very little change in the pattern of the T2 statistic. The UCL, as
determined from the HDS, has a value of 18.393; however, the values of the plotted
T2 statistic are consistently located at a substantial distance below this value.
Any erratic or consistent movement of the observed T2 values from the estab-
lished pattern of Figure 6.6 would indicate a process change. Figure 6.7 illustrates
such a condition, where the T2 values are increasing towards the control limit. Had
process intervention been initiated around observation 1000, it may have been pos-
sible to prevent the upset conditions that actually occurred at the end of the chart.
Of course, one needs tools to determine what variable or group of variables is the
precursor of the upset conditions. These will be discussed in Chapter 7.
Figures 6.6 and 6.7 present another interesting T2 pattern. Notice the running
U pattern contained in both charts. This is due to the fluctuation in the ambient
110 Chapter 6. Charting the T2 Statistic in Phase II
conditions from night to day over a 24-hour period, and it represents a source of
extraneous variation in the T2 charts. Such variation can distort the true rela-
tionships between the variables, and can increase the overall variation of the T2
statistic. For example, the cluster of points between observations 800 and 900 in
Figure 6.7 are U-shaped. Do they represent a process change or a change in ambi-
ent conditions? Removing the effect of ambient conditions would produce a clear
process picture.
Another example of a T2 chart is presented in Figure 6.8. Given are the T2
values for data collected on 45 process variables measured in the monitoring of a
furnace used in glass production. Observe the steady-state operating conditions of
6.7. Average Run Length (Optional) 111
the process from the beginning of the charting to around observation 500. This
pattern reflects a constant trend with minimal variation. After observation 500,
the T2 values exhibit a slow trend toward the UCL with upset conditions occurring
around observations 500, 550, 600, and 650. Corrections were made to the process
and control regained at about observation 650. However, note the increase in
variation of the T2 values and some signals between observations 650 and 1350.
The T2 plot flattens out beyond this point and the steady-state pattern returns.
These examples illustrate another important use of the T2 control chart. Af-
ter the trends in the T2 chart have been established and studied for a process,
any deviation from the established pattern indicates some type of process change.
Sometimes the change is for the better, and valuable process knowledge is gained.
Other times, the change is for the worse, and upset conditions occur. In either case,
we recommend the investigation of any change in the plotted T2 values. Using this
approach, expensive upsets that lead to chaotic conditions can be avoided.
where p represents the probability of being outside the control region. For a process
that is in control, this probability is equal to a, the probability of a Type I error
(see section 6.2). The ARL has a number of uses in both univariate and multi-
variate control procedures. For example, it can be used to calculate the number
of observations that one would expect to observe, on average, before a false alarm
occurs. This is given by
Another use of the ARL is to compute the number of observations one would
expect to observe before detecting a given shift in the process. Consider the two
univariate normal distributions presented in Figure 6.9. One is located at the
center line (CL) and the other is shifted to the right and located at the UCL. The
probability of detecting the shift (i.e., the probability of being in the shaded region
in Figure 6.9) equals (1 — /3), where /3 is the probability of a Type II error (see
section 6.2). Given the shift, this probability can be determined using standard
statistical formulas (e.g.. see Montgomery (2001)). The ARL for detecting the shift
is given by
From Chapter 5, we recognize that the probability (1 — (3) represents the power
of the test of a statistical hypothesis that the mean has shifted. This result produces
another major use of the ARL, which consists of comparing one control procedure
to another. This is done by comparing the ARLs of the two procedures for a given
process shift.
Shifts and the probability of detection, (1 — /3), are easy to compute in the uni-
variate case. However, it is more difficult to do these calculations in the multivariate
112 Chapter 6. Charting the T2 Statistic in Phase II
case. Consider a bivariate control region and a mean shift of the process as repre-
sented in Figure 6.10. We assume that the covariance matrix has not changed and
remains constant for the multivariate distributions. Hence, the orientation of the
control region for the shifted distribution is the same as that of the control region
for the in-control process. The area of the shifted distribution that corresponds
to the shaded region in Figure 6.10 equals the probability (1 — /?) of detecting the
shift. This probability can be computed analytically for p = 2, but becomes very
6.7. Average Run Length (Optional) 113
difficult for higher dimensions. However, using additional statistical theory and the
nonnull distribution of the T2 statistic, the problem can be simplified.
Suppose the parameters, \ix and S, of the MVN distribution are known. The
T2 control region for an in-control observation vector X is described by a chi-
square distribution (see section 6.4) and can be compared to the UCL based on
that distribution; i.e.,
but it cannot be described by the central chi-square distribution. This is because the
MVN that describes the vector (Y — /i x ) has a mean different from zero. However,
we can determine the mean of the normal vector (Y — /j,x) in terms of /j,x and ^y.
Consider
where 6 = (/j,y — IJLX} represents the mean shift. With this result, the distribution
of T2 is given by
114 Chapter 6. Charting the T2 Statistic in Phase II
Recall from Chapter 2 that the curve T2 = UCL establishes an elliptical con-
trol region. An example of such a control region, where x\ and x% are positively
correlated, is illustrated in Figure 6.13.
A number of observations can be made about the control region represented in
Figure 6.13. It is referenced by three different coordinate systems. The first is the
variable space, represented by (.TI, x ^ ) . This is obtained by expanding (6.10) as a
function of x\ and x2 and constructing the graph.
If we standardize x\ and £2, using the values
116 Chapter 6. Charting the T2 Statistic in Phase II
we obtain the translated axes (yi, 7/2) located at the center of the ellipse in Figure
6.13. The T2 statistic in terms of y\ and 7/2 takes the form
where Z' = (21,22) and the matrix A is a diagonal matrix with the eigenvalues of
P along the diagonal.
The above rotation of the (xi, x^} space to the (21, 22) space removes the depen-
dency (correlation) between x\ and x^. In the (z\, z^) space, the elliptical control
region is not tilted, since z\ and z2 are independent. Further, the z\ and z^ values
are expressed as linear combinations of y\ and y^ and, hence, ultimately as linear
combinations of x\ and x^. As such, these variables are the principal components
of the correlation matrix for x\ and x 2 . If x\ and x% are not standardized, the
z\ and z<2 variables are the principal components of the corresponding covariance
matrix E and will have a different representation. For purposes of plotting, it is
usually best to use the correlation matrix.
6.8. Plotting in Principal Component Space (Optional) 117
The control region for the T2 statistic can be written in terms of z\ and z^ by
expanding the matrix multiplication of (6.10) to obtain
where p is the population pairwise correlation between x\ and x^. Note that the
eigenvalues of the correlation matrix for x\ and x% are (1 + p) and (1 — p), so the
T2 statistic is now expressed in terms of these variables.
If the equation in (6.13) is set equal to the UCL, it forms a bivariate elliptical
control region that can be used as a charting procedure in the principal component
space, i.e., in terms of z\ and z%. For example, given a bivariate observation (xi, £2),
we can standardize the observations using (6.11). The principal components z\
and £2 are computed using (6.12) and plotted in the principal component space.
Observations plotting outside the elliptical region are out of control, as they do
not conform to the HDS. The point A in the principal component control region
presented in Figure 6.14 illustrates this.
The method of principal component translation can be generalized to the p-
dimensional case. The control region for the T2 statistic can be expressed in terms
of the p principal components of the correlation matrix as
where AI > A2 > > Ap are the eigenvalues of the correlation matrix. Each Zi is
computed as
The graph of (6.14), when set equal to the control limit, is a hyperellipsoid in ap-
dimensional space. However, plotting this region for p > 3 is not currently possible.
As an alternative, one can plot any combination of the principal components in a
subspace of three or fewer dimensions. This procedure also has a major drawback.
Any point (2j, Zj, z^) that plots outside the region defined by (6.14) will produce
a signal, but there is no guarantee that a point plotting inside the region does not
contain a signal on another principal component.
6.9 Summary
Signal detection is an important part of any control procedure. In this chapter, we
have discussed charting the T2 statistic when monitoring a process in a Phase II op-
eration. This includes the charting of the T2 statistic based on a single observation
and the T2 statistic based on the mean of a subgroup of observations. Included are
procedures to follow when the parameters of the underlying MVN distribution are
known and when they are unknown. In both cases it is assumed that the covariance
structure is nonsingular.
Also contained in this chapter is a discussion on determining the ARL for a T2
chart. To calculate an ARL for a given mean shift in a multivariate distribution
involves the introduction of noncentral distributions and the evaluation of some
complicated integrals. For the reader who has a deeper interest in this area, there
are many excellent texts in multivariate analysis on this subject, e.g., Johnson and
Wichern (1998), Fuchs and Kenett (1998), and Wierda (1994).
An optional section on the plotting of the T2 statistic in a principal component
space was presented. As pointed out in the discussion, the procedure has both
advantages and disadvantages. A major advantage is that one can plot and observe
signals on particular principal components in a subspace of the principal compo-
nent space. However, a major disadvantage is that each principal component is a
linear combination of all the process variables. This often inhibits a straightforward
interpretation procedure in terms of the process variables.
Chapter 7
Interpretation of T2 Signals for
Two Variables
7.1 Introduction
Univariate process control usually involves monitoring control charts for location
and variation. For example, one might choose to monitor mean shifts with an X
chart and variation shifts with an R chart, as both procedures are capable of de-
tecting deviations from the historical baseline. In this setting, signal interpretation
is simplified, as only one variable needs to be examined. A signal indicates that
either the process has shifted and/or the process variation has changed.
In multivariate SPC, the situation becomes more complicated. Nonconformity
to a given baseline data set can be monitored using the T2 statistic. If the observed
T2 value falls outside the control region, a signal is detected. The simplicity of the
monitoring scheme, however, stops with signal detection, as a variety of variable
relationships can produce a signal.
For example, an observation may be identified as being out of control because its
value for an individual variable is outside the bounds of process variation established
by the HDS. Another cause of a signal is when values on two or more variables do
not adhere to the linear correlation structure established by the historical data. The
worst case is when the signal is a combination of the above, with some variables
being out of control and others being countercorrelated.
Several solutions have been posed for the problem of interpreting a multivariate
signal. For example, Doganaksoy, Faltin, and Tucker (1991) proposed ranking the
components of an observation vector according to their relative contribution to a
signal using a univariate t statistic as the criterion. Hawkins (1991, 1993) and Wade
and Woodall (1993) separately used regression adjustments for individual variables
to improve the diagnostic power of the T2 after signal detection. Runger, Alt, and
Montgomery (1996) proposed using a different distance metric, and Timm (1996)
used a stepdown procedure for signal location and interpretation. An overview
of several of these multivariate process control procedures, including additional
119
120 Chapter 7. Interpretation of T2 Signals for Two Variables
ones by Kourti and MacGregor (1996) and Wierda (1994), can be found in Mason,
Champ, Tracy, Wierda, and Young (1997). Also, several comparisons are given in
Fuchs and Kenett (1998).
In this chapter, we present a method of signal interpretation that is based on
the orthogonal decomposition of the T2 statistic. The independent decomposition
components, each similar to an individual T2 variate, are used to isolate the source
of a signal and simplify its interpretation. The discussion is limited to a two-variable
problem, as it is the easiest to geometrically visualize. The more general p-variable
case is presented in Chapter 8.
where c is an appropriately chosen constant that specifies the size of the control
region (see section 6.2). A typical control region is illustrated by the interior of
the ellipse given in Figure 7.1. Any sample point located on the ellipse would be
located the same statistical distance from the sample mean as any other point on
the ellipse.
7.2. Orthogonal Decompositions 121
There are two additive components to the T2 statistic given in (7.1), and these
provide a natural decomposition of the corresponding statistic. The components,
in fact, are independent due to the independence of the two original x variables.
This property is what causes the ellipse not to be tilted. Since the components are
unequally weighted due to their unequal variances, we will transform the variables
to a form that will provide equal weights. Doing so will produce a circular region
and make the statistical distance, represented by the square root of the T2 in (7.1),
equivalent to the corresponding Euclidean, or straight-line, distance. This is the
view we need in order to interpret the T2 value.
Let
represent the standardized values of x\ and x2j respectively. Using this transfor-
mation, we can re-express the T2 value in (7.1) as
The T2 value is again separated into two independent components as in (7.1), but
now the components have equal weight. The first component, y 2 , measures the
contribution of x\ to the overall T2 value, and the second component, y2: measures
the contribution of x2 to the overall T2 value. Careful examination of the magnitude
of these components will isolate the cause of a signal.
The control region using the transformation given in (7.3) is illustrated by the
interior of the circle depicted in Figure 7.2. A T2 value in this orthogonal trans-
formed space is the same as the squared Euclidean distance that a point (2/1,3/2) is
from the origin (0,0), and it can be represented by the squared hypotenuse of the
122 Chapter 7. Interpretation of T2 Signals for Two Variables
Figure 7.2: Bivariate orthogonal control region with equal variances. SD refers to
statistical distance.
enclosed right triangle depicted in Figure 7.2. This is equivalent to the statistical
distance the point (zi,#2) is from the mean vector (x\, £2)- Thus, all points with
the same statistical distance are located on the circle in Figure 7.2, as well as on
the ellipse in Figure 7.1.
Consider a situation where the variables of X' = (xi,x 2 ) are not independent.
Since the pairwise correlation r between x\ and X2 is nonzero, the T2 value would
be given as
and the corresponding elliptical control region would be tilted. This is illustrated
in Figure 7.3. Again, letting yi and y2 represent the standardized values of xi and
a?2, the T2 value in (7.5) can be written as
to the axes of the ellipse. This is not done in either Figure 7.3 or Figure 7.4. For
example, the axes of the ellipse in Figure 7.4 do not correspond to the axes of the
(yii 2/2) space. The axes can only be aligned through an orthogonal transformation.
In this nonindependent case, the transformation in (7.6) is incomplete, as it does
not provide the axis rotation that is needed.
The T2 statistic in (7.6) can be separated into two additive components using
the orthogonal transformation
124 Chapter 7. Interpretation of T2 Signals for Two Variables
Figure 7.5: Bivariate principal component control region with unequal variances.
and
As was shown in Chapter 6, the values z\ and z^ are the first and second principal
components of the correlation matrix for x\ and x% (also see the appendix to this
chapter, section 7.10). Using this transformation, we can decompose the T2 value
in (7.6) as follows:
and
7.3. The MYT Decomposition 125
Figure 7.6: Bivariate principal component control region with equal variances.
The resultant control region, presented in Figure 7.6, is now circular, and the
(squared) statistical distance is represented by the hypotenuse of a right triangle.
The transformation given in (7.10) provides an orthogonal decomposition of the
T2 value. Thus, it will successfully separate a bivariate T2 value into two additive
and orthogonal components. However, each w\ and u>2 component in (7.10) is a
linear combination of both x\ and x^. Since each component consists of both vari-
ables, this hampers clear interpretation as to the source of the signal in terms of
the individual process variables. This problem becomes more severe as the number
of variables increases. What is needed instead is a methodology that will pro-
vide both an orthogonal decomposition and a means of interpreting the individual
components. One such procedure is given by the MYT (Mason-Young-Tracy) de-
composition, which was first introduced by Mason, Tracy, and Young (1995).
difficult to interpret, as they are linear combinations of the p variables of the ob-
servation vector. The components of the MYT decomposition of the T2 statistic,
in contrast, have global meaning. This is one of the most desirable characteristics
of the method.
We will demonstrate the MYT procedure for a bivariate observation vector X' =
(#1, £2)5 where x\ and x% are correlated. Details on the more general p-variable
case can be found in Chapter 8. The MYT decomposition uses an orthogonal
transformation to express the T2 values as two orthogonal and equally weighted
terms. One such decomposition is given by
where
and
In this formulation, x2.i is the estimator of the conditional mean of x% for a given
value of xi, and s|.i is the corresponding estimator of the conditional variance of
#2 f°r a given value of x\. Details on these estimators are given in the last section
of this chapter.
The first term of the MYT decomposition in (7.11), Tf, is referred to as an
unconditional term, as it depends only on x\. The second term of the orthogo-
nal decomposition, written as T|1? is referred to as a conditional term, as it is
conditioned on the value of x\. Using the above notation, we can write (7.11) as
The square root of this T2 value can be plotted and viewed in the T\ and T^.i space.
This is illustrated in Figure 7.7.
The orthogonal decomposition given in (7.11) is one of two possible MYT de-
compositions of a T2 value for p = 2. The other decomposition is given as
where
and
(xi, # 2 ) presented in Figure 7.8. As the value of x\ changes, so does the value of
X2.i. Consider a fixed value of xi, say x\ = a. This is represented by the vertical
line drawn upwards from the x\ axis. For process control to be maintained at this
value of #1, the corresponding value of x% must come from the shaded interval along
7.5. Regression Perspective 129
the x-2 axis. This means the value of x% must be contained in this portion of the
conditional density, otherwise, a signal will be obtained on the T|: component.
A similar discussion can be used to illustrate a signal on the T^2 term. Suppose
the value of x^ is fixed at a point b and we examine the restricted (conditional)
interval that must contain the observation on x\. This is depicted in Figure 7.9. If
the value of x\ is not contained in the shaded interval, a signal will be obtained on
the T^2 component.
A large value on a conditional term implies that the observed value of one
variable is not where it should be relative to the observed value of the other variable.
Observations on the variables (xi, #2) that produce a signal of this type are said to
be countercorrelated, as something is astray with the relationship between x\ and
X2- Countercorrelations are a frequent cause of a multivariate signal.
Consider the estimated mean of x% adjusted for Xj, i.e., Xi.j. This is given as
where Xi and Xj are the sample means of Xi and Xj obtained from the historical
data, and b is the estimated regression coefficient relating Xi to Xj in this data set.
The left-hand side of (7.19) contains Xi.j, which is the predicted value of x^ based
on the corresponding value of Xj (i.e., x^ is the dependent variable and x3 is the
predictor variable). Thus, the numerator of (7.18) is a regression residual; i.e.,
where JT^.J is the multiple correlation between Xi and x j ; and substituting r^.j for
(xi — Xj.j), we can re-express T? • as
or
We use this notation for consistency with the formula used in the p-dimensional
case discussed in Chapter 8.
130 Chapter 7. Interpretation of T2 Signals for Two Variables
and
where r^.i = (#2 — 2:2.1) and ri.2 = (x\ — #1.2) are residuals from the respective
regression fits of x% on x\ and x\ on x-2- These residuals are illustrated in Figures
7.10 and 7.11.
7.6. Distribution of the T2 Components 131
Notice that the two conditional values in (7.21) and (7.22), apart from the
R? j term, are actually standardized residuals having the form of 7Yj/Si. When the
residuals (after standardizing) in Figures 7.10 and 7.11 are large, the conditional T2
terms signal. This would occur only when the observed value of x\ differs from the
value predicted by x^, or the observed value of x% differs from the value predicted
by xi, where prediction is derived from the HDS.
for j = 1,2. Similarly, the conditional terms, T?-, used in checking the linear
relationships between the variables are distributed as
Mason, Tracy, and Young (1995). Thus, one can use the F distribution to deter-
mine when an individual unconditional or conditional term of the decomposition is
significantly large and makes a contribution to the signal.
The procedure for making this determination is as follows. For a specified a
level and HDS sample of size n, obtain ^(a,i,n-fc-i) from the appropriate F table.
Compute the UCL for individual terms using
imply that the corresponding Xj is contributing to the signal. Likewise, any condi-
tional term greater than its UCL, such as
does not require the assumption of nmltivariate normality. However, without this
assumption, or some distributional assumption, it is difficult to detect a signal.
We would like the parallelogram and ellipse in Figure 7.12 to be similar in size.
The size of the ellipse is determined by the choice of the overall probability, labeled
cci, of making a Type I error or of saying the process is out of control, when in fact
control is being maintained. The size of the parallelogram is controlled by the choice
of the specific probability, labeled a 2 , used for testing the "largeness" of individual
terms. Thus, a-2 represents the probability of saying a component is part of the
signal when in fact it is not. The two a's are not formally related in this situation.
However, ambiguities can be reduced by making the two regions agree in size.
We use the F distributions in (7.23) and (7.24) to locate large values among the
T2 decomposition terms. This is done because, given that an overall signal exists,
the most likely candidates among the unique terms of a total MYT decomposition
are the components with large values that occur with small a priori probabilities.
Our interest in locating the signaling terms of the MYT decomposition is due to
the ease of interpretation for these terms.
Consider an example to illustrate the methodology of this section. For p = 2,
the control region is illustrated in Figure 7.13. Signals are indicated by points A,
B, C. and D. Note that the box encompassing the control region represents the
tolerance on variables x\ and x^ for a = 0.05, as specified by (7.26). The tolerance
regions are defined by the Shewhart control limits of the individual variables for
the appropriate a level.
134 Chapter 7. Interpretation of T2 Signals for Two Variables
Figure 7.16: Time-sequence chart for first few points of boiler data.
dominated by moderate-to-high values of the two variables, making the low values
farther from the mean vector. Since the T2 statistic is a squared number, values
far from the mean are large and positive.
A Q-Q plot of the 500 ordered T2 values versus the corresponding beta values
for the boiler historical data is presented in Figure 7.17. The upper four points in
the graph correspond to the four T2 values of Figure 7.14 that are greater than
or equal to a value of 8. Although the linear trend for these four points is not
consistent with the linear trend of the other 496 points, the deviation is not severe
enough to disqualify the use of the T2 statistic from detecting signals in the Phase
II operation.
7.7. Data Example 137
Beta Quantiles
Summary statistics for the boiler HDS are presented in Table 7.2. The minimum
and maximum values of the variables give a good indication of the operational
ranges on the two variables. For example, fuel usage in the HDS ranges from a low
of 186.46 units to a high of 524.70 units.
With statistical understanding of the boiler system through examination of the
HDS, we are ready to move to signal interpretation. The control region is pre-
sented in Figure 7.18 in the variable space of fuel usage and steam flow. Also
included are three signaling points, designated as points 1, 2, and 3. Examination
of the signals in this graphical setting provides insight as to how the terms of the
MYT decomposition identify the source of the signal and how the signals are to be
interpreted.
For example, note the (Euclidean) distance point 1 is from the body of the data
(HDS). Also, note the "closeness" of points 2 and 3 to the control region, especially
point 2. This leads one to think that the signal for point 1 is more severe than
the signals for the other two points. However, this is not the case. Observe the
T2 values presented in Table 7.3 for the three signaling points. Point 3 clearly
has the largest T2 value, while point 2 is the smallest of the three signaling T2
values. To understand why this occurs, we need to examine the values of the MYT
decomposition terms that are presented in Table 7.4.
Since there are only two variables, the three signaling points can be plotted in
either the (7\, T 2 .i) space or the (T2, 7Y2) space. A representation in the (Ti,
T 2 .i) space is presented in Figure 7.19. Geometrically, the circle in Figure 7.19
138 Chapter 7. Interpretation of T2 Signals for Two Variables
represents a rotation of the elliptical control region given in Figure 7.18. In the
transformed space, the T2 statistic is represented by the square of the length of the
arrows designated in the plot. The UCL of 9.33 defines the square of the radius
of the circular control region. The coordinates of point 1 in Figure 7.19 are (3.27,
1.57). The sum of squares of these values equals the T2 value of point 1, i.e.,
The coordinates of point 2 are (1.3, 5.05), and those of point 3 are (2.04, 20.01).
Scalewise, point 3 would be located off the graph of Figure 7.19. However, this
7.7. Data Example 139
Figure 7.19: Control region in (Ti, T2.i) space for boiler data.
The corresponding observed fuel value of 400.00 is too large for this value of steam.
Likewise, the difference between the actual steam value and the predicted steam
140 Chapter 7. Interpretation of T2 Signals for Two Variables
Observation Number
Figure 7.20: Time-sequence graph of three signaling points.
value for this point is too large. The predicted steam value is
When compared to the actual value of 300.00, the residual of 13.74 steam units is
too large to be attributed to random fluctuation.
Point 3 also has T2 signals on both the conditional terms. This indicates that
the linear relationship between the two variables is astray. The reason for this can
best be seen by examining the time-sequence graph presented in Figure 7.20 for the
three signaling points. These were derived from the observation plot of the HDS
in Figure 7.16, where it was established that fuel must be above the corresponding
value of steam. For point 3 in Figure 7.20, the relationship is reversed as the value
of steam is above the corresponding fuel value. This counterrelationship produces
large signaling values on the two conditional T2 terms.
through the joint density at the fixed value of y, i.e., at y = b. This is illustrated
in Figure 7.22 for various values of the constant b.
For the MVN distribution with p = 2, the conditional density of x given y is
and
Examination of (7.28) reveals that (j,x\y depends on the specified value of y. For
example, the conditional mean of the distribution of x for y = b is given as
For various values of the constant b (i.e., for values of y), it can be proven that the
line connecting the conditional means (as illustrated in Figure 7.23) is the regression
line of re on y. This can also be seen in (7.28) by noting that the regression coefficient
(3 (of x on y] is given by
Thus, another form of the conditional mean (of the line connecting the means of
the conditional densities) is given by
In contrast to the conditional mean, the conditional variance in (7.29) does not
depend on the particular value of y. However, it does depend on the strength of
the correlation, p, existing between the two variables.
7.9 Summary
In this chapter, we have discussed the essentials for using the MYT decomposition in
the interpretation of signals for a bivariate process. We have shown that a signaling
T2 value for a bivariate observation vector has two possible MYT decompositions,
7.10. Appendix: Principal Component Form of T2 143
or
where \i > \2 > • • • > Xp are the eigenvalues of the estimated covariance matrix S
and the z%, i = 1 , . . . ,p, are the corresponding principal components. A principal
component is obtained by multiplying the vector quantity (X — X) by the transpose
of the normalized eigenvector Ui of S corresponding to A$; i.e.,
Note that
The matrix R (obtained from S) is a positive definite symmetric matrix and can
be represented in terms of its eigenvalues and eigenvectors. Using a transformation
similar to (A7.2), the above T2 can be written as
where Wi, W<2,..., Wp are the principal components of the correlation matrix R
and the 7^ are the eigenvalues of R. The principal component values are given by
Wi = V{ (X — X}, where the Vi are the normalized eigenvectors of R.
Equation (A7.4) is not to be confused with (A7.1). The first equation is written
in terms of the eigenvalues and eigenvectors of the covariance matrix, and the second
is in terms of the eigenvalues and eigenvectors of the estimated correlation matrix.
7.10. Appendix: Principal Component Form of Tz 145
These are two very different forms of the same Hotellirig's T 2 , as the mathematical
transformations are not equivalent. Similarly, (A7.4) should not be confused with
(6.13). The equation in (6.13) refers to a situation where the correlation matrix is
known, while (A7.4) is for the case where the correlation matrix is estimated.
The principal component representation of the T2 plays a number of roles in
multivariate SPC. For example, it can be used to show that the control region is
elliptical in shape. Consider a control region defined by a UCL. The observations
contained in the HDS have T2 values less than the UCL; i.e., for each X1t,
Thus, by (A7.4),
and
In the principal component space of the estimated correlation matrix, this re-
duces to
which gives the equation of the control ellipse. The length of the major axis of the
ellipse in (A7.5) is given by 71, and the length of the minor axis is given by 72-
The axes of this space are the principal components, w\ and w^. The absence of
a product term in this representation indicates the independence between wi and
W2- This is a characteristic of principal components, since they are transformed to
be independent.
Assuming that the estimated correlation r is positive, it can be shown that
71 = (1 + r) and 72 = (1 — r). For negative correlations, the 7$ values are reversed.
One can also show that the principal components can be expressed as
From these equations, one can obtain the principal components as functions of the
original variables.
Chapter 8
Interpretation of T2 Signals for
the General Case
8.1 Introduction
In this chapter, we extend the interpretation of signals from a T2 chart to the
setting where there are more than two process variables. The MYT decomposition
is the primary tool used in this effort, and we examine many interesting properties
associated with it. For example, we show that the decomposition terms contain
information on the residuals generated by all possible linear regressions of one
variable on any subset of the other variables. In addition to being an excellent
aid in locating the source of a signal in terms of individual variables or subsets of
variables, this property has two other major functions. First, it can be used to
increase the sensitivity of the T2 statistic in the area of small process shifts (see
Chapter 9). Second, the property is very useful in the development of a control
procedure for autocorrelated observations (see Chapter 10).
147
148 Chapter 8. Interpretation of T2 Signals for the General Case
elements of the mean vector. Suppose we similarly partition the matrix S so that
where Sxx 'ls the (p— 1) x (p— 1) covariance matrix for the first (p — 1) variables,
Sp is the variance of xp, and sxx is a (p — l)-dimensional vector containing the
covariances between xp and the remaining (p — 1) variables.
The T2 statistic in (8.1) can be partitioned into two independent parts (see
Rencher (1993)). These components are given by
uses the first (p — 1) variables and is itself a T2 statistic. The last term in (8.3)
can be shown (see Mason, Tracy, and Young (1995)) to be the square of the pth
component of the vector X adjusted by the estimates of the mean and standard
deviation of the conditional distribution of xp given (xi, X 2 , . . . , xp~i). It is given
as
where
and
Since the first term of (8.3) is a T2 statistic, it too can be separated into two
orthogonal parts:
The TI term in (8.6) is the square of the univariate t statistic for the first variable
of the vector X and is given as
Note this term is not a conditional term, as its value does not depend on a condi-
tional distribution. In contrast, all other terms of the expansion in (8.6) are condi-
tional terms, since they represent the value of a variable adjusted by the mean and
standard deviation from the appropriate conditional distribution. We will repre-
sent these terms with the standard dot notation used in multivariate analysis (e.g.,
see Johnson and Wichern (1999)) to denote conditional distributions. Thus, T2^ k
corresponds to the conditional T2 associated with the distribution of Xi adjusted
for, or conditioned on, the variables Xj and Xk-
Continuing in this fashion, we can compute the T2 values for all subvectors of the
original vector X. The last subvector, consisting of the first component X^ = (xi),
is used to compute the unconditional T2 term given in (8.7); i.e.,
All the T2 values, T?^ X i ,X2 ,. . - , X pv) T?\J<1 ,X2 ,...,JUp— I x) , . . . , T f 2(J,i)'
v are computed using^
A
To illustrate this method for computing the conditional and unconditional terms
of a MYT decomposition, consider an industrial situation characterized by three
process variables. The in-control HDS is represented by 23 observations, and the
estimates of the covariance matrix and mean vector are given by
To obtain T? x -, we partition the original estimates of the mean vector and covari-
ance structure to obtain the mean vector and covariance matrix of the subvector
X^ = ( x i ^ x z ) . The corresponding partitions are given as
and the smallness of the first two terms, T2 and T2A, imply that the signal is
contained in the third term, T2-^ 2-
Only one possible MYT decomposition was chosen above to illustrate a com-
puting technique for the decomposition terms. Had we chosen another MYT de-
composition, such as
other terms of the decomposition would have had large values. With a signal-
ing overall T2 value, we are guaranteed that at least one term of any particular
decomposition will be large. We illustrate this important point in later sections.
Regression of Conditional T2
2
x\ on X2 T
J
1.2
2
xi on X3 T
J
1.3
2
x\ on X2, X3 T
J
1.2,3
2
X2 on xi T
J
2.1
2
X2 on xs T
J
2.3
2
X2 On Xi, X3 r2.1,3
J
£3 on xi T
J
2
3.1
2
X3 On X2 T
J
3.2
xs on xi, x-2 T
J
2
3.1,2
The smallness of this latter T2 value suggests that there are no problems with
the observations on variables x\ and x 2 . Thus, all T2 terms, both conditional and
unconditional, involving only x\ and x% will have small values. Our calculations
confirm this result as
From this type of analysis one can conclude that the signal is caused by the observed
value on #3.
Another important property of the T2 statistic is the fact that the p(2p~1 — 1)
unique conditional terms of a MYT decomposition contain the residuals from all
possible linear regressions of each variable on all subsets of the other variables. For
example, for p = 3, a list of the nine (i.e., 3(23~1 — 1)) linear regressions of each
variable on all possible subgroups of the other variables is presented in Table 8.1
along with the corresponding conditional T2 terms. It will be shown in Chapter
8.5. Locating Signaling Variables 155
9 that this property of the T2 statistic provides a procedure for increasing the
sensitivity of the T2 statistic to process shifts.
One method for locating the variables contributing to the signal is to develop a
forward-iterative scheme. This is accomplished by finding the subset of variables
that do not contribute to the signal.
Recall from (8.3) and (8.5) that a T2 statistic can be constructed on any subset of
the variables, xi, #2, • • • , xp. Construct the T2 statistic for each individual variable,
Xj, j = 1 , 2 , . . . , p, so that
where Xj and s2- are the corresponding mean and variance estimates as determined
from the HDS. Compare these individual T2 values to their UCL, where
is computed for an appropriate a level and for a value of p = 1. Exclude from the
original set of variables all Xj for which
since observations on this subset of variables are definitely contributing to the signal.
From the set of variables not contributing to the signal, compute the T2 statistic
for all possible pairs of variables. For example, for all (x^, Xj) with i ^ j, compute
and compare it to
for conditional terms, where k equals the number of conditioned variables. For
k = 0, the distribution in (8.13) reduces to the distribution in (8.12). Using these
distributions, critical values (CVs), for a specified a level and an HDS sample of
size n for both conditional and unconditional terms are obtained as follows:
We can compare each individual term of the decomposition to its critical value and
make the appropriate decision.
To illustrate the above discussion, recall the T2 value of 79.9441 for the obser-
vation vector X' = (533, 514, 528) taken from the example described in section 8.3.
Table 8.2 contains the 12 unique terms and their values for a total decomposition
of this T2 value. A large component is determined by comparing the value of each
term to the appropriate critical value. The T2 values with asterisks designate those
terms that contribute to the overall T2 signal, e.g., T32, T^, T|3, T|_l5 T|2, T?2_3,
T^ i 3 , and T^12- All such terms contain the observation on x3. This was the same
variable designated by the exact method for detecting signaling variables. Thus,
one could conclude that a problem must exist in this variable. However, a strong ar-
gument also could be made that the problem is due to the other two variables, since
four of the signaling terms contain x\ and four terms contain x^. To address this
issue, more understanding of what produces a signal in terms of the decomposition
is needed.
or as
ellipsoid represents the control region for the overall T2 statistic. Various signaling
points are also included for discussion below.
If an observation vector plots outside the box, the signaling univariate Tj2 values
identify the out-of-control variables, since the observation on the particular variable
is varying beyond what is allowed (determined) by the HDS. This is illustrated in
Figure 8.3 by point A. Thus, when an unconditional term produces a signal, the
implication is that the observation on the particular term is outside its allowable
range of variation. The point labeled C also lies outside the box region, but the
overall T2 value for this point would not have signaled since the point is inside the
elliptical control region.
This part of the T2 signal analysis is equivalent to ranking the individual t values
of the components of the observation vector (see Doganaksoy, Faltin, and Tucker
(1991)). While these components are a part of the T2 decomposition, they represent
only the p unconditional terms. Additional insight into the location and cause of a
signal comes from examination of the conditional terms of the decomposition.
Consider the form of a general conditional term given as
its numerator must be small, as the denominator of these terms is fixed by the
historical data. This implies that component Xj from the observation vector X' =
(xi, #2, • • • , Xj, • • • i Xp) is contained in the conditional distribution of Xj given x\.
X2, • • • , Xj-i and falls in the elliptical control region.
A signal occurs on the term in (8.17) when Xj is not contained in the conditional
distribution of Xj given xi, X 2 , . . . , £j-i, i.e., when
This implies that something is wrong with the relationship existing between and
among the variables xi, x 2 , . . . ,Xj. For example, a signal on T 2 X 2 j_i implies
that the observation on Xj is not where it should be relative to the value of xi,
X2, • • • , Xj-i- The relationship between Xj and the other variables is counter to the
relationship observed in the historical data.
To illustrate a countercorrelation, consider the trace of the control region of
Figure 8.3 in the x\ and x3 spaces as presented in Figure 8.4. The signaling point
B of Figure 8.3 is located in the upper right-hand corner of Figure 8.4, inside the
operational ranges of x\ and £3 but outside the T2 control region. Thus, neither
the T2 nor the T| term would signal, but both the T 2 3 and T321 terms would.
Conditional distributions are established by the correlation structure among
the variables, and conditional terms of an MYT decomposition depend on this
160 Chapter 8. Interpretation of T2 Signals for the General Case
The T2 value for an observation taken from this distribution would be given by
the relationship among the variables contained in the term. All of these variables
would need to be examined to identify a possible cause.
More information pertaining to the HDS given in section 8.3 is needed in order to
expand our previous example to include interpretation of the signaling components
for the observation vector X' = (533, 514, 528). This information is summarized in
Tables 8.3a and 8.3b.
Consider from Table 8.2 the value of Tf_3 = 28.2305, which is declared large
since it exceeds the CV = 8.2956. The size of T±3 implies that something is wrong
with the relationship between the observed values on variables x\ and #3. Note
from Table 8.3 that, as established by the HDS, the correlation between these two
variables is 0.725. This implies that the two variables vary together in a positive
direction. However, for our observation vector X' = (533,514,528), the value
of x\ = 533 is somewhat above its mean value of 525.435, whereas the value of
x3 = 528 is well below its mean value of 539.913. This contradictory result is an
example of the observations on x\ and x<z being countercorrelated. To reestablish
control of the process, either x\ must be lowered, if possible, or the value of £3
must be increased.
To determine which variable to move requires one to be familiar with the process
and the process variables. This includes knowing which variable is easiest to control.
If x\ is controllable and x3 is not, then x\ should be lowered. If x% is controllable
and x\ is not, then x3 should be decreased. If both are controllable, then one might
consider the large size of the unconditional term T3 in Table 8.2; i.e.,
A large value on an unconditional term implies that the observation on that variable
is outside the Shewhart box. This is the case for the observed value of £3 = 528,
as it is considerably less than the minimum value of 532 listed in the HDS. Hence,
to restore control in this situation, one would adjust variable xs upward.
162 Chapter 8. Interpretation of T2 Signals for the General Case
where Xj is the sample mean of Xj obtained from the historical data. The subvector
X^~^ is composed of the observations on (£1,0:2,... ,2^-1), and X^~l"> is the
corresponding estimated mean vector obtained from the historical data. The vector
of estimated regression coefficients Bj is obtained from partitioning the submatrix
Sjj, the covariance matrix of the first j components of the vector X. To obtain
Sjj, partition S as follows:
Then
Since the left-hand side of (8.18) contains %.i,2,...,j-i, the predicted value of Xj
from the given values of £1,0:2, • • • , a^-i, the numerator of (8.17) is a regression
residual represented by
(see, e.g., Rencher (1993)) and substituting rj,i^,...,j~i for (xj — 2^.1,2,...j-i), we
can re-express T^ 2) j_i as a squared standardized residual having the form
8.8. Computational Scheme (Optional) 163
The conditional term in this form explains how well a future observation on a
particular variable is in agreement with the value predicted by a set of the other
variate values of the vector, using the covariance matrix constructed from the HDS.
Unless the denominator in (8.19) is very small, as occurs when R2 is near 1,
the "largeness" of the conditional T2 term will be due to the numerator, which is a
function of the agreement between the observed and predicted values of Xj. Even
when the denominator is small, as occurs with large values of R? we would expect
very close agreement between the observed and predicted Xj values. A significant
deviation between these values will produce a large T2 term.
When the conditional T2 term in (8.19) involves many variables, its size is di-
rectly related to the magnitude of the standardized residual resulting from the pre-
diction of xj using £1, £2, • • • , xj-i and the HDS. When the standardized residual
is large, the conditional T2 signals.
The above results indicate that a T2 signal may occur if something goes astray
with the relationships between subsets of the various variables. This situation
can be determined by examination of the conditional T2 terms. A signaling value
indicates that a contradiction with the historical relationship between the variables
has occurred either (1) due to a standardized component value that is significantly
larger or smaller than that predicted by a subset of the remaining variables, or (2)
due to a standardized component value that is marginally smaller or larger than
that predicted by a subset of the remaining variables when there is a very severe
collinearity (i.e., a large -R2 value) among the variables. Thus, a signal results when
an observation on a particular variable, or set of variables, is out of control and/or
when observations on a set of variables are counter to the relationship established
by the historical data.
Variable Xi X2 X3 X4 £5
2
T 7.61* 0.67 0.76 0.58 3.87
* Denotes significance at the 0.05 level, based on
one-sided UCL = 4.28.
variables that contribute to rejection. These new values conform not only to the
HDS, but also to the observations on the variables of the data vector that did not
contribute to the signal.
The HDS on the seven (coded) quality variables is characterized by the sum-
mary statistics and the correlation matrix presented in Tables 8.6a and 8.6b. As
demonstrated later, these statistics play an important role in signal interpretation
for the individual components of the decomposition.
Individual observation vectors are not presented due to proprietary reasons;
however, an understanding of the process variation for this product can be gained
by observing a graph of the T2 values for the HDS. This graph is presented in Figure
8.5. The time ordering of the T2 values corresponds to the order of lot production
8.9. Case Study 167
and acceptance by the customer. Between any two T2 values, other products could
have been produced as well as other rejected lots.
The seven process variables represent certain chemical compositions contained in
the product. Not only do the observations on these variables have to be maintained
in strict operation ranges, but they also must conform to relationships specified by
the correlation matrix of the HDS. This situation is representative of a typical
multivariate system.
Consider an observation vector X' - (89.0,7.3,3.2,0.33,0.05,0.86,1.08) for a
rejected lot. The lot was rejected because the T2 value of the observation vector
was greater than its UCL; i.e.,
A relatively large value of the Type I error rate (i.e., a = 0.05) is used to protect the
customer from receiving an out-of-control lot. The risk analysis used in assessing
the value of the Type I error rate deemed it more acceptable to reject an in-control
lot than to ship an out-of control lot to the customer.
The T2 value of this signaling observation vector is decomposed using the com-
puting scheme described in section 8.8. Table 8.7 contains the individual T2 values
for the seven unconditional terms. Significance is determined by comparing the
individual unconditional T2 values to a critical value computed from (8.14). Using
n — 85 and a = 0.05, we obtain
Note that #4 is not in the equation, as it was not significant. The predicted value
of x\ using this equation is 87.16. The T2 value using this predicted value of x\
and the observed values of the remaining variables is
This small value indicates that, if the value x\ = 87.16 is attainable in the re-work
process, the lot will be acceptable to the customer. This judgment is made because
the T2 value is insignificant when compared to a critical value of 16.2411.
A second observation vector of a rejected lot is given as X' = (87.2, 7.2, 3.1,
0.36, 0.05, 0.86, 1.08). Again, its T2 statistic is greater than the critical value; i.e.,
8.10 Summary
The interpretation of a signaling observation vector in terms of the variables of a
multivariate process is a challenging problem. Whether it involves many variables
or few, the problem remains the same: how to locate the signaling variable(s). In
this chapter, we have shown that the MYT decomposition of the T2 statistic is a
solution to this problem. This orthogonal decomposition is a powerful tool, as it
allows examination of a signaling T2 from numerous perspectives. For example,
the signaling of unconditional components readily locates the cause in terms of an
individual variable or group of variables, while signaling of conditional terms locates
countercorrelated relationships between variables as the cause.
As we will see in Chapter 9, the monitoring of the regression residuals contained
in the individual conditional T2 terms allows the detection of both large and small
process shifts. By enhancing the models that are inherent to all conditional terms
of the decomposition, the sensitivity of the overall T2 can be increased.
Unfortunately, the T2 statistic with the MYT decomposition is not the solution
to all process problems. For example, although this technique will identify the
variable or set of variables causing a signal, it does not distinguish between mean
shifts and shifts in the variability of these variables. Nevertheless, Hotelling's T2
with the MYT decomposition has been found to be very flexible and versatile in
industrial applications requiring multivariate SPC.
This page intentionally left blank
Chapter 9
Improving the Sensitivity of the
I2 Statistic
171
172 Chapter 9. Improving the Sensitivity of the T2 Statistic
As you wait for the return call, your anxiety level begins to rise. Unlike this
morning, there is no edge of panic. Only the unanswered question of whether
this is the solution to the problem. As the minutes slowly tick away, the
boss appears with his customary question, "Have we made any progress?" At
that moment the telephone rings. Without answering him, you reach for the
telephone. As the lead operator reports the findings, you slowly turn to the
boss and remark, "Old Blue is back on line and running fine." You can see
the surprise and elation as the boss leaves for his urgent meeting. He yells
over his shoulder, "You need to tell me later how you found the problem so
quickly."
9.1 Introduction
In Chapter 8, a number of properties of the MYT decomposition were explored
for use in the interpretation of a signaling T2 statistic. For example, through
the decomposition, we were able to locate the variable or group of variables that
contributed to the signal. The major goal of this chapter is to further investigate
ways of using the decomposition for improving the sensitivity of the T2 in signal
detection.
Previously, we showed the T2 statistic to be a function of all possible regres-
sions existing among a set of process variables. Furthermore, we showed that the
residuals of the estimated regression models are contained in the conditional terms
of the MYT decomposition. Large residuals produce large T2 components for the
conditional terms and are interpreted as indicators of counterrelationships among
the variables. However, a large residual also could imply an incorrectly specified
model. This result suggests that it may be possible to improve the performance of
the T2 statistic by more carefully describing the functional relationships existing
among the process variables. Minimizing the effects of model misspecification on
the signaling ability of the T2 should improve its performance in detecting abrupt
process shifts (see Mason and Young (1999)).
When compared to other multivariate control procedures, the T2 lacks the sen-
sitivity of detecting small process shifts. In this chapter, we show that this problem
can be overcome by monitoring the error residuals of the regressions contained in
the conditional terms of the MYT decomposition of the T2 statistic. Furthermore,
we show that such monitoring can be helpful in certain types of on-line experimen-
tation within a processing unit.
This is the square of the jth variable of the observation vector adjusted by the
estimates of the mean and variance of the conditional distribution of Xj given #1,
9.2. Alternative Forms of Conditional Terms 173
This was achieved by noting that £j-i,2,...,j-i can be obtained from the regression
of Xj on xi, # 2 , . . . ixj-i'i i- e -7
where bj are the estimated regression coefficients. Since £j.i,2,...,j-i is the predicted
value of .Xj, the numerator of (9.1) is the raw regression residual,
given in (9.2).
Another form of the conditional term in (9.1) is obtained by substituting the
following quantity for the conditional variance contained in (9.2); i.e., by substitut-
ing
where R? l 2 ~_i is the squared multiple correlation between Xj and x\. x^,..., £j-i.
This yields
using both approaches to improve model specification, and thereby increase the
sensitivity of the T2 statistic.
should lead to smaller residuals in (9.5). This is demonstrated in the following data
examples.
problem may occur in the boiler, the turbine-generator, or the condenser. Knowing
where to look for the source of a problem is very important in large systems such
as these.
The overall process is monitored by a T2 statistic on observations taken on the
following key variables:
(1) F = fuel to the boiler,
(2) S = steam produced to the turbine,
(3) ST = steam temperature,
(4) W or MW (Megawatts) = megawatts of electricity produced,
(5) P = absolute pressure or vacuum associated with the condenser
(6) RT = temperature of the river water
fuel is required to produce a megawatt. The reverse occurs in the summer months,
when the river water temperature is high, as more fuel is required to produce a
megawatt of electricity.
Typical HDSs for a steam turbine are too large to present in this book. However,
we can present graphs of the individual variables over an extended period of time.
For example, Figure 9.2 presents a graph of megawatt production for the same time
period as is given in Figure 9.1. The irregular movement of megawatt production in
this plot indicates the numerous load changes made on the unit in the time period
of operation. This is not uncommon in a large industrial facility. For example,
sometimes it is less expensive to buy electricity from another supplier than it is
to generate it. If this is the situation, one or more of the units, usually the most
expensive to operate, will take the load reduction.
Fuel usage, for the same time period as in Figures 9.1 and 9.2, is presented in
Figure 9.3. Perhaps a more realistic representation is contained in Figure 9.4, where
a fuel usage plot is superimposed on an enlarged section of the graph of megawatt
production.
For a constant load, the fuel supplied to the boiler remains constant. However, to
increase the load on the generator (i.e., increase megawatt production), additional
fuel must be supplied to the boiler. Thus, the megawatt production curve moves
upward as fuel usage increases. We must use more fuel to increase the load than
would be required to sustain a given load. The opposite occurs when the load is
reduced. To decrease the load, the fuel supply is reduced. The generator continues
to produce megawatts, and the load curve follows the fuel graph downwards until a
sustained load is reached. In other words, we recoup the additional cost to increase
a load when we reduce the load.
178 Chapter 9. Improving the Sensitivity of the T2 Statistic
Steam production over the given time period is presented in Figure 9.5. Ex-
amination of this graph and the megawatt graph of Figure 9.2 shows the expected
relationship between the two variables: megawatt production follows steam pro-
duction. Again, the large shifts in the graphs are due to load changes.
Steam temperature over the given time period is presented in Figure 9.6. Note
the consistency of the steam temperature values. This is to be expected, since
steam temperature does not vary with megawatt production or the amount of steam
produced.
9.4. Case Study: Steam Turbine 179
where the oti are the unknown regression coefficients. For example, the correlation
coefficient for the data plotted in Figure 9.8 is 0.989, indicating a very strong linear
relationship between the two variables.
The theory on steam turbines, however, indicates that a second-order polyno-
mial relationship exists between F and W. This is described by an I/O curve defined
by
where the /3j are the unknown coefficients. Without knowledge of this theory, the
power engineer might have used a control procedure based only on the simple linear
model given in (9.6).
To demonstrate how correct model specification can be used to increase the
sensitivity of the T2 statistic, suppose we compare these two models. Treating
9.5. Model Creation Using Expert Knowledge 181
Similarly, the following equation was obtained using the I/O curve given in (9.7):
A comparison of the graphs of these two functions is presented in Figure 9.9. The
use of the linear model given in (9.6) implies that fuel usage changes at a constant
rate as the load increases. Use of a linear relationship between the fuel and power
variables implies that fuel usage would remain the same regardless of the power.
We wish this were the case. Unfortunately, in operating steam turbines, more fuel
is required when the load increases. Only the quadratic I/O curve describes this
type of relationship.
In Figure 9.9, both functions provide a near-perfect fit to the data. The linear
equation has an R2 value of 0.9775, while the quadratic equation has an R2 value
of 0.9782. Although the difference in these R2 values is extremely small, the two
curves do slightly deviate from one another near the middle and at the endpoints
of the range for W. In particular, the linear equation predicts less fuel usage at
the ends and more in the middle than the quadratic equation. In these areas,
there could exist a set of run conditions acceptable using the linear model in the
T2 statistic but unacceptable using the quadratic I/O model. Since the quadratic
model is theoretically correct, use of it should improve the sensitivity of the T2
statistic to signal detection in the described region.
182 Chapter 9. Improving the Sensitivity of the T2 Statistic
The corresponding T2 statistic based on only the two variables (F, W) has a value
of 18.89. This is insignificant (for a = 0.001) when compared to a critical value of
19.56 and indicates that there is no problem with the observation.
This is in disagreement with the result using the three variables x\ — F, x^ = W,
and £3 = W2. The resulting T2 value of 25.74 is significantly larger than the
critical value of 22.64 (for a = 0.001). Investigation of this signal using the T2
decomposition indicates that T 2 3 = 18.54 and T22 3 = 14.34 are large. The large
conditional T23 term indicates that there is a problem in the relationship between
F and W 2 . It would appear that the value F = 9675 is smaller than the predicted
value based on a model using W2 and the HDS. The large conditional T22 3 term
indicates something is wrong with the fit to the I/O model in (9.8). It appears
again that the fuel value is too low relative to the predicted value.
The ability of the quadratic model to detect a signal, when the linear model
failed and when both models had excellent fits, is perplexing. When comparing the
two models in Figure 9.9, the curves were almost identical except in the tails. This
result occurred because the correlation between W and W2 was extremely high (i.e.,
R2 = 0.997), indicating that these two variables were basically redundant in the
HDS. If initial screening tools had been used in analyzing this data set, the severe
collinearity would have been detected and the redundant squared megawatt variable
probably would have been deleted. However, because of theoretical knowledge
about the process, we found that the I/O model needed to be quadratic in the
megawatt variable. Thus, the collinearity was an inherent part of the process and
9.6. Model Creation Using Data Exploration 183
could not be excluded. This information helped improve model specification, which
reduced the regression residuals and ultimately enhanced the sensitivity of the T2
statistic to a small process shift.
As an additional note, the (I/O) models created for a steam turbine control
procedure can play another important role in certain situations. Consider a number
of units operating in parallel, each doing its individual part to achieve a common
goal. Examples of this would be a powerhouse consisting of a number of steam
turbines used in the generation of a fixed load, a number of pumps in service to
meet the demand of a specific flow, and a number of processing units that must
process a fixed amount of feedstock. For a system with more than one unit, the
proper division of the load is an efficiency problem. Improper load division may
appreciably decrease the efficiency of the overall system.
One solution to the problem of improper load division is to use "equal-incremental-
rate formulation." The required output of such a system can be achieved in many
ways. For example, suppose we need a total of 100 megawatts from two steam
turbines. Ideally, we might expect each turbine to produce 50 megawatts, but re-
alistically this might not be the most efficient way to obtain the power. For a fixed
system output, equal-incremental-rate formulation divides the load among the in-
dividual units in the most economic way; i.e., it minimizes the amount of input.
This requires the construction of I/O models that accurately describe all involved
units. It can be shown that the incremental solution to the load division problem
is given at the point where the slopes of the I/O curves are equal.
the vapor to the condensing unit. However, if the turbine allows hot steam to pass
(i.e., running too hot), less vacuum is needed to move the vapor.
The vacuum is a function of the temperature of the coolant and the amount
of coolant available in the condensing unit. A lower temperature for the coolant,
which is river water, increases the tendency of the warm vapor to move to the
condenser. The amount of the coolant available depends on the cleanness of the
unit. For example, if the tubes that the coolant passes through become clogged
or dirty, inhibiting the flow, less coolant is available to help draw the warm steam
vapor through the unit and, hence, create a change in the vacuum.
Without data exploration, one might control the system using the three vari-
ables, vacuum (V), coolant temperature (T), and megawatt load (W). The condi-
tional term, TVTW, would contain the regression of vacuum on temperature and
megawatt load. In equation form, this is given as
where the bi are estimated constants. The sensitivity of the T2 statistic for the
condensing unit will be improved if the regression of vacuum on temperature and
megawatt load is improved, as TVTW is an important term in the decomposition
of this statistic. The theoretical functional form of this relationship is unknown,
but it can be approximated using data exploration techniques.
The HDS for this unit contains hundreds of observations on the three process
variables taken in time sequence over a time period of one year. For discussion
purposes, a partial HDS consisting of 30 points is presented in Table 9.1. However,
our analyses are based on the overall data set.
Results for the fitting of the regression model given in (9.9) for the overall data
set are presented in Table 9.2. The R2 value of 0.9517 in Table 9.2 indicates a good
fit to this data. However, the standard error of prediction has a value of 0.1658,
9.6. Model Creation Using Data Exploration 185
which is more than 5% of the average vacuum. This is considered somewhat large
for this type of data and, if possible, needs to be reduced.
A graph of the standardized residuals for the model in (9.9) is presented in
Figure 9.10. There is a definite cyclical pattern in the plot. This needs to be
dampened or removed.
Figure 9.11 contains a graph of the megawatt load on the generator over this
same time period. From the graph, it appears that the generator is oscillating over
its operation range throughout the year, with somewhat lower loads occurring at
the end of the year. A quick comparison of this graph to that of the standardized
residuals in Figure 9.10 gives no indication of a connection to the cyclic nature of
the errors.
An inspection of the graph of the vacuum over time, given in Figure 9.12,
indicates a strong relationship between vacuum and time. This can be explained
by noting the seasonal variation of the coolant temperature displayed in Figure
9.5. The similar shapes of the graphs in Figures 9.12 and 9.13 also indicate that
a strong relationship exists between the vacuum and temperature of the coolant.
This is confirmed by the correlation of 0.90 between these two variables.
186 Chapter 9. Improving the Sensitivity of the T2 Statistic
The curvature in the above series of plots suggests the need for squared terms
in temperature and load in the vacuum model. We also will add a cross-product
term between the temperature and W , since this will help compensate for the two
variables varying together. In functional form, the model in (9.9) is respecified to
obtain the prediction equation
The upper and lower control limits for these plots are given by
where MSB is the mean squared error from the regression fit of Xj on x\, # 2 , . . . , Xj-i.
With this form it is easy to see that, apart from the constant term in the denomi-
nator, the square root of the conditional T2 term is simply a standardized residual
from the above regression fit. Hence, rather than plot the conditional term, we can
simply plot the standardized residuals
As a note of interest, HDSs for a steam turbine contain a year of data taken
over the normal operational range (megawatt production) of the unit and have
these outliers (peaks) removed. The data in Figure 9.15 are very indicative of the
performance of a unit operating at maximum efficiency.
Monitoring the performance of a steam turbine can be accomplished by exam-
ining incoming observations (F, W, W ) and computing and charting the overall
T2 statistic. This statistic indicates when an abrupt change occurs in the system
and can be used as a diagnostic tool for determining the source of the signal. In
addition, the standardized residuals from the fit of F to W and W2 can be plot-
ted in a Shewhart chart and monitored to determine if small process changes have
occurred in the fuel consumption.
The residual plot in Figure 9.16, with its accompanying control limits at +/—3,
represents a time period when a small process change occurred in the operation
of the steam turbine. Residual values plotted above the zero line indicate that
fuel usage exceeded the amount established in the HDS, while those plotted below
the zero line indicate that the opposite is true. Thus, a run of positive residuals
indicates that the unit is less efficient in operation than that established in the
baseline period. In contrast, a run of negative residuals indicates that the unit is
using less fuel than it did in the baseline period.
The trends in the graph in Figure 9.16 indicate that the unit became less efficient
around the time period labeled "upset." At that point, the residuals moved above
the zero line, implying fuel usage was greater than that predicted by the model
given in (9.7). Notice that, while this pattern of positive residuals is consistent,
the residuals themselves are well within the control limits of the chart. The only
exceptions are the spikes in the plot, which occur with radical load changes.
Another example of using trends from plots of the conditional terms of the T2
to detect small process shifts is given in Figure 9.17. This is a residual plot using
the regression model given in (9.10) for the vacuum on a condensing unit. The
upset condition, indicated with a label on the graph, occurred when a technician
inadvertently adjusted the barometric gauge used in the calculation of the absolute
9.8. Summary 191
Figure 9.17: Standardized residual plot of vacuum model with an upset condition.
pressure. After that point, the residual values shift upward, although they remain
within the control limits of the standardized residuals.
Plots of standardized residuals, such as those in Figures 9.15-9.17, provide a
useful tool for detecting small process shifts. However, they should be used with
an overall T2 chart in order to avoid the risk of extrapolating values outside the
operational range of the variable. While we seek to identify systematic patterns in
these plots, a set of rigorous run rules is not yet available. Thus, we recommend
that a residual pattern be observed for a extensive period of time before taking
action.
The proposed technique using the standardized residuals of the fitted models
associated with a specific conditional T2 term is not a control procedure per se.
Rather, it is a tool to monitor process performance in situations where standard
control limits would be too wide. Any consistent change in the process is of interest,
and not just signaling values. While there is some subjectivity involved in the
determination of a trend in these plots, process data often is received continuously,
so that visual inspection of runs of residuals above or below zero is readily available.
9.8 Summary
The T2 statistic can be enhanced by improving its ability to detect (1) abrupt
process changes as well as (2) gradual process shifts. Abrupt changes can be better
identified by correctly modeling in Phase I operations the functional relationships
existing among the variables. One means of doing this is by examining the square
root of the conditional terms in the T2 decomposition or the corresponding related
regression residual plots. These values represent the corresponding standardized
residuals obtained from fitting a regression model. Gradual process shifts can be
192 Chapter 9. Improving the Sensitivity of the T2 Statistic
"We cannot ignore input from experts in the scientific discipline in-
volved. Statistical procedures are vehicles that lead us to conclusions;
but scientific logic paves the road along the way.... [F]or these reasons,
a proper marriage must exist between the experienced statistician and
the learned expert in the discipline involved."
Autocorrelation in T2 Control
Charts
10.1 Introduction
Development and use of the T2 as a control statistic for a multivariate process has
required the assumption of independent observations. Certain types of processing
units may not meet this assumption. For example, many units produce time-
dependent or autocorrelated observations. This may be due to factors such as
equipment degradation, depletion of critical process components, environmental
and industrial contamination, or the effect of an unmeasured "lurking" variable.
The use of the T2 as a control statistic, without proper adjustment for a time
dependency, can lead to incorrect signals (e.g., see Alt, Deutsch, and Walker (1977)
or Montgomery and Mastrangelo (1991)).
In Chapter 4 we discussed detection procedures for autocorrelated data. These
included examination of trends in time-sequence plots of individual variables and
the determination of the pairwise correlation between process variables and a cate-
gorical time-sequence variable. In this chapter, we add a third procedure. We show
that special patterns occurring in the graph of a T2 chart can be used to indicate
the presence of autocorrelation in the process data. We also demonstrate that if
autocorrelation is detected and ignored, one runs the risk of weakening the overall
T2 control procedure. This happens because the main effect of the autocorrelated
variable is confounded with the time dependency. Furthermore, relationships with
other variables may be masked by the time dependency.
When autocorrelation is present, an adjustment procedure is needed in order
to obtain a true picture of process performance. In a univariate setting, one such
adjustment involves modeling the time dependency with an appropriate autore-
gressive model and examining the resulting regression residuals. The residuals are
free of the time dependency and, under proper assumptions, can be shown to be
independent and normally distributed. The resulting control procedure is based on
these autoregressive residuals (e.g., see Montgomery (2001)).
193
194 Chapter 10. Autocorrelation in T2 Control Charts
corresponding mean line, are presented in Figures 10.4 and 10.5. Note the upward
trend in the plot of the points in both graphs.
To investigate the effects of these two autocorrelated variables on the behavior
of the T2 statistic, we examine in Figure 10.6 a graph of the corresponding T2
statistic. Observe the very slight, [/-shaped curvature in the graph of the statistic
over the operational range of the two variables. Note also the large variation in
the T2 values and the absence of numerous values close to zero. This is in direct
contrast to the trends seen in Figure 10.3 for the variables with no time dependency.
Since the T2 statistic should exhibit only random fluctuation in its graph, further
examination is required in order to determine the reason for this systematic pattern.
The plots in Figures 10.4 and 10.5 of the correlated data indicate the presence of
large deviations from the respective mean values of both variables at the beginning
and end of their sampling periods. Since the T2 is a squared statistic, such a
196 Chapter 10. Autocorrelation in T2 Control Charts
trend produces large T2 values. For example, while deviations below the mean are
negative, squaring them produces large positive values. As the variables approach
their mean values (in time), the value of the T2 declines to smaller values. The
curved [/-shaped pattern in Figure 10.6 is thus due to the linear time dependency
inherent in the observations. This provides a third method for detecting process
data with a time dependency.
Autocorrelation of the form described in the previous example is a cause-effect
relationship between the process variables and time. The observation on the process
variable is proportional to the variable at some prior time. In other cases this time
relationship may be only empirical and not due to a cause-and-effect relationship.
In this situation, the current observed value is not determined by a prior value, but
only associated with it. The association is usually due to a "lurking variable."
10.2. Autocorrelation Patterns in T2 Charts 197
Figure 10.7: Time-sequence plot of process variable with cyclical time effect.
Consider the cyclic nature of the process variable depicted in Figure 10.7. The
cyclical or seasonal variation is due to the rise and fall of the ambient temperature
for the corresponding time period. This is illustrated in Figure 10.8. Cyclical or
seasonal variation over time is assumed to be based on systematic causes; i.e., the
variation does not occur at random, but reflects the influence of "lurking" variables.
Variables with a seasonal effect will have a very regular cycle, whereas variables with
a cyclical trend may have a somewhat irregular cycle. Such trends will be reflected
in the T2 chart, and the curved [/-shaped pattern seen previously in other T2 charts
may have short cycles.
A T2 chart including the cyclical process variable in Figure 10.7 with no adjust-
ment for the seasonal trend is presented in Figure 10.9. Close examination of the
run chart reveals a cyclic pattern due to the seasonal variation of the ambient tem-
perature. As the temperature approaches its maximum and minimum values, the
198 Chapter 10. Autocorrelation in T2 Control Charts
T2 statistic moves upward. When the temperature approaches its average value,
the T2 moves toward zero. Also notice the excess variation due to the temperature
swings in the T2 values.
The above examples illustrate a number of the problems occurring with autocor-
related data and the T2 statistic. Autocorrelation produces some type of systematic
pattern over time in the observations on the variables. If not corrected, the pat-
terns are transformed to nonrandom patterns in the T2 charts. As illustrated in
the following sections, these patterns can greatly affect signals. The presence of
autocorrelation also increases variation in the T2 statistic. This increased variation
can smother the detection of process movement and hamper the sensitivity of the
T2 statistic to small but consistent process shifts. As in other statistical procedures,
nonrandom variation of this form is explainable, but it can and should be removed.
10.3. Control Procedure for Uniform Decay 199
To accurately assess the T2 statistic, this time effect must be separated and removed
from the random error.
As an example, reconsider the data for x\ given in Figure 10.4. The time
dependency can be explained by a first-order autoregressive model,
where /30 and (3\ are the unknown regression coefficients, x\^ is the current ob-
servation, and xi,t_i is the immediate prior observation. Since the mean of x\,
conditioned on time, is given by
The above relationship suggests a method for computing the T2 statistic for an
observation vector with some observations exhibiting a time dependency. This is
achieved using the formula
to the unadjusted mean, Xj. However, for those variables with a first-order time
dependency, Xj\t would be obtained using a regression equation based on the model
in (10.2) or some similar autoregressive function. Thus, for an AR(1) process,
when a time dependency is present, where &o and b\ are the estimated regression
coefficients.
The common estimator of S for a sample of size n is usually given as
where X is the overall sample mean. However, if some of the components of the
observation vector X have a time dependency, S also must be corrected. This is
achieved by taking deviations from Xt; i.e.,
The variance terms of St will be denoted as Sj\t to indicate a time adjustment has
been made, while the general covariance terms will be designated as Sj.i,2,...,p-i|t-
Decomposition of T2t
Suppose the components of an observation vector with a time dependency have been
determined using the methods of section 10.1, and the appropriate autoregressive
functions fitted. We assume that Xt and St have been computed from the HDS.
To calculate the T2 value for a new incoming observation, we compute (10.3).
The general form of the MYT decomposition of the T2 value associated with a
signaling p-dimensional data vector X' = ( x i , . . . ,xp) is given in Chapter 8. The
decomposition of T2 follows a similar procedure but uses time adjustments similar
to (10.4).
If a signal is observed, we decompose the T2 statistic, adjusted for time effects,
as follows:
Close examination reveals how the time effect is removed. Consider the term
Observations on both x\ and x<2 are corrected for the time dependency by subtract-
ing the appropriate xt term. The standard deviation is time corrected in a similar
manner.
cycle and thus would contain more variation than a steady-state variable, which
would remain relatively constant. If we fail to consider the decay in the process,
any efficiency value between 85% and 98% would be acceptable, even 85% at the
beginning of a cycle.
As discussed in Chapter 8, a deviation beyond its operational range (established
using in-control historical data) for a process variable can be detected using the
corresponding unconditional T2 term of the MYT decomposition. In addition,
incorrect movement of the variable within its range because of improper linear
relationships with other process variables can be detected using the conditional T2
terms. However, this approach does not account for the effects of movement due
to time dependencies.
df SS MS F Significance of F
Regression 1 2077.81 2077.81 99.81 < 0.0001
Residual 77 1602.93 20.82
Total 78 3680.74
This nonrandom trend coupled with the moderate value of 0.565 for the R2 statistic
supports our belief.
Process variable x\ shows a strong upward linear time trend in its time-sequence
plot given in Figure 10.12. This is confirmed by its high correlation (0.880) with
10.4. Example of a Uniform Decay Process 205
df SS MS F Significance of F
Regression 1 11.69 11.69 518.64 < 0.000
Residual 77 1.73 0.02
Total 78 13.42
time given in Table 10.1. The analysis-of-variance table from the regression analysis
for an AR(1) model for this variable is presented in Table 10.4. The fit is highly
significant (p < 0.000) and indicates that there is a linear relationship between
x\ and its immediate past value. Summary statistics for this fit are presented in
Table 10.5. The large R2 value, 0.871, in addition to the small residuals, given in
the residual plot in Figure 10.15, indicates a good fit over most of the data. The
increase in variation at the end of the plot is due to decreasing unit efficiency as
the unit life increases.
The third variable to show a time dependency is the average reactor temperature
(Temp). As noted in Figure 10.10, reactor temperature has a nonlinear (i.e., curved)
relationship with time. Thus, an AR(2) model of the form
206 Chapter 10. Autocorrelation in T2 Control Charts
where the (3j are the unknown regression coefficients, might result in decreasing the
error seen at the end of the cycle in Figure 10.10. However, for simplicity, we will
use the AR(1) model.
Although the pairwise correlation between this variable and the time-sequence
variable is only 0.691 in Table 10.1, this is mainly a result of the flatness of the plot
at the earlier time points. The analysis-of-variance table for the AR(1) model fit to
the average temperature is presented in Table 10.6. The fit is significant (p < 0.000)
and indicates that there is a linear relationship between average temperature and
its immediate past value. Summary statistics for the AR(1) fit are presented in
Table 10.7. The R2 value of 0.565 is moderate, and the larger residuals in the
residual plot in Figure 10.16 confirm this result.
For the three variables exhibiting some form of autocorrelation, the simplest au-
toregressive function was fit. This is to simplify the discussion of the next section.
The fitted AR(1) models depend only on the first-order lag of the data. A substan-
tial amount of lack of fit was noted in the discussion of the residual plots. These
models could possible be improved by the addition of different lag terms. The use
of a correlogram, which displays the lag correlations as a function of the lag value
(see section 4.8), can be a useful tool in making this decision. The correlogram for
the three variables xi, x-z, and Temp is presented in tabular form for the first three
lags in Table 10.8. In the case of variable xi, the correlogram suggests using all
three lags as the lag correlations remain near 1 for all three time points.
10.4. Example of a Uniform Decay Process 207
df SS MS F Significance of F
Regression 1 4861.44 4861.44 108.63 < 0.000
Residual 77 3445.93 44.75
Total 78 8307.37
10.4.3 Estimates
Using the results of section 10.3, we can construct time-adjusted estimates of the
mean vector and covariance matrix for our reactor data. For notational purposes,
the four variables are denoted by x\ (for process variable xi), x% (for process variable
^2)5 £3 (for Temp), and x^ (for Feed). The estimate of the time-adjusted mean
208 Chapter 10. Autocorrelation in T2 Control Charts
vector is given as
where
Since Feed has no time dependency, no time adjustments are needed, and the
average of the Feed data is used.
Removing the time dependency from the original data produces some interesting
results. For example, consider the correlation matrix of the 79 observations with
the time dependency removed. This is calculated by computing
and converting the covariance estimate to a correlation matrix. The resulting esti-
mated correlation matrix is presented in Table 10.9. In contrast to the unadjusted
correlation matrix given in Table 10.1, there now is only a weak correlation between
the time-sequence variable and each of the four process variables. Other correlations
not directly involving the time-sequence variable were also affected. For example,
the original correlation between temperature and x\ was 0.737. Corrected for time,
the value is now —0.026. Thus, these two variables were only correlated due to
the time effect. Also, observe the correlation between x\ and #2- Originally, a
correlation of 0.795 was observed in Table 10.1, but correcting for time decreases
this value to 0.586.
The T2 values of the preliminary data are plotted in Figure 10.17. These values
are computed without any time adjustment. Close inspection of this graph reveals
the U-shaped trend in the data that is common to autocorrelated processes. The
upward trend prevails more at the end of the cycle than at its beginning. This is
mainly due to the instability of the reactor as it nears the end of its life cycle.
Consider the T2 graph for the time-adjusted data. This is presented in Figure
10.18. In comparison to Figure 10.17, there is no curvature in the plotted points.
However, the expanded variation at the end of the life cycle is still present, as it
10.4. Example of a Uniform Decay Process 209
is not due to the time variable. Note also that this plot identifies eight outliers
in the data set as compared to only five outliers being detected in the plot of the
uncorrected data.
where the symbol (**) denotes that the unconditional T2 term for temperature
10.5. Control Procedure for Stage Decay Processes 211
produces a signal as it exceeds the critical value of 7.559. The usual interpretation
for a signal on an unconditional term is that the observation on the variable is
outside the operational range. However, for time-adjusted variables, the implication
is different. In this example, the observed temperature value, 526, is not where it
should be relative to its lag value of 506. For observation 10, the increase in
temperature from the value observed for observation 9 was much more than that
predicted using the historical data.
Removing the Temp variable from observation 10 and examining its subvector
(Feed, xit, x^t] produced a T2 value of 15.910. When compared to a critical value
of 13.38, a signal was still present. Further decomposition of the T2 value on this
subvector produced the following two-way conditional T2 terms:
where the symbol (**) denotes the term that exceeds the critical value of 7.675.
There are two signaling conditional terms, and these imply that the relationship
between the operational variables x% and 0:4 (Feed), after adjustment for time, does
not agree with the historical situation.
These results indicate the need to remove the process variables x2 and x 4 . With
their removal, the only variable left to be examined is x\. However, the small value
of the unconditional T2 term, T2 = 1.3124, indicates that no signal is present in
observation 10 on this variable.
10.6 Summary
The charting of autocorrelated multivariate data in a control procedure presents a
number of serious challenges. A user must not only examine the linear relationships
existing between the process variables to determine if any are unusual, but also
adjust the control procedure for the effects of the time dependencies existing among
these variables. This chapter presents one possible solution to problems associated
with constructing multivariate control procedures for processes experiencing either
uniform decay or stage decay.
Autocorrelated observations are common in many industrial processes. This is
due to the inherent nature of the processes, especially any type of decay process.
Because of the potentially serious effects of autocorrelation on control charts, it
is important to be able to detect its presence. We have offered two methods of
detection. The first involves examining the correlations between each variable and
a constructed time-sequence variable. Large correlation will imply some type of time
dependency. Graphical techniques are a second aid in detecting time dependencies.
Trends in the plot of an individual variable versus time will give insight to the type
of autocorrelation that is present. Correlogram plots for individual variables also
can be helpful in locating the lag associated with the autocorrelation.
For uniform decay data that can be fit to an autoregressive model, the cur-
rent value of an autocorrelated variable is corrected for its time dependency. The
proposed control procedure is based on using the T2 value of the time-adjusted
observation and decomposing it into components that lead to an interpretation of
the time-adjusted signal. The resulting decomposition terms can be used to moni-
tor relationships with the other variables and to determine if they are in agreement
with those found in the HDS. This property is also helpful in examining stage-
decay processes as the decay occurs sequentially and thus lends itself to analysis by
repeated decompositions of the T2 statistic obtained at each stage.
Chapter 11
2
The T Statistic and Batch
Processes
11.1 Introduction
Our development of a multivariate control procedure has been limited to applica-
tions to continuous processes. These are processes with continuous input, continu-
ous processing, and continuous output. We conclude the text with a description of
a T2 control procedure for batch processes. These are processes that use batches
as input (e.g., see Fuchs and Kenett (1998)).
There are several similarities between the T2 procedures for batch processes
and for continuous processes. Phase I still consists of constructing an HDS, and
Phase II continues to be reserved for monitoring new (future) observations. Also,
control procedures for batch processes can be constructed for the overall process,
or for individual components of the processing unit. In some settings, multiple
observations on the controlled component may be treated as a subgroup with control
based on the sample mean. In other situations, a single observation may be used,
such as monitoring the quality of the batch or grade being produced.
Despite these similarities, differences do exist when monitoring batch processes.
For example, the estimators of the covariance matrix and the overall mean vector
may vary. Changes also may occur in the form of the T2 statistic and the probability
function used to describe its behavior. A detailed discussion of the application of
the T2 statistic to batch processes can be found in Mason, Chou, and Young (2001).
213
214 Chapter 11. The T2 Statistic and Batch Processes
where X% represents the mean of the ith batch. The total sample size N is obtained
as the sum of the batch sizes, i.e., N = n\ + n? + • • • + n^. The estimate of the
covariance matrix is computed as
218 Chapter 11. The T2 Statistic and Batch Processes
The quantity SSr in (11.2) is referred to as the total sum of squares of varia-
tion. It can be separated into two separate components. One part, referred to as
the within-variation, represents the variation within the batches. The other part,
labeled the between-variation, is the variation between the batches. We write this
as
The component 883 represents the between-batch variation and, when signifi-
cant, can distort the common estimator 5. However, for Category 1 processes, it is
assumed that the between-batch, as well as the within-batch, variation is minimal
and due to random variation. Therefore, for a Category 1 situation, we estimate the
overall mean using (11.1) and the covariance matrix using (11.2). We emphasize
that these are the appropriate estimates only if we adhere to the basic assumption
that a single multivariate distribution can describe the process.
For a Category 2 process, we have multiple distributions describing the process.
Multiplicity comes from the possibility that the mean vector of the various batches
may differ, i.e., that /^ ^ Hj for all i and j. For this case, the overall mean is still
estimated using (11.1). However, the covariance matrix estimator in (11.2) is no
longer applicable due to the effects of the between-batch variation.
As an illustration of the effects of between-batch variation, consider the plot
given in Figure 11.7 for two variables, x\ and x-2, and two batches of data. The
orientation of the two sets of data implies that x\ and x% have the same correlation
in each batch, but the batch separation implies that the batches have different
means. If the batch classification is ignored, the overall sample covariance matrix,
5, will be based on deviations taken from the overall mean, indicated by the center
of the ellipse, and will contain any between-group variation.
For a Category 2 process, the covariance matrix is estimated as
where Si is the covariance matrix estimate for the iih batch and SSw represents
the within-batch variation as defined in (11.3). Close inspection of (11.4) reveals
the estimator Sw to be a weighted average (weighted on the degrees of freedom)
of the within-batch covariance matrix estimators. With mean differences between
the batches, the common estimator obtained by considering the observations from
all batches as one group would be contaminated with the between-batch varia-
tion, represented by 885. Using only the within-batch variation to construct the
estimator of the common covariance matrix will produce a true estimate of the rela-
tionships among the process variables. We demonstrate this in a latter example (see
section 11.7).
11.4. Outlier Removal for Category 1 Batch Processes 219
where X and S are the common estimators obtained from (11.1) and (11.2), respec-
tively, and -B( p /2,AT-p-i/2) represents the beta distribution with parameters (p/2)
and ((TV — p — l)/2), where N is total sample size (all batches combined). The
UCL, used for outlier detection in a Phase I operation, is given as
where -B[ a ,p/2,(w-p-i)/2] ig the upper ath quantile of -B[p/2,(w-p-i)/2]- For this
category, the distribution of the T2 statistic and the purging procedure for outlier
removal is the same as those used for a continuous process.
We emphasize that the statistic in (11.5) can only be used when there is no
between-batch variation. All observations from individual batches must come from
the same multivariate distribution. This assumption is so critical that it is strongly
recommended that a test of hypothesis be performed to determine if the batch
means are equal. This creates the dilemma of whether to remove outliers first or
to test the equality of the batch means, since mean differences could be due to
individual batches containing atypical observations.
220 Chapter 11. The T2 Statistic and Batch Processes
As an example, reconsider the two-variable control region and data set illus-
trated in Figure 11.7. If one treats the two batches of data as one overall group,
the asterisk in the middle of the ellipse in the graph represents the location of the
overall mean vector. Since deviations used in computing the common covariance
matrix S are taken from this overall mean, the variances of x\ and x% would be
considerably larger when using S instead of Sw Also, the estimated correlation
between the two variables would be distorted, as the orientation of the ellipse is not
the same as the orientation of the two separate batches. Finally, note that the two
potential outliers in the ellipse would not be detected using the common estimator
S as these points are closer to the overall mean than any of the other points of the
two separate batches.
The solution to the problem of outliers in batches is provided in Mason, Chou,
and Young (2001). These authors recommend the following procedure for this
situation.
Step 1. Center all the individual batch data by subtracting the particular batch
mean from the batch observation; i.e., compute
and remove outliers following the established procedures (see Chapter 5).
Step 4. After outlier removal, Sw and X must be recalculated using only
the retained observations. To test the hypothesis of equal batch means, apply the
outlier removal procedure to the batch mean vectors. The T2 statistic for this
procedure is given as
where S\y is the sample covariance matrix computed using (11.7) and the translated
data with the individual outliers removed. The distribution of the statistic in (11.9),
under the assumption of a true null hypothesis, is that of an F variable (i.e., see
Wierda (1994)), and is given by
11.5. Example: Category 1 Batch Process 221
where ^(p, n fc-fc-p+i) represents the F distribution with parameters (p) and (nk —
k — p + 1). For a given a level, the UCL for the T2 statistic is computed as
three separate batches are presented in Figures 11.8, 11.9, and 11.10, respectively.
Observe the same general shape of the data swarm for each of the different batches.
Note also potential outliers in each batch. For example, the observation located
at the extreme left end of the data swarm of Figure 11.8 and the cluster of points
located in the extreme right-hand corner of Figure 11.9.
Summary statistics for the three separate batches are presented in Table 11.1.
Observe the similarities among the statistics for the three batches, especially for the
11.5. Example: Category 1 Batch Process 223
pairwise correlations between x\ and x^. This is also true for the separate standard
deviations for each variable.
Centering of the data for the three separate batches is achieved by subtracting
the respective mean. For example, the translated vector based on centering the
observations of Batch 1 is obtained using
The summary statistics for the combined translated data is given in the last column
of Table 11.1, and a graph of the translated data is presented in Figure 11.11.
Observe the similarities between the standard deviations of the individual variables
in the three separate batches with the standard deviation of the variables in the
overall translated batch. Likewise, the same is true for the pairwise correlation
between the variables in the separate and combined batches.
Only one observation appears as a potential outlier in the scatter plot presented
in Figure 11.11. This is observation 46 of Batch 1, and it is located at the (extreme)
left end of the data swarm. This observation also was noted as a potential outlier
in a similar scatter plot of the Batch 1 data presented in Figure 11.8. A T2 chart
based on (11.8) and the combined translated data is presented in Figure 11.12. The
224 Chapter 11. The T2 Statistic and Batch Processes
first 99 T2 values correspond to Batch 1; the second 100 values correspond to Batch
2; and the last 100 values refer to Batch 3. The results confirm that the T2 value
of observation 46 exceeds the UCL. Thus, it is removed from the data set.
A revised T2 value chart, based on 298 observations, is given in Figure 11.13.
The one large T2 value in the plot corresponds to observation 181 from Batch 2.
However, the change in TV from 299 to 298, by excluding observation 46 in Batch
1, does not reduce the UCL sufficiently to warrant further deletion. Thus, the
previous removal of the one outlier is adequate to produce a homogeneous data set.
The distribution of the T2 statistic is verified by examining a Q-Q plot of the
HDS. This plot is presented in Figure 11.14. The plot has a strong linear trend,
and no serious deviations from it are noted other than the few points located in
the upper right-hand corner of the plot. The extreme value is observation 181 from
11.5. Example: Category 1 Batch Process 225
Figure 11.13. It appears that the beta distribution can be used in Phase I analyses,
and the corresponding F distribution should be appropriate for Phase II operations.
Using the combined translated data with the single outlier removed, estimates
of Sw and X are obtained and the mean test given in Step 4 of section 11.5 is
performed. Summary statistics for the HDS are presented in Table 11.2. Very close
agreement is observed when the overall mean and standard deviation are compared
to the individual batch means and standard deviations of Table 11.1.
The T2 values for the three individual batch means are computed using (11.9)
and are presented in Table 11.3. All three values are extremely small due to the
closeness of the group means to the overall mean (see Tables 11.1 and 11.2). When
compared to the UCL value of 0.0622 as computed using (11.11) with p = 2, k = 3,
and n ~ 100, none are significantly different from the others. From these results,
we conclude that all three batches are acceptable for use in the HDS.
226 Chapter 11. The T2 Statistic and Batch Processes
where #(fc/2,(fc-p-i)/2) represents the beta distribution with parameters (k/2) and
((k—p— l)/2), SB = SSp/k is the covariance estimate defined in (11.3), and X is
the overall mean computed using (11.1). The corresponding UCL is given by
and considerably lower than those for Batches 2 and 3. This is due to the mean
separation of the individual batches. Even though all three batches represent an
in-control process, combining them into one data set can mask the true correlation
between the variables or create a false correlation. Note also the difference between
the standard deviations of the variables within each batch and the standard devi-
ation of the variables for the overall batch. The latter is larger for x\ and nearly
twice as large for x% and does not reflect the true variation for an individual batch.
We begin the outlier detection procedure by translating the three sets of batch
data to a common group that is centered at the origin. This is achieved by subtract-
ing the respective batch mean from each observation vector within the batch. For
an observation vector from Batch 1, we use (xu — 133.36, #12 — 200.78); for Batch
2, we compute (x^i — 149.08, ^22 — 202.59); and for Batch 3, we use (^31 — 147.6
£32 — 190.60). A scatter plot of the combined data set after the translation is
presented in Figure 11.16, and summary statistics are presented in the last row of
Table 11.4. Comparing the summary statistics of the within-batches to those of
the translated batch presents a more agreeable picture. The standard deviations
of the variables of the overall translated batch compare favorably to the standard
deviation of any individual batch. Likewise, the correlation of the overall translated
group is more representative of the true linear relationship between the two process
variables.
Translation of the data in Figure 11.16 presents a different perspective. Overall,
the scatter plot of the data does not indicate obvious outliers. Three observations
from Batch 3 are at the extreme scatter of the data, but do not appear as obvious
outliers. Using an a — 0.05, the T2 statistic based on the common overall batch
11.7. Example: Category 2 Batch Process 229
was used to detect observations located a significant distance from the mean of
(0,0). These results are presented in the T2 chart given in Figure 11.17, with Batch
1 data first, then Batch 2 data, followed by Batch 3 data. No observation has a T2
value larger than the UCL of 5.818. Also, the eighth observation of Batch 1 has
the largest T2 value, though this was not obvious when examining the scatter plot
given in Figure 11.15.
If there is an indication of a changing covariance matrix among the different
data runs or batches, a test of hypotheses of equality of covariance matrices may
be performed (e.g., see Anderson (1984)). Rejection of the null hypothesis of equal
group covariance matrices would imply that different MVN distributions are needed
to describe the different runs or batches, and that the data cannot be pooled or
translated to a common group. From a practical point of view, this would imply a
very unstable process with no repeatability, and each run or batch would need to
230 Chapter 11. The T2 Statistic and Batch Processes
where N is the total sample size of the HDS. The common covariance matrix
estimate S and the target mean vector estimate X are obtained using the HDS,
and F(PIJV-P) denotes the F distribution with p and N — p degrees of freedom. For
a given value of a and the appropriate values of N and p, we compute the UCL
using
where ^(a,p,Ar-p) is the upper ath quantile of F(^p^_p). If, for a given observation
X, the T2 value does not exceed the UCL, it is concluded that control of the process
is being maintained; otherwise, a signal is declared. The T2 distribution in (11.14)
and the UCL in (11.15) would be appropriate for monitoring a Phase II operation
for a Category 1 batch process.
Table 11.5: Phase II formulas for batch processes.
Sub- Target
group Mean Covariance T2 T2 T2
Size: m Mt Estimator Statistic Distribution UCL
1 Known S {A
( X HT ) o<3~~ \A-
llmY ( X l
MT )
llm\ [p(JV-i)l Fr. fp(JV-l)] p
L (N-p) J (P,n-p) [ (AT-p) J P(<*,P,N-p)
l
1 Unknown Sw (x
^ ^) s~
xY J (x
{s\
W x}
y^ )
\p(N~k)(N + l)l
[N(N-k-p+l)\
p
r
(p,N-k-p+l)
fp(AT-fc)(JV + l)] p
[N(N-k-p+l)\ (<*,p,JV - K — p+i;
The changes in the distribution of the T2 statistic when using the estimator S\v
are given by
and
The T2 distribution and the UCL given in (11.16) would be appropriate for
monitoring a Phase II operation for a Category 2 batch process. This statistic
can also be used to monitor a Category 1 batch process, but it produces a more
conservative control region.
The changes that occur in the two statistics when a target mean vector HT is
specified are given as
where S is again obtained from the HDS. For a given value of a, the UCL is
computed using
When the target mean vector is specified but the estimator Sw is used, the T2
statistic and its distribution are given as
of all observations are well below the UCL of 9.4175, the process appears to be in
control.
However, closer inspection of the chart reveals a definite linear trend in the T2
values. An inspection of the data leads to the cause for this pattern. Consider
the scatter plot presented in Figure 11.20. When compared to the scatter plot of
the HDS given in Figure 11.11 or to the scatter plots of the individual batch data
given in Figures 11.8-11.10, the reason becomes obvious. The process is operating
in only a portion (i.e., the lower left-hand corner of the plot in Figure 11.20) of the
variable range specified in the HDS.
234 Chapter 11. The T2 Statistic and Batch Processes
Further investigation confirms this conclusion and also gives a strong indication,
as does the T2 chart, that the entire process is moving beyond the operational region
of the variables. This is exhibited in the time-sequence plots of the individual
variables that are presented in Figures 11.21 and 11.22. From this analysis, it is
concluded that the process must be immediately recentered; otherwise the noted
drift in both variables will lead to upset conditions.
11.10. Summary 235
11.10 Summary
When monitoring batch processes, the problems of outlier detection, covariance
estimation, and batch mean differences are interrelated. To identify outliers and
estimate the covariance matrix, we recommend translating the data from the dif-
ferent batches to the origin prior to analysis. This is achieved by subtracting the
individual batch mean from the batch observations. With this translation, outliers
can be removed using the procedures identified in Chapter 5. To detect batches
with atypical means, we recommend testing for mean batch differences following
the procedures described in this chapter.
Old Blue: Epilogue
As you walk out of your office with your boss, you explain how you used multi-
variate statistical process control to locate the cause of the increased fuel usage
on Old Blue. You add that this would be an excellent tool to use in real-time
applications within the unit. You also ask permission to make a presentation
at the upcoming staff meeting on what you 've learned from reading this new
book on multivariate statistical process control.
The boss notices the book in your hand and asks who wrote it. You glance
at the names of the authors, and comment: Mason and Young. Then it all
connects. That old statistics professor wasn't named Dr. Old . . . his name
was Dr. Young.
Appendix
Distribution Tables
z Value 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
3.5 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998
3.6 0.9998 0.9998 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.7 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.8 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
* Entries in the table are the probability that a standard normal variate is less than
or equal to the given z value.
240 Appendix. Distribution Tables
Alpha
DF 0.2 0.15 0.1 0.05 0.025 0.01 0.005 0.001 0.0005
1 1.376 1.963 3.078 6.314 12.706 31.821 63.656 318.29 636.58
2 1.061 1.386 1.886 2.920 4.303 6.965 9.925 22.328 31.600
3 0.978 1.250 1.638 2.353 3.182 4.541 5.841 10.214 12.924
4 0.941 1.190 1.533 2.132 2.776 3.747 4.604 7.173 8.610
5 0.920 1.156 1.476 2.015 2.571 3.365 4.032 5.894 6.869
Alpha
DF 0.001 0.005 0.01 0.025 0.05 0.1 0.9 0.95 0.975 0.99 0.995 0.999
1 0.00 0.00 0.00 0.00 0.00 0.02 2.71 3.84 5.02 6.63 7.88 10.83
2 0.00 0.01 0.02 0.05 0.10 0.21 4.61 5.99 7.38 9.21 10.60 13.82
3 0.02 0.07 0.11 0.22 0.35 0.58 6.25 7.81 9.35 11.34 12.84 16.27
4 0.09 0.21 0.30 0.48 0.71 1.06 7.78 9.49 11.14 13.28 14.86 18.47
5 0.21 0.41 0.55 0.83 1.15 1.61 9.24 11.07 12.83 15.09 16.75 20.51
6 0.38 0.68 0.87 1.24 1.64 2.20 10.64 12.59 14.45 16.81 18.55 22.46
7 0.60 0.99 1.24 1.69 2.17 2.83 12.02 14.07 16.01 18.48 20.28 24.32
8 0.86 1.34 1.65 2.18 2.73 3.49 13.36 15.51 17.53 20.09 21.95 26.12
9 1.15 1.73 2.09 2.70 3.33 4.17 14.68 16.92 19.02 21.67 23.59 27.88
10 1.48 2.16 2.56 3.25 3.94 4.87 15.99 18.31 20.48 23.21 25.19 29.59
11 1.83 2.60 3.05 3.82 4.57 5.58 17.28 19.68 21.92 24.73 26.76 31.26
12 2.21 3.07 3.57 4.40 5.23 6.30 18.55 21.03 23.34 26.22 28.30 32.91
13 2.62 3.57 4.11 5.01 5.89 7.04 19.81 22.36 24.74 27.69 29.82 34.53
14 3.04 4.07 4.66 5.63 6.57 7.79 21.06 23.68 26.12 29.14 31.32 36.12
15 3.48 4.60 5.23 6.26 7.26 8.55 22.31 25.00 27.49 30.58 32.80 37.70
16 3.94 5.14 5.81 6.91 7.96 9.31 23.54 26.30 28.85 32.00 34.27 39.25
17 4.42 5.70 6.41 7.56 8.67 10.09 24.77 27.59 30.19 33.41 35.72 40.79
18 4.90 6.26 7.01 8.23 9.39 10.86 25.99 28.87 31.53 34.81 37.16 42.31
19 5.41 6.84 7.63 8.91 10.12 11.65 27.20 30.14 32.85 36.19 38.58 43.82
20 5.92 7.43 8.26 9.59 10.85 12.44 28.41 31.41 34.17 37.57 40.00 45.31
21 6.45 8.03 8.90 10.28 11.59 13.24 29.62 32.67 35.48 38.93 41.40 46.80
22 6.98 8.64 9.54 10.98 12.34 14.04 30.81 33.92 36.78 40.29 42.80 48.27
23 7.53 9.26 10.20 11.69 13.09 14.85 32.01 35.17 38.08 41.64 44.18 49.73
24 8.08 9.89 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.98 45.56 51.18
25 8.65 10.52 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31 46.93 52.62
26 9.22 11.16 12.20 13.84 15.38 17.29 35.56 38.89 41.92 45.64 48.29 54.05
27 9.80 11.81 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.96 49.65 55.48
28 10.39 12.46 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99 56.89
29 10.99 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34 58.30
30 11.59 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67 59.70
40 17.92 20.71 22.16 24.43 26.51 29.05 51.81 55.76 59.34 63.69 66.77 73.40
50 24.67 27.99 29.71 32.36 34.76 37.69 63.17 67.50 71.42 76.15 79.49 86.66
60 31.74 35.53 37.48 40.48 43.19 46.46 74.40 79.08 83.30 88.38 91.95 99.61
70 39.04 43.28 45.44 48.76 51.74 55.33 85.53 90.53 95.02 100.43 104.21 112.32
80 46.52 51.17 53.54 57.15 60.39 64.28 96.58 101.88 106.63 112.33 116.32 124.84
90 54.16 59.20 61.75 65.65 69.13 73.29 107.57 113.15 118.14 124.12 128.30 137.21
100 61.92 67.33 70.06 74.22 77.93 82.36 118.50 124.34 129.56 135.81 140.17 149.45
150 102.11 109.14 112.67 117.98 122.69 128.28 172.58 179.58 185.80 193.21 198.36 209.27
200 143.84 152.24 156.43 162.73 168.28 174.84 226.02 233.99 241.06 249.45 255.26 267.54
250 186.55 196.16 200.94 208.10 214.39 221.81 279.05 287.88 295.69 304.94 311.35 324.83
500 407.95 422.30 429.39 439.94 449.15 459.93 540.93 553.13 563.85 576.49 585.21 603.45
* Entries in the table are the chi-square values for an area (Alpha probability) in the upper tail
of the chi-square distribution for the given degrees of freedom (DF).
Table A.4a: Percentage points of the F distribution at Alpha = 0.05/
Denominator Numerator DF
DF 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 60 90 120 150 200 250 500 oo
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 243.9 245.9 248.0 249.0 250.1 252.2 252.9 253.2 253.5 253.7 253.8 254.1 254.3
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.41 19.43 19.45 19.45 19.46 19.48 19.48 19.49 19.49 19.49 19.49 19.49 19.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.74 8.70 8.66 8.64 8.62 8.57 8.56 8.55 8.54 8.54 8.54 8.53 8.53
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.91 5.86 5.80 5.77 5.75 5.69 5.67 5.66 5.65 5.65 5.64 5.64 5.63
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.68 4.62 4.56 4.53 4.50 4.43 4.41 4.40 4.39 4.39 4.38 4.37 4.36
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.00 3.94 3.87 3.84 3.81 3.74 3.72 3.70 3.70 3.69 3.69 3.68 3.67
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.57 3.51 3.44 3.41 3.38 3.30 3.28 3.27 3.26 3.25 3.25 3.24 3.23
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.28 3.22 3.15 3.12 3.08 3.01 2.98 2.97 2.96 2.95 2.95 2.94 2.93
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.07 3.01 2.94 2.90 2.86 2.79 2.76 2.75 2.74 2.73 2.73 2.72 2.71
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.91 2.85 2.77 2.74 2.70 2.62 2.59 2.58 2.57 2.56 2.56 2.55 2.54
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.69 2.62 2.54 2.51 2.47 2.38 2.36 2.34 2.33 2.32 2.32 2.31 2.30
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.48 2.40 2.33 2.29 2.25 2.16 2.13 2.11 2.10 2.10 2.09 2.08 2.07
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.28 2.20 2.12 2.08 2.04 1.95 1.91 1.90 1.89 1.88 1.87 1.86 1.84
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.18 2.11 2.03 1.98 1.94 1.84 1.81 1.79 1.78 1.77 1.76 1.75 1.73
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.09 2.01 1.93 1.89 1.84 1.74 1.70 1.68 1.67 1.66 1.65 1.64 1.62
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.92 1.84 1.75 1.70 1.65 1.53 1.49 1.47 1.45 1.44 1.43 1.41 1.39
90 3.95 3.10 2.71 2.47 2.32 2.20 2.11 2.04 1.99 1.94 1.86 1.78 1.69 1.64 1.59 1.46 1.42 1.39 1.38 1.36 1.35 1.33 1.30
120 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91 1.83 1.75 1.66 1.61 1.55 1.43 1.38 1.35 1.33 1.32 1.30 1.28 1.25
150 3.90 3.06 2.66 2.43 2.27 2.16 2.07 2.00 1.94 1.89 1.82 1.73 1.64 1.59 1.54 1.41 1.36 1.33 1.31 1.29 1.28 1.25 1.22
200 3.89 3.04 2.65 2.42 2.26 2.14 2.06 1.98 1.93 1.88 1.80 1.72 1.62 1.57 1.52 1.39 1.33 1.30 1.28 1.26 1.25 1.22 1.19
250 3.88 3.03 2.64 2.41 2.25 2.13 2.05 1.98 1.92 1.87 1.79 1.71 1.61 1.56 1.50 1.37 1.32 1.29 1.27 1.25 1.23 1.20 1.17
500 3.86 3.01 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.85 1.77 1.69 1.59 1.54 1.48 1.35 1.29 1.26 1.23 1.21 1.19 1.16 1.11
oo 3.84 3.00 2.60 2.37 2.21 2.01 1.94 1.88 1.83 1.75 1.67 1.57 1.52 1.46 1.32 1.38 1.20 1.22 1.20 1.17 1.15 1.13 1.00
"Entries in the table are the F values for an area (Alpha probability) in the upper tail of the F distribution for the given denominator and numerator
degrees of freedom (DF).
Table A.4b: Percentage points of the F distribution at Alpha = 0.025."
Denominator Numerator DF
1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 60 90 120 150 200 250 500 oo
1 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.6 963.3 968.3 976.2 984.9 993.1 997.3 1001 1010 1013 1014 1015 1016 1016 1017 1018
2 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.41 39.43 39.45 39.46 39.46 39.48 39.49 39.49 39.49 39.49 39.49 39.50 39.50
3 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.34 14.25 14.17 14.12 14.08 13.99 13.96 13.95 13.94 13.93 13.92 13.91 13.90
4 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.75 8.66 8.56 8.51 8.46 8.36 8.33 8.31 8.30 8.29 8.28 8.27 8.26
5 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.52 6.43 6.33 6.28 6.23 6.12 6.09 6.07 6.06 6.05 6.04 6.03 6.02
6 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.37 5.27 5.17 5.12 5.07 4.96 4.92 4.90 4.89 4.88 4.88 4.86 4.85
7 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76 4.67 4.57 4.47 4.41 4.36 4.25 4.22 4.20 4.19 4.18 4.17 4.16 4.14
8 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.20 4.10 4.00 3.95 3.89 3.78 3.75 3.73 3.72 3.70 3.70 3.68 3.67
9 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.87 3.77 3.67 3.61 3.56 3.45 3.41 3.39 3.38 3.37 3.36 3.35 3.33
10 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.62 3.52 3.42 3.37 3.31 3.20 3.16 3.14 3.13 3.12 3.11 3.09 3.08
12 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.28 3.18 3.07 3.02 2.96 2.85 2.81 2.79 2.78 2.76 2.76 2.74 2.72
15 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.96 2.86 2.76 2.70 2.64 2.52 2.48 2.46 2.45 2.44 2.43 2.41 2.40
20 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.77 2.68 2.57 2.46 2.41 2.35 2.22 2.18 2.16 2.14 2.13 2.12 2.10 2.09
24 5.72 4.32 3.72 3.38 3.15 2.99 2.87 2.78 2.70 2.64 2.54 2.44 2.33 2.27 2.21 2.08 2.03 2.01 2.00 1.98 1.97 1.95 1.94
30 5.57 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.51 2.41 2.31 2.20 2.14 2.07 1.94 1.89 1.87 1.85 1.84 1.83 1.81 1.79
60 5.29 3.93 3.34 3.01 2.79 2.63 2.51 2.41 2.33 2.27 2.17 2.06 1.94 1.88 1.82 1.67 1.61 1.58 1.56 1.54 1.53 1.51 1.48
90 5.20 3.84 3.26 2.93 2.71 2.55 2.43 2.34 2.26 2.19 2.09 1.98 1.86 1.80 1.73 1.58 1.52 1.48 1.46 1.44 1.43 1.40 1.37
120 5.15 3.80 3.23 2.89 2.67 2.52 2.39 2.30 2.22 2.16 2.05 1.94 1.82 1.76 1.69 1.53 1.47 1.43 1.41 1.39 1.37 1.34 1.31
150 5.13 3.78 3.20 2.87 2.65 2.49 2.37 2.28 2.20 2.13 2.03 1.92 1.80 1.74 1.67 1.50 1.44 1.40 1.38 1.35 1.34 1.31 1.27
200 5.10 3.76 3.18 2.85 2.63 2.47 2.35 2.26 2.18 2.11 2.01 1.90 1.78 1.71 1.64 1.47 1.41 1.37 1.35 1.32 1.30 1.27 1.23
250 5.08 3.74 3.17 2.84 2.62 2.46 2.34 2.24 2.16 2.10 2.00 1.89 1.76 1.70 1.63 1.46 1.39 1.35 1.33 1.30 1.28 1.24 1.20
500 5.05 3.72 3.14 2.81 2.59 2.43 2.31 2.22 2.14 2.07 1.97 1.86 1.74 1.67 1.60 1.42 1.35 1.31 1.28 1.25 1.24 1.19 1.14
00 5.02 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.05 1.94 1.83 1.74 1.64 1.57 1.39 1.31 1.30 1.24 1.21 1.18 1.13 1.00
Entries in the table are the F values for an area (Alpha probability) in the upper tail of the F distribution for the given denominator and n umerator
degrees of freedom (DF).
Table A.4c: Percentage points of the F distribution at Alpha = 0.01/
Denominator Numerator DF
DF 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 60 90 120 150 200 250 500 00
1 4052 4999 5403 5624 5764 5859 5928 5981 6022 6056 6107 6157 6209 6234 6260 6313 6331 6339 6345 6350 6353 6359 6366
2 98.50 99.00 99.16 99.25 99.30 99.33 99.36 99.38 99.39 99.40 99.42 99.43 99.45 99.46 99.47 99.48 99.49 99.49 99.49 99.49 99.50 99.50 99.50
3 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.34 27.23 27.05 26.87 26.69 26.60 26.50 26.32 26.25 26.22 26.20 26.18 26.17 26.15 26.13
4 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55 14.37 14.20 14.02 13.93 13.84 13.65 13.59 13.56 13.54 13.52 13.51 13.49 13.46
5 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 9.89 9.72 9.55 9.47 9.38 9.20 9.14 9.11 9.09 9.08 9.06 9.04 9.02
6 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.72 7.56 7.40 7.31 7.23 7.06 7.00 6.97 6.95 6.93 6.92 6.90 6.88
7 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.47 6.31 6.16 6.07 5.99 5.82 5.77 5.74 5.72 5.70 5.69 5.67 5.65
8 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.67 5.52 5.36 5.28 5.20 5.03 4.97 4.95 4.93 4.91 4.90 4.88 4.86
9 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 5.11 4.96 4.81 4.73 4.65 4.48 4.43 4.40 4.38 4.36 4.35 4.33 4.31
10 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.71 4.56 4.41 4.33 4.25 4.08 4.03 4.00 3.98 3.96 3.95 3.93 3.91
12 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.16 4.01 3.86 3.78 3.70 3.54 3.48 3.45 3.43 3.41 3.40 3.38 3.36
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.67 3.52 3.37 3.29 3.21 3.05 2.99 2.96 2.94 2.92 2.91 2.89 2.87
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37 3.23 3.09 2.94 2.86 2.78 2.61 2.55 2.52 2.50 2.48 2.47 2.44 2.42
24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17 3.03 2.89 2.74 2.66 2.58 2.40 2.34 2.31 2.29 2.27 2.26 2.24 2.21
30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98 2.84 2.70 2.55 2.47 2.39 2.21 2.14 2.11 2.09 2.07 2.06 2.03 2.01
60 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63 2.50 2.35 2.20 2.12 2.03 1.84 1.76 1.73 1.70 1.68 1.66 1.63 1.60
90 6.93 4.85 4.01 3.53 3.23 3.01 2.84 2.72 2.61 2.52 2.39 2.24 2.09 2.00 1.92 1.72 1.64 1.60 1.57 1.55 1.53 1.49 1.46
120 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47 2.34 2.19 2.03 1.95 1.86 1.66 1.58 1.53 1.51 1.48 1.46 1.42 1.38
150 6.81 4.75 3.91 3.45 3.14 2.92 2.76 2.63 2.53 2.44 2.31 2.16 2.00 1.92 1.83 1.62 1.54 1.49 1.46 1.43 1.42 1.38 1.33
200 6.76 4.71 3.88 3.41 3.11 2.89 2.73 2.60 2.50 2.41 2.27 2.13 1.97 1.89 1.79 1.58 1.50 1.45 1.42 1.39 1.37 1.33 1.28
250 6.74 4.69 3.86 3.40 3.09 2.87 2.71 2.58 2.48 2.39 2.26 2.11 1.95 1.87 1.77 1.56 1.48 1.43 1.40 1.36 1.34 1.30 1.24
500 6.69 4.65 3.82 3.36 3.05 2.84 2.68 2.55 2.44 2.36 2.22 2.07 1.92 1.83 1.74 1.52 1.43 1.38 1.34 1.31 1.28 1.23 1.16
00 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32 2.18 2.04 1.88 1.79 1.70 1.47 1.38 1.36 1.29 1.25 1.22 1.15 1.00
"Entries in the table are the F values for an area (Alpha probability) in the upper tail of the F distribution for the given denominator and numerator
degrees of freedom (DF).
Table A.4d: Percentage points of the F distribution at Alpha = 0.005.*
Denominator Numerator DF
DF 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 60 90 120 150 200 250 500 oo
1 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++
2 198.5 199.0 199.2 199.2 199.3 199.3 199.4 199.4 199.4 199.4 199.4 199.4 199.4 199.4 199.5 199.5 199.5 199.5 199.5 199.5 199.5 199.5 199.5
3 55.55 49.80 47.47 46.20 45.39 44.84 44.43 44.13 43.88 43.68 43.39 43.08 42.78 42.62 42.47 42.15 42.04 41.99 41.96 41.92 41.91 41.87 41.83
4 31.33 26.28 24.26 23.15 22.46 21.98 21.62 21.35 21.14 20.97 20.70 20.44 20.17 20.03 19.89 19.61 19.52 19.47 19.44 19.41 19.39 19.36 19.32
5 22.78 18.31 16.53 15.56 14.94 14.51 14.20 13.96 13.77 13.62 13.38 13.15 12.90 12.78 12.66 12.40 12.32 12.27 12.25 12.22 12.21 12.17 12.14
6 18.63 14.54 12.92 12.03 11.46 11.07 10.79 10.57 10.39 10.25 10.03 9.81 9.59 9.47 9.36 9.12 9.04 9.00 8.98 8.95 8.94 8.91 8.88
7 16.24 12.40 10.88 10.05 9.52 9.16 8.89 8.68 8.51 8.38 8.18 7.97 7.75 7.64 7.53 7.31 7.23 7.19 7.17 7.15 7.13 7.10 7.08
8 14.69 11.04 9.60 8.81 8.30 7.95 7.69 7.50 7.34 7.21 7.01 6.81 6.61 6.50 6.40 6.18 6.10 6.06 6.04 6.02 6.01 5.98 5.95
9 13.61 10.11 8.72 7.96 7.47 7.13 6.88 6.69 6.54 6.42 6.23 6.03 5.83 5.73 5.62 5.41 5.34 5.30 5.28 5.26 5.24 5.21 5.19
10 12.83 9.43 8.08 7.34 6.87 6.54 6.30 6.12 5.97 5.85 5.66 5.47 5.27 5.17 5.07 4.86 4.79 4.75 4.73 4.71 4.69 4.67 4.64
12 11.75 8.51 7.23 6.52 6.07 5.76 5.52 5.35 5.20 5.09 4.91 4.72 4.53 4.43 4.33 4.12 4.05 4.01 3.99 3.97 3.96 3.93 3.90
15 10.80 7.70 6.48 5.80 5.37 5.07 4.85 4.67 4.54 4.42 4.25 4.07 3.88 3.79 3.69 3.48 3.41 3.37 3.35 3.33 3.31 3.29 3.26
20 9.94 6.99 5.82 5.17 4.76 4.47 4.26 4.09 3.96 3.85 3.68 3.50 3.32 3.22 3.12 2.92 2.84 2.81 2.78 2.76 2.75 2.72 2.69
24 9.55 6.66 5.52 4.89 4.49 4.20 3.99 3.83 3.69 3.59 3.42 3.25 3.06 2.97 2.87 2.66 2.58 2.55 2.52 2.50 2.49 2.46 2.43
30 9.18 6.35 5.24 4.62 4.23 3.95 3.74 3.58 3.45 3.34 3.18 3.01 2.82 2.73 2.63 2.42 2.34 2.30 2.28 2.25 2.24 2.21 2.18
60 8.49 5.79 4.73 4.14 3.76 3.49 3.29 3.13 3.01 2.90 2.74 2.57 2.39 2.29 2.19 1.96 1.88 1.83 1.81 1.78 1.76 1.73 1.69
90 8.28 5.62 4.57 3.99 3.62 3.35 3.15 3.00 2.87 2.77 2.61 2.44 2.25 2.15 2.05 1.82 1.73 1.68 1.65 1.62 1.60 1.56 1.52
120 8.18 5.54 4.50 3.92 3.55 3.28 3.09 2.93 2.81 2.71 2.54 2.37 2.19 2.09 1.98 1.75 1.66 1.61 1.57 1.54 1.52 1.48 1.43
150 8.12 5.49 4.45 3.88 3.51 3.25 3.05 2.89 2.77 2.67 2.51 2.33 2.15 2.05 1.94 1.70 1.61 1.56 1.53 1.49 1.47 1.42 1.37
200 8.06 5.44 4.41 3.84 3.47 3.21 3.01 2.86 2.73 2.63 2.47 2.30 2.11 2.01 1.91 1.66 1.56 1.51 1.48 1.44 1.42 1.37 1.31
250 8.02 5.41 4.38 3.81 3.44 3.18 2.99 2.83 2.71 2.61 2.45 2.27 2.09 1.99 1.88 1.64 1.54 1.48 1.45 1.41 1.39 1.33 1.27
500 7.95 5.35 4.33 3.76 3.40 3.14 2.94 2.79 2.66 2.56 2.40 2.23 2.04 1.94 1.84 1.58 1.48 1.42 1.39 1.35 1.32 1.26 1.18
oo 7.88 5.30 4.28 3.72 3.35 3.09 2.90 2.74 2.62 2.52 2.36 2.19 2.00 1.90 1.79 1.53 1.43 1.36 1.32 1.28 1.25 1.17 1.00
* Entries in the table are the F values for an area (Alpha probability) in the upper tail of the F distribution for the given denominator and numerator degrees of
freedom (DF).
F values exceed 16.000.
Table A.4e: Percentage points of the F distribution at Alpha = 0.001.H
Denominator Numerator DF
DF 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 60 90 120 150 200 250 500 00
1 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++
2 998.4 998.8 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.3 999.5
3 167.1 148.5 141.1 137.1 134.6 132.8 131.6 130.6 129.9 129.2 128.3 127.4 126.4 125.9 125.4 124.4 124.2 124.0 123.9 123.7 123.7 123.6 123.5
4 74.13 61.25 56.17 53.43 51.72 50.52 49.65 49.00 48.47 48.05 47.41 46.76 46.10 45.77 45.43 44.75 44.51 44.40 44.33 44.27 44.22 44.14 44.05
5 47.18 37.12 33.20 31.08 29.75 28.83 28.17 27.65 27.24 26.91 26.42 25.91 25.39 25.13 24.87 24.33 24.15 24.06 24.00 23.95 23.92 23.85 23.78
6 35.51 27.00 23.71 21.92 20.80 20.03 19.46 19.03 18.69 18.41 17.99 17.56 17.12 16.90 16.67 16.21 16.06 15.98 15.93 15.89 15.86 15.80 15.75
7 29.25 21.69 18.77 17.20 16.21 15.52 15.02 14.63 14.33 14.08 13.71 13.32 12.93 12.73 12.53 12.12 11.98 11.91 11.87 11.82 11.80 11.75 11.70
8 25.41 18.49 15.83 14.39 13.48 12.86 12.40 12.05 11.77 11.54 11.19 10.84 10.48 10.30 10.11 9.73 9.60 9.53 9.49 9.45 9.43 9.38 9.33
9 22.86 16.39 13.90 12.56 11.71 11.13 10.70 10.37 10.11 9.89 9.57 9.24 8.90 8.72 8.55 8.19 8.06 8.00 7.96 7.93 7.90 7.86 7.81
10 21.04 14.90 12.55 11.28 10.48 9.93 9.52 9.20 8.96 8.75 8.45 8.13 7.80 7.64 7.47 7.12 7.00 6.94 6.91 6.87 6.85 6.81 6.76
12 18.64 12.97 10.80 9.63 8.89 8.38 8.00 7.71 7.48 7.29 7.00 6.71 6.40 6.25 6.09 5.76 5.65 5.59 5.56 5.52 5.50 5.46 5.42
15 16.59 11.34 9.34 8.25 7.57 7.09 6.74 6.47 6.26 6.08 5.81 5.54 5.25 5.10 4.95 4.64 4.53 4.48 4.44 4.41 4.39 4.35 4.31
20 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 5.08 4.82 4.56 4.29 4.15 4.00 3.70 3.60 3.54 3.51 3.48 3.46 3.42 3.38
24 14.03 9.34 7.55 6.59 5.98 5.55 5.24 4.99 4.80 4.64 4.39 4.14 3.87 3.74 3.59 3.29 3.19 3.14 3.10 3.07 3.05 3.01 2.97
30 13.29 8.77 7.05 6.12 5.53 5.12 4.82 4.58 4.39 4.24 4.00 3.75 3.49 3.36 3.22 2.92 2.81 2.76 2.73 2.69 2.67 2.63 2.59
60 11.97 7.77 6.17 5.31 4.76 4.37 4.09 3.86 3.69 3.54 3.32 3.08 2.83 2.69 2.55 2.25 2.14 2.08 2.05 2.01 1.99 1.94 1.89
90 11.57 7.47 5.91 5.06 4.53 4.15 3.87 3.65 3.48 3.34 3.11 2.88 2.63 2.50 2.36 2.05 1.93 1.87 1.83 1.79 1.77 1.72 1.66
120 11.38 7.32 5.78 4.95 4.42 4.04 3.77 3.55 3.38 3.24 3.02 2.78 2.53 2.40 2.26 1.95 1.83 1.77 1.73 1.68 1.66 1.60 1.54
150 11.27 7.24 5.71 4.88 4.35 3.98 3.71 3.49 3.32 3.18 2.96 2.73 2.48 2.35 2.21 1.89 1.77 1.70 1.66 1.62 1.59 1.53 1.47
200 11.15 7.15 5.63 4.81 4.29 3.92 3.65 3.43 3.26 3.12 2.90 2.67 2.42 2.29 2.15 1.83 1.71 1.64 1.60 1.55 1.52 1.46 1.39
250 11.09 7.10 5.59 4.77 4.25 3.88 3.61 3.40 3.23 3.09 2.87 2.64 2.39 2.26 2.12 1.80 1.67 1.60 1.56 1.51 1.48 1.42 1.34
500 10.96 7.00 5.51 4.69 4.18 3.81 3.54 3.33 3.16 3.02 2.81 2.58 2.33 2.20 2.05 1.73 1.60 1.53 1.48 1.43 1.39 1.32 1.23
00 10.83 6.91 5.42 4.62 4.10 3.74 3.47 3.27 3.10 2.96 2.74 2.51 2.27 2.13 1.99 1.66 1.52 1.45 1.40 1.34 1.30 1.21 1.00
* Entries in the table are the F values for an area (Alpha probability) in the upper tail of the F distribution for the given denominator and numerator degrees of
freedom (DF).
++
F values exceed 400,000.
Table A.5: Percentage points of the beta distribution.'
(n-p-l)/2
P/2 Alpha 5 6 7 8 9 10 20 30 40 50 60 70 90 120 150 200 250 500
0.5 0.999 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.75 0.0106 0.0088 0.0075 0.0065 0.0058 0.0052 0.0026 0.0017 0.0013 0.0010 0.0008 0.0007 0.0006 0.0006 0.0004 0.0003 0.0003 0.0002 0.0001
0.10 0.2473 0.2093 0.1814 0.1600 0.1431 0.1295 0.0662 0.0445 0.0335 0.0268 0.0224 0.0192 0.0168 0.0150 0.0112 0.0090 0.0067 0.0054 0.0027
0.05 0.3318 0.2835 0.2473 0.2193 0.1969 0.1787 0.0927 0.0625 0.0472 0.0379 0.0316 0.0272 0.0238 0.0212 0.0159 0.0127 0.0096 0.0077 0.0038
0.025 0.4096 0.3532 0.3103 0.2765 0.2493 0.2269 0.1194 0.0810 0.0612 0.0492 0.0412 0.0354 0.0310 0.0276 0.0208 0.0166 0.0125 0.0100 0.0050
0.01 0.5011 0.4374 0.3876 0.3478 0.3152 0.2882 0.1546 0.1055 0.0801 0.0645 0.0540 0.0464 0.0407 0.0363 0.0273 0.0219 0.0165 0.0132 0.0066
0.005 0.5619 0.4948 0.4413 0.3979 0.3621 0.3321 0.1808 0.1240 0.0944 0.0761 0.0638 0.0549 0.0482 0.0429 0.0324 0.0260 0.0195 0.0157 0.0079
0.001 0.6778 0.6084 0.5505 0.5019 0.4608 0.4256 0.2397 0.1664 0.1273 0.1031 0.0866 0.0747 0.0656 0.0585 0.0442 0.0355 0.0267 0.0214 0.0108
1.0 0.999 0.0002 0.0002 0.0001 0.0001 0.0001 0.0001 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.75 0.0559 0.0468 0.0403 0.0353 0.0315 0.0284 0.0143 0.0095 0.0072 0.0057 0.0048 0.0041 0.0036 0.0032 0.0024 0.0019 0.0014 0.0012 0.0006
0.10 0.3690 0.3187 0.2803 0.2501 0.2257 0.2057 0.1087 0.0739 0.0559 0.0450 0.0376 0.0324 0.0284 0.0253 0.0190 0.0152 0.0114 0.0092 0.0046
0.05 0.4507 0.3930 0.3482 0.3123 0.2831 0.2589 0.1391 0.0950 0.0722 0.0582 0.0487 0.0419 0.0368 0.0327 0.0247 0.0198 0.0149 0.0119 0.0060
0.025 0.5218 0.4593 0.4096 0.3694 0.3363 0.3085 0.1684 0.1157 0.0881 0.0711 0.0596 0.0513 0.0451 0.0402 0.0303 0.0243 0.0183 0.0146 0.0074
0.01 0.6019 0.5358 0.4821 0.4377 0.4005 0.3690 0.2057 0.1423 0.1087 0.0880 0.0739 0.0637 0.0559 0.0499 0.0376 0.0302 0.0228 0.0183 0.0092
0.005 0.6534 0.5865 0.5309 0.4843 0.4450 0.4113 0.2327 0.1619 0.1241 0.1005 0.0845 0.0729 0.0641 0.0572 0.0432 0.0347 0.0261 0.0210 0.0105
0.001 0.7488 0.6838 0.6272 0.5783 0.5358 0.4988 0.2921 0.2057 0.1586 0.1290 0.1087 0.0940 0.0827 0.0739 0.0559 0.0450 0.0339 0.0273 0.0137
1.5 0.999 0.0023 0.0019 0.0017 0.0015 0.0013 0.0012 0.0006 0.0004 0.0003 0.0002 0.0002 0.0002 0.0002 0.0001 0.0001 0.0001 0.0001 0.0000 0.0000
0.75 0.1093 0.0926 0.0803 0.0709 0.0635 0.0575 0.0295 0.0198 0.0150 0.0120 0.0100 0.0086 0.0075 0.0067 0.0050 0.0040 0.0030 0.0024 0.0012
0.10 0.4500 0.3944 0.3509 0.3158 0.2871 0.2631 0.1431 0.0982 0.0747 0.0603 0.0506 0.0435 0.0382 0.0340 0.0257 0.0206 0.0155 0.0124 0.0062
0.05 0.5266 0.4660 0.4174 0.3778 0.3450 0.3173 0.1755 0.1212 0.0925 0.0748 0.0628 0.0541 0.0475 0.0424 0.0320 0.0257 0.0193 0.0155 0.0078
0.025 0.5915 0.5280 0.4761 0.4332 0.3972 0.3666 0.2062 0.1432 0.1096 0.0888 0.0747 0.0644 0.0566 0.0505 0.0381 0.0306 0.0231 0.0185 0.0093
0.01 0.6628 0.5981 0.5438 0.4981 0.4591 0.4255 0.2444 0.1710 0.1315 0.1068 0.0899 0.0776 0.0682 0.0609 0.0461 0.0370 0.0279 0.0224 0.0113
0.005 0.7080 0.6437 0.5887 0.5417 0.5012 0.4660 0.2718 0.1912 0.1474 0.1199 0.1011 0.0873 0.0769 0.0687 0.0520 0.0418 0.0315 0.0253 0.0127
0.001 0.7902 0.7298 0.6758 0.6281 0.5858 0.5485 0.3309 0.2358 0.1830 0.1494 0.1263 0.1093 0.0964 0.0862 0.0654 0.0527 0.0398 0.0320 0.0161
2.0 0.999 0.0083 0.0070 0.0060 0.0053 0.0048 0.0043 0.0022 0.0015 0.0011 0.0009 0.0008 0.0006 0.0006 0.0005 0.0004 0.0003 0.0002 0.0002 0.0001
0.75 0.1612 0.1380 0.1206 0.1072 0.0964 0.0876 0.0458 0.0310 0.0235 0.0189 0.0158 0.0135 0.0119 0.0106 0.0079 0.0064 0.0048 0.0038 0.0019
0.10 0.5103 0.4526 0.4062 0.3684 0.3368 0.3102 0.1729 0.1198 0.0916 0.0741 0.0623 0.0537 0.0472 0.0421 0.0318 0.0255 0.0192 0.0154 0.0077
0.05 0.5818 0.5207 0.4707 0.4291 0.3942 0.3644 0.2067 0.1441 0.1106 0.0897 0.0754 0.0651 0.0572 0.0511 0.0386 0.0310 0.0234 0.0188 0.0094
0.025 0.6412 0.5787 0.5265 0.4825 0.4450 0.4128 0.2382 0.1670 0.1286 0.1045 0.0880 0.0760 0.0669 0.0597 0.0452 0.0363 0.0274 0.0220 0.0111
0.01 0.7057 0.6434 0.5899 0.5440 0.5044 0.4698 0.2768 0.1957 0.1512 0.1232 0.1039 0.0899 0.0792 0.0707 0.0536 0.0432 0.0326 0.0262 0.0132
0.005 0.7460 0.6849 0.6315 0.5850 0.5443 0.5086 0.3043 0.2163 0.1677 0.1368 0.1156 0.1000 0.0882 0.0788 0.0598 0.0482 0.0364 0.0292 0.0147
0.001 0.8186 0.7625 0.7113 0.6651 0.6237 0.5866 0.3630 0.2613 0.2039 0.1671 0.1416 0.1228 0.1084 0.0970 0.0738 0.0595 0.0450 0.0362 0.0183
2.5 0.999 0.0182 0.0155 0.0135 0.0120 0.0107 0.0097 0.0051 0.0034 0.0026 0.0021 0.0017 0.0015 0.0013 0.0012 0.0009 0.0007 0.0005 0.0004 0.0002
0.75 0.2092 0.1808 0.1592 0.1422 0.1286 0.1173 0.0625 0.0426 0.0323 0.0260 0.0218 0.0187 0.0164 0.0146 0.0110 0.0088 0.0066 0.0053 0.0027
0.1 0.5577 0.4994 0.4517 0.4122 0.3789 0.3505 0.1997 0.1395 0.1072 0.0870 0.0732 0.0632 0.0556 0.0496 0.0375 0.0302 0.0227 0.0182 0.0092
0.05 0.6245 0.5641 0.5137 0.4713 0.4351 0.4040 0.2344 0.1648 0.1271 0.1034 0.0871 0.0753 0.0663 0.0592 0.0448 0.0361 0.0272 0.0218 0.0110
0.025 0.6793 0.6185 0.5668 0.5225 0.4844 0.4512 0.2663 0.1884 0.1457 0.1188 0.1002 0.0867 0.0764 0.0683 0.0518 0.0417 0.0315 0.0253 0.0127
0.01 0.7381 0.6785 0.6264 0.5810 0.5413 0.5063 0.3052 0.2177 0.1690 0.1381 0.1168 0.1011 0.0892 0.0798 0.0606 0.0488 0.0369 0.0296 0.0150
0.005 0.7746 0.7167 0.6652 0.6196 0.5792 0.5435 0.3326 0.2386 0.1858 0.1522 0.1288 0.1117 0.0985 0.0882 0.0670 0.0540 0.0409 0.0328 0.0166
0.001 0.8398 0.7875 0.7389 0.6944 0.6541 0.6176 0.3906 0.2839 0.2226 0.1831 0.1554 0.1350 0.1193 0.1069 0.0814 0.0658 0.0498 0.0401 0.0203
3.0 0.999 0.0316 0.0270 0.0237 0.0210 0.0189 0.0172 0.0090 0.0061 0.0046 0.0037 0.0031 0.0027 0.0023 0.0021 0.0016 0.0013 0.0009 0.0008 0.0004
0.75 0.2531 0.2206 0.1955 0.1756 0.1593 0.1459 0.0790 0.0542 0.0413 0.0333 0.0279 0.0240 0.0211 0.0188 0.0142 0.0114 0.0086 0.0069 0.0034
0.1 0.5962 0.5382 0.4901 0.4496 0.4152 0.3855 0.2242 0.1579 0.1218 0.0991 0.0836 0.0722 0.0636 0.0568 0.0430 0.0346 0.0261 0.0210 0.0106
0.05 0.6587 0.5997 0.5496 0.5069 0.4701 0.4381 0.2595 0.1839 0.1424 0.1162 0.0981 0.0849 0.0748 0.0669 0.0507 0.0408 0.0308 0.0248 0.0125
0.025 0.7096 0.6509 0.6001 0.5561 0.5178 0.4841 0.2916 0.2081 0.1616 0.1321 0.1117 0.0968 0.0853 0.0763 0.0580 0.0467 0.0353 0.0284 0.0143
0.01 0.7637 0.7068 0.6563 0.6117 0.5723 0.5373 0.3305 0.2377 0.1855 0.1520 0.1288 0.1117 0.0986 0.0882 0.0671 0.0541 0.0410 0.0329 0.0166
0.005 0.7970 0.7422 0.6926 0.6482 0.6085 0.5730 0.3577 0.2588 0.2026 0.1663 0.1411 0.1225 0.1082 0.0969 0.0738 0.0596 0.0451 0.0363 0.0183
0.001 0.8562 0.8073 0.7612 0.7185 0.6793 0.6436 0.4151 0.3042 0.2397 0.1977 0.1682 0.1463 0.1295 0.1161 0.0886 0.0717 0.0543 0.0438 0.0222
3.5 0.999 0.0473 0.0408 0.0359 0.0320 0.0289 0.0264 0.0140 0.0095 0.0072 0.0058 0.0049 0.0042 0.0037 0.0033 0.0025 0.0020 0.0015 0.0012 0.0006
0.75 0.2929 0.2572 0.2294 0.2070 0.1886 0.1732 0.0954 0.0659 0.0503 0.0407 0.0341 0.0294 0.0258 0.0230 0.0174 0.0140 0.0105 0.0084 0.0042
0.1 0.6282 0.5711 0.5230 0.4821 0.4470 0.4165 0.2468 0.1751 0.1356 0.1107 0.0935 0.0809 0.0713 0.0637 0.0484 0.0389 0.0294 0.0236 0.0119
0.05 0.6870 0.6296 0.5802 0.5376 0.5005 0.4681 0.2824 0.2018 0.1569 0.1283 0.1085 0.0940 0.0829 0.0742 0.0564 0.0454 0.0343 0.0276 0.0139
0.025 0.7344 0.6778 0.6282 0.5848 0.5466 0.5128 0.3147 0.2263 0.1765 0.1447 0.1226 0.1063 0.0939 0.0840 0.0639 0.0516 0.0390 0.0314 0.0158
0.01 0.7845 0.7302 0.6814 0.6379 0.5990 0.5642 0.3534 0.2562 0.2008 0.1650 0.1400 0.1216 0.1075 0.0963 0.0734 0.0593 0.0449 0.0361 0.0183
0.005 0.8152 0.7632 0.7156 0.6724 0.6335 0.5984 0.3804 0.2774 0.2181 0.1796 0.1526 0.1327 0.1173 0.1052 0.0802 0.0648 0.0491 0.0396 0.0200
0.001 0.8695 0.8235 0.7797 0.7387 0.7007 0.6658 0.4370 0.3228 0.2556 0.2114 0.1802 0.1570 0.1390 0.1248 0.0954 0.0773 0.0586 0.0473 0.0240
4.0 0.999 0.0648 0.0562 0.0496 0.0444 0.0402 0.0368 0.0198 0.0135 0.0103 0.0083 0.0069 0.0060 0052 0.0047 0035 0.0028 0.0021 0.0017 0.0009
0.75 0.3291 0.2910 0.2609 0.2364 0.2162 0.1991 0.1114 0.0774 0.0593 0.0481 0.0404 0.0348 0306 0.0273 0207 0.0166 0.0125 0.0100 0.0050
0.1 0.6554 0.5994 0.5517 0.5108 0.4753 0.4443 0.2678 0.1914 0.1488 0.1217 0.1030 0.0892 0787 0.0704 0535 0.0431 0.0326 0.0262 0.0132
0.05 0.7108 0.6551 0.6066 0.5644 0.5273 0.4946 0.3036 0.2185 0.1706 0.1398 0.1185 0.1028 0908 0.0813 0618 0.0499 0.0378 0.0304 0.0153
0.025 0.7551 0.7007 0.6525 0.6097 0.5719 0.5381 0.3359 0.2433 0.1906 0.1566 0.1329 0.1154 1020 0.0914 0696 0.0562 0.0426 0.0343 0.0173
0.01 0.8018 0.7500 0.7029 0.6604 0.6222 0.5878 0.3745 0.2735 0.2152 0.1773 0.1508 0.1311 1160 0.1040 0794 0.0642 0.0486 0.0392 0.0198
0.005 0.8303 0.7809 0.7351 0.6933 0.6552 0.6206 0.4012 0.2947 0.2327 0.1921 0.1636 0.1424 1261 0.1131 0864 0.0699 0.0530 0.0427 0.0217
0.001 0.8804 0.8371 0.7954 0.7560 0.7192 0.6851 0.4569 0.3401 0.2703 0.2242 0.1915 0.1670 1481 0.1331 1019 0.0826 0.0628 0.0506 0.0257
"Entries in the table are the beta values B ( a , p / 2 , ( n - p - 2 ) / 2 ) for an area (Alpha probability) in the upper tail of the beta distribution for the given parameter values
of p/2 and (n-p-l)/2.
Table A.5 (continued): Percentage points of the beta distribution/
(N-
Alpha 5 6 7 8 9 10 20 30 40 50 60 70 80 90 120 150 200 250 500
4.5 0.999 0.0834 0.0727 0.0644 0.0579 0.0526 0.0481 0.0262 0.0180 0.0137 0.0111 0.0093 0.0080 0.0070 0.0063 0.0047 0.0038 0.0029 0.0023 0.0011
0.75 0.3620 0.3220 0.2901 0.2640 0.2422 0.2238 0.1271 0.0888 0.0683 0.0554 0.0467 0.0403 0.0354 0.0316 0.0239 0.0192 0.0145 0.0116 0.0059
0.1 0.6787 0.6241 0.5770 0.5362 0.5006 0.4693 0.2874 0.2068 0.1614 0.1324 0.1122 0.0973 0.0859 0.0769 0.0585 0.0472 0.0357 0.0287 0.0145
0.05 0.7311 0.6771 0.6297 0.5880 0.5512 0.5185 0.3234 0.2343 0.1836 0.1509 0.1281 0.1113 0.0983 0.0881 0.0671 0.0542 0.0411 0.0330 0.0167
0.025 0.7728 0.7204 0.6735 0.6317 0.5942 0.5607 0.3555 0.2593 0.2040 0.1680 0.1428 0.1242 0.1099 0.0985 0.0752 0.0608 0.0461 0.0371 0.0188
0.01 0.8165 0.7669 0.7215 0.6802 0.6427 0.6087 0.3938 0.2897 0.2288 0.1890 0.1610 0.1402 0.1242 0.1114 0.0851 0.0689 0.0523 0.0421 0.0214
0.005 0.8430 0.7960 0.7520 0.7115 0.6743 0.6403 0.4203 0.3109 0.2464 0.2040 0.1740 0.1517 0.1344 0.1207 0.0923 0.0748 0.0568 0.0458 0.0232
0.001 0.8896 0.8487 0.8089 0.7710 0.7354 0.7022 0.4752 0.3561 0.2842 0.2363 0.2022 0.1767 0.1568 0.1410 0.1082 0.0878 0.0668 0.0539 0.0274
5.0 0.999 0.0898 0.0799 0.0721 0.0656 0.0602 0.0331 0.0229 0.0175 0.0141 0.0119 0.0102 0.0090 0.0080 0.0060 0.0049 0.0037 0.0029 0.0015
0.75 0.3507 0.3173 0.2898 0.2668 0.2471 0.1424 0.1001 0.0771 0.0628 0.0529 0.0457 0.0403 0.0360 0.0272 0.0219 0.0165 0.0133 0.0067
0.1 0.6458 0.5995 0.5590 0.5234 0.4920 0.3059 0.2215 0.1735 0.1426 0.1210 0.1051 0.0929 0.0832 0.0634 0.0512 0.0388 0.0312 0.0158
0.05 0.6965 0.6502 0.6091 0.5726 0.5400 0.3418 0.2493 0.1961 0.1615 0.1373 0.1194 0.1057 0.0947 0.0723 0.0584 0.0443 0.0357 0.0181
0.025 0.7376 0.6921 0.6511 0.6143 0.5810 0.3738 0.2745 0.2167 0.1789 0.1524 0.1327 0.1175 0.1054 0.0805 0.0652 0.0494 0.0398 0.0202
0.01 0.7817 0.7378 0.6976 0.6609 0.6274 0.4118 0.3049 0.2418 0.2002 0.1708 0.1489 0.1320 0.1185 0.0908 0.0735 0.0558 0.0450 0.0229
0.005 0.8091 0.7668 0.7275 0.6913 0.6579 0.4379 0.3262 0.2595 0.2153 0.1840 0.1606 0.1424 0.1280 0.0981 0.0795 0.0604 0.0488 0.0248
0.001 0.8587 0.8206 0.7841 0.7496 0.7173 0.4920 0.3712 0.2974 0.2479 0.2124 0.1859 0.1652 0.1486 0.1142 0.0928 0.0706 0.0570 0.0290
6.0 0.999 0.1120 0.1016 0.0929 0.0857 0.0482 0.0335 0.0257 0.0209 0.0176 0.0152 0.0133 0.0119 0.0090 0.0072 0.0055 0.0044 0.0022
0.75 0.3663 0.3368 0.3117 0.2902 0.1717 0.1220 0.0946 0.0773 0.0653 0.0566 0.0499 0.0446 0.0339 0.0273 0.0206 0.0166 0.0084
0.1 0.6377 0.5982 0.5631 0.5317 0.3397 0.2490 0.1964 0.1621 0.1380 0.1202 0.1064 0.0954 0.0729 0.0590 0.0448 0.0361 0.0183
0.05 0.6848 0.6452 0.6096 0.5774 0.3754 0.2772 0.2195 0.1817 0.1550 0.1351 0.1197 0.1075 0.0823 0.0666 0.0506 0.0408 0.0207
0.025 0.7233 0.6842 0.6486 0.6162 0.4070 0.3026 0.2405 0.1995 0.1705 0.1488 0.1320 0.1186 0.0909 0.0737 0.0560 0.0452 0.0230
0.01 0.7651 0.7271 0.6920 0.6597 0.4443 0.3330 0.2659 0.2213 0.1894 0.1655 0.1470 0.1322 0.1015 0.0824 0.0627 0.0506 0.0257
0.005 0.7915 0.7546 0.7201 0.6882 0.4698 0.3542 0.2838 0.2366 0.2028 0.1774 0.1577 0.1419 0.1091 0.0886 0.0675 0.0545 0.0278
0.001 0.8401 0.8062 0.7738 0.7432 0.5222 0.3987 0.3217 0.2695 0.2317 0.2032 0.1809 0.1630 0.1257 0.1023 0.0781 0.0631 0.0322
7.0 0.999 0.1315 0.1209 0.1119 0.0642 0.0451 0.0348 0.0283 0.0239 0.0206 0.0182 0.0162 0.0123 0.0099 0.0075 0.0060 0.0030
0.75 0.3782 0.3518 0.3289 0.1994 0.1431 0.1117 0.0915 0.0776 0.0673 0.0594 0.0532 0.0405 0.0327 0.0247 0.0199 0.0101
0.1 0.6309 0.5965 0.5654 0.3700 0.2742 0.2177 0.1805 0.1541 0.1345 0.1192 0.1071 0.0821 0.0665 0.0506 0.0408 0.0207
0.05 0.6750 0.6404 0.6090 0.4054 0.3027 0.2413 0.2006 0.1716 0.1499 0.1331 0.1196 0.0918 0.0745 0.0567 0.0457 0.0233
0.025 0.7114 0.6771 0.6457 0.4365 0.3281 0.2626 0.2188 0.1874 0.1640 0.1457 0.1311 0.1008 0.0818 0.0623 0.0503 0.0256
0.01 0.7512 0.7177 0.6866 0.4729 0.3584 0.2882 0.2408 0.2067 0.1811 0.1611 0.1451 0.1118 0.0909 0.0693 0.0560 0.0286
0.005 0.7766 0.7439 0.7132 0.4977 0.3794 0.3060 0.2563 0.2204 0.1933 0.1721 0.1551 0.1196 0.0973 0.0743 0.0600 0.0307
0.001 0.8241 0.7936 0.7645 0.5485 0.4234 0.3439 0.2894 0.2496 0.2194 0.1957 0.1766 0.1366 0.1114 0.0851 0.0689 0.0353
*Entries in the table arc the beta values B(n,p/2,(n-p-2)/2) f°r an area (Alpha probability) in the tipper tail of the beta distribution for the given parameter values
of p/2 and (n-p-l)/2.
Table A.5 (continued): Percentage points of the beta distribution.*
(n-p-l)/2
p/2 Alpha 9 10 20 30 40 50 60 70 80 90 120 150 200 250 500
8.0 0.999 0.1487 0.1381 0.0809 0.0573 0.0444 0.0362 0.0306 0.0265 0.0233 0.0209 0.0158 0.0128 0.0096 0.0077 0.0039
0.75 0.3877 0.3638 0.2255 0.1635 0.1282 0.1055 0.0896 0.0779 0.0689 0.0617 0.0471 0.0381 0.0288 0.0232 0.0118
0.1 0.6250 0.5945 0.3974 0.2976 0.2377 0.1979 0.1694 0.1481 0.1316 0.1184 0.0909 0.0738 0.0562 0.0454 0.0231
0.05 0.6666 0.6360 0.4323 0.3262 0.2616 0.2183 0.1873 0.1640 0.1458 0.1313 0.1010 0.0821 0.0626 0.0505 0.0258
0.025 0.7012 0.6708 0.4628 0.3516 0.2831 0.2368 0.2035 0.1784 0.1588 0.1430 0.1103 0.0897 0.0684 0.0553 0.0282
0.01 0.7393 0.7094 0.4984 0.3817 0.3087 0.2591 0.2231 0.1959 0.1745 0.1574 0.1216 0.0990 0.0756 0.0612 0.0313
0.005 0.7638 0.7344 0.5226 0.4025 0.3266 0.2747 0.2369 0.2082 0.1857 0.1676 0.1296 0.1056 0.0808 0.0654 0.0335
0.001 0.8101 0.7824 0.5718 0.4458 0.3644 0.3078 0.2663 0.2347 0.2097 0.1895 0.1470 0.1201 0.0920 0.0745 0.0382
9.0 0.999 0.1639 0.0978 0.0698 0.0543 0.0445 0.0376 0.0326 0.0288 0.0258 0.0196 0.0158 0.0120 0.0096 0.0049
0.75 0.3954 0.2500 0.1830 0.1443 0.1191 0.1015 0.0883 0.0782 0.0702 0.0537 0.0434 0.0330 0.0266 0.0135
0.1 0.6198 0.4224 0.3194 0.2566 0.2144 0.1841 0.1613 0.1435 0.1292 0.0995 0.0809 0.0617 0.0499 0.0255
0.05 0.6594 0.4567 0.3479 0.2807 0.2351 0.2023 0.1775 0.1581 0.1425 0.1099 0.0895 0.0683 0.0553 0.0282
0.025 0.6924 0.4867 0.3732 0.3022 0.2538 0.2187 0.1921 0.1713 0.1545 0.1194 0.0973 0.0744 0.0602 0.0308
0.01 0.7290 0.5214 0.4031 0.3279 0.2762 0.2385 0.2099 0.1873 0.1692 0.1310 0.1069 0.0818 0.0662 0.0339
0.005 0.7526 0.5449 0.4236 0.3458 0.2919 0.2525 0.2224 0.1987 0.1795 0.1392 0.1137 0.0871 0.0705 0.0362
0.001 0.7977 0.5925 0.4663 0.3833 0.3251 0.2821 0.2491 0.2229 0.2018 0.1570 0.1284 0.0985 0.0799 0.0411
10 0.999 0.1148 0.0826 0.0645 0.0530 0.0449 0.0390 0.0345 0.0309 0.0235 0.0190 0.0144 0.0116 0.0059
0.75 0.2732 0.2017 0.1599 0.1325 0.1131 0.0986 0.0875 0.0786 0.0602 0.0488 0.0371 0.0299 0.0152
0.1 0.4452 0.3397 0.2744 0.2301 0.1981 0.1739 0.1549 0.1397 0.1079 0.0879 0.0671 0.0543 0.0278
0.05 0.4790 0.3682 0.2986 0.2511 0.2166 0.1904 0.1698 0.1533 0.1186 0.0967 0.0739 0.0599 0.0307
0.025 0.5083 0.3933 0.3202 0.2699 0.2332 0.2053 0.1833 0.1656 0.1283 0.1047 0.0802 0.0649 0.0333
0.01 0.5422 0.4228 0.3459 0.2925 0.2532 0.2232 0.1996 0.1805 0.1401 0.1145 0.0878 0.0712 0.0365
0.005 0.5651 0.4431 0.3637 0.3082 0.2672 0.2359 0.2111 0.1910 0.1485 0.1215 0.0932 0.0756 0.0389
0.001 0.6113 0.4851 0.4010 0.3413 0.2970 0.2627 0.2356 0.2134 0.1665 0.1365 0.1049 0.0852 0.0439
15 0.999 0.1963 0.1462 0.1166 0.0970 0.0831 0.0726 0.0645 0.0581 0.0446 0.0363 0.0276 0.0223 0.0114
0.75 0.3712 0.2846 0.2308 0.1941 0.1675 0.1473 0.1315 0.1187 0.0920 0.0750 0.0574 0.0465 0.0239
0.1 0.5361 0.4245 0.3511 0.2991 0.2605 0.2307 0.2071 0.1878 0.1467 0.1204 0.0927 0.0754 0.0389
0.05 0.5668 0.4519 0.3753 0.3206 0.2798 0.2482 0.2230 0.2024 0.1585 0.1302 0.1004 0.0817 0.0423
0.025 0.5930 0.4758 0.3965 0.3397 0.2970 0.2638 0.2372 0.2155 0.1691 0.1391 0.1073 0.0874 0.0453
0.01 0.6230 0.5035 0.4216 0.3622 0.3174 0.2824 0.2543 0.2313 0.1818 0.1498 0.1157 0.0943 0.0490
0.005 0.6430 0.5224 0.4387 0.3778 0.3315 0.2953 0.2662 0.2422 0.1907 0.1573 0.1217 0.0992 0.0516
0.001 0.6830 0.5609 0.4743 0.4103 0.3612 0.3226 0.2913 0.2655 0.2098 0.1733 0.1344 0.1097 0.0572
*Entries in the table are the beta values B(aip/2](n-p-2)/2) f°r an area (Alpha probability) in the upper tail of the beta distribution
for the given parameter values of p/2 and (n-p-l)/2.
Table A.5 (continued): Percentage points of the beta distribution.*
(n-p-l)/2
p/2 Alpha 30 40 50 60 70 80 90 120 150 200 250 500
20 0.999 0.2057 0.1670 0.1406 0.1215 0.1069 0.0955 0.0863 0.0669 0.0547 0.0419 0.0339 0.0174
0.75 0.3524 0.2913 0.2482 0.2163 0.1916 0.1720 0.1560 0.1221 0.1003 0.0772 0.0628 0.0325
0.1 0.4893 0.4122 0.3560 0.3131 0.2795 0.2523 0.2300 0.1816 0.1501 0.1164 0.0950 0.0496
0.05 0.5152 0.4358 0.3774 0.3326 0.2973 0.2688 0.2452 0.1941 0.1606 0.1247 0.1019 0.0533
0.025 0.5376 0.4564 0.3962 0.3498 0.3131 0.2834 0.2587 0.2052 0.1700 0.1322 0.1081 0.0566
0.01 0.5634 0.4804 0.4182 0.3701 0.3319 0.3007 0.2749 0.2185 0.1813 0.1411 0.1156 0.0606
0.005 0.5809 0.4967 0.4334 0.3841 0.3448 0.3127 0.2861 0.2278 0.1891 0.1474 0.1208 0.0634
0.001 0.6162 0.5303 0.4647 0.4132 0.3719 0.3380 0.3097 0.2474 0.2059 0.1609 0.1320 0.0695
25 0.999 0.2595 0.2139 0.1821 0.1586 0.1404 0.1260 0.1143 0.0894 0.0734 0.0566 0.0460 0.0238
0.75 0.4089 0.3432 0.2958 0.2599 0.2318 0.2092 0.1906 0.1505 0.1243 0.0964 0.0787 0.0411
0.1 0.5407 0.4625 0.4038 0.3583 0.3219 0.2922 0.2676 0.2134 0.1775 0.1386 0.1137 0.0598
0.05 0.5651 0.4852 0.4248 0.3777 0.3399 0.3089 0.2831 0.2263 0.1885 0.1474 0.1210 0.0638
0.025 0.5860 0.5049 0.4432 0.3947 0.3557 0.3236 0.2969 0.2378 0.1982 0.1552 0.1275 0.0674
0.01 0.6100 0.5278 0.4646 0.4146 0.3743 0.3410 0.3131 0.2514 0.2099 0.1646 0.1354 0.0717
0.005 0.6262 0.5433 0.4792 0.4283 0.3871 0.3530 0.3244 0.2608 0.2180 0.1712 0.1409 0.0747
0.001 0.6587 0.5750 0.5093 0.4567 0.4137 0.3781 0.3480 0.2808 0.2352 0.1851 0.1526 0.0812
*Entries in the table arc the beta values -B(Q.p/2,(«,-p-2)/2) f°r an area (Alpha probability) in the upper tail of the beta
distribution for the given parameter values of p/2 and (n-p-l)/2.
This page intentionally left blank
Bibliography
Agnew, J.L., and Knapp, R.C. (1995). Linear Algebra with Applications,
Brooks/Cole, Pacific Grove, CA.
Alt, F.B. (1982). "Multivariate Quality Control: State of the Art," Quality Congress
Transactions, American Society for Quality, Milwaukee, WI, pp. 886-893.
Alt, F.B., Deutch, S.J., and Walker, J.W. (1977). "Control Charts for Multivariate,
Correlated Observations," Quality Congress Transactions, American Society for
Quality, Milwaukee, WI, pp. 360-369.
Anderson, D.R., Sweeney, D.J., and Williams, T.A. (1994). Introduction to Statis-
tics Concepts and Applications, West Publishing Company, New York.
Barnett, V., and Lewis, T. (1994). Outliers in Statistical Data, 3rd ed., Wiley, New
York.
Belsley, D.A., Kuh, E., and Welsch, R.E. (1980). Regression Diagnostics: Identify-
ing Influential Data and Sources of Collinearity, Wiley, New York.
Box, G.E.P., and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and
Control, Holden-Day, San Francisco, CA.
Chatterjee, S., and Price, B. (1999). Regression Analysis by Example, 3rd ed., Wiley,
New York.
Chou, Y.M., Mason, R.L., and Young, J.C. (1999) "Power Comparisons for a
Hotelling's T2 Statistic," Commun. Statist. Simulation Comput., 28, pp. 1031-
1050.
Chou, Y.M., Mason, R.L., and Young, J.C. (2001). "The Control Chart For Individ-
ual Observations from a Multivariate Non-Normal Distribution." Comm. Statist.,
30, pp. 1937-1949.
253
254 Bibliography
Chou, Y.M., Polansky, A.M., and Mason, R.L. (1998). "Transforming Non-Normal
Data to Normality in Statistical Process Control," J. Quality Technology, 30, pp.
133-141.
Conover, W. J. (2000). Practical Nonparametric Statistics, 3rd ed., Wiley, New York.
David, H.A. (1970). Order Statistics, Wiley, New York.
Doganaksoy, N., Faltin, F.W., and Tucker, W.T. (1991). "Identification of Out-
of-Control Quality Characteristics in a Multivariate Manufacturing Environment,"
Comm. Statist. Theory Methods, 20, pp. 2775-2790.
Dudewicz, E.J., and Mishra, S.N. (1988). Modern Mathematical Statistics, Wiley,
New York.
Duncan, A.J. (1986). Quality Control and Industrial Statistics, 5th ed., Richard D.
Irwin, Homewood, IL.
Fuchs, C., and Kenett, R.S. (1998). Multivariate Quality Control, Dekker, New
York.
Gnanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate
Observations, Wiley, New York.
Hawkins, D.M. (1980). Identification of Outliers, Chapman and Hall, New York.
Hawkins, D.M. (1981). "A New Test for Multivariate Normality and Homoscedas-
ticity," Technometrics, 23, pp. 105-110.
Hawkins, D. M. (1991). "Multivariate Quality Control Based on Regression-
Adjusted Variables," Technometrics, 33, pp. 61-75.
Hawkins, D.M. (1993). "Regression Adjustment for Variables in Multivariate Qual-
ity Control," J. Quality Technology, 25, pp. 170-182.
Holmes, D.S., and Mergen, A.E. (1993). "Improving the Performance of the T2
Control Chart," Quality Engrg., 5, pp. 619-625.
Hotelling, H. (1931). "The Generalization of Student's Ratio," Ann. Math. Statist.,
2, pp. 360-378.
Kourti, T., and MacGregor, J.F. (1996). "Multivariate SPG Methods for Process
and Product Monitoring," J. Quality Technology, 28, pp. 409-428.
Kshirsagar, A.M., and Young, J.C. (1971). "Correlation Between Two Hotelling's
T2," Technical Report, Department of Statistics, Southern Methodist University,
Dallas, TX.
Langley, M.P., Young, J.C., Tracy, N.D., and Mason, R.L. (1995). "A Computer
Program for Monitoring Multivariate Process Control," in Proceedings of the Sec-
tion on Quality and Productivity, American Statistical Association, Alexandria, VA,
pp. 122-123.
Little, R.J.A., and Rubin, D.B. (1987). Statistical Analysis with Missing Data,
Wiley, New York.
Looney, S.W. (1995). "How to Use Tests for Univariate Normality to Assess Mul-
tivariate Normality," Amer. Statist., 49, pp. 64-70.
Mahalanobis, P.C. (1930). "On Tests and Measures of Group Divergence." J. Proc.
Asiatic Soc. Bengal, 26, pp. 541-588.
Mardia, K.V., Kent, J.T., and Bibby, J.M. (1979). Multivariate Analysis, Academic
Press. New York.
Mason, R.L., Champ, C.W., Tracy, N.D., Wierda, S.J., and Young, J.C. (1997).
"Assessment of Multivariate Process Control Techniques," J. Quality Technology,
29, pp. 140-143.
Mason, R.L., Chou, Y.M., and Young, J. C. (2001). "Applying Hotelling's T2 Statis-
tic to Batch Processes," J. Quality Technology, 33, pp. 466-479.
Mason, R.L., Tracy, N.D., and Young, J.C. (1995). "Decomposition of T2 for Mul-
tivariate Control Chart Interpretation," J. Quality Technology, 27, pp. 99-108.
Mason, R.L., Tracy. N.D., and Young, J.C. (1996). "Monitoring a Multivariate Step
Process," J. Quality Technology, 28, pp. 39-50.
Mason, R.L., Tracy, N.D., and Young, J.C. (1997). "A Practical Approach for
Interpreting Multivariate T2 Control Chart Signals," J. Quality Technology. 29,
pp. 396-406.
Mason, R.L., and Young, J.C. (1997), "A Control Procedure for Autocorrelated
Multivariate Process," in Proceedings of the Section on Quality and Productivity,
American Statistical Association, Alexandria, VA, pp. 143-145.
Mason, R.L., and Young, J.C. (1999). "Improving the Sensitivity of the T2 Statistic
in Multivariate Process Control," J. Quality Technology, 31, pp. 155-165.
Mason, R.L., and Young, J.C. (2000). "Autocorrelation in Multivariate Processes,"
in Statistical Monitoring and Optimization for Process Control, edited by S. Park
and G. Vining, Marcel Dekker, New York, pp. 223-240.
256 Bibliography
Montgomery, D.C., and Mastrangelo, C.M. (1991). "Some Statistical Process Con-
trol Methods for Autocorrelated Data (with Discussion)," J. Quality Technology,
23, pp. 179-204.
Montgomery, D.C. (2001). Introduction to Statistical Quality Control, 5th ed., Wi-
ley, New York.
Morrison, D.F. (1990). Multivariate Statistical Methods, 3rd ed., McGraw-Hill, New
York.
Myers, R.H. (1990). Classical and Modern Regression with Applications, 2nd ed.,
Duxbury Press, Boston, MA.
Myers, R.H., and Milton, J. (1991). A First Course in the Theory of Linear Statis-
tical Models, PWS-Kent, Boston, MA.
Polansky, A.M., and Baker, E.R. (2000). "Multistage Plug-In Bandwidth Selection
for Kernel Distribution Function Estimates," J. Statist. Comput. Simulation, 65,
pp. 63-80.
Rencher, A.C. (1993). "The Contribution of Individual Variables to Hotelling's T2,
Wilks' A, and #2," Biometrics, 49, pp. 479-489.
Runger, G.C., Alt, F.B., and Montgomery, D.C. (1996). "Contributors to a Multi-
variate Statistical Process Control Chart Signal," Comra. Statist. Theory Methods,
25, pp. 2203-2213.
Ryan, T.P. (2000). Statistical Methods for Quality Improvement, 2nd ed., Wiley,
New York.
Scholz, F.W., and Tosch, T.J. (1994). "Small Sample Uni- and Multivariate Control
Charts for Means," in Proceedings of the American Statistical Association, Quality
and Productivity Section, American Statistical Association, Alexandria, VA, pp.
17-22.
Seber, G.A.F. (1984). Multivariate Observations, Wiley, New York.
Sharma, S. (1995). Applied Multivariate Techniques, Wiley, New York.
Sullivan, J.H., and Woodall, W.H. (1996). "A Comparison of Multivariate Control
Charts for Individual Observations," J. Quality Technology, 28, pp. 398-408.
Sullivan, J.H., and Woodall, W.H. (2000). "Change-Point Detection of Mean Vec-
tor or Covariance Matrix Shifts Using Multivariate Individual Observations," HE
Trans., 32, pp. 537-549.
Timm, N.H. (1996). "Multivariate Quality Control Using Finite Intersection Tests,"
J. Quality Technology, 28, pp. 233-243.
Tracy, N.D., Young, J.C., and Mason, R.L. (1992) "Multivariate Control Charts
for Individual Observations," J. Quality Technology, 24, pp. 88-95.
Bibliography 257
259
260 Index