Professional Documents
Culture Documents
To cite this article: Piao Chen, Bing Xing Wang & Zhi-Sheng Ye (2019) Yield-based process
capability indices for nonnormal continuous data, Journal of Quality Technology, 51:2, 171-180,
DOI: 10.1080/00224065.2019.1571342
Article views: 54
RESEARCH ARTICLE
ABSTRACT KEYWORDS
Process capability indices (PCIs) are widely used to assess whether an in-control process meets confidence limits; coverage
manufacturing specifications. In most applications of classical PCIs, the process characteristic probability; kernel
is assumed normally distributed. However, the normal distribution has been found inappropri- estimation; nonparamet-
ric bootstrap
ate in various applications. In the literature, the percentile-based PCIs are widely used to deal
with the nonnormal process. One problem associated with the percentile-based PCIs is that
they do not provide a quantitative interpretation to the process capability. In this study, new
PCIs that have a consistent quantification to the process capability for both normal and non-
normal processes are proposed. The proposed PCIs are generalizations of the classical normal
PCIs in the sense that they are the same as the classical PCIs when the process characteristic
follows a normal distribution, and they offer the same interpretation to the process capability
as the classical PCIs when the process characteristic is nonnormal. We then discuss nonpara-
metric and parametric estimation of the proposed PCIs. The nonparametric estimator is based
on the kernel density estimation and confidence limits are obtained by the nonparametric
bootstrap, while the parametric estimator is based on the maximum likelihood estimation and
confidence limits are constructed by the method of generalized pivots. The proposed meth-
odologies are demonstrated using a real example from a manufacturing factory.
CONTACT Bing Xing Wang wangbingxing@163.com School of Statistics, Zhejiang Gongshang University, Hangzhou, P.R. China.
Piao Chen is now affiliated with the Institute of High Performance Computing, Singapore.
Color versions of one or more figures in the article can be found online at www.tandfonline.com/ujqt.
ß 2019 American Society for Quality
172 P. CHEN, B. XING WANG, AND Z.-S. YE
also questionable in many service and transaction Although several transformation methods such as the
systems (Aldowaisan et al. 2015). Box–Cox transformation and the Johnson transform-
When the process characteristic X is nonnormal, a ation have been proposed (Tang and Than 1999),
serious consequence is that the process capability meas- none of these methods is exact and thus the trans-
ured based on Cp ; Cpk ; Cpm ; and Cpmk in Eq. [1] can be formation method is often not tempting for practi-
misleading (Wang et al. 2016). One natural remedy is tioners (Ryan 2011, chap. 7). On the contrary, given a
to first identify an appropriate distribution for the pro- known continuous CDF of X, our method to transform
cess characteristic data and then use PCIs tailored for X to normality is exact. In addition, the proposed PCIs
this distribution (Clements 1989; Rodriguez 1992). For are generalizations of the classical PCIs as they degen-
a nonnormal process, the percentile-based PCIs are erate to the classical PCIs when the process is normal,
probably the most popular ones (Kotz and Lovelace and they offer the same interpretation to the process
1998). The percentile-based PCIs were proposed by capability as the classical PCIs when the process is
Clements (1989) and are given by nonnormal. As a result, the proposed PCIs can be
USLLSL safely used in various practical applications, such as the
CpðqÞ ¼ and aforementioned supplier-selection problem.
X0:99865 X0:00135
[2] Both nonparametric and parametric inference pro-
USLX0:5 X0:5 LSL
CpkðqÞ ¼ min ; ; cedures for the proposed PCIs are developed. The
X0:99865 X0:5 X0:5 X0:00135
nonparametric inference is based on the kernel dens-
where Xc is the c percentile of X. It is obvious that if ity estimation, which is a popular approach to smooth
X follows the normal distribution, then CpðqÞ ¼ Cp nonparametric estimation of a density (Wand and
and CpkðqÞ ¼ Cpk : The other two PCIs CpmðqÞ and Jones 1994). In addition, nonparametric confidence
CpmkðqÞ could be similarly obtained (Pearn and Kotz limits for the proposed PCIs are constructed by the
1994). Unlike the classical PCIs, the percentile-based nonparametric bootstrap. On the other hand, the
PCIs do not bear a quantitative relationship with the parametric estimator is based on the maximum likeli-
process capability. In other words, one cannot infer hood (ML) estimation, and confidence limits are
the process yield based merely on the percentile-based obtained by the generalized pivotal quantity (GPQ)
PCIs. Moreover, processes with different distributions method. Simulation studies are used to assess the pro-
but the same percentile-based PCIs are very likely to posed inference methods.
have different capabilities. Therefore, percentile-based The remainder of the article is organized as follows.
PCIs for different distributions cannot be compared The second section introduces the new PCIs and
directly. This is undesirable, as the PCIs are com- investigates their properties. The third section dis-
monly used as indices for tracking performance and cusses the nonparametric inference of the proposed
comparing several processes (Chen and Chen 2004). PCIs, and the fourth section discusses the parametric
For example, the process capability is an important inference. The proposed PCIs and the developed
criterion for companies in selecting suppliers (e.g., methodologies are illustrated by a real example from a
Linn et al. 2006). If the process distributions of the manufacturing factory in the fifth section. The sixth
suppliers are different, it would be misleading to com- section concludes the article.
pare the process capabilities of these suppliers based
on the percentile-based PCIs.
2. The proposed PCIs
In brief, the classical PCIs can only be used for a
normal process and the process capability cannot be As argued, the percentile-based PCIs for a nonnormal
quantified using the percentile-based PCIs. This study process fail to quantify the process yield, and they
aims to propose new PCIs that are capable of quanti- cannot be compared across different distribution fami-
fying the process capability for both normal and non- lies. This section proposes new PCIs that have a con-
normal processes. The underlying idea of the sistent quantification to the process yield. Consider a
proposed PCIs is to transform the process characteris- process characteristic X with CDF F(x). By the funda-
tic X to normality. In the literature, transforming the mental theorem of simulation, the random variable
process characteristic to normality and then using the F(X) follows the standard uniform distribution U(0, 1)
classical PCIs is an alternative method for the nonnor- when FðÞ is continuous (Casella and Berger 2002,
mal process. The primary impediment to the trans- Theorem 2.1.10). Therefore, the random variable
formation method is that the closeness of the U1 ðFðXÞÞ follows the standard normal distribution
transformed data to normality cannot be guaranteed. N(0, 1), where U1 ðÞ is the standard normal quantile.
JOURNAL OF QUALITY TECHNOLOGY 173
This means that given X with a continuous CDF, the Yield ¼ 1U 3CpkðQÞ 6CpðQÞ U 3CpkðQÞ :
transformation Y ¼ U1 ðFðXÞÞ always ensures
YNð0; 1Þ: This relationship motivates the definition The preceding properties reveal the same properties
of our new PCIs. between the proposed PCIs and the classical PCIs in
Assume that the process characteristic X has a con- the case of two-sided specification limits. In fact, the
tinuous CDF. Let USL and LSL be the upper and case of one-sided specification limit is also common
lower specification limits of X. After the transform- in industry. Usually, a smaller-the-better process only
ation Y ¼ U1 ðFðXÞÞ; the upper and lower specifica- has an upper specification limit USL, and a larger-the-
tion limits of Y are then U1 ðFðUSLÞÞ and better process only has a lower specification limit LSL
U1 ðFðLSLÞÞ; respectively. Because Y is standard nor- (e.g., Hubele et al. 2005; Wang and Tamirat 2016). In
mal, its PCIs can be determined using Eq. [1]. Since such cases, Property 4 shows that the process yield
the process characteristic X is a transformation of Y, has an exact relationship with the proposed CpkðQÞ :
it is reasonable to require that the PCIs for X are the
same as the classical PCIs for Y, that is, Property 4. Consider a process characteristic X
with either an upper specification limit USL or a
U1 ðF ðUSLÞÞU1 ðF ðLSLÞÞ
CpðQÞ ¼ [3] lower specification limit LSL. If the proposed PCI
6 CpkðQÞ in Eq. [4] is known, then the process yield
and PðX USLÞ (or PðX LSLÞ) is exactly
Yield ¼ U 3CpkðQÞ :
min U1 ðF ðUSLÞÞ; U1 ðF ðLSLÞÞ
CpkðQÞ ¼ : [4]
3
The intimate relations with the yield are a major
The PCIs with respect to Cpm and Cpmk can be simi- advantage of the proposed PCIs over the percentile-
larly constructed as based PCIs when X is nonnormal, as the percentile-
based PCIs do not have a quantitative relationship
U1 ðFðUSLÞÞU1 ðFðLSLÞÞ
CpmðQÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi [5] with the process yield. In addition, because the pro-
6 1 þ ½U1 ðFðTÞÞ2 posed PCIs quantify the process yield regardless of the
distribution families, they can be compared directly
and
even if the underlying distributions of the processes
minfU1 ðFðUSLÞÞ; U1 ðFðLSLÞÞg are different. As an example, consider two process
CpmkðQÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; [6]
3 1 þ ½U ðFðTÞÞ1 2 characteristics X1 and X2 with the same one-sided spe-
cification limit. If the distributions of X1 and X2 are
where T is the target value for X. different, the process yields of the two processes can-
Because the proposed PCIs for X are essentially the not be compared based on the percentile-based PCIs.
same as the classical PCIs for Y, it is not surprising to On the contrary, the proposed CpkðQÞ can be safely
find that the proposed PCIs inherit most of the good used for capability comparison. Next, we show that
properties from the classical PCIs. Some of the key the proposed PCIs have the following invari-
properties are summarized below, and the proofs are ance property.
given in the Appendix.
Property 5. If gðÞ is a strictly increasing function
Property 1. When X is normally distributed, on the support of X, the proposed PCIs are invariant
the proposed PCIs are the same as the classical under the transformation X ! gðXÞ:
PCIs in Eq. [1] defined for normally distributed
characteristics. The invariance property is useful in various cases.
For example, consider the lifetime X as the process
Property 2. If the proposed PCI CpkðQÞ in Eq. [4] is characteristic. The distribution of the lifetime is often
known, the process yield PðLSL X USLÞ is nonnormal, and the logarithm transformation of the
bounded as lifetime is often seen in many practical applications
2U 3CpkðQÞ 1 Yield U 3CpkðQÞ : (Lawless 2003). In such cases, the percentile-based
PCIs based on the original data and the log-trans-
formed data would be different, which may confuse
Property 3. If the proposed PCIs CpðQÞ and CpkðQÞ
the practitioners. On the contrary, the proposed PCIs
in Eq. [3] and Eq. [4] are known, the process yield
are consistent as the logarithm transformation
PðLSL X USLÞ is exactly
is monotonic.
174 P. CHEN, B. XING WANG, AND Z.-S. YE
Figure 1. Coverage probabilities for the 95 percent nonparametric lower confidence limit when n ¼ 50, 100, 200.
176 P. CHEN, B. XING WANG, AND Z.-S. YE
GPQs for the model parameters are available, a GPQ Step 2. For each hðbÞ ; b ¼ 1; :::; B; compute the
for a function of the model parameters can be value of CpðQÞ defined in Eq. [4] and denote it
ðbÞ
obtained by simply plugging in the parameter GPQs, as CpðQÞ :
ð1Þ ðBÞ
and then the confidence interval for the function can Step 3. Use the a percentile of fCpðQÞ :::CpðQÞ g as the
be constructed. Under mild assumptions, Hannig 100ð1aÞ percent lower confidence limit of CpðQÞ :
et al. (2006) showed that the GPQ method has asymp-
totically correct frequentist coverage. Successful appli-
4.2. Simulation study
cations of the GPQ methods have been reported in
Hannig et al. (2006, 2016), among others. Simulation is conducted to assess the performance of
The first step in applying the GPQ method to the the GPQ method in constructing the lower confidence
porposed PCIs is to find GPQs for the parameters h: limits of the proposed PCIs. The general setting is
GPQs for parameters of the location-scale parametric almost identical to that in the third section, second
distributions can be found in Krishnamoorthy and subsection. In brief, 95 percent lower confidence lim-
Mathew (2009, sec. 1.4.2), and GPQs for the gamma its of CpðQÞ ; CpkðQÞ ; and CpmðQÞ are constructed by
distribution are given in Appendix B. Once the GPQs assuming that X follows WBðk; 1Þ; LNðl; 1Þ; and
of h; denoted by Gh ; are available, they could be GAð#; 1Þ; respectively. The parameters k; l; and # are
plugged into Eqs. [3]–[6] to obtain the GPQs of the set as 0:5; 1; :::; 4:5; 5: The respective ULS and LSL are
proposed PCIs. For instance, the GPQ of CpðQÞ can be set as the 0.999 and 0.001 percentiles of the true dis-
constructed as tribution, and the target value T is set as the mean of
U1 ðF ðUSL; Gh ÞÞU1 ðF ðLSL; Gh ÞÞ the true distribution. In addition, B is set as 10,000
GCpðQÞ ¼ ; [8] and sample sizes n ¼ 10, 20, 50 are considered. The
6
estimated coverage probabilities shown in Figure 2 are
and the a percentile of this can then be used for con-
obtained based on 10,000 replications. As can be seen,
structing a 100ð1aÞ percent lower confidence limit of
the coverage probabilities are very close to the nom-
CpðQÞ : Although the exact distribution of Eq. [8] may
inal values regardless of the sample size n.
not be easy to obtain, it can be estimated through the
Monte Carlo simulation. Algorithm 2 summarizes the
procedure for constructing the lower confidence limit 5. Illustrative example
of CpðQÞ : Similar procedures can be applied to construct An example from a manufacturing factory is used to
lower confidence limits of the other proposed PCIs. illustrate the proposed methods. In the production
Algorithm 2. Constructing a parametric lower con- process of the factory, cutting machines are used. The
fidence limit for CpðQÞ : drill is one of the important components in the cut-
ting machine, and drills of different sizes are needed
Step 1. Generate B realizations of h from the distri- in the production process. In this study, we focus on
butions of their GPQs Gh ; and denote them drills of size 1.88 mm. Currently, the factory purchases
as ðhð1Þ ; :::; hðBÞ Þ: the 1.88-mm drills from two different suppliers. To
Figure 2. Coverage probabilities for the log-normal CpðQÞ ; the Weibull CpkðQÞ ; and the gamma CpmðQÞ when n ¼ 10, 20, 50 and the
nominal value is 0.95.
JOURNAL OF QUALITY TECHNOLOGY 177
make a subsequent purchase decision, the factory is distribution is used to fit the lifetime data, and
interested in knowing which supplier is more reliable. Algorithm 2 is used for interval estimation. All these
The lifetime of the 1.88-mm drill is treated as the estimation results are shown in Table 3. As seen, the
quality characteristic and the comparison is based on difference between the nonparametric confidence limit
the PCIs. Lifetime data for the drills are collected dur- and the nonparametric point estimate is smaller than
ing the production process, as shown in Table 1. that of the parametric estimation. This is reasonable
According to the process record, all the drill lifetime as the nonparametric bootstrap generally requires a
data are collected from an in-control process. large sample size to ensure a satisfactory nonparamet-
Based on the histograms in Figure 3, the drill life- ric confidence limit (see the third section, second sub-
times of both suppliers seem to have skewed distribu- section), while the parametric lower confidence limit
tions. The log-normal distribution, the Weibull based on the GPQ method can be well estimated regard-
distribution, and the gamma distribution are then less of the sample size (see the fourth section, second
used to fit the lifetime data. Based on the values of subsection). Therefore, we suggest using parametric
models when the sample size is not very large (e.g.,
Akaike information criterion (AIC) in Table 2, the
n 100). At last, all the inference results indicate that
gamma distribution seems to provide the best fit.
the first supplier is more reliable, and this information
Because the lifetime is larger-the-better, the factory
may help the factory to select a better supplier.
sets LSL ¼ 80 and puts no restriction on USL, that is,
ULS ¼ 1: Given the one-sided specification limit,
Property 4 in the second section shows that the pro- 6. Conclusion
cess yield is exactly Uð3CpkðQÞ Þ: Therefore, CpkðQÞ This study is the first attempt at developing yield-
defined in Eq. [4] is used for comparison. On one based PCIs for nonnormal processes. In the literature,
hand, the kernel estimator of CpkðQÞ is obtained with the use of classical PCIs such as Cp and Cpk is based
the kernel function K set as the triweight function on the normality assumption of the process character-
and the bandwidth h is selected by the plug-in istic X. If X is nonnormal, the percentile-based PCIs
method in Polansky and Baker (2000). Algorithm 1 is cannot quantify the process yield, which limits their
then used to construct the nonparametric lower confi- usefulness in various applications such as the sup-
dence limits of CpkðQÞ : On the other hand, the gamma plier-selection problem. On the contrary, our pro-
posed PCIs degenerate to the classical PCIs when X is
Table 1. Lifetimes (in minutes) of 1.88-mm drill from normally distributed, and they have the same
two suppliers.
X1 135 98 114 137 138 144 99 93 115 106 132 122 94 98 127
122 102 133 114 120 93 126 119 104 119 114 125 107 98 117
Table 2. AIC values based on the log-normal, Weibull, and
111 106 108 127 126 135 112 94 127 99 120 120 121 122 96 gamma distributions.
109 123 105 Log-normal Weibull Gamma
X2 105 105 95 87 112 80 95 97 77 103 78 87 107 96 79
X1 390.08 391.96 389.87
91 108 97 80 76 92 85 76 96 77 80 100 94 82 104
X2 335.38 337.80 335.27
91 95 93 99 99 94 84 99 91 85 86 79 89 89 100
distribution: Censored and uncensored cases. the process capability index Cpk. European Journal of
Environmetrics 27 (8):479–93. Operational Research 217 (3):560–6.
Lawless, J. F. 2003. Statistical models and methods for life-
time data. New York, NY: John Wiley & Sons.
Linn, R. J., F. Tsung, and L. W. C. Ellis. 2006. Supplier Appendix A: Proofs of the properties
selection based on process capability and price analysis.
We first prove Property 5, which includes Property 1 as its
Quality Engineering 18 (2):123–9.
special case. We only need to show that CpðQÞ is invariant
Mathew, T., G. Sebastian, and K. Kurian. 2007. Generalized
under the transformation X ! gðXÞ; as the invariance prop-
confidence intervals for process capability indices.
erty of the other proposed PCIs can be proved in a similar
Quality and Reliability Engineering International 23 (4):
way. Let Z ¼ gðXÞ and the CDF of Z be FZ ðÞ: Then,
471–81.
Pearn, W., and S. Kotz. 1994. Application of U1 FZ g ðUSLÞ ¼ U1 P Z g ðUSLÞ
Clements’method for calculating second-and third-gener-
¼ U1 ðPðX USLÞÞ ¼ U1 ðFðUSLÞÞ:
ation process capability indices for non-normal
Pearsonian populations. Quality Engineering 7 (1):139–45. Similarly, we can show that U1 ðFZ ðgðLSLÞÞÞ ¼
1
Pearn, W., and M.-H. Shu. 2003. Lower confidence bounds U ðFðLSLÞÞ: Therefore, CpðQÞ for Z can be expressed as
with sample size information for Cpm applied to produc-
U1 FZ g ðUSLÞ U1 FZ g ðLSLÞ
tion yield assurance. International Journal of Production
Research 41 (15):3581–99. 6
Perakis, M., and E. Xekalaki. 2016. On the relationship U1 ðFðUSLÞÞU1 ðFðLSLÞÞ
¼ ;
between process capability indices and the proportion of 6
conformance. Quality Technology & Quantitative which completes the proof.
Management 13 (2):207–20. To prove Property 2 and Property 3, we need to show
Polansky, A. M. 2014. Assessing the capability of a manufac- that the process yields of X and Y ¼ U1 ðFðXÞÞ are the
turing process using nonparametric Bayesian density esti- same, that is,
mation. Journal of Quality Technology 46 (2):150–70.
Polansky, A. M., and E. R. Baker. 2000. Multistage plug-in PðLSL X USLÞ ¼ P U1 ðFðLSLÞÞ Y U1 ðFðUSLÞÞ :
bandwidth selection for kernel distribution function esti-
mates. Journal of Statistical Computation and Simulation Based on Theorem 2.1.10 in Casella and Berger (2002),
65 (1–4):63–80. we have PðLSL X USLÞ ¼ PðFðLSLÞ FðXÞ
Rodriguez, R. N. 1992. Recent developments in process cap- FðUSLÞÞ: On the other hand, since U1 ðÞ is strictly increas-
ability analysis. Journal of Quality Technology 24 (4):176–87. ing, the preceding equality obviously holds. Because the
Ryan, T. P. 2011. Statistical methods for quality improve- proposed PCIs for X are the same as the classical PCIs for
ment. New York, NY: John Wiley & Sons. Y, and the classical PCIs have the same relationship with
Shao, J., and D. Tu. 2012. The jackknife and bootstrap. New the process yield as that in Property 2 and Property 3, the
York, NY: Springer Science & Business Media. proof is complete. Given Property 3, Property 4 is obvious
Tang, L. C., and S. E. Than. 1999. Computing process cap- as CpðQÞ ¼ 1 in the case of one-sided specification limit.
ability indices for nonnormal data: A review and com-
parative study. Quality and Reliability Engineering
International 15 (5):339–53. Appendix B: Gamma distribution
Wand, M. P., and M. C. Jones. 1994. Kernel smoothing. Consider a random variable X following a gamma distribu-
Boca Raton, FL: CRC Press. tion GAð#; bÞ with shape parameter # and rate parameter
Wang, B. X., and F. Wu. 2017. Inference on the gamma dis- b. The GPQ of the shape parameter # can be constructed
tribution. Technometrics 60 (2):235–244. ~ XÞ
based on the Cornish–Fisher expansion P to W ¼ log QðX=1=n
Wang, F.-K., and Y. Tamirat. 2016. Multiple comparisons (Wang and Wu 2017), where X ¼ Xi =n; ¼
X X i :
i i
with the best for process selection for linear profiles with The Cornish–Fisher expansion is a useful tool to approxi-
one-sided specifications. Quality and Reliability mate the quantiles of a distribution by using its cumulants
Engineering International 32 (2):697–704. (Shao and Tu 2012). Since W is independent of the rate
Wang, H., J. Yang, and S. Hao. 2016. Two inverse normaliz- parameter b, its pth quantile Wp can be approximated by
ing transformation methods for the process capability c1 ð#Þ þ ½c2 ð#Þ1=2 Zð#; pÞ; where ci is the ith cumulant of W
analysis of non-normal process data. Computers & and Z is a function of the ci’s. For example, if the first five
Industrial Engineering 102:88–98. cumulants of W are used for the approximation, then
Weber, S., T. Ressurreiç~ao, and C. Duarte. 2016. Yield pre-
diction with a new generalized process capability index 1 1 1
Zð#; pÞ ¼ zp þ ~c 3 zp2 1 þ ~c 4 zp3 3zp ð~c 3 Þ2 2zp3 5zp
applicable to non-normal data. IEEE Transactions on 6 24 36
1 1 1
Computer-Aided Design of Integrated Circuits and Systems þ ~c 5 zp4 6zp2 þ 3 ~c 3~c 4 zp4 5zp2 þ 2 þ ð~c 3 Þ3 12zp4 53zp2 þ 17 ;
120 24 324
35 (6):931–42.
Weerahandi, S. 1993. Generalized confidence intervals. where ~c i ¼ ci =ðc2 Þi=2 and zp is the pth quantile of a standard
Journal of the American Statistical Association 88 (423): normal distribution. After some manipulation, it can be
899–905. shown that c1 ¼ wð#Þwðn#Þ þ log ðnÞ and ci ¼
Wu, C.-W., M. Aslam, and C.-H. Jun. 2012. Variables sam- wði1Þ ð#Þ=ni1 wði1Þ ðn#Þ; i ¼ 2; 3; ::: for W, where wðxÞ ¼
pling inspection scheme for resubmitted lots based on C0 ðxÞ=CðxÞ and wðmÞ ðxÞ ¼ dm wðxÞ=dxi :
180 P. CHEN, B. XING WANG, AND Z.-S. YE
Since W is a monotone function of # (Iliopoulos 2016) approximately follows Nðl; r2 Þ; where the parameters have
based on the observed data, a GPQ G# for # can be the the following relationship:
~ XÞ
solution to log ðX= ¼ c1 ðG# Þ þ ½c2 ðG# Þ1=2 Z½G# ; FW ðW Þ;
1=3