Generalized Linear Model Based Monitoring Methods For High-Yield Processes

Received: 6 July 2019 Revised: 8 December 2019 Accepted: 20 February 2020
DOI: 10.1002/qre.2646
RESEARCH ARTICLE
Generalized linear model based monitoring methods for

high-yield processes
Tahir Mahmood
Department of Systems Engineering and

Engineering Management, City University
Abstract
of Hong Kong, Kowloon, Hong Kong Emerge in technology brought well-organized manufacturing systems to pro-
duce high-quality items. Therefore, monitoring and control of products have
Correspondence
Tahir Mahmood, Department of Systems become a challenging task for quality inspectors. From these highly efficient
Engineering and Engineering processes, produced items are mostly zero-defect and modeled based on zero-
Management, City University of Hong
inflated distributions. The zero-inflated Poisson (ZIP) and zero-inflated Nega-
Kong, Tat Chee Avenue, Kowloon, Hong
Kong. tive Binomial (ZINB) distributions are the most common distributions, used to
Email: tmahmood5-c@my.cityu.edu.hk model the high-yield and rare health-related processes. Therefore, data-based
Funding information
control charts under ZIP and ZINB distributions (i.e., Y-ZIP and Y-ZINB) are
Research Grant Council of Hong Kong, proposed for the monitoring of high-quality processes. Usually, with the defect
Grant/Award Numbers: CityU 11213116, counts, few covariates are also measured in the process, and the generalized
T32-101/15-R; National Natural Science
Foundation of China, Grant/Award linear model based on the ZIP and ZINB distributions are used to estimate
Number: 71532008 their parameters. In this study, we have designed monitoring structures
(i.e., PR-ZIP and PR-ZINB) based on the ZIP and ZINB regression models
which will provide the monitoring of defect counts by accounting the single
covariate. Further, proposed model-based charts are compared with the exis-
ting data-based charts. The simulation study is designed to access the perfor-
mance of monitoring methods in terms of run length properties and a case
study on the number of flight delays between Atlanta and Orlando during
2012–2014 is also provided to highlight the importance of the stated research.
KEYWORDS
Pearson residuals, statistical process control, zero-defect, zero-inflated negative binomial
regression, zero-inflated poisson regression
1 | INTRODUCTION
The count process data is categorized as discrete, non-negative integers and most commonly modeled by Poisson and
Negative Binomial (NB) distributions. The Poisson distribution is known as an equi-dispersion distribution because, its
mean is equal to its variance and the standard c and u control charts are well-known charts, designed on the basis of
Poisson distribution. On the other hand, NB distribution is known as an over-dispersion distribution because its vari-
ance is higher than its mean. Ohta et al.1 proposed CCC − r control chart based on the NB distribution. Often, most of
the count data sets contains a large number of zeros as compared to the zeros inherently allowed by the ordinary
Poisson distribution. In such a situation, an excess of zeros in the sample may cause the violation of equi-dispersion
assumption, and ordinary models provide biased and inadequate estimates. Therefore, to overcome the effect of zero
excess, Lambert2 introduced zero-inflated Poisson (ZIP) distribution as an alternative to the ordinary Poisson
1570 © 2020 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/qre Qual Reliab Engng Int. 2020;36:1570–1591.
MAHMOOD 1571
distribution. Similarly, zero-inflated Negative Binomial (ZINB) distribution is an alternative to the ordinary NB distri-
bution (cf. McCullagh and Nelder3). In literature, a process is said to be a zero-defect process, if the variable of interest
contained a large number of zeros. Moreover, in industrial terminology, the zero-defect process is referred to high-yield
process, and in healthcare studies, it is related to rare health-related events.
Usually, two types of variations are present in the quality characteristics of any process named by natural cause var-
iation and assignable cause variation (cf. Adegoke et al.4). The natural variations are the inherent part of the process,
while assignable cause variations needed to be monitor and settled down. The control charts are the most widely useful
tool, used to distinguish such variations in a process. A typical structure of the control chart depends on two decision
lines named as; upper control limit (UCL) and lower control limit (LCL). A process is declared as in-control (IC) or sta-
ble when sample points lie in decision lines while, when sample points exceed decision lines, the process is declared as
out-of-control (OOC) or unstable (cf. Adegoke et al.5). Moreover, when plotting statistic is the number of defects or
defects per units, then the control chart structure is recognized as c chart or u chart, respectively. A detailed review of
these charts can be found in Saghir and Lin6 and Ali et al.7.
In the monitoring of the zero-defect process, conventional u chart produces a high false alarm rate, even when
the exact probability limits of ordinary poison distribution are used. In such case, Xie and Goh8 proposed a
Shewhart mechanism based on ZIP distribution which is further extended by Xie et al.9 using a combined version
of c and CCC control charts and by Chang and Gan10 using a combined version of g and CCC control charts. The
effect of estimation errors in the Shewhart ZIP control chart was discussed by He et al.11. Fatahi et al.12 proposed
exponentially weighted moving average (EWMA) structure for the monitoring of ZIP parameters, which is further
extended into combined EWMA structure by Leong and Tan13. He et al.14 introduced a combined cumulative sum
(CUSUM) structure to monitor a ZIP process. Usually, control charts are designed by assuming the independence
between observations of the ZIP process, but in some real scenarios, there exist autocorrelation between the ZIP
observations. To overcome this issue, Rakitzis et al.15 and Huh et al.16 proposed control charting structures for the
detection of shifts in the mean based on ZIPINAR (1), ZIPINARCH (1), INGARCH (1,1) and ZIP INGARCH (1,1)
models. Kim and Lee17 created R package known as “attrCUSUM” to implement the VSI-CUSUM control charts
for the ZINB and ZIP models. Recently, Tian et al.18 proposed charts based on the threshold-Poisson distribution
and Mukherjee and Rakitzis19 proposed progressive monitoring procedures based on Wald, max and likelihood
ratio-based statistics for the ZIP process. Finally, a comprehensive review of these charts can be found in Mahmood
and Xie20.
In practice, with the variable of interest, some linearly related extra information called as a covariate is also
observed in a process. If the variable of interest follows the Poisson distribution than the data with covariates is
modeled by the Poisson regression model and when the variable of interest follows the NB distribution than the zero-
inflated Negative Binomial regression is used to fit the data. In statistical process monitoring literature, control charts
based on the regression models are referred by model-based control charts. There exist a vast literature on the model-
based control charts, but in this study, we are mainly focused on the count related models such as; Poisson and NB
models. Skinner et al.21 proposed control chart named by r-chart based on the generalized linear model (GLM) when
the response variable follows a Poisson distribution. Further, Asgari et al.22 suggested a new link function for the
Poisson regression and proposed SR-chart based on the standardized residuals of the Poisson regression. They also
extend r-chart and SR-chart under EWMA structure to get the efficiency on the small number of shifts. Alencar et al.23
proposed likelihood ratio based CUSUM control chart for monitoring the parameter of NB model. Recently, Maleki
et al.24 studied the effect of parameter estimation on monitoring methods based on the Poisson regression profile and
Kinat et al.25 proposed control charts for inverse-Gaussian regression profiles. A comprehensive overview of the model-
based control charts is provided in Maleki et al.26.
In the above-mentioned model-based studies, it is assumed that there is no zero-inflation in the response variable.
However, in most of the real scenarios, the count response variable suffered by the zero-inflation and the conventional
Poisson and NB regression models provide misleading estimates. To overwhelmed the biased estimates, zero-inflated
version of Poisson (ZIP) and Negative Binomial (ZINB) regression models are the best alternatives to obtain adequate
estimates of the process. These models are ubiquitous to fit real data sets belongs to several sectors such as; in
manufacturing industries Li et al.27, Zhang et al.28, in healthcare industry Lee et al.29, Gauran et al.30, and in business
and finance Falk and Hagsten31, Metulini et al.32. Motivated by the ZIP and ZINB regression models, we have designed
PR-charts based on the Pearson residuals of the ZIP and ZINB regression models (i.e., PR-ZIP and PR-ZINB control
charts) followed by Asgari et al.22 and compared these charts with existing data-based control charts for the ZIP and
ZINB distributions (i.e., Y-ZIP and Y-ZINB control charts) followed by Xie and Goh8.
1572 MAHMOOD
Rest of the article is followed as: In Section 2, we will describe the count models used in this study; In Section 3,
the model selection procedure is given; In Section 4, the structure of the proposed control charts based on
zero-inflated models are given, and performance evaluations are discussed in Section 5. Further, in Section 6 compari-
son of the proposed charts is provided; The case study about the number of flight delays between Atlanta and Orlando
during 2012–2014 is presented in Section 7; Finally, in Section 8, summary, conclusions and recommendations
are stated.
2 | THE COMMON H IGH-YIELD M ODELS
Linear models are the most commonly used methods to provide good estimates under the assumption that their resid-
uals follow a normal distribution. When the continuous response variable is skewed than the transformation of the
response variable can produce residuals that follow an approximately normal distribution. However, most of the time,
the response variable of interest is the counted number of occurrences of an event or discrete in nature than transfor-
mation does not provide normally distributed residuals. Hence, significant assumptions of the ordinary least-square
model (i.e., homoscedasticity, normality, and linearity) are violated, and use of such model provides biased and ineffi-
cient estimates which lead to false conclusions. Therefore, to avoid such issues, count models are established to access
the estimates when the response variable is in a discrete nature. There exist several count models, but in this study, we
will discuss the zero-inflated versions of Poisson and Negative Binomial models.
2.1 | The zero-inflated Poisson model
The Poisson distribution is the most commonly used probability distribution to model the count data because, inher-
ently, it allows zero counts, positive skewness and has ease of use and interpretation Montgomery33. Often, most of the
count data sets contain a large number of zeros as compared to zeros inherently allowed by the ordinary Poisson model.
In such a situation, an excess of zeros in the sample may cause the violation of equi-dispersion assumption, and the
Poisson model provides biased and inadequate estimates. Therefore, to overcome the effect of zero excess, Lambert2
introduced the zero-inflated Poisson (ZIP) model as an alternative to ordinary Poisson model. The ZIP model assumed
that the outcomes emanate from two processes such as; (i) first process models a proportion ð1−pÞe − λi of zeros coming
from Poisson distribution and zero-inflation by including zeros with a proportion p; and (ii) the second process models
the nonzero counts from the zero-truncated Poisson model. Thus, the model can be formulated as follows:
8 − λi
< Pi + ð1 −Pi Þe
> if yi = 0
PðY i = yi jX i , Z i Þ = e − λ i λ i yi ð1Þ
>
: ð1 −Pi Þ if yi > 0
yi !
where Zi is the vector of covariates defining the probability of excess zero Pi. The conditional mean and the conditional
variance of the ZIP model are E(Yi| Xi,Zi) = (1 − Pi)λi and Var ðY i jX i , Z i Þ = ð1 −Pi Þ λi + Pi λ2i . It is noted that when Pi ! 0
then the ZIP model reduces to the ordinary Poisson model; otherwise, it is over-dispersed due to higher variance than
its mean. The Negative Binomial model which is discussed in the next section considers over-dispersion due to the het-
erogeneity of the data, but ZIP model considers over-dispersion due to splitting the data into two statistical processes
because of the excess zeros.
In the ZIP model, Pi is modeled by using a Logit model and given as:

exp Z0i γ
Pi = ,
1 + exp Z0i γ
where the vector Z 0i = zi,0 , zi,1 , …, zi,m contains m covariates for defining the probability of excess zero Pi and γ 0 = (γ 0, γ 1,
…, γ m) is the vector of m unknown parameters. Usually, the vector of Zi include elements of Xi and the logit model can
be substituted by the probit model specifications. Moreover, the parameter Pi can also be related to λi but in this study,
MAHMOOD 1573
we assumed that the y1,y2,…,yn are independent, and Pi is not related to λi. Thus, we can define the likelihood function
for the response variable as:
Y Y e − λi λi yi

− λi
lðY , β, γÞ = Pi + ½1 −Pi e + ½1 −Pi ,
yi = 0 yi > 0
yi !
and log-likelihood function is defined as:
X X Xn
lnðlðY , β,γ ÞÞ = ln exp Z 0i γ + exp −exp X 0i β + Y i exp X 0i β − exp X 0i β −lnðyi !Þ − ln 1 + exp Z 0i γ :
yi = 0 yi > 0 i=1
Let, Ii be an indicator variable defined as follows:

(
1, if yi = 0
Ii = ,
0, if yi > 0
Then the log-likelihood function becomes,
X X Xn
lnðlðY , β, γ ÞÞ = I i ln exp Z 0i γ + exp −exp X 0i β + ð1 −I i Þ Y i exp X 0i β −exp X 0i β − lnðyi !Þ − nln 1 + exp Z 0i γ ,
yi = 0 yi > 0 i=1
Further, this log-likelihood function can be maximized using the EM algorithm. The interpretation of the parameter λ
is the same as in the Poisson regression, but the parameter γ can be interpreted through the odd ratios (for more details
see; Lambert2). Usually, in regression analysis, the residual analysis is employed for the assessment of the fitted model.
Examination of residuals is used to check the model misspecifications, departures from error, variance assumptions,
and to detect the presence of outliers. However, several types of residuals are available for the regression model, but for
the zero-inflated models, Pearson residuals (standardized ordinary residual) may be a helpful tool for the validation of
the fitted model (cf. Garay et al.34). The Pearson residuals can be determined by the following expression:
yi − ð1 − Pi Þλi
PRi = qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
: ð2Þ
ð1 −Pi Þ λi + Pi λ2i
2.2 | The zero-inflated negative binomial model
Commonly, when data is over-dispersed due to heterogeneity in the data, the Negative Binomial (NB) distribution is
used to model the data. The NB distribution is a continuous mixture of Poisson distribution which allows the Poisson
mean to be gamma distributed and in this way over-dispersion is modeled. Similar to the ZIP distribution, the
zero-inflated Negative
Binomial
−τ (ZINB) distribution is a mixture of the two processes such as; (i) first process models
a proportion ð1 −pÞ 1 + λτi of zeros coming from NB distribution and zero-inflation by including zeros with a propor-
tion p; and (ii) the second process models the nonzero counts from the zero-truncated NB model. Thus, the model can
be formulated as follows:
8
> λi − τ
>
> P i + ð1 − P i Þ 1 + if yi = 0
< τ
PðY i = yi jX i , Z i Þ =
−τ
, ð3Þ
>
> Γðyi + τÞ λi τ − yi
>
: ð1 −Pi Þ 1+ 1+ if yi > 0
yi !ΓðτÞ τ λi
1574 MAHMOOD
where Zi is the vector of covariates defining the probability of excess zero Pi. The conditional mean and the conditional
variance of the ZINB model are E(Yi| Xi,Zi) = (1 − Pi)λi and Var(Yi| Xi,Zi) = λi(1 − Pi)(1+Piλi+λi/τ). It is noted that the
ZINB model reduced to the NB model when Pi ! 0, reduced to the ZIP model when τ ! 0 and reduced to the Poisson
model when both 1/τ and Pi ≈ 0.
In practice, the parameters λi and Pi depends on the vectors of explanatory variables Xi and Zi respectively. That is,

exp Z0i γ
λi = exp X0i β ;Pi = ,
1 + exp Z0i γ
where the vectors X 0i = x i,0 , x i,1 , …,x i,m contains m covariates and β0 = (β0, β1, …, βm) is the vector of m unknown parame-
ters. However, the vector Z 0i = zi,0 , zi,1 , …, zi,m contains m covariates for defining the probability of excess zero Pi and
γ 0 = (γ 0, γ 1, …, γ m) is the vector of m unknown parameters. The ZINB log-likelihood given the observed data is obtain as
follow:
X0β
−τ ! X0β
Xn 0
X 0 e i +τ X e i +τ 0

lðY , β, γ,τÞ = ln 1 + eZ i γ − ln eZ i γ + + ln τ + yi ln 1 + eX i β τ
i=1 i:y = 0
τ i:yi > 0
τ
i
X
+ lnðΓðτÞÞ + lnðΓð1 + yi ÞÞ −lnðΓðτ + yi ÞÞ:
i:yi > 0
The parameter estimation of the ZINB model is carried out by the BFGS method. As discussed above that the residual
analysis has a crucial role in the assessment of the fitted model. Therefore, for the ZINB model, Pearson residuals are
derived in the following expression:
yi − ð1 − Pi Þλi
PRi = pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð4Þ
λi ð1 −Pi Þð1 + Pi λi + λi =τÞ
3 | MODEL S ELE CT I ON P R OCE DUR E
Usually, it is difficult to differentiate between ZIP and ZINB models so, selection of an appropriate model for the data
set is an important step. The wrong choice of a model may lead to estimation errors and monitoring method based on it
may lead to the false conclusion to the practitioners. Therefore, it is necessary to select an appropriate model before the
execution of the monitoring method.
In this study, we have adopted the two-step method discussed by Walters35. The two-step method is known as
the LRT–Vuong method and its procedure is given in Figure 1. The likelihood ratio test (LRT) is used to test the
over-dispersion in the data, and the Vuong test is used for checking the zero-inflation in the data. As the Poisson
model is a particular case of the NB model and in the family of NB model, testing that the Poisson model is adequate
is corresponding to the testing of H0 : τ = 0 versus H1 : τ > 0. For a general NB regression model, the LRT for τ is
defined by:

LRTτ = − 2 l Y ^β −l Y , ^β,^τ ,

where l Y , β^ is maximized log-likelihood under Poisson regression while l Y , β, ^ τ^ is the maximized log-likelihood
under NB regression. Under the null hypothesis, it is assumed that LRT statistic follows a chi-square distribution with
unit degree of freedom and the LRT test provide evidence of the over-dispersion when it produces significant results.
The Vuong test is used to test zero-inflation in the data or to compare the Poisson model with the ZIP model and
NB model with the ZINB model. Let P(Yi = yi| Xi) is the predicted probability of an observed count from the standard
model (e.g., Poisson and NB model) and P(Yi = yi| Xi,Zi) is the predicted probability of an observed count from the zero-
inflation model (e.g., ZIP and ZINB model) than the statistic mibased on their ratio is defined as:
MAHMOOD 1575
F I G U R E 1 Flow chart for the LRT–Vuong

method [Colour figure can be viewed at
wileyonlinelibrary.com]
PðYi = yi jXi Þ
mi = Log :
PðYi = yi jXi ,Zi Þ
Hence, the Vuong test statistic for the testing of H0 : E(mi) = 0 versus H1 : E(mi) 6¼ 0 is defined as:

pffiffiffi 1
P
n
n n mi
i=1
V = qffiffi
2 :
Pn P
n
1
n i=1 m −
i n
1
m i
i=1
The Vuong statistic follows asymptotically normal distribution under H0. Therefore, at 5% level of significance, the stan-
dard model is preferred when V > 1.96, the zero-inflated model is preferred when V < − 1.96 and both models are
assumed equivalent when |V| < 1.96.
In this diagnostic procedure, assessment of over-dispersion is the first step (yellow shaded in Figure 1). Both Poisson
and NB models are applied to the data, and further LRT test is utilized to determine whether the NB model provides
the best fits or Poisson model. If the NB model has significant findings than it reveals over-dispersion in the data; other-
wise, no over-dispersion is assumed. Checking for the zero-inflation is the second step, and the Vuong test is used on
this stage (pink shaded in Figure 1). If over-dispersion is concluded from the first step than the NB model is compared
with the ZINB model; otherwise, the Poisson model is compared with the ZIP model. If the zero-inflated model pro-
vides better fits as compared to its respective standard model, then it is the evidence of zero-inflation in the data. So,
when zero-inflation exist in the data than the zero-inflated model are useful for estimation and to draw efficient conclu-
sions. Moreover, several score tests are also available to test both over-dispersion, and zero-inflation in data see; Ridout
et al.36, Deng and Paul37, Yang et al.38. Hence, when the underlying model of the data is described by the LRT–Vuong
method than the respective monitoring method will be applied to monitor the online process.
4 | M O N I T O R I N G M E T H O D S BA S E D O N Z E R O - I N F L A T E D M O D E L S
Control charts are the key tool among the seven magnificent tools of statistical process control Montgomery33. The
online processing is well-monitored through control chart, and if any assignable cause appeared in the process, control
chart timely indicates to the engineers. For the high-quality processes, control charts are designed based on the zero-
inflated distributions and zero-inflated models, which are described in the following sections.
1576 MAHMOOD
4.1 | The Y-ZIP control chart
The ZIP model is extensively discussed in section 2.1. When Yi follows ZIP distribution having parameters Pi and λi,
then its conditional mean and variance are defined as follows:
μZIP = E ðY i jX i , Z i Þ = ð1 −Pi Þλi ,

σ 2ZIP = Var ðY i jX i , Z i Þ = ð1 −Pi Þ λi + Pi λ2i :
Xie and Goh8 proposed Shewhart type control chart based on ZIP distribution, and they considered probability limits as
the decision lines. In this study, we are using the same mechanism but with the Lσ limits. In the Y-ZIP control chart,
observations are plotted against the upper control limit, which is obtained by the following expression:
qffiffiffiffiffiffiffiffi
UCL = μZIP + L1 σ2ZIP ,
where L1 is the width of the control limit and specified against the fixed IC average run length (ARL0). When an obser-
vation exceeds upper control limit, chart signaled an OOC situation; otherwise, the process is in IC situation.
4.2 | The PR-ZIP control chart
Residuals play an essential role to check the model misspecifications, departures from the error and variance assump-
tions and to detect the presence of outliers. In the PR-ZIP control chart, Pearson residuals of the ZIP model given in
equation (2) are plotted against their control limits, where the control limits are derived below:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
UCL = μPR − ZIP + L2 σ2PR − ZIP ,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
LCL = μPR − ZIP −L2 σ2PR − ZIP ,
where L2 is the width of the control limits for PR-ZIP control chart, which is specified against the fixed ARL0. When
any Pearson residual exceeds limits, the PR-ZIP chart signaled an OOC situation; otherwise, the process is in the IC
situation.
4.3 | The Y-ZINB control chart
The ZINB model is briefly discussed in section 2.2. When Yi follows ZINB distribution having parameters Pi, λi and τ,
then it's conditional mean and variance, are defined as follows:
μZINB = E ðY i jX i , Z i Þ = ð1 −Pi Þλi ,
σ 2ZINB = Var ðY i jX i , Z i Þ = λi ð1 −Pi Þð1 + Pi λi + λi =τÞ:
In the Y-ZINB control chart, observations are plotted against the upper control limit, which is obtained by the following
expression:
qffiffiffiffiffiffiffiffiffiffiffi
UCL = μZINB + L3 σ2ZINB ,
where L3 is the width of the control limit and specified against the fixed ARL0. When an observation exceeds the upper
control limit than the Y-ZINB chart signaled an OOC situation; otherwise, the process is considered as IC.
MAHMOOD 1577
4.4 | The PR-ZINB control chart
Similar to the PR-ZIP control chart, PR-ZINB control chart is designed to plot the Pearson residuals of the ZINB model,
which is reported in equation (4). The control limits for the PR-ZINB control charts are obtained by the following
expressions:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
UCL = μPR − ZINB + L4 σ2PR − ZINB ,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
LCL = μPR − ZINB −L4 σ2PR − ZINB ,
where L4 is the width of the control limits for PR-ZINB control chart, which is specified against the fixed ARL0. When
any Pearson residual from the ZINB model exceeds limits, the PR-ZINB chart signaled an OOC situation; otherwise,
the process is said to be IC.
It is noted that the Y-ZIP and Y-ZINB control charts are the existing schemes which are based on the zero-
inflated distribution and categorized as data-based charts. However, the PR-ZIP and PR-ZINB control charts are
the newly proposed methods, which are based on the zero-inflated regression models and referred as model-based
charts. It is not fair to compare model-based chart based on ZIP and ZINB regression with the conventional
model-based chart based on the Poisson (cf. Asgari et al.22) and NB (cf. Alencar et al.23) regression models. Hence,
in this study, we are comparing model-based charts with the data-based charts developed for the same zero-inflated
process.
5 | PERFORM ANCE EVALUATIONS
In this section, performance measures are discussed which were used to compare the control charts. The analysis and
diagnosis of IC models with the development of control limit coefficients are also discussed in this section. Moreover, to
evaluate the run-length study, different shifts are imposed in the IC model, which are also discussed in the following
subsection.
5.1 | Performance measures
Saghir and Lin6 discussed several measures which are used to analyze the performance of control charts. In this
comparative study, control charts are assessed by using properties of a run-length such as; average run length (ARL),
the standard deviation of run length (SDRL) and median of run length (MDRL). The ARL is a very popular measure
and defined as the average number of samples until a signal occurred. The ARL is categorized into IC average run
length (ARL0) and OOC average run length (ARL1). The ARL1 is the performance measure of the chart when
the process is under unstable condition while ARL0 is the performance measure for a chart under stable condition.
On the fixed ARL0, a chart having the minimum ARL1 is declared as a best chart as compared to others under
consideration (cf. Adegoke et al.39). Further, with the ARL, some researchers prefer to report the other properties of
run-length such as; dispersion of a run-length which is described by SDRL and due to skewed behavior of run
length, some suggested to report MDRL. Hence, discussion on the results of the stated study is made in terms of
ARL, SDRL and MDRL.
5.2 | Designing of IC models and control limit coefficients
This study is purely designed to monitor a process which is modeled by either ZIP or ZINB models. Therefore, for the
ARL study, we have defined two IC models which are discussed in the following subsections. Moreover, followed by IC
models, control limits coefficients for the proposed charts are also discussed.
1578 MAHMOOD
5.2.1 | The simulated ZIP model
The simulated data is generated from ZIP distribution by using the following specifications:

0 if ci = 1
Y i j X i ,Z i = ,
Poisson ðλi Þ if ci = 0

~
ci Bernoulli ~ μX = 0, σ 2X = 0:5 , Z i N
ðpi Þ, X i N ~ μZ = 3, σ 2Z = 0:5 ,
eγ 0 + γ 1 Z i
λi = eβ0 + β1 X i , pi = :
1 + eγ 0 + γ 1 Zi
Whereas, Liu et al.40 provide two possible choices of the coefficients (i) β0 = 0.1, β1 = 1.0, γ 0 = 0.5 and γ 1 = − 1.0,
(ii) β0 = 1.0, β1 = 1.5, γ 0 = 0.5, and γ 1 = − 0.5. The second choice yields a large μ as well as almost large p as compared
to the first choice. Therefore, the expected structural zeroes from the second choice are 26.89% as compared to the first
choice, which has almost 7.59% expected structural zeroes. Miller41 argued that when a small portion of zeroes is avail-
able, then there may have some chances to consider zero from the Poisson model rather than categorized as inflated
zero. Hence, in terms of bias, type I error and inferences, the second choice is appropriate to generate the simulated
model.
To perform a comparative study based on simulated ZIP model, we have generated 1000 samples (i. e., n = 1000) by
using the second choice of the coefficients. The histogram of the generated response variable is plotted in Figure 2(a).
Where the response variable consists of 362 zero values, 148 one, 112 two, 76 three, 81 four and 221 other numbers. For
the estimation purpose, we have applied the Poisson model, NB model, ZIP model and ZINB model on the simulated
data and the results with diagnosis analysis are reported in Table 1. Although, the LR test for the comparison of the
Poisson and NB model provides the evidence that the NB model has reasonable estimates, but the LR test for the com-
parison of the ZIP and ZINB model shows that the ZIP model is a more appropriate model. Further, minimum
AIC = 3586.57 and BIC = 3606.21 are also in favor of the ZIP model.
5.2.2 | The simulated ZINB model
For the simulated ZINB model, data set having 1000 samples is generated from ZINB distribution by using the follow-
ing specifications:

0 if ci = 1
Y i j X i , Zi = ,
NB ðλi , τÞ if ci = 0

~
ci Bernoulli ~ μX = 0,σ 2 = 1 , Z i N
ðpi Þ, X i N ~ μZ = 0, σ 2 = 1 ,
X Z
eγ0 + γ1 Zi
λi = eβ0 + β1 Xi , pi = :
1 + eγ0 + γ1 Zi
Williamson et al.42 used β0 = 1.609, β1 = 0.25, γ 0 = − 0.406, γ 1 = − 0.65 and τ = 0.2 as a choice of the parameters for
the simulated ZINB model. As discussed above that when τ ! 0, the ZINB model reduces to the ZIP model. Therefore,
we have used all similar parameters except the dispersion parameter, which is set at 11 (i.e., τ = 11). The histogram of
the generated response variable is plotted in Figure 2(b). Where the response variable consists of 439 zero values,
33 one, 56 two, 90 three, 77 four and 395 other numbers. To access the best-fitted model, we have applied the Poisson
model, NB model, ZIP model and ZINB model on the simulated data set and the estimations with diagnosis analysis
are reported in Table 1.
By considering the LR-Vuong test, initially, the LR test is used to compare of the Poisson and NB models. The LR
test reveals significant results, which are evidence of over-dispersion in the response variable of the simulated data set.
MAHMOOD 1579
F I G U R E 2 Histograms of the response

variable (a) for the simulated ZIP model and
(b) for simulated ZINB model [Colour figure
can be viewed at wileyonlinelibrary.com]
Further, the Vuong test is used to compare the NB model with the ZINB model. The Vuong test also showed significant
findings, which is the evidence that ZINB is an appropriate model among others. Further, minimum AIC = 3903.53 and
BIC = 3928.07 are also in favor of the ZINB model.
5.2.3 | Algorithm for charting constants
As mentioned above that for the Y-ZIP and PR-ZIP charts, control limits are dependent on the charting constants L1
and L2. Further, the charting constants L3 and L4 are used to describe the width of Y-ZINB control limit and PR-ZINB
control limits, respectively. The procedure to find the control charting constants for the stated charts is illustrated in
the following steps;
i. Generate a data set of fixed sample size n using a specified simulated model structure (given in section 5.2.1 and
5.2.2).
ii. For the PR-ZIP and PR-ZINB control charts, fit the respective GLM model on the data set and obtain Pearson
residuals (For ZIP model, use equation (2) and for ZINB model, use equation (4)).
iii. For the PR-ZIP and PR-ZINB control charts, obtain the mean and standard error of Pearson residuals while
calculate the mean and standard error of the response variable for the Y-ZIP and Y-ZINB control charts.
iv. Use an arbitrary value as a charting constant for the respective control chart and obtain control limit(s) of the
respective control chart by using the estimates from step iii and arbitrary charting constant.
v. For the PR-ZIP and PR-ZINB control charts, plot the respective Pearson residuals against the respective control
limits of the control chart and for the Y-ZIP and Y-ZINB control charts, plot the response values against their
respected control limits.
vi. Repeat step i-v, a large number of runs to obtain specified ARL0.
vii. If specified ARL0 does not achieve than adjust the previous arbitrary value and repeat step i-vi, until defined ARL0
is obtained. The control charting constants are reported in Table 2 with respect to different choices of ARL0 0 s.
TABLE 1 Estimation and diagnosis analysis of selected count models for simulated data sets
1580
Poisson model NB model ZIP model ZINB model
z P z P z P z P
Est. S.E value value Est. S.E value value Est. S.E value value Est. S.E value value
Simulated ZIP data β0 0.68 0.02 27.64 0.00 0.69 0.04 18.72 0.00 1.01 0.03 36.84 0.00 1.01 0.03 36.84 0.00
β1 1.55 0.04 38.04 0.00 1.53 0.08 20.29 0.00 1.52 0.04 35.02 0.00 1.52 0.04 35.01 0.00
Log(τ) 10.08 15.22 3.38 0.50
γ0 1.12 0.50 2.23 0.02 1.12 0.50 2.23 0.02
γ1 −0.71 0.17 −4.17 0.00 −0.71 0.17 −4.17 0.00
τ 1.396 23893.75
Log-Lik (df) −2200.68(df = 2) −1948.54(df = 3) −1789.29(df = 4) −1789.29(df = 5)
AIC 4405.37 3903.07 3586.57 3588.58
BIC 4415.18 3917.79 3606.21 3613.12
LR test 504.69 (0.00) 0.0086 (0.9262)
(p-value)
Vuong test Poisson model vs ZIP model NB model vs ZINB model
Raw −9.81 (0.00) ZIP> Poisson −10.82(0.00) ZINB>NB
AIC-corrected −9.77 (0.00) ZIP> Poisson −10.68(0.00) ZINB>NB
BIC-corrected −9.64 (0.00) ZIP> Poisson −10.35(0.00) ZINB>NB
Simulated ZINB β0 1.06 0.02 56.28 0.00 1.06 0.05 22.33 0.00 1.63 0.02 84.54 0.00 1.62 0.02 69.61 0.00
data β1 0.23 0.02 12.32 0.00 0.23 0.05 4.72 0.00 0.24 0.02 12.55 0.00 0.24 0.02 10.45 0.00
Log(τ) 2.47 0.21 11.61 0.00
γ0 −0.29 0.07 −4.36 0.00 −0.32 0.07 −4.62 0.00
γ1 0.57 0.07 8.06 0.00 0.58 0.07 8.04 0.00
τ 0.53 11.87
Log-Lik (df) −3034.71(df = 2) −1966.57(df = 3) −2177.12 (df = 4) −1946.77(df = 5)
AIC 6073.41 4360.24 3941.13 3903.53
BIC 6083.23 4374.96 3960.76 3928.07
LR test 1715.20 (0.00) 39.59 (0.00)
(p-value)
Raw −20.94 (0.00) ZIP> Poisson −13.05(0.00) ZINB>NB
AIC-corrected −20.91 (0.00) ZIP> Poisson −12.94(0.00) ZINB>NB
BIC-corrected −20.81 (0.00) ZIP> Poisson −12.66(0.00) ZINB>NB
MAHMOOD
MAHMOOD 1581
5.3 | Shifts for the performance evaluation
As discussed above that the zero-inflation in both models is controlled by the parameter pi. Fatahi et al.43 argued that
this parameter is just the probability of occurring a shock in the system which may be considered as constant or the
change in it may be so negligible. On the other hand, in the ZIP model λi and in ZINB model λi and τ are the most
critical parameters. Hence, we evaluate the performance of the control charts by considering the different amount of
indirect and direct shifts in the λi for the ZIP model and in λi and τ for the ZINB model. The description of these
directed and undirected shifts is given as:
pffiffiffiffiffiffiffiffi
i. For the ZIP
pffiffiffiffiffiffiffiffiffiffiffi and ZINB models, direct shifts in the λ i are introduced by changing the λ i into λ i + δ σ 2ZIP or
λi + δ σ ZINB , respectively.
2
ii. For the ZIP and ZINB models, undirected shifts in the λi are introduced by changing the β0 into β0+η or by chang-
ing the mean of covariate Xi such as: from μX to μX+ψ,
iii. As the ZINB model has an additional parameter τ so, performance is also observed by introducing the shifts in τ
using τ − Ω.
For the above-mentioned shifts, we have carried an extensive simulation study with 106 iterations to evaluate the run
length properties in terms of ARL, SDRL and MDRL, and results are reported in the Tables 3–6.
6 | R ESULTS A ND DISCUSSIONS
In this section, we evaluate the comparative results of charts based on the ZIP model and ZINB model. The ARL, SDRL
and MDRL with respect to different choices of ARL0 such as; 200, 370 and 500 are provided in the Tables p
pffiffiffiffiffiffiffiffi 3–6.
ffiffiffiffiffiffiffiffiffiffiffi
Table 3 contains the results of shifts which are directly introduced in λi as; λi + δ σ 2ZIP and λi + δ σ 2ZINB . When
ARL0 is fixed at 200 then the 1σ shift in λi may cause 145.87 and 182.42 unit decrease in the ARL1 of the Y-ZIP and
PR-ZIP control charts, respectively. On fixed ARL0 = 370, 273.66 and 340.93 unit reduction is reported in the ARL1 of
Y-ZIP and PR-ZIP control charts. Further, approximately 359.11 and 458.83 unit decrease in the ARL1 of the Y-ZIP and
PR-ZIP control charts is reported at the fixed ARL0 = 500. However, for the Y-ZINB and PR-ZINB control charts, 2σ
shift in λi may cause the ARL1 around 6.64 and 5.42 at ARL0 = 200, respectively. The ARL1 is reported around 9.17
and 6.97 for the Y-ZINB and PR-ZINB control chart on the fixed ARL0 = 370. Further, at ARL0 = 500, ARL1 0 s for
the Y-ZINB and PR-ZINB control charts are reported as 10.98 and 8.36. Hence, results provide evidence that the
model-based control charts such as; PR-ZIP and PR-ZINB, outperformed the data-based control charts (e.g., Y-ZIP and
Y-ZINB) in the detection of direct shifts in λi.
The first type of indirect shifts in λi are introduced by changing the β0 as β0+η and the findings based on these shifts
are reported in Table 4. For the Y-ZIP and PR-ZIP control charts, shift (η = 0.8) may cause the ARL1 around 19.26 and
5.60 at ARL0 = 200, respectively. The ARL1 is reported around 30.37 and 8.10 for the Y-ZIP and PR-ZIP control charts
on the fixed ARL0 = 370. Further, at ARL0 = 500, ARL1 0 s for the Y-ZIP and PR-ZIP control charts are reported as 42.01
and 10.46. However, when ARL0 is fixed at 200 then the shift (η = 0.2) may cause 130.12 and 135.50 unit decrease in the
ARL1 of the Y-ZINB and PR-ZINB control charts, respectively. On fixed ARL0 = 370, 245.33 and 319.58 unit reduction
is reported in the ARL1 of Y-ZINB and PR-ZINB control charts. Further, approximately 319.58 and 328.46 unit decrease
in the ARL1 of the Y-ZINB and PR-ZINB control charts is reported at the fixed ARL0 = 500. Hence, results provide simi-
lar evidence that the model-based control charts performed well as compared to the data-based control charts for the
detection of indirect shifts in λi due to change in β0.
TABLE 2 Control charting constants with respect to specified ARL0
ARL0 L1 L2 L3 L4
200 4.72 3.091 3.37 3.21
370 5.73 3.48 3.79 3.58
500 6.44 3.73 4.05 3.81
1582 MAHMOOD
TABLE 3 Run-length profile of control charts under direct shifts in λi
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart δ ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
Y-ZIP 0.00 200.82 191.99 145.00 369.98 300.04 288.00 501.98 347.92 453.00
0.25 135.06 133.16 92.00 252.14 230.75 183.00 368.15 302.95 283.00
0.50 96.94 95.89 67.00 180.03 169.30 129.00 266.37 239.45 195.00
0.75 70.97 69.60 50.00 130.08 126.65 91.00 191.60 182.13 135.00
1.00 54.13 53.95 38.00 96.34 94.73 66.00 140.89 135.57 100.50
1.25 43.29 43.28 30.00 73.83 73.99 50.00 106.52 104.04 75.00
1.50 33.61 33.20 23.00 57.87 57.60 40.00 82.63 82.80 58.00
1.75 28.59 28.59 20.00 47.40 48.42 33.00 66.95 65.46 48.00
2.00 23.51 23.10 17.00 38.74 39.11 26.00 54.00 54.75 37.00
2.25 19.71 19.49 14.00 32.05 31.76 23.00 44.71 44.32 31.00
2.50 16.84 16.59 12.00 26.84 26.95 19.00 37.53 37.42 26.00
2.75 14.49 14.35 10.00 23.23 23.12 16.00 31.84 31.42 22.00
3.00 12.86 12.61 9.00 20.10 20.06 14.00 27.53 27.76 19.00
PR-ZIP 0.00 200.26 199.14 136.00 371.15 311.06 280.00 501.24 352.79 450.00
0.25 90.11 93.55 62.00 173.97 175.86 117.00 258.50 245.23 182.00
0.50 47.82 48.81 32.00 85.83 87.97 60.00 128.45 130.92 88.00
0.75 27.41 27.92 19.00 48.22 48.96 33.00 68.57 70.34 46.00
1.00 17.58 17.40 12.00 29.07 29.63 20.00 41.17 41.73 28.00
1.25 11.69 11.37 8.00 18.37 18.09 13.00 25.91 25.74 18.00
1.50 8.45 8.02 6.00 12.57 12.40 9.00 17.38 17.31 12.00
1.75 6.16 5.71 4.00 9.02 8.50 7.00 11.83 11.51 8.00
2.00 4.76 4.19 3.00 6.80 6.45 5.00 8.76 8.26 6.00
2.25 3.89 3.39 3.00 5.28 4.85 4.00 6.61 6.14 5.00
2.50 3.21 2.72 2.00 4.20 3.64 3.00 5.13 4.65 4.00
2.75 2.78 2.27 2.00 3.53 3.01 3.00 4.19 3.62 3.00
3.00 2.46 1.90 2.00 2.97 2.42 2.00 3.45 2.94 2.00
Y-ZINB 0.00 200.21 201.21 139.00 370.50 314.23 281.00 500.49 353.39 436.00
0.25 93.94 95.95 64.00 176.34 176.57 121.00 249.44 239.47 173.00
0.50 48.82 48.97 33.00 88.10 91.81 59.00 125.58 131.44 84.00
0.75 30.20 30.50 21.00 50.87 53.25 34.00 67.92 70.94 46.00
1.00 19.88 19.94 14.00 31.43 30.83 22.00 41.22 41.95 28.00
1.25 13.94 13.69 10.00 21.33 21.31 15.00 27.09 27.64 19.00
1.50 10.64 10.26 7.00 15.45 14.62 11.00 19.08 18.52 13.00
1.75 8.14 7.74 6.00 11.48 10.82 8.00 14.15 13.73 10.00
2.00 6.64 6.17 5.00 9.17 8.93 6.00 10.98 10.54 8.00
2.25 5.60 5.09 4.00 7.27 6.86 5.00 8.72 8.28 6.00
2.50 4.80 4.34 3.00 6.22 5.56 5.00 7.22 6.83 5.00
2.75 4.28 3.80 3.00 5.28 4.72 4.00 6.13 5.67 4.00
3.00 3.70 3.17 3.00 4.56 4.04 3.00 5.39 4.84 4.00
PR-ZINB 0.00 200.72 200.88 135.00 370.70 314.55 285.00 500.67 355.82 452.00
0.25 88.99 92.00 61.00 163.99 166.85 111.00 242.47 235.37 164.00
(Continues)
MAHMOOD 1583
TABLE 3 (Continued)
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart δ ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
0.50 45.63 46.61 31.00 78.77 80.89 54.00 111.15 113.98 76.00
0.75 26.27 26.19 18.00 42.25 43.23 29.00 58.88 59.78 40.00
1.00 16.80 16.98 12.00 25.93 25.22 18.00 34.21 34.64 24.00
1.25 11.55 11.36 8.00 17.06 16.93 12.00 21.60 21.57 15.00
1.50 8.41 7.85 6.00 11.81 11.58 8.00 14.83 14.33 10.00
1.75 6.66 6.06 5.00 8.81 8.42 6.00 11.07 10.53 8.00
2.00 5.42 4.95 4.00 6.97 6.35 5.00 8.36 7.83 6.00
2.25 4.58 4.12 3.00 5.70 5.14 4.00 6.74 6.28 5.00
2.50 3.87 3.37 3.00 4.84 4.34 3.00 5.45 4.92 4.00
2.75 3.42 2.85 3.00 4.13 3.62 3.00 4.72 4.23 3.00
3.00 3.06 2.48 2.00 3.63 3.06 3.00 4.09 3.66 3.00
The second type of indirect shifts in λi are introduced by changing the mean of covariate Xi such as: from μX to
μX+ψ and the findings based on these shifts are given in Table 5. The results revealed that when ARL0 = 200 then the
shift (ψ = 0.2) may cause 77.11 and 99.71 unit reduction in the ARL1 of the Y-ZIP and PR-ZIP control charts, respec-
tively. On fixed ARL0 = 370,144.78 and 168.35 unit reduction is reported in the ARL1 of Y-ZIP and PR-ZIP control
charts. Further, approximately 174.95 and 196.02 unit decrease in the ARL1 of the Y-ZIP and PR-ZIP control charts is
reported at the fixed ARL0 = 500. However, for the Y-ZINB and PR-ZINB control charts, shift (ψ = 2) may cause the
ARL1 reduction around 181.41 and 184.24 at the ARL0 = 200, respectively. The reduction in the ARL1 is reported around
341.09 and 345.51 for the Y-ZINB and PR-ZINB control chart at fixed ARL0 = 370. Further, at ARL0 = 500, reductions
in the ARL1 0 s of the Y-ZINB and PR-ZINB control charts are reported as 462.58 and 468.27. Henceforth, results again
provide the evidence that the model-based control charts are better than the data-based control charts for the detection
of indirect shifts in λi due to change in the mean of covariate Xi.
Furthermore, the ZINB model-based control charts have an extra parameter which controls the dispersion of the
data. So, the performance of the Y-ZINB and PR-ZINB is also observed by introducing the shifts in τ as τ − Ω. The
findings depict that when ARL0 = 200 then the shift (Ω = 4) may cause 62.94 and 73.48 unit reduction in the ARL1 of
the Y-ZINB and PR-ZINB control charts, respectively. On the fixed ARL0 = 370,126.71 and 138.79 unit reduction is
reported in the ARL1 of Y-ZINB and PR-ZINB control charts. Further, approximately 162.84 and 176.90 unit decrease in
the ARL1 of the Y-ZINB and PR-ZINB control charts is reported at the fixed ARL0 = 500. Hence, for monitoring the
reduction of dispersion parameter, PR-ZINB control chart outperformed the Y-ZINB control chart.
Overall, it is concluded that the PR-ZIP control chart provides better performance as compared to Y-ZIP control
chart under different type of shifts in λi. Further, it is also noted that the PR-ZINB control chart outperformed the
Y-ZINB control chart under different kind of shifts in λi and τ. Hence, it is revealed that when the covariate is accumu-
lated with the variable of interest than the model-based control charts (e.g., PR-ZIP and PR-ZINB charts) provide better
performance as compared to data-based control charts (e.g., Y-ZIP and Y-ZINB charts).
7 | A CA S E S TU D Y O N F LI G H T D E L A Y S
Flight delays is a particular issue, which is increased in recent years. The statistics from U.S. Department of Transporta-
tion showed that in 2012–2014, 18,287,150 flights were scheduled within the U.S. and around 26, 413 flights are sched-
uled between Hartsfield–Jackson, Atlanta International Airport (ATL) and Orlando International Airport (MCO),
Orlando, Florida. We obtained this data set from the Bureau of transportation statistics, U.S. Department of Transporta-
tion Department of Transportation44. The distance between the ATL and MCO is around 650 km, and its route is
plotted in Figure 3. For the ATL to MCO flights, we ignored some missing data values and obtained data set based on
26,256 flights between ATL and MCO during 2012–2014.
1584 MAHMOOD
TABLE 4 Run-length profile of control charts under indirect shifts in λi due to change in β0
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart η ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
Y-ZIP 0.00 200.97 189.21 147.00 370.55 300.43 292.00 500.11 344.90 437.00
0.20 104.39 102.07 73.00 191.60 179.10 137.00 281.27 249.27 210.00
0.40 56.77 55.79 40.00 98.47 96.71 70.00 143.40 137.66 102.00
0.60 32.88 32.46 23.00 53.33 53.34 37.00 73.84 73.38 52.00
0.80 19.26 18.93 13.50 30.37 30.35 21.00 42.01 42.43 29.00
1.00 12.31 11.93 9.00 18.34 18.09 13.00 23.89 23.80 17.00
1.20 8.17 7.64 6.00 11.56 11.15 8.00 14.55 14.09 10.00
1.40 5.66 5.25 4.00 7.72 7.23 5.50 9.70 9.26 7.00
1.60 4.22 3.71 3.00 5.46 4.99 4.00 6.54 6.14 5.00
1.80 3.19 2.65 2.00 4.11 3.61 3.00 4.80 4.35 3.00
2.00 2.65 2.11 2.00 3.18 2.63 2.00 3.60 3.08 3.00
PR-ZIP 0.00 200.02 198.45 138.00 370.10 311.15 274.00 500.37 353.04 435.00
0.20 81.16 82.35 55.50 156.87 157.29 108.00 244.54 235.38 170.00
0.40 31.02 30.81 21.00 57.81 59.09 39.00 86.45 89.75 57.00
0.60 12.17 11.87 9.00 20.58 20.54 14.00 29.12 29.44 20.00
0.80 5.60 5.17 4.00 8.10 7.60 6.00 10.46 10.04 7.00
1.00 3.08 2.49 2.00 3.84 3.35 3.00 4.60 4.10 3.00
1.20 2.09 1.53 2.00 2.45 1.86 2.00 2.68 2.12 2.00
1.40 1.70 1.08 1.00 1.84 1.24 1.00 1.95 1.36 1.00
1.60 1.53 0.90 1.00 1.58 0.98 1.00 1.61 1.00 1.00
1.80 1.45 0.83 1.00 1.47 0.84 1.00 1.51 0.85 1.00
2.00 1.40 0.75 1.00 1.41 0.76 1.00 1.43 0.78 1.00
Y-ZINB 0.00 200.45 202.95 138.00 370.60 312.69 282.00 500.54 354.01 433.00
0.20 69.88 72.72 47.00 124.67 128.85 84.00 180.42 184.01 121.00
0.40 27.40 27.29 19.00 45.62 46.37 31.00 60.89 63.03 41.00
0.60 12.92 12.56 9.00 18.51 18.24 13.00 23.66 23.84 16.00
0.80 6.95 6.46 5.00 9.34 8.76 7.00 11.23 11.03 8.00
1.00 4.28 3.78 3.00 5.27 4.66 4.00 5.96 5.48 4.00
1.20 2.99 2.44 2.00 3.56 3.00 3.00 3.87 3.32 3.00
1.40 2.35 1.79 2.00 2.59 2.09 2.00 2.77 2.23 2.00
1.60 2.00 1.40 2.00 2.13 1.55 2.00 2.23 1.67 2.00
1.80 1.84 1.25 1.00 1.91 1.33 1.00 1.96 1.37 1.00
2.00 1.77 1.19 1.00 1.77 1.19 1.00 1.80 1.19 1.00
PR-ZINB 0.00 199.18 196.46 136.00 370.92 315.05 277.50 500.56 353.64 442.00
0.20 64.50 66.87 44.00 118.53 119.29 80.00 171.54 171.94 118.00
0.40 24.64 24.61 17.00 39.13 39.13 27.00 53.39 54.19 37.00
0.60 10.69 10.41 7.00 15.38 14.84 11.00 19.83 19.73 14.00
0.80 5.61 5.12 4.00 7.50 6.97 5.00 8.86 8.26 6.00
1.00 3.55 2.98 3.00 4.17 3.71 3.00 4.71 4.20 3.00
1.20 2.54 1.95 2.00 2.78 2.27 2.00 3.08 2.55 2.00
1.40 2.08 1.52 2.00 2.20 1.62 2.00 2.31 1.72 2.00
(Continues)
MAHMOOD 1585
TABLE 4 (Continued)
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart η ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
1.60 1.83 1.23 1.00 1.93 1.33 1.00 1.97 1.36 1.00
1.80 1.76 1.17 1.00 1.76 1.13 1.00 1.80 1.21 1.00
2.00 1.71 1.09 1.00 1.73 1.12 1.00 1.74 1.13 1.00
TABLE 5 Run-length profile of control charts under indirect shifts in λi due to change in μX
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart ψ ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
Y-ZIP 0.00 200.65 191.80 145.00 369.92 301.50 285.00 499.07 346.05 441.50
0.05 179.76 172.32 128.00 328.91 279.08 247.00 456.71 332.78 383.00
0.10 155.58 150.16 110.00 292.55 256.80 216.00 411.93 316.69 332.00
0.15 138.50 131.73 99.00 253.46 226.72 190.00 362.90 298.06 279.00
0.20 122.89 119.87 86.00 225.22 208.83 162.00 325.05 274.28 248.00
0.25 109.95 106.85 78.00 198.64 189.57 139.00 290.02 256.04 212.00
0.30 95.99 94.63 66.00 172.84 165.86 122.00 252.17 228.05 183.00
0.35 86.66 85.01 61.00 153.15 146.04 108.00 224.14 207.29 163.00
0.40 76.13 74.19 54.00 137.31 131.98 97.00 195.41 186.35 139.00
0.45 67.60 68.64 46.00 120.54 117.25 84.00 175.60 169.85 123.00
0.50 59.95 60.21 42.00 107.23 104.83 76.00 151.33 146.06 107.00
PR-ZIP 0.00 199.87 198.52 135.00 369.31 306.66 277.50 500.27 356.08 439.00
0.05 167.99 168.79 115.00 319.27 283.99 229.00 454.59 340.96 371.00
0.10 146.05 149.61 100.00 277.51 255.29 196.00 402.19 325.60 307.00
0.15 119.80 124.44 80.00 236.17 230.98 161.00 352.89 305.53 257.00
0.20 100.29 102.47 69.00 201.65 200.92 139.00 303.98 277.27 215.00
0.25 84.11 87.97 57.00 167.22 168.78 114.00 250.66 240.42 172.00
0.30 73.09 75.31 50.00 143.26 142.24 99.00 216.29 212.54 148.00
0.35 60.40 60.74 42.00 115.73 118.70 79.00 178.03 181.01 121.00
0.40 50.58 51.52 35.00 95.67 99.21 65.00 147.83 151.81 100.00
0.45 43.03 43.83 29.00 80.03 83.60 55.00 122.17 123.36 83.00
0.50 34.87 34.99 24.00 66.04 66.87 45.00 97.87 100.14 67.00
Y-ZINB 0.00 200.17 206.30 137.00 370.12 311.68 289.00 499.15 351.46 430.00
0.50 102.99 106.04 69.00 193.54 192.42 133.00 275.47 257.29 192.00
1.00 53.70 55.12 37.00 97.01 102.01 67.00 134.94 136.77 91.00
1.50 30.71 30.34 22.00 50.84 51.90 34.00 68.76 69.49 48.00
2.00 18.59 18.43 13.00 28.91 28.98 20.00 37.42 38.54 25.00
2.50 11.92 11.58 8.00 17.23 17.04 12.00 21.41 20.94 15.00
3.00 7.84 7.52 6.00 10.77 10.37 7.00 13.31 13.29 9.00
3.50 5.68 5.15 4.00 7.34 7.08 5.00 8.69 8.17 6.00
4.00 4.34 3.80 3.00 5.30 4.85 4.00 6.11 5.66 4.00
PR-ZINB 0.00 200.23 200.61 139.00 370.48 310.84 281.00 501.54 353.89 456.00
0.50 96.11 98.50 66.00 184.14 186.04 124.00 274.28 257.22 192.00
(Continues)
1586 MAHMOOD
TABLE 5 (Continued)
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart ψ ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
1.00 50.09 50.45 35.00 88.91 90.77 61.00 128.59 130.33 88.00
1.50 27.27 27.29 19.00 44.72 46.12 30.00 61.66 63.26 42.00
2.00 15.76 15.23 11.00 24.49 24.32 17.00 31.73 31.87 22.00
2.50 9.91 9.42 7.00 14.35 14.20 10.00 17.54 17.74 12.00
3.00 6.41 5.92 5.00 9.00 8.60 6.00 10.46 9.92 7.00
3.50 4.62 4.12 3.00 5.95 5.51 4.00 6.86 6.34 5.00
4.00 3.45 2.94 2.00 4.24 3.75 3.00 4.74 4.24 3.00
The U.S. Department of Transportation considers a flight delayed if it arrives at the destination gate after the 15 or
more minutes of its scheduled arrival time. The histogram for the number of flight delays during 2012–2014 is plotted
in Figure 4(a) which shows that 720 days had zero flight delay, 194 days have one flight delay, 72 days have two flight
delay, and other numbers of flight delays are recorded in 110 days. Commonly, flight delays are dependent on several
causes such as airport and air space congestion, size of aircraft and the weather conditions. In this case study, we are
focused on the flight delays caused by weather such as wind speed. Therefore, monthly wind speed data is accessed
through Underground45 is used as the covariate of the number of flight delays in the months. Furthermore, the number
of delayed flights from the departure airport (ATL) are considered as the variable for explaining the zero-inflation part.
TABLE 6 Run-length profile of ZINB model-based control charts under shifts in τ
ARL0 = 200 ARL0 = 370 ARL0 = 500
Chart Ω ARL SDRL MDRL ARL SDRL MDRL ARL SDRL MDRL
Y-ZINB 0.00 200.57 201.55 139.00 370.82 313.15 280.50 499.46 350.28 430.00
1.00 191.29 192.17 128.00 346.70 300.68 254.00 461.97 344.68 376.00
2.00 169.55 170.87 115.00 314.19 283.05 224.00 417.60 330.31 325.00
3.00 152.31 156.54 103.00 279.20 258.58 196.50 379.45 316.25 283.00
4.00 137.06 140.08 94.00 243.29 233.35 171.00 337.16 294.50 245.00
5.00 115.98 119.69 79.00 203.22 200.93 140.00 277.22 255.56 197.50
6.00 96.38 98.26 67.00 161.02 160.10 111.00 223.96 220.10 153.00
7.00 75.83 77.67 52.00 123.67 125.16 84.00 168.27 169.76 114.00
8.00 58.17 57.02 41.00 88.95 90.34 61.00 113.74 116.89 77.00
9.00 39.95 39.73 27.00 56.34 56.71 39.00 68.54 69.35 47.00
10.00 24.08 23.73 17.00 30.35 30.08 22.00 34.84 34.54 24.00
PR-ZINB 0.00 200.74 200.66 142.00 370.83 314.70 274.00 500.19 353.77 433.00
1.00 182.72 181.50 123.00 341.97 297.42 247.50 467.71 345.06 388.00
2.00 168.14 169.11 116.00 308.93 277.18 220.00 425.88 330.51 336.50
3.00 147.04 148.82 101.00 269.67 253.74 189.00 383.54 314.59 293.00
4.00 126.52 128.88 86.00 231.21 225.51 157.00 323.10 285.18 234.00
5.00 108.06 110.44 73.00 190.53 189.55 130.00 265.69 247.96 188.00
6.00 87.39 88.14 60.00 148.74 150.52 102.00 204.25 201.85 141.00
7.00 67.37 67.77 47.00 109.26 112.57 74.00 145.29 145.68 100.00
8.00 50.61 50.43 35.00 75.26 76.33 52.00 95.46 97.77 66.00
9.00 34.04 33.80 24.00 47.30 47.79 32.00 56.67 56.18 40.00
10.00 20.48 20.15 14.00 25.37 24.59 18.00 29.16 29.00 20.00
MAHMOOD 1587
F I G U R E 3 Flight route from Hartsfield–Jackson, Atlanta International Airport (ATL) to Orlando International Airport (MCO),
Orlando, Florida [Colour figure can be viewed at wileyonlinelibrary.com]
To obtain the best-fitted model, we have applied the Poisson model, NB model, ZIP model and ZINB model on the col-
lected data set and the estimations with diagnosis analysis are reported in Table 7.
The LR-Vuong test is used for the selection purpose of an appropriate model where the LR test is used to compare of the
Poisson and NB models. The LR test showed significant findings, which is the evidence of over-dispersion in the number of
flight delays during 2012–2014. Next, the Vuong test is used to compare the NB model with the ZINB model, which revealed
the evidence that ZINB is an appropriate model among others. Further, the minimum of AIC's and BIC's are also in favor of
the ZINB model. Hence, the Y-ZINB and PR-ZINB control charts are used to diagnose the abrupt changes.
The model which contain the number of flight delays during 2012–2014 is used as the IC model. Further, by fixing
the parameters of the IC model and ARL0 = 200, control charting statistics (i.e., L3 = 4.7 and L4 = 4.84) were obtained
through the algorithm described in Section 5.2.3. For the OOC datapset, parameters of the IC model were considered
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
and a direct shift in the λi is introduced by changing the λi into λi + 1 2:034463. The histogram for the number of flight
(A) (B)
600
600
400
Count
Count
400
200
200
F I G U R E 4 Histograms
for the number of flight delays
of (a) IC dataset and (b) OOC
dataset [Colour figure can be 0 0
viewed at wileyonlinelibrary. 0 5 10 15 0 5 10 15 20
com] Y Y
TABLE 7 Estimation and diagnosis analysis of selected count models for flight delay data sets
1588
Poisson model NB model ZIP model ZINB model
z P z P z P z P
Est. S.E value value Est. S.E value value Est. S.E value value Est. S.E value value
IC data set β0 2.88 0.37 −7.69 0.00 −3.11 0.62 −5.04 0.00 −1.04 0.46 −2.25 0.00 −1.95 0.62 −3.16 0.00
β1 0.88 0.13 6.99 0.00 0.95 0.21 4.55 0.00 0.54 0.15 3.54 0.00 0.74 0.21 3.61 0.00
Log(τ) 0.21 0.21 0.99 0.32
γ0 2.30 0.23 10.10 0.00 2.19 0.31 7.13 0.00
γ1 −0.19 0.02 −9.04 0.00 −0.26 0.04 −7.09 0.00
τ 0.4309 1.2306
Log-Lik (df) −1499.8(df = 2) −1249.3(df = 3) −1251.3(df = 4) −1206.8(df = 5)
AIC 3003.68 2504.60 2510.66 2423.49
BIC 3013.68 2519.60 2530.66 244849
LR test 501.08 (0.00) 89.17 (0.00)
(p-value)
Raw −7.63 (0.00) ZIP> Poisson −4.52 (0.00) ZINB>NB
AIC-corrected −7.57 (0.00) ZIP> Poisson −4.31 (0.00) ZINB>NB
BIC-corrected −7.41 (0.00) ZIP> Poisson −3.77 (0.00) ZINB>NB
OOC data β0 −1.81 0.26 −6.98 0.00 −1.53 0.56 −2.75 0.00 −0.13 0.27 −0.50 0.62 −0.55 0.44 −1.27 0.21
set β1 0.76 0.09 8.74 0.00 0.67 0.19 3.51 0.00 0.44 0.09 4.92 0.00 0.52 0.15 3.53 0.00
Log(τ) 0.41 0.17 2.33 0.00
γ0 1.93 0.19 9.98 0.00 2.06 0.26 7.90 0.00
γ1 −0.18 0.02 −9.91 0.00 −0.24 0.03 7.68 0.00
τ 0.4116 1.2707
Log-Lik (df) −2445.8(df = 2) 1777.1(df = 3) −1808.5 (df = 4) −1701.0(df = 5)
AIC 4872.86 3560.15 3624.89 3411.99
BIC 4882.86 3575.15 3644.89 3436.98
LR test 1314.7 (0.00) 214.91 (0.00)
(p-value)
Raw −13.48 (0.00) ZIP> Poisson −6.68 (0.00) ZINB>NB
AIC-corrected −13.42 (0.00) ZIP> Poisson −6.50 (0.00) ZINB>NB
BIC-corrected −13.33 (0.00) ZIP> Poisson −6.06 (0.00) ZINB>NB
MAHMOOD
MAHMOOD 1589
(A)
(B)
FIGURE 5 Control charts for flight delay data sets [Colour figure can be viewed at wileyonlinelibrary.com]
delays in the shifted data set is plotted in Figure 4(b) which shows that 601 days have zero flight delay, 127 days have
one flight delay, 120 days have two flight delays, and other 248 days, different number of flight delays are recorded. To
obtain the best-fitted model for the shifted data, we have applied the Poisson model, NB model, ZIP model and ZINB
model on the shifted data set and the estimations with diagnosis analysis are also reported in Table 7. The LR-Vuong
test, LR test, AIC's and BIC's criteria showed that the shifted data is also best fitted through the ZINB model.
Further, limits of the Y-ZINB and PR-ZINB control charts are derived, and their plotting statistics against control
limits are plotted in Figure 5. In Figure 5(a), data values are plotted against the control limits of Y-ZINB control chart
and Pearson residuals calculated through equation (4) are plotted against the PR-ZINB control chart limits in
Figure 5(b). Where red points belong to IC data set and blue points are from the OOC data set. The findings revealed
that both Y-ZINB and PR-ZINB control charts showed a false alarm. Moreover, in the OOC state, Y-ZINB control chart
signaled 41 OOC points while PR-ZINB control chart signaled 47 OOC points. Hence, it is evidence that the Pearson
residual-based control charts provide better performance in the detection of OOC points.
8 | SUMMAR Y , CON CLU S I ON S AND R E C O M M E NDA T I O NS
The manufacturing processes are embedded with high-quality systems due to a revolution in the technology. This
revolutionary change brings zero-defect products in the markets. However, monitoring and controlling of zero-defect
products have become a challenging task for the quality inspectors. Generally, zero-defect products are well-modeled
by the zero-inflated distributions and the control charts based on zero-inflated distributions are used to diagnose any
abrupt change in the high-quality processes. The ZIP and ZINB distributions are the most common distributions and
the charts based on these distributions (i.e., Y-ZIP and Y-ZINB control charts) are very common to monitor the high-
yield and rare health-related processes. Usually, with the defect counts, few covariates are also measured in the process,
and the generalized linear model-based on the ZIP and ZINB distributions are used to estimate their parameters. In this
study, we have designed monitoring structures based on the ZIP and ZINB regression models (i.e., PR-ZIP and PR-ZINB
control charts) and compared with the existing data-model control charts (i.e., Y-ZIP and Y-ZINB control charts).
1590 MAHMOOD
Several types of direct and indirect shifts are introduced in the parameter λi, to compare the performance of the
model-based control chart and data-based control charts. From the simulated study, it is concluded that the PR-ZIP
control chart provides better performance as compared to Y-ZIP control chart as well as the PR-ZINB control chart also
performed better than the Y-ZINB control chart. Hence, it is revealed that when the covariate is accumulated with the
variable of interest than the model-based control charts (e.g., PR-ZIP and PR-ZINB charts) provide better performance
as compared to data-based control charts (e.g., Y-ZIP and Y-ZINB charts). A case study is discussed on the base of flight
delay data set, which reveals that covariance (wind speed) provides better estimates and the charts based on models are
adequate to detect a change in the mean number of flight delays.
The scope of this study is limited to the single covariate used for the estimation of Poisson parameter λi, one may
extend these methods for two or more covariates. As this study is designed under Shewhart structure which provides
better performance in a large amount of shifts so, one may extend these methods in EWMA, CUSUM and mixed struc-
tures to obtain efficiency under small to moderate shifts in the Poisson parameter λi.
ACK NO WLE DGE MEN TS

The author is thankful to the anonymous reviewers for the constructive comments that helped in improving the last
version of the paper. The work described in this paper was supported by the Research Grant Council of Hong Kong
(CityU 11213116 and T32-101/15-R) and also by the National Natural Science Foundation of China under a Key Project
(71532008). Moreover, the author is also thankful to Muhammad Sajjad, School of Energy and Environments, City Uni-
versity of Hong Kong, Kowloon, Hong Kong, for helping in the construction of Figure 3.
ORCID
Tahir Mahmood https://orcid.org/0000-0002-8748-5949
R EF E RE N C E S
1. Ohta H, Kusukawa E, Rahim A. A CCC-r chart for high-yield processes. Qual Reliab Eng Int. 2001;17(6):439-446.
2. Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Dent Tech. 1992;34:1-14.
3. McCullagh P, Nelder J. Generalised linear models. 2nd ed. London: Chapman and Hall; 1983.
4. Adegoke NA, Abbasi SA, Dawod AB, Pawley MD. Enhancing the performance of the EWMA control chart for monitoring the process
mean using auxiliary information. Qual Reliab Eng Int. 2019;35(4):920-933.
5. Adegoke NA, Smith AN, Anderson MJ, Sanusi RA, Pawley MD. Efficient homogeneously weighted moving average chart for monitoring
process mean using an auxiliary variable. IEEE Access. 2019;7:94021-94032.
6. Saghir A, Lin Z. Control charts for dispersed count data: an overview. Qual Reliab Eng Int. 2015;31(5):725-739.
7. Ali S, Pievatolo A, Göb R. An overview of control charts for high-quality processes. Qual Reliab Eng Int. 2016;32(7):2171-2189.
8. Xie M, Goh T. Spc of a near zero-defect process subject to random shocks. Qual Reliab Eng Int. 1993;9(2):89-93.
9. Xie W, Xie M, Goh T. Control charts for processes subject to random shocks. Qual Reliab Eng Int. 1995;11(5):355-360.
10. Chang T, Gan F. Charting techniques for monitoring a random shock process. Qual Reliab Eng Int. 1999;15(4):295-301.
11. He B, Xie M, Goh TN, Ranjan P. On the estimation error in zero-inflated Poisson model for process control. Int J Reliab Qual Saf Eng.
2003;10(02):159-169.
12. Fatahi AA, Noorossana R, Dokouhaki P, Moghaddam BF. Zero inflated Poisson EWMA control chart for monitoring rare health-related
events. J Mech Med Biol. 2012;12:12500651-125006514.
13. Leong RNF, Tan DSY. Some zero inflated Poisson-based combined exponentially weighted moving average control charts for disease
surveillance. Philipp Stat. 2015;64:17-28.
14. He S, Li S, He Z. A combination of CUSUM charts for monitoring a zero-inflated Poisson process. Commun Stat-Simul C. 2014;43(10):
2482-2497.
15. Rakitzis AC, Weiß CH, Castagliola P. Control charts for monitoring correlated Poisson counts with an excessive number of Zeros. Qual
Reliab Eng Int. 2017;33(2):413-430.
16. Huh J, Kim H, Lee S. Monitoring parameter shift with Poisson integer-valued GARCH models. J Stat Comput Simul. 2017;87(9):1754-1766.
17. Kim H, Lee S. attrCUSUM: Tools for Attribute VSI CUSUM Control Chart. 2016. Available at: https://CRAN.R-project.org/package=
attrCUSUM.
18. Tian W, You H, Zhang C, Kang S, Jia X, Chien W-TK. Statistical process control for monitoring the particles with excess zero counts in
semiconductor manufacturing. IEEE Trans Semicond Manuf. 2018;32:93-103.
19. Mukherjee A, Rakitzis AC. Some simultaneous progressive monitoring schemes for the two parameters of a zero-inflated Poisson process
under unknown shifts. J Qual Technol. 2019;51(3):257-283.
20. Mahmood T, Xie M. Models and monitoring of zero-inflated processes: the past and current trends. Qual Reliab Eng Int. 2019;35(8):2540-2557.
21. Skinner KR, Montgomery DC, Runger GC. Process monitoring for multiple count data using generalized linear model-based control
charts. Int J Prod Res. 2003;41(6):1167-1180.
MAHMOOD 1591
22. Asgari A, Amiri A, Niaki STA. A new link function in GLM-based control charts to improve monitoring of two-stage processes with
Poisson response. J Adv Manuf Tech. 2014;72(9-12):1243-1256.
23. Alencar AP, Lee Ho L, Albarracin OYE. CUSUM control charts to monitor series of negative binomial count data. Stat Methods Med Res.
2017;26(4):1925-1935.
24. Maleki M, Castagliola P, Amiri A, Khoo MB. The effect of parameter estimation on phase II monitoring of poisson regression profiles.
Commun Stat-Simul C. 2019;48(7):1964-1978.
25. Kinat S, Amin M, Mahmood T. GLM-based control charts for the inverse-Gaussian distributed response variable. Qual Reliab Eng Int.
2019;36(2):765-783. https://doi.org/10.1002/qre.2603
26. Maleki MR, Amiri A, Castagliola P. An overview on recent profile monitoring papers (2008–2018) based on conceptual classification
scheme. Comput Ind Eng. 2018;126:705-728.
27. Li G, Tan KKR, Ng SH, Chua DH. A multilevel zero-inflated model for the study of copper hillocks growth in integrated circuits
manufacturing. IEEE Trans Semicond Manuf. 2018;31(3):385-394.
28. Zhang X, Kano M, Tani M, Mori J, Harada K. Defect Data Modeling and Analysis for Improving Product Quality and Productivity in
Steel Industry. In: Computer Aided Chemical Engineering. Vol.44. San Diego, California, USA: Elsevier; 2018:2233-2238.
29. Lee SM, Karrison T, Nocon RS, Huang E. Weighted zero-inflated Poisson mixed model with an application to Medicaid utilization data.
Commun Stat Appl Methods. 2018;25(2):173-184.
30. Gauran IIM, Park J, Lim J, et al. Empirical null estimation using zero-inflated discrete mixture distributions and its application to pro-
tein domain data. Biometrics. 2018;74(2):458-471.
31. Falk M, Hagsten E. The art of attracting international conferences to European cities. Tour Econ. 2018;24(3):337-351.
32. Metulini R, Patuelli R, Griffith DA. A spatial-filtering zero-inflated approach to the estimation of the gravity model of trade. Economet-
rics. 2018;6(1):9. https://doi.org/10.3390/econometrics6010009
33. Montgomery DC. Introduction to statistical quality control. 6th ed. New York: John Wiley & Sons; 2009.
34. Garay AM, Hashimoto EM, Ortega EM, Lachos VH. On estimation and influence diagnostics for zero-inflated negative binomial regres-
sion models. Comput Stat Data An. 2011;55(3):1304-1318.
35. Walters GD. Using Poisson class regression to analyze count data in correctional and forensic psychology: a relatively old solution to a
relatively new problem. Crim Justice Behav. 2007;34(12):1659-1674.
36. Ridout M, Hinde J, DeméAtrio CG. A score test for testing a zero-inflated Poisson regression model against zero-inflated negative bino-
mial alternatives. Biometrics. 2001;57(1):219-223.
37. Deng D, Paul SR. Score tests for zero-inflation and over-dispersion in generalized linear models. Stat Sin. 2005;15:257-276.
38. Yang Z, Hardin JW, Addy CL. Score tests for zero-inflation in overdispersed count data. Commun Stat-Theor M. 2010;39(11):2008-2030.
39. Adegoke NA, Riaz M, Sanusi RA, Smith AN, Pawley MD. EWMA-type scheme for monitoring location parameter using auxiliary infor-
mation. Comput Ind Eng. 2017;114:114-129.
40. Liu X, Winter B, Tang L, Zhang B, Zhang Z, Zhang H. Simulating comparisons of different computing algorithms fitting zero-inflated
Poisson models for zero abundant counts. J Stat Comput Simul. 2017;87(13):2609-2621.
41. Miller JM. Comparing Poisson, Hurdle, and ZIP model fit under varying degrees of skew and zero-inflation. USA: PhD Thesis, University
of Florida; 2007.
42. Williamson JM, Lin H, Lyles RH, Hightower AW. Power calculations for ZIP and ZINB models. Data Sci J. 2007;5:519-534.
43. Fatahi AA, Noorossana R, Dokouhaki P, Moghaddam BF. Zero inflated Poisson EWMA control chart for monitoring rare health-related
events. J Mech Med Biol. 2012;12(04):12500651–125006514.
44. Department of Transportation US. Reporting carrier on-time performance (1987-present). Available at: https://www.transtats.bts.gov/
DL_SelectFields.asp?Table_ID=236. (Accessed 12, December)
45. Underground W. Weather History for Orlando International Airport, Florida. Available at: https://www.wunderground.com/history/
daily/us/fl/orlando-international/KMCO/date/2018-12-12?cm_ven=localwx_history. (Accessed 12, December)
AUTHOR BIOGRAPHY
Tahir Mahmood got his degree of BS (Hons.) in Statistics with distinction (Gold Medalist) from the Department of
Statistics, University of Sargodha, Sargodha, Pakistan, in 2012 and served as Teaching Assistant from 2012 to 2015.
In April 2017, he received his MS (Applied Statistics) degree from the Department of Mathematics and Statistics,
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia. Now, he is student of Ph.D. in the Depart-
ment of Systems Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong.
His current research interests include statistical process control and experimental design. His e-mail addresses are:
tmahmood5-c@my.cityu.edu.hk and rana.tm.19@gmail.com.
How to cite this article: Mahmood T. Generalized linear model based monitoring methods for high-yield
processes. Qual Reliab Engng Int. 2020;36:1570–1591. https://doi.org/10.1002/qre.2646

Generalized Linear Model Based Monitoring Methods For High-Yield Processes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Generalized Linear Model Based Monitoring Methods For High-Yield Processes

Uploaded by

Copyright:

Available Formats

Received: 6 July 2019 Revised: 8 December 2019 Accepted: 20 February 2020

Generalized linear model based monitoring methods for

Department of Systems Engineering and

2 | THE COMMON H IGH-YIELD M ODELS

2.1 | The zero-inflated Poisson model

and log-likelihood function is defined as:

Let, Ii be an indicator variable defined as follows:

Then the log-likelihood function becomes,

2.2 | The zero-inflated negative binomial model

3 | MODEL S ELE CT I ON P R OCE DUR E

F I G U R E 1 Flow chart for the LRT–Vuong

4.1 | The Y-ZIP control chart

μZIP = E ðY i jX i , Z i Þ = ð1 −Pi Þλi ,

4.2 | The PR-ZIP control chart

4.3 | The Y-ZINB control chart

μZINB = E ðY i jX i , Z i Þ = ð1 −Pi Þλi ,

σ 2ZINB = Var ðY i jX i , Z i Þ = λi ð1 −Pi Þð1 + Pi λi + λi =τÞ:

4.4 | The PR-ZINB control chart

5 | PERFORM ANCE EVALUATIONS

5.1 | Performance measures

5.2 | Designing of IC models and control limit coefficients

5.2.1 | The simulated ZIP model

5.2.2 | The simulated ZINB model

F I G U R E 2 Histograms of the response

5.2.3 | Algorithm for charting constants

Poisson model NB model ZIP model ZINB model

5.3 | Shifts for the performance evaluation

TABLE 2 Control charting constants with respect to specified ARL0

TABLE 3 Run-length profile of control charts under direct shifts in λi

ARL0 = 200 ARL0 = 370 ARL0 = 500

ARL0 = 200 ARL0 = 370 ARL0 = 500

ARL0 = 200 ARL0 = 370 ARL0 = 500

ARL0 = 200 ARL0 = 370 ARL0 = 500

ARL0 = 200 ARL0 = 370 ARL0 = 500

ARL0 = 200 ARL0 = 370 ARL0 = 500

TABLE 6 Run-length profile of ZINB model-based control charts under shifts in τ

ARL0 = 200 ARL0 = 370 ARL0 = 500

Poisson model NB model ZIP model ZINB model

8 | SUMMAR Y , CON CLU S I ON S AND R E C O M M E NDA T I O NS

ACK NO WLE DGE MEN TS

You might also like