You are on page 1of 22

Received: 5 September 2018 Revised: 25 April 2019 Accepted: 30 June 2019

DOI: 10.1002/qre.2551

RESEARCH ARTICLE

Deep recurrent neural network‐based residual control chart


for autocorrelated processes

Shumei Chen | Jianbo Yu

School of Mechanical Engineering, Tongji


Abstract
University, Shanghai, China
With the growth of automation in process industries, there is correlation in the
Correspondence process variables. Deep learning has achieved many great successes in image
Jianbo Yu, School of Mechanical
Engineering, Tongji University, Shanghai
and visual analysis. This paper concentrates on developing a deep recurrent
201804, China. neural network (RNN) model to characterize process variables at vary time
Email: jianboyu.bob@gmail.com lags, and then a residual chart is developed to detect mean shifts in
Funding information autocorrelated processes. The experiment results indicate that the RNN‐based
Fundamental Research Fund for the Cen- residual chart outperforms other typical methods (eg, autoregressive [AR]‐
tral University; National Natural Science
based control chart, back propagation network [BPN]‐based residual chart).
Foundation of China, Grant/Award
Number: 71777173 This paper provides guideline for deep learning technique employed as an
effective tool in autocorrelated process control.

KEYWORDS
autocorrelated process, deep learning, recurrent neural network, residual control chart, statistical
process control

1 | INTRODUCTION

In the last few decades, considerable efforts have been devoted to the monitoring of time‐series data from
autocorrelated manufacturing processes. Statistical process control (SPC) is an effective tool used to detect whether
the observed process variables deviate from the normal state. Although the regular SPC technique has witnessed
many successes for monitoring discrete manufacturing processes (Deming1), they are not applicable in continuous
and batch process industries (Zobel et al2). The fundamental assumption using SPC is independent of the observed
process variables. That is to say, the value of a process variable is dependent upon the previous value in time series.
These continue industries, including the manufacture of food, chemicals, paper, and wood products, are all attributed
to this process category (Cook and Chiu3). Using conventional techniques to identify mean shifts of continue
manufacturing operations will result in a large number of false alarms. Once a false alarm is triggered in the process,
quality engineers need to investigate and eliminate these assignable causes from the processes (English et al4), which
results in the costly over‐control of process. Hence, some alternative approaches are developed to monitor mean
shifts in autocorrelated processes (Zobel et al2).
Some efforts have been devoted to extend the SPC techniques to detect shifts of correlated processes. Some
researchers, including Alwan and Roberts,5 Harries and Ross,6 Wardell et al,7 Runger et al,8 as well as Box and
Luceno,9 proposed time‐series model‐based methods to model the process variables collected from the autocorrelated
processes. They used a control chart to monitor residuals from the time‐series model fitted by the adequate correlated
process data. The mean of residuals on the control chart is near to zero, and the autocorrelations at all lags are non-
significant if the observed parameter data are in the normal state (Yourstone and Montgomery10). Wardell et al11 eval-
uated properties of those time‐series control methods, including Shewhart, exponential weighted moving average
Qual Reliab Engng Int. 2019;35:2687–2708. wileyonlinelibrary.com/journal/qre © 2019 John Wiley & Sons, Ltd. 2687
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2688 CHEN AND YU

(EWMA), common‐cause control (CCC), and residual charts. Their analysis results show effectiveness of EWMA to
monitor correlated data. Zhang12 proposed the EWMAST chart that extends the traditional EWMA to the
autocorrelated data from (weakly) stationary processes. Wright et al13 used a method of joint estimation to monitor
abnormal observations (Chen and Liu14) by means of ARIMA to recognize four types of outliers that characterized four
different problems in correlated processes.
In recent years, considerable researches have been devoted to the growth of the state‐of‐the‐art control technology for
autocorrelated processes. Huang et al15 proposed an autoregressive (AR) moving average method to describe a time‐
series model, and two special cause charts were developed in monitoring the predicted values. Pirhooshyaran and
Niaki16 developed a double‐max multivariate exponentially weighted moving average (DM‐MEWMA) chart based on
a novel statistic to simultaneously detect shifts in mean and variability of multistage processes. Akhundjanov and
Pascual17 adopted the EWMA chart to monitor the correlated multivariate processes of Poisson distribution without
an assumption of negative correlations. Bodnar and Schmid18 modified several CUSUM control charts to deal with
the detection of covariance matrix variation manifested in multivariate time‐series model of Gaussian distribution.
Amiri et al19 proposed three MEWMA control charts to simultaneously monitor the sum of squared residual and regres-
sion parameters based on principal component analysis in phase II of multivariate regression profile in the presence
with correlation. Maleki et al20 utilized an extended logistic regression model to illustrate the autocorrelation existing
within profile and combined the logistic function with the particle swarm optimization algorithm to deal with the
estimation of parameters. Huang and Khachatryan21 proposed a dimension reduction‐based method for a multivariate
time‐series process control.
Artificial neural networks (ANNs) are widely used as a promising tool to recognize shifts in correlated processes
without the assumption of data independence. Cook and Chiu3 proposed a radial basis function (RBF)‐based model
to identify shifts in process variables based on two industrial datasets (ie, the papermaking [Pandit22] and viscosity
[Box and Jenkins23] dataset). Smith24 proposed back propagation network (BPN)‐based model to identify both mean
and variance shifts of various types of out‐of‐control patterns. Ho and Chang25 adopted a combined neural network
model to detect both mean and variance shifts and compared with other traditional SPC methods using average run
length (ARL) and recognition rate. Yang and Zhou26 developed an ANN ensemble‐enabled AR coefficient‐invariant
control chart patterns recognition (CCPR) model to detect seven representative types of unnatural process patterns.
The process was assumed to be AR processes of lag 1 (AR (1)) over time, and the constant AR coefficient was
unknown. Purintrapiban and Corley27 proposed an ANN method to detect AR (1) data of a cyclic pattern after apply-
ing the fractal dimension calculation to extract the representative features from process observations. Hwarng and
Wang28 proposed an ANN‐based identifier for multivariate autocorrelated process to detect shifts of process mean
and recognize source(s) of the shift(s). Yuangyai and Abrahams29 constructed an ANN model to classify various
out‐of‐control patterns in correlated processes. Prajapati and Singh30 proposed the sum of chi‐square theory to deal
with autocorrelation in process response values. White and Safi31 proposed an ANN‐based model to forecast time‐
series observations with a comparison with AR integrated moving average and regression model.
Some other machine learning‐based methods have been proposed for autocorrelated processes control. Chinnam32
developed a support vector machine (SVM)‐based model to identify mean shifts of autocorrelated processes. Apley and
Shi33 proposed a generalized likelihood ratio test (GLRT)‐based method to monitor and estimate mean shifts in
autocorrelated processes. In sensory data for process monitoring, the informative and discriminative signals may be
separated by many indiscriminative signals or even noisy signals occupying a long time period. Thus, the long delays
that separate some important features in time‐scale may lead to failures of these typical time‐series models or machine
learning models.
In recent years, feature learning with a deep structure has aroused wide concerns in some research fields
(Yuangyai and Abrahams29). Deep learning, also known as deep neural networks (DNN), is a new feature learning
method with multiple hidden layers of representation. It employs a hierarchical structure with multiple neural layers
and extracts information from input data through a layer‐by‐layer process. This deep structure allows it to learn the
representations of complicated raw data with multiple levels of abstraction, which makes it easier to extract useful
features when constructing classifiers or predictors (Lecun et al34). A few of deep learning algorithms have been pro-
posed to automatically learn more abstract and useful features via multilayer nonlinear transformations in deep net-
work architecture (Lecun et al34). Recurrent neural network (RNN) is a DNN that is capable of learning dynamic
information in time‐series data by the periodic connections of hidden neurons to implement prediction of time‐series
data. Different from other feedforward neural networks, the context is stored, learnt, and expressed by RNNs in any
length without limitation to spatial boundaries. Considerable research efforts have been devoted to the extensive
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2689

applications of different fields (eg, image tagging, handwriting recognition, and machine translation). Auli et al35
constructed a joint language and translation system to predict the target content based on a series of both source
and target content through RNN. Liu et al36 developed a recursive recurrent neural network (R2NN) to fit the
end‐to‐end decoding process for an application of statistical machine translation. Mulder et al37 presented a survey
that covers the applications of RNN in statistical language modeling and introduced some recent significant exten-
sions of RNN to overcome the long training period and the constraints on the context words of RNN. Socher and
Lin38 developed a max‐margin structure based on RNN as a syntactic parser to predict natural language sentences
from a typical dataset of Penn Treebank. However, there are few researches on RNN‐based time‐series prediction
model for autocorrelated processes. RNN is capable of predicting process variables at varying time lags. Thus,
RNN is a very promising tool to identify mean shifts of autocorrelated processes.
In general, a process control system consists of fault detection and fault diagnosis. The fault diagnosis is also a very
essential issue in process control, and considerable efforts have been devoted to recognize the fault patterns based on
many typical machine‐learning methods (Zou et al39), eg, random forest (Hsieh et al40), SVM (Yélamos et al, Chiang
et al41,42), and ANN (Tamilselvan and Wang, Tamilselvan, Fan43-45). However, the detection of abnormal signal is
usually the first issue to be solved in process control, and thus this study focuses on it in autocorrelated processes. In this
study, a RNN‐based method is proposed to identify mean shifts of autocorrelated manufacturing processes. The proposed
RNN model is capable of learning dynamic information and capturing long‐term dependencies in time‐series data with-
out any assumption about the data distribution. This salient feature is also useful during the modeling phase as it can be
applied to any process data. A RNN‐based residual control chart is further proposed to detect mean shifts of
autocorrelated processes. The experimental results indicate that the proposed method provides a significant improve-
ment for identifying mean shifts of autocorrelated processes over those regular monitoring methods.
The remaining parts of the paper are organized as follows. RNN is introduced in Section 2. In Section 3, the RNN‐
based residual control chart is proposed for monitoring of autocorrelated processes. In Section 4, the experimental
results are presented to illustrate effectiveness of the RNN‐based residual control chart to detect mean shifts of
autocorrelated processes. The concluding remarks are provided in Section 5.

2 | RECURRENT NEURAL NETW ORK

The most distinguished feature of RNN is the inclusion of at least one feed‐back connection inside the network, which is
conducive for activations to circulate in a loop. The span to adjacent time interval distinguishes RNN from the regular
ANNs, and the time notation is initially introduced to RNN (Lipton et al46). RNN is capable of remembering the
temporal sequence information, which facilitates RNN to implement time sequence prediction task. In addition,
RNN has gained its popularity in application of speech recognition with complex time‐varying signals, temporal iden-
tification, etc. The sophisticated architectures inside RNN along with the activation functions exploit the powerful
performance for mapping the input to the target vector in a nonlinear way.Units of a single layer inside RNN connect
each other directly, which contributes to dealing with the context from different time intervals (Karim et al47). Connec-
tions between hidden layers are conducive to the circular use of unbound historical information from any time interval.
The strategy for sequence learning can effectively work out due to the provision of all the required data by RNN.

FIGURE 1 The basic architecture of RNN [Colour figure can be viewed at wileyonlinelibrary.com]
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2690 CHEN AND YU

There may not be any cycle among traditional edges inside RNN, whereas the inclusion of recurrent edges that pres-
ent connection between adjacent time steps may form cycles. However, the application of RNN to deal with the sophis-
ticated and nonmonotonic sequence prediction where the sequences are of different lengths still needs further
consideration (Sutskever et al48). Figure 1 presents the basic architecture of RNN.
As shown in Figure 1, RNN consists of three main components, ie, input layer x, hidden layer h, and output layer
o. The (xt,1, xt,2, …, xt,n) with n data points is the sequence input of RNN at time t, W ðhx Þ ∈ RDh xd denotes weight
matrix concerned with the hidden and input layer, and W ðohÞ ∈ RDo xDh is the weights matrix concerned with hidden
layer and output layer. Unlike the regular neural networks (eg, BPN), there are recurrent connections between the
hidden layer and itself at adjacent time steps in RNN. At the time point t, these nodes in the hidden layer ht receive
the input of current time series (xt,1, xt,2, …, xt,n) and the nodes value of the hidden layer at previous state ht ‐ 1. The
hidden node value ht is then fed to the output layer to generate output value at each time t ot. Thus, the input at
time t − 1 (xt − 1,1,xt − 1,2,…,xt − 1,n) can influence the output at time t ot due to the recurrent connections. That
is, there are several cycles feeding the state from the previous time step to the current network as an input to gen-
erate output value at current time step. We will take advantage of this calculation mechanism of RNN for prediction
at different time steps based on a dynamically changing contextual window over the input series flow (Sak et al49).
Thus, RNN is very appropriate to as an effective predictor for time sequence prediction.
The output of hidden layer ht and output layer otat time t are calculated as follows:

 
ht ¼ f W ðhhÞ ht−1 þ W ðhxÞ x t þ bh (1)

 
ot ¼ f W ðohÞ ht þ bo (2)

where bh and bodenote bias of the hidden layer and output layer, respectively.
The cross entropy error is used as the loss function of RNN over a sequence of size T, which is calculated as follows:

1 T 1 T ∣V ∣  
J ¼ − ∑ J ðtÞ ðθÞ ¼ − ∑ ∑ ot; j x log b
ot; j (3)
T t¼1 T t¼1 j¼1

where ∣V∣ is the size of output layer, and b ot; j is the predicted output of the jth neuron in the output layer, while ot,j
denotes the expected output of the jth neuron in the output layer.
Given the time sequence data (xt,1, xt,2, …, xt,n) for the well‐trained RNN model, the output of the hidden layer ht at
time t is generated by using Equation (1), and then the output (ie, a prediction value of xt,n+1) of the output layer ot
is generated by using Equation (2).
Back propagation through time (BPTT) algorithm is employed for RNN training, which propagates the errors back
through a certain time steps of recurrent units. The specific sequence information in the hidden layers is obtained
when BPTT is performed. All the weights inside RNN are replicated for any number of time steps spatially. Hence,
each node with an inclusion of direct or indirect activation connection along the recurrent units has a certain num-
ber of copies.
In general, the update of weights inside RNN is performed at each time step. Thus, the historical information of
input and state of RNN at the past one time needs to be stored for parameter update in an online fashion. The

FIGURE 2 Moving window on the process [Colour figure can be viewed at wileyonlinelibrary.com]
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2691

sequence information with only a specific length of time step will be considered to compute the gradients for weight
update feasibly.

3 | THE M ETHODOLOGY

This section presents the RNN‐based residual chart including the development of datasets, model construction, and
application procedure to detect mean shifts of autocorrelated processes.

3.1 | Data presentation

In this study, a moving window method (see Figure 2) including n data points collected from the time series is
employed to produce dataset for RNN. The number of neurons in the input layer depends upon the window size
n. The window size has a significant influence on the performance of the monitoring model to detect process vari-
ability. A small window size is usually capable of identifying abnormal signals with large shift magnitude quickly,
which results in a short in‐control ARL, ie, a large Type I error. A large window contains more observation points
for detection of abnormal signals with small shift magnitude but might increase the required time for detection of
these abnormal signals (ie, long out‐of‐control ARL and large Type II error). Hence, a suitable window size should
be determined to balance the Type I and Type II error. Usually, the window size n is determined by the trial‐and‐
error method. The training dataset is collected in advance to test the performance of the RNN‐based residual chart
with different window sizes of n. An optimal n is determined based on a good balance between Type I and Type II
error. In this study, n equals to 24, that is, a window of 24 is selected to perform the experiments. In addition, var-
ious window sizes are considered to carry out more experiments toward investigating the performance of RNN to
monitor the inherent disturbance of autocorrelated processes. In this study, we consider different autocorrelated pro-
cesses to verify effectiveness of the proposed RNN‐based residual control chart. Various shift magnitudes will be con-
sidered on each autocorrelated process with an autocorrelation in the testing phase.
We collect time‐series data online that describe the normal state of the autocorrelated process. In the training phase,
we use the normal data from this process to train a RNN model. In the testing phase, we consider different shift mag-
nitudes for each autocorrelated process to test performance of the RNN‐based residual chart. A general procedure for
developing dataset for an autocorrelated process is presented as follows.

Step 1: Generate t data points (x1, x2, …, xt) from an autocorrelated process as follows:

x ðt Þ ¼ Φ*x ðt − 1Þ þ εt (4)

where x(t) and x(t − 1) denote the data points at time t and (t − 1), respectively, Φ is the autoregressive coefficient with a
range [−1 1], and εt is an error term subjecting to a normal and independent distribution.

Step 2: Generate the moving window vectors based on the collected data points;
Step 3: Construct input vectors matrix X using Equation (5) and the output matrix Y = [xn+1, xn+2, …, xt] (n denotes the
window size) to feed into the RNN model for training;

2 3
x1 x2 … xn
6 7
6 x2 x3 … x nþ1 7
X ¼6
6
7: (5)
4 ⋮ ⋮ ⋱ ⋮ 75
x t−n x t−nþ1 … x t−1

The development of the testing dataset for the autocorrelated process with various shift magnitudes consists of the
following four steps:
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2692 CHEN AND YU

 
Step 1: Generate x ′1 ; x ′2 ; …; x ′t data points from the autocorrelated process for various shift magnitudes:

x ðt Þ ¼ Φ*x ðt − 1Þ þ εt þ δ (6)

where δ is a shift that occurs at the time point t (the shifted point begins at the time point 71 in this study);

Step 2: Generate the moving window vectors based on the collected time data points;
Step 3: The input matrix X' using Equation (7) is fed into the well‐trained RNN to generate residuals
 
R′ ¼ x ′nþ1 − o′nþ1 ; x ′nþ2 − o′nþ2 ; …; x ′t − o′t (o't denotes a prediction value at the time point t);

2 3
x ′1 x ′2 … x ′n
6 7
6 x ′2 x ′3 … x ′nþ1 7
6 7
X′ ¼ 6 7 (7)
6 ⋮ ⋮ ⋱ ⋮ 7
4 5
x ′t−n x ′t−nþ1 ′
… x t−1

Step 4: These residuals are plotted in the RNN‐based chart for monitoring.

The special‐causes control (SCC), namely, residual chart based on RNN is applied for online monitoring. Hence, if
the residual on the RNN‐based SCC chart exceeds the threshold, the process will be considered as an abnormal state.
All processes will start in an in‐control operation, then shift is added to the process mean. The moving window points
with size of n contain initial m points from the normal state and n‐m out‐of‐control points. The parameter m is the
number of in‐control data points entering into a moving window at different time points for testing out‐of‐control
ARL of the proposed control chart in the testing phase. It is a descriptive variable and is not determined in advance.
That is to say, m + 24 points are the last in‐control window vectors, and the monitoring process is supposed to shift
to an out‐of‐control condition when the window vector at point m + 24 is fed to RNN. Thus, when the first abnormal
point enters the moving window, the residual on the SCC is supposed to exceed the control limit. In practical, an out‐of‐
control signal usually occurs on the SCC after some cycles where the process is operating in a normal condition and the
starting time of the shift point is often unknown (Guh50). When the data points with large shift magnitudes enter into
the moving window, an alarm will be given quickly (ie, small Type II error), which means the m is small; When small
shifts enter into the window, more data points will be needed to give an out‐of‐control alarm (ie, large m and Type II
error). It should be mentioned that Type I error is in one‐to‐one correspondence with in‐control ARL (Khoo and Ariffin,
Gültekin et al, Marcos et al, Yourstone and Zimmer, Yu and Liu51-55). In this study, an in‐control ARL of 370 is used to
determine the threshold (ie, upper control limit) of the proposed control chart, which corresponds to Type I error of
0.27% (Khoo and Ariffin51). The Type II error is obtained in the testing phase when an alarm is triggered to give an
out‐of‐control ARL.

TABLE 1 The network structure parameters of the RNN

Structure parameters Input layer 24


Hidden layer 34
Output layer 1
Transfer function Trainlm
Learning parameters Batch size 200/500
Epochs number 3000
Goal 0.00001
Layer delays 1:2
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2693

3.2 | Construction of RNN

In this section, RNN is constructed as a monitoring model for recognizing mean shifts of autocorrelated processes.
Table 1 shows the architecture parameters of the RNN model. The network structure and learning parameters are set
up by using the trail‐error method in order to gain a well‐trained RNN. We adopted three‐layer architecture of “24‐
34‐1” to predict process value at next time based on input window vectors with 24 points. The number of neurons in
the input layer of RNN is coherent with the size of moving window. In this study, the window size is determined by
the trial‐and‐error method on the collected data in advance. Namely, the different window sizes in a range are used
to test the performance of the RNN‐based control chart on the collected data, and then a good balance between Type
I and Type II error of the RNN‐based control chart will be obtained based on a window size. The sensitivity analysis
for different window sizes is also performed in the experimental section to illustrate the influence of window size on
performance of the RNN‐based control chart. According to the prior experience, the size of the hidden layer is usually
less than 2i+1 and greater than i (i is the size of the input layer), and it is finally determined by deleting the nodes in the
hidden layer from 2i+1 to i to obtain the good prediction performance on the training dataset. The size of the output
layer is set to be 1 for prediction at next time point in each autocorrelated process.
In the testing phase, the window vectors are fed to RNN to generate SCC. Based on SCC, we can observe whether the
process remains an in‐control condition or deviates from the normal state. The average of out‐of‐control ARL over 1000
simulation runs is computed along with their standard errors for each case with different autocorrelations and shift
magnitudes. The procedure of RNN construction is presented in Figure 3.
The dataset is developed based on the moving window method. The window vector containing n data points (xt,1, xt,2,
…, xt,n) at time t is fed into the RNN to learn the mapping between input window and output xt,n+1 (in the training
phase) and predict the next data point xt,n+1 in the testing phase by using Equation (2). The procedure for setting up
the RNN‐based residual control chart is illustrated as follows:

Step 1: Collect normal data from an autocorrelated process;


Step 2: Develop the training dataset based on the moving window method;
Step 3: Set up the network structure and learning parameters as shown in Table 1;
Step 4: The training dataset is used to train the RNN model;
Step 5: When the prediction errors meet the predetermined requirement on the training phase, a well‐trained RNN is
constructed;

FIGURE 3 The procedure of RNN construction [Colour figure can be viewed at wileyonlinelibrary.com]
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2694 CHEN AND YU

Step 6: The prediction values are compared with real values to calculate prediction error for construction of the RNN‐
based residual control chart;
Step 7: The threshold (ie, upper control limit) of the RNN‐based residual control chart is determined based on in‐
control ARL = 370 (ie, Type I error = 0.27%);
Step 8: The testing samples in online monitoring phase are fed to the well‐trained RNN to obtain the prediction error
that will be plotted on the residual control chart.

3.3 | Application procedure

The RNN‐based method consists of two phases, ie, offline modeling and online process monitoring (see Figure 4). The
offline phase is to train a RNN model to learn the alignment between the input sequence and target output. In the test-
ing phase, the input data are fed into the well‐trained RNN to predict process changes, and then SCC is generated to
analyze the process state. As shown in Figure 4, the detailed application procedure of the RNN‐based residual control
chart for detecting mean shifts on the autocorrelated processes is illustrated as follows:

Step 1: Develop the training dataset based on moving window from an autocorrelated process;
Step 2: Feed the training dataset into the RNN model for training;
Step 3: A well‐trained RNN model is constructed to fit the time‐series data well;
Step 4: Plot prediction errors of the RNN model on the residuals control chart, and a threshold (ie, upper control limit)
is determined based on an in‐control ARL of 370 (ie, Type I error = 0.27%);
Step 5: In the testing phase, feed the input window vectors into the well‐trained RNN to obtain residuals;
Step 6: Plot residual on the residual control chart to determine whether it exceed the predetermined threshold;
Step 7: If the residual exceeds the threshold, an alarm is triggered for this out‐of‐control signal.

FIGURE 4 The application procedure of the RNN‐based residual control chart [Colour figure can be viewed at wileyonlinelibrary.com]
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2695

4 | EXPERIMENT AND RESULT A NALYSIS

4.1 | The model of interest: AR (1) with step shift

In real‐world industries, many manufacturing systems manifest AR characteristic of lag 1 (ie, AR (1)). The value of a
process variable depends upon the previous one at varying time lags due to the autoregression form. The relationship
between the values of time series in an AR (1) model can be presented as follows:

Z ðt Þ ¼ μ þ Φ*ðZ ðt − 1Þ − μÞ þ εt (8)

where Z(t) and Z(t − 1)denote the data points at time t and (t − 1), respectively, μ is the mean of process data, Φ is the
autoregressive coefficient with a range [−1 1], and εt is an error term subjecting to a normal and independent
distribution.

TABLE 2 ARL comparison among SCC, BPN, LRProb, and RNN

δ SCCa BPNb LRProbc RNN


Φ Shift ARL SRL ARL SRL ARL SRL ARL SRL Threshold

0.00 0.00 370.40 369.88 372.96 370.10 379.86 324.27 372.33 168.44 0.523
0.50 152.22 154.72 25.38 17.93 20.53 14.58 31.73 19.499
1.00 43.89 43.39 8.29 5.76 8.92 4.49 10.86 8.314
2.00 6.30 5.78 2.47 1.51 3.95 1.68 1.23 0.621
3.00 2.00 1.41 1.29 0.61 2.65 1.01 1.00 0.032
d
0.25 0.00 370.40 N/A 371.23 385.70 374.06 364.2 369.78 168.62 0.480
0.50 206.04 N/A 32.46 24.96 35.09 31.66 30.83 24.84
1.00 75.42 N/A 11.87 9.03 11.69 6.65 5.29 5.54
2.00 12.24 N/A 3.39 2.19 5.64 2.21 1.16 0.48
3.00 2.85 N/A 1.63 0.99 3.87 1.62 1.00 0.03
0.50 0.00 370.40 N/A 371.3 373.57 375.68 352.01 371.74 169.00 0.387
0.50 258.42 N/A 52.07 45.74 43.24 39.71 30.25 22.16
1.00 123.82 N/A 16.74 12.88 15.45 9.13 7.36 6.09
2.00 24.22 N/A 4.84 3.26 6.42 3.63 1.15 0.61
3.00 4.14 N/A 2.22 1.52 3.86 1.95 1.00 0.03
0.75 0.00 370.40 N/A 370.6 368.36 373.23 345.21 371.75 167.27 0.241
0.50 311.23 N/A 91.72 94.81 70.91 70.95 112.60 52.53
1.00 197.74 N/A 35.42 32.71 25.53 20.08 33.98 34.34
2.00 40.24 N/A 8.95 9.23 11.37 6.36 1.24 1.85
3.00 3.01 N/A 3.52 3.28 6.83 4.21 1.00 0.00
0.95 0.00 370.40 369.88 370.37 374.11 379.20 336.06 372.38 169.50 0.125
0.50 330.96 357.37 152.09 148.81 130.12 142.58 129.19 47.84
1.00 138.84 267.19 77.00 69.16 54.30 62.40 73.58 64..46
2.00 1.08 6.21 32.07 27.57 16.97 15.49 4.41 20.54
3.00 1.00 0 10.17 1.00 8.99 7.33 1.01 0.35
a 7
Results taken from (Wardell et al ).
b
Results taken from (Hwarng56).
c
Result taken from (Yu and Liu55).
d
Not reported in the original paper.
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2696 CHEN AND YU

If a shift δ occurs at point t, the process change to an abnormal state is following:

Z ðt Þ ¼ δ þ Z ðt − 1Þ: (9)

We can see that the expected difference between the abnormal value of Z(t) and in‐control value of Z(t − 1) is δ. Thus,
the expected change at time point (t+k) is supposed to be (1 − Φ)δ for k ≥ 1. As a result, if the previous abnormal points
cannot be detected immediately, it will be more difficult to detect a shift as more data points enter into processes.

4.2 | ARL performance comparison with SCC, BPN, and LRProb

The ARL comparison is firstly provided between the RNN‐based residual chart and BPN‐based residual chart and AR‐
based SCC. The in‐control condition is considered as a normal distribution process with mean of zero and variance of
one. Based on Equations (4) and (6), 500 in‐control moving windows of size 24 were generated for each autocorrelation
parameter Φ (ie, 0.0, 0.25, 0.45, 0.50, 0.75, and 0.95) as the training datasets of RNN. Thus, six RNNs were constructed
for the six autocorrelation coefficients, respectively. The threshold is then set up for each Φ in the last column of Table 2
.
These mean shifts δ = 0.50, 1.00, 2.00, and 3.00 are considered using Equations (4) and (6) to generate the testing
dataset for each Φ, where shifts will added in point 71. It should be pointed out that these mean shifts ranging from
0.5 to 3.0 are often considered in many works (Wardell et al,7 Hwarng,56 and Yu and Liu55) to test sensitivity of their
control charts to various shift magnitudes. Table 2 presents ARLs and SRLs (the standard deviation of ARL) of the three
methods. Each ARL and threshold are computed through 1000 simulations. The results of the SCC chart and BPN as
well as LRProb chart are taken directly from Wardell et al,7 Hwarng,56 and Yu and Liu,55 respectively. We can see from
Table 2 that the abnormal signal is detected when the first shifted window vector enters into RNN for the large shift
δ ≥ 2.00 except when δ = 2.00 of Φ = 0.95. This illustrates that RNN is quite effective to detect the large mean shifts.
RNN outperforms SCC in all cases except when δ ≥ 2.00 of Φ = 0.95. Compared with LRProb, RNN shows a quicker
response to process changes in major cases except when δ ≤ 1.00 of Φ = 0, δ = 0.5 of Φ = 0.75, and δ = 1.0 of

TABLE 3 ARL comparison among RNN, SCC, X, EWMA, EWMAST, MQE, and LRProb

Φ δ (in σ) SCCa Xa EWMAa EWM ASTb ARM ASTc MQEd LRProbe RNN Threshold

0.00 0.00 370.38 370.40 369.00 370.40 N/A 371.41 372.86 372.33 0.523
0.50 152.22 152.22 28.19 28.19 N/A 54.12 20.53 31.73
1.00 43.89 43.89 9.73 9.73 N/A 10.9 8.12 10.86
2.00 6.30 6.30 4.18 4.18 N/A 4.81 3.85 1.23
3.00 2.00 2.00 2.76 2.76 N/A 2.97 2.55 1.00
0.475 0.00 370.38 365.34 376.53 373.66 370.00 370.39 371.43 371.76 0.510
0.50 253.13 166.77 70.05 77.91 65.60 59.36 42.85 55.09
1.00 117.96 51.05 20.69 22.45 20.30 17.79 15.42 12.88
2.00 22.64 8.69 7.16 6.06 6.61 5.84 6.54 1.98
3.00 4.02 2.50 4.28 3.35 3.67 3.74 3.67 1.02
0.95 0.00 370.38 369.15 365.16 368.31 370.00 379.20 372.27 372.38 0.127
0.50 330.95 259.73 245.67 222.07 226.00 130.12 130.10 129.19
1.00 138.84 118.92 107.83 105.12 102.00 54.30 56.00 73.58
2.00 1.08 22.44 27.79 23.33 25.80 16.97 15.35 4.41
3.00 1.00 1.43 10.01 7.41 8.65 8.99 6.06 1.01
a 7
Results taken from (Wardell et al ).
b
Results taken from (Zhang12).
c
Results taken from (Tsui and Woodall57).
d
Results taken from (Yu and Xi58).
e
Results taken from (Yu and Liu55).
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2697

Φ = 0.95. The LRProb chart shows similar performance with the BPN‐based residual chart. Thus, the RNN‐based resid-
ual chart shows the best results in most cases. It should be noted that the SRLs of RNN are smallest among all moni-
toring methods in major cases except when δ = 0.5, δ = 1.00 of Φ = 0, δ = 1.0 of Φ = 0.75, and δ = 1.00, δ = 2.00 of
Φ = 0.95. This indicates its high robustness in shift detection. Moreover, RNN is capable of dealing with the large
time‐series data with little time cost, and it only took 1.56649 s for RNN to address 200 000 window vectors (the size
of window is 24) for each case. RNN shows a significant improvement over other monitoring methods considering
accuracy, time cost as well as robustness.

4.3 | ARL performance comparison with other monitoring methods

To further verify performance of the RNN‐based residual chart for monitoring shifts, the comparison between RNN and
other monitoring methods (ie, SCC [Wardell et al7], X [Wardell et al7], EWMA [Wardell et al7], EWMAST [Zhang12],
ARST [Tsui and Woodall57], MQE [Yu and Xi58], and LRProb [Yu and Liu55]) is further provided in Table 3. We can
see that RNN‐based residual chart shows the best results except when δ ≤ 1.00 of Φ = 0, δ = 0.5 of Φ = 0.475, and
δ ≥ 1.0 of Φ = 0.95. The RNN‐based residual chart gives an out‐of‐control alarm immediately for the large shift
δ = 3.00. SCC and X outperform RNN for δ ≥ 1.0 of Φ = 0.95. The reason is that only an observation is monitored each
time in SCC and X charts whereas moving window vector (the window size is 24) enters into RNN for the next obser-
vation prediction. Hence, the immediate deviation of the first shifted window vector that consists of 23 in‐control points
and one single out‐of‐control point could not be recognized by the RNN‐based residual chart. Only when more shifted
points enter into the moving window RNN can detect the process change. Consequently, RNN combined with other
statistic‐based control charts like SCC provides a significant improvement on monitoring of autocorrelated processes.

4.4 | Two process cases

In this study, the two typical autocorrelated processes are further considered to analyze the performance of the RNN‐
based residual chart. The two datasets are collected from the papermaking and viscosity manufacturing process, respec-
tively. The papermaking dataset consists of 160 observations as shown in Figure 5, and the viscosity dataset consists of
310 observations as shown in Figure 6.
The first‐order AR model (AR (1)) (Cook and Chiu3) is employed to fit time series for papermaking and viscosity with
an autocorrelation coefficient 0.8981 and 0.8615, respectively:
Papermaking process: Z(t) = 0.8981Z(t − 1)+εt, εt ∼ N(0,0.1059).
Viscosity process: Z(t) = 0.8615Z(t − 1)+εt, εt ∼ N(0,0.0934).
The data collected from the two process cases operated in normal situation are preprocessed to generate window
vectors with 24 points without a prior knowledge of distribution, and then they are sent to the RNN model for training.
Because there is no abnormal data generated from these two processes in reality, we generate the abnormal signals with
different shift magnitudes using Equation (6) by simulation method as the testing dataset. In this study, we select
δ = 1.00σ and δ = 2.00σ as small and large shift magnitudes for the two processes, respectively. The test dataset consists
of normal and abnormal signals. Besides, the inherent disturbance (δ = 1.00σ and δ = 2.00σ) is supposed to enter into
the process at time point t = 71, and it indicates that the first out‐of‐control window vector occurs at 47. After the first

FIGURE 5 Papermaking data (Pandit22) [Colour figure can be viewed at wileyonlinelibrary.com]


10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2698 CHEN AND YU

FIGURE 6 Viscosity data (Box and Jenkins23) [Colour figure can be viewed at wileyonlinelibrary.com]

abnormal window vector is generated, the new shifted window vectors are sequentially fed to RNN for process
monitoring.

4.5 | Sensitivity analysis of RNN with different window sizes

The ARL results for different window sizes (restricted to range between 8 and 24) are presented in Tables 4 and 5 for the
two cases, respectively. From Table 4, we can see that the performance of the RNN‐based residual chart for small shift
(δ ≤ 1.5σ) detection improves as the window size increases, and small window shows better performance when detect-
ing large shifts (δ ≥ 2.0σ) except δ = 2.0σ with window of size 24. In the viscosity process, Table 5 manifests the anal-
ogous results where small window is more capable of detecting large shifts (δ ≥ 2.0σ) except δ = 2.0σ with window size
16. The detection performance of the RNN‐based residual chart for small shifts (δ ≤ 1.5σ) deteriorates as the window
size becomes smaller. The reasons why window size has a distinct impact on performance for detecting various shift
magnitudes can be concluded as follows:

1. The window with large size contains more abnormal observation points, and smaller shifts in the process require
more evidence for a signal. Hence, a bigger window is needed to detect small shifts. A smaller window obviously
will not be able to accumulate as much evidence as a bigger window.
2. A small window works well for large shifts due to the bigger differences among the elements of a moving window
that consists of in‐control and out‐of‐control observations. Evidence of a shift is stronger in this case, and a large
window is not needed. Thus, the window with small size can detect process variation more easily when large shifts
occur in the processes.

Detection rate of RNN for different shift magnitudes with different window sizes is presented in Figures 7 and 8. For
papermaking, the RNN‐based residual chart with large window size exhibits the competitive performance for detecting
small shifts δ ≤ 1.5σ. The abnormal change pattern with large shifts δ ≥ 2σ can be detected using moving window with
small size except δ = 2.0σ with window of size 24 (see Figure 7). For viscosity, small window enables detecting large
shifts (δ ≥ 2.0σ) except δ = 2.0σ with window size 16 more easily, and large window facilitates the detection of small
shifts (δ ≤ 1.5σ) (see Figure 8).

TABLE 4 ARL of RNN‐based residual chart for different window sizes in papermaking process

Shift Magnitude (in σ)


Window Size 0.5 1.0 1.5 2.0 2.5 3.0 0.0 Threshold (λΦ)

8 135.089 111.847 58.596 7.317 1.813 1.000 371.993 0.161


12 134.212 107.653 58.140 13.299 1.870 1.000 371.914 0.149
16 128.7750 105.79 56.455 13.556 2.093 1.056 371.336 0.1665
20 127.680 101.770 54.544 15.407 2.178 1.125 369.982 0.191
24 123.351 98.258 39.567 14.269 2.877 1.150 372.247 0.176
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2699

TABLE 5 ARL of RNN‐based residual chart for different window sizes in viscosity process

Shift Magnitude (in σ)


Window Size 0.5 1.0 1.5 2.0 2.5 3.0 0.0 Threshold (λΦ)

8 138.673 128.732 69.831 26.292 6.245 1.328 369.528 0.187


12 136.271 120.075 78.042 31.564 6.326 1.367 370.813 0.1874
16 135.063 119.181 94.637 38.525 6.539 1.426 369.790 0.210
20 128.097 115.909 79.621 32.064 8.577 1.656 371.530 0.1965
24 127.929 113.067 79.780 33.275 10.877 2.050 370.245 0.1959

FIGURE 7 Detection rate of RNN‐based residual chart with different window sizes for papermaking [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 8 Detection rate of RNN‐based residual chart with different window sizes for viscosity [Colour figure can be viewed at
wileyonlinelibrary.com]

4.6 | RNN‐based residual chart

In this section, the two cases were further used to analyze the prediction accuracy and shift detection performance of
RNN. The typical methods, BPN‐based residual control chart and AR‐based residual control chart (ie, SCC), are also
performed for comparison purpose. The mean shifts δ = 1.00σ and δ = 2.00σ occur at point t = 71 (ie, the window vector
47) in papermaking and viscosity process, respectively (see Figures 9 and 10).

4.6.1 | RNN fitting performance analysis

The fitting performance of RNN is further compared with that of BPN and AR. Figures 11 and 12 present the fitting
results of the three methods (ie, RNN, BPN, and AR (1)) for papermaking and viscosity process, respectively. It is clear
from Figures 11 and 12 that RNN and BPN outperform AR (1) for the time‐series fitting. Figures 13 and 14 present the
fitting error of the three methods for papermaking and viscosity, respectively. The sum of absolute errors of three
methods (ie, RNN, BPN, and AR) for papermaking are 17.5492, 19.383, and 24.7713, respectively, and for viscosity
are 14.6396, 21.8334, and 41.7800, respectively. It is clear that RNN shows the best fitting accuracy with the smallest
absolute error among the three methods.
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2700 CHEN AND YU

FIGURE 9 Papermaking process. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour figure can be viewed at wileyonlinelibrary.com]

4.6.2 | Shift detection performance analysis

To maintain an in‐control ARL of 370 for the RNN‐based residual chart, 0.17 and 0.263 were used as the threshold for
papermaking and viscosity, respectively. In this study, the control limit is determined based on the absolute errors
between predicted value and true value to obtain an in‐control ARL of 370, which helps to set up an upper control limit
of the RNN‐based residual chart. The detection results over all shift magnitudes for the two processes are presented in
Figures 15 and 16, respectively. For the two processes, RNN can detect both small and large shiftδ = 1.00σ and δ = 2.00σ
immediately at the window vector 47 (equivalent to point 71 in Figures 9 and 10). This indicates that RNN is sensitive to
these small shifts in an autocorrelated process with a high autocorrelation. In general, it is difficult to detect shifts as
more points enter into the moving window. However, RNN can still detect several abnormal points with the large shift
δ = 2.00σ.
We also investigated the performance of the BPN‐based residual chart to detect shifts with the threshold, 0.175 and
0.217 for papermaking and viscosity, respectively (see Figures 17 and 18). For papermaking, only one point was detected
at 97 for small shift magnitude δ = 1.00σ, which is equivalent to point 121 in Figure 9A. It cannot detect this shift at 47

FIGURE 10 Viscosity process. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour figure can be viewed at wileyonlinelibrary.com]
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2701

FIGURE 11 Fitting results of three methods for papermaking process. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 12 Fitting results of three methods for viscosity process. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 13 Fitting error of three methods for papermaking process. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 14 Fitting error of three methods for viscosity process. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

once a shift occurs in the process. The BPN‐based residual chart can detect the large shift δ = 2.00σ immediately at win-
dow number 47. For viscosity, the first shift point detected by BPN occurs at window number 54 (equivalent to point 78
in Figure 10A) for small shift magnitude δ = 1.00σ. However, it cannot still detect this shift at 47. As for the large shift
δ = 2.00σ, the BPN‐based residual chart can detect shift immediately at the window 47. The RNN‐based residual chart is
capable of detecting shifts (δ = 1.00σ, δ = 2.00σ) immediately when the first shifted point occurs on the papermaking
and viscosity process. In contrast, the BPN‐based residual chart fails to detect shifted points immediately in the
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2702 CHEN AND YU

FIGURE 15 Papermaking process: RNN‐based residual chart with threshold = 0.17. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour
figure can be viewed at wileyonlinelibrary.com]

FIGURE 16 Viscosity process: RNN‐based residual chart with threshold = 0.263. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour
figure can be viewed at wileyonlinelibrary.com]

papermaking process when δ = 1.00σ,2.00σ and in the viscosity process when δ = 1.00σ. In this case, RNN‐based resid-
ual chart shows better detection performance than that of the BPN‐based residual chart.
AR‐based SCC is widely used for autocorrelated process control. For papermaking, the control limit (ie, CL) of SCC is
±0.9763. For small shift δ = 1.00σ, when the first abnormal point enters into the process, the AR‐based SCC fails to give
an alarm signal. The first shift point detected by the AR‐based SCC occurs at t = 105, which shows a delay for shift
detection (the first shift point occurs at t = 71) as shown in Figure 19A. For the large shift δ = 2.0σ, SCC can detect
the shift point immediately after the first abnormal point enters into the process (see Figure 19B). For viscosity, the
CL of SCC is ±0.9187. SCC gives a false out‐of‐control alarm at t = 71 and 196 (see Figure 20). It is clear that the
AR‐based SCC is not sensitive to small shifts in the papermaking process. In this case, compared with SCC, the
RNN‐based residual chart is still more effective for detection of these small shifts in autocorrelated processes. Cook
and Chiu3 adopted the AR (1) model developed by Newton59 to model the time‐series data on the papermaking and vis-
cosity process. Based on the AR (1) model, the residual control chart (ie, the SCC chart) is then developed to monitor the
two processes. Their results indicated that their method provided a substantial improvement over the residual control
chart. Thus, it is appropriate to use AR (1) to model the time sequences on the two processes. In this study, we also
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2703

FIGURE 17 Papermaking process: BPN‐based residual chart with threshold = 0.175. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour
figure can be viewed at wileyonlinelibrary.com]

FIGURE 18 Viscosity process: BPN‐based residual chart with threshold = 0.217. A, Shift, t = 71; B, shift = 2.00, t = 71 [Colour figure can
be viewed at wileyonlinelibrary.com]

consider AR (1) as a comparison method on the two processes with the proposed RNN model. The results manifest that
the performance of AR (1) method is interior than the proposed method.

4.7 | A nonstationary process

With the growing complexity of operation in the manufacturing process, increasing importance has been attached to the
nonstationary process. Compared with the stationary process, the nonstationary process is essentially unstable and
drifting, which is hard for the learning models to learn inherent characteristics of nonstationary process statistically.
In order to illustrate effectivity of RNN on the nonstationary process, the fitting performance comparison between
the RNN model and the BPN and AR model is further implemented based on a sunspot dataset as shown in
Figure 21.
The sunspot dataset has served as a benchmark in much work for study of nonlinear and nonstationary process (Cao
and Gu60), which provides a record of 289 yearly averaged sunspots during 1700 and 1979. The nonstationary process
model is constructed based on a training set with 100 window vectors of 12 points, that is to say, the current sunspot
is predicted based on the 12 sunspots previous it. The testing dataset with 77 samples is utilized to validate the fitting
performance of the three methods (ie, RNN, BPN, and AR (1)).
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2704 CHEN AND YU

FIGURE 19 Papermaking process: SCC with CL = 0.9763. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 20 Viscosity process: SCC with CL = 0.9187. A, Shift = 1.00, t = 71; B, shift = 2.00, t = 71 [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 21 Sunspot data [Colour figure can be viewed at wileyonlinelibrary.com]


10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2705

FIGURE 22 Prediction results of three methods for the nonstationary process. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

FIGURE 23 Prediction error of three methods for sunspot dataset. A, RNN; B, BPN; and C, AR [Colour figure can be viewed at
wileyonlinelibrary.com]

Because the residual control chart is constructed based on the prediction errors, a good fitting performance of the
prediction model means it will be effective to detect various shifts occurring in processes. The better the fitting perfor-
mance is, the more sensitive the control chart is to small shifts. In this section, we compare the fitting performance of
RNN, BPN and AR (1) in order to demonstrate that the prediction performance of RNN is good on this complex
autocorrelated process. Consequently, the sunspots that is a typical nonlinear and nonstationary process is very appro-
priate to compare the prediction performance of RNN, BPN, and AR (1). The testing results imply that RNN provides
the best prediction result and could potentially be applied in monitoring for those nonstationary processes. Figure 22
presents the prediction results of the three methods (ie, RNN, BPN, and AR (1)) on the testing data. As can be seen from
Figure 22, RNN and BPN outperform AR (1) for the nonstationary time‐series fitting. Figure 23 shows the prediction
error of the three methods. The absolute errors of RNN, BPN, and AR for the sunspot data are 5.7715, 7.3334, and
9.5327, respectively. It should be noted that the difference of absolute errors of three methods is not distinctive because
the test set is of small size. However, it is clear that RNN shows the best prediction accuracy with the smallest absolute
error among the three methods on this nonstationary process.

5 | CONCLUSIONS

In this paper, a novel RNN‐based residual chart with deep learning technique is proposed to recognize mean shifts in
autocorrelated processes. By using the deep recurrent network architecture, the proposed method is capable of identifying
mean shifts of various correlated processes with different autocorrelation coefficients. RNN is firstly employed to monitor
shifts in autocorrelated processes. The comparison results between the proposed method and other typical methods (eg,
AR‐based SCC and BPN‐based residual chart) demonstrate that the proposed method provides the good performance
for monitoring mean shifts in autocorrelated manufacturing processes. This work could be extended to multivariate mon-
itoring in the autocorrelated manufacturing process with RNNs to deal with the prediction of time‐series data. In addition,
deep learning‐based fault diagnosis in autocorrelated processes is an interesting research in future.

ACK NO WLE DGE MEN TS


This work was supported by the National Natural Science Foundation of China (No. 71777173) and the Fundamental
Research Fund for the Central University.
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2706 CHEN AND YU

ORCID
Jianbo Yu https://orcid.org/0000-0003-3204-2486

R EF E RE N C E S
1. Deming WE. Out of the crisis. Massachusetts Institute of Technology 1986.
2. Zobel CW, Cook DF, Nottingham QJ. An augmented neural network classification approach to detecting mean shifts in correlated
manufacturing process parameters. International Journal of Production Research. 2004;42(4):741‐758.
3. Cook DF, Chiu CC. Using radial basis function neural networks to recognize shifts in correlated manufacturing process parameters. IIE
Transactions. 1998;30(3):227‐234.
4. English JR, Lee SC, Martin TW, Tilmon C. Detecting changes in autoregressive processes with X¯ and EWMA charts. IIE Transactions.
2000;32(12):1103‐1113.
5. Alwan LC, Roberts HV. Time‐series modeling for statistical process control. Journal of Business & Economic Statistics. 1988;6(1):87.
6. Harris TJ, Ross WH. Statistical process control procedures for correlated observations. Canadian Journal of Chemical Engineering.
1991;69(1):48‐57.
7. Wardell DG, Moskowitz H, Plante RD. Run‐length distributions of special‐cause control charts for correlated processes. Dent Tech.
1994;36(1):3‐17.
8. Runger GC, Willemain TR, Prabhu S. Average run lengths for CUSUM control charts applied to residuals. Communications in Statistics—
Theory and Methods. 1995;24(1):273‐282.
9. Box G E P, Luceno A. Statistical control by monitoring and feedback adjustment. 1997.
10. Yourstone SA, Montgomery DC. A time‐series approach to discrete real‐time process quality control. Quality and Reliability Engineering
International. 1989;5(4):309‐317.
11. Wardell DG, Moskowitz H, Plante RD. Control charts in the presence of data correlation. Management Science. 1992; 38(8):1084‐1105.
12. Zhang NF. Statistical data control chart for stationary process. Dent Tech. 1998;40(1):24‐38.
13. Wright CM, Booth DE, Hu MY. Joint estimation: SPC method for short‐run autocorrelated data. Journal of Quality Technology.
2001;33(3):1‐11.
14. Chen C, Liu LM. Joint estimation of model parameters and outlier effects in time‐series. J am Stat Assoc. 1993;88(421):284‐297.
15. Huang X, Bisgaard S, Xu N. Model‐based multivariate monitoring charts for autocorrelated processes. Quality and Reliability Engineering
International. 2014;33(4):527‐534.
16. Pirhooshyaran M, Niaki STA. A double‐max MEWMA scheme for simultaneous monitoring and fault isolation of multivariate multistage
auto‐correlated processes based on novel reduced‐dimension statistics. Journal of Process Control. 2015;29(C):11‐22.
17. Akhundjanov SB, Pascual FG. Exponentially weighted moving average charts for correlated multivariate Poisson processes. Communica-
tions in Statistics—Theory and Methods. 2017;46(10):4977‐5000.
18. Bodnar O, Schmid W. CUSUM control schemes for monitoring the covariance matrix of multivariate time series. Stat. 2017;51(4):722‐744.
19. Amiri A, Sogandi F, Ayoubi M. Simultaneous monitoring of correlated multivariate linear and GLM regression profiles in phase II. Qual-
ity Technology and Quantitative Management. 2016;3703:1‐24.
20. Maleki MR, Amiri A, Taheriyoun AR. Phase II monitoring of binary profiles in the presence of within‐profile autocorrelation based on
Markov model. Commun Stat Simul Comput. 2017;46(10):7710‐7732.
21. Huang X, Khachatryan D. Dimension reduction for a multivariate time series process of a regenerative glass furnace. Journal of Quality
Technology. 2018;50(1):98‐116.
22. Pandit SM. Time Series and System Analysis with Applications. New York: Wiley; 1983.
23. Box GEP, Jenkins GM, Reinsel GC. Time series analysis: forecasting and control. 1994;37(2):238‐242.
24. Smith AE. X‐bar and R control chart interpretation using neural computing. International Journal of Production Research.
1994;32(2):309‐320.
25. Ho ES, Chang SI. An integrated neural network approach for simultaneous monitoring of process mean and variance shifts a comparative
study. International Journal of Production Research. 1999;37(8):1881‐1901.
26. Yang WA, Zhou W. Autoregressive coefficient‐invariant control chart pattern recognition in autocorrelated manufacturing processes
using neural network ensemble. J Intell Manuf. 2015;26(6):1161‐1180.
27. Purintrapiban U, Corley HW. Neural networks for detecting cyclic behavior in autocorrelated process. Computers and Industrial Engineer-
ing. 2012;62(4):1093‐1108.
28. Hwarng HB, Wang Y. Shift detection and source identification in multivariate autocorrelated processes. International Journal of Produc-
tion Research. 2010;48(3):835‐859.
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHEN AND YU 2707

29. Yuangyai C, Abrahams R. Statistical process control with autocorrelated data using neural networks. in IEEE International Conference
on Quality and Reliability. IEEE. 2011:283–287.
30. Prajapati DR, Singh S. Application of ANN to monitor the correlated process using higher sample size. International Journal of Perform-
ability Engineering. 2015;11(4):395‐404.
31. White AK, Safi SK. The efficiency of artificial neural networks for forecasting in the presence of autocorrelated disturbances.
2016;5(2):51‐58.
32. Chinnam RB. Support vector machines for recognizing shifts in correlated and other manufacturing processes. International Journal of
Production Research. 2002;40(17):4449‐4466.
33. Apley DW, Shi J. The GLRT for statistical process control of autocorrelated processes. IIE Transactions (Institute of Industrial Engineers).
1999;31(12):1123‐1134.
34. Lecun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436‐444.
35. Auli M, Galley M, Quirk C, Zweig G. Joint language and translation modeling with recurrent neural networks. Emnlp. 2013;1044‐1054.
36. Liu S, Yang N, Li M, Zhou M. A recursive recurrent neural network for statistical machine translation. Proceedings of the 52nd Annual
Meeting of the Association for Computational Linguistics (ACL 2014). 2014:1491–1500.
37. Mulder WD, Bethard S, Moens MF. A survey on the application of recurrent neural networks to statistical language modeling. Computer
Speech and Language. 2015;30(1):61‐98.
38. Socher R, Lin C C, Manning C, Ng AY Parsing natural scenes and natural language with recursive neural networks. Proceedings of the
28th International Conference on Machine Learning (ICML‐11). 2011:129–136.
39. Zou W, Xia Y, Li H. Fault diagnosis of Tennessee‐Eastman process using orthogonal incremental extreme learning machine based on
driving amount. IEEE Transactions on Cybernetics. 2018;48(12):3403‐3410.
40. Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC(J). Novel solutions for an old disease: diagnosis of acute appendicitis with random
forest, support vector machines, and artificial neural networks. Surgery. 2011;149(1):87‐93.
41. Yélamos I, Escudero G, Graells M, Puigjaner L. Performance assessment of a novel fault diagnosis system based on support vector
machines. Computers & Chemical Engineering. 2009;33(1):244‐255.
42. Chiang LH, Kotanchek ME, Kordon AK. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Computers
& Chemical Engineering. 2004;28(8):1389‐1401.
43. Tamilselvan P, Wang P. Failure diagnosis using deep belief learning based health state classification. Reliability Engineering & System
Safety. 2013;115:124‐135.
44. Eslamloueyan R. Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee–Eastman pro-
cess. Appl Soft Comput. 2011;11(1):1407‐1415.
45. Fan JY, Nikolaou M, White RE. An approach to fault diagnosis of chemical processes via neural networks. AIChE Journal.
1993;39(1):82‐88.
46. Lipton ZC, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning 2015:1–38.
47. Karim F, Majumdar S, Darabi H, Chen S. LSTM fully convolutional networks for time series classification. IEEE Access. 2018;6:1662‐1669.
48. Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems
(NIPS). 2014:3104–3112.
49. Sak H, Senior A, Beaufays F. Long short‐term memory based recurrent neural network architectures for large vocabulary speech recog-
nition. 2014:arXiv preprint arXiv:1402.1128,.
50. Guh RS. A hybrid learning‐based model for on‐line detection and analysis of control chart patterns. Computers and Industrial Engineering.
2005;49(1):35‐62.
51. Khoo MBC, Ariffin KN. Two improved runs rules for the Shewhart X control chart. Quality Engineering. 2006;18(2):173‐178.
52. Gültekin M, English JR, Elsayed EA. Cross‐correlation and X‐bar‐trend control charts for processes with linear shift. International Journal
of Production Research. 2002;40(5):1051‐1064.
53. Marcos A A, Heuchenne C, Faraz A. Nonparametric control charts: economic statistical design. 2016.
54. Yourstone SA, Zimmer WJ. Non‐normality and the design of control charts for averages. Decision Sciences. 1992;23(5):1099‐1113.
55. Hwarng HB. Detecting process mean shift in the presence of autocorrelation: a neural‐network based monitoring scheme. International
Journal of Production Research. 2004;42(3):573‐595.
56. Yu JB, Liu JP. LRProb control chart based on logistic regression for monitoring mean shifts of auto‐correlated manufacturing processes.
International Journal of Production Research. 2011;49(8):2301‐2326.
57. Tsui K, Woodall WH. The new SPC ARMA monitoring method: chart. American Society for Quality. 2000;42(4):399‐410.
58. Yu JB, Xi LF. Using an MQE chart based on a self‐organizing map NN to monitor out‐of‐control signals in manufacturing processes. Inter-
national Journal of Production Research. 2008;46(21):5907‐5933.
59. Newton HJ. Timeslab: A Time Series Analysis Laborator. Pacific Grove, CA: Wadsworth and Brooks/Cole Publishing Company; 1988.
60. Cao L, Gu Q. Dynamic support vector machines for non‐stationary time series forecasting. Intelligent Data Analysis. 1988;6(1):67‐83.
10991638, 2019, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/qre.2551 by UNIVERSITY FEDERAL DE SANTA MARIA, Wiley Online Library on [13/11/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2708 CHEN AND YU

A UT H O R B I O G R A P H I E S
Shumei Chen received the BSc degree from the Nanjing University of Science and Technology, Nanjing, China, in
2017. She is currently pursuing the master's degree from Tongji University, Shanghai, China. Her current research
interests include machine learning, profile monitoring, and process control.

Jianbo Yu received the B.Eng. degree from the Department of Industrial Engineering, Zhejiang University of Tech-
nology, Zhejiang, China, in 2002, the M.Eng. degree from the Department of Mechanical Automation Engineering,
Shanghai University, Shanghai, China, in 2005, and the PhD degree from the Department of Industrial Engineering
and Management, Shanghai Jiaotong University, Shanghai, in 2009. In 2008, he joined the Center for Intelligent
Maintenance System, University of Cincinnati, Cincinnati, OH, USA, as a Visiting Scholar. From 2009 to 2013, he
was an Associate Professor with the Department of Mechanical Automation Engineering, Shanghai University. Since
2016, he has been a Professor with the School of Mechanical Engineering, Tongji University, Shanghai. His current
research interests include intelligent condition‐based maintenance, machine learning, quality control, and statistical
analysis. Dr Yu is the Editorial Board Member of Advances in Mechanical Engineering, Chinese Journal of Engi-
neering, and Journal of Advanced Manufacturing Research.

How to cite this article: Chen S, Yu J. Deep recurrent neural network‐based residual control chart for
autocorrelated processes. Qual Reliab Engng Int. 2019;35:2687–2708. https://doi.org/10.1002/qre.2551

You might also like