Professional Documents
Culture Documents
1, FEBRUARY 2015
Abstract—Fault detection technique is essential for improv- realize reducing wafer scrap, increasing equipment uptime
ing overall equipment efficiency of semiconductor manufacturing and reducing the usage of test wafers [1]–[4]. Traditional
industry. It has been recognized that fault detection based on multivariate statistical process monitoring (MSPM) methods,
k nearest neighbor rule (kNN) can effectively deal with some char-
acteristics of semiconductor processes, such as multimode batch such as principal component analysis (PCA) and partial least
trajectories and nonlinearity. However, the computation complex- squares (PLS) have been extensively used in semiconduc-
ity and storage space involved in neighbors searching of kNN tor manufacturing process monitoring [5]–[8]. However, as
prevent it from online monitoring, especially for high dimensional a typical batch process semiconductor processes have some
cases. To deal with this difficulty, principal component-based characteristics, such as multimode batch trajectories, nonlin-
kNN has also been presented in literature, in which dimension
reduction is done by principal component analysis (PCA) before earity and non-Gaussian distributed data (see Fig. 11 in [1] and
kNN rule implemented to fault detection. However, dimension Fig. 3 in [9]), which have posed challenges to these MSPM
reduction by PCA may distort the distances between pairs of methods. Several modified PCA-based methods for nonlin-
samples (trajectories). Thus the false alarm and missing detection earity [10], [11], multimode [12], [13], and non-Gaussian
of kNN for fault detection may increase in principal component data [14], also encounter difficulties when these problems
subspace because PCA fails to preserve pairwise distances in
subspace. To overcome this drawback, we propose a new fault coexisted.
detection method based on random projection and kNN rule, To overcome these limitations, He and Wang [1] pro-
which combines the advantages of random projection in dis- posed a fault detection method based on k nearest neighbor
tance preservation (in the expectation) and kNN rule in dealing rule (FD-kNN). Unlike the well known k nearest neighbor
with the problems of multimodality and nonlinearity that often rule for multi-classes classification, it is used as an anomaly
coexist in semiconductor manufacturing processes. An industrial
example illustrates the performance of the proposed method. detection algorithm in which only the normal data is available
for model building. It performs fault detection based on the
Index Terms—Fault detection, k-nearest neighbor rule (kNN), following criterion: the trajectory of a normal test sample is
random projection (RP), distance preservation.
similar to the trajectories of normal training samples (obtained
under normal operation conditions); while the trajectory of a
I. I NTRODUCTION faulty sample should deviate significantly from the trajectories
AULT detection and classification techniques continu- of normal samples in training set. The deviation is measured
F ously play an important role in the sustained growth of
the semiconductor manufacturing industry. Combining with
by kNN distance which is the average square distance between
the incoming test sample and its k nearest neighbors from the
advanced process control techniques, it can characterize and normal training set. The reason why FD-kNN is superior to
control variability in critical manufacturing processes and thus PCA for fault detection of semiconductor process is that PCA
fails to capture those characteristics of semiconductor manu-
Manuscript received June 23, 2014; revised October 11, 2014; accepted facturing processes; it implies the assumption of multivariate
November 19, 2014. Date of publication November 26, 2014; date of current
version January 30, 2015. This work was supported in part by the National gaussian distribution on measurement variables and tries to
Natural Science Foundation of China under Grant 61034006, Grant 61104028, characterize the global variation of underling data. In contrast,
Grant 61273170, Grant 61203094, Grant 61290324, and Grant 61333005, FD-kNN overcomes these problems by utilizing the relation of
in part by the National High-tech R&D Program of China (863 Program)
under Grant 2012AA041709. A brief version of the paper was presented distances among local samples to perform fault detection. This
at 33rd Chinese Control Conference, July 28, 2014. (Corresponding author: has been evidently illustrated through simulation and industrial
C. Wen.) examples in [1].
Z. Zhou and C. Yang are with the State Key Laboratory of
Industrial Control Technology, Institute of Industrial Process Control, In general, the measurement data of batch process are col-
Zhejiang University, Hangzhou 310027, China (e-mail: zhouzhe@zju.edu.cn; lected in a 3-D array, denoted as X ∈ RI×J×K , where I, J and
cjyang@iipc.zju.edu.cn). K represent the number of batches, variables and sampling
C. Wen is with the Institute of Systems Science and Control Engineering,
School of Automation, Hangzhou Dianzi University, Hangzhou 310018, times, respectively. Before applying PCA, several processing
China, and also with the College of Electrical Engineering, Henan University steps are needed including unfolding the 3-D array into 2-D
of Technology, Zhengzhou 450007, China (e-mail: wencl@hdu.edu.cn). matrix, X. The way adopted in FD-kNN is batch unfolding
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. (X ∈ RI×JK ), each batch trajectory characterizes the varia-
Digital Object Identifier 10.1109/TSM.2014.2374339 tion in the whole batch duration is represented by a higher
0894-6507 c 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
ZHOU et al.: FAULT DETECTION USING RANDOM PROJECTIONS AND kNN RULE FOR SEMICONDUCTOR MANUFACTURING PROCESSES 71
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
72 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 28, NO. 1, FEBRUARY 2015
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
ZHOU et al.: FAULT DETECTION USING RANDOM PROJECTIONS AND kNN RULE FOR SEMICONDUCTOR MANUFACTURING PROCESSES 73
Fig. 1. Fault detection using FD-kNN with k = 3. Fig. 2. Fault detection using PC-kNN with PCs = 1.
A. Distance Preservation by PCA If the fault occurs in the direction dominated by the first
few principal components, it may also be detected by kNN in
The conclusion is that the projection by PCA cannot
PCS. However, it can never be detected by kNN in PCS if
preserve the capability of the kNN rule.
the fault occurs in the residual subspace. Actually, the fault
Here, a two-dimensional example is given to show that
is more likely detected in residual subspace which has been
dimension reduction by PCA fails to preserve the distances
indicated in several MSPM related literatures [6], [23], [24].
among samples. Fig. 1 is an illustration of fault detection by
As the two-dimensional example given in this subsection, if
FD-kNN with k = 3. It is a nonlinear case with two variables
the information in residual subspace can be utilized, such as
and ten normal samples are represented by circles. It can be
implementing kNN rule in residual subspace as well; those
seen from the distribution of normal samples that the relation
three faults can be detected correctly. However, the advantage
of variables is approximately quadratic. The faulty samples
of low computation consumption and storage requirement will
are represented by three red squares. Obviously, all the three
also disappear.
faulty samples significantly deviate from its neighbors in nor-
mal samples, so it is easy to correctly detect these three faulty
samples by FD-kNN. B. Distance Preservation by Random Projection
When PC-kNN is applied to this example, PCA is firstly
It is evident that projection based on PCA captures a
applied to the normal samples to learn the basis of PCS. The
global property and cannot provide any local guarantees.
variance along direction of axis x1 is biggest, so an orthonor-
Random projection, which will be introduced in this subsec-
mal basis along the direction x1 is selected as the basis of PCS.
tion, can provide the preservation of pairwise distances. It has
Then normal samples will be projected onto PCS and thresh-
been extensively used in machine learning [16]–[18], image
old can be calculated in PCS. For online monitoring, the new
processing [19], [20] and compressed sensing [21], [22].
incoming samples will firstly be projected onto PCS, then its
A result of an important lemma—Johnson-
kNN distance is calculated and compared with a threshold to
Lindenstrauss (JL) lemma [30]—says that any set of n
perform fault detection. The results are shown in Fig. 2. From
points in (high) d-dimensional Euclidean space can be
Fig. 2, we can see that the projections of the three faulty sam-
mapped into an O(ε−2 log n)-dimensional Euclidean space
ples from the original space have mixed with the projections
such that the distance between any two points is distorted
of normal samples in PCS, and their neighbors in PCS are not
by only a factor of 1 ± ε (0 < ε < 1) [31]. Later, several
identical to that of in original space. Therefore, kNN fails to
researchers provided a simpler proof of the original JL lemma
detect the three faulty samples in PCS, though the threshold in
and specified a bound of reduced dimension L [32]–[34]. In
PCS is different from the original one. This means the dimen-
order to improve further efficiency of projection operation,
sion reduction by PCA changes pairwise distances in PCS
Achlioptas [34] designs two very simple projection matrices
and is thus unable to retain the advantages of kNN in PCS,
where elements are either 0 or ±1, and provides a new bound
which may results in the increase of false alarms or missing
of L which is similar to the result in [33].
detections.
Theorem 1: Let Q be an arbitrary set of n points in Rd ,
In addition, it is hard to determine the retained number of
represented as an n × d matrix A. Given ε, β > 0, let
principle components l due to the difficult of measuring the
effect of l on distances preservation (further affecting kNN). 4 + 2β
This will be seen from the industry example in Section V, L0 = log n (12)
ε2 / 2 − ε3 / 3
the best result of PC-kNN is obtain when PCs is small (only
capture about 10% variances); worse results are obtained with For integer L ≥ L0 , let R be a d × L random matrix with
more PCs. R(i, j) = ri,j , where {ri,j } are independent random variables
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
74 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 28, NO. 1, FEBRUARY 2015
from either one of the following two probability distributions: where D̄(i)2 , i = 1, . . . , m is the rearrangement
of D̄i , i = 1, . . . , m in descending order, and
2
+1 with probability 1/2
rij = (13) n(1 − α) represents the integer of n(1 − α).
−1 ··· 1/2,
⎧ • Fault detection
⎨ +1 with probability 1/6
rij = 0 ··· 2/3 (14) For a new sample y,
⎩ – Project onto subspace: ty = RT y
−1 ··· 1/6.
– Calculate ty ’s kNN distance D̄t2y using (17).
Let
– If D̄t2y > D̄α2 , then y will be classified as a faulty
1
E = √ AR (15) sample. Otherwise, y is an normal sample.
k Note that:
Let f : Rd → RL map the ith row of A to row of E. With 1) For a given set of samples, the dimension of random
probability at least 1 − n−β , for all u, v ∈ Q subspace is dependent on distortion parameter ε. If
small value of ε is chosen, then the degree of distor-
(1 − ε) u − v 2 ≤ f (u) − f (v) 2 ≤ (1 + ε) u − v 2 (16)
tion is low and the performance of kNN can be well
where the parameter ε controls the accuracy in distance preserved. However, small ε (large L) results in less
preservation and β controls the probability of success. effective dimension reduction. In contrast, large value
From JL Lemma and theorem 1, it can be seen that the of ε results in higher degree of distortion which further
reduced dimension L is not dependent on the dimension of affect the performance of kNN, though the dimension
original space d, but relevant mainly to the number of points can be reduced extensively in this case. It also worth not-
n and distortion parameter ε. This means random projection ing that ε is related with the worst case, which means the
is extremely suitable for applications of high dimension and biggest distortion of any pairwise distances is no larger
limited samples. In addition, the projection matrix is non- than ε. That is to say, in order to preserve all pairwise
adaptive and independent of the underlying data used for distances, small value of ε may be determined due to
model building. This is different from PCA, where complex few outliers that deviated from other samples. At least
eigendecomposition of high dimensional XT X needs to be the distortion can be controlled by random projection
solved. It worth noting that the reduced dimension L may be which is superior to PCA.
larger than d if the number of samples, n, is large enough or 2) The advantage of nonadaptive designation of random
with very small ε. projection matrix R may useful for the problem of model
immigration, where there is not enough data for learning
IV. R ANDOM P ROJECTION BASED kNN RULE a reliable projection model.
In this section, we propose a new fault detection method
based on random projection and k nearest neighbor rule, which B. Choice of Reduced Dimension L
combines the advantages of random projection in distance In this subsection, we will discuss how to select an appro-
preservation and kNN rule in fault detection. priate L. For the purpose of fault detection, the optimal L is
the one that reduces the computational complexity as much as
A. Random Projection-Based kNN (RPkNN) possible and guaranteeing the performance of fault detection
The details of RPkNN algorithm are as follows: simultaneously.
• Dimension reduction by random projection It is indicated in (12) that the parameter ε determines the
– Construct projection matrix R according to retained degree of pairwise distances and affects further the
either (13) or (14) chosen neighbors of the online test sample. This means the
– Project X onto random subspace (RS): TRP = XR chosen neighbors of a test sample in original space will not
• Model building based on TRP
be identical to those in the random subspace if an inappro-
– Find k nearest neighbors using Euclidean distance priate ε is selected. Hence, a faulty sample detected in the
for each sample in TRP . original space by kNN may be recognized as normal sample
– Calculate the average square Euclidean distance in subspace, and vice versa. Therefore, we want to determine
between each sample and its k nearest neighbors. ε under the criterion of maintaining the same neighbors of the
test sample in both original and random subspace, and then L
1 2
k
can be determined according to (12).
D̄i2 = d̄i,j (17)
k Assume all the pairwise distances in the training set
j=1
denote as
2 denotes the squared distance between
where d̄i,j ⎡ ⎤
d1,(1) · · · d1,(k) d1,(k+1) · · · d1,(n−1)
sample i to its jth nearest neighbor. ⎢ .. .. .. .. ⎥
– Determine the threshold D̄α2 (control limit). ⎢ . . . . ⎥
⎢ ⎥
The threshold is chosen as the (1 − α)-empirical ⎢
D = ⎢ di,(1) · · · di,(k) di,(k+1) · · · di,(n−1) ⎥
⎥ (19)
quartile of Di2 as ⎢ . .. .. .. ⎥
⎣ . . . . . ⎦
D̄α2 = D̄(n·(1−α))
2
(18) dn,(1) · · · dn,(k) dn,(k+1) · · · dn,(n−1)
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
ZHOU et al.: FAULT DETECTION USING RANDOM PROJECTIONS AND kNN RULE FOR SEMICONDUCTOR MANUFACTURING PROCESSES 75
TABLE I
where di,( j) , j = 1, . . . , n − 1 represents the distance between I NDUCED FAULTS
ith sample and the jth nearest neighbor in training set except
ith sample itself. We denote Nindex i as the k nearest neighbors’
index set of ith sample.
Furthermore, the distance matrix and index set in projection
subspace are denoted as D̄ = [d̄i,( j) ] and N̄index
i , respectively.
The following theorem gives the condition of ε, under which
the chosen neighbors of each sample can be identical both in
original and random subspace, namely Nindex i = N̄index
i .
Theorem 2: Let Q be an arbitrary set of n point in Rd , and
the distance matrix D contains the Euclidean distance between
any two samples in Q. Let Nindex i represents the index set TABLE II
of k nearest neighbors of ith sample. For a random matrix P ROCESS VARIABLES U SED FOR M ONITORING
R ∈ RL×d generated according to (13) or (14), projecting
these n points onto random subspace using R. Let D̄ = [d̄i,( j) ]
i
and N̄index denote the distance matrix and index set of each
sample in projection subspace, respectively.
If ε satisfies
d
min i,(k+1) −1
i∈{1,...,n} di,(k)
ε≤ d
(20)
min i,(k+1) +1
i∈{1,...,n} di,(k)
then
i
N̄index = Nindex
i
, i = 1, . . . , n (21) the value of the variables shown in Table I. These data were
obtained from three experiments (in Feb, Mar, and Apr 1996
The proof can be seen in the appendix. respectively). Due to large amount of missing data in two
Remark 1: Once ε is calculated according to the result of batches (each one in normal and fault set), only 107 normal
theorem 2, L can be further determined. It worth noting that we wafers and 20 fault wafers are used in this case study. The
should calculate L based on (20) with n + 1, in order to ensure standard etch process consists of six steps, similar to [1], only
the effect of random matrix on the incoming test sample. the samples from step 4 and 5 are used, and 17 nonsetpoint
Remark 2: The bound of ε in theorem 2 is slightly relaxed. process variables3 (see Table II) are used for fault detection
It only guarantees the sets of neighbors for each sample in both in experiments.
original and subspace are the same, but not the permutation
of k nearest neighbors.
B. Data Preprocessing
Remark 3: The JL lemma is based on the worst case which
guarantees distance distortion of any two samples less than or In order to maximize the level of automation in fault detec-
equal to ε. So the bound of ε in Theorem 2 is also derived tion of industrial applications, relevant fault detection methods
from the worst case. This means the value of ε may be very with minimum data preprocessing are compared in experi-
small due to the extreme case, which leads to trivial results ments [1], [3]. Firstly, equal length batch records are obtained
(i.e., dimension of random subspace may be larger than the through removing the initial five sample points such that the
original space, L > d). However, the performance of kNN rule effect of initial fluctuation can be eliminated and keeping 85
in random subspace may not degrade much, contradicting the sample points in order to accommodate shorter batches in
bound of ε. all the batches. Then, equal length batch array is unfolded
and the obtained two-dimensional data matrix will be scaled
V. I NDUSTRY E XAMPLE to zero mean and unit variance for each variable. The same
preprocessing is done for all training, validation and test data.
In this section, a benchmark industrial data is used to
demonstrate the performance of the proposed fault detec-
C. Results of Fault Detection
tion method. Three relevant methods are compared in this
experiment. In order to reduce the effect of randomness on the results,
the experiments are repeated 100 times. In each experiment,
A. Data Description the 107 normal wafers are further randomly separated into two
parts, 97 wafers for training and 10 for validation respectively.
The data is collected from an Al stack etch process per-
This means different training set are used to build the model
formed on a commercially available Lam 9600 plasma etch
in each experiment. The parameter settings are as follows: the
tool [6], [35]. The goal of this process is to etch the
number of neighbors k = 3 and the confidence level is set
TiN/A1-0.5% Cu/TiN/oxide stack with an inductively coupled
BCl3 /Cl2 plasma. The data consist of 108 normal wafers and 3 Two variables (bias RF reflected power and TCP reflected power) remain
21 fault wafer which were intentionally induced by changing almost zero during the batch.
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
76 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 28, NO. 1, FEBRUARY 2015
TABLE III
FAULT D ETECTION BY T HREE M ETHODS ON VALIDATION S ET
TABLE IV
FAULT D ETECTION R ATES OF T HREE M ETHODS
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
ZHOU et al.: FAULT DETECTION USING RANDOM PROJECTIONS AND kNN RULE FOR SEMICONDUCTOR MANUFACTURING PROCESSES 77
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
78 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 28, NO. 1, FEBRUARY 2015
TABLE V
C OMPARISON OF C OMPUTATION S PEED of the idea of contribution plots. And the estimation of
fault magnitude is useful for fault tolerant control. The
distances from normal samples are, to a certain extent
related with fault magnitude.
• To apply RPkNN to other batch or continuous processes.
The proposed method is not limited to semiconductor pro-
cesses, it can also be applied to other high-dimensional
batch processes and continuous processes.
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.
ZHOU et al.: FAULT DETECTION USING RANDOM PROJECTIONS AND kNN RULE FOR SEMICONDUCTOR MANUFACTURING PROCESSES 79
[5] B. M. Wise, N. B. Gallagher, and E. B. Martin, “Application of [29] C. Schmidt et al., “Fault detection and classification (FDC) for a via
PARAFAC2 to fault detection and diagnosis in semiconductor etch,” etching process,” in Proc. 5th Eur. AEC/APC Conf., Dresden, Germany,
J. Chemometr., vol. 15, no. 4, pp. 285–298, 2001. Apr. 2004.
[6] B. M. Wise, N. B. Gallagher, S. W. Butler, D. D. White, and G. G. Barna, [30] W. B. Johnson and J. Lindenstrauss, “Extensions of Lipschitz map-
“A comparison of principal component analysis, multiway principal pings into a Hilbert space,” in Proc. Conf. Mod. Anal. Probab., vol. 26.
component analysis, trilinear decomposition and parallel factor analy- New Haven, CT, USA, 1984, pp. 189–206.
sis for fault detection in a semiconductor etch process,” J. Chemometr., [31] D. Sivakumar, “Algorithmic derandomization via complexity theory,”
vol. 13, nos. 3–4, pp. 379–396, 1999. in Proc. 34th Annu. ACM Symp. Theory Comput., Montreal, QC, Canada,
[7] G. A. Cherry and S. J. Qin, “Multiblock principal component analy- May 2002, pp. 619–626.
sis based on a combined index for semiconductor fault detection and [32] P. Frankl and H. Maehara, “The Johnson-Lindenstrauss Lemma and
diagnosis,” IEEE Trans. Semicond. Manuf., vol. 19, no. 2, pp. 159–172, the sphericity of some graphs,” J. Comb. Theory B, vol. 44, no. 3,
May 2006. pp. 355–362, 1988.
[8] Z. Q. Ge and Z. H. Song, “Semiconductor manufacturing process mon- [33] S. Dasgupta and A. Gupta, “An elementary proof of a theorem of
itoring based on adaptive substatistical PCA,” IEEE Trans. Semicond. Johnson and Lindenstrauss,” Random Struct. Algorithms, vol. 22, no. 1,
Manuf., vol. 23, no. 1, pp. 99–108, Feb. 2010. pp. 60–65, 2003.
[9] Q. P. He and J. Wang, “Statistics pattern analysis: A new process moni- [34] D. Achlioptas, “Database-friendly random projections,” in Proc. 20th
toring framework and its application to semiconductor batch processes,” ACM SIGMOD-SIGACT-SIGART Symp. Prin. Database Syst., Montreal,
AIChE J., vol. 57, no. 1, pp. 107–121, 2011. QC, Canada, May 2001, pp. 274–281.
[10] S. W. Choi, C. Lee, J.-M. Lee, J. H. Park, and I.-B. Lee, “Fault detec- [35] B. M. Wise. (1999). Metal Etch Data for Fault Detection Evaluation.
tion and identification of nonlinear processes based on kernel PCA,” [Online]. Available: http://software.eigenvector.com/Data/Etch/
Chemometr. Intell. Lab. Syst., vol. 75, no. 1, pp. 55–67, 2005. index.html
[11] Z. Q. Ge, C. J. Yang, and Z. H. Song, “Improved kernel PCA-based
monitoring approach for nonlinear processes,” Chem. Eng. Sci., vol. 64,
no. 9, pp. 2245–2255, 2009.
[12] Z. Q. Ge and Z. H. Song, “Mixture Bayesian regularization method of
PPCA for multimode process monitoring,” AIChE J., vol. 56, no. 11,
pp. 2838–2849, 2010.
[13] C. H. Zhao, Y. Yao, F. R. Gao, and F. L. Wang, “Statistical analy-
sis and online monitoring for multimode processes with between-mode Zhe Zhou received the B.E. and M.S. degrees
transitions,” Chem. Eng. Sci., vol. 65, no. 22, pp. 5961–5975, 2010. from the School of Automation, Hangzhou Dianzi
[14] X. Q. Liu, L. Xie, U. Kruger, T. Littler, and S. Q. Wang, “Statistical- University, Hangzhou, China, in 2009 and 2012,
based monitoring of multivariate non-Gaussian systems,” AIChE J., respectively. He is currently pursuing the Ph.D.
vol. 54, no. 9, pp. 2379–2391, 2008. degree from the Department of Control Science
and Engineering, Zhejiang University, Hangzhou.
[15] S. X. Ding, Model-Based Fault Diagnosis Techniques: Design Schemes,
His current research interests are data-driven fault
Algorithms, and Tools, 2nd ed. Berlin, Germany: Springer, 2013.
diagnosis and its applications in industry.
[16] D. Fradkin and D. Madigan, “Experiments with random projections for
machine learning,” in Proc. 9th ACM SIGKDD Int. Conf. Knowl. Disc.
Data Min., Washington, DC, USA, Aug. 2003, pp. 517–522.
[17] Q. F. Shi, C. H. Shen, R. Hill, and A. van den Hengel, “Is margin pre-
served after random projection?” in Proc. 29th Int. Conf. Mach. Learn.,
Edinburgh, U.K., 2012, pp. 591–598.
[18] C. Boutsidis, A. Zouzias, and P. Drineas, “Random projections for
κ-means clustering,” in Proc. Adv. Neural Inf. Process. Syst., Vancouver, Chenglin Wen (M’10) received the bache-
BC, Canada, 2010, pp. 298–306. lor’s and master’s degrees in mathematics from
[19] E. Bingham and H. Mannila, “Random projection in dimensionality Henan University, Kaifeng, China, and Zhengzhou
reduction: Applications to image and text data,” in Proc. 7th ACM University, Zhengzhou, China, and the Ph.D. degree
SIGKDD Int. Conf. Knowl. Disc. Data Min., San Francisco, CA, USA, from Northwestern Polytechnical University, Xi’an,
Aug. 2001, pp. 245–250. China, in 1986, 1996, and 1999, respectively. He is
[20] A. Eftekhari, M. Babaie-Zadeh, and H. A. Moghaddam, “Two- currently a Professor and the Chair with the Institute
dimensional random projection,” Signal Process., vol. 91, no. 7, of Systems Science and Control Engineering,
pp. 1589–1603, 2011. School of Automation, Hangzhou Dianzi University,
[21] E. Candes and J. Romberg, “Practical signal recovery from random Hangzhou, China. His current research interests
projections,” Proc. SPIE Int. Symp. Electron. Imag. Comput. Imag. III, include multisensor networked information fusion
pp. 76–86, 2005. theory, multitarget tracking, fault diagnosis of complex systems and devices,
[22] E. Candes and T. Tao, “Near-optimal signal recovery from random reliability assessment and health control, recognition, and tracking of hyper-
projections: Universal encoding strategies?” IEEE Trans. Inf. Theory, sonic vehicle. He is currently a Committee Member of Intelligent Automation
vol. 52, no. 12, pp. 5406–5425, Dec. 2006. Committee and Process Fault Diagnosis and Security Committee of Chinese
[23] J. E. Jackson and G. S. Mudholkar, “Control procedures for residuals Association of Automation.
associated with principal component analysis,” Technometrics, vol. 21,
no. 3, pp. 341–349, 1979.
[24] S. J. Qin, “Statistical process monitoring: Basics and beyond,” J.
Chemometr., vol. 17, nos. 8–9, pp. 480–502, 2003.
[25] T. Denoeux, “A k-nearest neighbor classification rule based on
Dempster–Shafer theory,” IEEE Trans. Syst., Man, Cybern., vol. 25,
no. 5, pp. 804–813, May 1995.
[26] J. M. Keller, M. R. Gray, and J. A. Givens, “A fuzzy k-nearest neigh- Chunjie Yang received the Ph.D. degree in con-
bor algorithm,” IEEE Trans. Syst., Man, Cybern., vol. SMC-15, no. 4, trol theory and engineering from the Zhejiang
pp. 580–585, Jul. 1985. University, Hangzhou, China, in 1998. He is cur-
[27] H. B. Shen and K. C. Chou, “Using optimized evidence-theoretic rently a Professor with the Department of Control
k-nearest neighbor classifier and pseudo-amino acid composition to Science and Engineering, Zhejiang University. His
predict membrane protein types,” Biochem. Biophys. Res. Commun., research interests include system modeling, con-
vol. 334, no. 1, pp. 288–292, 2005. trol, and fault diagnosis of industrial processes, soft
[28] Y. C. Lee, “Handwritten digit recognition using k nearest-neighbor, sensor technology, and implementation for complex
radial-basis function, and backpropagation neural networks,” Neural industrial system.
Comput., vol. 3, no. 3, pp. 440–449, 1991.
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 14,2020 at 18:38:51 UTC from IEEE Xplore. Restrictions apply.