Wu 2012

238 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 25, NO.
2, MAY 2012
Dynamic-Moving-Window Scheme for

Virtual-Metrology Model Refreshing
Wei-Ming Wu, Student Member, IEEE, Fan-Tien Cheng, Fellow, IEEE, and Fan-Wei Kong
Abstract—Virtual metrology (VM) is a method to conjecture virtual metrology system (AVM) to implement and deploy the
manufacturing quality of a process tool based on data sensed VM operations automatically [6], [7].
from the process tool without physical metrology operations.
Historical data is used to produce the initial VM models,
Historical data is used to produce the initial VM models, and
then these models are applied to operating in a process drift and then these models are applied to operating in a process
or shift environment. The accuracy of VM highly depends on drift or shift environment. To create a VM model, n sets of
the modeling samples adopted during initial-creating and online- historical data are collected, including process data (Xa , a=
refreshing periods. Since large resources are required, design-of- 1, 2, . . . , n) from a production tool and the corresponding
experiments may not be performed. In that case, how could we
metrology data (ya , a= 1, 2, . . . , n) from a metrology tool.
guarantee the stability of the models and predictions as they
move into the unknown environment? Conventionally, static- Each set of process data contains individual parameters (from
moving-window (SMW) schemes with a fixed window size are parameter 1 to parameter p). The linking between the process
adopted in the online-refreshing period. The purpose of this and the metrology data of each set should be assured before
paper is to propose a dynamic-moving-window (DMW) scheme considering this set as a valid modeling set. Furthermore, if the
for VM model refreshing to enhance prediction accuracy. The
historical data sets possess the correlation across data streams
DMW scheme adds a new sample into the model and applies
a clustering technology to do similarity clustering. Next, the and across time (i.e., autocorrelation), then the historical data
number of elements in each cluster is checked. If the largest of a typical entire correlation period should be collected for
number of the elements is greater than the predefined threshold, modeling so as to improve the modeling and conjecturing
then the oldest sample in the cluster with the largest population accuracy [8].
is deleted. Both the adaptive-resonance-theory-2 and the newly
Nevertheless, during the initial model-creating phase, it is
proposed weighted-Euclidean-distance methods are applied to do
similarity clustering. impossible to collect all the possible drift or shift samples
simply from historical data; it is also impossible to generate
Index Terms—Dynamic-moving-window (DMW) scheme,
all the probable drift or shift samples by performing design-of-
model refreshing, static-moving-window (SMW) scheme, virtual
metrology (VM), weighted-Euclidean-distance (WED) method. experiments (DOEs) because this requires a large amount of
resources. Since DOE may not be performed exhaustively, how
could we guarantee the stability of the models and predictions
I. Introduction
while they move into the unknown environment? Furthermore,
R ECENTLY, a promising technology—denoted virtual

metrology (VM)—has bloomed [1], [2]. The Interna-
tional SEMATECH Manufacturing Initiative added VM into
during the model-refreshing phase, how could those important
drift or shift samples collected online be added into the set of
model-refreshing samples automatically?
its next-generation factory realization roadmap of the semicon- Not many researchers had addressed the issues concerning
ductor industry [3]. The International Technology Roadmap model creating or refreshing. Among them, Cheng et al.
for Semiconductors also designated VM as one of the focus [9] proposed the dual-phase VM scheme to solve the online
areas on factory information and control systems and advanced model-refreshing problem. After the VM model being built
process control (APC) [4], [5]. VM can convert sampling and with the dual-phase VM algorithm [9] running, the drift
inspection with metrology delay into real-time and online total or shift samples encountered during manufacturing processes
inspection. The authors have developed the so-called automatic can be added to the model automatically, in order to enhance
Manuscript received October 1, 2011; accepted December 4, 2011. Date of
the prediction accuracy of the future VM values. However, the
publication January 9, 2012; date of current version May 4, 2012. This work issue of the size of modeling samples is not considered in [9].
was supported in part by the National Science Council of Taiwan, under Khan et al. [2], [10] applied a static-moving-window
Contracts NSC100-2221-E-006-002 and NSC100-2622-E-006-011-CC2, and
in part by the Ministry of Education, Taiwan, under Project AIM-HI. There
(SMW) scheme for partial-least-squares (PLSs) modeling with
are currently patents pending for the work presented in this paper in Taiwan, the regression coefficient matrix being recursively updated
the U.S., China, Japan, and Korea with application no. 100147447. when actual measurement of output is available. The size of
The authors are with the Institute of Manufacturing Information
and Systems, National Cheng Kung University, Tainan 70101, Taiwan
this SMW is found by reaching a tradeoff between the speed
(e-mail: min@super.ime.ncku.edu.tw; chengft@mail.ncku.edu.tw; fwkung@ of adaptation of the PLS model (requiring smaller size) and
fs-technology.com). avoiding any response to noisy data (requiring larger size). In
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
one approach for the moving window, the number (n) of data
Digital Object Identifier 10.1109/TSM.2012.2183398 points in X and y remains fixed while old data is replaced by
0894-6507/$31.00
c 2012 IEEE
WU et al.: DYNAMIC-MOVING-WINDOW SCHEME FOR VM MODEL REFRESHING 239
new ones. Alternatively, some of the DOE data can be retained The remainder of this paper is organized as follows. Sec-
while new values of X and y replace the older ones; this second tion II details the DMW schemes by applying ART2 [13],
approach seeks to preserve the richness of the DOE data in [14] and WED. Section III then presents and compares the
the reformulation [2], [10]. experimental results among the SMW, ART2-based DMW, and
Lynn et al. [11] proposed another SMW modeling scheme, WED-based DMW schemes. The implications of experimen-
the so-called weighted-PLS scheme, to virtually measure etch tal results are also discussed here. Finally, a summary and
rates in an industrial plasma etch process. The weights applied conclusions are made in Section IV.
to each sample are determined depending on their relevance
to the sample at the front of the window. The sample weights
vary in accordance with the tool maintenance history to satisfy II. DMW Scheme
two assumptions. First, due to process drift, it is assumed that
samples closer to the front of the window are more relevant The conjecture accuracy of VM should be equally good
to prediction. Second, it is assumed that samples contained for the entire possible range of the corresponding actual
within the same predictive maintenance (PM) cycle as the metrology. To achieve this goal, the modeling samples should
target sample are more relevant than samples from previous be scattered over the entire range as equally as possible.
PM cycles. In order to fulfill these assumptions, samples are Therefore, the rule of thumb of designing the DMW scheme
first assigned a linearly decreasing weight across the window is to keep the sample with a small population and delete the
length, with the most recent sample given a weight of 1 and the sample with an abundant population within the entire range.
oldest samples a weight of 0. Next, the weights of the samples The flowchart of the DMW scheme is shown in Fig. 1 and is
are adjusted according to the number of PM cycles spanned explained as follows.
by the window length. The samples contained in the most Step 1) Collect a new sample for modeling.
recent PM cycle are first incremented, and older PM cycles Step 2) Add the new sample into the modeling samples.
are incremented by less and less amounts progressively. Step 3) Perform similarity clustering.
The window size of all the SMW schemes mentioned above Step 4) Search for the cluster that has the largest number of
is fixed with old samples being replaced by new ones. As samples.
such, those important and rare drift or shift samples collected Step 5) Check whether there are two or more clusters with
both in the initial-creating and online-refreshing periods will the same largest number of samples; if yes, then go
be deleted sooner or later. This problem should be studied to Step 6, otherwise, jump to Step 7.
and resolved. Thus, a new scheme that keeps the important Step 6) Find out the cluster (of those two or more) that has
and rare samples and deletes the redundant and common ones the oldest sample.
should be developed. Step 7) Check whether there are less than three samples in
The authors presented a preliminary study of a dynamic- the cluster that has the largest number of samples; if
moving-window (DMW) scheme for VM model refreshing yes, then go to Step 8, otherwise, jump to Step 9.
in [12]. The DMW scheme adds a new sample into the Step 8) Save this new sample for modeling without purging
model and applies a clustering technology to do similarity any historical sample, and then stop.
clustering. Next, the number of elements in each cluster is Step 9) Save this new sample and purge the oldest sample for
checked. If the largest number of elements is greater than the modeling, and then stop.
predefined threshold, then the oldest sample in the group with Intuitively, the historical actual metrology data (y) may be
the largest population is deleted. The predefined threshold is used as the element to do similarity clustering. However, the
set to be three, which will be explained later. In the preliminary clustering discrimination will not be good enough if merely
study, the adaptive-resonance-theory-2 (ART2) was applied for applying y as the element for clustering due to the fact
similarity clustering. However, after performance evaluation, that there may be several different process data sets (Xs)
it was found that the number of clusters of the ART2-based that produce the same metrology datum (y). To enhance the
DMW is rather high. The number of clusters should not be clustering discrimination, instead of y, the process data set (X)
close to that of the window size of model creation, otherwise should be adopted as the element to do similarity clustering.
the elements in each cluster will be relatively small. Further, Two clustering methods—ART2 [13], [14] and the proposed
the execution time of ART2 is comparatively high as well. WED scheme—are adopted in this paper. Both methods utilize
In this paper, not only the ART2 but also the newly proposed X as the element to do clustering. They are presented in the
weighted-Euclidean-distance (WED) methods are applied to following two sections, respectively.
do similarity clustering. The performance (including the num-
ber of clusters and execution times) of the ART2-based DMW
and WED-based DWM are also compared in this paper. A. Applying ART2 for Similarity Clustering
The photo processes of a fifth-generation thin-film transistor ART2 is a kind of unsupervised neural network (NN). It
liquid crystal display (TFT-LCD) factory are adopted in this has the features of both stability and flexibility and is able
paper to test and compare the conjecture accuracy among the to rapidly learn the new characteristics of clusters without
SMW, ART2-based DMW, and WED-based DMW schemes. predetermination of the number of clusters. However, the
Testing results show that the DMW schemes have better online quality of ART2 clustering crucially depends on the setting
conjecture accuracy than the conventional SMW scheme. of the vigilance parameter, ρ, of the resonance layer [13].
240 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 25, NO. 2, MAY 2012
under the condition of minimizing intracluster and maximizing

intercluster variations [16]. The ART2 clustering with auto-ρ
method is also adopted in this paper to perform Step 3 of the
DMW scheme.
B. Applying WED for Similarity Clustering

The WED method performed in Step 3 of the DWM scheme
consists of the following three steps.
Step 1) Calculate the weighted correlation coefficient (Wj )
between the jth process parameter, j = 1, 2, . . . , p,
and the real metrology.
Step 2) Calculate the WED (WEDi ) [15] for each model-
ing sample, i = 1, 2, . . . , n.
Step 3) Put all of the WEDi of the modeling samples in
a histogram and calculate the desired number (C)
of clusters by the Sturge’s rule [18], finally divide
the histogram into C clusters.
The details of Steps 1–3 are given below.
Step 1: Calculate the weighted correlation coefficient
(Wj )
r(Xj , y)
Wj = p ,j = 1, 2, . . . , p (1)

r(Xj , y)
j=1
where r(Xj , y) is the correlation coefficient [17] between the

jth process parameters, Xj , and the real metrology values,
T
y, with Xj = x1,j , x2,j , . . . , xn,j being the j set of in-
dividual
parameter containing
T n modeling process data, and
y = y1 , y2 , . . . , yn the corresponding set of modeling actual
measurement values. As a result, the weighted-correlation-
coefficient matrix (W) can be expressed as
⎡ ⎤
W1 0 ... 0
⎢ 0 W2 ... 0 ⎥
⎢ ⎥
W = ⎢ . .. .. .. ⎥
⎣ .. . . . ⎦
0 0 ... Wp
⎡ r(X1 ,y) ⎤

p 0 ... 0
⎢ r(Xj ,y) ⎥
⎢ ⎥
⎢ j=1
⎥
⎢ 0 r(X2 ,y)
... 0 ⎥
⎢
p ⎥
⎢ r(Xj ,y) ⎥
= ⎢ ⎥ (2)
⎢ j=1 ⎥
⎢ .. .. .. .. ⎥
⎢ . . . . ⎥
⎢ ⎥
⎢ r(Xp ,y) ⎥
⎣ 0 0 ...
p ⎦
r(Xj ,y)
j=1
Fig. 1. DMW scheme. where xi,j is the jth process parameter in the ith set of process
data, yi is the ith actual measurement value, i represents the
ith sample, i = 1, 2, . . . , n, and j represents the jth parameter,
The optimal value of ρ is usually obtained by the try-and- j = 1, 2, . . . , p.
error method and must be fine-tuned by the user according to Step 2: Calculate the weighted Euclidean distance
different data sets. (WEDi ). Before the WEDi is constructed, the process data
The authors have applied ART2 clustering to create the must be standardized. The equations for standardizing the
metrology data-quality-index (DQIy ) model [16] by the pro- process data are
posed auto-ρ procedure that automatically searches for the
most suitable ρ value. The most suitable ρ is obtained by xi,j − x̄j
Z xi,j = , i = 1, 2, ..., n; j = 1, 2, ..., p (3)
finding the shortest distance between each pair of two clusters σx j
TABLE I
VM Conjecture Accuracy of Various Moving Window Schemes for Example 1 (All Positions)
NN MR
MAPE (%) Max Error (%) MAPE (%) Max Error (%)
Pos.
SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED
1 1.68 0.23 0.31 5.91 0.41 0.70 2.06 0.46 0.43 4.25 0.97 0.98
2 1.08 0.33 0.37 5.29 1.34 1.65 0.92 0.29 0.32 2.85 0.61 0.68
3 2.18 0.34 0.38 7.42 1.36 1.51 1.39 0.30 0.32 3.90 0.62 0.64
4 1.24 0.41 0.32 3.32 1.15 0.76 1.09 0.40 0.40 2.50 1.04 1.04
5 0.84 0.30 0.43 3.11 0.72 1.04 2.27 0.46 0.46 4.86 0.93 1.42
6 1.62 0.38 0.43 3.26 1.14 0.98 3.92 0.44 0.49 9.35 1.28 1.21
7 0.74 0.52 0.30 2.09 0.97 0.50 2.89 0.34 0.41 7.60 0.76 0.80
8 1.09 0.44 0.35 2.69 1.13 0.88 6.32 0.52 0.47 15.12 0.92 0.98
Mean 1.31 0.37 0.36 4.14 1.03 1.00 2.61 0.40 0.41 6.30 0.89 0.97
The Sturge’s rule is

Group number = 1 + log2 (n) (8)
where n is the number of samples for modeling. The nearest
integer of group number in (8) is adopted as the C value.
Finally, those WEDi , i = 1, 2, . . . , n, are equally divided into
C clusters with
Width of a cluster = [max(WEDi ) − min(WEDi )]/C. (9)
III. Illustrative Examples

Two examples are chosen to be tested and compared. All
Fig. 2. Measurement positions of photo equipment (14.1-in product glass). the experimental data are collected from a photo tool. This
tool is practically operating in a fifth-generation TFT-LCD
1 factory in Taiwan. In Examples 1 and 2, 14.1-in product glass
x̄j = x1,j + x2,j + ... + xn,j (4) is divided into two shots for photo processing. Each shot has
n
eight measurement positions, as depicted in Fig. 2. Shot 2 is

1 2 2 2 used in both examples.
σxj = x1,j − x̄j + x2,j − x̄j + ... + xn,j − x̄j According to the physical properties of the photo equipment,
n −1
(5) 21 key process parameters are chosen, respectively, as inputs
where Zxi,j is the standardized jth process parameter in the ith of the conjecture model. The process data were extracted
set of process data, x̄j is the mean of the jth process data, and from time-series-trace data by taking averages of a window
σxj is the standard deviation of the jth process data. of time period. The conjecture accuracy calculated from the
Let Zi = [Zxi,1 , Zxi,2 , ..., Zxi,p ]T be the vector of the ith set testing data is quantified by the mean absolute percentage error
standardized process data. And, the model set of the process (MAPE) [8], [9]. Its formula is represented as follows:
T
parameters is defined as XM = xM,1 , xM,2 , ..., xM,p , q
where xM,j equals x̄j , j = 1, 2, . . . , p, so that each element in |(ŷi − yi )/y|

i=1
the model set after standardization (also denoted as the stan- MAPE = × 100% (10)
dardized model parameter, ZM,j ) has a value of 0. Restated, q
T
all of the elements in ZM = ZM,1 , ZM,2 , . . . , ZM,p are 0. where ŷi is the VM conjecture value, yi is the actual metrology
Thereafter, the WEDi is expressed as value, y is the target value, and q is the conjecture sample size.
The closer the MAPE value is to zero, the better the conjecture
WEDi = (Zi −ZM ) W I W T (Zi −ZM )T (6) accuracy of the model would be.
where I is the identity matrix. Due to all of the elements in
ZM are 0, the WEDi can be simplified as A. Conjecture Results of Example 1
The purpose of Example 1 is to demonstrate the DMW’s
WEDi = Zi W I W T ZTi . (7)
capability of keeping the golden samples that should be kept
Step 3: Perform clustering on WEDi by Sturge’s rule. in the model permanently. To begin with, 12 experimental sets
After all of the WEDi , i = 1, 2, . . . , n, are obtained, they are adopted. Among these 12 sets, Samples 3–9 are especially
are first plotted by a histogram. Then, Sturge’s rule [18] is chosen to perform a critical-dimension (CD) spread test with
applied to calculate the desired number (C) of clusters. the adjustment of a major parameter (ActProcess Time) on
Fig. 3. Samples 95–106 VM conjecture results of various moving window schemes for Example 1. (a) Position 2. (b) Position 8.
the photo equipment. These seven spread-test samples are of those 12 experimental sets are adopted for evaluating the
so-called the golden samples. Then, 63 additional historical VM conjecture accuracy.
samples are collected. As a result, 75 sets, which include The SMW, ART2-based DMW, and WED-based DMW
12 experimental sets in the front and 63 historical samples schemes are applied to perform this free-running test. The
at the back, are adopted as modeling sets to establish the eight positions of Shot 2, as shown in Fig. 2, are adopted
VM conjecture model. The following 19 sets are utilized to for evaluation. Table I presents the VM conjecture accu-
tune or retrain the VM model by the dual-phase scheme [9] racy of various schemes for all the measurement positions
one by one. Finally, the process data of the 12 experimental (1–8) by applying back-propagation NNs and multiregression
sets (which contains the golden samples) are used again as (MR). Among those eight positions, the conjecture results of
the samples for the free-running mode VM conjecturing test, positions 2 and 8 for various schemes are depicted in Fig. 3.
whereas the corresponding actual metrology values (for CD) Observing Fig. 3, the global-similarity-index (GSI) values [19]
TABLE II
VM Conjecture Accuracy of Various Moving Window Schemes for Example 2 (All Positions)
Pos. NN MR
MAPE (%) Max Error (%) MAPE (%) Max Error (%)
SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED SMW DMW-ART2 DMW-WED
1 0.90 0.77 0.73 3.38 2.28 2.15 0.83 0.78 0.65 4.18 2.46 2.12
2 0.35 0.37 0.33 2.12 1.09 1.04 0.38 0.38 0.41 2.58 1.08 1.14
3 0.70 0.67 0.67 3.32 1.92 1.70 0.72 0.66 0.70 3.90 1.85 1.76
4 0.59 0.55 0.56 2.13 1.99 1.82 0.66 0.60 0.62 2.20 1.98 1.85
5 0.51 0.58 0.53 1.45 1.59 1.66 0.57 0.62 0.63 1.42 1.81 1.44
6 0.59 0.57 0.56 1.92 1.63 1.48 0.64 0.60 0.60 2.32 1.83 1.55
7 0.55 0.52 0.55 1.56 1.68 1.52 0.58 0.55 0.54 1.82 1.38 1.50
8 0.92 0.87 0.75 2.54 2.42 2.45 0.90 0.87 0.79 3.58 2.37 2.50
Mean 0.64 0.61 0.58 2.30 1.83 1.73 0.66 0.63 0.62 2.75 1.84 1.73
Fig. 4. Key process data (ActProcess Time) of Example 1.
of the SMW scheme at Samples 97–103 are much higher than samples: 1) all of the golden samples are discarded by the
the GSI threshold (9), while, those of the ART2-based DMW SMW scheme; 2) golden Samples 4–9 are still kept in the
and WED-based DMW schemes at Samples 97–103 are both VM model by the ART2-based DMW; and 3) all of the golden
lower than the GSI threshold. In fact, Samples 97–103 are Samples 3–9 are still kept in the VM model by the WED-based
the golden Samples 3–9 mentioned above. This phenomenon DMW scheme.
indicates that those golden samples have been deleted by
the SMW scheme and are still kept in the VM model by B. Conjecture Results of Example 2
both the ART2-based DMW and WED-based DMW schemes. The purpose of Example 2 is to evaluate the performance of
Observing Table I, due to the fact that the golden samples are various moving-window schemes for an ordinary running case
missing in the SMW case yet preserved in the DWM cases, by applying the dual-phase VM scheme [9]. In this example,
the MAPEs and max errors of the SMW scheme are relatively 75 historical samples are collected and adopted as modeling
larger than those of the DWM schemes. Furthermore, Fig. 4 sets to establish the VM conjecture model. The following 56
shows the key process data (ActProcess Time) of Example sets are utilized to test and tune or retrain the VM model by the
1. It obviously indicates that after tuning or retraining 19 dual-phase scheme [9] one by one. The corresponding actual
Fig. 5. Samples 76–131 VM conjecture results of various moving window schemes for Example 2. (a) Position 2. (b) Position 6.
metrology values (for CD) of those 56 samples are used to Table II shows that the mean MAPEs and max errors of the
evaluate the VM conjecture accuracy. SMW scheme are worse than those of the ART2-based DMW
The SMW, ART2-based DMW, and WED-based DMW and WED-based DMW schemes, respectively. Comparing the
schemes are, again, applied to perform this dual-phase running accuracy of the ART2-based DMW with that of the WED-
test. The eight positions of Shot 2, as shown in Fig. 2, are based DMW, the WED-based DMW is slightly better. The
adopted for evaluation. Table II presents the VM conjecture phenomena mentioned above are supported by the results
accuracy of various schemes for all the measurement positions shown in Figs. 5 and 6.
(1–8). Among those eight positions, the conjecture results of Observing Fig. 5, the GSI value of the SMW scheme
positions 2 and 6 for various schemes are displayed in Fig. 5. at Sample 126 is much higher than the GSI threshold (9),
Moreover, a process datum (Lamp Illumination) of Example while those of the ART2-based DMW and WED-based DMW
2 for all of the 131 samples is depicted in Fig. 6. schemes at Sample 126 are both lower than the GSI threshold.
Fig. 6. Process datum (Lamp Illumination) of Example 2.
Further, the GSI value of the ART2-based DMW is slightly The number of clusters should not be close to that of the
higher than that of the WED-based DMW at Sample 126. As a model-creation window size, otherwise the elements in each
result, the prediction error of the SMW at Sample 126 is larger cluster will be relatively small. According to the experience
than those of the ART2-based DMW and WED-based DMW. gained from [16], all the elements in the same cluster sorted
Moreover, the prediction error of the ART2-based DMW is by ART2 have the same properties. To sustain the properties of
slightly larger than that of the WED-based DMW at Sample each cluster, the minimal number of elements is three (3) [16].
126. Therefore, the predefined threshold of the number of elements
Figure 6 shows the process datum, Lamp Illumination, of in each cluster is set to be three as shown in Step 7 of Fig. 1.
Example 2 for all of the 131 samples. The value of Lamp Because the execution time and the number of the clusters
Illumination at Sample 126 is: 1) quite different from those of the WED-based DMW scheme perform better than those
of the modeling samples of the SMW scheme; 2) relatively of the ART2-based scheme, the WED-based DMW scheme is
similar to Samples 35 and 36 kept in the modeling sets of the recommended to be applied for VM model refreshing.
ART2-based DMW scheme; and 3) highly similar with Sample
30 kept in the modeling sets of the WED-based DMW scheme.
Consequently, the phenomena depicted in Fig. 6 support the IV. Conclusion
results shown in Table II and Fig. 5.
This paper proposed a DMW scheme for VM model refresh-
ing. The DMW scheme added a new sample into the model
C. Discussion and applied a clustering technology to do similarity clustering.
The execution time of the ART2-based DMW and WED- Next, the number of elements in each cluster was checked. If
based DMW schemes in Examples 1 and 2 takes 0.891 s and the largest number of elements was greater than three, then
0.030 s on average, respectively. The computer used to test the the oldest sample in the group with the largest population was
execution time is Intel Core 2 CPU 2.13 GHz with 1.48 GB deleted. Two clustering methods ART2 and the proposed WED
RAM. Microsoft Windows XP is adopted as the operating were applied in this paper. Process and metrology data of a
system. Moreover, both of the ART2-based and WED-based photo tool that is practically operating in a fifth-generation
schemes are developed by applying MATLAB 7.6. TFT-LCD factory in Taiwan were adopted as the illustrated
With the initial model-creation window size being 75, the examples. Test results showed that the DMW scheme has
numbers of clusters of the ART2-based DMW and WED-based better online conjecture accuracy than the SMW scheme.
DMW schemes in Examples 1 and 2 are 40 and 7, respectively. Further, the execution time of the WED-based DMW scheme
was at least ten times faster than that of the ART2-based DMW Wei-Ming Wu (S’08) was born in Tainan, Tai-
wan, on March 27, 1981. He received the B.S. de-
scheme. Therefore, the WED-based DMW scheme was best gree from the Department of Industrial Engineering
recommended to be adopted for VM model refreshing. and Management, Taipei University of Technology,
Taipei, Taiwan, in 2004, and the M.S. degree from
the Institute of Manufacturing Engineering, National
Cheng Kung University, Tainan, in 2006. He is
Acknowledgment currently pursuing the Ph.D. degree with the Institute
The authors would like to thank Chi Mei Optoelectronics of Manufacturing Information and Systems, National
Cheng Kung University.
Corporation, Taiwan, for providing the raw data of the photo His current research interests include factory au-
processes used in the illustrative examples. tomation and virtual metrology for semiconductor manufacturing, key-variable
selection, artificial intelligence, and statistics analysis.
References
[1] A. Weber, “Virtual metrology and your technology watch list: Ten things
you should know about this emerging technology,” Future Fab Int., vol.
22, no. 4, pp. 52–54, Jan. 2007.
[2] A. A. Khan, J. R. Moyne, and D. M. Tilbury, “An approach for factory- Fan-Tien Cheng (S’87–M’89–SM’98–F’08) re-
wide control utilizing virtual metrology,” IEEE Trans. Semicond. Manuf., ceived the B.S. degree from National Cheng Kung
vol. 20, no. 4, pp. 364–375, Nov. 2007. University (NCKU), Tainan, Taiwan, in 1976, the
[3] O. Rothe. (2008, Jul.). ISMI next generation factory. presented at Masters degree in 1982 and the Ph.D. degree in
the e-Manufacturing Workshop, SEMICON West [Online]. Avail- 1989 from Ohio State University, Columbus, all in
able: http://www.sematech.org/meetings/archives/emanufacturing/8546/ electrical engineering.
01-NGF.pdf He is currently the Chair Professor of NCKU.
[4] J. Moyne, “International technology roadmap for semiconductors (ITRS) From August 1998 to July 2001, he was the Di-
perspective on AEC/APC,” presented at the 21st ISMI AEC/APC rector of the Institute of Manufacturing Engineering
Symposium, Ann Arbor, MI, Sep. 2009. (IME), NCKU. He built a web-enabled experimental
[5] J. Moyne, “PCS mechanisms for fab-wide development and latest trends, manufacturing execution system and a supply chain
new directions in PCS: Virtual metrology,” presented at the 21st ISMI information system for integrated circuit packaging. He also established
AEC/APC Symposium, Ann Arbor, MI, Sep. 2009. testing beds for e-diagnostics, equipment-engineering-system, engineering-
[6] F.-T. Cheng, H.-C. Huang, and C.-A. Kao, “Developing an automatic chain-management-system, and automatic-virtual-metrology frameworks for
virtual metrology system,” IEEE Trans. Automat. Sci. Eng., vol. 9, no. semiconductor manufacturing at IME Automation Laboratory for educational
1, pp. 181–188, Jan. 2012. and research purposes. He is the founder of the e-Manufacturing Research
[7] F.-T. Cheng, J. Y.-C. Chang, H.-C. Huang, C.-A. Kao, Y.-L. Chen, and Center (eMRC), NCKU, and has been the Director of eMRC since January
J.-L. Peng, “Benefit model of virtual metrology and integrating AVM 2008. His current research interests include semiconductor manufacturing
into MES,” IEEE Trans. Semicond. Manuf., vol. 24, no. 2, pp. 261–272, automation, e-manufacturing, virtual metrology, and intelligent PM.
May 2011. Prof. Cheng received the Senior Scientist Award from the DoD, Taiwan, in
[8] W.-M. Wu, F.-T. Cheng, T.-H. Lin, D.-L. Zeng, and J.-F. Chen, “Selec- 1994, the Kayamori Best Automation Paper Award at the IEEE ICRA in 1999,
tion schemes of dual virtual-metrology outputs for enhancing prediction the Outstanding Industry-University-Cooperation (IUC) Award from the MoE,
accuracy,” IEEE Trans. Automat. Sci. Eng., vol. 8, no. 2, pp. 311–318, Taiwan, in 2003, the NCKU Distinguished IUC-Professor Award in 2004 and
Apr. 2011. 2008, the Taiwan National Science Council (NSC) Outstanding IUC Award
[9] F.-T. Cheng, H.-C. Huang, and C.-A. Kao, “Dual-phase virtual metrol- (as the only awardee) in 2006, the Taiwan NSC Outstanding Research Award
ogy scheme,” IEEE Trans. Semicond. Manuf., vol. 20, no. 4, pp. 566– in 2006 and 2009, the University Industry Economy Contribution Award,
571, Nov. 2007. Individual Award from the Ministry of Economic Affairs (MOEA), Taiwan,
[10] A. A. Khan, J. R. Moyne, and D. M. Tilbury, “Virtual metrology in 2008, the TECO Award from the TECO Technology Foundation, Taiwan,
and feedback control for semiconductor manufacturing process using in 2010, the National (Silver) Invention and Creation Award from MOEA,
recursive partial least squares,” J. Process Contr., vol. 18, pp. 961–974, Taiwan, in 2011, and the Award for Outstanding Contributions in Science and
Apr. 2008. Technology from the Executive Yuan, Taiwan, in 2011. He was an Associate
[11] S. Lynn, J. V. Ringwood, and N. MacGearailt, “Weighted windowed Editor of the IEEE Transactions on Robotics and Automation from
PLS models for virtual metrology of an industrial plasma etch process,” 2000 to 2004. He was the IEEE ICRA Kayamori Best Automation Paper
in Proc. IEEE Int. Conf. Indust. Tech., Mar. 2010, pp. 271–276. Award Committee Chair in 2006, the Senior Program Committee Member of
[12] W.-M. Wu and F.-T. Cheng, “Preliminary study of a dynamic-moving- ICRA in 2011, the Program Chair of IEEE WCICI in 2011, and the Convener
window scheme for virtual-metrology model refreshing,” in Proc. IEEE and the Program Director of the NSC Automation Engineering Program,
Int. Conf. Robot. Automat., May 2012, to be published. Taiwan, from 2007 to 2009. He will be the Program Chair of the IEEE CASE
[13] G. A. Carpenter and S. Grossberg, “ART 2: Self-organization of stable in 2014.
category recognition codes for analog input patterns,” Appl. Optics, vol.
26, no. 12, pp. 4919–4930, Dec. 1987.
[14] G. A. Carpenter and S. Grossberg, “A massively parallel architecture for
a self-organizing neural pattern recognition machine,” Comput. Vision
Graphics Image Process., vol. 13, no. 7, pp. 37–54, Jul. 1987.
[15] E. Deza and M. M. Deza, Encyclopedia of Distances. New York: Fan-Wei Kong was born in Tainan, Taiwan, on
Springer, 2009. December 5, 1984. He received the B.S. degree from
[16] Y.-T. Huang and F.-T. Cheng, “Automatic data quality evaluation for the Department of Industrial Engineering and Sys-
the AVM system,” IEEE Trans. Semicond. Manuf., vol. 24, no. 3, pp. tem Management, Chung Yuan Christian University,
445–454, Aug. 2011. Taichung, Taiwan, in 2005, and the M.S. degree
[17] R. V. Hogg and E. Tanis, Probability and Statistical Inference. Engle- from the Institute of Manufacturing Engineering,
wood Cliffs, NJ: Prentice-Hall, 1997. National Cheng Kung University, Tainan, in 2011,
[18] D. W. Scott, “Sturges’ rule,” Wiley Interdisciplinary Rev.: Computat. respectively.
Statist., vol. l, no. 3, pp. 303–306, Nov. 2009. His current research interests include semiconduc-
[19] F.-T. Cheng, Y.-T. Chen, Y.-C. Su, and D.-L. Zeng, “Evaluating reliance tor manufacturing automation, equipment engineer-
level of a virtual metrology system,” IEEE Trans. Semicond. Manuf., ing system, e-manufacturing, artificial intelligence,
vol. 21, no. 1, pp. 92–103, Feb. 2008. and statistics.

Wu 2012

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wu 2012

Uploaded by

Copyright:

Available Formats

238 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 25, NO.

Dynamic-Moving-Window Scheme for

R ECENTLY, a promising technology—denoted virtual

under the condition of minimizing intracluster and maximizing

B. Applying WED for Similarity Clustering

where r(Xj , y) is the correlation coefficient [17] between the

The Sturge’s rule is

Width of a cluster = [max(WEDi ) − min(WEDi )]/C. (9)

III. Illustrative Examples

where xM,j equals x̄j , j = 1, 2, . . . , p, so that each element in |(ŷi − yi )/y|

Fig. 4. Key process data (ActProcess Time) of Example 1.

Fig. 6. Process datum (Lamp Illumination) of Example 2.

You might also like