You are on page 1of 6

Proceedings of the 2010 Industrial Engineering Research Conference

An Agent-Based Approach to Enhance Bio-Manufacturing


Quality Control Using Data Mining

Tzu-Liang (Bill) Tseng1, Richard Chiou2, Chun-Che Huang3 and Johnny C. Ho4
1
Department of Industrial Engineering, The University of Texas at El Paso, USA
2
Applied Engineering Technology, Drexel University, USA
3
Department of Information Management, National Chi Nan University, Taiwan
4
Turner College of Business and Computer Science, Columbus State University,
USA

Abstract

Quality Control (QC) is a process employed to ensure a certain level of quality in a product or service. One
of the techniques in QC is to predict the product quality based on the product features. However, traditional
QC techniques have faced some drawbacks such as heavily depending on the collection and analysis of
data and frequently dealing with uncertainty processing. In order to improve the effectiveness during a QC
process, an agent-based hybrid approach incorporated with data mining techniques such as rough set theory
(RST) is proposed in this paper. Under the agent-based framework, each agent is able to perform one or
more functionality during the entire QC process. Based on empirical case study in bio-manufacturing, the
proposed solution approach provides a great promise in QC processes.

Keywords
Quality control, Agent technology, Rough set theory, Bio-manufacturing

1. Introduction
To date, quality has become one of the major manufacturing strategies and perhaps the single most important
way to achieve manufacturing success in a highly competitive manufacturing market. High quality production
provides some advantages such as reduced scrap or re-machining cost and increased market share. To ensure
the quality in machining process, it is important to response the dynamic environment quickly. According to
literatures, the decision rules are appreciated to support for QC procedures while various variations occur in
the machining process. Consequently, an effective prediction model utilized significant features for part
quality is required in contemporary manufacturing and so called e-manufacturing. Traditionally, statistical
process control (SPC) seeks to control and minimize variation in manufacturing process. However, quality
control and improvement using SPC is very difficult to set up the best condition of manufacturing
specification in plants with complex sequential processes [1]. Moreover, the statistical method isn’t able to
handle linguistic variables and uncertain or incomplete information. Therefore, a hybrid data mining approach
for a QC system which integrates rough set theory, fuzzy logic system and a genetic algorithm is proposed and
applied.
To enhance effectiveness and extension of the data mining approach, it is required improvement of
gathered, managed, distributed and utilized information to decision-makers. Consequently, agent technology is
a potential tool to enhance the approach. In this paper, the agent approach is proposed to augment the part
quality control. Under the agent technology based framework, three main stages are identified and constructed
for the QC prediction system: (1) Stage I - Quality control rule induction stage: A Rough Set (RS) based
approach to select significant features and derive decision rules is used at this stage. (2) Stage II – Process
variation modeling stage: After significant features are identified at stage I, a Fuzzy Set (FS) approach is used
due to it is capable to model and compensate process variations effectively. (3) Stage III – Solution
Proceedings of the 2010 Industrial Engineering Research Conference

optimization stage: At this stage, Genetic Algorithms (GA) is used to train membership function at Stage II in
order to implement fuzzy solution optimization.

2. Literature Survey
The traditional way of achieving and ensuring the quality standards is mainly via statistical process control
(SPC) procedures [7]. However, in sequential manufacturing processes, product quality is influenced by
many factors that involve causal relationship and interact with each other. Thus, it is very difficult to set up
the best conditions of manufacturing specifications for SPC by executing the design of experiments (DOE)
in plants that have large equipment or sequential processes [1]. The conventional SPC and six sigma
techniques must respect several statistical assumptions such as normality of distribution of the variables,
constant variance of the variables, etc. It is hard to meet all these assumptions in practice.
The data mining approach of variable precision rough set and fuzzy set would produce a model
that is more capable of solving noise, fuzzy, uncertain and complicated problems than its individual
components [7]. Form literatures, data mining has been proved that it is capable to improve quality control,
for example, [6].
Current statistical approaches are difficult in analyzing qualitative information such as character
the qualitative variable in several levels; and the uncertainty (i.e., variation) of vague observations is
essentially non-statistical in nature, and hence these observations may not adequately support the random
variation assumption inherent in statistical quality control methods. Moreover, the final solutions derived
from standard statistical techniques may not be optimal because these methodologies are not able to learn
from historical data. Based on the aforementioned deficiencies from current statistical approaches, a hybrid
data mining approach which integrates rough set theory, fuzzy set theory, genetic algorithm and agent
based technology is proposed. Comparing to standard statistical tools that use population based approach,
the RST uses an individual, object-model based approach that makes a very good tool for analyzing quality
control problems [3]. Furthermore, FST has demonstrated its ability in a number of applications, especially
for the control of complex non-linear systems that may be difficult to model analytically. The Genetic
Algorithm (GA) operates on a population solution rather than a single solution [2]. To resolve the
drawbacks of these statistical methodologies in quality control, the proposed approach expects to provide a
way to optimize prediction for the lowest defective rate.

3. Solution Approaches to the QC Problem


In this section, the proposed approach integrates the essence of RST, FST and GA and provides solution
alternatives to QC issues incorporation with of agent technology. The holistic multi-agent environment and
architecture, quality control rule induction, a fuzzy rule system for part quality prediction and modeling
process variation, and the solution optimization through genetic algorithm are presented in Section 3.1-3.4,
respectively.
3.1. The Agent-based hybrid approach
The multi-agent environment, which aims enhance communication effectiveness for the quality prediction of
machining parts is presented in Figure 1. Three stages and eleven types of agents are conducted. In this agent-
based system, each agent is able to perform one or more services. A service corresponds to some problem
solving activities in QC processes. Each stage has several agents and may use domain resources. The domain
resources include not only databases and jobs, but also other agents. The latter case allows a nested
(hierarchical) agent system to be constructed in which higher-level agents realize their functionality through
lower level-agents (the lower-level agents have the same structure as the higher-level agents and, can,
therefore, have sub-agents as well as the jobs). The nesting of services can be arbitrarily complex and at the
topmost level the entire business process ultimately can be viewed as a service. Each service is managed by
one agent, although the execution of its sub-services may involve a number of other agents.
Since agents are autonomous, there are no control dependencies between them. Therefore, if an agent
requires a service, which is managed by another agent, it cannot simply instruct that agent to start the service.
Rather, the agents must come to a mutually acceptable agreement about the terms and conditions under which
the desired service will be performed. The way for making agreements is negotiation - a joint decision making
process in which the parties verbalize their (possibly contradictory) demands and then move towards
agreement by a process of concession or search for new alternatives.
Proceedings of the 2010 Industrial Engineering Research Conference

Figure 1: Environment of the proposed agent-based approach


3.2 The feature & rule extraction stage and its solution procedure
To reduce the complexity of original data records, the reduct generation agent is developed. To extract
significant features from useful reduct rules, the rule extraction agent is constructed. Finally, the critical rules
are validated by the validation agent and significant features are extracted. The three types of agents are
illustrated in the following sections.
3.2.1. The reduct generation agent
A reduct, which is generated by the reduct generation agent of the feature & rule extraction stage, is a
minimal sufficient subset of features but provides the same quality off discriminating concepts as the
original set of features. Most of the rough set based approaches may generate more than one reduct for an
object. This paper adopts the reduct generation procedure proposed by [4].
3.2.2. The rule extraction agent
To obtain significant features from the data set, the rule extraction agent is proposed based on the heuristic
algorithm developed by Tseng and Huang (2006). The data set is randomly divided into the training set
and the testing set. Feature sets are used for predicting an object’s outcome with this algorithm based on
training set. It also requires a procedure to validate the derived reduct rules based on the testing set. The
rule extraction procedure consists of the following steps
Step 1. Define a proper feature set and a target file. This step is critical to obtain high-accuracy outcomes
generated by the algorithm.
Step 2. Examine each object in the set for completeness. If the object is uncompleted, then delete the object
from the file; repeat through all objects.
Step 3. Determine the final rules from the candidate decision rules generated by the reduct generation
procedure. If all of candidate decision rules are satisfied then go to Step 6; otherwise go to Step 4.
Step 4. If candidate reduct rules are satisfied, then transfer the reduct rules to Step 5; otherwise, restore the
objects associated with unsatisfied rule and go to Step 3.
Step 5. Collect all of the satisfied reduct rules and go to Step 6; otherwise, go to Step 4.
Step 6. Stop and output the results (i.e., the potential rules for further validation).
3.2.3. The rule-validation agent
To validate the decision rules from reduct rules using a threshold, which is determined by the domain
expert, the rule-validation agent is provided and a validation procedure is presented next (Figure 2).
After the RST approach is applied, the decision rules are generated. Domain experts and knowledge workers
select the features with high frequent appearance from the premises of the decision rules as significant
features. The significant features will be used in the fuzzy logic system illustrated in next section.
Step 1. Compare each reduct rule derived from the rule-extraction algorithm with each new object from test
set. Calculate how many objects are matched with the rule;
Step 2. Repeat comparison of the reduct rules with objects from test set until no reduct rule is left;
Step 3. Calculate the accuracy of each rule by using the total matched objects (for each rule) divided by
summation of total correctly matched objects and total incorrectly matched objects. If accuracy of the rule is
greater than a predefined threshold value (e.g., 60%) then go to Step 4; otherwise, remove the rules;
Step 4. Stop and output the results.
Figure 2: The procedure of rule validation agent
Proceedings of the 2010 Industrial Engineering Research Conference

3.3 The quality prediction stage


The proposed hybrid FLS is used to input the significant features from the feature & rule extraction stage
and predict the part quality by using inference engine. It is consisted of four components such as fuzzifier,
inference, defuzzifer, and rules. To develop the FLS, the quality prediction stage involves four agents: The
fuzzifer agent converts crisp numbers to fuzzy sets. A fuzzy set is defined on a universe of discourse X and
is characterized by a membership function  F (x) that takes on values in the interval [0, 1]. The rule
determination agent defines fuzzy rules corresponding to all kinds of level combination with respective to
all significant features. The inference agent infers the prediction based on the fuzzy rules. The
defuzzification agent is used to map the fuzzy set from the inference agent into a crisp result.
3.4 The optimization stage
The GA agent is used to search the optimal solution of the quality prediction. A good fuzzy rule base is
determined by the fitted membership functions and fuzzy rules. There are two different ways in the GA
agent. One is to fix fuzzy rules and adapted membership functions; and the other is to fix membership
functions and adapted fuzzy rules. Since the GA is suitable to self-learning and self-organizing, it is
effective in searching optimal solution in FLS. In this paper, the former method is considered because
adapting the fuzzy rules is complicated in case that the number of rules is much more than the number of
input variable.
The solution procedure for the GA agent is illustrated as follows:
Step 1. Initiate the population according to the principle of gene encoding. All the fuzzy rules are encoded to
chromosomes.
Step 2. Determine the size of the individual for the next stage. If the number of individual is too low, then the
evaluation is slowly. And if the number is huge, then the computation becomes complicated.
Step 3. At the evaluation stage, the purpose of evaluation is to provide a standard for selection. The initial
individuals are given a lot of testing and they also provide feedbacks. These feedbacks express
performance of an individual as fitness. For example, the evolution function would rate the
chromosomes as follows:
eval(v1) = f(x1) = e1 (1)
where the chromosome v represents the real value x. e is a result of the fitness function eval(v)
Step 4. At the selection stage, choose individuals for parent according to their fitness value e. Generally,
higher fitness value is favorite.
Step 5: Determine the mutation probability to mutate the new population members randomly.
Step 6: Crossover each in turn two members of the selected part of population to form a new population
member.
Step7: Produce next generation of population. Repeat Step 3 to Step 7 until a satisfied solution is obtained.

The purpose of GA adaptation is to adapt the membership function of each fuzzy rule such that the
inference agent can predict more accurately. The GA adaptation starts with approximate control rules
derived from the empirical models and refines the control rules through a learning process when process
variations occur. The fuzzy input remains the same, while the fuzzy output membership functions are
adapted to minimize errors. The GA fitness function is given as follows:
Min E (i) = 1/2Σ(yi-di)2 (2)
where E(i) = error between the actual defective rate and the fuzzy output, yi = fuzzy output, di= resulting
part quality.

4. Case Study
This case study conducted by the authors illustrates the methodology presented in Section 3 to implement
the integrated data mining approach in the BM process called the “Dip-Spinning Coating” process. Section
4.1 describes the background and problem description of the features which impact to the quality of the
final product in ABC Inc. The remaining of Section 4 solves the quality control problem by applying the
methodology discussed in this paper and analyzes the computational results.

4.1 Background and problem description


The bio-manufacturing company, ABC Inc. plans to investigate the features which impact the quality of the
abdominal aortic aneurysm model and develop a prediction model for the selected features (see Figure 3)
Proceedings of the 2010 Industrial Engineering Research Conference

during the Dip-Spinning Coating process. The features of the Dip-Spinning Coating process include
Dipping orientation (F1), Curing temperature (F2), Curing time (F3), Numbers of dipping (F4), Rotation
speed (F5), Prototype mold (F6), Withdrawn rate (F7), Silicone solution viscosity (F8), Diameter of the
mold (F9) while the output feature is a good part or defective. The analysts of the BM department of the
ABC Inc. were responsible for extracting the reliable and concise decision rules from the given process
features. In other words, the rules with strong evidences, i.e., supported by more examples and as few
attributes as possible, are the focus of this project as well as the prediction model development. The
analysts were also required to determine which attributes are significant to the derived rules and prove that
the rules are valid in the determination of the relationship between the features and the quality of the final
product.

Figure 3: Example abdominal aortic aneurysm model manufactured in the UTEP Keck Center for a
biomedical device manufacturer showing the geometric computer model of the patient-specific anatomy
(left), the resulting flexible models manufactured for cardiovascular device deployment testing right).

4.2 Computational results


All data sets including 65 customer’s order data records are cleansed and the incomplete data sets are
removed by the data collection agent. The 65 records were cleansed and reduced to 55 since some data in
the 10 orders are not complete. The 55 data sets are divided as two groups: One is the training data set,
which is used to derive the decision rules; the other is the testing data set to verify the decision rules. The
result of reduct generation and the validated decision rules are represent below. Note that two values are
listed in the bracket. The first number is number of supporting objects while the second one is accuracy.
(1) IF (F3 = 0) AND (F4 = 1) AND (F9 = 1) THEN (Product = “Good Part”); [10, 95%]
(2) IF (F1 = 1) AND (F2 = 2) THEN (Product = “Good Part”) [8, 89%]
(3) IF (F4 = 0) AND (F9 = 3) THEN (Product = “Defective”) [5, 75%]
(4) IF (F3 = 2) AND (F4 = 3) AND (F7 = 1) THEN (Product = “Good Part”) [4, 75%]
(5) IF (F2 = 0) AND (F 3 = 1) THEN (Product = “Good Part”) [3, 82%]
(6) IF (F4 = 1) AND (F6 = 1) THEN (Product = “Good Part”) [3, 90%]
(7) IF (F1 = 2) AND (F6 = 1) AND (F9 = 1) THEN (Product = “Defective”) [4, 80%]

4.3 The results at the quality prediction stage


At the feature & rule extraction stage, Feature 4 (Numbers of dipping) Feature 6 (Prototype mold) and
Feature 9 (Diameter of the mold) are significant in the fabricating process of the part since they are selected
through Rough Set software developed by University of Texas at El Paso in the promises of the eight
decision rules. These three features are different in nature. According to the different levels of uncertainty,
different types (e.g., Type I and II) of Fuzzy Logic System (FLS) are desired to perform further
investigation. The membership function of Type I FLS is applied to Feature 6 while the membership
function of Type II FLS is used for Feature 4 and Feature 9 since Features 4 and 9 are much similar and the
cohesion contains more uncertainly than Feature 6. Since two different types of the membership functions
are incorporated in FLS, the combined FLS called a Hybrid FLS. In this case, the triangle membership
function is applied since the triangle membership function is simple and suitable for most conditions.
4.4 Evaluation of Type I and Hybrid FLS using GA agent
To be able to compare the Hybrid system, Type I FLS was applied as a baseline. Since the construction and
rule derivation of fuzzy membership function depend on expert’s knowledge and experience, a subjective
Proceedings of the 2010 Industrial Engineering Research Conference

judgment might lead into inaccuracy. Consequently, the software is used to refine the original fuzzy
membership function with the empirical data. There are three different cases of comparison of prediction
performance: (1) Type I FLS (after GA training), (2) Hybrid FLS (before GA training), and (3) Hybrid FLS
(after GA training). Figure 4 shows the performance of these three cases.
In the case study, the proposed approach provides higher accuracy of quality prediction than other
similar approaches since the curve generated from a hybrid approach (after training) is more close to the real
quality curve in most conditions. The decision rules generated during the feature & rule extraction stage are
able to provide decision support for quality improvement in manufacturing processes. After this investigation,
the operators, process designers, quality engineers and inspectors can focus on these selected features since
the overall part quality will be improved through intensive care on these significant features.
Comparision of Prediction Performance of Three Cases of FLS

0.7
0.6 Real Quality
0.5
Case I (Type I, After
Quality

0.4 Training)
0.3 Case II (Hybrid , Before
Training_
0.2
Case III (Hybrid, After
0.1 Training)

0
1 2 3 4 5 6 7 8 9 10 11 12
Part No.

Figure 4: Comparison of prediction performance of three cases of FLS

5. Conclusions
In this paper, the hybrid approach is developed through a three-stage approach and in an environment
includes twelve agents which each agent has its own functionality and unique solution procedure. The
proposed solution approach has the advantages to compensate the weaknesses of traditional quality
techniques. For example, the traditional techniques being applied alone results in non-optimum results. The
outcomes generated at the feature & rule extraction stage are significant features which can be used to
model and compensate process variations effectively through FLS. Finally, the GA approach searches the
optimum solutions by incorporating with the constructed FLS. The rules derived from the data set provide
an indication of how to study this problem further and pave a path for effective further investigation. This
paper forms the basis for solving many other similar problems that occur in manufacturing industries.

Acknowledgements
This work was supported by the US National Science Foundation (CCLI Phase I DUE-0737539) and the
US Dept. of Education (Award #P116B080100A). The authors wish to express sincere gratitude for their
financial support.

References
[1] Boo, S. K., Deok, H. C., Sang, C. P., 1999, “Intelligent process control in manufacturing industry with
sequential processes,” International Journal of Production Economics, Vol. 60-61, pp. 583-590.
[2] Goldberg, D.E., 1989, Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley,
Reading, MA.
[3] Kusiak, A., 2001, “Rough Set Theory: A data mining tool for semiconductor manufacturing,” IEEE
Transactions on Electronics Packaging Manufacturing, Vol. 24, No. 1, pp. 44-50.
[4] Pawlak, Z., 1991, Rough Sets: Theoretical Aspects of Reasoning about Data, Boston: Kluwer Academic
Publishers.
[5] Tseng, T.-L.(Bill), Huang, C. C., 2006, “Rough set-based approach to feature selection in customer relationship
management,” Omega, In Press, Corrected Proof, Available online.
[6] Tseng, T.-L.(Bill), Yongjin, K., Yalcin, M., 2005, “Feature-based rule induction in machining operation using
rough set theory for quality assurance,” Robotics and Computer Integrated Manufacturing, Vol. 21(6), pp. 559-567.
[7] Zhai, J., Xu, X., Xie, C., Luo, M., 2004, “Fuzzy control for manufacturing quality based on variable precision
rough set,” Intelligent Control and Automation, Vol. 3, pp. 2347-2351.