Professional Documents
Culture Documents
Application
Abstract
In traditional customer service support of a manufacturing environment, a customer service database usually stores two
types of service information: (1) unstructured customer service reports record machine problems and its remedial actions and
(2) structured data on sales, employees, and customers for day-to-day management operations. This paper investigates how to
apply data mining techniques to extract knowledge from the database to support two kinds of customer service activities:
decision support and machine fault diagnosis. A data mining process, based on the data mining tool DBMiner, was
investigated to provide structured management data for decision support. In addition, a data mining technique that integrates
neural network, case-based reasoning, and rule-based reasoning is proposed; it would search the unstructured customer service
records for machine fault diagnosis. The proposed technique has been implemented to support intelligent fault diagnosis over
the World Wide Web. # 2000 Elsevier Science B.V. All rights reserved.
Keywords: Data mining; Knowledge discovery in databases; Customer service support; Decision support; Machine fault diagnosis
0378-7206/00/$ ± see front matter # 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 3 7 8 - 7 2 0 6 ( 0 0 ) 0 0 0 5 1 - 3
2 S.C. Hui, G. Jha / Information & Management 38 (2000) 1±13
2. Data mining
a priority that determines the sequence in which it can regression tree (CART), k-means algorithm, and case-
be exercised and a help ®le that gives visual details on based reasoning for classi®cation, prediction, and
how to carry out the checkpoint. An example of a clustering functions. There are also some tools that
fault-condition and checkpoint information for a ser- only aim at a speci®c data mining function. This
vice record is given in Fig. 2. provides ¯exibility; the users can select different
In addition, the customer service database also data mining tools for their problem domains to achieve
stores data related to sales, customers, and employees: the best results.
six major tables are de®ned in the customer service The choice of data mining tool must be based on the
database for this. Two, namely, MACHINE_FAULT application domain and its supported features. Certain
and CHECKPOINT, are used to store the knowledge applications may require only one data mining func-
base on common machine fault-conditions and their tion; others may require more than one. In this
checkpoints. These are unstructured textual data. The research, DBMiner [12] was chosen. This system
remaining four tables are used to store information on was developed by the DBMiner Research Group from
customers (CUSTOMER), employees (EMPLOYEE), the Intelligent Database Systems Research Laboratory
sales (MACHINE) and maintenance (SERVICE_RE- at Simon Fraser University in Canada. The system,
PORT). These four store only the structured data. which integrates data warehousing, on-line analytical
There are over 70 000 service records. Since each processing (OLAP) [6] and data mining techniques,
of the fault-conditions has several checkpoints, there supports the discovery of various kinds of knowledge
are over 50 000 checkpoints. Information on over at multiple conceptual levels from large relational
4000 employees, 500 customers, 300 different databases. The DBMiner system supports most of
machine models and 10 000 sales transactions are the major functions. It was implemented using many
also stored. advanced data mining techniques. In addition, it pro-
vides multidimensional data visualization support and
3.1. Mining structured data interacts with standard data sources through open
database connectivity (ODBC) interface.
A list of most popular data mining tools available
commercially or in public domain is given in the 3.2. Mining unstructured data
KDNuggets website [17]. These tools can mine the
structured data of sales, maintenance, and particulars Although DBMiner is an excellent data mining tool
of employees and customers in the customer service for large databases with structured data, it is unsuitable
database. It is interesting to see that a number of tools for extracting knowledge from the textual data of the
support multiple approaches; i.e., more than one customer service database. As the information or
data mining techniques. For example, Darwin from knowledge on common faults and their suggested
Thinking Machine Corp. supports neural networks, remedies are stored in textual format as fault-condi-
4 S.C. Hui, G. Jha / Information & Management 38 (2000) 1±13
tions and checkpoints, new techniques are needed to approaches, such as hybrid case-based reasoning
extract knowledge from this database for machine and neural network [21,24], have also been developed.
fault diagnosis. This is known as text mining Here, a data mining technique that integrates case-
[2,19,22]. based reasoning, neural network and rule-based rea-
Traditionally, case-based reasoning (CBR) has been soning is de®ned. These two are incorporated into the
successfully applied to fault diagnosis for customer framework of the CBR cycle. Instead of using the
service support [20,23,25]. CBR systems rely on nearest neighbour technique of traditional CBR sys-
building a large repository of diagnostic cases (or past tems, a neural network is used for indexing and
service reports) in order to circumvent the dif®cult retrieval of most appropriate service records based
task of extracting and encoding expert domain knowl- on user's fault description. Rule-based reasoning is
edge [1]. It is one of the most appropriate techniques used to guide the reuse of checkpoint solutions.
for machine fault diagnosis, as it learns by experience
gained in solving problems and hence emulates
human-like intelligence. However, the performance 4. Data mining for decision support
of CBR systems critically depend on the adequacy as
well as the organization of cases and the algorithms Information, such as the best selling machines, the
used for retrieval from a large case database. Most customers of a particular machine, a comparison of
CBR systems [28] use the nearest neighbour algorithm sales among different machines, and the performance
for retrieval from the ¯at-indexed case database; this is of different service engineers are highly desirable for
inef®cient, especially for large case database. Other the management team.
CBR systems use hierarchical indexing such as CART
[5], decision trees [26], and C4.5 [27]. Although this 4.1. Data mining process
performs ef®cient retrieval, building a hierarchical
index needs the knowledge of an expert during the Fig. 3 depicts the data mining process for extracting
case-authoring phase. hidden knowledge from large databases. The process
The neural network approach [8] provides an ef®- focuses on ®nding interesting patterns that can be
cient learning capability when provided detailed interpreted as useful knowledge. It consists of seven
examples. Neural networks may be either supervised steps.
or unsupervised, depending on the method of training.
It performs retrieval based on nearest neighbour 4.1.1. Establishing the mining goals
matching, since it stores the weight vectors as the This involves the understanding of the customer
code-book or exemplar vector for the input patterns. service support process, its database, and the admin-
The matching is based on a competitive process that istrative procedures of the company. A number of
determines the output unit that is the best match for the mining goals were identi®ed:
input vector, similar to the nearest neighbour rule.
However, the search space in a neural network is Marketing: identify the machine models with poor
greatly reduced because of the generalisation of sales and determine possible reasons; then improve
knowledge through training. In contrast, the CBR the design and reliability of those machine models
systems need to store all the cases in the case database in order to increase sales. Target the customers with
in order to perform accurate retrieval. The CBR mail campaigns of the machine models in which
systems that store only relevant cases for an ef®cient they are likely to be interested.
retrieval lack the accuracy as well as the learning Customer support: provide the best possible service
feature. Thus, neural networks are very suitable for to customers based on the machine model, the
case indexing and retrieval. nature of the problem, and geographical location.
Other data mining techniques include rule-based Resource management: assign duties to service
reasoning, fuzzy logic, genetic algorithms, decision engineers based on their expertise and past experi-
trees, inductive learning systems, and statistical pat- ence. Promote service engineers based on their
tern classi®cation systems [3]. In addition, hybrid performance.
S.C. Hui, G. Jha / Information & Management 38 (2000) 1±13 5
machine models serviced by service engineers using a `KL006' would be the most suitable person to be
bar chart. This summarization is very useful for under- assigned to serve `TAIT' in future. The next two rules
standing the expertise of service engineers. This show that the machine model `AVK_2013S' is ser-
information can be used to assign appropriate engi- viced by the service engineer `KL006' only and the
neers to servicing particular machine models. From faults were reported in the year `1996'. In addition,
the ®gure, it can be seen that the machine model Rule 8 indicates that the service engineer `10530' had
`AVK_2013S' has been serviced only by service resolved all the fault problems within a day.
engineer `KL006.' However, this also shows that
service engineer `KL006' has not worked on any other 4.1.7. Evaluating the mining results
machine model. Different data mining functions have been exer-
Another example is given in Fig. 6 to illustrate the cised, providing data. The information obtained is
use of association rules mining. The association rules next analyzed. The results are:
determine how the various attributes are related. Here,
there is a strong association among the attributes in a Marketing: OLAP analysis and summarization have
textual format. The rules have a minimum support of been applied to identify machine models with poor
8% and a minimum con®dence of 98%. High con- sales and high frequency faults. Clustering is used
®dence rules represent distinctive patterns within a to identify customers suitable for cross sales.
database: the ®rst two association rules state that the Customer support: association rules, classification,
customer `TAIT' has reported all the faults during the and clustering are used to identify those who have
year `1996' and it was serviced by the service engineer recently reported many faults. Better quality of
`KL006' only. This shows that the service engineer service can then be provided based on customer
geographical location and machine model pur- data mining technique based on the integration of
chased. neural network, case-based reasoning, and rule-based
Resource management: summarization can be used reasoning has been applied to the customer service
to identify the expertise of different service engi- database to support intelligent machine fault diagno-
neers. Association rules provide useful information sis.
on efficient engineers who can repair machine faults
within a day. Prediction analysis can be used to 5.1. Data mining process
compare different service engineers who have
repaired machine faults under the same conditions. Fig. 7 shows the framework of the data mining
With this, the company can assign job duties to process. It consists of two major processes: the off-line
service engineers based on their expertise and knowledge extraction process and the on-line fault
efficiency. diagnosis process. The ®rst extracts knowledge from
the customer service database to form a knowledge
base that contains the neural network models and a
5. Data mining for machine fault diagnosis rule-base. The neural network models and the rule-
base work within the CBR cycle to support the second,
The unstructured textual data of fault-condition and which uses the four stages of CBR cycle (retrieve,
checkpoint information of the customer service data- reuse, revise, and retain) to diagnose customer
base provides useful machine service information. A reported problems. It accepts user's problem descrip-
tion as input, maps the description into the closest neural network and the unsupervised Kohonen self-
fault-conditions of the faults previously stored from organizing map (KSOM) neural network [18]. LVQ3
the knowledge base, and retrieves the corresponding and KSOM are used as classi®cation and clustering
checkpoint solutions for the user. The user's feedback techniques for intelligent fault diagnosis, respectively.
on the fault diagnosis process is used to revise the Classi®cation techniques are used to put an instance of
problem and its solution. The new result is ultimately a new fault description into one of the known classes
retained as knowledge for enhancing performance of of faults and then use the suggested solution of the
future problems. known fault for the current problem. Clustering tech-
nique can be used to extract information from the
5.2. Knowledge extraction process customer service database to form groups of similar
faults and then de®ne a new problem instance in one of
Fig. 8 shows the knowledge extraction process for the clusters. The classi®cation into a speci®c fault-
retrieving information from the unstructured textual condition can be determined based on the closest
data of the fault-conditions and checkpoints in the match of the fault-condition with the input pattern
customer service database. There are two major within the cluster. The clustering technique has better
generation steps: neural network model and rule ef®ciency, but lower accuracy compared to the clas-
base. si®cation approach.
The neural network model generation phase The rule base generation process involves the
extracts the knowledge from the fault-conditions to extraction of knowledge from the checkpoint solutions
train the neural network to build neural network of the fault-conditions to generate a rule-base to guide
models for classi®cation and clustering. The fault- the reuse of checkpoint solution in the most effective
conditions in the customer service database are ®rst way. The rule-base consists of control and checkpoint
pre-processed to extract keywords. The pre-proces- rules. Control rules are coded manually to specify the
sing uses a word-list, stop-list, and algorithms from diagnostic procedure for the ®ring of checkpoint rules
Wordnet [11]. The extracted keywords are used to so that the checkpoints can be exercised one by one
form weight vectors to initialize the neural networks. procedure according to its priority. As each service
Then, the neural networks are trained to generate the record in the database contains a fault-condition and
neural network models. its prioritized checkpoints to be exercised for diag-
Two types of neural networks were investigated: the nosing the fault-condition, checkpoint rules can be
supervised learning vector quantization (LVQ3) generated automatically to provide speci®c diagnostic
10 S.C. Hui, G. Jha / Information & Management 38 (2000) 1±13
user for the fault-condition retrieved by the LVQ3 If the problem persists after trying all the check-
neural network. points for all the retrieved fault-conditions, the user
reports a failure by ®lling in a service request form via
5.3.3. Reuse of service records the web. In this case, a service engineer will rapidly
The checkpoints are presented in the order according contact the user. If the problem is too complicated, a
to the checkpoint rules ®red. They operate in a compe- service engineer will be dispatched for on-site repair.
titive manner to display the checkpoints in the order The service report subsequently generated will be
of their priority in solving the fault-condition. Help can updated in the customer service database.
also be obtained by clicking the `Help' option. This will
load the help ®le for the corresponding checkpoint to 5.4. Performance evaluation
help the user to carry out the remedial task.
Performance evaluation has been carried out to
5.3.4. Revise and retain with user feedback compare the retrieval performance of the LVQ3 neural
The neural network indexing database, the check- network for classi®cation and KSOM neural network
point rules, and the service records in the customer for clustering with the k nearest neighbour (kNN)
service database are next updated, based on user technique used in the traditional CBR systems [15].
feedback about the effectiveness of the fault diagnosis Two popular variations of kNN techniques were cho-
process. The input problem description and its past sen for comparison. The ®rst variation, known as
checkpoint solutions are revised through user's feed- kNN1, stores cases in a ¯at memory structure, extracts
back and updated into the relevant databases. As keywords from the textual descriptions and uses nor-
shown in Fig. 10, the user states whether the problem malized Euclidean distance for matching. The second
is resolved or not. If the problem is resolved, then the variation, known as kNN2, uses the fuzzy-trigram
neural network indexing database and the checkpoint technique [14] for matching. The performance com-
rule-base are updated. parison of the neural network techniques with the two
Table 1
Performance comparison between LVQ3, KSOM and kNNs
variations of the kNN techniques was measured in respectively. Fuzzy-trigram matching has a better
terms of ef®ciency and precision. Two experiments performance than the Euclidean distance matching
were carried out. The ®rst was based on the testing set because of its ability to handle spelling mistakes
of service records, while the second was based on user and grammatical variations in the user input. However,
fault descriptions. By doing this, a better evaluation its retrieval accuracy is lower than that of the two
can be obtained. The results are given in Table 1. neural networks.
For the ®rst experiment, the testing set of 15 850
service records was used. Both LVQ3 and KSOM
perform better than both variations of the kNN meth- 6. Conclusions
ods for retrieval in both the speed and accuracy because
of their ability to generalize information through As a collaborative research project with a multi-
training. In kNN1 technique using Euclidean distance national company, this research investigated the appli-
for matching, equal weights are assigned to the indi- cation of data mining techniques to extract knowledge
vidual attributes (i.e., keywords). Therefore, the retrie- from the customer service database for two kinds of
val is less accurate. In kNN2 using fuzzy-trigram customer service activities: decision support and
matching, positive scores are assigned for every machine fault diagnosis. The information stored in
sequence of three-letters. Although this technique the customer service database are classi®ed as struc-
may be useful in checking spelling errors and gram- tured and unstructured textual data. The structured
matical variations, the retrieval is quite inaccurate data are mined to enhance the decision making process
compared to the neural network techniques. Moreover, for better management of resources and marketing of
the major drawback in both of these kNN techniques is products. The unstructured data are mined to support
that new cases are indexed separately into the ¯at intelligent diagnosis of machine faults over the World
memory structure and thus the search space keeps on Wide Web.
increasing, thereby decreasing the ef®ciency. In order to mine the structured data in the customer
In the second experiment, a total of 50 fault descrip- service database, a data mining process based on the
tions were taken from non-expert users. The purpose data mining tool, DBMiner was proposed. To support
of this experiment was to test the performance when machine fault diagnosis, a data mining technique
the input was less precise in describing the fault- based on the integration of neural network, case-based
condition. Unlike the service records tested in the ®rst reasoning, and rule-based reasoning is incorporated.
experiment, user input was less accurate than that of This data mining technique can operate within a
service engineers. In this test, all the four retrieval system to provide ef®cient on-line machine fault
techniques were found to have lower retrieval accu- diagnosis over the World Wide Web.
racy due to the impreciseness and grammatical varia-
tion of the user input. The retrieval accuracy of the
LVQ3 neural network and the KSOM neural network References
were 88 and 86%, respectively. On the other hand, the
retrieval accuracy of the nearest neighbor and fuzzy- [1] A. Aamodt, E. Plaza, Case-based reasoning: foundational
trigram matching techniques were 72 and 76%, issues, methodological variations, and system approaches,
S.C. Hui, G. Jha / Information & Management 38 (2000) 1±13 13
Proceedings of AICom Ð Artificial Intelligence Commu- [19] K. Lagus, T. Honkela, S. Kaski, T. Kohonen, WEBSOM for
nications 7 (1), 1994, pp. 39±59. textual data mining, Artificial Intelligence Review, On-line
[2] H. Ahonen, O. Heinonen, M. Klemettinen, A.I. Verkamo, document available at <URL: http://websom.hut.fi/websom/
Applying data mining techniques for descriptive phrase doc/publications.html>, 1999, in press.
extraction in digital document collections, IEEE Forum on [20] Y.F.D. Law, S.W. Foong, S.E.J. Kwan, An integrated case-
Research and Technology Advances in Digital Libraries based reasoning approach for intelligent help desk fault
(ADL'98), 1998, pp. 2±11. management, Expert Systems with Applications 13 (4), 1997,
[3] K. Balakrishnan, V. Honavar, Intelligent diagnosis systems, pp. 265±274.
Journal of Intelligent Systems, 8(3) (1998), also available at [21] B. Lees, J. Corchado, Case based reasoning in a hybrid agent-
URL: http://www.cs.iastate.edu/honavar/publist.html>. oriented system, in: Proceedings of 5th German Workshop on
[4] R.J. Brachman, T. Khabaza, W. Kloesgen, G. Piatetsky- Case-Based Reasoning, 1997, pp. 139±144.
Shapiro, E. Simoudis, Mining business databases, Commu- [22] B. Lent, R. Agrawal, R. Srikant, Discovering trends in text
nication of the ACM 39 (11), 1996, pp. 42±48. databases, in: Proceedings of 3rd International Conference on
[5] L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification Knowledge Discovery and Data Mining (KDD'97), 1997,
of Regression Trees, Wadsworth, 1984. pp. 227±230.
[6] S. Chaudhuri, U. Dayal, An overview of data warehousing and [23] Z.Q. Liu, F. Yan, Fuzzy neural network in case-based
olap technology, ACM SIGMOD Record 26 (1), 1997, pp. 65±75. diagnostic system, IEEE Transactions on Fuzzy Systems 5
[7] M.S. Chen, J. Han, P.S. Yu, Data mining: an overview from a (2), 1997, pp. 209±222.
database perspective, IEEE Transactions on Knowledge and [24] M. Papagni, V. Cirillo, A. Micarelli, A hybrid architecture for
Data Engineering 8 (6), 1996, pp. 866±883. a user-adapted training system, in: Proceedings of 5th
[8] R. Eustace, G. Merrington, A probabilistic neural network German Workshop on Case-Based Reasoning, 1997,
approach to jet engine fault diagnosis, in: Proceedings of the pp. 181±188.
8th International Conference on Industrial and Engineering [25] D.W.R. Patterson, J.G. Hughes, Case-based reasoning for
Applications of Artificial Intelligence and Expert Systems fault diagnosis, The New Review of Applied Expert Systems
(IEA/AIE'95), Melbourne, Australia, ACM, 6±8 June 1995, 3, 1997, pp. 15±26.
pp. 67±76. [26] J.R. Quinlan, Induction of decision trees, Machine Learning
[9] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, From data 1, 1986, pp. 81±106.
mining to knowledge discovery: an overview, in: U.M. [27] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan
Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy Kaufmann, 1993.
(Eds.), Advances in Knowledge Discovery and Data Mining. [28] I.D. Watson, Applying case-based reasoning: techniques for
AAAI Press, Cambridge, MA, 1996, pp. 1±34. enterprise systems, Morgan Kaufman Publishers, 1997.
[10] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, The KDD
process for extracting useful knowledge from volumes of S.C. Hui is an Associate Professor in the
data, Communications of the ACM 39 (11), 1996, pp. 27±34. School of Applied Science at Nanyang
[11] C. Fellbaum (Ed.), Wordnet: An Electronic Lexical Database, Technological University, Singapore. He
The MIT Press, 1998. receivedhis B.Sc. Degree inMathematicsin
[12] J. Han et al., DBMiner: a system for data mining in relational 1983 and a D. Phil. Degree in Computer
databases and data warehouses, in: Proceedings of CASCON Science in 1987 from the University of
'97: Meeting of Minds, Toronto, Canada, 1997. Sussex,UK.HeworkedinIBMChina/Hong
[13] J. Han, Towards on-line analytical mining in large databases, Kong Corporation as a system engineer
SIGMOD Record 27 (1), 1998, pp. 97±107. from 1987 to 1990. His current research
[14] Inference Corporation, CBR Content Navigator, Online interests include data mining, internet
document available at <URL: http://www.inference.com/ technology, and multimedia systems.
products/>, 1999.
[15] G. Jha, Data mining for customer service support, M.A.Sc G. Jha is currently a software engineer in
Thesis. School of Applied Science, Nanyang Technological eGain Communications Corp., Sunnyvale,
University, Singapore, 1999. USA. He received his Bachelor of Tech-
[16] Johnson Space Centre's homepage, CLIPS (C Language nology Degree in Computer Science and
Integrated Production System), On-line document available at Engineering from Indian Institute of
<URL: http://www.jsc.nasa.gov/clips/CLIPS.html>, 1999. Kanpur, India, and his M.A.Sc in Compu-
[17] K.D. Nuggets, Tools (Siftware) for Data Mining and Knowl- ter Engineering from the School of
edge Discovery, On-line document available at <URL: http:// Applied Science, Nanyang Technological
www.kdnuggets.com/tools.html>, 1999. University, Singapore. His research inter-
[18] T. Kohonen, Self-Organizing Maps, 2nd Extended Edition, ests include data mining, machine learning
Springer Series in Information Sciences, Vol. 30. Springer, and information management.
Berlin, 1997.