You are on page 1of 10

Applied Intelligence 21, 301–310, 2004


c 2004 Kluwer Academic Publishers. Manufactured in The United States.

A Fuzzy Integral Based Query Dispatching Model in Collaborative


Case-Based Reasoning

SIMON C.K. SHIU, YAN LI AND FENG ZHANG


Department of Computing, The Hong Kong Polytechnic University, China
csckshiu@comp.polyu.edu.hk
csyli@comp.polyu.edu.hk
cszhangf@comp.polyu.edu.hk

Abstract. In a collaborative (distributed) Case-Based Reasoning (CBR) environment, an input query case could
be compared with the old cases that are resided in many different CBR agents in the network. How to obtain
the best solution effectively and efficiently from this distributed CBR network depends on a carefully designed
query dispatching strategy. In this paper, we propose a fuzzy integral based approach to measure the competence
of different CBR agents in the network and suggest three query dispatching policies which could be used to fulfill
this task. They are: To-Top policy, Strong-Strong policy and Best-Committee policy. The experimental result shows
that our proposed policies are comparatively better than the existing ones developed by Plaza and Ontañón.

Keywords: fuzzy integral, query dispatching, collaborative case-based reasoning

1. Introduction Plaza and Arcos [2] proposed two modes of cooper-


ation among CBR agents (i.e. Distributed Case-based
Traditionally, intelligent systems are developed in a Reasoning (DistCBR) and Collective Case-based Rea-
standalone and insolated manner. However, the recent soning (ColCBR)). DistCBR means that a problem can
growth of the World Wide Web and multi-agent sys- be dispatched to any agent for solving, disregarding
tems triggers the need of designing intelligent systems who generates the problem. ColCBR means that the
in a distributed and collaborative manner. Being a suc- owner of the problem tries to collect the useful cases
cessful intelligent system technology, Case-Based Rea- and methods from other agents, and decide how to solve
soning (CBR) also has the need to develop its applica- the problem. Three collaboration policies (i.e. Commit-
tions into a full fledged and distributed environment. tee policy, Peer-Counsel Policy and Bounded-Counsel
Currently, there are two main approaches for select- Policy) were developed for the DistCBR framework by
ing CBR agents, the first one is proposed by Nagendra Plaza and Ontañón [4]. However, these policies are all
Prasad et al. [1] based on the concept of task decom- based on a random selection of the agents, therefore in
position, and the second one is proposed by Plaza and most cases, it is very time consuming to get the best
Arcos [2] based on the random selection of agents. The solution.
use of task decomposition is only effective when the Instead of having random selection of agents, we
problem can be nicely decomposed into a set of sub- propose our policies based on the concept of “compe-
problems, and each sub-problem can be solved by an tence”, which could be defined as the range of prob-
individual CBR agent. However if conflicts exist (e.g. lems that a particular agent (or case) could solve. Since
the solutions from two sub-problems could not be in- these policies are based on “competence”, they are
tegrated), additional heuristics from the users may be comparatively better than the previous ones. The struc-
needed. Sometimes this way of collaboration may be ture of this paper is as follows: Section two reviews
even worse than a single and isolated system [3]. two methods of calculating the cases competence. The
302 Shiu, Li and Zhang

first one is proposed by Smyth and Keane [5] while where GroupDensity is defined as the average
the second one is proposed by the authors. The CBR CaseDensity of the group (see Eq. (2)), and |G| means
agents competence computation and ranking policy are the size of the competence group G, i.e. the number of
given in Section 3. In Section 4, three policies for cases in the group G.
dispatching a new query case are proposed. Each of 
these policies is based on a different assumption of GroupDensity(G) = CaseDensity(c, G)/|G|
how to obtain the best solution. An experimental com- c∈G
parison of our approaches to the existing ones is pro- (2)
vided in Section 5. Finally, the conclusion is given in
Section 6. where CaseDensity is defined by Eq. (3)

CaseDensity(c, G)
2. Case-Base Competence Modeling 
= Simlarity(c, c∗ )/(|G| − 1). (3)
c∗ ∈G−{c}
The concept of case-based competence was first pro-
posed by Smyth and Keane [5], (i.e. refer as the S-K Different ways of computing Similarity can be used de-
model in this paper), and subsequently it has been de- pending on the problem on hand. For a given case-base,
veloped further to a whole range of concepts which with competence groups G = {G 1 , G 2 , . . . , G n }, n =
are useful for measuring the problem solving abil- 1, 2 . . . , the total coverage is defined by Eq. (4).
ity of case-bases. In the S-K model, many statistical 
properties of a case base, such as the size and den- Coverage(G) = GroupCoverage(G i ) (4)
sity of cases, are used as input parameters for mea- G i ∈G
suring competence. However, this model assumes that
there is no overlap among different group of cases (e.g. 2.2. Problem of the S-K Model
features interaction [6] is a common cause of over-
laps). Therefore, if simply taking the group compe- Suppose that in some problem domain, we have a group
tence as the sum of the individual case competence, of non-uniformly distributed cases as depicted in Fig. 1,
and each individual case competence is computed inde- it can be shown that the S-K model is not a good pre-
pendently without considering the overlapping effects, dictor of the group competence because this model as-
the resulting group competence may be over- or under- sumes that the cases are distributed uniformly such
exaggerated. This feature overlap problem has been as those shown in Fig. 2. Assuming that Size(G) =
tackled by Shiu et al. [7] using fuzzy integral (refer as Size(G ), i.e. |G| = |G  |, and GroupDensity(G) =
the S-L model in this paper). These two models will be GroupDensity(G  ). Then, from Eq. (1), we have
used as the basis to develop our query case dispatch- GroupCoverage(G) = GroupCoverage(G  ) where G
ing strategies. Details are explained in the following is an arbitrary competence group in a case-base, so sim-
sections. ilar results can be obtained between two case-bases, in
which one has its cases non-uniformly distributed and
the other uniformly distributed.
2.1. The S-K Model
However, from Figs. 1 and 2, it is obvious that the
coverage of the two competence groups cannot pos-
In this model, two key fundamental concepts are used:
sibly be the same. There are coverage holes in Fig. 1
coverage and reachability. Coverage of a case refers to
compared with that of Fig. 2. If we calculate the com-
the set of problems that the case can solve. Reachability
petence of the groups in Fig. 1 using the S-K model,
of a case is the set of cases that can be used to provide
then the actual competence will be over-exaggerated.
solutions to a case. Furthermore, the competence of a
It is because, the S-K model only considers the group
group of cases (G) (i.e. group coverage of G) depends
density, but ignores their distribution. There are possi-
on the number of cases in the group and their density
bly many ways of cases distributions, therefore a more
(see Eq. (1)).
accurate way of modeling of case-base competence is
required.
GroupCoverage(G) To further illustrate this point, we use the following
= 1 + |G| · (1 − GroupDensity(G)) (1) example:
A Fuzzy Integral Based Query Dispatching Model 303

Figures 1 and 2. (1) Non-uniformly distributed case base. (2) Uniformly distributed case base.

Suppose that the densities of the groups G 1 and also resulted with a computing error that cannot be
G 2 in Fig. 1 are both 0.8 and they are assumed to ignored.
have uniform distribution (i.e. the case density of each In cases distribution such as Fig. 1, the difference
case, in either G 1 or G 2 , is 0.8). The density of the between GroupCoverage1 and GroupCoverage2 can
whole group is 0.2, and the coverage of c∗ is three be further investigated as follows:
cases. The overlap coverage of c∗ and G 1 ∪ G 2 is
two, and c∗ is a pivotal case. It is rather straightfor- GroupCoverage2 − GroupCoverage1
ward to get the coverage of the whole competence = {1 + [|G| · (1 − GroupDensity(G))]}
group G as follows: (note that GroupCoverage1 means − {1 + [|G1| · (1 − GroupDensity(G1))] + 1
the computed group coverage in Fig. 1 while Group-
Coverage2 means the computed group coverage in + [|G2| · (1 − GroupDensity(G2))] + 1}
Fig. 2): = |G| · [(GroupDensity(G1)
− GroupDensity(G)) − 1 − GroupDensity(G1)]
GroupCoverage1(G) = GroupCoverage(G 1 )
(5)
+ GroupCoverage(G 2 ) + [coverage(c∗ )
≥ |G| · [(GroupDensity(G1)
− coverage(c∗ ) ∩ Coverage(G 1 ∪ G 2 )]
− GroupDensity(G)) − 2] (6)
= 1 + [|G1|(1 − GroupDensity(G))] + 1
Given the above Eqs. (5) and (6), if the number of
+ [|G2|(1 − GroupDensity(G))] + 1
cases increases and tends to ∞ (in the extreme case),
= 1 + 5(1 − 0.8) + 1 + 7(1 − 0.8) + 1 the value [GroupDensity(G 1 )−GroupDensity(G)] also
= 5.4 increases at the same time. As a result, the computing
error also tends to ∞. Therefore, the precision error of
However, according to the S-K competence model Smyth and Keane’s competence model can be arbitrar-
(Eq. (1)), the result becomes: ily large in some case-bases.

GroupCoverage2(G) = 1 + [|G| 2.3. Fuzzy Integral Competence Model


· (1 − GroupDensity(G))]
In both Figs. 1 and 3, we can easily see that the case
= 1 + 12(1 − 0.2)
c∗ or c∗∗ has an important role to play because they
= 10.6 will affect the overall competence distribution in the
group. Therefore, it is important to detect such cases
The above two results are very different and the prob- (which are called weak-links in this paper) for possible
lem is caused by the inaccurate assumption that group identification of smaller competence groups which are
G is uniformly distributed, which is not true. This having more evenly distributed cases. These smaller
304 Shiu, Li and Zhang

According to inequality (8), since Relative


Coverage(c∗ ) is small, we can see that it is  Group
Density which leads to the competence error.

2.3.1. Detecting Weak Links. In order to tackle the


problem of non-uniformly distributed cases, we should
first identify the weak links in each competence group.
The definition of weak link as well as several related
concepts are more directly related to the competence
of the group in question, and are defined as follows:
Definition Let G = {c1 , c2 , . . . , cn } be a given com-
petence group in a case-base C, c∗ ∈ G is called a weak
link if
Figure 3. An example of non-uniform distribution of cases in a
case-base. CompetenceError(c∗ )
= |G|GroupDensity − GroupDensity(G i )
groups’ competence can then be computed using the − RelativeCoverage(c∗ )
S-K model. A new way of computing the group com-
petence (referred as the S-L model in the paper) is then ≥ α
developed to tackle the above problem. Details are as
follows. where α is a parameter which depends on the ques-
In general, competence groups, such as G 1 and G 2 in tion on hand. If ∃c∗ ∈ G, and c∗ is a weak link, then
Fig. 1, are not necessarily having strictly uniform dis- the competence group G is called a non-uniform dis-
tribution, and the weak link case c∗ is not necessarily a tributed competence group. Otherwise, if for ∀c ∈ G,
pivotal case. To deal with this situation, without influ- CompetenceError(M) ≤ α, then G is called a quasi-
encing the results in Eq. (5), GroupDensity(G 1 ) can be uniform distributed competence
 group.
replaced by the average group density of group G 1 and It is obvious that G i = G − {c∗ }. With this def-
G 2 , which can be denoted by GroupDensity(G i ) i ∈ inition, we propose a recursive method to detect the
{1, 2}, So [GroupDensity(G 1 ) − GroupDensity(G)] weak links in a given competence group G, which is
is equal to [GroupDensity(G i ) − GroupDensity(G)] described as follows:
which is denoted by GroupDensity. A concept called
quasi-uniform distribution can be used to describe the Weak-link Detection Algorithm:
distribution which is near to uniform distribution. As 1. W -SET← { }, G-SET ← { }, i = |G|;
mentioned, the other assumption that c∗ is a pivotal case 2. If (i = 0) {Consider each given competence
in the example is not necessarily true in many cases. To group G in the S-K competence model, compute
address this problem, just consider the individual com- CompetenceError(c), ∀c ∈ G; i = i − 1}
petence of c∗ as its relative coverage, which is defined 3. If there is no weak link, add G to G-SET, end;
as follows (Eq. (7)): 4. If there is a weak link c∗ , identify the competence
groups G 1 , . . . , G n , (n ≥ 1) in G −{c∗ }using the S-
RelativeCoverage(c)
 K competence model, add c∗ to the set of weak-links
1
= (7) W -SET.
c ∈CoverageSet(c)
|ReachabilitySet(c )| 5. For (1 ≤ i ≤ n){G ← G i ; repeat Steps 1 to 4}.

Hence, according to Eqs. (5) and (6), that is


Thus, we can obtain the set of weak links W -SET in
Competence-error(c∗ ) a given competence group G and the set of new compe-
= |G|GroupDensity − Gr oup Densit y(G i ) tence groups G-SET. Then, a given competence group
G is repartitioned (i.e. divided into smaller groups) by
− RelativeCoverage(c∗ ) ≥ |G|GroupDensity identifying the weak links in it. The groups in G-SET
− (RelativeCoverage(c∗ ) + 1) (8) are called new competence groups.
A Fuzzy Integral Based Query Dispatching Model 305

2.3.2. Computing the Overall Coverage Using Fuzzy spective importance, which indicates the two sets are
Integral. After detecting the weak links in a com- resisting each other.
petence group G, several new competence groups In our problem, consider X = {G 1 , . . . , G n } as the
G 1 , . . . , G n (n ≥ 1) are produced. According to the factor space. There are weak links among the compe-
definition of a weak link, each newly produced group tence groups, linking them to one group G. Here, weak
is said to be quasi-uniformly distributed. The next task links such as c∗ and c∗∗ are enhancing the overall cov-
is to compute the overall coverage of G. In the example erage of G. Hence, the important measure µ defined
described in Fig. 1, we simply sum the competence of on the power set (X ) is a super-additive measure. So
G i (1 ≤ i ≤ n) and the relative coverage of c∗ , but this here we have
method is only suitable in simple situations. There are
more complicated scenarios, such as the one given in
µ(A ∪ B) ≥ µ(A) + µ(B) for A, B ∈ (X ).
Fig. 3. It is difficult to clearly identify the contribution
of each weak link, (i.e. it is difficult to tell whether c∗
has more influence on the coverage of G than c∗∗ or vice For example, in Fig. 3, c∗ enhances the importance of
versa.) To describe this complex relationship, we apply G 1 ∪ G 2 , c∗∗ enhances the contribution of G 2 ∪ G 3 , and
a powerful tool called fuzzy integral (or non-linear in- there is no case to enhance or reduce the contribution
tegral) with respect to a fuzzy measure (a non-additive of G 1 ∪ G 3 , so we have
set function). Details are described in the next section.
µ(G 1 ∪ G 2 ) ≥ µ(G 1 ) + µ(G 2 )
2.3.3. Non-Additive Set Function. Let X be a µ(G 2 ∪ G 3 ) ≥ µ(G 2 ) + µ(G 3 )
nonempty set and (X ) be the power set of X . We
use the symbol µ to denote a non-negative set func- µ(G 1 ∪ G 3 ) = µ(G 1 ) + µ(G 3 )
tion defined on (X )with the properties µ() = 0. If
µ(X ) = 1, µ is said to be regular. When X is finite, µ is When using the fuzzy integral to compute the over-
usually called a fuzzy measure if it satisfies monotonic- all coverage of the original competence group G, we
ity, i.e., A ⊆ B ⇒ µ(A) ≤ µ(B) for A, B ∈ (X ). should determine the importance measure µ in ad-
For a non-negative set function µ, there are some as- vance. However, for a factor space including n factors,
sociate concepts. µ is said to be additive if µ(A ∪ B) = there are (2n − 1) parameters to decide. In the situa-
µ(A) + µ(B) for A, B ∈ (X ); to be sub-additive tion of Fig. 3, seven values of the importance measure
if µ(A ∪ B) ≤ µ(A) + µ(B) for A, B ∈ (X ); to should be determined, which are:
be super-additive if µ(A ∪ B) ≥ µ(A) + µ(B) for
A, B ∈ (X ).
Let X = {G 1 , . . . , G n } be the space of the new com- µ(G 1 ), µ(G 2 ), µ(G 3 ), µ(G 1 ∪ G 2 ), µ(G 1 ∪ G 3 ),
petence groups, and A and B two subsets of the power µ(G 2 ∪ G 3 ), µ(G 1 ∪ G 2 ∪ G 3 ).
set of X . Here, A and B can be a single new group G i or
the joint of several groups. If we consider µ(A) as the
To reduce the load, we apply a kind of fuzzy measure
importance of subset A, then the additivity of the set
called the λ-fuzzy measure, which takes the following
function means that the joint importance of the groups
form:
is just the sum of their respective importance, which
implies that there is no interaction among the compe-
tence groups. However, this is not true in the problem µ(A ∪ B) = µ(A) + µ(B) + λ · µ(A) · µ(B)
considered. In fact, most measures of importance are
λ ∈ (−1, ∞)
non-additive.
The sub-additivity and super-additivity are two spe-
cial types of non-additivity. Super-additivity means that If λ ≤ 0, µ is a sub-additive measure; if λ ≥ 0, µ is
the joint importance of the two sets is greater than or a super-additive measure; if and only if λ = 0, µ is
equal to the sum of their respective importance, which additive. So here we have λ ≥ 0. Applying the λ-fuzzy
indicates that the two sets are enhancing each other. In measure to determine the importance measure µ, we
contrast, sub-additivity means that the joint importance simply need to determine the importance of n on each
of two sets is less than or equal to the sum of their re- single factor and λ.
306 Shiu, Li and Zhang

2.3.4. Determining the λ-fuzzy Measure µ. In our where Fα = {x | f (x) ≥ α} for any α ∈ [0, ∞). When
model, we consider that the importance of each compe- X is finite, the Choquet integral can also be defined in
tence group is equal to 1, i.e. µ(G i ) = 1, (1 ≤ i ≤ n). the same way with respect to a non-negative set func-
This is a reasonable assumption because each group tion that is not necessarily monotone.
makes a unique contribution to the overall coverage, In our model, X = {G 1 , . . . , G n } is finite, f i =
that is, the status of each group is considered to be GroupCoverage(G i ), and importance measure µ satis-
equal. fies:
The next task is to determine the parameter λ, which
is critical to determine µ. It is obvious that the prop- µ(G i ) = 1(1 ≤ i ≤ n);
erties of the weak links between two groups are im- µ(A ∪ B) = µ(A) + µ(B) + λ · µ(A) · µ(B)(λ ≥ 0),
portant factors for determining λ. In our model, cover-
age of a group refers to the area of the target problem where λ is determined by Eq. (9).
space covered by the group. In this sense, the value The process of calculating the value of the Choquet
of λ is closely related to the coverage of weak links integral is as follows:
and the density of their coverage sets. Consider two
arbitrary new groups G i and G j , the W-SET between (1) Rearranging { f 1 , f 2 , . . . , f n } into a non-decrea-
them is C ∗ = {c1∗ , . . . , ch∗ }. We define Coverage(C ∗ ) sing order such that
and Density(C ∗ ) as follows:
f 1∗ ≤ f 2∗ ≤ · · · ≤ f n∗

h
Coverage(C ∗ ) = RelativeCoverage(ci∗ )
i=1
where ( f 1∗ , f 2∗ , . . . , f n∗ ) is a permutation of
 ( f 1 , f 2 , . . . , f n );

h

Density(C ) = GroupDensity(Cov(ci∗ )) h (2) Computing
i=1  
n

where Cov(ci∗ )
is the coverage set of one of the weak (c) fdµ = [ f j∗ − f j−1

] · µ({G ∗j , G ∗j+1 , . . . , G ∗n })
j=1
links between G i and G j .
The coverage contribution of G i ∪ G j must be di-
where f (x0∗ ) = 0.
rectly proportional to Coverage(C ∗ ) and inversely pro-
portional to Density(C ∗ ). With these assumptions, the The value of the Choquet integral is considered as the
parameter λ is given by the formula in Eq. (8). coverage of the considered competence group. Each
competence group in the S-K model is considered in
the same way, and the sum of all group coverage is the
λ = Coverage(C ∗ ) · (1 − Density(C ∗ )) (9) overall coverage of the given case-base.

The λ-fuzzy measure µ is then determined.


3. Competence of the CBR Agents
2.3.5. Using the Choquet Integral to Compute Com-
petence. Due to the non-additivity of the set function In the previous section, we have given the fuzzy integral
µ, some new types of integrals (known as non-linear approach for calculating the competence of each case
integrals) have to be used. A common type of nonlin- group, in this section, we can compute the competence
ear integrals with respect to non-negative monotone set of the CBR agents using the S-K and the S-L models
functions is the Choquet integral. respectively. The policies proposed here are based on
Let f be a nonnegative real-value measurable func- the DistCBR mode (see section one) in which different
tion defined on X , and µ be a non-negative monotone CBR agents are able to communicate and cooperate
set function introduced in the above section. The with one another for recommending a solution. For in-
 Cho- stance, when agent Ai is unable to solve a problem,
quet integral of f on X with respect to µ, (c) f dµ,
is defined by the formula will delegate its authority of solving the problem toA j .
For a given Collaborative Case-Based Reasoning
  ∞
(CCBR) system, there are n(n ≥ 1) case-based reason-
(c) fdµ = µ(Fα ) dα, ers, denoted by CBR1 , CBR2 , . . . , CBRn . These CBRs
0
A Fuzzy Integral Based Query Dispatching Model 307

can be regarded as n agents A1 , A2 , . . . , An for prob- 2. Compute the competence of each CBR agent in
lem solving in a distributed manner. The corresponding the CCBR2 group according to the S-L model,
case-bases are CB1 , CB2 , . . . , CBn , with competence and rank them as C12 , C22 , . . . , Cm2 2 .
groups G 1 , G 2 , . . . , G n respectively.
Step 4. Rank the CBR agents according to the respective
competence of each CBR agent in the CCBR1 and
Compute the Group Competence
CCBR2:
In computing the competence, we define the similarity According to the competence, rank the CBR agents in
between two cases p and q by the following equation: the CCBR1 and CCBR2 system in a descending order
  
 as {A11 , A12 , . . . , A1m 1 }, {A21 , A22 , . . . , A2m 2 }.
n
S M pq = 1/ 1 +
(x pj − xq j )2  ,
j=1
4. Query Dispatching Policies
where xi j corresponds to the value of feature F j (1 ≤
Three query dispatching policies based on the compe-
j ≤ n), (i = 1, . . . , n).
tence of the CBR agents are proposed here: To-Top
policy; Strong-Strong policy and Best Committee pol-
Step 1. Detecting the weak-links in the above compe-
icy. These are:
tence group G i (i = 1, 2, . . . , n):
If ∃c∗ ∈ G i , s.t. CompetenceError(c∗ ) ≥ α, then the
competence group G i is a non-uniform distributed 4.1. To-Top Policy
competence group. Otherwise, G i is a quasi-uniform
distributed competence group. The main idea of this policy is to choose the CBR
Step 2. Partition the CBR agents according to their com- agent which has the maximal competence in the cor-
petence: responding CCBR system, i.e. A11 or A21 in CCBR1 and
the CCBR2 system respectively. CCBR1 is chosen as
1. CCBR1 ← φ, CCBR2 ← φ, i = |G|; where
the problem-solving agent if there are no feature in-
CCBR1 consists of those agents who have no
teractions, otherwise CCBR2 is chosen. For example,
feature interactions (or no overlaps among com-
in a travel-planning problem which will be described
petence groups), while CCBR2 consists of those
in section five, the hotels are classified by the number
agents who have feature interactions (or over-
of stars, therefore when the user specify the type of
laps among competence groups).
accommodations (e.g. the number of stars), this will
2. If (i = 0), compute CompetenceError(c), ∀c ∈
limit the choices of the available hotels. In this case,
G, i = i − 1;
the features “accommodation” and “hotel” are inter-
3. If there is no weak-link in G, then G is called
acting. The dispatching procedure is as follow: if agent
a quasi-uniform distributed competence group,
Ai receives an input query, it will try to solve it. When
then add G to CCBR1, otherwise G is called
the solution is satisfactory (i.e. within a user defined
a non-uniform distributed competence group,
threshold of solution accuracy, and efficiency), it be-
then add G to CCBR2, end;
comes the answer to the input query. Otherwise, it will
4. For (1 ≤ j ≤ n), G ← G j ( j = 1, 2, . . . , n),
dispatch the problem to A11 or A21 for solving. If Ai
repeat the above Steps 1 to 3.
is one of the agents in CCBR1, then it dispatches the
Step 3. Compute the competence of each CBR agent in problem to the agent A11 , otherwise the problem will go
the CCBR1 & CCBR2: to A21 in CCBR2.
1. Compute the competence of each CBR agent in
the CCBR1 group according to the S-K model. 4.2. Strong-Strong Policy
Since the cases are distributed uniformly in
each CBR agent, we can get the competence In this policy, we assume that we could not determine
using the S-K competence model [5] directly. whether the features are having interactions or not.
They are then ranked in a descending order ac- Consider the travel-planning problem again, we are not
cording their competence, and are denoted as sure whether there are feature interactions or not be-
C11 , C21 , . . . , Cm1 1 . tween the features “season” and “hotel” or between
308 Shiu, Li and Zhang

the features “holiday duration” and “season”. Thus, it groups (i.e. each group represent one CBR agent, there-
is better to ask more than one agent to suggest the so- fore if 4 agents are used, each of them consists of 200
lutions. We will choose the most competent agent (i.e. cases, etc.). The feature “price” is chosen as the solu-
one from each collaborative CBR system). That is, if tion feature. The testing is based on the evaluation of
j
the agent Ai (i = 1, j = 1, 2) receives the prob- the solution accuracy and the mean cost of solving (i.e.
lem, and cannot solve it satisfactorily, it will ask the time consumption).
agents A11 in the CCBR1 and agent A21 in the CCBR2 The objective of the experiment is to determine the
to solve it in parallel. One of these suggested solutions “price” of each travel plan using our proposed policies.
will be used based on an earlier assessment of these A comparison of our approach to some exiting ones
two agents’ ability. (Note that these two agents belong [4] is also carried out. The mean relative error (i.e.
to the two CCBR systems, therefore the selection has the difference between the actual result and the pre-
already considered the feature interaction property). dicted result and divided by the actual result) is used
to compute the accuracy. For example in our experi-
ment, if four agents are used to predict the “price” of a
4.3. Best-Committee Policy particular testing case (such as Case number 987), our
three policies will give the following results respec-
If time is not critical, and getting better solution is the tively: $4,708.25, $3,536.72, and $4,561.35. The ac-
main concern, then the user can ask more agents for curacies are 84.94%, 86.43% and 88.26% respectively
suggested solutions. In general, the more the agents (see Fig. 4). The mean cost is the relative CPU time of
are involved, the more accurate the answer will be. the isolated agent, and assuming the mean time cost of
j
That is, if the agent Ai (i = 1, j = 1, 2) receives the isolated agent is ONE unit, then the mean time costs
the problem, it could follow the To-Top and Strong- of the collaborative policies are given in Fig. 5. We have
Strong policies first for solving the problem. However, conducted five testing runs, and each testing has dif-
if the solution is not satisfactory, it can ask the agents ferent number of agents. These agents are formed by
j j j
A1 , A2 , . . . , Ai−1 , (i.e. those agents that are better in randomly re-organize the 800 testing cases.
competence), to solve the problem. Each agent will of- The result shows that our policies use less time and
fer a solution to the problem, and the final solution is still can achieve the same accuracy as the other exist-
chosen according to the user’s preference, such as pre- ing ones. In details, Fig. 4 shows that all of six case
ferred accuracy. This policy provides the user a flexible dispatching policies are better than the isolated agent.
choice, when he wants to get the best solution, then he The Strong-Strong policy is better than the To-Top pol-
can ask all the agents for suggestions. This policy is icy and the Best-Committee policy is the best one.
the same as the “Committee” policy proposed by Plaza The Strong-Strong policy has similar accuracy to the
and Ontañón.

5. Experimental Evaluation

In order to evaluate our approach, we used a set of


test cases that are available from AI-CBR web-site (i.e.
www.ai-cbr.org). This travel case-base contains 1470
cases, and we randomly selected 1100 cases for our ex-
periment. Each test case describes a holiday-package
tour from Europe/ North Africa, and consists of 9 fea-
tures. In dividing the cases into different groups for
measuring competence, we use a random selection ap-
proach. The reason is that because in real life, there
may be missing values in the cases, therefore a feature
based grouping of cases may not be possible. In the
experiment, 800 cases are chosen randomly as learn-
ing data, and 300 cases are chosen as testing data. The
learning data are further divided into 4, 5, 6, 7 and 8 Figure 4. The average accuracy of collaborative policies.
A Fuzzy Integral Based Query Dispatching Model 309

Acknowledgment

This project is supported by the Hong Kong Poly-


technic University research grant H-ZJ90 and CERG
research grant B-Q379.

References

1. M.V.N. Prasad, V. Lesser, and S. Lander, “Retrieval and reasoning


in distributed case-bases,” Journal of Visual Communication and
Image Representation, Special Issue on Digital Libraries, vol. 7,
no. 1, pp. 74–87, 1996.
2. E. Plaza, J.L. Arcos, and F.J. Martı́n, “Cooperative case-based
reasoning,” in Distributed Artificial Intelligence Meets Machine
Learning: Learning in Multi-Agent Environments, edited by G.
Weiss, 1997, pp. 180–201.
Figure 5. The mean cost of the collaborative policies.
3. D.B. Leake and R. Sooriamurthi, “When two cases are better
than one: Exploiting multiple case-bases,” in Proc. of the 4th Int.
Bounded-Counsel policy and the Peer-Counsel policy. Conf. on Case-Based Reasoning, ICCBR 2001, Vancouver, BC,
Canada, 2001, pp. 321–335.
Here we did not include the Committee policy in the 4. E. Plaza and S. Ontañón, “Ensemble Case-Based Reasoning: Col-
experiment because it can be viewed as a special case laboration policies for multi-agent cooperative CBR,” in Proc.
of Best-Committee policy. of the 4th Int. Conf. on Case-Based Reasoning, ICCBR 2001,
Some limitations of our experiment include: (1) Vancouver, BC, Canada, 2001, pp. 437–451.
since the number of cases is fixed, an increase of the 5. B. Smyth and E. McKenna, “Modeling the competence of case-
bases,” in Proc. of the 4th European Workshop, EWCBR 1998,
number of agents will decrease their competence cor- Dublin, Ireland, 1998, pp. 208–220.
respondingly, as the result, the experimental accuracy 6. X.Z. Wang and D.S. Yeung, “Using fuzzy integral to modeling
will decrease with the increasing number of the agents case-based reasoning with feature interaction,” IEEE Int. Conf.
and (2) a pre-processing of the agents’ competence is on Systems, Man, and Cybernetics, vol. 5, pp. 3660–3665, 2000.
required. On the other hand, the merits of our method 7. S.C.K. Shiu, Y. Li, and X.Z. Wang, “Using fuzzy in-
tegral to model case-base competence,” in Proc. of Soft
are: (1) the computation time for finding a satisfac- Computing in Case-Based Reasoning Workshop, in conjunc-
tory solution is comparatively less than the current ap- tion with the 4th Int. Conf. in Case-Based Reasoning, IC-
proaches, (2) our three policies can provide alternative CBR 2001, Vancouver, Canada, 2001, pp. 206–212. Or
case dispatching methods to users according to their available from http://www.aic.nrl.navy.mil/papers/2001/AIC-01-
preference, and (3) our approach can be used to model 003/ws5/ws5toc6.pdf, 2002.
8. Z.Y. Wang and G.J. Klir, Fuzzy Measure Theory, Plenum: New
feature interaction among cases. York, USA, 1992, pp. 42–43.

6. Conclusions

In this paper, we have presented our approach of dis-


patching query to different CBR agents. The policies
are based on the concept of case and group competence.
The problem of feature interaction among cases is also
tackled using the fuzzy integral model. Our approach
has been demonstrated empirically with some testing
cases from the travel domain, and the result shows that
our approach is better than the existing ones. Further Simon C.K. Shiu is an Assistant Professor at the Department of
research includes a more detail investigation of case Computing, Hong Kong Polytechnic University, Hong Kong. He
feature interactions, as well as their modeling in dis- received M.Sc. degree in Computing Science from University of
Newcastle Upon Tyne, U.K. in 1985, M.Sc. degree in Business Sys-
tributed CBR environments. Furthermore, a more the- tems Analysis and Design from City University, London in 1986
oretical analysis and evaluation of our approach will be and Ph.D. degree in Computing in 1997 from Hong Kong Polytech-
carried out. nic University. He worked as a system analyst and project manager
310 Shiu, Li and Zhang

between 1985 and 1990 in several business organizations in Hong the Department of Computing, the Hong Kong Polytechnic Univer-
Kong. His current research interests include Case-base Reasoning, sity. Her interests include fuzzy mathematics, case-based reasoning,
Machine Learning and Soft Computing. He has co-authored (with rough sets theory and information retrieval.
Professor Sankar K. Pal) a research monograph Foundations of Soft
Case-Based Reasoning published by John Wiley in 2004.
Dr. Shiu is a member of the British Computer Society and the
IEEE.

Feng Zhang received the B.Sc. and M.Sc. degrees in Mathemat-


ics in 1998 and 2001 respectively from the College of Computer
and Mathematics, Hebei University, P.R. China. Currently, she is a
Yan Li received the B.Sc. and M.Sc. degrees in Mathematics in 1998 Lecturer in the College of Computer and Mathematics, Hebei Uni-
and 2001 respectively from the College of Computer and Mathemat- versity. Her interests include fuzzy mathematics, neural networks,
ics, Hebei University, P.R. China. Currently, she is a Ph.D. student in case-based reasoning and information retrieval.