You are on page 1of 5

Knowledge-Based Systems xxx (xxxx) xxx

Contents lists available at ScienceDirect

Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys

Short communication

On representation of fuzzy measures for learning Choquet and Sugeno


integrals✩

Gleb Beliakov a , , Dmitriy Divakov b
a
School of Information Technology, Deakin University, Geelong, 3220, Australia
b
Peoples’ Friendship University of Russia (RUDN University) 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation

article info a b s t r a c t

Article history: This paper examines the marginal contribution representation of fuzzy measures, used to construct
Received 7 May 2019 fuzzy measure from empirical data through an optimization process. We show that the number of
Received in revised form 9 September 2019 variables can be drastically reduced, and the constraints simplified by using an alternative represen-
Accepted 15 October 2019
tation. This technique makes optimizing fitting criteria more efficient numerically, and allows one to
Available online xxxx
tackle learning problems with higher number of correlated decision criteria.
Keywords: © 2019 Elsevier B.V. All rights reserved.
Aggregation functions
Fuzzy measures
Capacities
Choquet integral
Sugeno integral
Multicriteria decision making

1. Introduction interpretation is complicated, as the domain experts are unable


to specify and interpret all decision criteria interactions, and b)
This paper addresses computational aspects of learning non- machine learning methods based on fitting fuzzy measures to
additive measures, often called fuzzy measures, or capacities [1– empirical data struggle with the computational complexity of the
3]. The fuzzy measures are normalized monotone set functions optimization problem.
which model decision making problems where the decision cri- Simplifications to fuzzy measures include k-additivity, sym-
teria are interacting (complementary or redundant). Combined metry, k-maxitivity, k-interactivity [2,10,11]. The interactions be-
with nonlinear integrals, such as the Choquet and Sugeno inte- tween the criteria are reduced to subsets of cardinality up to
grals, fuzzy measures have been successfully applied to decision k and hence the number of parameters to specify is also re-
making problems in numerous studies, see, e.g., [2,4–8]. duced. In some cases the number of monotonicity constraints,
Fuzzy measures extend the traditional probability measures, essential in the fuzzy measure definition, are also reduced, but
enabling one to efficiently represent the variety of cases of in- in other cases they are not, which only partially solves the issue
teraction among multiple decision criteria [2,4,9]. An additive of computational complexity.
(probability) measure reflects independence among the decision The methods based on learning fuzzy measure values form
criteria, whereas superadditivity (more generally, supermodular- data include least squares and least absolute deviation regres-
ity) reflects complementarity and subadditivity (submodularity) sion [4,5,12–17] and ordinal regression [18–21].
reflects the redundancy. Suitable representations of fuzzy measures play an important
For n decision criteria there are 2n parameters that identify a role in reducing complexity of the learning process. The standard
fuzzy measure µ (and hence all the input interactions), including representation µ, the Möbius representation M and the possi-
fixed values at µ(∅) and µ(N ), N = {1, 2, . . . , n}. Even for bilistic Möbius representation MP are three most frequently used
moderate 10 < n < 20 this flexibility of representation becomes representations. The two mentioned Möbius representations in
expensive: a) eliciting the values of a fuzzy measure and their particular are convenient to represent k-additive and k-maxitive
fuzzy measures, as these measures have the parameters at sub-
sets of higher cardinality fixed at 0. Unfortunately this does not
✩ No author associated with this paper has disclosed any potential or
help reducing the number of monotonicity constraints.
pertinent conflicts which may be perceived to have impending conflict with
Another representation is based on marginal contributions [2],
this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.
2019.105134. which are the differences µ(A ∪{i}) −µ(A), and represent the dis-
∗ Corresponding author. crete derivatives of the set function. The marginal contributions
E-mail address: gleb@deakin.edu.au (G. Beliakov). conveniently express the entropy of the fuzzy measure and help

https://doi.org/10.1016/j.knosys.2019.105134
0950-7051/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: G. Beliakov and D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowledge-Based Systems (2019)
105134, https://doi.org/10.1016/j.knosys.2019.105134.
2 G. Beliakov and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

express its super- (sub-)modularity. Furthermore, the monotonic- Definition 4 ([23]). The entropy of a fuzzy measure µ on N is
ity of fuzzy measures translates into simple non-negativity of the defined by
marginal contributions. However this also increases the number n
(n − |A| − 1)!|A|!
of parameters to n2n and adds up to n! equality constraints (some
∑ ∑
E(µ) = g (µ(A ∪ {i}) − µ(A)) , (4)
of which may be redundant). n!
i=1 A⊆N \{i}
In this contribution we develop a different representation
where g(x) = x lnx if x > 0 and 0 if x = 0.
related to the marginal contributions, in which the number of
parameters is kept to 2n , and the equality constraints are ex- The symmetric additive fuzzy measure has the largest entropy
plicitly resolved. The monotonicity constraints are also expressed value ln(n). The Choquet integral with respect to such a measure
as simple non-negativity, which is very convenient when formu- coincides with the arithmetic mean function.
lating the learning problem as a linear programming problem, The monotonicity condition can be equally rewritten as
in which all the variables are non-negative by default. We also ∆i µ(B) = µ(B ∪ {i}) − µ(B) ⩾ 0, ∀i ∈ N , B ⊆ N \ {i}. (5)
combine this representation with the notions of k-maxitivity and
k-interactivity, which further reduce the number of parameters. That is, the marginal contribution of any criterion i to any subset
The proposed representation helps express the problems of B, denoted as ∆i µ(B), is always nonnegative:
learning of the Choquet and Sugeno integrals in a more efficient ∆i µ(B) ⩾ 0, ∀i ∈ N , B ⊆ N \ {i}.
way, thus helping one to reduce the computing time and memory
requirements. In turn, this benefits capacity-based multicriteria In the context of pseudo-boolean functions [24] ∆i µ(B) is
decision making models and decision support. The proposed rep- called the ith derivative of µ at B. We shall use the non-negative
variables ∆i µ(B) as an alternative representation of capacities.
resentation is also suitable when some monotonicity constraints
There are O(n2n ) variables which satisfy the following constraints
are removed in the context of pre-aggregation functions [22]
[2]. Let π denote a permutation of N . Then µ is a fuzzy measure
based on fuzzy integrals, such as CC-integrals [8], but its detailed
on N if and only if ∆i µ(B) ⩾ 0 for all i ∈ N and B ⊆ N \ {i} and
treatment merits a separate investigation.
n
The paper is structured as follows. Section 2 provides some ∑
preliminary definitions and notation. Section 3 presents the new ∆π (i) µ(hπ (i−1) ) = 1 for all π, (6)
representation of fuzzy measures and specifies learning problems i=1

in this representation. Section 5 concludes. where hπ (i) = {π (1), . . . , π (i)}, hπ (0) = ∅.

2. Preliminaries 3. Method

We fix the number of criteria in multicriteria decision problem 3.1. Capacity learning
n ⩾ 2. Aggregation of the values of the criteria is achieved by
When eliciting a fuzzy measure from empirical data, the
using fuzzy integrals. Fuzzy integrals are defined with respect to
dataset usually consists of observed input vectors x(k) and ob-
fuzzy measures (also called capacities). In the sequel we assume
served outputs y(k) , k = 1, . . . , K . The aim in data fitting is to find
that the fuzzy measure values and all the inputs range over the
the parameters of a function f such that f matches the unknown
unit interval. The subsequent definitions can be found in the function g that generated the data. Fitting (or model learning)
following references [1–4]. can be quantified using different objectives, such as minimizing
the sum of (absolute, squared) differences between the predicted
Definition 1. Let N = {1, 2, . . . , n}. A fuzzy measure is a set f (x(k) ) and observed y(k) values. Once the objective is chosen and
function µ : 2N → [0, 1] which is: the constraints on the parameters are specified, an optimization
problem is solved so as to deliver the optimal model f . In our case
• monotonic, i.e., µ(A) ⩽ µ(B) if A ⊂ B,
the parameters of the model are the values of the fuzzy measure,
• normalized, i.e., satisfies µ(∅) = 0 and µ(N ) = 1. and the function f is a fuzzy integral.
The problem of fitting capacities to data has been considered
Definition 2. The Choquet integral with respect to a fuzzy using both Choquet and Sugeno integrals, for example [4,12,14–
measure µ and input vector x is given by 17,25]. In particular for numerical data, learning capacities in
n
∑ the Choquet integral setting in the least absolute deviation sense
Cµ (x) = (x(i) − x(i−1) )hi , (1) has been performed by using linear programming techniques [5,
i=1 13,26]. In the least squares approach the methods of quadratic
programming (QP) and heuristics [4,14,27] have been used. The
where x(i) denotes the ordered inputs x(1) ⩽ x(2) ⩽ · · · ⩽ x(n)
limitation of those methods is that the solutions are computa-
and hi = µ({σ (i), σ (i + 1), . . . , σ (n)}), corresponding with the tionally feasible only for small n ⩽ 10 because of an excessive
chain of nested subsets induced by the ordering permutation of number of monotonicity constraints.
x, σ : N → N and x(0) = 0 by convention. However when f is the Sugeno integral, the learning problem
becomes much more complicated because of the non-convex
Definition 3. The Sugeno integral with respect to a fuzzy mea- multiextremal objective [18]. Furthermore, ordinal regression [21]
sure µ and an input vector x ∈ [0, 1]n is given by implies a different objective: it is not required to fit the target
⋁ ( ) outputs but rather preserve their relative ordering.
Sµ (x) = (min xi ) ∧ µ(A) , (2) Simplification strategies are employed to reduce the num-
i∈A
A⊆N ber of degrees-of-freedom and constraints of a capacity. The
k-additive capacities [10] restrict marginal criteria interactions to
where ∨ is the maximum and ∧ is the minimum. subsets of cardinality at most k. However, except for the special
The Sugeno integral can also be written as case of 2-additive capacities, this technique does not reduce the
number of monotonicity constraints. Another approach is to use
Sh (x) = max min{x(i) , hi }. (3) k-interactivity [11] in which the capacity values for subsets of
i=1,...,n
cardinality larger than k are fixed so as to maximize the capacity
where h is defined as before. entropy.

Please cite this article as: G. Beliakov and D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowledge-Based Systems (2019)
105134, https://doi.org/10.1016/j.knosys.2019.105134.
G. Beliakov and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx 3

3.2. Alternative representation monotonicity constraints, as well as other suitable constraints


such as k-additivity or k-maxitivity.
The difficulty with the marginal contributions representation We now formulate the fitting problem in terms of the specified
is a large number of redundant variables, and hence equality con- set of marginal contributions. We shall use the l1 norm as it is
straints, which make the fitting algorithms less efficient. In this less sensitive to outliers in the data and allows us to translate the
section we propose an alternative set of variables by resolving problem into a linear programming problem, which in the pres-
partially the constraints. Recall that the marginal contributions ence of many parameters and constraints offers computational
∆i (A) are defined for every A ⊊ N and every i ̸∈ A by (5), and advantages.
there are O(n2n ) such contributions. Using the standard expression for the residuals rj = Cµ (xj ) − yj
Consider a subset A ⊆ N . The value µ(A) can be written as and splitting them into positive and negative parts rj = rj+ − rj− ,
the partial sum rj+ , rj− ⩾ 0 and rj+ · rj− = 0, we get |rj | = rj+ + rj− . Note that at least
|A|
∑ one of rj+ , rj− is necessarily 0 at an optimal point, and therefore
µ(A) = ∆σ (i) µ(B), (7) the constraint rj+ · rj− = 0 is implied, but not needed explicitly.
i=1 Taking the following set of non-negative decision variables,
rj+ , rj− , j = 1, . . . , J, µ(A), A ⊆ N , A ̸ = ∅, we have an equivalent
where B = {σ (1), . . . , σ (i − 1)}, for any permutation σ of
linear programming problem
(1, . . . , |A|). In particular, let σ be the natural increasing ordering
of the elements of A. Then, for example, J

minimize rj+ + rj− , (9)
µ({1, 3, 6, 7}) = ∆1 µ(∅) + ∆3 µ({1}) + ∆6 µ({1, 3})
j=1
+ ∆7 µ({1, 3, 6}). ∑
s.t. rj+ − rj− = µ(A)gA (xj ) − yj
Let us now take the subset of marginal contributions that satisfy A⊆N
additional constraints µ(A) ⩾ µ(A \ {i}), ∀i ∈ A for all A ⊆ N ,
∆i µ(A), i > e ∀e ∈ A ̸= N . (8) µ(N ) = 1,
For example, the above mentioned variables ∆1 µ(∅), ∆3 µ where gA (x) = max(0, mini∈A xi − maxi∈N \A xi ) and where the
({1}), ∆6 µ({1, 3}), ∆7 µ({1, 3, 6}) satisfy (8), but ∆1 µ({3}), decision variables µ are organized into a vector with the relevant
∆6 µ({1, 3, 7}) do not. cardinality-based numbering system.
Now, since (7) is a linear transformation, the problem (9)
Theorem 1. The subset of marginal contributions that satisfy (8) remains a linear programming problem in new variables ∆i µ(A),
forms a basis in the space of fuzzy measures. but some monotonicity constraints simplify to simple non-
negativity, which is implicit in LP, and hence can all be omitted.
Proof. The dimension of the space of fuzzy measures is 2n − 2, Of course, this does not eliminate other monotonicity constraints
accounting for two fixed values µ(∅) = 0, µ(N ) = 1. Every value relating to the non-negativity of the marginal contributions which
µ(A) can be uniquely expressed as (7) for the chosen natural are not part of the chosen basis. Similarly, when fitting to data in
ordering. Moreover, each sum in (7) contains one term ∆i µ(A \ the least squares sense, we have the non-negative least squares
{i}), which is not present in any expression for µ(B), B ⊊ A. It problem, which is well studied [28].
follows that ∆i µ(A) which satisfy (8) are linearly independent.
It also follows that there are 2n − 2 marginal contributions that 4.2. Sugeno integral setting
satisfy (8).

Therefore, with the addition of variables µ(∅) = 0 and To fit the Sugeno integral, which is more suitable for modeling
∆n µ({1, 2, . . . , n−1}), the subset of marginal contributions which aggregation on an ordinal scale [29], such as linguistic labels, we
satisfy (8) is a representation of a fuzzy measure. There are 2n apply ordinal regression, which translates into the problem [19]
such variables compared to O(n2n ) marginal contributions. J

It is now possible to formulate fuzzy measure learning prob- minimize F (µ) = 0 ∨ (Sµ (x(j) ) − Sµ (x(k) ))
lem in terms of the non-negative variables ∆i µ(A), which will be j,k=1 (10)
exemplified in the sequel.
for all j, k such that y(j) ⩽ y(k)

4. Learning problem formulations s.t. µ is monotone and µ(∅) = 0, µ(N ) = 1.


Substituting the expression for the Sugeno integral we get
4.1. Choquet integral setting
K ( )
∑ (j) j (k) j
Assume there is a data set D composed of J samples described minimize F (h) = 0∨ max min{xi , hi } − max min{xi , hi }
i=1,...,n i=1,...,n
j j,k=1
by n attributes xi , i = 1, . . . , n, j = 1, . . . , J and the target values
yj , j = 1, . . . , J, organized into a table. The goal of learning a for all j, k such that y(j) ⩽ y(k)
fuzzy measure is to determine such a measure µ that the Choquet j j
s.t. 1 = h1 ⩾ h2 ⩾ · · · ⩾ hjn ⩾ 0,
(or Sugeno) integral Cµ (x) matches the target value y for all the
samples j = 1, . . . , J. This is achieved by minimizing the norm of (11)
the residuals where, for brevity of notation, we now assume that each x has (k)

Minimize ∥r∥ = ∥Cµ (x) − y∥, been increasingly ordered in advance, and where vectors hj are
defined for each x(j) as in (1), according to the ordering of the
where we typically employ the Euclidean l2 or l1 norm. components of each input. The objective here is a piecewise linear
Traditionally the variables of the minimization problem are function, and as such it is Lipschitz and also a DC (difference of
the values µ(A) for all A ⊂ N , and the constraints are the convex) function. The latter is quite important as it allows us

Please cite this article as: G. Beliakov and D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowledge-Based Systems (2019)
105134, https://doi.org/10.1016/j.knosys.2019.105134.
4 G. Beliakov and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

to apply the methods of DC optimization [30,31] to numerically supermodular. The submodular fuzzy measures arise in distance
solve a challenging global optimization problem [19]. metric learning [32] to ensure triangular inequality.
Similarly to the case of Choquet integral, by replacing the vari- Consequently, in the new fuzzy measure representation ∆i µ
ables µ(A) with ∆i µ(A) using linear transformation (7), we do (A) which satisfy (8), learning supermodular fuzzy measures re-
not loose the DC structure of the objective, but translate multiple quires only supermodularity constraints ∆i µ(A) ⩽ ∆j µ(A ∪ {i})
linear monotonicity constraints into simple non-negativity. Fur- (of which there are 2n − n − 1) but no monotonicity constraints,
thermore, here we also eliminate the boundary constraint µ(A) = which is a significant advance over [32].
n
i=1 ∆σ (i) µ({σ (1), . . . , σ (n − 1)}) = 1, which ensures that all

µ(A) ⩽ 1. The reason is that in the expression for Sugeno integral 4.5. Encoding the variables
min(x(i) , µ(A)) = min(x(i) , min(1, µ(A))), therefore µ(A) needs
not be restricted. That makes the use of nonlinear optimization Finally we present an efficient method of encoding the vari-
methods and DC programming subject to fewer non-negativity ables ∆i µ(A) in a numerical vector of parameters v that is used
constraints simple and efficient. in the mentioned optimization problems. We recall the binary
The Sugeno integral fitting problem is multiextremal, there- ordering used to encode the values µ(A) into a 2n -vector [2]. The
fore global optimization approach is needed. Multistart local non- position of µ(A) in a 0-based vector v is found in binary form by
smooth and DC optimization were successfully applied in [18, associating the ith bit with the presence or absence of the variable
19]. i in the set A. For example, the first few terms in this vector are
v = (µ(∅), µ({1}), µ({2}), µ({1, 2}), µ({3}), µ({1, 3}), . . .).
4.3. Reduction of the problem size Since we established that each µ(A) can be expressed through
(7) uniquely when we fix σ to the natural increasing ordering, we
We now mention the k-maxitive and k-interactive fuzzy mea- can now encode the variables ∆i µ(A) into a numerical vector w
sures, which reduce the number of fuzzy measure parameters according to the same scheme, by placing ∆i µ(A) at the position
and constraints. For k-interactive fuzzy measures [11] we fix their occupied in v by µ(A ∪{i}). So the first few elements of the vector
values for all sets of cardinality greater than k < n in such a way are w = (µ(∅), ∆1 µ(∅), ∆2 µ(∅), ∆2 µ({1}), . . .). Note that only
as to maximize the entropy, which in turn ensures that all the the values which satisfy (8) are encoded.
criteria have the same chance to affect the aggregated value [23]. The linear transformation between v and w is achieved by
Then only the values for the subsets of cardinality no greater matrix–vector multiplication v = Aw, where A, is a suitable 0–
than k need to be determined from the data by solving a problem 1 matrix, which facilitates setting up the optimization problems
similar to (9): and performing calculations.
Linear programming formulation such as (9) facilitates sensi-
J
∑ tivity analysis of the model using standard linear programming
minimize rj+ + rj− , (12) tools. This way the dependence of the solution on the empiri-
j=1 cal data can be evaluated. Furthermore, in the case of multiple

s.t. rj+ − rj− − µ(A)gA (xj ) optimal solutions, the whole subset of optimal solutions can be
A⊂N ,|A|⩽k
identified and studied, like in the Non-additive Robust Ordinal
n−k−1
Regression [21,33]. Sensitivity analysis is much more compli-
1−K ∑ j cated in the case of Sugeno integral and associated nonlinear
= x(i) + Kx(n−k) − yj
n−k−1 programming problem.
i=1
µ(A) ⩾ µ(A \ {i}), ∀i ∈ A for all A ⊂ N , 0 < |A| ⩽ k, 4.6. Illustrative example
µ(A) ⩽ K , |A| = k, and K ⩽ 1 is a chosen parameter.
Consider the case of n = 4 criteria, and hence 2n = 16
We see that in the new representation ∆i µ(A) the problem re- fuzzy measure values (two of which are fixed). For example, the
mains an LP, and many monotonicity constraints are eliminated. interdependent criteria in military hardware assessment are Sur-
In the case of k-maxitive fuzzy measures we fix µ(A) = 1 vivability, Lethality, Mobility and Communications. The data set D
for all A, |A| > k, and also simplify the problem (11) while j j j j
is composed of J samples (x1 , x2 , x3 , x4 , yj ), and the goal is to find
preserving its structure. This type of fuzzy measure is called
an optimal fuzzy measure using model (9), which in addition is
k-tolerant.
superadditive. Instead of µ(A) we use the marginal contributions
which satisfy (8), namely the non-negative variables
4.4. Learning super- and sub-modular capacities
µ(∅), ∆1 (∅), ∆2 (∅), ∆2 ({1}), ∆3 (∅), ∆3 ({1}), ∆3 ({2}), ∆3 ({1, 2}),
The proposed change of variables has eliminated the mono- ∆4 (∅), ∆4 ({1}), ∆4 ({2}), ∆4 ({1, 2}), ∆4 ({3}),
tonicity constraints only partially. The next result shows that for
∆4 ({1, 3}), ∆4 ({2, 3}), ∆4 ({1, 2, 3})
two important special cases, supermodular and submodular fuzzy
measures, all monotonicity constraints are redundant. organized into a vector in the mentioned ordering.
The monotonicity constraints are now implicit, and the eleven
Theorem 2. If the marginal contributions that satisfy (8) are superadditivity constraints are
non-negative, the supermodularity constraints imply monotonicity.
∆1 (∅) ⩽ ∆2 ({1}) ⩽ ∆3 ({1, 2}) ⩽ ∆4 ({1, 2, 3}),
Proof. The supermodularity is equivalent to the following condi- ∆1 (∅) ⩽ ∆3 ({1}) ⩽ ∆4 ({1, 3}), ∆1 (∅) ⩽ ∆4 ({1}),
tions, as per Corollary 2.23 in [1]: ∆i µ(A) ⩽ ∆i µ(B) for all A ⊂ N ∆2 ({1}) ⩽ ∆4 ({1, 2}),
and B = A ∪ {j}, j ̸ ∈ A, i ∈ N \ B. That means that the marginal ∆2 ({1}) ⩽ ∆4 ({1, 2}), ∆2 (∅) ⩽ ∆3 ({2}) ⩽ ∆4 ({2, 3}),
contributions (for a fixed i) form a non-decreasing sequence with
∆3 (∅) ⩽ ∆4 ({3})
subset cardinality. Since all ∆i µ(∅) ⩾ 0, we obtain the result.
as well as two equality constraints
The case of submodular fuzzy measures is equivalent by using
duality: a fuzzy measure is submodular if and only if its dual is µ(∅) = 0, ∆1 (∅) + ∆2 ({1}) + ∆3 ({1, 2}) + ∆4 ({1, 2, 3}) = 1.

Please cite this article as: G. Beliakov and D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowledge-Based Systems (2019)
105134, https://doi.org/10.1016/j.knosys.2019.105134.
G. Beliakov and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx 5

In contrast, learning such a measure in the Möbius represen- [9] J.-Z. Wu, G. Beliakov, Probabilistic bipartition interaction index of multiple
tation requires 28 monotonicity and 24 supermodularity con- decision criteria associated with the nonadditivity of fuzzy measures, Int.
J. Intell. Syst. 34 (2019) 247–270.
straints [32]. To see the trend, when n = 8 we only have 55
[10] M. Grabisch, K-Order additive discrete fuzzy measures and their
linear constraints as compared to 1016+1792 monotonicity and representation, Fuzzy Sets and Systems 92 (1997) 167–189.
superadditivity constraints in the Möbius representation, with the [11] G. Beliakov, J.-Z. Wu, Learning fuzzy measures from data: simplifications
same number of decision variables. and optimisation strategies, Inform. Sci. 494 (2019) 100–113.
Once the constraints are specified, it is not difficult to use [12] J.-L. Marichal, M. Roubens, Determination of weights of interacting criteria
from a reference set, European J. Oper. Res. 124 (3) (2000) 641–650.
problem formulation (9) to construct an optimal fuzzy measure [13] G. Beliakov, Construction of aggregation functions from data using linear
that fits D and then convert the optimal solution to µ using partial programming, Fuzzy Sets and Systems 160 (2009) 65–75.
sums (7). [14] A. Tehrani, W. Cheng, K. Dembczynski, E. Hüllermeier, Learning monotone
nonlinear models using the Choquet integral, Mach. Learn. 89 (1–2) (2012)
183–211.
5. Conclusions
[15] M.A. Islam, D.T. Anderson, A.J. Pinar, T.C. Havens, Data-driven compression
and efficient learning of the Choquet integral, IEEE Trans. Fuzzy Syst. 26
In this contribution we provided a new and efficient repre- (4) (2018) 1908–1922.
sentation of fuzzy measures in terms of a suitably chosen subset [16] A. Mendez-Vazquez, P. Gader, J.M. Keller, K. Chamberlin, Minimum classi-
of marginal contributions. The main achievement is the elimina- fication error training for Choquet integrals with applications to landmine
detection, IEEE Trans. Fuzzy Syst. 16 (1) (2008) 225–238.
tion of many monotonicity constraints on fuzzy measures, which [17] M.F. Anderson, D.T. Anderson, D.J. Wescott, Estimation of adult skeletal
translate into simpler non-negativity constraints. In the cases age-at-death using the Sugeno fuzzy integral, Am. J. Phys. Anthropol. 142
of supermodular and submodular fuzzy measures, monotonicity (1) (2010) 30–41.
constraints can be eliminated altogether. Since the change of [18] M. Gagolewski, S. James, G. Beliakov, Supervised learning to aggregate data
with the Sugeno integral, IEEE Trans. Fuzzy Syst. 27 (2019) 810–815.
variables is linear, the structure of the fuzzy measure learning
[19] G. Beliakov, M. Gagolewski, S. James, Aggregation on ordinal scales
problems in the new variables remains (expressed as an LP, QP or with Sugeno integral for biomedical applications, Inform. Sci. 501 (2019)
DC problem), yet is simplified due to simpler constraints, and in 377–387.
the case of LP these constraints are in fact implicit. Furthermore, [20] Q. Brabant, M. Couceiro, K-maxitive Sugeno integrals as aggregation
the new scheme is also efficient for k-interactive and k-maxitive models for ordinal preferences, Fuzzy Sets and Systems 343 (2018) 65–75.
[21] S. Angilella, M. Bottero, S. Corrente, V. Ferretti, S. Greco, I.M. Lami, Non
fuzzy measures and enjoys the same reduction of the number of additive robust ordinal regression for urban and territorial planning: An
parameters under these simplifications. application for siting an urban waste landfill, Ann. Oper. Res. 245 (1–2)
We also specified an efficient encoding scheme for the new (2016) 427–456.
variables into a 2n -vector which facilitates variable transforma- [22] G. Lucca, J. Sanz, G. Dimuro, B. Bedregal, R. Mesiar, A. Kolesárová, H.
Bustince, Preaggregation functions: Construction and an application, IEEE
tions, calculation of the fuzzy integrals and the objective func-
Trans. Fuzzy Syst. 24 (2016) 260–272.
tions by using standard linear algebra. [23] J.-L. Marichal, Entropy of discrete Choquet capacities, European J. Oper.
Res. 137 (3) (2002) 612–624.
Acknowledgment [24] M. Grabisch, J.-L. Marichal, M. Roubens, Equivalent representations of set
functions, Math. Oper. Res. 25 (2) (2000) 157–178.
[25] D. Anderson, J. Keller, T. Havens, Learning fuzzy-valued fuzzy measures for
The publication has been prepared with the support of the
the fuzzy-valued Sugeno fuzzy integral, Lecture Notes Artif. Intell. 6178
‘‘RUDN University Program 5-100’’. (2010) 502–511.
[26] G. Beliakov, Rfmtool package, version 3, 2018, https://CRAN.R-project.org/
References package=Rfmtool.
[27] M. Grabisch, A new algorithm for identifying fuzzy measures and its
application to pattern recognition, in: Fuzzy Systems, 1995. International
[1] M. Grabisch, Set functions, Games and Capacities in decision making,
Joint Conference of the Fourth IEEE International Conference on Fuzzy
Springer, Berlin, New York, 2016.
Systems and the Second International Fuzzy Engineering Symposium.,
[2] G. Beliakov, S. James, J.-Z. Wu, Discrete Fuzzy Measures. Computational
Proceedings of 1995 IEEE Int, IEEE, 1995, pp. 145–150.
Aspects, Springer, Cham, 2019.
[28] C. Lawson, R. Hanson, Solving Least Squares Problems, SIAM, Philadelphia,
[3] G. Choquet, Theory of Capacities, Ann. Inst. Fourier 5 (1953) 131–295.
1995.
[4] M. Grabisch, I. Kojadinovic, P. Meyer, A review of methods for capacity
[29] D. Dubois, J.-L. Marichal, H. Prade, M. Roubens, R. Sabbadin, The use of the
identification in Choquet integral based multi-attribute utility theory:
discrete Sugeno integral in decision-making: A survey, Int. J. Uncertain.
Applications of the Kappalab R package, European J. Oper. Res. 186 (2)
Fuzziness Knowl.-Based Syst. 9 (05) (2001) 539–561.
(2008) 766–785.
[30] R. Horst, N. Thoai, DC programming: overview, J. Optim. Theory Appl. 103
[5] G. Beliakov, S. James, G. Li, Learning Choquet-integral-based metrics for
(1999) 1–43.
semisupervised clustering, IEEE Trans. Fuzzy Syst. 19 (2011) 562–574.
[31] A. Ferrer, A. Bagirov, G. Beliakov, Solving DC programs using the cutting
[6] R.R. Yager, Using Fuzzy measures to construct multi-criteria decision
angle method, J. Global Optim. 61 (2015) 71–89.
functions, in: Soft Computing Based Optimization and Decision Models,
[32] G. Beliakov, S. James, G. Li, Learning Choquet integral-based metrics in
Springer, 2018, pp. 231–239.
semi-supervised classification, IEEE Trans. Fuzzy Syst. 19 (2011) 562–574.
[7] G. Beliakov, S. James, T. Wilkin, T. Calvo, Robustifying OWA operators
[33] S. Angilella, S. Greco, B. Matarazzo, Non-additive robust ordinal regression:
for aggregating data with outliers, IEEE Trans. Fuzzy Syst. 26 (2018)
A multiple criteria decision model based on the Choquet integral, European
1823–1832.
J. Oper. Res. 201 (1) (2010) 277–288.
[8] G. Lucca, G. Dimuro, J. Fernández, H. Bustince, B. Bedregal, J. Sanz, Improv-
ing the performance of fuzzy rule-based classification systems based on a
non-averaging generalization of CC-integrals named CF1 ,F2 -integrals, IEEE
Trans. Fuzzy Syst. 27 (2018) 124–134.

Please cite this article as: G. Beliakov and D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowledge-Based Systems (2019)
105134, https://doi.org/10.1016/j.knosys.2019.105134.

You might also like