You are on page 1of 11

Available online at www.sciencedirect.

com

Applied Soft Computing 8 (2008) 1178–1188


www.elsevier.com/locate/asoc

Fuzzy functions with LSE


I. Burhan Türkşen a,b,*
a
Head Department of Industrial Engineering, TOBB-Economy and Technology University, Sögütözü Cad. No. 43,
Sögütözü 06560, Ankara, Turkey
b
Director-Knowledge/Intelligence Systems Laboratory, Department of Mechanical & Industrial Engineering,
University of Toronto, Toronto, Ontario M5S 3G8, Canada
Received 21 February 2007; accepted 23 February 2007
Available online 23 December 2007

Abstract
‘‘Fuzzy Functions’’ are proposed to be determined by the least squares estimation (LSE) technique for the development of fuzzy system models.
These functions, ‘‘Fuzzy Functions with LSE’’ are proposed as alternate representation and reasoning schemas to the fuzzy rule base approaches.
These ‘‘Fuzzy Functions’’ can be more easily obtained and implemented by those who are not familiar with an in-depth knowledge of fuzzy theory.
Working knowledge of a fuzzy clustering algorithm such as FCM or its variations would be sufficient to obtain membership values of input vectors.
The membership values together with scalar input variables are then used by the LSE technique to determine ‘‘Fuzzy Functions’’ for each cluster
identified by FCM. These functions are different from ‘‘Fuzzy Rule Base’’ approaches as well as ‘‘Fuzzy Regression’’ approaches. Various
transformations of the membership values are included as new variables in addition to original selected scalar input variables; and at times, a
logistic transformation of non-scalar original selected input variables may also be included as a new variable. A comparison of ‘‘Fuzzy Functions-
LSE’’ with Ordinary Least Squares Estimation (OLSE)’’ approach show that ‘‘Fuzzy Function-LSE’’ provide better results in the order of 10% or
better with respect to RMSE measure for both training and test cases of data sets.
# 2008 Elsevier B.V. All rights reserved.

Keywords: Fuzzy functions; Rule bases; Membership values; Transformations; Input–output variables; Scalar and non-scalar; Reasoning; Least squares; Logistic

1. Introduction The proposed Fuzzy Functions-LSE (FF-LSE), are formed


with a selected set of original scalar input variables plus
Fuzzy Functions, for short, FF, are proposed for the structure logistic transformations of non-scalar input variables as well as
identification of system models and reasoning with them. These suitable transformations of membership values of a given input
fuzzy functions can be determined by any function identifica- vector’s belonging to a fuzzy cluster. The membership values
tion method such as least squares’ estimates, LSE, maximum are determined by a fuzzy clustering algorithm with the
likelihood estimates, MLE, support vector machine estimates, analysis of training vectors. The models are validated by test
SVM, etc. Here we discuss only the Fuzzy Functions with Least vectors.
Squares’ Estimates (FF-LSE). In a future work, we plan to Next we briefly explain how the proposed structure
discuss Fuzzy Functions with Support Vector Machines (FF- identification of system models with FF-LSE is unique and
SVM). Furthermore, we discuss only the Type 1 Fuzzy structurally different and distinct from the well-known structure
Functions. Our work on Type 2 Fuzzy Functions is planned to identification approaches which are
follow in the future.
(i) Most commonly applied methods of developing fuzzy rule
bases which are determined either by experts or fuzzy
* Correspondence address: Head Department of Industrial Engineering, clustering methods, such as FCM [1] in order to obtain the
TOBB-Economy and Technology University, Sögütözü Cad. No. 43, Sögütözü membership descriptions of the input fuzzy sets that form
06560, Ankara, Turkey. Tel.: +90 312 292 4068/+1 416 978 1278; the left hand sides and the output fuzzy sets that form the
fax: +90 312 292 4092.
E-mail addresses: bturksen@etu.edu.tr, turksen@mie.utoronto.ca.
right-hand sides of a fuzzy rule base. This approach was
URL: http://www.mie.utoronto.ca/staff/profiles/turksen.html, initially proposed by Zadeh [2,3] and originally applied by
http://www.etu.edu.tr Mamdani and Assilian [4]. There are two basic variations of
1568-4946/$ – see front matter # 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2007.12.004
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1179

this approach among many others that can be found in follows:


current literature: c
(a) Sugeno-Yasukawa (1993) approach where fuzzy sets of R : ALSO ðIF antecedenti THENconsequenti Þ (1)
i¼1
both the right- and left-hand sides are determined either
by experts or by fuzzy clustering algorithms such as where c* is the number of rules in a rule base either given by
FCM [1] with several alternatives. experts or it is determined by a fuzzy clustering algorithm such
(b) Takagi and Sugeno [5] approach where fuzzy sets of the as FCM. The fuzzy rule base structures determined by alter-
left-hand sides of a fuzzy rule base are determined natives (i), (a) and (b) stated above mainly differ in the
either by experts or by fuzzy clustering algorithms such representation of the consequents in structure (1). If the con-
as FCM [1] and the right-hand sides are functions sequent is represented with fuzzy sets then the fuzzy rule base
determined either by experts or by function estimation can be categorized as alternative (i) (a). This is the one initially
methods. proposed by Zadeh [3] originally applied by Mamdani and
(ii) There appears to be two different approaches under the Assilian [4], and a modified version is proposed by Sugeno and
heading of fuzzy ‘‘Fuzzy Regression’’: Yasukawa Fuzzy Rule Base (1993). Thus, we will call this
(a) Methods that were proposed by Tanaka [6] and Sugeno-Yasukawa Fuzzy Rule base, SY-FRB, structure.
investigated by Tanaka et al. [7,8], Tanaka and Whereas, if the consequents are represented with linear equa-
Ishibuchi [9], Celmins [10], Celmins [11], Savic and tions of input variables, then the rule base structure is the
Pedryze [12] in current literature, where the coeffi- alternative (i) (b) which we call Takagi-Sugeno Fuzzy Rule
cients of input variables are assumed to be fuzzy Base [5], TS-FRB, structure. Thus SY-FRB and TS-FRB are
numbers. These fuzzy regression models are based considered to be special cases of Zadeh Fuzzy Rule Bases, Z-
on the possibility theory instead of the probability FRB. These special cases of Z-FRB, i.e., SY-FRB and TS-FRB
theory. structures can be formalized as follows.
(b) Method proposed by Hathaway and Bezdek [13] where In general, let nv be the number of selected input variables in
first the fuzzy clusters determined by an FCM method the system. Then, the multidimensional antecedent, x, can be
define how many ordinary regressions are to be defined as x ¼ ðx1 ; x2 ; . . . ; xnv Þ, where xj is the jth input variable
constructed, one for each cluster. Next each fuzzy of the antecedent and the domain of x in X, can be defined as
cluster is used essentially for switching purposes to X ¼ X 1 X 1 ; . . . ; X nv , where Xj  R is the domain of variable xj.
determine the most appropriate ordinary regression Similarly, the domain of the output variable, y, will be denoted
that is to be applied for a new input from amongst a as Y  R. Then, the ith rule, Ri, and rulebase, R, in SY-FRB
number of ordinary regressions determined in the first structure can be defined as
place. nv
Ri : IF ANDðx j 2 X j isr Ai j Þ THEN y 2 Y isr Bi ;
j¼1
It is to be noted that we are proposing ‘‘Fuzzy Functions- (2)
LSE’’ as an alternate to ‘‘Fuzzy Rule Base’’ approaches as well 8 i ¼ 1; :::; c
as ‘‘Fuzzy Regression’’ approaches that we have briefly c nv
reviewed above. Thus once we specify the details of our R : ALSOðIF ANDðx j 2 X j isr Ai j Þ THEN y 2 Y isr Bi Þ
i¼1 j¼1
proposed approach of ‘‘Fuzzy Functions with LSE’’ (FF-LSE),
(3)
we will compare our results with ‘‘Ordinary Least Squares
Estimation, OLSE’’ approach. where Aij is the linguistic label, i.e., fuzzy subset, associated
with jth input variable of the antecedent in the ith rule, Ri, with
2. Background of fuzzy system models membership function mi(xj):Xj ! [0,1] and similarly Bi is the
consequent linguistic label, i.e., consequent fuzzy subset, of the
Before we discuss the details of our proposed ‘‘Fuzzy ith rule with membership function mi(y):Y ! [0,1], and c* is
Functions’’, let us briefly review the structure of fuzzy rule the number of rules in the model. In this structure, the
bases specified in (i) (a) and (b). challenges for knowledge representation are (1) to identify the
membership functions of fuzzy sets on the left- and right-hand
2.1. Fuzzy rule base models sides of the rules and (2) to identify the most suitable t-norm
and t-conorm combinations that represent in a one-to-
The most commonly applied fuzzy rule bases are fuzzy one correspondence the linguistic ‘‘AND’’ and ‘‘OR’’ for the
system models which attempt to identify the underlying combination of left-hand side fuzzy subsets together with the
relationship between input and output variables of a system by implication operator, ‘‘IMP’’, that will carry the left-hand side
fuzzy sets. In this paper, we will only deal with Multi-Input membership degree, i.e., the degree of firing, to the right-hand
Single Output (MISO) systems. Generally fuzzy system side consequent fuzzy subset. As well, one needs to know and
models represent relationships between the input and output be able to apply fuzzy logic to carry out approximate reasoning.
variables which are expressed as a collection of IF-THEN rules It should be recalled that Mamdani and Assilian [4] applied Min
that utilize linguistic labels, which are represented with fuzzy operator for both ‘‘AND and ‘‘IMP’’ which is a very special
sets. The general fuzzy rule base structure can be written as case whereas SY-FRB is more general. On the other hand, when
1180 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188

the linguistic ‘‘AND’’ and ‘‘OR’’ operators can not be output universe of discourse Y [20]. In general, X and Y stand for
represented in a one-to-one correspondence with a t-norm the set of all possible values of the input variable x and the set of
and a t-conorm, respectively, as it is shown by Türkşen [14,15], all possible values of the output variable y, respectively. The
then the FDCF and FCCF, Fuzzy Disjunctive and Conjunctive relation between x and y plays a crucial role in the analysis of
Canonical Forms, are to be used for the representation of rules the system S. It is usually desirable to find a functional relation
and for reasoning with them. However, such models fall into between x and y, which is conceived as an ordinary function f:
Interval-Valued Type 2 fuzzy systems analyses which are not X ! Y; in general. Let us consider an ordinary function f:
dealt with in this paper. Finally, one has to carry out X ! Y; proposed for the functional dependency of y to x: To
defuzzification computations in all fuzzy rule base models. investigate the appropriateness of f for the functional
Furthermore the above structure assumes non-interactivity dependency of y to x, for a certain state of x and for the
between input variables [3]. In fact, this is the underlying corresponding state of y, let us measure the values of x and y.
assumption when the fuzzy subsets for the left- and right-hand From a theoretical point of view, for the measured values x-
sides are obtained from experts by interview techniques. In measured of x and y-measured of y, it is expected that the value
order to eliminate the non-interactivity assumption, Delgado y-measured should coincide with the theoretically expected
et al. [16], Babuska and Verbruggen [17], and Uncu and value f(x-measured) of y. However, practically, these two values
Türkşen [18] used multi-dimensional Type 1 fuzzy subsets to are not usually equal, but they can be very close or very similar
represent the antecedent part of the rules. In such investigations, to each other. The classical assumption, asserting the functional
generally a multi-dimensional fuzzy clustering technique, e.g., dependency of y to x as an ordinary function f: X ! Y; does not
FCM is implemented to obtain multi-dimensional fuzzy subsets tell us any thing about how the measured value x-measured of x
that capture the interactivity (or joint affect) of input variables. relates to the measured value y-measured of y; and how the
Hence, the Z-FRB structure can be expressed as follows: measured value y-measured of y and the hypothetically claimed
c
value f(x-measured) of y relate to each other. Demirci [20]
R: ALSO ðIF x 2 X isr Ai THEN y 2 Y isr Bi Þ (4) states that the main difficulty behind these two problems in the
i¼1
classical approach results from the assumption that each
where the multi-dimensional antecedent fuzzy subset of ith rule possible value of x is related to a unique possible value of y, and
is Ai. This multi-dimensional antecedent fuzzy subset determi- both the indistinguishability of the input values and the
nation eliminates the search for the appropriate t-norm for the indistinguishability of the output values are always omitted
combination of antecedent fuzzy subsets with ‘‘AND’’. Thus, mathematically. Instead of the classical assumption that accepts
the degree of firing, say, for the-ith rule, is determined directly the functional dependency of y to x as an ordinary function f:
from the corresponding ith multi-dimensional antecedent fuzzy X ! Y, Demirci [20] proposes a vaguely defined function from
subset Ai and applied to the consequent fuzzy subset with the X to Y for the description of the functional dependency of y to x
selection of appropriate implication operator, ‘‘IMP’’. But to solve these problems.
again requiring a search and a selection of a t-norm–con- In other words, taking the M-equivalence relations E on X
orm-based ‘‘IMP’’ operators. In particular, Sugeno-Yasukawa, and F on Y into account, Demirci [20] suggests that a strong
SW-FRB, would thus be expressed as Eq. (4) above; and fuzzy function r in L[X  Y] from X to Y w.r.t. E and F can be
Takagi-Sugeno, TS-FRB, Fuzzy Rule Base structures would taken as the mathematical representation of the functional
thus be expressed with Eq. (5) as follows: dependency of y to x, where the M-equivalence relations on X is
c
called an M-equivalent similarity relation [22] such that
R: ALSO ðIF antecedenti THEN yi ¼ ai xT þ bi Þ (5) M = (L, < = , *) denotes an integral, commutative cqm-lattice
i¼1
with L = [0,1] and * a t-norm. For the proposed ordinary
where antecedenti = x 2 X isr Ai, and ai ¼ ðai;1 ; . . . ; ai;nv Þ is the function, f: X ! Y, to represent the functional dependency of y
regression coefficient vector associated with the ith rule in Eq. (5) to x, f can be thought as a hypothetical or an ideal description of
whereas bi’s are the scalars associated with the ith rule of Eq. (5). the functional dependency of y to x. For M-equivalence
For these special cases of Z-FRB, again each degree of firing, di, relations E on X and F on Y, a strong fuzzy function r in
associated with the-ith rule, is determined directly from the L[X  Y] from X to Y w.r.t. E and F with the property f in
corresponding ith multi-dimensional antecedent fuzzy subset ORD(r) can be also conceived as a realistic and a
Ai and applied to the consequent fuzzy subset for the SY-FRB comprehensive description of the functional dependency of y
or to the classical ordinary regression for the case of TS-FRB. to x which does not ignore the indistinguishability of input
values and the indistinguishability of output values. For each x
3. Fuzzy functions in X and y in Y; the element r(x, y) of L can be interpreted as the
degree of the truth of the statement ‘‘y takes the value y for a
A conceptual origin of our proposed FF-LSE may be found given value x of x,’’ where the top element 1 (the bottom
in Demirci [19,20], Demirci and Recasens [21] discusses the element 0) of L denotes the completely true (false) case of this
general properties of ‘‘Fuzzy Functions’’ from a perspective of statement. For the sake of simplicity, if we denote the output
mathematical theory. In particular he suggests (2003) an variable y by y(x) whenever x takes the value x in X; then, for
application possibility as follows: an input/output system S can each x in X on y in Y; r(x, y) will be nothing but the degree of the
be comprehended as an input universe of discourse X and an truth of ‘‘y(x) = y’’ [20].
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1181

It is to be noted that while such theoretical analyses give us a methods as required in the current fuzzy regression meth-
base to start the formation of ‘‘Fuzzy Functions’’, it does not odologies.
specify how we are to obtain such ‘‘Fuzzy Functions’’. In this All they have to understand the notion of membership values
regard, our proposed approach, to be discussed next, provides a and how they can be obtained from a fuzzy clustering algorithm
novel way where such fuzzy functions can be determined in such as FCM in addition to their usual background knowledge
practical engineering applications. of a function estimation technique, e.g., LSE, etc.
Thus we propose a novel approach in order to provide any
3.1. Proposed FF-LSE easy entry into fuzzy system modeling for mathematicians
and statisticians who are working in industry and for other
The proposed FF-LSE‘s are structurally different from Z- novices. For this purpose, we first review the basic structure
FRB, SY-FRB, TS-FRB, and ‘‘Fuzzy Regression’’ models of of the well-known ‘‘Least Squares’’ method of function
Tanaka et al. [7], and its variations which are explained above, estimation and then present our generalization of it which
and Hathaway and Bezdek [13] model, because the proposed includes membership values and their transformations as
approach introduces membership values and their transforma- well as a logistic transformation of non-scalar input variables
tions as well as a logistic transformation of non-scalar original as new input variables in addition to the original scalar input
input variables as new input variables in addition to the original variables. It is to be noted that, with a logistic analysis of
scalar input variables for ‘‘Fuzzy Function’’ estimation with non-scalar input variables, we obtain the frequency
LSE. With this new set of (augmented) input variables, one information with which non-scalar variables occur in system
executes a fuzzy clustering algorithm such as FCM and first behavior description. This valuable information is extracted
determines (local) optimum number of fuzzy clusters and via a logistic analysis because non-scalar variables cannot be
hence the associated membership values and then identifies a used directly in FCM. It should be recalled that FCM
fuzzy function to represent each fuzzy cluster separately. Thus requires that all its variables be scalar. In real life databases,
there are as many fuzzy functions as there are fuzzy clusters generally there are many non-scalar variables, i.e., nominal,
similar to Hathaway and Bezdek [13] model but they include binary and ordinal variables; and they contain valuable
membership functions as input arguments and thus there is no information that affects output variables based on some
need to use the cluster information for switching purposes. performance measure.
These fuzzy functions are estimated by the least squares
method in this paper. Therefore, it is structurally a new and 3.2. OLSE method
unique approach for the determination of fuzzy functions
instead of fuzzy rule bases. They represent fuzzy rule bases In OLSE method, the dependent variable, y, is assumed to be
indirectly. It is to be noted for the sake of emphasis that a linear function of one or more independent, input, variables, x,
coefficients of the inputs, whether they be membership values plus an error component as follows:
or their transformations or original input variables, are not
fuzzy sets in our proposed approach. Instead membership y ¼ b0 þ b1 x1 þ . . . þ bnv xnv þ e (9)
values and their transformations enter into an augmented input
set as new and additional variables. In our experience, it is where y is the dependent output, xj’s are the input or explanatory
found that this approach is most suitable for those analysts who variables, for j ¼ 1; . . . ; nv, nv is the number of selected inputs
are familiar with a function estimation technology, e.g., the and e is the independent error term which is typically assumed
least squares technology, etc. They only need to develop an to be normally distributed. The goal of the least squares method
understanding of fuzzy clustering algorithms without studying is to obtain estimates of the unknown parameters, bj’s,
many aspects of fuzzy theory. That is they generally do not j ¼ 0; 1; . . . ; nv, which indicate how a change in one of the
need to know or to develop an in-depth understanding of independent variables affects the dependent variable. The usual
essential concepts for the development and use of fuzzy rule method is known as the ‘‘Ordinary Least Squares’’, OLS. It
bases such as the estimation of membership functions, the should be emphasized that only one function is estimated for
selection of t-norms and co-norms for the combination of the the whole training data set. In OLS exercises, generally, there is
left-hand side membership functions and the selection of no prior clustering, in fact no fuzzy clustering that is deter-
implication operator to determine the affect of the left-hand mined ahead of time. Only one function is fitted to the over all
side on the right-hand side membership function. Furthermore, training data.
they do not need to study fuzzy logic, such as GMP, In matrix notation, the general linear model is expressed
Generalized Modus Ponens, fuzzification, de-fuzzification as
and various alternatives associated with all these essential
concepts. Generally such concepts are new to most mathe- Y ¼ Xb þ e
maticians and statisticians in industry who are not working
within fuzzy theory nor would they have time to study it. Nor where Y is [nd, 1] vector of response values, X is ½nd; nv þ 1
would such researchers have to develop an understanding of matrix of known constants which are inputs, nd represents the
the essential concepts of fuzzy numbers, their centers and number of input–output vectors in a training data set and nv is
widths, and how to use them in least squares estimation the number of selected input variables, b is ½ðnv þ 1Þ; 1 vector
1182 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188

of parameters and e is [nd,1] vector of errors such that: centers for m = m* and c = 1, . . ., c* as

yTnd;1 ¼ ½y1 ; y2 ; . . . ; ynd  vXjY; j ¼ ðxc1; j ; xc2; j ;   ; xcnv; j ; ycj Þ


bTnvþ1;1 ¼ ½b0 ; b1 ; b2 ; . . . ; bnv  m

eTnd;1 ¼ ½e1 ; e2 ; :::; end  From this, we identify the cluster centers of the ‘‘input
X nd;nvþ1 ¼ ½1; xk; j jk ¼ 1; 2; . . . ; nd; j ¼ 1; 2; . . . ; nv space’’ again for m = m* and c = 1, . . ., c* as
The objective is to minimize the total residual errors for the
vX; j ¼ ðxc1; j ; xc2; j ; . . . ; xcnv; j Þ
estimation of the parameters of the model, i.e. m

X
nd
Next, one computes the normalized membership values of
Min Q ðyk  ðb0 þ b1 xk1 þ . . . þ bnv xknv ÞÞ
each vector of observations in the training data set with the use
k
of the cluster center values determined in the previous step.
In matrix notation, we re-write it and then take the partial There are generally two steps in these calculations.
derivatives with respect to b’s: First we determine the (local) optimum membership values
uik‘s and then determine mik‘s that are above an a-cut in order to
Min Q ¼ ðy  XbÞT ðy  XbÞ; eliminate harmonics generated by FCM as
@=@b½ðy  XbÞT ðy  XbÞ ¼ 0;
2ðX T XÞb ¼ 2X T y;   2 1
  m1
1 X  
c  xk  vX;i   
b ¼ ðX T XÞ X T y:
uik ¼   ; mik a (10)
 
provided that XTX is not singular. j¼1 x  v 
 k X; j 

3.3. Proposed generalization of OLSE as FF-LSE


where mik denotes the membership value of the kth vector,
k = 1, . . ., nd, in the ith rule, i = 1, . . ., c* and xk denotes the kth
The proposed generalization of OLSE as FF-LSE requires
vector and for all the input variables j ¼ 1; . . . ; nv, in the input
that a fuzzy clustering algorithm, such as FCM [1], be available
space.
to determine the interactive (joint) membership values of input–
(2) Next, we normalize them as
output variables in each of the fuzzy clusters that can be
identified for a given training data set.Let (Xk,Yk), k = 1,. . ., nd,
mi j ðx j Þ
be the set of observations in a training data set, such that g i j ðx j Þ ¼ Pc (11)
i0 ¼1 mi0 j ðx j Þ
X k ¼ ðx jk j j ¼ 1; . . . ; nv; k ¼ 1; . . . ; ndÞ
where these normalized membership values of xj, j = 1, . . ., nd,
First, one determines the optimal (m*, c*) pair for a in the ith rule, i = 1, . . ., c*, which in turn indicate the member-
particular performance measure, i.e., a cluster validity index, ship values that will constitute an new input variable in our
with a iterative search and an application of FCM algorithm, proposed scheme of function identification for the representa-
where m is the level of fuzziness (in our experiments we usually tion of ith cluster. Let Gi = (gijji = 1, . . ., c*, j = 1, . . ., nd) be the
take m = 1.1, . . ., 2.5), and c is the number of clusters (in our membership values of X in the ith cluster, i.e., ith rule.
experiments we usually take c = 2, . . ., 10). It should be recalled Next we determine a new augmented input matrix X for each
that the well-known FCM algorithm is stated as [1] of the clusters which could take on several forms depending on
which transformations of membership values we want to or
X
nd X
c
need to include in our system structure identification for our
min J ðU; VÞ ¼ ðuik Þm ðkxk  vi kÞA
intended system analyses. Examples of these are
k¼1 i¼1
s:t: 0  uik  1; 8 i; k
X c
X 0i ¼ ½1; G i ; X; or X 00i ¼ ½1; G 2i ; X; or X 000
i
uik ¼ 1; 8 k
i¼1 ¼ ½1; G 2i ; G m
i ; expðG i Þ; X; etc:
X
nd
0 uik  nd; 8 i
k¼1 where X 0i ; X 00i ; X 000
i are the augmented input matrices to be used in
least squares estimation of a new system structure identification
where J is objective function to be minimized, jjjjA is a norm and Gi = (gijji = 1, . . ., c*; j = 1, . . ., nd). The choice depends on
that specifies a distance based similarity between the data whether we want to or need to include just the membership
vector xk and a fuzzy cluster center vi. In particular, A = I is values or some of their transformations as new input variables
the Euclidian Norm and A = C1 is the Mahalonobis Norm, etc. in order to obtain a best representation of a system behavior. A
Once the optimal pair (m*, c*) is determined with the new augmented input matrix, say X 0i, would look as follows for
application of FCM algorithm, one next identifies the cluster the special case of X = Xj, i.e., the matrix X is just a vector of a
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1183

single variable, Xj = (xjkjk = 1, . . ., nd) for the jth input variable: compared with the ‘‘Ordinary Least Squares Estimation’’,
2 3 OLSE, functions on the bases of R-square measure of
1 g i1 xi j
6. performance. It is shown that the predictions made by ‘‘Fuzzy
X 0i j ¼ ½1; G i ; X i j  ¼ 4 .. ... .. 7
. 5 Functions with LSE’’ are at least 10% or better that the
1 g ind xijnd predictions made by OLSE with respect to R-square.
Thus the function Yi = bi0 + bi1Gi + bi2Xij, that represents the 4.1. Desulfurization case study
ith rule corresponding to the ith interactive (joint) cluster in
(Yi,Gi,Xj) space, would be estimated as follows: Desulfurization facility in a large scale steel industry is used
bi ¼ ðX 0i j TX 0i j Þ1 ðX 0i j TY i Þ, where X 0i j ¼ ½1; G i ; X i j , pro- to remove the sulfur from hot molten steel coming from blast
vided the inverse exists. furnaces before it is sent to the next operation in the processing
Such that bi ¼ ðbi0 ; bi1 ; bi2 Þ and the estimate of Yi would be sequence. Desulfurization is carried out by the injection of two
obtained as Yi ¼ bi0 þ bi1 G i þ bi2 X i j . Within the proposed different powdered reagents, Reagent1 and Reagent2, directly
framework, the general form of the shape of a cluster for the into the hot molten steel by means of a lance. The reagents react
case of a single input variable Xj and for the ith cluster can be with the sulfur in the hot metal. Then the sulfur rich slag is
conceptually captured by a second order (cone) function when separated from the steel.
one introduces the square of membership values into the
augmented input matrix in the space of U  X  Y which can 4.1.1. Objective
be illustrated with a prototype shown in Fig. 1. The purpose of the modeling activity is that a reduction in
In a number of real life case studies, we have in fact found reagent consumption would be possible if a more precise and
out that generally some second order or exponential functions reliable model can be developed to estimate the right amount of
give a good approximation from amongst the following 20 reagents to be used in the desulphurization process. This is
alternatives we have experimented: based on the fact that in desulphurization process, the target
Model1 X0 ¼ ð1; G ; XÞ amount of sulfur, i.e., the aim sulfur, is often set much lower
Model2 X0 ¼ ð1; G ; G 2 ; XÞ than the actual sulfur value that comes from blast furnaces. The
Model3 X0 ¼ ð1; G m ; XÞ ‘‘aim sulfur’’ specifies the quality of steel demanded by
Model4 X0 ¼ ð1; eG ; XÞ customers. One of the key concerns in this case study is that
.. when a model with poor predictive capability is used, it requires
. that many batches of hot metal to be desulphurized again and
Model20 X 0 ð1; G ; eG m ; G m ; XÞ again. It should be noted that the desulphurization process is a
highly expensive process.
4. Real life case study applications Therefore, the main objective in this exercise is to minimize
the number of desulphurization processes by an increase of a
The proposed FF-LSE’s were developed for and applied to model’s prediction ability. For this purpose, FF-LSE’s were
two real life case studies. We briefly review here these two case developed and applied to provide a more accurate and reliable
studies: (1) desulfurization process of steel for a steel company, determinations of reagent amounts in order to desulfurize each
and (2) income prediction of customers for a bank. The results new batch of hot metal. As well, the same dataset is used to
of the fuzzy functions developed in these investigations are determine OLSE, Ordinary LSE, in order to show that it is
advantageous to use the proposed approach.
In summary, the objective of the desulphurization process is
basically to adjust the level of sulfur in the steel to meet a
quality specification. The sulfur is removed from the metal by
adding two different reagents, Reagent1 and Reagent2. These
reagents bind the sulfur and move it into the slag layer, which
forms on top of the hot metal. After the sulfur is removed from
the hot metal, then it is used to produce the final steel product.
As each batch of hot metal arrives at the station, a model is used
to predict ‘‘what amount of each of the reagents’’ will be
required to produce the final steel product quality.

4.1.2. Dataset
The input variables used in the dataset can be divided into
two parts: scalar and non-scalar variables. Scalar variables have
continuous values between (1 and +1) whereas non-scalar
variables can be either binary or ordinal variables. Hence some
of the variables of the Desulfurization model are (1) scalar
Fig. 1. A fuzzy cluster in U  X  Y space. variables: start sulfur, KGS, Temp, FB, aim-sulfur, end-sulfur,
1184 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188

compounds 1–5 and (2) binary or ordinal variables: Car-Type, probability of a certain event occurring. From the equation
Pos, Practice 1–6, Injection #, Equipment Type. shown above, the probability of Y = 1 is calculated in the
Note that, since the goal of the model is to predict what general case as follows:
amount of each of the reagents are required to produce the final Pn
steel product, the model has two dependent output, Y, variables, eb 0 þ i¼1
bi xi
one for Reagent1 and the other for Reagent2. It is found that pðY ¼ 1Þ ¼ Pn
these two variables in general are highly correlated. 1 þ eb 0 þ i¼1
b i xi

4.1.3. Data pre-processing Logistic regression thus forms a predictor variable (log( p/
In the beginning of the desulfurization project, various data (1  p)) which is a linear combination of the binary or ordinal
pre-processing techniques have been applied to construct a explanatory variables. The values of this predictor variable are
dataset to determine FF-LSE’s as well as OLSE’s in order to then transformed into probabilities by a logistic function.
estimate the necessary amounts of reagents needed in each In this study, only the discrete explanatory variables, i.e.,
process. binary and ordinal, in the dataset are used to model logistic
We were given a desulphurization dataset which contained regression. Because the original response variables are scalar,
approximately 13,000 observations, i.e., vectors, with 27 each response variable is sorted and then divided into two
variables composed of binary, ordinal and scalar variables as groups, each of which has the equal amount of observations.
well as a few categorical variables. Each observation represents Thereby, two new binary output variables are constructed. As a
a data vector for one batch of hot metal. result, using the new binary response variables and the same
First various outlier treatments have been conducted discrete explanatory variables, two logistic regression models
applying expert knowledge. After the outlier treatments, are constructed using the probabilities as the fitted values.
the remaining dataset contained only 9475 observations. Next The correlations of probabilities of fitted logistic regression
the dataset was partitioned into two separate datasets, namely model with the binary response variable are calculated and the
the training and testing datasets, using a proposed sampling probabilities having the highest correlation to the response
technique, known as Partial Iterative Sampling Method (PISM). variables are added as the new input variable to the original
PISM is an iterative method used to select best chunk of data to input variable dataset to form the augmented input variable data
be used for modeling. Different training and testing datasets are set.
prepared in each iteration. The steps of this iterative process are It should be recalled that the two response variables,
omitted for the sake of page restriction. Reagent1 and Reagent2, are highly correlated; hence the
selected probabilities of these two response variables are more
4.1.4. Variable selection or less identical. Based on this, the probability from Reagent1’s
The process of selecting input variables of the system logistic regression model is used as the new input for the
includes correlation analysis, descriptive statistics, and expert formation of FF-LSE.
knowledge. Recall that FCM requires that explanatory and
response variables be scalar. In addition, the effect of non-scalar 4.1.6. Formation of FF-FRB-LSE
variables can also be determined by a logistic regression We have developed and search through a good number of
analysis. With the application of Logistic Regression, one can FF-LSE models. Amongst these, we very briefly discuss four
capture the probability by which non-scalar input variables can specific models which are labeled as FF-LSE-M1, FF-LSE-M2,
affect an output variable. These probabilities are included as FF-LSE-M3, and FF-LSE-M4. The augmented matrices of
new additional variables into the augmented input dataset. these four models are specific selections from the 20 models
partly shown above: Model1–Model20. As well, Fuzzy
4.1.5. Logistic regression Functions-LSE are constructed for each of the two output
Logistic regression is a variation of ordinary regression. variables, Reagent1 and Reagent2, separately. These four FF-
Unlike ordinary linear regression, logistic regression does not LSE models are built using our FSM tool developed with SAS
assume that the relationship between the independent variables 9.1 in KIS Lab, Knowledge/Intelligence Laboratory, by KIS
and the dependent variable is a linear one. Nor does it assume Lab Software development group. (For more information about
that the dependent variable or the error terms are distributed the software see http://www.mie.utoronto.ca/labs/fuzzy/
normally. Fsmdemo.html.) Again for each of the four different model
The form of the model in general is structures, a heuristic search has been applied to find the
  optimum model parameters. For comparative purposes, the
p results from each of the four sub-optimum FF-LSE models
log ¼ b0 þ b1 X 1 þ b2 X 2 þ    þ bk X k
1 p together with OLSE are shown in Table 1. (Recall that FCM
identifies sub-optimal clusters.)
where p is the probability that Y = 1 and X1, X2, . . ., Xk are the As it was stated before, the FF-LSE models are better
binary or ordinal independent variables (predictors); and b0, b1, predictors of the outputs by 10% or more as shown above.
b2, . . ., bk are known as the regression coefficients, which have The coefficients of the best FF-LSE’s, i.e., FF-LSE-M3, are
to be estimated from the data. Logistic regression estimates the shown in Table 2.
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1185

Table 1
R-Square values of the Reagent1 and Reagent2 estimations using OLSE and FF-LSE approach for the test dataset
Model name: desulfurization R-Square Parameters
Reagent1 Reagent2 Average m* c*
Ordinary Least Square Estimation 0.82 0.76 0.79
FF models
FF-LSE-M1 (12 input) 0.92 0.93 0.925 1.5 3
FF-LSE-M2 (13 input) 0.925 0.934 0.93 1.5 3
FF-LSE-M3 (16 input) 0.9243 0.936 0.931** 1.5 3
FF-LSE-M4 (18 input) 0.9255 0.93 0.928 1.5 3
m*: optimum degree of fuzziness, c*: optimum cluster size, ‘‘Average’’ column represents the performance obtained form the average of R-square values for Reagent1
and Reagent2 models, respectively, **: bold numbers indicate optimal results.

4.2. Income prediction case study Table 2


The coefficients of the optimum FF-LSE method, where optimum c* = 3, and
The objective of Income Prediction Model is to develop optimum m* = 1.5
models based on the available personal information in order to Coefficient Reagent1
estimate the income of a bank’s customers in order to provide
Cluster1 Cluster2 Cluster3
bank’s decision makers with necessary decision support
indicators. b0 604.362 100.462 238.463
bG 1306.790 137.070 320.120
bG 2 4567.480 110.800 196.700
4.2.1. Objectives beG 1568.750 119.660 312.560
The requirement underlying the income prediction modeling beG m 964.406 18.830 74.002
bG m 4296.000 74.370 106.630
case study can be summarized as follows: ‘‘A financial
bx1 0.478 0.475 0.475
institution is willing to offer its customers a special financial bx2 0.268 0.255 0.255
package such as credit cards, loans, mortgages, etc. But they bx3 0.215 0.219 0.220
needed to know if a customer is suitable or not to be eligible for bx4 0.095 0.021 0.031
this offer. For this reason, the income of its future customers bx5 0.059 0.068 0.068
bx6 0.007 0.044 0.037
needs to be predicted before-hand based on certain information
bx7 0.048 0.039 0.038
they have gathered from other sources’’. The financial bx8 0.005 0.005 0.004
institution already has customers whose income they know bx9 0.135 0.131 0.132
before hand. The aim is to determine estimates of a customer’s bx10 0.097 0.005 0.008
income with the empirical data at hand. bx11 0.379 0.377 0.377
In this case study, FF-LSE and OLSE methods are applied to
Coefficient Reagent2
the data at hand in order to show the predictive performance
advantage of the proposed FF-LSE approach in comparison to Cluster1 Cluster2 Cluster3
the OLSE. b0 310.880 219.165 339.778
bG 690.369 297.991 457.103
bG 2 – 220.290 287.490
4.2.2. Income prediction data set beG 790.615 270.785 442.356
There were roughly 500,000 data records available for this beG m 479.740 51.048 102.313
investigation which includes 200 variables such as general bG m 2194.800 140.170 159.820
demographical and geographical information (up to 15 bx1 0.504 0.497 0.497
variables), financial relationship summary (35 variables), bx2 0.281 0.274 0.274
bx3 0.173 0.177 0.178
product and activity descriptions, (80 variables), aggregate bx4 0.084 0.025 0.043
information (70 variables). bx5 0.049 0.057 0.057
bx6 0.001 0.039 0.026
bx7 0.063 0.049 0.049
4.2.3. Data pre-processing bx8 0.025 0.031 0.031
Missing values were treated by the application of certain bx9 0.156 0.154 0.155
criteria decided together with the experts from the bank. bx10 0.067 0.012 0.012
Datasets were segmented by proper categorical variables, i.e., bx11 0.394 0.385 0.385
residential area which is categorized into three groups, gender, b0: the constant coefficient, bG: the coefficient for the membership value, bG 2 :
product types, i.e., the number of products used. As a result of the coefficient for the squared membership value, beG: the coefficient for
this partition of the dataset, there were 11 segments. In this case exponential membership value, beG m : the coefficient for exponential of m
powered membership value, bG m : the coefficient of m powered membership
study, a dataset from a particular segment, which represents the value, bx1 to bx11 are the coefficients of the input variables, i.e., start-sulphur,
females living in the mid-income class range that had two aim-sulphur, KGS, TEMP, FB, compound1 to compound5, logistic regression
different financial products of the financial company, is probability, respectively.
1186 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188

implemented to demonstrate and show the advantage of the Table 3


R-square values of income estimations for FF-LSE-M1, . . ., FF-LSE-M6 are
proposed FF-LSE method.
shown with m*: optimum degree of fuzziness, c*: optimum number of clusters
The dataset of the particular segment used in this study
consists of 35,095 customers after we applied the data cleaning Model name: income R2 m* c*
processes. Various statistical analyses, including descriptive Ordinary least square estimation 0.183 – –
analysis, histograms, scatter diagrams, correlation analysis, FF-LSE-M1 (13 input) 0.6228 2 5
etc., were conducted for the input variable selection on the input FF-LSE-M2 (14 input) 0.622 1.5 5
FF-LSE-M3 (15 input) 0.6232 2.0 5
dataset for the selected segment. In particular, 11 scalar
FF-LSE-M4 (14 input) 0.6235 2.0 5
variables and 15 nominal and ordinal variables are chosen FF-LSE-M5 (14 input) 0.624 1.5 5
among the 200 variables based on their highest explanatory FF-LSE-M6 (13 input) 0.62 2.0 3
power with respect to income variable. Bold numbers indicate optimal results.
The dataset is partitioned into training and testing datasets.
For each modeling technique, a separate Partial Iterative
Sampling Method (PISM) which is indicated previously is augmented matrices of these six models are formed with certain
applied and as a result of this sampling method, 5700 and 5000 particular membership function transformations and the
observations are selected for the training and the testing original input matrix, X, which includes the logistic transfor-
datasets, respectively. mation of certain non-scalar input variables as follows:
Again, logistic regression is applied to determine prob- The twelve input variables used for each model are
abilities, associated with the nominal, binary and ordinal
variables, which are included in FF-LSE’s as additional inputs Inputs: (1) withdrawl_am, (2) system_am, (3)
MNYIN_ACTIVE_TOTBAL_AM, (4) MNYIN_ACTIVE_PRTBAL_AM, (5)
in the augmented input matrix. For this purpose, the output branch_am, (6) AGE_YR, (7) withdrawl_ct, (8) system_ct, (9)
variable, ‘‘income’’, has been divided into three groups PSYTE_CLUSTR_ID, (10) DEM_CHQ_ACTIVE_CT, (11) branch_ct, (12)
according to
s values as follows: the first group includes probability from logistic regression
the input vectors with income values that fall within (1,s)
interval, the second group includes input vectors with income The comparisons of these six models together with the
values that fall within the [s, +s] interval and third group OLSE model are shown in Table 3 below.
includes the input vectors with income values that fall within The best FF-LSE model, i.e., FF-LSE-M5, where optimum
the (+s, 1) interval. An ordinal income variable for each vector c* = 5, and optimum m* = 1.5 and the original selected scalar
is created which gets 0 if the actual income variable falls within input variable including the logistic transformation of the
the first group, (1) if it falls within the second and (2) for the non-scalar original input variables are wdr_am:withdrawl_am,
third group. Ordinal logistic regression is applied using the syst_am:system_am, mnatbam:MNYIN_ACTIVE_TOTBA-
ordinal income variable as the output and the binary, ordinal L_AM,mnatpam: MNYIN_ACTIVE_PRTBAL_AM,brnch-
and nominal input variables as the explanatory variables. The am:branch_am,wgdwl_ct:withdrawl_ct, sys_ct:system_ct,
ordinal logistic regression results in three probability variables psy_id:PSYTE_CLUSTR_ID, dmcq_ct:DEM_CHQ_ACTI-
for three discrete values of ordinal income variable. Each VE_CT, brnch_ct:branch_ct,:Lr_Prob:Probability from logistic
probability variable represents the observed probability of each transformation.
value of the ordinal output given the inputs. For instance, the
first probability variable states the probability of ordinal income 5. Conclusions
being ‘‘0’’given the inputs, the second probability variable
states the probability of ordinal income being ‘‘1’’ given the ‘‘Fuzzy Functions with LSE’’ is proposed for the develop-
inputs and the third probability variable states the probability of ment of FSM’s for system representation and reasoning with
ordinal income being ‘‘2’’. The probability having the highest them. The proposed fuzzy functions are distinct and uniquely
correlation with the scalar income variable in the training different in structure identification and reasoning from: (1) the
dataset is selected as additional input to the augmented input rule bases originally proposed by Zadeh and initially
matrix; and as a result, each fuzzy model has one additional implemented by Mamdani and their important variations
input from logistic regression. In order to find the logistic proposed by Sugeno-Yasukava and Tagagi-Sugeno models, and
regression probabilities of a new observation, e.g., in the test (2) fuzzy regression models originally proposed by Tanaka
dataset, the logistic regression estimated coefficients are used et al. [7] and its variations as well as Hathaway and Bezdek
together with whichever probability is chosen for the train [13].
dataset, in order to determine the probability as the additional In the first case, the structural difference is in terms of
input variable value in FF-LSE’s. functional representation versus the rule base representation. In
fuzzy functions, membership values and their transformations
4.2.4. Application of FF-LSE obtained from a fuzzy clustering algorithm enter into an
Six FF-LSE models were developed, namely, FF-LSE- augmented input matrix together with a logistic transformation
M1,. . ., FF-LSE-M6, with 12 scalar inputs which include the of non-scalar original input variables for function identification
logistic regression probability value together with the six exercises in addition to original scalar input variables with
alternative additions of membership value transformations. The Least Squares estimation technique. Therefore, the reasoning
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1187

only requires the substitution of membership values of a new these applications, we have compared the results of fuzzy
input vector (observation), determined by an execution of a function approach only to the ordinary least squares
fuzzy clustering algorithm, together with the input variables of estimation. Comparisons with other approaches could be
a new input vector in to a number of fuzzy functions. The made, but this is left for our future studies. In these
number of fuzzy functions being equal to the number of rules, comparisons, we have shown several models of fuzzy
i.e., fuzzy clusters that is determined by, say, FCM. Whereas in functions together with the best amongst them. It should
fuzzy rule bases, membership functions of fuzzy sets associated be noted that the best model result is generally obtained by
with the original input variables as well as the output variable a more complicated augmented matrix. In practice, the
must be estimated either by experts or by a fuzzy clustering simplest fuzzy functions, which are obtained with the
technique for representation. Next for approximate reasoning, augmented matrix that contains only the membership
system analyst needs to identify combination operators known variable in addition to the original variables, might be good
as t-norms and t-conorms and the associated implication enough for its operational simplicity and its closeness to
operator as well as needs to implement fuzzification, General- the best model with respect to a performance measure.
ized Modus Ponens and defuzzification. Thus for the However, one generally obtains a better estimate when
representation and reasoning with fuzzy functions, the transformations of membership values are included in the
application oriented mathematicians and statisticians: (1) do augmented matrix.
not need to deal with membership function estimation of fuzzy
sets, after membership values are obtained either from experts
References
or from a fuzzy clustering algorithm, that make up a fuzzy rule
base and to fuzzify an new input with respect to the estimated [1] J.C. Bezdek, ‘‘Fuzzy Mathematics in Pattern Classification’’, PhD Thesis,
membership functions; nor (2) do they have to know anything applied Mathematics Centre, Cornell University, Ithaca, 1973.
about t-norms, t-conorm and how to chose an appropriate t- [2] L.A. Zadeh, Fuzzy sets, Inf. Control 8 (1965) 338–353.
norm, t-conorm and an associated implication operator to [3] L.A. Zadeh, The concept of a linguistic variable and its application to
identify a good approximate reasoning structure within the approximate reasoning, Inf. Sci. 8 (1975) 199–249.
[4] E.H. Mamdani, S. Assilian, An experiment in linguistic syntesis with a
alternatives of Generalized Modus Ponens; and finally (3) they fuzzy, logic controller, in: E.H. Mamdani, B.R. Gains (Eds.), Fuzzy
do not need to choose a de-fuzzification technique. In contrast, Reasoning and Its Applications, Academic Press, New York, 1981, pp.
these applications oriented system modelers need to know just a 311–323.
fuzzy clustering together with the least squares estimation [5] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications
techniques. Finally it is important to recall that in fuzzy to modeling and control, IEEE Trans. Syst. Man Cybern. SMC-15 (1)
(1985) 116–132.
function structure identification, we include interactive [6] H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets
membership values and their transformations as new input Syst. 24 (1991) 363–375.
variables. Thus we do not need to make independence [7] H. Tanaka, S. Vegima, K. Asai, Linear regression analysis with fuzzy
assumption for the input variables. model, IEEE Trans. Syst. Man Cybern. SMC-2 (1982) 903–907.
[8] H. Tanaka, H. Ishibuchi, S. Yoshikawa, Exponential possibility regression
In the second case, the structural difference is in terms of
analysis, Fuzzy Sets Syst. 69 (1995) 305–318.
the schema with which the original variables enter into the [9] H. Tanaka, H. Ishibuchi, Identification of possibilistic linear systems by
formation of a function. First, generally, there is only one quadratic membership functions of fuzzy parameters, Fuzzy Sets Syst. 41
fuzzy regression generated with the use of all data. Whereas, (1991) 145–160.
there are a number of fuzzy functions equivalent to the [10] A. Celmins, Least squares model fitting to fuzzy vector data, Fuzzy Sets
number of clusters, i.e., rules. Secondly, in fuzzy regression Syst. 22 (1987) 245–269.
[11] A. Celmins, Multidimensional least squares model fitting of fuzzy models,
approach, original input variables are the only ones that enter Math. Model. 9 (1987) 669–690.
into the right hand side of the function to be estimated by the [12] D. Savic, W. Pedryzc, Evolution of fuzzy linear regression models, Fuzzy
least squares method [7]. Alternately, fuzzy clustering results Sets Syst. 39 (1991) 51–63.
are used as switching functions in applications of fuzzy [13] R.J. Hathaway, J.C. Bezdek, Switching regression models and fuzzy
clustering, IEEE Trans. Fuzzy Syst. 1 (3) (1993) 195–203.
regression suggested by Hathaway and Bezdek [13].
[14] I.B. Türkşen, Interval valued fuzzy sets based on normal forms, Fuzzy Sets
Whereas in ‘‘Fuzzy function-LSE’’ approach, membership Syst. 20 (1986) 191–210.
values and their transformations together with a logistic [15] I.B. Türkşen, Type 2 representation and reasoning for CWW, Fuzzy Sets
transformation of the original non-scalar input variables Syst. 127 (2002) 17–36.
enter into an augmented input matrix in addition to the [16] M.R. Delgado, A.F. Gomez-Skennata, F. Martin, Rapid prototyping of
original scalar variables for the application of the least fuzzy models, in: H. Hellendoorn, D. Driankov (Eds.), Fuzzy Model
Identification: Selected Approaches, Springer, Berlin, Germany, 1997, pp.
squares method. Thirdly, the coefficients of the fuzzy 53–90.
regression equations are assumed to be, generally but not [17] R. Babuska, H.B. Verbruggen, Constructing fuzzy models by product
necessarily symmetric, triangular membership functions [7]. space clustering, in: H. Hellendoorn, D. Driankov (Eds.), Fuzzy Model
In ‘‘Fuzzy function-LSE’’, membership values and their Identification: Selected Approaches, Springer, Berlin, Germany, 1997, pp.
transformations enter into the right-hand side of the 53–90.
[18] Ö. Uncu, I.B. Türkşen, A novel fuzzy system modeling approach: multi-
equations as new variables. dimensional structure identification and inference, in: Proceedings of the
To demonstrate the advantages of the proposed fuzzy Tenth IEEE International Conference on Fuzzy Systems, Melbourne,
functions, we have presented two real-life case studies. In Australia, December, (2001), pp. 557–562.
1188 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188

[19] M. Demirci, Fuzzy functions and their fundamental properties, Fuzzy Sets [21] M. Demirci, J. Recasens, Fuzzy groups, fuzzy functions and
Syst. 106 (1999) 239–246. fuzzy equivalence relations, Fuzzy Sets Syst. 144 (2004) 441–
[20] M. Demirci, Foundations of fuzzy functions and vague algebra based on 458.
many-valued equivalence relations. Part I. Fuzzy functions and their [22] U. Höhle, Quotients with respect to similarity relations, Fuzzy Sets Syst.
applications I, J. Gen. Syst. 32 (2003) 123–155. 27 (1988) 31–44.