You are on page 1of 33

Normalization and Lossless Join

Decomposition of Similarity-Based
Fuzzy Relational Databases
zgn Bahar,

Adnan Yazc*
Department of Computer Engineering, Middle East Technical University,
06531, Ankara, Turkey
Fuzzy relational database models generalize the classical relational database model by allowing
uncertain and imprecise information to be represented and manipulated. In this article, we intro-
duce fuzzy extensions of the normal forms for the similarity-based fuzzy relational database
model. Within this framework of fuzzy data representation, similarity, conformance of tuples,
the concept of fuzzy functional dependencies, and partial fuzzy functional dependencies are
utilized to define the fuzzy key notion, transitive closures, and the fuzzy normal forms. Algo-
rithms for dependency preserving and lossless join decompositions of fuzzy relations are also
given. We include examples to show how normalization, dependency preserving, and lossless
join decomposition based on the fuzzy functional dependencies of fuzzy relation are done and
applied to some real-life applications. 2004 Wiley Periodicals, Inc.
1. INTRODUCTION
The relational data model proposed by Codd
1
is based on the set of theoretic
concepts and enables well-defined, unambiguous, and exact data of an applica-
tion. However, in many real world applications, such as biology and genetics, geo-
graphical information systems, economic and weather forecasting systems, and so
on, data is often partially known or imprecise and queries may include vague terms.
To cope with various types of imperfectness and to capture more meaning of the
data in databases, several extensions to the classical relational database model have
been proposed in literature.
18
Properly formulating a database model in terms of
relation schemas is a key requirement in a fuzzy database design. Main frame-
works for fuzzy data representation based on the fuzzy set theory
9
allow imprecise
data for the attribute values and may be categorized into a partial membership-
based approach,
5,6
similarity-based approach,
10
possibility-based approach,
11
and
*Author to whomall correspondence should be addressed: e-mail: yazici@ceng.metu.edu.tr.

e-mail: ozgun.bahar@isbank.com.tr.
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 19, 885917 (2004)
2004 Wiley Periodicals, Inc. Published online in Wiley InterScience
(www.interscience.wiley.com).

DOI 10.1002/int.20029
the extended possibility-based approach.
12
The similarity-based framework is the
approach used in this study.
One of the primary purposes of any database is to decrease data redundancy
and to provide data reliability.
13,14
Data redundancies and update anomalies have
also been of great concern in fuzzy relational database design,
2,6,15,16
and integrity
constraints play an important role in fuzzy relational database design theory. Var-
ious types of data dependencies such as functional and multivalued dependencies
are used as guidelines for the design of classical relational schema that are concep-
tually meaningful and free of certain anomalies. For example, if one attribute deter-
mines another, we say that there exists a functional dependency between these
attributes. This determination is unique in a classical (crisp) relational model
whereas it need not be in a fuzzy relational database model. In a crisp database
model, functional and multivalued dependencies are the precise determinants, and
this is not the case for some of the real-world applications.
17
As the relational data
model is extended to deal with fuzzy data, integrity constraints have also been
extended, and, in literature, there are a number of ways to impose fuzzy data depen-
dency on fuzzy data in fuzzy database relations.
6,11,15,16,1820,21
The following is
an example of fuzzy functional dependencies (ffds).
One of the areas in which fuzziness may be used is business and finance
applications. To evaluate the creditworthiness of a customer, multiple financial
and personal factors are used. Economic thinking and social integrity are two of
these personal factors for the creditworthiness assessment for consumer credit.
Economic thinking and conformity with social and economic standards more or
less determine the business behaviour is a valid constraint in this application. In
this example, all the business behaviour, economic thinking, and social integ-
rity are the attributes of a person with inexact values. The more or less part in
the example causes the constraint itself to be fuzzy. The dependency does not deter-
mine the precise level of determinancy, but the minimum level. The data depen-
dency in this example application is the ffd, and such a dependency cannot be
enforced by a crisp relational database system.
There have been a number of studies on extending data dependencies for
fuzzy relational database models. Among these, Raju and Majumdar
6
have pro-
posed ffds in terms of a membership function of the elements of a fuzzy relation.
Chen et al.
2
have given a definition of ffds in terms of closeness measures ()
for the equality of possibility distribution and fuzzy implication operators. Shenoi
et al.
15
have extended Buckles and Petrys approach
10
by defining ffds based on
equivalence classes from finite domain partitions alone. Liu
16
has defined ffds
based on the concept of semantic proximity in [0,1] between two fuzzy attri-
bute values v
1
and v
2
, which are intervals. Yazc
8
and Yazc and Szat
17,20
have defined ffds between two fuzzy attribute values and proved the soundness
and completeness of the inference rules of those ffds. The studies related to the
normalization process, analyzing the given relation schemas to achieve the desir-
able properties of minimizing redundancy, and minimizing the insertion, dele-
tion, and update anomalies take the fuzzy data and ffds into account.
2,6,15
In
these studies, the main goal is that a fuzzy relation not being in a certain normal
form is decomposed into multiple fuzzy relation schemas of the desired normal
886 BAHAR AND YAZICI
forms. Dependency preserving and lossless join decompositions are used to achieve
desirable decompositions. Chen et al.
22,23
and Raju and Majumdar
6
have studied
such decompositions for the fuzzy relational databases.
Our study differs from the previous research efforts in literature in a number
of aspects. First of all, the similarity-based fuzzy relational database model is used
as the reference model in our study. We deal with a number of issues to design the
similarity-based fuzzy relational databases in order to reduce data redundancy and
eliminate update anomalies. The formal definitions of ffd and partial ffd are given
based on the conformance of tuples. In addition, the fuzzy key concept and transi-
tive closure of the ffds are presented for definitions of the fuzzy normal forms.
Second, we introduce a number of fuzzy normal forms based on the ffds. We first
define the fuzzy first normal form (1NF). Afterward, fuzzy second (2NF), fuzzy
third (3NF), and fuzzy Boyce Codd Normal (BCNF) forms are introduced. Third,
the dependency preserving and lossless join properties in decompositions into the
fuzzy normal forms with respect to ffds are defined. Finally, all these concepts are
described along with examples to demonstrate how these concepts are used in
some real-life applications.
The article is organized as follows: The following section discusses some
background information, including fuzzy relational databases, similarity relations,
and similarity-based fuzzy relational databases. In Section 3, fuzzy functional
dependencies (ffds), tuple conformance, inference rules for ffds, four fuzzy nor-
mal forms, namely, fuzzy 1NF, fuzzy 2NF, fuzzy 3NF, and fuzzy BCNF forms,
and their decomposition algorithms are provided. In this section two testing algo-
rithms for the dependency preserving and lossless join properties of the decompo-
sitions are given. We also include a real-life application, fraud detection, to show
how normalization and dependency preserving and lossless join properties of the
similarity-based fuzzy relational databases are utilized. Finally, the conclusion is
given in Section 4.
2. BACKGROUND
In this section, we first define fuzzy relational databases. Then, the similarity
relations are described as defined by Zadeh
9
and the similarity-based fuzzy data-
base model, as the reference model of this study, is briefly explained.
2.1. Fuzzy Relational Databases
The relational data model uses a single concept of relation both for data rep-
resentation and data association, and it is supported by the set theory. In this model,
every value in the relation is atomic; that is, values must be atomic. Except for the
null value, every attribute must have a precise value and cannot have fuzzy or
uncertain values.
Several approaches are proposed for extending classical relational database
model to fuzzy relational database model. Fuzzy relational databases are the data-
bases that can represent fuzzy and uncertain data. An extensive list of references
NORMALIZATION OF FRDBs 887
to the relevant literature can be found in Refs. 5 and 24. Main contributions in this
area are as follows: In the fuzzy relational data model proposed by Umano and
Freedom,
7
fuzzy data are represented by possibility distributions and a grade of
membership is used to represent the association between values. Also this grade of
membership may itself be a possibility distribution. Buckles and Petry
10
intro-
duced the fuzzy similarity relations. These fuzzy similarity relations facilitate the
estimation of the extent to which possible values of an attribute can be regarded as
being interchangeable. Prade and Testemale
11
generalized the representation of
Umano by introducing an extra element, e, for the situations where a nonzero pos-
sibility can mean the nonapplicability of an attribute. They also proposed the use
of possibility distributions to represent fuzzy values as well as uncertainty con-
cerning the value of an attribute. To handle incomplete information and missing
and nonapplicable values, Imelinski and Lipski
3
proposed a method where incom-
plete information is represented as a list of possible values. Lipski does not assume
that null means a value that is completely unknown.
Two different causes of imprecise attribute values in database systems moti-
vated two approaches for representing fuzzy data. The similarity-based approach
10
uses linguistic terms to describe attribute values. The impreciseness of these terms
is characterized by a similarity matrix, which records the degree of similarity
between the pairs of linguistic terms in a domain. The possibility-based model is
an alternative approach for representing imprecise data using a possibility distri-
bution as the value of an attribute. Possibility measure and necessity measure are
the two kinds of matching degrees calculated for this approach. There have also
been some mixed models combining these approaches.
12,19
2.2. Similarity Relations
The identity relation used in nonfuzzy relational databases induces equiva-
lence classes over a domain base set, D
j
, which affects the result of certain opera-
tions and the removal of redundant tuples. The equivalence classes are most
frequently singleton sets. Identity relation is a special case of this similarity relation.
Similarity relations are useful for describing how similar two elements from
the same domain are. A similarity relation,
9,10
s~ x, y!, for a given domain D
j
, is a
mapping of every pair of elements in the domain onto the unit interval [0,1]. A
similarity relation is reflexive and symmetric as in an equivalence relation. The
similarity relation should also have transitive property. These three properties of a
similarity relation are stated below.
Definition. A similarity relation is a mapping, s: D D r @0, 1#, such that
for x, y, z D,
s~ x, x! 1 ~reflexivity!,
s~ x, y! s~ y, x! ~symmetry!,
s~ x, z! max
yD
~min~s~ x, y!, s~ y, z!!! ~max-min transitivity!
888 BAHAR AND YAZICI
2.3. Similarity-Based Fuzzy Relational Databases
The similarity-based fuzzy relational model is not an extension to the original
relational model, but actually a generalization of it. It allows a set of values for an
attribute rather than only atomic values, and replaces the identity concept with a
similarity concept.
The similarity-based relational model allows a set of values for a single
attribute provided that all the values are from the same domain. Thus, while allow-
ing multiple values, similarity-based relational model keeps the strongly typed
attribute value property of the original model. This property is useful for query
processing and update operations. If the attribute value is precise and crisp, then
the value is atomic, if it is imprecise and inexact, then a set of values that are
similar to this value are stated in place of it. The level of similarity among the
values is defined by the explicitly defined similarity relation for the domain of the
attribute values.
The original model compares two attribute values by checking whether the
two values are equal or not. The identity relation reflects this fact: i ~ x, y! 1 if
and only if x y, and i ~ x, y! 0 otherwise. The similarity-based relational
model
10
compares two attributes by measuring the closeness of the values in terms
of the explicitly declared similarity relation of the attribute domain. Atuple in this
model is called redundant if it can be merged with another through the set union of
corresponding domain values.
3. FUZZY NORMAL FORMS FOR FUZZY RELATIONS
In a logical database design, integrity constraints have a critical role. One of
the most important integrity constraints is the functional dependency. Functional
dependencies reflect a kind of semantic knowledge about the relationships between
the attributes. They help the database designer remove some of the redundant infor-
mation in the relations. To provide a guidance for a good fuzzy database design
several fuzzy normal forms based on fuzzy functional dependencies have been
proposed.
3.1. Fuzzy Functional Dependencies
Fuzzy functional dependencies reflect some kind of semantic knowledge about
attribute subsets of the real world. Ffds are used to design fuzzy databases where
data redundancy and update anomalies are reduced.
In the classical relational data model, a functional dependency X rY states
that equal Y values correspond to equal X values. However, the definition of func-
tional dependency is not directly applicable to similarity-based fuzzy databases,
because the concept of equality does not totally apply to fuzzy relational database
models. In a fuzzy relational data model, the degree of X determines Y may not
necessarily be 1 as in the crisp case. Naturally, a value ranging over the interval
[0,1] may be accepted. Then the definition of ffd turns into similar Y values cor-
respond to similar X values.
NORMALIZATION OF FRDBs 889
Ffds are functional constraints that are specified among the attributes of a
fuzzy relation schema. In the definition of the ffds, we use the conformance con-
cept.
8,17,20
According to the definition of conformance, a tuple is similar to itself
independent of its attribute values, the uncertainty is kept even in the presence of
ffds imposed on the relation, and this definition of conformance is transitive, sym-
metric, and reflexive. For precise ffds, the similarity of Y values has to be greater
than or equal to the similarity of X values, where similarity is measured in terms of
conformance. For imprecise ffds, the impreciseness of the dependency is a thresh-
old on the similarity of Y values, weakening the dependency. Using the definition
of ffd, we have defined the partial ffd, to be used in the definition of fuzzy 2NF.
3.1.1. Conformance of Tuples
A ffd can be represented as X r
q
Y, where q is the linguistic strength (like
more or less, sometimes, etc.). A ffd, X r
q
Y, states that similar Y values
correspond to similar X values. Here similarity (or closeness) refers to confor-
mance of tuples. The similarities of the attribute values define how conformant the
two tuples are on that attribute. Aformal definition of conformance
7
is given below.
Definition. The conformance of attribute A
k
defined on domain D
k
for any two
tuples t
1
and t
2
present in relation instance r and denoted by C~A
k
@t
1
, t
2
# ! is
given as
C~A
k
@t
1
, t
2
# ! min$min
xd
1
$max
yd
2
$s~ x, y!%%, min
xd
2
$max
yd
1
$s~ x, y!%%%
where d
1
is the value set of attribute A
k
for tuple t
1
, d
2
is the value set of attribute
A
k
for tuple t
2
, s~ x, y! is a similarity relation for values x and y, and s is a
mapping of every pair of elements in the domain D
k
onto interval @0, 1#.
In the case of an ordinary relational data model, both d
1
for d
2
have to be
singleton sets, and the similarity of any tuples can have the value of either 0 or 1.
Here, the identity relation is replaced by the explicitly declared s~ x, y! of which
the identity relation is a special case. To describe the closeness of two tuples on a
set of attributes rather than on a single attribute, the definition of conformance is
extended in Ref. 8 as follows.
Definition. The conformance of attribute set X for any two tuples t
1
and t
2
present in relation instance r and denoted by C~X@t
1
, t
2
#! is given as C~X@t
1
, t
2
#!
min
A
k
X
$C~A
k
@t
1
, t
2
# !%.
3.1.2. Definition of Fuzzy Functional Dependencies
A formal definition for the ffd can be given as follows.
Definition. Let r be any fuzzy relation instance on schema R~A
1
, . . . , A
n
!, U be
the universal set of attributes A
1
, . . . , A
n
, and both X and Y be subsets of U. Fuzzy
890 BAHAR AND YAZICI
relation instance r is said to satisfy the ffd, X r
q
Y, if for every pair of tuples t
1
and t
2
in r, C~Y@t
1
, t
2
# ! min~q, C~X@t
1
, t
2
# !!, where q is a real number
within the range @0, 1#, describing the linguistic strength.
As for their crisp counterparts, the ffds should also be checked whenever
tuples are inserted into the fuzzy relational database or they are modified, so that
the integrity constraints imposed by these ffds are not violated.
Example 1. Consider a fuzzy relation instance Person ~Name, Performance,
Earning!. The similarity relations of the attribute domains are given in Tables IIII.
The integrity constraint for the Person relation is Performance of the
employee more or less determines his/her earning. That is, the ffd of this relation
is PERFORMANCE r
0.6
EARNING, where 0.6 is the linguistic strength, more
or less. This ffd should be checked whenever new tuples are to be inserted, to see
whether the new tuple violates the ffd. Below, a couple of tuples are inserted to
investigate the tuple conformance concept.
Step 1: Insertion of the first tuple
^$Kelly%, $ poor, very poor%, $little%&
Since this is the first tuple, it does not violate the ffd.
Step 2: Insertion of the second tuple
^$Matthew%, $average%, $moderate, average%&
Table I. Similarity relation for attribute NAME.
NAME Kelly Jerry Matthew Sandra
Kelly 1 0 0 0
Jerry 0 1 0 0
Matthew 0 0 1 0
Sandra 0 0 0 1
Table II. Similarity relation for attribute PERFORMANCE.
PERFORMANCE Very poor Poor Average Good Excellent
Very poor 1 0.75 0.3 0.3 0.3
Poor 0.75 1 0.3 0.3 0.3
Average 0.3 0.3 1 0.6 0.6
Good 0.3 0.3 0.6 1 0.65
Excellent 0.3 0.3 0.6 0.65 1
NORMALIZATION OF FRDBs 891
The conformance values of the left- and right-hand side attributes of the ffd are as
C~Perf @t
1
, t
2
# ! 0.3, C~Earn@t
1
, t
2
# ! 0.2
Here, the ffd Performance r
0.6
Earning is violated because C~Earn@t
1
, t
2
# !
min~0.6, C~Perf @t
1
, t
2
# !!, so the tuple is not inserted.
Step 3: Insertion of the third tuple
^$Jerry%, $average, good%, $moderate%&
There is only one tuple to be dealt with for the conformance check, because the
tuple in step 2 is not inserted.
C~Perf @t
1
, t
2
# ! 0.3, C~Earn@t
1
, t
2
# ! 0.8
Then the ffd Performance r
0.6
Earning is not violated because C~Earn@t
1
, t
2
# !
min~0.6, C~Perf @t
1
, t
2
# !!, so the tuple is inserted. Now, we have two tuples in
our relation:
t
1
: ^$Kelly%, $ poor, very poor%, $little%&
t
2
: ^$Jerry%, $average, good%, $moderate%&
Step 4: Insertion of the fourth tuple
^$Sandra%, $average%, $little%&
There are two tuples to be dealt with for the conformance check, because the tuple
in step 2 is not inserted.
C~Perf @t
1
, t
3
# ! 0.3, C~Earn@t
1
, t
3
# ! 1,
C~Perf @t
2
, t
3
# ! 0.6, C~Earn@t
2
, t
3
# ! 0.8
Then the ffd Performance r
0.6
Earning is not violated because both
C~Earn@t
1
, t
3
# ! min~0.6, C~Perf @t
1
, t
3
# !!
and
C~Earn@t
2
, t
3
# ! min~0.6, C~Perf @t
2
, t
3
# !!
Table III. Similarity relation for attribute EARNING.
EARNING Little Moderate Average High Very high
Little 1 0.8 0.2 0.2 0.2
Moderate 0.8 1 0.2 0.2 0.2
Average 0.2 0.2 1 0.6 0.6
High 0.2 0.2 0.6 1 0.8
Very high 0.2 0.2 0.6 0.8 1
892 BAHAR AND YAZICI
so the tuple is inserted. Thus, we have three tuples in the relation:
t
1
: ^$Kelly%, $ poor, very poor%, $little%&
t
2
: ^$Jerry%, $average, good%, $moderate%&
t
3
: ^$Sandra%, $average%, $little%&
3.1.2.1. Partial Fuzzy Functional Dependencies. Using the definition of the
ffd, we can define a partial ffd, which is used in the definition of the fuzzy 2NF.
Definition. Y is called partially fuzzy functionally dependent on X to the
degree q, X r
q
Y partially, if and only if X r
q
Y and there exists an X
'
X,
X
'
, such that X
'
r
a
Y where a q.
In more relaxed terms, a ffd X r
q
Y is a partial ffd, if removal of an attribute
A from X means that the dependency still holds. That is, for an attribute A X,
X $ A% still fuzzy functionally determines Y to the degree a q.
Example 2. Let the relational schema be R ~A, B, C! and the ffds be
AB r
0.8
C and A r
0.9
C. After removing attribute B from the first ffd, the
dependency still holds; hence AB r
0.8
C is the partial ffd.
3.1.3. Inference Rules for Fuzzy Functional Dependencies
An important concept related to data dependencies is the concept of infer-
ence rules. Given a set of dependencies, inference rules introduce other dependen-
cies that are logical consequences of the given dependencies. These rules are
dependency generators and so they are closely related to the definition and seman-
tics of the dependencies.
The fuzzy inference rules are listed below for ffds. They reduce to those of
the classic fds. The inference rules presented below have already been shown to be
sound and complete in Ref. 17.
(1) Inclusive rule for ffds:
If X r
u
1
Y holds and u
1
u
2
, then X r
u
2
Y holds.
(2) Reflexive rules for ffds:
If X Y, then X r
u
Y holds for all u @0, 1# .
(3) Augmentation rule for ffds:
Whenever r satisfies X r
u
Y, it also satisfies XZ r
u
YZ
(4) Transitivity rule for ffds:
Whenever r satisfies X r
u
1
Y and Y r
u
2
Z, it also satisfies X r
min~u
1
, u
2
!
Z
NORMALIZATION OF FRDBs 893
By successive application of the above inference rules, additional inference rules
for the ffds can be stated:
(1) Union rule for ffds:
Whenever X r
u
1
Y and X r
u
2
Z are satisfied by r, X r
min~u
1
, u
2
!
YZ is also satisfied
(2) Pseudotransitivity rule for ffds:
Whenever r satisfies X r
u
1
Y and WY r
u
2
Z, then it also satisfies WX r
min~u
1
, u
2
!
Z
(3) Decomposition rule for ffds:
If X r
u
Y holds and Z Y, then X r
u
Z holds
3.2. Fuzzy Keys
Like its classical relational counterpart, fuzzy normal forms are based on the
concept of ffd and the concept of fuzzy key. Therefore, we define the fuzzy key
concept in this section.
A primary key is a special case of functional dependency in classical rela-
tional database models. The role of X in functional dependency X rY belongs to
the attributes in the key, and the set of all other attributes in the relation play the
role of Y. That is, a key subset of U, K, of a relation schema R means that the
values of U are determined from K values for all tuples of any relation of R. In
classical relational data model, identical K values lead to identical U values. In the
fuzzy relational data model, the concept of being identical again leaves its place to
similarity (or closeness). The determination is reflected by the relationship that
identical K values lead to identical U values, and close K values lead to close U
values to a certain extent. In fuzzy relational databases, the classical primary key
is extended to be called fuzzy key with strength q, where q is the extent men-
tioned before. A more formal definition can be given as follows.
Definition. Let K, S U, and F be a set of ffds for R: K is called a fuzzy key of
R with strength q if and only if K r
q
i
U F and K r
q
i
U is not a partial ffd,
where q min q
i
and q 0.
Example 3. If we consider a symbolic example, let us have a relation R where
R ~A, B, C, D! and ffds A r
0.7
B and A r
0.9
CD; the A is called the fuzzy
key of the relation with strength 0.7, because B values are determined by A to the
degree 0.7, and C and D values are determined by A to the degree 0.9. Our q
i
values are q
1
0.7 and q
2
0.9, and q value is then the minimum of $0.7, 0.9%,
that is, 0.7.
A fuzzy key can have the values that an ordinary attribute can take. It can
have multivalues such as $a, b% where a and b are similar to each other with a
certain degree. The only restriction on the values of a fuzzy key, like the values of
894 BAHAR AND YAZICI
other attributes, is that the values should not be AND-combined, as will be explained
later.
3.2.1. Transitive Closure of Fuzzy Functional Dependencies
Given a set of ffds for a relation, the fuzzy key of that relation can be found
utilizing the concept of transitive closure. Chen, Kerre, and Vandenbulcke
18
stud-
ied the ffd transitive closure and axiomatization of fuzzy functional dependence.
Transitive closure comes into place when we want to know whether a given ffd
can be derived using the ffd set F of a relation and the inference rules for ffds.
However, it is not a simple task to compute the set of all ffds that are derived from
F using the inference rules, because the set is infinite. Instead of computing this
whole set, the algorithm below finds all attributes that are fuzzy functionally depen-
dent on attribute(s) X, and the maximal degree the dependencies hold, namely the
transitive closure of X.
Algorithm. Transitive Closure Computation Algorithm. Let X be a set of k
attributes, X X
1
X
2
. . . X
k
:
(1) Initially construct the closure list of X, XList, with the attributes in X with the maximal
degree, 1, for each.
XList $~X
1
,1!, ~X
2
,1!, . . . ,~X
k
,1!%
The domain, Dom, contains the attributes in the XList; X
1
, X
2
, . . . , X
k
initially. BList
is a temporary closure list, and initialized at the beginning.
(2) For each ffd V r
a
W, in F.
If the left-hand side of the ffd is a subset of the domain, V Dom,

Find the minimum strength in XList, among the elements of XList whose attributes
are the elements of V, minstrength.

Set f as the minimum of a and the strength found in the previous step, f min~a,
minstrength!

For each attribute W


j
of the right-hand side W, add the entry ~W
j
, f! to the BList.
(3) Combine BList into XList using fuzzy union operation.
(4) If there is a change in XList, reset the BList, adjust the domain, Dom, according to the
new elements of XList, and go to step 2. Else stop, XList is the transitive closure of X.
Example 4. If we consider the relation in Example 3, the relation R has the
attribute set $ A, B, C, D% and ffds A r
0.7
B and A r
0.9
CD. Let us compute the
transitive closure of attribute A.
Initially,
XList $~A, 1!%, Dom$ A%, BList
For the first ffd, A r
0.7
B
Minstrength 1, w min~1, 0.7! 0.7
BList $~B, 0.7!%
NORMALIZATION OF FRDBs 895
For the second ffd, A r
0.9
CD
Minstrength 1, w min~1, 0.9! 0.9
BList $~B, 0.7!, ~C, 0.9!, ~D, 0.9!%
Combining BList into XList, XList $~A, 1!, ~B, 0.7!, ~C, 0.9!, ~D, 0.9!%.
Because XList is changed, we reset BList and our new domain is Dom $ A, B,
C, D%. And then the two ffds should again be considered in the same scenario.
But this time, there is no change in XList, and hence the transitive closure of A is
$~A, 1!, ~B, 0.7!, ~C, 0.9!, ~D, 0.9!%.
3.2.2. Finding the Fuzzy Key of a Relation
To find the fuzzy key of a relation, the concept of transitive closure for ffds is
used. The exhaustive way is to analyze the transitive closure of all the combina-
tions of all of the attributes in the relation and check whether the transitive clo-
sures found include all the attributes. This means that the attribute combination
determines all the attributes in the relation to the respective degrees in the closure
list, and the minimum of these strength values would be the strength of the fuzzy
key. But in this case, there is no need to consider the transitive closures of all
attributes, because for an attribute to be a part of a fuzzy key, it should belong to
the left-hand side of any of the ffds, or it should not exist in any of the ffds in the
relation. That is, to find a fuzzy key, the attributes that appears only on the right-
hand sides of the ffds in the relation need not be considered in finding the transi-
tive closures. Below is the algorithm to find the fuzzy keys of a given relation with
a set of ffds F.
Algorithm. Fuzzy Key Finding Algorithm. Let F be the set of ffds of R:
(1) Find all the left-hand side attributes of ffds in F.
(2) Find the attributes not contained in any of the ffds of F.
(3) Get the union of the two sets found in the first two steps above into AttributeList.
(4) Beginning with the single attribute combinations, for all the ascending combinations
of attributes in AttributeList (say comb for the combination):

If comb contains a key found before, continue with another combination.

Find the transitive closure of the comb.

If the transitive closure found contains all the attributes of the relation, set a to the
minimum of the strengths in the transitive closure, and add comb to the key list with
the degree a.
With this algorithm, all the candidate keys can be found. The first control of the
fourth step in the algorithm ensures the full fuzzy functional dependence of the
attributes of the relation on the fuzzy key.
Example 5. Let us consider Example 3 again. To find all the fuzzy keys of the
relation R ~A, B, C, D! with ffds A r
0.7
B and A r
0.9
CD, we apply the
algorithm above. The set of left-hand-side attributes of R is $ A%. There is no
attribute not contained in any of the ffds, so AttributeList $ A%. Because there is
896 BAHAR AND YAZICI
only one attribute in AttributeList, only one transitive closure set, that is for attribute
A, should be computed. And the transitive closure of A is $~A, 1!, ~B, 0.7!, ~C, 0.9!,
~D, 0.9!%. Because the transitive closure contains all the attributes in the relation,
A is the fuzzy (candidate) key of the relation with strength 0.7, that is, a mini-
mum of (1, 0.7, 0.9).
3.2.3. Fuzzy Prime and Nonprime Attributes
To be able to state the condition for the fuzzy 2NF, it is also necessary to
define fuzzy prime and fuzzy nonprime attributes for a relation.
Definition. Let A U, X U, and K be a fuzzy key set of R. A is called a
fuzzy prime attribute if and only if A K; X is called a fuzzy prime if and only if
X K. Those attributes that are not fuzzy prime are called fuzzy nonprime.
For an attribute to be a fuzzy prime attribute, it should be a part of at least one
of the fuzzy candidate keys of the relation. Similarly, for an attribute to be a fuzzy
nonprime attribute, it should not appear in any of the fuzzy candidate keys of the
relation. In Example 5, the attribute A is a prime attribute with a degree of 0.7.
3.3. Fuzzy First Normal Form
The first one of the classical normal forms that is extended and generalized
within the framework of similarity-based fuzzy relational model is the 1NF.
Definition. Let D
k
be the domain of attribute A
k
, a relation schema R is called
to be in fuzzy 1NF if and only if for any relation r in R, none of the attributes has
values (AND-combined) multivalued.
When a relation schema is not in fuzzy 1NF, the algorithm below can be used
to normalize the relation to be in fuzzy 1NF.
Algorithm. Fuzzy 1NF Decomposition Algorithm.
When the relation is not in fuzzy 1NF, remove the tuple whose attributes vio-
late fuzzy 1NF.
Place these attributes in separate tuples along with the other attributes to
achieve the fuzzy 1NF.
Example 6. Consider a relation schema R, and let its attributes be NAME, AGE,
and LANGUAGE-SPOKEN. A relation r of R consists of four tuples given as
t
1
~Kelly, 35, English)
t
2
~Jerry, [very young, young], $English, French%)
t
3
~Matthew, middle-aged, an oriental language)
t
4
~Sandra, 60, German)
NORMALIZATION OF FRDBs 897
In r, t
1
means Kelly is 35 years old and speaks English, t
2
means that Jerry, quite
young, speaks English and French, t
3
means Matthew, who is middle-aged, speaks
Japanese, and t
4
means Sandra, aged 60, speaks German.
This schema does not satisfy fuzzy 1NF because of the second tuple. In this
tuple, Jerry speaks two languages, and this is an example of multivalued (AND-
combined) data. When we apply the algorithm to make the relation in fuzzy 1NF,
the tuples become
t
1
~Kelly, 35, English)
t
2
] t
5
~Jerry, [very young, young], English)
t
6
~Jerry, [very young, young], French)
t
3
~Matthew, middle-aged, Japanese)
t
4
~Sandra, 60, German)
where the relation is now in fuzzy 1NF.
3.4. Fuzzy Second Normal Form
The fuzzy second normal form, fuzzy 2NF, is based on the concept of the full
ffd. By using the concepts of fuzzy key and partial fuzzy functional dependence,
we can define the fuzzy 2NF.
Definition. Let F be the set of ffds for schema R and K be a fuzzy key of R with
strength q. R is called to be in fuzzy 2NF if and only if none of the fuzzy nonprime
attributes is partially fuzzy functionally dependent on the fuzzy key, K.
Example 7. Let us consider a symbolic example, where a relation schema is R
~A, B, C, D!, and the ffds are AB r
0.8
D and A r
0.9
C. Then attributes AB is
the fuzzy key with strength 0.8. Because a fuzzy nonprime attribute, C, is partially
fuzzy functionally dependent on fuzzy key of R, AB, R is not in fuzzy 2NF.
3.4.1. Fuzzy Second Normal Form Control
Because the definition of fuzzy 2NF involves the control of partial ffd of
fuzzy nonprime attributes on the fuzzy key of R, an algorithm is used to control
partial fuzzy functional dependence and it is given below.
Algorithm. Partial Dependency Control Algorithm. Let the ffd to be investi-
gated for being partial be X r
a
Y.
(1) If the left-hand side of the ffd, X, contains a single attribute, the test need not be applied
at all; the ffd is not partial. Otherwise,
(2) Beginning with the single attribute combinations, for all the ascending combinations
of the attributes of X, except for the combination containing all the attributes;
898 BAHAR AND YAZICI

Find the transitive closure of the combination.

If the transitive closure contains all the attributes of the right-hand side of the ffd, Y,
and the corresponding strengths are greater than or equal to a, then the ffd is partial.
The algorithm above is based on the fact that, if a proper subset of left-hand side
attributes of a ffd fuzzy functionally determines the right-hand side to a degree
greater than or equal to the strength of the ffd, then the ffd is partial.
To understand whether a given relation is in its fuzzy 2NF, all the fuzzy non-
prime attributes of the relation should be checked to see whether they are partially
fuzzy functionally dependent on any of the fuzzy keys of the relation. The algo-
rithm below is developed for the fuzzy 2NF control for a given relation.
Algorithm. Fuzzy 2NF Control Algorithm. Let K be the set of fuzzy keys of rela-
tion R.
For each candidate key K
i
of the relation,

If the fuzzy key contains a single attribute, it has already no partial ffd, continue with
another candidate key.

For each nonprime attribute A


j
of the relation,

Let the ffd be K


i
r
a
i
A
j
, where a
i
is the strength of K
i
.

Apply the partial dependency control algorithm to find out whether the ffd is a partial
ffd. If so, stop, the relation is not in fuzzy 2NF.
3.4.2. Decomposition into Fuzzy Second Normal Form
If a relation schema is not in fuzzy 2NF, it can be normalized into a number of
smaller relations in fuzzy 2NF by the following algorithm.
Algorithm. Fuzzy 2NF Decomposition Algorithm: If the relation is not in fuzzy
2NF, using the Fuzzy 2NF control algorithm, find the partial fuzzy keys and their
dependent fuzzy nonprime attributes.

Decompose and set up a new relation for each partial fuzzy key with its dependent
attributes.

Extract the fuzzy nonprime attributes that are partially fuzzy functionally dependent on
any fuzzy key of the relation from the original relation and set up a new relation with the
remaining attributes.
Example 8. If we consider Example 7 again, the relation was R (A, B, C, D),
ffds were AB r
0.8
D and A r
0.9
C, and AB was the fuzzy key of the relation with
strength 0.8. The second ffd A r
0.9
C contains a part of the fuzzy key as its
left-hand side, so we have to decompose the relation. According to our algorithm,
the decomposition will be like R1 (A, C) and R2 (A, B, D) where A is the
fuzzy key of the first relation with strength 0.9 and AB is the fuzzy key of the
second relation with strength 0.8.
NORMALIZATION OF FRDBs 899
3.5. An Example Application: Leasing Risk Assessment
To automate the risk assessment evaluation for car leasing contracts, a fuzzy
enhanced score card system is developed. There are three different customer types:
private, self-employed, and corporate customers. For modeling private customers,
factors such as age, marital status, length of time at present address,and so forth
are used, that is, the attributes are generally crisp. On the other hand, corporate
customers have more input variables that are a bit more complicated and contain
fuzzy data. Attributes of the relation Leasing Risk Assessment are (Capital, Rev-
enue, Workforce, CompAge, LegalType, FinanBack, CompStruct, IlliquidRisk, Credit-
Rating), where
Capital ] Companys capital basis
Revenue ] Companys annual revenue
Workforce ] Number of employees
CompAge ] Age of the company
LegalType ] Legal status of the company
FinanBack ] Financial background evaluation
CompStruct ] Company structure evaluation
IlliquidRisk ] Evaluation of the risk of company becoming illiquid
CreditRating ] Credit rating for the current leasing contract
with the ffds specified below:
FFD1: Companys capital basis and annual revenue generally determines its
financial background.
$Capital, Revenue% r
0.8
FinanBack
FFD2: Number of employees, age of the company and its legal status together
more or less determines the structure of the company.
$WorkForce, Compage, LegalType% r
0.7
CompStruct
FFD3: Financial background and structure of the company mostly deter-
mines the risk of the company becoming illiquid.
$FinanBack, CompStruct% r
0.9
IlliquidRisk
FFD4: Evaluation of the risk of the company becoming illiquid more or less
determines the credit rating of the company.
IlliquidRisk r
0.7
CreditRating
In this relation (Capital, Revenue, WorkForce, CompAge, LegalType) is the fuzzy
key with strength 0.7. FFD1 contains a part of the fuzzy key as its left-hand side,
that is, FinanBack is partially fuzzy functionally dependent on the fuzzy key. Also
in FFD2, CompStruct is partially fuzzy functionally dependent on the fuzzy key.
900 BAHAR AND YAZICI
So the relation is not in fuzzy 2NF; it should be decomposed. The decomposition
is as follows:
R1 (Capital, Revenue, FinanBack)
where Capital, Revenue is the fuzzy key with strength 0.8, and its ffd is
$Capital, Revenue% r
0.8
FinanBack
R2 (Workforce, CompAge, LegalType, CompStruct)
where Workforce, CompAge, LegalType is the fuzzy key with strength 0.7, and its
ffd is
$WorkForce, Compage, LegalType% r
0.7
CompStruct
We must also make sure to keep a relation with the remaining attributes, removing
FinanBack and CompStruct from the relation. So, we have the below third relation
with (Capital, Revenue, Workforce, CompAge, LegalType) being the fuzzy key
with strength 0.7. Then, the last relation is
R3 (Capital, Revenue, Workforce, CompAge, LegalType,
IlliquidRisk, CreditRating)
and the corresponding ffds are
$Capital, Revenue, Workforce, CompAge, LegalType% r
0.9
IlliquidRisk
IlliquidRisk r
0.7
CreditRating
At this point, all three relations, R1, R2, and R3 are in fuzzy 2NF.
3.6. Fuzzy Third Normal Form
The normalization process takes a relation schema through a series of tests to
certify whether it satisfies a certain normal form. The process proceeds in a top-
down fashion. In a database design satisfying the fuzzy 3NF, insertion, deletion,
and update anomalies will be minimum.
Definition. Let F be the set of ffds for R, and K be the fuzzy key of R with
strength q. R is called to be in fuzzy 3NF if and only if R is in fuzzy 2NF and for
any X r
a
A in F where A is not in X, either X contains the fuzzy key or A is fuzzy
prime.
3.6.1. Fuzzy Third Normal Form Control
The definition of fuzzy 3NF can directly be used to control whether a given
relation is in fuzzy 3NF. All of the ffds should be checked against the conditions:
If the left-hand side attributes contain all the attributes of the right-hand side, that
ffd does not violate fuzzy 3NF. Similarly if the left-hand side contains any of the
NORMALIZATION OF FRDBs 901
fuzzy keys of the relation, fuzzy 3NF is not violated. And finally, if the right-hand
side attributes of the ffd are all fuzzy prime attributes, fuzzy 3NF is also not vio-
lated. These are composed together in the algorithm below.
Algorithm. F3NF Control Algorithm. Let K be the fuzzy key set of relation R.
(1) For every ffd X r
a
Y in the relation,

If X Y, fuzzy 3NF is not violated; otherwise,

If X K
i
, for any K
i
K, fuzzy 3NF is not violated; otherwise,

Let P be the set of fuzzy prime attributes of R. If Y P, fuzzy 3NF is also not
violated.
(2) If none of the above conditions are satisfied for at least one of the ffds in the relation,
the relation is not in fuzzy 3NF.
Example 9. For a symbolic example, let R (A, B, C, D) and the ffds be
AB r
0.9
C, AC r
0.8
D, and C r
0.6
E. The first ffd has the fuzzy key as its
left-hand side not violating the fuzzy 3NF. But the second and third ffds, AC r
0.8
D
and C r
0.6
E, violate the fuzzy 3NF definition; left-hand sides are not a part of
fuzzy key, AB, and D and E are not fuzzy prime. Then R is not in fuzzy 3NF.
3.6.2. Decomposition into Fuzzy Third Normal Form
The normalization process based on ffds uses a number of decompositions
while normalizing the relations. But normal forms do not always guarantee a good
database design. Generally it is not sufficient to only check that each relation schema
in the database is in one of the fuzzy normal forms, fuzzy 3NF, or in fuzzy Boyce
Code Normal Form (BCNF). The normalization process should also confirm the
existence of two additional and desirable properties, dependency preservation prop-
erty and lossless join property. The decomposition algorithms having these men-
tioned properties will be given in the following sections.
3.6.2.1. Minimal Cover. In the next two sections, two algorithms are given
both for the dependency preserving and lossless join decompositions. But for the
decompositions to possess the two desired properties, the initial ffd set should be a
minimal cover and it should be free of partial ffds. A minimal cover of a set of
dependencies F is a set of dependencies that is equivalent to F with no redundan-
cies. A set of ffds F is minimal if the following conditions hold: (1) every depen-
dency in F has a single attribute for its right-hand side, (2) we cannot replace any
X r
u
A with Y r
a
A where Y is a proper subset of X and a u and still have a
set of ffds equivalent to F, and (3) we cannot remove any dependency from F and
still have a set of ffds equivalent to F.
Partial ffd free means that there is no partial ffd in the set of ffds. The algo-
rithm below finds the minimal cover of a given ffd set and makes it partial ffd free.
Algorithm. Minimal Cover Algorithm: Let F be the set of ffds, and assign F to
G, G
:
F.
902 BAHAR AND YAZICI
(1) Replace each ffd X r
q
i
$ A
1
, A
2
, . . . , A
n
% in G by n ffds X r
q
i
A
1
, X r
q
i
A
2
, . . . ,
X r
q
i
A
n
.
(2) For each ffd X r
q
i
A
k
in G
For each attribute B X
If ~~G $X r
q
i
A
k
%! ~~X $B%! r
a
A
k
!! where a q
i
is equivalent to G
Then replace X r
q
i
A
k
with ~X $B%! r
a
A
k
in G.
(3) For each remaining ffd X r
q
i
A
k
in G
If ~G $X r
q
i
A
k
%! is equivalent to G, then remove X r
q
i
A
k
from G.
3.6.2.2. Dependency Preserving Decomposition into Fuzzy Third Normal
Form. In fuzzy databases, it is important to preserve the dependencies while
decomposing the relations like their classical counterparts, because each depen-
dency in the fuzzy database represents a constraint in the database. If one of the
dependencies is not represented in some individual relation R
i
, we have to join
two or more relations in the decomposition and then proceed, and that is ineffi-
cient and impractical. The dependency preservation property ensures that each ffd
is represented in some individual relation resulting after decomposition.
Now, we give the algorithm that creates a dependency-preserving decompo-
sition of a relation R based on a set of ffds, F, such that each relation in the decom-
position is in fuzzy 3NF.
Algorithm. Dependency Preserving Decomposition into Fuzzy 3NF Algorithm.

Find the minimal cover G for F, and make it partial ffd free by using the Min cover
Algorithm above.

Place any attributes that have not been included in any of the ffds of G in a separate
relation schema, and eliminate them from R.

If any of the ffds in G involves all the attributes of R, then the decomposition is R.

Else, for each left-hand side X of ffds in G, create a new relation schema in D with
attributes $X $ A
1
% . . . $ A
k
%% where X r
q
1
A
1
, X r
q
2
A
2
, . . . , X r
q
k
A
k
are
the ffds in G, and X is the fuzzy key of this new relation with strength q
i
min
.
Example 10. Let R ~A, B, C, D, E! and the ffds be CD r
0.7
A, CD r
0.7
B,
AD r
0.5
E, CD r
0.7
E, A r
0.8
B, and B r
0.6
E. Hence CD is the fuzzy key
of the relation with strength 0.7.
First of all, the minimal cover algorithm is applied. G is initialized to the set
of ffds, F, that is, G
:
F. All the ffds are in the form of X r
q
i
A
i
, meaning that
every ffd has a single attribute on its right-hand side. In the third step, for the ffd
AD r
0.5
E, for the attribute D $ A, D%, ~G $ AD r
0.5
E%! ~$ A r
0.6
E%!
is equivalent to G because 0.6 0.5. In this step, A r
0.6
E is obtained from the
two ffds A r
0.8
B, and B r
0.6
E using the transition property. So AD r
0.5
E is
replaced with A r
0.6
E in G. In the last step, the ffd A r
0.6
E, obtained in the
previous step, is removed just because it can be obtained from the last two ffds
A r
0.8
B, and B r
0.6
E. Then, the minimal cover G is
NORMALIZATION OF FRDBs 903
CD r
0.7
A, CD r
0.7
B, CD r
0.7
E, A r
0.8
B, and B r
0.6
E.
For the second step of the dependency-preserving decomposition algorithm,
for each left-hand side of the ffds, where CD is the fuzzy key of the relation with
strength 0.7, a relation schema is created with attributes A, B, C, D, and E whose
ffds are CD r
0.7
A, CD r
0.7
B, and CD r
0.7
E, with CD as the fuzzy key with
strength 0.7. Then for the remaining ffds A r
0.8
B, and B r
0.6
E, two separate
relation schemas, one with attributes A and B, the other with attributes B and E, are
created. At the end, after the dependency-preserving decomposition three relation
schemas are obtained. The first one is R1 (A, B, C, D, E) with fuzzy functional
dependencies CD r
0.7
A, CD r
0.7
B, and CD r
0.7
E, the second one is R2
(A, B) with A r
0.8
B, and the third one is R3 (B, E) with B r
0.6
E.
3.6.2.3. Lossless Join Decomposition into Fuzzy Third Normal Form.
Another desired property of a decomposition is the lossless join property. If a
decomposition does not have the lossless join property, then we may get spurious
tuples after joining those relations in that decomposition. These spurious tuples
represent erroneous information. Therefore, this property is critical and must cer-
tainly be achieved. Lossless join property guarantees that spurious tuple genera-
tion problem does not occur with respect to the relation schemas created after
decomposition. The algorithm below provides a lossless join decomposition into
fuzzy 3NF.
Algorithm. Lossless Join Decomposition into Fuzzy 3NF Algorithm.

Find the minimal cover G for F, and make it partial ffd free.

Place any attributes that have not been included in any of the ffds of G in a separate
relation schema, and eliminate them from R.

If any of the ffds in G involves all the attributes of R, then the decomposition is R.

Else, for each left-hand side X of ffds in G, create a new relation schema in D with
attributes $X $ A
1
% . . . $ A
k
%% where X r
q
1
A
1
, X r
q
2
A
2
, . . . , X r
q
k
A
k
are
the ffds in G, and X is the fuzzy key of this new relation with strength q
i
min
.

If none of the relation schemas contains the fuzzy key of R, create one more relation
schema that contains attributes that form the fuzzy key of R.
The testing algorithms for these two properties, dependency preserving and loss-
less join properties, are presented in the following sections after fuzzy BCNF.
Example 11. The lossless join decomposition algorithm brings an additional step
into the dependency-preserving decomposition at the end, by creating a new rela-
tion schema for the fuzzy key of the relation. If we consider the relation in Exam-
ple 10 again, in order to get a lossless join decomposition, we must go through all
the steps of the dependency-preserving decomposition again and at the end we
must create a new relation for the fuzzy key, CD, if it is not contained in any of the
decomposed relations. But, in our case, the fuzzy key CD is already contained in
one of the decomposed relations, so there is no need to create a new relation.
904 BAHAR AND YAZICI
Then, after a lossless join decomposition into fuzzy 3NF, we have three relations,
R1 (A, B, C, D, E), R2 (A, B), and R3 (B, E), as in Example 10.
3.6.2.4. An Example Application: Leasing Risk Assessment. The Leasing
Risk Assessment relation analyzed above can be further analyzed for fuzzy 3NF.
When the conditions for the fuzzy 3NF are considered, R1 and R2 in the decom-
posed relation R do not violate the fuzzy 3NF, because in each relation there is
only one ffd, and their left-hand sides are the fuzzy keys of the corresponding
relations. But in the third relation, the ffds are
$Capital, Revenue, Workforce, CompAge, LegalType% r
0.9
IlliquidRisk
IlliquidRisk r
0.7
CreditRating
According to fuzzy 3NF control algorithm, the first ffd satisfies the second condi-
tion so it does not violate the fuzzy 3NF, but in the second ffd none of the condi-
tions are met, the left-hand side does not contain the right-hand side, and also it
does not contain the key, and lastly CreditRating is not a fuzzy prime attribute.
Consequently, the third relation is not in fuzzy 3NF, and it must be decomposed.
Applying the Dependency-Preserving Decomposition into fuzzy 3NF algorithm,
we get the decomposed relations as follows:
R4 (Capital, Revenue, Workforce, CompAge, LegalType, IlliquidRisk)
where Capital, Revenue, Workforce, CompAge, LegalType is the fuzzy key with
strength 0.9, and its ffd is $Capital, Revenue, Workforce, CompAge, LegalType% r
0.9
IlliquidRisk
R5 (IlliquidRisk, CreditRating)
where IlliquidRisk is the fuzzy key with strength 0.7, and its ffd is
IlliquidRisk r
0.7
CreditRating
Lossless Join Decomposition into fuzzy 3NF Algorithm has only one additional
step with respect to the dependency-preserving decomposition algorithm: If none
of the relation schemas contains the fuzzy key of R3, create one more relation
schema that contains attributes that form the fuzzy key of R3. But in our case,
relation R4 has the fuzzy key of R3; hence the decomposition is also a lossless join
decomposition.
3.7. Fuzzy Boyce Codd Normal Form
Like its classical counterpart, fuzzy boyce codd normal form (fuzzy BCNF)
is a stricter form of fuzzy 3NF. Fuzzy BCNF ensures that there is no redundancy
that can be detected using ffd information alone. It is the most desirable normal
form from the point of view of redundancy. The formal definition of the fuzzy
BCNF can be given as follows.
NORMALIZATION OF FRDBs 905
Definition. Let F be the set of ffds for schema R, and K be the fuzzy key of R
with strength q. R is called to be in Fuzzy BCNF if and only if R is in fuzzy 3NF and
for any X r
p
A in F, either A is in X or X is a fuzzy superkey of R, that is X K.
To check whether a given relation is in fuzzy BCNF, all of the ffds in the
relation should be checked against the specified two conditions. If the left-hand
side of the ffd contains all the attributes of the right-hand side or any of the fuzzy
keys of the relation, that ffd does not violate the fuzzy BCNF. The algorithm for
the decomposition into fuzzy BCNF is given below. The algorithm ensures that
the decomposition is a lossless join decomposition.
Algorithm. Decomposition into Fuzzy BCNF Algorithm: Let the ffd that vio-
lates fuzzy BCNF be X r
p
A, where A, X R and A is the single attribute.
Decompose R into two relation schemas R A and XA.

Recursively apply the previous step for all the ffds that violate the fuzzy BCNF, until
there is no ffd in the relation violating fuzzy BCNF.
The ffds being checked against fuzzy BCNF are already in fuzzy 3NF and their
right-hand sides consist of single attributes, because of the fuzzy 3NF decomposi-
tion algorithm.
Example 12. Consider a relation schema R (A, B, C, D, E, F, G) with ffds
CE r
0.7
A, BD r
0.6
E, and C r
0.9
B, and A is the fuzzy key of the relation
with strength 0.8, that is, A r
0.8
BCDEFG. The relation schema is in fuzzy 2NF
because there is no partial dependence (the fuzzy key of the relation, A, is already
a single attribute). Now, we have to check if the relation is in fuzzy 3NF. To be in
fuzzy 3NF, either the left-hand side of the ffds should contain the fuzzy key, A, or
the right-hand side is fuzzy prime, that is, a part of the fuzzy key. In our example,
the second ffd violates this constraint, so the relation is not in fuzzy 3NF and
consequently not in fuzzy BCNF. According to our algorithm, we decompose the
relation into two; one with attributes $B, D, E% and ffd BD r
0.6
E with BD as
the fuzzy key with strength 0.6, and the other with the attributes $A, B, C, D, F, G%
and ffd C r
0.9
B with still A as the fuzzy key with strength 0.8. In this decompo-
sition, the second relation schema is still not in fuzzy BCNF because of the ffd
C r
0.9
B. So we decompose it again into two new relations, the first one with
attributes $A, C, D, F, G% and A being the fuzzy key with strength 0.8, and the
second one with the attributes $B, C% and ffd C r
0.9
B with C as the fuzzy key
with strength 0.9. Thus each of the schemas BDE, BC, and ACDFG is in fuzzy
BCNF.
In the Leasing Risk Assessment example, all the decomposed relations are
in fuzzy BCNF.
3.8. Dependency Preservation Property Testing in Decompositions
While discussing the fuzzy 3NF, two algorithms are given for the decompo-
sition into fuzzy 3NF, one achieving the dependency preservation property, and
906 BAHAR AND YAZICI
the other also having the lossless join property. Also the algorithm for normaliza-
tion into fuzzy BCNF ensures the lossless join property. The dependency preser-
vation property of the decompositions in the fuzzy relational data model is studied
in Ref. 23 widely. In this section, an algorithm is presented to test the dependency
preservation property of decompositions.
Algorithm. Dependency Preservation Testing Algorithm: For every ffd, X r
a
Y,
where X X
1
X
2
. . . X
m
,
(1) Construct a transitive closure list, ZList, initially for all the attributes of the left-hand
side of the ffd, X, with maximum strengths.
ZList $~X
1
,1!, ~X
2
,1!, . . . , ~X
m
,1!%
(2) While (true)
i. ZList2 R Zlist.
ii. Reset domain.
iii. For each decomposed relation R
i
~i 1 to k!,

Reset domain.

For each element of ZList2,


If the attribute of this element is in U
i
, where U
i
is the attribute set of R
i
, add
the attribute to the domain.

Find the transitive closure of the domain, ZList


i
.

For each element of ZList


i
,
If the attribute of this element is in U
i
add the element to TList
i
.

Combine TList
i
into ZList2 using fuzzy union operation.
iv. If ZList ZList2 break.
v Else ZList RZList2.
(3) If all the attributes in Y occur in ZList with the strength a or greater, then dependency
preserving property is not violated, continue with the other ffd.
(4) Else not dependency preserving, break.
Example 13. Let the attribute set for a relation R be $A, B, C%, the decomposed
relations be R1 (A, B) and R2 (B, C), and the ffds be A r
0.9
B and B r
0.7
C.
For the first ffd, A r
0.9
B, transitive closure of A is ZList $(A, 1)% initially.
ZList2 $~A, 1!%
R1 ] domain $ A%
ZList
1
$~A, 1!, ~B, 0.9!, ~C, 0.7!%
TList
1
$~A, 1!, ~B, 0.9!%
ZList2 $~A, 1!, ~B, 0.9!%
R2 ] domain $B%
ZList
2
$~B, 1!, ~C, 0.7!%
TList
2
$~B, 1!, ~C, 0.7!%
ZList2 $~A, 1!, ~B, 0.9!, ~C, 0.7!%
NORMALIZATION OF FRDBs 907
Because ZList ZList2,
ZList R$~A, 1!, ~B, 0.9!, ~C, 0.7!%
In the second pass, ZList ZList2, exiting the loop, we see that the right-hand-side
attribute B occurs in ZList with strength 0.9, so the dependency preservation prop-
erty is not violated and we continue with the second ffd, B r
0.7
C. Similarly, at
the end we find ZList $~B, 1!, ~C, 0.7!%, and because attribute C occurs in
ZList with strength 0.7, the dependency preservation property is not violated. Hence
the decomposition is dependency preserving.
3.9. Lossless Join Property Testing in Decompositions
Chen, Kerre, and Vandenbulcke impose a restriction on the extended alge-
braic operations in their study.
22
In accordance with the design issues and to achieve
a complete information reconstruction, they restricted the eight algebraic opera-
tions, namely product, union, intersection, natural join, projection, selection, minus,
and division, so that they are performed for base relations only on identical ele-
ments or tuples, not on close ones. That means that whenever tuple merging is of
concern, it is referred to identical elements. Raju and Majumdar
6
restrict the fuzzy
resemblance relation and named the class of ffds where the fuzzy resemblance
relation is restricted as restricted ffd. With these choice of restrictions, both Chen
et al.
22
and Raju and Majumdar
6
use a classic algorithm to test lossless join decom-
position of fuzzy relation with ffds.
In this article, we also utilize the classic algorithm to test whether a de-
composition has lossless join property. The logic in using the classical testing
algorithm in the similarity-based fuzzy relational database model is as follows.
The table created during the application of the algorithm is used only to deter-
mine whether there is a joining attribute between the decomposed relations.
Fuzziness is taken into the consideration after this point. If there is a joining
attribute, the decision whether they can be joined or not is given according to
their similarity levels and a predefined threshold. The tuples should also satisfy
all the ffds of the relation; that is, for every pair of tuples, for each ffd X r
q
Y,
C~Y@t
1
, t
2
# ! min~q, C~X@t
1
, t
2
# !!.
Algorithm. Lossless Join Testing Algorithm: Let the relation schema be R
with the attributes A
1
, A
2
, . . . , A
n
, F be the ffds, and r $R
1
, R
2
, . . . , R
k
% be the
decomposition.
(1) Create an initial table T with one row i, for each relation R
i
in the decomposition and
one column j for each attribute A
j
in the relation being decomposed, R.
(2) Put b
ij
in every cell of the table.
(3) For each row i and column j,
If A
j
is in attribute domain, U
i
, of R
i
, then set T
ij
a
j
(4) Repeat until there are no changes in T.
For each ffd X r
a
Y in F,
For all rows in T, look for those rows which have the same symbols in all columns
corresponding to attributes in X,
908 BAHAR AND YAZICI
For any two rows make the symbols in all columns for the attributes in Y be
the same as follows: if any of the symbols is an a symbol set the other to
that same a symbol.
(5) At the end, if a row is entirely of a symbols then the decomposition has lossless join
property. Otherwise, it is not lossless join decomposition.
Example 14. Let the relation schema be R (A, B, C, D, E, F) and ffds be
A r
0.6
B, C r
0.5
DE, and AC r
0.8
F. Here AC is the fuzzy key of the relation
with strength 0.5. Suppose we decompose R into two relations R1 (B, E) and
R2 (A, C, D, E, F), and then test for the lossless join. The initial table T has
i 2 rows for relations R1 and R2, and j 6 columns for the attributes A, B, C, D,
E, and F. For the second step, we initialize each cell with b
ij
. The initial table can
be seen in Table IV.
For the first row, T
12
and T
15
are set to a
2
and a
5
, respectively, because rela-
tion R1 contains the attributes A
2
B and A
5
E. Similarly for the second row,
entries T
21
, T
23
, T
24
, T
25
, and T
26
are set to a
1
, a
3
, a
4
, a
5
, and a
6
, respectively,
because R2 contains the attributes A, C, D, E, and F as in Table V.
For the first ffd A r
0.6
B, R1 and R2 do not have the same symbols in the
first column, the column for the attribute A, so there is no change in B column.
Considering the second ffd, C r
0.5
DE, again R1 and R2 do not have the same
symbols in the column for C, and there is no change in the table. The situation is
the same for the last ffd, AC r
0.8
F. At the end, because there is no row consist-
ing of entirely a symbols, the decomposition is not lossless join decomposition.
Example 15. Now we give a lossless join decomposition example. Let the rela-
tion schema be R (A, B, C, D, E, F, G) and ffds be ABC r
0.7
D, ABC r
0.8
E,
DE r
0.7
F, and F r
0.6
G. Here ABC is the fuzzy key of the relation with
strength 0.7. Suppose we decompose R into three relations R1 (A, B, C, D, E),
R2 (D, E, F), and R3 (F, G), and then test for the lossless join. The initial
table T has i 3 rows for relations R1, R2, and R3, and j 7 columns for the
Table IV. Initial table for relation R ~A, B,
C, D, E, F!.
T A B C D E F
R1 b
11
b
12
b
13
b
14
b
15
b
16
R2 b
21
b
22
b
23
b
24
b
25
b
26
Table V. Table after applying the third step of
lossless join testing algorithm to R.
T A B C D E F
R1 b
11
a
2
b
13
b
14
a
5
b
16
R2 a
1
b
22
a
3
a
4
a
5
a
6
NORMALIZATION OF FRDBs 909
attributes A, B, C, D, E, F, and G. For the second step, we initialize each entry with
b
ij
, Table VI.
For the first row, T
11
, T
12
, T
13
, T
14
, and T
15
are set to a
1
, a
2
, a
3
, a
4
, and a
5
,
respectively, because relation R1 contains the attributes A
1
A, A
2
B, A
3
C,
A
4
D, and A
5
E. Similarly for the second row, entries T
24
, T
25
, and T
26
are set
to a
4
, a
5
, and a
6
, respectively, because R2 contains the attributes D, E, and F. And
finally, entries T
36
and T
37
are set to a
6
and a
7
, respectively, because R3 contains
the attributes F and G, and the table becomes as Table VII.
For the first and second ffds ABC r
0.7
D, and ABC r
0.8
E, R1, R2, and R3
do not have the same symbols in the columns for the attributes A, B, and C, so
there is no change in the D column. Considering the third ffd, DE r
0.7
F, R1 and
R2 have the same symbols in the columns for D and E, so the column of attribute F
for relation R1, b
16,
will be changed into a
6
in the table. Then we get Table VIII.
Finally, for the last ffd F r
0.6
G, R1, R2, and R3 have the same symbols in
the column for F, so the column of attribute G for relation R1 and R2 will be
changed into a
7
in the table, and the table becomes as the one in Table IX.
At the end, because there is a row consisting of entirely a symbols, that is,
the first row, the decomposition is lossless join decomposition.
3.10. An Example Application: Fraud Detection
An increasing number of transactions are carried out remotely and electroni-
cally in todays financial world. Thus, with the complexity of the system, the oppor-
tunities for criminals to conduct fraudulent transactions rise. Credit cards are one
of the areas where fraudulent behavior is extremely important for financial insti-
tutions. Fraudulent behavior can arise through different ways. In one of these ways,
the criminals are individuals; they steal credit cards and then use them toward
purchases. In another case, criminal groups steal new credit cards and duplicate
Table VI. Initial table for relation R ~A, B, C,
D, E, F, G!.
T A B C D E F G
R1 b
11
b
12
b
13
b
14
b
15
b
16
b
17
R2 b
21
b
22
b
23
b
24
b
25
b
26
b
27
R3 b
31
b
32
b
33
b
34
b
35
b
36
b
37
Table VII. Result of the third step of lossless join
testing algorithm to R.
T A B C D E F G
R1 a
1
a
2
a
3
a
4
a
5
b
16 b17
R2 b
21
b
22
b
23
a
4
a
5
A
6
b
27
R3 b
31
b
32
b
33
b
34
b
35
A
6
A
7
910 BAHAR AND YAZICI
them. On the other hand, there is a customer-induced fraud in which customers
claim that their credit card was stolen after making some expensive purchases.
Most of the credit card companies use some sophisticated systems to detect fraud-
ulent behavior, because various opportunities for this still exist although most credit
card purchases are electronically verified before the actual transaction. These sys-
tems have to work with very little significant data; they know only the past cus-
tomer history and the current transaction information. On the other hand, they
should not too easily decline nonfraudulent transactions so as not to make the
customers dissatisfied.
At this point, the companies are unwilling to disclose system details or even
the fact that they use fuzzy logic fraud detection systems. Our case will be on a
financial service provider. The company offers its customers both banking and
insurance services, and the system is used for the detection of insurance fraud.
Each insurance claim in the field of home insurance is evaluated to assess the
fraudulent behavior likelihood. The company wanted to implement a fraud detec-
tion system that looks at multiple factors in every insurance claim and selects only
those that have a certain degree of likelihood of fraud.
All information about the customers is hold in a database. By using the sys-
tem, the insurance claim is evaluated, and if the likelihood of fraud assessed is
lower than a certain predefined threshold, the claim is immediately paid out to the
customer. If the result is higher than the threshold, then the claim is passed on to a
claims auditor with the reason result. After his manual review, final decisions on
further steps are made.
We have the following attributes for the system: Number of claims in the last
12 months, amount of the current claim, time with insurance, average balance on
all banking accounts over the last 12 months, number of overdrafts over the last 12
months, annual income of the customer, recent changes in status, insurance history
evaluation, banking history evaluation, personal evaluation, fraud likelihood, and
Table VIII. Result of the fourth step of lossless join
testing algorithm for the first three FFDs of R.
T A B C D E F G
R1 a
1
a
2
a
3
a
4
a
5
a
6
b
17
R2 b
21
b
22
b
23
a
4
a
5
a
6
b
27
R3 b
31
b
32
b
33
b
34
b
35
a
6
a
7
Table IX. Table for R ~A, B, C, D, E, F, G! at
the end of lossless join testing algorithm.
T A B C D E F G
R1 a
1
A
2
a
3
a
4
a
5
a
6
a
7
R2 b
21
b
22
b
23
a
4
a
5
a
6
a
7
R3 b
31
b
32
b
33
b
34
b
35
a
6
a
7
NORMALIZATION OF FRDBs 911
fraud reason explanation. The first three attributes give information about the insur-
ance contract and the claimitself, the next two attributes describe the banking back-
ground of the customer, and the sixth and seventh attributes provide the personal
background. Then our relation schema and the fuzzy functional dependencies will
be as follows:
R: ~NumClaim, Amount, CustSince, AvgAmnt, NumOvr, Income, StatChng,
HistIns, HistBank, Personal, Fraud, Reason!
FFD1: Number of claims in the last 12 months, amount of current claim, and
time with insurance mostly determines the insurance history evaluation.
$NumClaim, Amount, CustSince% r
0.8
HistIns
FFD2: Average balance on all banking accounts over the last 12 months and
number of overdrafts over the last 12 months generally determines banking
history evaluation.
$AvgAmnt, NumOvr% r
0.7
HistBank
FFD3: Annual income of the customer and recent changes in status more or
less determines personal evaluation.
$Income, StatChng% r
0.6
Personal
FFD4: Insurance history evaluation, banking history evaluation, and per-
sonal evaluation mostly determines fraud reason explanation.
$HistIns, HistBank, Personal % r
0.8
Reason
FFD5: Fraud reason explanation more or less determines fraud likelihood.
Reason r
0.6
Fraud
The attributes can be briefly explained as follows. NumClaim gives an indi-
cation of how often the customer has used the insurance in the past year. Amount
expresses how significant the current claim is. CustSince takes into account how
long the insurance contract has been in existence. HistIns indicates how much the
customer has exercised their insurance contract in the past and present. AvgAmnt
is the average total balance on all banking accounts of the customer. NumOvr is
the number of overdrafts on checking accounts. HistBank evaluates the banking
history of the customer and its relevance to his insurance claim. Personal assesses
the customers basic situation, detects possible motives within the customers life
style that could motivate fraudulent behavior. StatChng indicates whether a fun-
damental change in the customers life has occurred over the past four months.
Normalization process begins with the fuzzy 1NF, but because there are no
tuples at the beginning, we continue with the fuzzy 2NF. Analyzing the ffds, the
fuzzy key is $NumClaim, Amount, CustSince, AvgAmnt, NumOvr, Income, Stat-
Chng% with a degree of 0.6 because the transitive closure of this attribute set
912 BAHAR AND YAZICI
contains all the attributes of the relation. In this case, HistIns, HistBank, Personal,
Fraud, and Reason are fuzzy nonprime attributes. For the relation to be in fuzzy
2NF, none of these fuzzy nonprime attributes is partially fuzzy functionally depen-
dent on the fuzzy key. But in our case, this restriction is violated, so the relation is
not in fuzzy 2NF, and it should be normalized into a number of smaller relations
that are in fuzzy 2NF. Using the decomposition algorithm 3.4.2.1, R is decom-
posed into four new relations R1 through R4.
R1: (NumClaim, Amount, CustSince, HistIns) with the fuzzy functional dependency
$NumClaim, Amount, CustSince% r
0.8
HistIns
R2: (AvgAmnt, NumOvr, HistBank) with the fuzzy functional dependency
$NumOvr, AvgAmnt% r
0.7
HistBank
R3: (Income, StatChng, Personal ) with the fuzzy functional dependency
$Income, StatChng% r
0.6
Personal
and a relation with the remaining attributes, after removing the fuzzy nonprime
attributes partially fuzzy functionally dependent on the fuzzy key of the original
relation,
R4: ~NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng, Fraud,
Reason! with the fuzzy functional dependency
$NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng% r
0.6
Reason
Reason r
0.6
Fraud
After achieving the fuzzy 2NF, conditions for the fuzzy 3NF should be tested.
For a relation to be in fuzzy 3NF, it should already be in fuzzy 2NF, and addition-
ally for each ffd in the relation either the left-hand side contains the fuzzy key of
the relation or the right-hand side consist of fuzzy prime attributes. For the first
three of the relations, the left-hand sides of the ffds contain the respective fuzzy
keys of the relations. But in the fourth relation, in the second ffd, neither the left-
hand side contains the fuzzy key, that is, $NumClaim, Amount, CustSince, AvgAmnt,
NumOver, Income, StatChng%, nor the right-hand side attribute Fraud is fuzzy
prime. So the last relation should be decomposed into fuzzy 3NF. To be able to
make a lossless join decomposition into fuzzy 3NF, initially minimal cover of the
ffds of R4 should be found. After applying the minimal cover algorithm, we find
the minimal cover for R4 as shown below:
NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng r
0.6
Reason,
Reason r
0.6
Fraud
Then by using the lossless join decomposition into fuzzy 3NF algorithm, R4 is
decomposed into two new relations, R5 and R6.
NORMALIZATION OF FRDBs 913
R5: ~NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng, Reason!
with the fuzzy functional dependency
$NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng% r
0.6
Reason
R6: (Reason, Fraud) with the fuzzy functional dependency
Reason r
0.6
Fraud
At this point, all the relations are also in fuzzy BCNF. Applying the Dependency
Preservation Testing Algorithm, we see that the decomposition has the property of
dependency preservation. We can also check whether the decomposition of R into
R1, R2, R3, R5, and R6 has the lossless join property by using the lossless join
property testing algorithm. Table X has five rows, one for each decomposed rela-
tions, and 12 columns, one for each attribute. After initializing the entries with
respect to ffds the decomposed relations are shown in Table X. Then for each
fuzzy functional dependency, the table should be processed. The ffd to be pro-
cessed are
$NumClaim, Amount, CustSince% r
0.8
HistIns ,
$AvgAmnt, NumOvr% r
0.7
HistBank,
$Income, StatChng% r
0.6
Personal ,
$HistIns, HistBank, Personal % r
0.8
Reason,
Reason r
0.6
Fraud
The result of this step is shown in Table XI. Because there is a row, that is, the
fourth row, made up of entirely a symbols, therefore, the decomposition satis-
fies the lossless join property.
4. CONCLUSION
Like the classical databases, the fuzzy databases not properly designed suffer
from the problems of data redundancy and update anomalies. To provide a good
fuzzy relational database design, the concept of ffd is used to define the fuzzy
normal forms and dependency-preserving and lossless join properties.
In this article, we begin with the first step of the normalization process and
define the Fuzzy 1NF. Then the concept of fuzzy key is introduced. It constitutes a
base for the remaining fuzzy normal forms, Fuzzy 2NF, Fuzzy 3NF, and Fuzzy
BCNF. To state the condition for fuzzy normal forms, the definitions of fuzzy
prime and fuzzy nonprime attributes are introduced. We also discuss the two desir-
able properties of decompositions, namely the dependency preservation property
and the lossless join property, which are both used by the design algorithms to
achieve desirable decompositions. Normal forms are insufficient on their own as
criteria for a good database design. The relations must collectively satisfy these
two additional properties to qualify as a good design. The situation is the same
when we deal with fuzzy data and fuzzy normal forms. We illustrate how these
914 BAHAR AND YAZICI
T
a
b
l
e
X
.
I
n
i
t
i
a
l
t
a
b
l
e
f
o
r
r
e
l
a
t
i
o
n
R

(
N
u
m
C
l
a
i
m
,
A
m
o
u
n
t
,
C
u
s
t
S
i
n
c
e
,
A
v
g
A
m
n
t
,
N
u
m
O
v
r
,
I
n
c
o
m
e
,
S
t
a
t
C
h
n
g
,
H
i
s
t
I
n
s
,
H
i
s
t
B
a
n
k
,
P
e
r
s
o
n
a
l
,
F
r
a
u
d
,
R
e
a
s
o
n
)
a
f
t
e
r
s
e
t
t
i
n
g
t
h
e
e
n
t
r
i
e
s
w
i
t
h
r
e
s
p
e
c
t
t
o
d
e
c
o
m
p
o
s
e
d
r
e
l
a
t
i
o
n
s
.
R
N
u
m
C
l
a
i
m
A
m
o
u
n
t
C
u
s
t
S
i
n
c
e
A
v
g
A
m
n
t
N
u
m
O
v
r
I
n
c
o
m
e
S
t
a
t
C
h
n
g
H
i
s
t
I
n
s
H
i
s
t
B
a
n
k
P
e
r
s
o
n
a
l
F
r
a
u
d
R
e
a
s
o
n
R
1
a
1
a
2
a
3
b
1
4
b
1
5
b
1
6
b
1
7
a
8
b
1
9
b
1
1
0
b
1
1
1
b
1
1
2
R
2
b
2
1
b
2
2
b
2
3
a
4
a
5
b
2
6
b
2
7
b
2
8
a
b
2
1
0
b
2
1
1
b
2
1
2
R
3
b
3
1
b
3
2
b
3
3
b
3
4
b
3
5
a
6
a
7
b
3
8
b
3
9
a
1
0
b
3
1
1
b
3
1
2
R
5
a
1
a
2
a
3
a
4
a
5
a
6
a
7
b
4
8
b
4
9
b
4
1
0
b
4
1
1
a
1
2
R
6
b
5
1
b
5
2
b
5
3
b
5
4
b
5
5
b
5
6
b
5
7
b
5
8
b
5
9
b
5
1
0
a
1
1
a
1
2
T
a
b
l
e
X
I
.
T
a
b
l
e
f
o
r
r
e
l
a
t
i
o
n
R

(
N
u
m
C
l
a
i
m
,
A
m
o
u
n
t
,
C
u
s
t
S
i
n
c
e
,
A
v
g
A
m
n
t
,
N
u
m
O
v
r
,
I
n
c
o
m
e
,
S
t
a
t
C
h
n
g
,
H
i
s
t
I
n
s
,
H
i
s
t
B
a
n
k
,
P
e
r
s
o
n
a
l
,
F
r
a
u
d
,
R
e
a
s
o
n
)
a
t
t
h
e
e
n
d
o
f
l
o
s
s
l
e
s
s
j
o
i
n
t
e
s
t
i
n
g
a
l
g
o
r
i
t
h
m
.
R
N
u
m
C
l
a
i
m
A
m
o
u
n
t
C
u
s
t
S
i
n
c
e
A
v
g
A
m
n
t
N
u
m
O
v
R
I
n
c
o
m
e
S
t
a
t
C
h
n
g
H
i
s
t
I
n
s
H
i
s
t
B
a
n
k
P
e
r
s
o
n
a
l
F
r
a
u
d
R
e
a
s
o
n
R
1
a
1
a
2
a
3
b
1
4
b
1
5
b
1
6
b
1
7
a
8
b
1
9
b
1
1
0
b
1
1
1
b
1
1
2
R
2
b
2
1
b
2
2
b
2
3
a
4
a
5
b
2
6
b
2
7
b
2
8
a
9
b
2
1
0
b
2
1
1
b
2
1
2
R
3
b
3
1
b
3
2
b
3
3
b
3
4
b
3
5
a
6
a
7
b
3
8
b
3
9
a
1
0
b
3
1
1
b
3
1
2
R
5
a
1
a
2
a
3
a
4
a
5
a
6
a
7
a
8
a
9
a
1
0
a
1
1
a
1
2
R
6
b
5
1
b
5
2
b
5
3
b
5
4
b
5
5
b
5
6
b
5
7
b
5
8
b
5
9
b
5
1
0
a
1
1
a
1
2
NORMALIZATION OF FRDBs 915
fuzzy normal forms can be used to decompose an unnormalized relation into a set
of normalized relations by examples.
We have developed an implemented system (using Borland C4.0), which
is carried out within the framework. Implementation consists of two main parts.
The first part defines the attributes and their properties and provides an interface
to accept tuples and check their conformance. The second part of the implementa-
tion consists of normalization procedures, controlling the level of the normal forms
and decomposing the relation into various normal forms with dependency preser-
vation and the lossless join properties.
Further study involving the fuzzy multivalued dependencies, fuzzy join depen-
dencies, fuzzy inclusion dependencies, and related normal forms has been ongoing.
References
1. Codd E. Arelational model for large shared data banks. Commun ACM 1970;13:377387.
2. Chen G, Kerre EE, Vandenbulcke J. Normalization based on ffd in a fuzzy relational data
model. Inform Syst 1996;21:299310.
3. Imelinski T, Lipski W. Incomplete information in relational databases. J ACM1984;31:701
791.
4. Medina J, Pons O, Vila M. GEFRED: A generalized model to implement fuzzy relational
databases. Inform Sci 1994;47:234254.
5. Petry FE. Fuzzy databases: Principles and applications. Boston: Kluwer Academic Pub-
lishers; 1996.
6. Raju KVSVN, Majumdar AK. Fuzzy functional dependencies and lossless join decompo-
sition of fuzzy relational database systems. ACM Trans Database Syst 1988;13:129166.
7. Umano M, Freedom O. A fuzzy database system. In: E. Sanchez, M. M. Gupta, editors.
Fuzzy Information and Decision Processes. Amsterdam: North Holland; 1982. pp 339347.
8. Yazc A, George R. Fuzzy database modeling. Heidelberg: Physica-Verlag; 1999.
9. Zadeh L. Similarity relations and fuzzy orderings. Inform Sci 1971;3:177206.
10. Buckles PB, Petry FE. A fuzzy representation of data for relational databases. Fuzzy Set
Syst 1982;7:213226.
11. Prade H, Testemale C. Representation of soft constraints and fuzzy attribute values by
means of possibility distributions in databases. In: James Bezdek, editor. Analysis of Fuzzy
Information: Vol. II, Artificial Intelligence and Decision Systems. Boca Raton, FL: CRC
Press; 1987. pp 213229.
12. Rundensteiner E, Hawkes L, Bandler W. On nearness measures in fuzzy relational data
models. Int J Approx Reason 1989;3:267298.
13. Codd E. Further normalization of the database relational model. In: Rustin, editor. Data
base systems. New York: Prentice-Hall; 1972. pp 3364.
14. Elmasri R, Navathe SB. Fundamentals of database systems. New York: Benjamin Cum-
mings Publishing Co.; 2000.
15. Shenoi S, Melton A, Fan LT. Functional dependencies and normal forms in fuzzy rela-
tional database model. Inform Sci 1992;60:128.
16. Liu W-Y. Fuzzy data dependencies and implication of fuzzy data dependencies. Fuzzy Set
Syst 1997;92:341348.
17. Yazc A, Szat MI. Acomplete axiomatization for fuzzy functional and multivalued depen-
dencies in fuzzy database relations. Fuzzy Set Syst 2001;117:161181.
18. Chen G, Kerre EE, Vandenbulcke J. A computational algorithm for the FFD transitive
closure and a complete axiomatization of fuzzy functional dependence(FFD). Int J Intell
Syst 1994;9:421439.
19. Nakata M, Murai T. Updating under integrity constraints in fuzzy databases. In: Proc Sixth
IEEE Conf on Fuzzy Systems (FUZZ-IEEE97). Barcelona: IEEE; 1997. pp 713719.
916 BAHAR AND YAZICI
20. Yazc A, Szat MI. The integrity constraints for similarity-based fuzzy relational data-
bases. Int J Intell Syst 1998;13:641660.
21. Saxena PC, Tyagi BK. Fuzzy functional dependencies and independencies in extended
fuzzy relational database models. Fuzzy Set Syst 1995;69:6589.
22. Chen G, Kerre EE, Vandenbulcke J. On the lossless join decomposition of relation scheme(s)
in a fuzzy relational data model. In: Bilal M. Ayyub, editor. Proc ISUMA 93, Second
International Symposium on Uncertainty Modeling and Analysis. Los Alamitos, CA: IEEE
Computer Society Press; 1993. pp 440446.
23. Chen G, Kerre EE, Vandenbulcke J. The dependency preserving decomposition and a test-
ing algorithm in a fuzzy relational data model. Fuzzy Set Syst 1995;72:2737.
24. Kerre E, Zenner R, De Clauwe R. The use of fuzzy set theory in information retrieval and
databases: A survey. J Am Soc Inform Sci 1986;37:341345.
NORMALIZATION OF FRDBs 917