6 views

Uploaded by thuan_nv

Hello World

- Functional+dependencies+and+normalization4
- Question Bank
- Dbms Question Bank-new
- DB2007HW03Answer
- w 26142147
- lossyless join
- Lecture 8 Normalization
- Normalization
- 78 to 80
- Baseis Dedomenon Functional Dependencies
- MN405 Data and Information Management
- 12 Rules for a Relational Database Model
- Chapter4 - Schema Refinement and Normalisation
- Lecture 18
- ENCh06(2)
- Oracle Question 1
- Computer Science
- Database notes
- 1 Comparison on the Performance of Induction Motor Drive uisng Artificial Intelligent Controller.pdf
- IJEPES-S-07-00041

You are on page 1of 33

Decomposition of Similarity-Based

Fuzzy Relational Databases

zgn Bahar,

Adnan Yazc*

Department of Computer Engineering, Middle East Technical University,

06531, Ankara, Turkey

Fuzzy relational database models generalize the classical relational database model by allowing

uncertain and imprecise information to be represented and manipulated. In this article, we intro-

duce fuzzy extensions of the normal forms for the similarity-based fuzzy relational database

model. Within this framework of fuzzy data representation, similarity, conformance of tuples,

the concept of fuzzy functional dependencies, and partial fuzzy functional dependencies are

utilized to define the fuzzy key notion, transitive closures, and the fuzzy normal forms. Algo-

rithms for dependency preserving and lossless join decompositions of fuzzy relations are also

given. We include examples to show how normalization, dependency preserving, and lossless

join decomposition based on the fuzzy functional dependencies of fuzzy relation are done and

applied to some real-life applications. 2004 Wiley Periodicals, Inc.

1. INTRODUCTION

The relational data model proposed by Codd

1

is based on the set of theoretic

concepts and enables well-defined, unambiguous, and exact data of an applica-

tion. However, in many real world applications, such as biology and genetics, geo-

graphical information systems, economic and weather forecasting systems, and so

on, data is often partially known or imprecise and queries may include vague terms.

To cope with various types of imperfectness and to capture more meaning of the

data in databases, several extensions to the classical relational database model have

been proposed in literature.

18

Properly formulating a database model in terms of

relation schemas is a key requirement in a fuzzy database design. Main frame-

works for fuzzy data representation based on the fuzzy set theory

9

allow imprecise

data for the attribute values and may be categorized into a partial membership-

based approach,

5,6

similarity-based approach,

10

possibility-based approach,

11

and

*Author to whomall correspondence should be addressed: e-mail: yazici@ceng.metu.edu.tr.

e-mail: ozgun.bahar@isbank.com.tr.

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 19, 885917 (2004)

2004 Wiley Periodicals, Inc. Published online in Wiley InterScience

(www.interscience.wiley.com).

DOI 10.1002/int.20029

the extended possibility-based approach.

12

The similarity-based framework is the

approach used in this study.

One of the primary purposes of any database is to decrease data redundancy

and to provide data reliability.

13,14

Data redundancies and update anomalies have

also been of great concern in fuzzy relational database design,

2,6,15,16

and integrity

constraints play an important role in fuzzy relational database design theory. Var-

ious types of data dependencies such as functional and multivalued dependencies

are used as guidelines for the design of classical relational schema that are concep-

tually meaningful and free of certain anomalies. For example, if one attribute deter-

mines another, we say that there exists a functional dependency between these

attributes. This determination is unique in a classical (crisp) relational model

whereas it need not be in a fuzzy relational database model. In a crisp database

model, functional and multivalued dependencies are the precise determinants, and

this is not the case for some of the real-world applications.

17

As the relational data

model is extended to deal with fuzzy data, integrity constraints have also been

extended, and, in literature, there are a number of ways to impose fuzzy data depen-

dency on fuzzy data in fuzzy database relations.

6,11,15,16,1820,21

The following is

an example of fuzzy functional dependencies (ffds).

One of the areas in which fuzziness may be used is business and finance

applications. To evaluate the creditworthiness of a customer, multiple financial

and personal factors are used. Economic thinking and social integrity are two of

these personal factors for the creditworthiness assessment for consumer credit.

Economic thinking and conformity with social and economic standards more or

less determine the business behaviour is a valid constraint in this application. In

this example, all the business behaviour, economic thinking, and social integ-

rity are the attributes of a person with inexact values. The more or less part in

the example causes the constraint itself to be fuzzy. The dependency does not deter-

mine the precise level of determinancy, but the minimum level. The data depen-

dency in this example application is the ffd, and such a dependency cannot be

enforced by a crisp relational database system.

There have been a number of studies on extending data dependencies for

fuzzy relational database models. Among these, Raju and Majumdar

6

have pro-

posed ffds in terms of a membership function of the elements of a fuzzy relation.

Chen et al.

2

have given a definition of ffds in terms of closeness measures ()

for the equality of possibility distribution and fuzzy implication operators. Shenoi

et al.

15

have extended Buckles and Petrys approach

10

by defining ffds based on

equivalence classes from finite domain partitions alone. Liu

16

has defined ffds

based on the concept of semantic proximity in [0,1] between two fuzzy attri-

bute values v

1

and v

2

, which are intervals. Yazc

8

and Yazc and Szat

17,20

have defined ffds between two fuzzy attribute values and proved the soundness

and completeness of the inference rules of those ffds. The studies related to the

normalization process, analyzing the given relation schemas to achieve the desir-

able properties of minimizing redundancy, and minimizing the insertion, dele-

tion, and update anomalies take the fuzzy data and ffds into account.

2,6,15

In

these studies, the main goal is that a fuzzy relation not being in a certain normal

form is decomposed into multiple fuzzy relation schemas of the desired normal

886 BAHAR AND YAZICI

forms. Dependency preserving and lossless join decompositions are used to achieve

desirable decompositions. Chen et al.

22,23

and Raju and Majumdar

6

have studied

such decompositions for the fuzzy relational databases.

Our study differs from the previous research efforts in literature in a number

of aspects. First of all, the similarity-based fuzzy relational database model is used

as the reference model in our study. We deal with a number of issues to design the

similarity-based fuzzy relational databases in order to reduce data redundancy and

eliminate update anomalies. The formal definitions of ffd and partial ffd are given

based on the conformance of tuples. In addition, the fuzzy key concept and transi-

tive closure of the ffds are presented for definitions of the fuzzy normal forms.

Second, we introduce a number of fuzzy normal forms based on the ffds. We first

define the fuzzy first normal form (1NF). Afterward, fuzzy second (2NF), fuzzy

third (3NF), and fuzzy Boyce Codd Normal (BCNF) forms are introduced. Third,

the dependency preserving and lossless join properties in decompositions into the

fuzzy normal forms with respect to ffds are defined. Finally, all these concepts are

described along with examples to demonstrate how these concepts are used in

some real-life applications.

The article is organized as follows: The following section discusses some

background information, including fuzzy relational databases, similarity relations,

and similarity-based fuzzy relational databases. In Section 3, fuzzy functional

dependencies (ffds), tuple conformance, inference rules for ffds, four fuzzy nor-

mal forms, namely, fuzzy 1NF, fuzzy 2NF, fuzzy 3NF, and fuzzy BCNF forms,

and their decomposition algorithms are provided. In this section two testing algo-

rithms for the dependency preserving and lossless join properties of the decompo-

sitions are given. We also include a real-life application, fraud detection, to show

how normalization and dependency preserving and lossless join properties of the

similarity-based fuzzy relational databases are utilized. Finally, the conclusion is

given in Section 4.

2. BACKGROUND

In this section, we first define fuzzy relational databases. Then, the similarity

relations are described as defined by Zadeh

9

and the similarity-based fuzzy data-

base model, as the reference model of this study, is briefly explained.

2.1. Fuzzy Relational Databases

The relational data model uses a single concept of relation both for data rep-

resentation and data association, and it is supported by the set theory. In this model,

every value in the relation is atomic; that is, values must be atomic. Except for the

null value, every attribute must have a precise value and cannot have fuzzy or

uncertain values.

Several approaches are proposed for extending classical relational database

model to fuzzy relational database model. Fuzzy relational databases are the data-

bases that can represent fuzzy and uncertain data. An extensive list of references

NORMALIZATION OF FRDBs 887

to the relevant literature can be found in Refs. 5 and 24. Main contributions in this

area are as follows: In the fuzzy relational data model proposed by Umano and

Freedom,

7

fuzzy data are represented by possibility distributions and a grade of

membership is used to represent the association between values. Also this grade of

membership may itself be a possibility distribution. Buckles and Petry

10

intro-

duced the fuzzy similarity relations. These fuzzy similarity relations facilitate the

estimation of the extent to which possible values of an attribute can be regarded as

being interchangeable. Prade and Testemale

11

generalized the representation of

Umano by introducing an extra element, e, for the situations where a nonzero pos-

sibility can mean the nonapplicability of an attribute. They also proposed the use

of possibility distributions to represent fuzzy values as well as uncertainty con-

cerning the value of an attribute. To handle incomplete information and missing

and nonapplicable values, Imelinski and Lipski

3

proposed a method where incom-

plete information is represented as a list of possible values. Lipski does not assume

that null means a value that is completely unknown.

Two different causes of imprecise attribute values in database systems moti-

vated two approaches for representing fuzzy data. The similarity-based approach

10

uses linguistic terms to describe attribute values. The impreciseness of these terms

is characterized by a similarity matrix, which records the degree of similarity

between the pairs of linguistic terms in a domain. The possibility-based model is

an alternative approach for representing imprecise data using a possibility distri-

bution as the value of an attribute. Possibility measure and necessity measure are

the two kinds of matching degrees calculated for this approach. There have also

been some mixed models combining these approaches.

12,19

2.2. Similarity Relations

The identity relation used in nonfuzzy relational databases induces equiva-

lence classes over a domain base set, D

j

, which affects the result of certain opera-

tions and the removal of redundant tuples. The equivalence classes are most

frequently singleton sets. Identity relation is a special case of this similarity relation.

Similarity relations are useful for describing how similar two elements from

the same domain are. A similarity relation,

9,10

s~ x, y!, for a given domain D

j

, is a

mapping of every pair of elements in the domain onto the unit interval [0,1]. A

similarity relation is reflexive and symmetric as in an equivalence relation. The

similarity relation should also have transitive property. These three properties of a

similarity relation are stated below.

Definition. A similarity relation is a mapping, s: D D r @0, 1#, such that

for x, y, z D,

s~ x, x! 1 ~reflexivity!,

s~ x, y! s~ y, x! ~symmetry!,

s~ x, z! max

yD

~min~s~ x, y!, s~ y, z!!! ~max-min transitivity!

888 BAHAR AND YAZICI

2.3. Similarity-Based Fuzzy Relational Databases

The similarity-based fuzzy relational model is not an extension to the original

relational model, but actually a generalization of it. It allows a set of values for an

attribute rather than only atomic values, and replaces the identity concept with a

similarity concept.

The similarity-based relational model allows a set of values for a single

attribute provided that all the values are from the same domain. Thus, while allow-

ing multiple values, similarity-based relational model keeps the strongly typed

attribute value property of the original model. This property is useful for query

processing and update operations. If the attribute value is precise and crisp, then

the value is atomic, if it is imprecise and inexact, then a set of values that are

similar to this value are stated in place of it. The level of similarity among the

values is defined by the explicitly defined similarity relation for the domain of the

attribute values.

The original model compares two attribute values by checking whether the

two values are equal or not. The identity relation reflects this fact: i ~ x, y! 1 if

and only if x y, and i ~ x, y! 0 otherwise. The similarity-based relational

model

10

compares two attributes by measuring the closeness of the values in terms

of the explicitly declared similarity relation of the attribute domain. Atuple in this

model is called redundant if it can be merged with another through the set union of

corresponding domain values.

3. FUZZY NORMAL FORMS FOR FUZZY RELATIONS

In a logical database design, integrity constraints have a critical role. One of

the most important integrity constraints is the functional dependency. Functional

dependencies reflect a kind of semantic knowledge about the relationships between

the attributes. They help the database designer remove some of the redundant infor-

mation in the relations. To provide a guidance for a good fuzzy database design

several fuzzy normal forms based on fuzzy functional dependencies have been

proposed.

3.1. Fuzzy Functional Dependencies

Fuzzy functional dependencies reflect some kind of semantic knowledge about

attribute subsets of the real world. Ffds are used to design fuzzy databases where

data redundancy and update anomalies are reduced.

In the classical relational data model, a functional dependency X rY states

that equal Y values correspond to equal X values. However, the definition of func-

tional dependency is not directly applicable to similarity-based fuzzy databases,

because the concept of equality does not totally apply to fuzzy relational database

models. In a fuzzy relational data model, the degree of X determines Y may not

necessarily be 1 as in the crisp case. Naturally, a value ranging over the interval

[0,1] may be accepted. Then the definition of ffd turns into similar Y values cor-

respond to similar X values.

NORMALIZATION OF FRDBs 889

Ffds are functional constraints that are specified among the attributes of a

fuzzy relation schema. In the definition of the ffds, we use the conformance con-

cept.

8,17,20

According to the definition of conformance, a tuple is similar to itself

independent of its attribute values, the uncertainty is kept even in the presence of

ffds imposed on the relation, and this definition of conformance is transitive, sym-

metric, and reflexive. For precise ffds, the similarity of Y values has to be greater

than or equal to the similarity of X values, where similarity is measured in terms of

conformance. For imprecise ffds, the impreciseness of the dependency is a thresh-

old on the similarity of Y values, weakening the dependency. Using the definition

of ffd, we have defined the partial ffd, to be used in the definition of fuzzy 2NF.

3.1.1. Conformance of Tuples

A ffd can be represented as X r

q

Y, where q is the linguistic strength (like

more or less, sometimes, etc.). A ffd, X r

q

Y, states that similar Y values

correspond to similar X values. Here similarity (or closeness) refers to confor-

mance of tuples. The similarities of the attribute values define how conformant the

two tuples are on that attribute. Aformal definition of conformance

7

is given below.

Definition. The conformance of attribute A

k

defined on domain D

k

for any two

tuples t

1

and t

2

present in relation instance r and denoted by C~A

k

@t

1

, t

2

# ! is

given as

C~A

k

@t

1

, t

2

# ! min$min

xd

1

$max

yd

2

$s~ x, y!%%, min

xd

2

$max

yd

1

$s~ x, y!%%%

where d

1

is the value set of attribute A

k

for tuple t

1

, d

2

is the value set of attribute

A

k

for tuple t

2

, s~ x, y! is a similarity relation for values x and y, and s is a

mapping of every pair of elements in the domain D

k

onto interval @0, 1#.

In the case of an ordinary relational data model, both d

1

for d

2

have to be

singleton sets, and the similarity of any tuples can have the value of either 0 or 1.

Here, the identity relation is replaced by the explicitly declared s~ x, y! of which

the identity relation is a special case. To describe the closeness of two tuples on a

set of attributes rather than on a single attribute, the definition of conformance is

extended in Ref. 8 as follows.

Definition. The conformance of attribute set X for any two tuples t

1

and t

2

present in relation instance r and denoted by C~X@t

1

, t

2

#! is given as C~X@t

1

, t

2

#!

min

A

k

X

$C~A

k

@t

1

, t

2

# !%.

3.1.2. Definition of Fuzzy Functional Dependencies

A formal definition for the ffd can be given as follows.

Definition. Let r be any fuzzy relation instance on schema R~A

1

, . . . , A

n

!, U be

the universal set of attributes A

1

, . . . , A

n

, and both X and Y be subsets of U. Fuzzy

890 BAHAR AND YAZICI

relation instance r is said to satisfy the ffd, X r

q

Y, if for every pair of tuples t

1

and t

2

in r, C~Y@t

1

, t

2

# ! min~q, C~X@t

1

, t

2

# !!, where q is a real number

within the range @0, 1#, describing the linguistic strength.

As for their crisp counterparts, the ffds should also be checked whenever

tuples are inserted into the fuzzy relational database or they are modified, so that

the integrity constraints imposed by these ffds are not violated.

Example 1. Consider a fuzzy relation instance Person ~Name, Performance,

Earning!. The similarity relations of the attribute domains are given in Tables IIII.

The integrity constraint for the Person relation is Performance of the

employee more or less determines his/her earning. That is, the ffd of this relation

is PERFORMANCE r

0.6

EARNING, where 0.6 is the linguistic strength, more

or less. This ffd should be checked whenever new tuples are to be inserted, to see

whether the new tuple violates the ffd. Below, a couple of tuples are inserted to

investigate the tuple conformance concept.

Step 1: Insertion of the first tuple

^$Kelly%, $ poor, very poor%, $little%&

Since this is the first tuple, it does not violate the ffd.

Step 2: Insertion of the second tuple

^$Matthew%, $average%, $moderate, average%&

Table I. Similarity relation for attribute NAME.

NAME Kelly Jerry Matthew Sandra

Kelly 1 0 0 0

Jerry 0 1 0 0

Matthew 0 0 1 0

Sandra 0 0 0 1

Table II. Similarity relation for attribute PERFORMANCE.

PERFORMANCE Very poor Poor Average Good Excellent

Very poor 1 0.75 0.3 0.3 0.3

Poor 0.75 1 0.3 0.3 0.3

Average 0.3 0.3 1 0.6 0.6

Good 0.3 0.3 0.6 1 0.65

Excellent 0.3 0.3 0.6 0.65 1

NORMALIZATION OF FRDBs 891

The conformance values of the left- and right-hand side attributes of the ffd are as

C~Perf @t

1

, t

2

# ! 0.3, C~Earn@t

1

, t

2

# ! 0.2

Here, the ffd Performance r

0.6

Earning is violated because C~Earn@t

1

, t

2

# !

min~0.6, C~Perf @t

1

, t

2

# !!, so the tuple is not inserted.

Step 3: Insertion of the third tuple

^$Jerry%, $average, good%, $moderate%&

There is only one tuple to be dealt with for the conformance check, because the

tuple in step 2 is not inserted.

C~Perf @t

1

, t

2

# ! 0.3, C~Earn@t

1

, t

2

# ! 0.8

Then the ffd Performance r

0.6

Earning is not violated because C~Earn@t

1

, t

2

# !

min~0.6, C~Perf @t

1

, t

2

# !!, so the tuple is inserted. Now, we have two tuples in

our relation:

t

1

: ^$Kelly%, $ poor, very poor%, $little%&

t

2

: ^$Jerry%, $average, good%, $moderate%&

Step 4: Insertion of the fourth tuple

^$Sandra%, $average%, $little%&

There are two tuples to be dealt with for the conformance check, because the tuple

in step 2 is not inserted.

C~Perf @t

1

, t

3

# ! 0.3, C~Earn@t

1

, t

3

# ! 1,

C~Perf @t

2

, t

3

# ! 0.6, C~Earn@t

2

, t

3

# ! 0.8

Then the ffd Performance r

0.6

Earning is not violated because both

C~Earn@t

1

, t

3

# ! min~0.6, C~Perf @t

1

, t

3

# !!

and

C~Earn@t

2

, t

3

# ! min~0.6, C~Perf @t

2

, t

3

# !!

Table III. Similarity relation for attribute EARNING.

EARNING Little Moderate Average High Very high

Little 1 0.8 0.2 0.2 0.2

Moderate 0.8 1 0.2 0.2 0.2

Average 0.2 0.2 1 0.6 0.6

High 0.2 0.2 0.6 1 0.8

Very high 0.2 0.2 0.6 0.8 1

892 BAHAR AND YAZICI

so the tuple is inserted. Thus, we have three tuples in the relation:

t

1

: ^$Kelly%, $ poor, very poor%, $little%&

t

2

: ^$Jerry%, $average, good%, $moderate%&

t

3

: ^$Sandra%, $average%, $little%&

3.1.2.1. Partial Fuzzy Functional Dependencies. Using the definition of the

ffd, we can define a partial ffd, which is used in the definition of the fuzzy 2NF.

Definition. Y is called partially fuzzy functionally dependent on X to the

degree q, X r

q

Y partially, if and only if X r

q

Y and there exists an X

'

X,

X

'

, such that X

'

r

a

Y where a q.

In more relaxed terms, a ffd X r

q

Y is a partial ffd, if removal of an attribute

A from X means that the dependency still holds. That is, for an attribute A X,

X $ A% still fuzzy functionally determines Y to the degree a q.

Example 2. Let the relational schema be R ~A, B, C! and the ffds be

AB r

0.8

C and A r

0.9

C. After removing attribute B from the first ffd, the

dependency still holds; hence AB r

0.8

C is the partial ffd.

3.1.3. Inference Rules for Fuzzy Functional Dependencies

An important concept related to data dependencies is the concept of infer-

ence rules. Given a set of dependencies, inference rules introduce other dependen-

cies that are logical consequences of the given dependencies. These rules are

dependency generators and so they are closely related to the definition and seman-

tics of the dependencies.

The fuzzy inference rules are listed below for ffds. They reduce to those of

the classic fds. The inference rules presented below have already been shown to be

sound and complete in Ref. 17.

(1) Inclusive rule for ffds:

If X r

u

1

Y holds and u

1

u

2

, then X r

u

2

Y holds.

(2) Reflexive rules for ffds:

If X Y, then X r

u

Y holds for all u @0, 1# .

(3) Augmentation rule for ffds:

Whenever r satisfies X r

u

Y, it also satisfies XZ r

u

YZ

(4) Transitivity rule for ffds:

Whenever r satisfies X r

u

1

Y and Y r

u

2

Z, it also satisfies X r

min~u

1

, u

2

!

Z

NORMALIZATION OF FRDBs 893

By successive application of the above inference rules, additional inference rules

for the ffds can be stated:

(1) Union rule for ffds:

Whenever X r

u

1

Y and X r

u

2

Z are satisfied by r, X r

min~u

1

, u

2

!

YZ is also satisfied

(2) Pseudotransitivity rule for ffds:

Whenever r satisfies X r

u

1

Y and WY r

u

2

Z, then it also satisfies WX r

min~u

1

, u

2

!

Z

(3) Decomposition rule for ffds:

If X r

u

Y holds and Z Y, then X r

u

Z holds

3.2. Fuzzy Keys

Like its classical relational counterpart, fuzzy normal forms are based on the

concept of ffd and the concept of fuzzy key. Therefore, we define the fuzzy key

concept in this section.

A primary key is a special case of functional dependency in classical rela-

tional database models. The role of X in functional dependency X rY belongs to

the attributes in the key, and the set of all other attributes in the relation play the

role of Y. That is, a key subset of U, K, of a relation schema R means that the

values of U are determined from K values for all tuples of any relation of R. In

classical relational data model, identical K values lead to identical U values. In the

fuzzy relational data model, the concept of being identical again leaves its place to

similarity (or closeness). The determination is reflected by the relationship that

identical K values lead to identical U values, and close K values lead to close U

values to a certain extent. In fuzzy relational databases, the classical primary key

is extended to be called fuzzy key with strength q, where q is the extent men-

tioned before. A more formal definition can be given as follows.

Definition. Let K, S U, and F be a set of ffds for R: K is called a fuzzy key of

R with strength q if and only if K r

q

i

U F and K r

q

i

U is not a partial ffd,

where q min q

i

and q 0.

Example 3. If we consider a symbolic example, let us have a relation R where

R ~A, B, C, D! and ffds A r

0.7

B and A r

0.9

CD; the A is called the fuzzy

key of the relation with strength 0.7, because B values are determined by A to the

degree 0.7, and C and D values are determined by A to the degree 0.9. Our q

i

values are q

1

0.7 and q

2

0.9, and q value is then the minimum of $0.7, 0.9%,

that is, 0.7.

A fuzzy key can have the values that an ordinary attribute can take. It can

have multivalues such as $a, b% where a and b are similar to each other with a

certain degree. The only restriction on the values of a fuzzy key, like the values of

894 BAHAR AND YAZICI

other attributes, is that the values should not be AND-combined, as will be explained

later.

3.2.1. Transitive Closure of Fuzzy Functional Dependencies

Given a set of ffds for a relation, the fuzzy key of that relation can be found

utilizing the concept of transitive closure. Chen, Kerre, and Vandenbulcke

18

stud-

ied the ffd transitive closure and axiomatization of fuzzy functional dependence.

Transitive closure comes into place when we want to know whether a given ffd

can be derived using the ffd set F of a relation and the inference rules for ffds.

However, it is not a simple task to compute the set of all ffds that are derived from

F using the inference rules, because the set is infinite. Instead of computing this

whole set, the algorithm below finds all attributes that are fuzzy functionally depen-

dent on attribute(s) X, and the maximal degree the dependencies hold, namely the

transitive closure of X.

Algorithm. Transitive Closure Computation Algorithm. Let X be a set of k

attributes, X X

1

X

2

. . . X

k

:

(1) Initially construct the closure list of X, XList, with the attributes in X with the maximal

degree, 1, for each.

XList $~X

1

,1!, ~X

2

,1!, . . . ,~X

k

,1!%

The domain, Dom, contains the attributes in the XList; X

1

, X

2

, . . . , X

k

initially. BList

is a temporary closure list, and initialized at the beginning.

(2) For each ffd V r

a

W, in F.

If the left-hand side of the ffd is a subset of the domain, V Dom,

Find the minimum strength in XList, among the elements of XList whose attributes

are the elements of V, minstrength.

Set f as the minimum of a and the strength found in the previous step, f min~a,

minstrength!

j

of the right-hand side W, add the entry ~W

j

, f! to the BList.

(3) Combine BList into XList using fuzzy union operation.

(4) If there is a change in XList, reset the BList, adjust the domain, Dom, according to the

new elements of XList, and go to step 2. Else stop, XList is the transitive closure of X.

Example 4. If we consider the relation in Example 3, the relation R has the

attribute set $ A, B, C, D% and ffds A r

0.7

B and A r

0.9

CD. Let us compute the

transitive closure of attribute A.

Initially,

XList $~A, 1!%, Dom$ A%, BList

For the first ffd, A r

0.7

B

Minstrength 1, w min~1, 0.7! 0.7

BList $~B, 0.7!%

NORMALIZATION OF FRDBs 895

For the second ffd, A r

0.9

CD

Minstrength 1, w min~1, 0.9! 0.9

BList $~B, 0.7!, ~C, 0.9!, ~D, 0.9!%

Combining BList into XList, XList $~A, 1!, ~B, 0.7!, ~C, 0.9!, ~D, 0.9!%.

Because XList is changed, we reset BList and our new domain is Dom $ A, B,

C, D%. And then the two ffds should again be considered in the same scenario.

But this time, there is no change in XList, and hence the transitive closure of A is

$~A, 1!, ~B, 0.7!, ~C, 0.9!, ~D, 0.9!%.

3.2.2. Finding the Fuzzy Key of a Relation

To find the fuzzy key of a relation, the concept of transitive closure for ffds is

used. The exhaustive way is to analyze the transitive closure of all the combina-

tions of all of the attributes in the relation and check whether the transitive clo-

sures found include all the attributes. This means that the attribute combination

determines all the attributes in the relation to the respective degrees in the closure

list, and the minimum of these strength values would be the strength of the fuzzy

key. But in this case, there is no need to consider the transitive closures of all

attributes, because for an attribute to be a part of a fuzzy key, it should belong to

the left-hand side of any of the ffds, or it should not exist in any of the ffds in the

relation. That is, to find a fuzzy key, the attributes that appears only on the right-

hand sides of the ffds in the relation need not be considered in finding the transi-

tive closures. Below is the algorithm to find the fuzzy keys of a given relation with

a set of ffds F.

Algorithm. Fuzzy Key Finding Algorithm. Let F be the set of ffds of R:

(1) Find all the left-hand side attributes of ffds in F.

(2) Find the attributes not contained in any of the ffds of F.

(3) Get the union of the two sets found in the first two steps above into AttributeList.

(4) Beginning with the single attribute combinations, for all the ascending combinations

of attributes in AttributeList (say comb for the combination):

If the transitive closure found contains all the attributes of the relation, set a to the

minimum of the strengths in the transitive closure, and add comb to the key list with

the degree a.

With this algorithm, all the candidate keys can be found. The first control of the

fourth step in the algorithm ensures the full fuzzy functional dependence of the

attributes of the relation on the fuzzy key.

Example 5. Let us consider Example 3 again. To find all the fuzzy keys of the

relation R ~A, B, C, D! with ffds A r

0.7

B and A r

0.9

CD, we apply the

algorithm above. The set of left-hand-side attributes of R is $ A%. There is no

attribute not contained in any of the ffds, so AttributeList $ A%. Because there is

896 BAHAR AND YAZICI

only one attribute in AttributeList, only one transitive closure set, that is for attribute

A, should be computed. And the transitive closure of A is $~A, 1!, ~B, 0.7!, ~C, 0.9!,

~D, 0.9!%. Because the transitive closure contains all the attributes in the relation,

A is the fuzzy (candidate) key of the relation with strength 0.7, that is, a mini-

mum of (1, 0.7, 0.9).

3.2.3. Fuzzy Prime and Nonprime Attributes

To be able to state the condition for the fuzzy 2NF, it is also necessary to

define fuzzy prime and fuzzy nonprime attributes for a relation.

Definition. Let A U, X U, and K be a fuzzy key set of R. A is called a

fuzzy prime attribute if and only if A K; X is called a fuzzy prime if and only if

X K. Those attributes that are not fuzzy prime are called fuzzy nonprime.

For an attribute to be a fuzzy prime attribute, it should be a part of at least one

of the fuzzy candidate keys of the relation. Similarly, for an attribute to be a fuzzy

nonprime attribute, it should not appear in any of the fuzzy candidate keys of the

relation. In Example 5, the attribute A is a prime attribute with a degree of 0.7.

3.3. Fuzzy First Normal Form

The first one of the classical normal forms that is extended and generalized

within the framework of similarity-based fuzzy relational model is the 1NF.

Definition. Let D

k

be the domain of attribute A

k

, a relation schema R is called

to be in fuzzy 1NF if and only if for any relation r in R, none of the attributes has

values (AND-combined) multivalued.

When a relation schema is not in fuzzy 1NF, the algorithm below can be used

to normalize the relation to be in fuzzy 1NF.

Algorithm. Fuzzy 1NF Decomposition Algorithm.

When the relation is not in fuzzy 1NF, remove the tuple whose attributes vio-

late fuzzy 1NF.

Place these attributes in separate tuples along with the other attributes to

achieve the fuzzy 1NF.

Example 6. Consider a relation schema R, and let its attributes be NAME, AGE,

and LANGUAGE-SPOKEN. A relation r of R consists of four tuples given as

t

1

~Kelly, 35, English)

t

2

~Jerry, [very young, young], $English, French%)

t

3

~Matthew, middle-aged, an oriental language)

t

4

~Sandra, 60, German)

NORMALIZATION OF FRDBs 897

In r, t

1

means Kelly is 35 years old and speaks English, t

2

means that Jerry, quite

young, speaks English and French, t

3

means Matthew, who is middle-aged, speaks

Japanese, and t

4

means Sandra, aged 60, speaks German.

This schema does not satisfy fuzzy 1NF because of the second tuple. In this

tuple, Jerry speaks two languages, and this is an example of multivalued (AND-

combined) data. When we apply the algorithm to make the relation in fuzzy 1NF,

the tuples become

t

1

~Kelly, 35, English)

t

2

] t

5

~Jerry, [very young, young], English)

t

6

~Jerry, [very young, young], French)

t

3

~Matthew, middle-aged, Japanese)

t

4

~Sandra, 60, German)

where the relation is now in fuzzy 1NF.

3.4. Fuzzy Second Normal Form

The fuzzy second normal form, fuzzy 2NF, is based on the concept of the full

ffd. By using the concepts of fuzzy key and partial fuzzy functional dependence,

we can define the fuzzy 2NF.

Definition. Let F be the set of ffds for schema R and K be a fuzzy key of R with

strength q. R is called to be in fuzzy 2NF if and only if none of the fuzzy nonprime

attributes is partially fuzzy functionally dependent on the fuzzy key, K.

Example 7. Let us consider a symbolic example, where a relation schema is R

~A, B, C, D!, and the ffds are AB r

0.8

D and A r

0.9

C. Then attributes AB is

the fuzzy key with strength 0.8. Because a fuzzy nonprime attribute, C, is partially

fuzzy functionally dependent on fuzzy key of R, AB, R is not in fuzzy 2NF.

3.4.1. Fuzzy Second Normal Form Control

Because the definition of fuzzy 2NF involves the control of partial ffd of

fuzzy nonprime attributes on the fuzzy key of R, an algorithm is used to control

partial fuzzy functional dependence and it is given below.

Algorithm. Partial Dependency Control Algorithm. Let the ffd to be investi-

gated for being partial be X r

a

Y.

(1) If the left-hand side of the ffd, X, contains a single attribute, the test need not be applied

at all; the ffd is not partial. Otherwise,

(2) Beginning with the single attribute combinations, for all the ascending combinations

of the attributes of X, except for the combination containing all the attributes;

898 BAHAR AND YAZICI

If the transitive closure contains all the attributes of the right-hand side of the ffd, Y,

and the corresponding strengths are greater than or equal to a, then the ffd is partial.

The algorithm above is based on the fact that, if a proper subset of left-hand side

attributes of a ffd fuzzy functionally determines the right-hand side to a degree

greater than or equal to the strength of the ffd, then the ffd is partial.

To understand whether a given relation is in its fuzzy 2NF, all the fuzzy non-

prime attributes of the relation should be checked to see whether they are partially

fuzzy functionally dependent on any of the fuzzy keys of the relation. The algo-

rithm below is developed for the fuzzy 2NF control for a given relation.

Algorithm. Fuzzy 2NF Control Algorithm. Let K be the set of fuzzy keys of rela-

tion R.

For each candidate key K

i

of the relation,

If the fuzzy key contains a single attribute, it has already no partial ffd, continue with

another candidate key.

j

of the relation,

i

r

a

i

A

j

, where a

i

is the strength of K

i

.

Apply the partial dependency control algorithm to find out whether the ffd is a partial

ffd. If so, stop, the relation is not in fuzzy 2NF.

3.4.2. Decomposition into Fuzzy Second Normal Form

If a relation schema is not in fuzzy 2NF, it can be normalized into a number of

smaller relations in fuzzy 2NF by the following algorithm.

Algorithm. Fuzzy 2NF Decomposition Algorithm: If the relation is not in fuzzy

2NF, using the Fuzzy 2NF control algorithm, find the partial fuzzy keys and their

dependent fuzzy nonprime attributes.

Decompose and set up a new relation for each partial fuzzy key with its dependent

attributes.

Extract the fuzzy nonprime attributes that are partially fuzzy functionally dependent on

any fuzzy key of the relation from the original relation and set up a new relation with the

remaining attributes.

Example 8. If we consider Example 7 again, the relation was R (A, B, C, D),

ffds were AB r

0.8

D and A r

0.9

C, and AB was the fuzzy key of the relation with

strength 0.8. The second ffd A r

0.9

C contains a part of the fuzzy key as its

left-hand side, so we have to decompose the relation. According to our algorithm,

the decomposition will be like R1 (A, C) and R2 (A, B, D) where A is the

fuzzy key of the first relation with strength 0.9 and AB is the fuzzy key of the

second relation with strength 0.8.

NORMALIZATION OF FRDBs 899

3.5. An Example Application: Leasing Risk Assessment

To automate the risk assessment evaluation for car leasing contracts, a fuzzy

enhanced score card system is developed. There are three different customer types:

private, self-employed, and corporate customers. For modeling private customers,

factors such as age, marital status, length of time at present address,and so forth

are used, that is, the attributes are generally crisp. On the other hand, corporate

customers have more input variables that are a bit more complicated and contain

fuzzy data. Attributes of the relation Leasing Risk Assessment are (Capital, Rev-

enue, Workforce, CompAge, LegalType, FinanBack, CompStruct, IlliquidRisk, Credit-

Rating), where

Capital ] Companys capital basis

Revenue ] Companys annual revenue

Workforce ] Number of employees

CompAge ] Age of the company

LegalType ] Legal status of the company

FinanBack ] Financial background evaluation

CompStruct ] Company structure evaluation

IlliquidRisk ] Evaluation of the risk of company becoming illiquid

CreditRating ] Credit rating for the current leasing contract

with the ffds specified below:

FFD1: Companys capital basis and annual revenue generally determines its

financial background.

$Capital, Revenue% r

0.8

FinanBack

FFD2: Number of employees, age of the company and its legal status together

more or less determines the structure of the company.

$WorkForce, Compage, LegalType% r

0.7

CompStruct

FFD3: Financial background and structure of the company mostly deter-

mines the risk of the company becoming illiquid.

$FinanBack, CompStruct% r

0.9

IlliquidRisk

FFD4: Evaluation of the risk of the company becoming illiquid more or less

determines the credit rating of the company.

IlliquidRisk r

0.7

CreditRating

In this relation (Capital, Revenue, WorkForce, CompAge, LegalType) is the fuzzy

key with strength 0.7. FFD1 contains a part of the fuzzy key as its left-hand side,

that is, FinanBack is partially fuzzy functionally dependent on the fuzzy key. Also

in FFD2, CompStruct is partially fuzzy functionally dependent on the fuzzy key.

900 BAHAR AND YAZICI

So the relation is not in fuzzy 2NF; it should be decomposed. The decomposition

is as follows:

R1 (Capital, Revenue, FinanBack)

where Capital, Revenue is the fuzzy key with strength 0.8, and its ffd is

$Capital, Revenue% r

0.8

FinanBack

R2 (Workforce, CompAge, LegalType, CompStruct)

where Workforce, CompAge, LegalType is the fuzzy key with strength 0.7, and its

ffd is

$WorkForce, Compage, LegalType% r

0.7

CompStruct

We must also make sure to keep a relation with the remaining attributes, removing

FinanBack and CompStruct from the relation. So, we have the below third relation

with (Capital, Revenue, Workforce, CompAge, LegalType) being the fuzzy key

with strength 0.7. Then, the last relation is

R3 (Capital, Revenue, Workforce, CompAge, LegalType,

IlliquidRisk, CreditRating)

and the corresponding ffds are

$Capital, Revenue, Workforce, CompAge, LegalType% r

0.9

IlliquidRisk

IlliquidRisk r

0.7

CreditRating

At this point, all three relations, R1, R2, and R3 are in fuzzy 2NF.

3.6. Fuzzy Third Normal Form

The normalization process takes a relation schema through a series of tests to

certify whether it satisfies a certain normal form. The process proceeds in a top-

down fashion. In a database design satisfying the fuzzy 3NF, insertion, deletion,

and update anomalies will be minimum.

Definition. Let F be the set of ffds for R, and K be the fuzzy key of R with

strength q. R is called to be in fuzzy 3NF if and only if R is in fuzzy 2NF and for

any X r

a

A in F where A is not in X, either X contains the fuzzy key or A is fuzzy

prime.

3.6.1. Fuzzy Third Normal Form Control

The definition of fuzzy 3NF can directly be used to control whether a given

relation is in fuzzy 3NF. All of the ffds should be checked against the conditions:

If the left-hand side attributes contain all the attributes of the right-hand side, that

ffd does not violate fuzzy 3NF. Similarly if the left-hand side contains any of the

NORMALIZATION OF FRDBs 901

fuzzy keys of the relation, fuzzy 3NF is not violated. And finally, if the right-hand

side attributes of the ffd are all fuzzy prime attributes, fuzzy 3NF is also not vio-

lated. These are composed together in the algorithm below.

Algorithm. F3NF Control Algorithm. Let K be the fuzzy key set of relation R.

(1) For every ffd X r

a

Y in the relation,

If X K

i

, for any K

i

K, fuzzy 3NF is not violated; otherwise,

Let P be the set of fuzzy prime attributes of R. If Y P, fuzzy 3NF is also not

violated.

(2) If none of the above conditions are satisfied for at least one of the ffds in the relation,

the relation is not in fuzzy 3NF.

Example 9. For a symbolic example, let R (A, B, C, D) and the ffds be

AB r

0.9

C, AC r

0.8

D, and C r

0.6

E. The first ffd has the fuzzy key as its

left-hand side not violating the fuzzy 3NF. But the second and third ffds, AC r

0.8

D

and C r

0.6

E, violate the fuzzy 3NF definition; left-hand sides are not a part of

fuzzy key, AB, and D and E are not fuzzy prime. Then R is not in fuzzy 3NF.

3.6.2. Decomposition into Fuzzy Third Normal Form

The normalization process based on ffds uses a number of decompositions

while normalizing the relations. But normal forms do not always guarantee a good

database design. Generally it is not sufficient to only check that each relation schema

in the database is in one of the fuzzy normal forms, fuzzy 3NF, or in fuzzy Boyce

Code Normal Form (BCNF). The normalization process should also confirm the

existence of two additional and desirable properties, dependency preservation prop-

erty and lossless join property. The decomposition algorithms having these men-

tioned properties will be given in the following sections.

3.6.2.1. Minimal Cover. In the next two sections, two algorithms are given

both for the dependency preserving and lossless join decompositions. But for the

decompositions to possess the two desired properties, the initial ffd set should be a

minimal cover and it should be free of partial ffds. A minimal cover of a set of

dependencies F is a set of dependencies that is equivalent to F with no redundan-

cies. A set of ffds F is minimal if the following conditions hold: (1) every depen-

dency in F has a single attribute for its right-hand side, (2) we cannot replace any

X r

u

A with Y r

a

A where Y is a proper subset of X and a u and still have a

set of ffds equivalent to F, and (3) we cannot remove any dependency from F and

still have a set of ffds equivalent to F.

Partial ffd free means that there is no partial ffd in the set of ffds. The algo-

rithm below finds the minimal cover of a given ffd set and makes it partial ffd free.

Algorithm. Minimal Cover Algorithm: Let F be the set of ffds, and assign F to

G, G

:

F.

902 BAHAR AND YAZICI

(1) Replace each ffd X r

q

i

$ A

1

, A

2

, . . . , A

n

% in G by n ffds X r

q

i

A

1

, X r

q

i

A

2

, . . . ,

X r

q

i

A

n

.

(2) For each ffd X r

q

i

A

k

in G

For each attribute B X

If ~~G $X r

q

i

A

k

%! ~~X $B%! r

a

A

k

!! where a q

i

is equivalent to G

Then replace X r

q

i

A

k

with ~X $B%! r

a

A

k

in G.

(3) For each remaining ffd X r

q

i

A

k

in G

If ~G $X r

q

i

A

k

%! is equivalent to G, then remove X r

q

i

A

k

from G.

3.6.2.2. Dependency Preserving Decomposition into Fuzzy Third Normal

Form. In fuzzy databases, it is important to preserve the dependencies while

decomposing the relations like their classical counterparts, because each depen-

dency in the fuzzy database represents a constraint in the database. If one of the

dependencies is not represented in some individual relation R

i

, we have to join

two or more relations in the decomposition and then proceed, and that is ineffi-

cient and impractical. The dependency preservation property ensures that each ffd

is represented in some individual relation resulting after decomposition.

Now, we give the algorithm that creates a dependency-preserving decompo-

sition of a relation R based on a set of ffds, F, such that each relation in the decom-

position is in fuzzy 3NF.

Algorithm. Dependency Preserving Decomposition into Fuzzy 3NF Algorithm.

Find the minimal cover G for F, and make it partial ffd free by using the Min cover

Algorithm above.

Place any attributes that have not been included in any of the ffds of G in a separate

relation schema, and eliminate them from R.

If any of the ffds in G involves all the attributes of R, then the decomposition is R.

Else, for each left-hand side X of ffds in G, create a new relation schema in D with

attributes $X $ A

1

% . . . $ A

k

%% where X r

q

1

A

1

, X r

q

2

A

2

, . . . , X r

q

k

A

k

are

the ffds in G, and X is the fuzzy key of this new relation with strength q

i

min

.

Example 10. Let R ~A, B, C, D, E! and the ffds be CD r

0.7

A, CD r

0.7

B,

AD r

0.5

E, CD r

0.7

E, A r

0.8

B, and B r

0.6

E. Hence CD is the fuzzy key

of the relation with strength 0.7.

First of all, the minimal cover algorithm is applied. G is initialized to the set

of ffds, F, that is, G

:

F. All the ffds are in the form of X r

q

i

A

i

, meaning that

every ffd has a single attribute on its right-hand side. In the third step, for the ffd

AD r

0.5

E, for the attribute D $ A, D%, ~G $ AD r

0.5

E%! ~$ A r

0.6

E%!

is equivalent to G because 0.6 0.5. In this step, A r

0.6

E is obtained from the

two ffds A r

0.8

B, and B r

0.6

E using the transition property. So AD r

0.5

E is

replaced with A r

0.6

E in G. In the last step, the ffd A r

0.6

E, obtained in the

previous step, is removed just because it can be obtained from the last two ffds

A r

0.8

B, and B r

0.6

E. Then, the minimal cover G is

NORMALIZATION OF FRDBs 903

CD r

0.7

A, CD r

0.7

B, CD r

0.7

E, A r

0.8

B, and B r

0.6

E.

For the second step of the dependency-preserving decomposition algorithm,

for each left-hand side of the ffds, where CD is the fuzzy key of the relation with

strength 0.7, a relation schema is created with attributes A, B, C, D, and E whose

ffds are CD r

0.7

A, CD r

0.7

B, and CD r

0.7

E, with CD as the fuzzy key with

strength 0.7. Then for the remaining ffds A r

0.8

B, and B r

0.6

E, two separate

relation schemas, one with attributes A and B, the other with attributes B and E, are

created. At the end, after the dependency-preserving decomposition three relation

schemas are obtained. The first one is R1 (A, B, C, D, E) with fuzzy functional

dependencies CD r

0.7

A, CD r

0.7

B, and CD r

0.7

E, the second one is R2

(A, B) with A r

0.8

B, and the third one is R3 (B, E) with B r

0.6

E.

3.6.2.3. Lossless Join Decomposition into Fuzzy Third Normal Form.

Another desired property of a decomposition is the lossless join property. If a

decomposition does not have the lossless join property, then we may get spurious

tuples after joining those relations in that decomposition. These spurious tuples

represent erroneous information. Therefore, this property is critical and must cer-

tainly be achieved. Lossless join property guarantees that spurious tuple genera-

tion problem does not occur with respect to the relation schemas created after

decomposition. The algorithm below provides a lossless join decomposition into

fuzzy 3NF.

Algorithm. Lossless Join Decomposition into Fuzzy 3NF Algorithm.

Find the minimal cover G for F, and make it partial ffd free.

Place any attributes that have not been included in any of the ffds of G in a separate

relation schema, and eliminate them from R.

If any of the ffds in G involves all the attributes of R, then the decomposition is R.

Else, for each left-hand side X of ffds in G, create a new relation schema in D with

attributes $X $ A

1

% . . . $ A

k

%% where X r

q

1

A

1

, X r

q

2

A

2

, . . . , X r

q

k

A

k

are

the ffds in G, and X is the fuzzy key of this new relation with strength q

i

min

.

If none of the relation schemas contains the fuzzy key of R, create one more relation

schema that contains attributes that form the fuzzy key of R.

The testing algorithms for these two properties, dependency preserving and loss-

less join properties, are presented in the following sections after fuzzy BCNF.

Example 11. The lossless join decomposition algorithm brings an additional step

into the dependency-preserving decomposition at the end, by creating a new rela-

tion schema for the fuzzy key of the relation. If we consider the relation in Exam-

ple 10 again, in order to get a lossless join decomposition, we must go through all

the steps of the dependency-preserving decomposition again and at the end we

must create a new relation for the fuzzy key, CD, if it is not contained in any of the

decomposed relations. But, in our case, the fuzzy key CD is already contained in

one of the decomposed relations, so there is no need to create a new relation.

904 BAHAR AND YAZICI

Then, after a lossless join decomposition into fuzzy 3NF, we have three relations,

R1 (A, B, C, D, E), R2 (A, B), and R3 (B, E), as in Example 10.

3.6.2.4. An Example Application: Leasing Risk Assessment. The Leasing

Risk Assessment relation analyzed above can be further analyzed for fuzzy 3NF.

When the conditions for the fuzzy 3NF are considered, R1 and R2 in the decom-

posed relation R do not violate the fuzzy 3NF, because in each relation there is

only one ffd, and their left-hand sides are the fuzzy keys of the corresponding

relations. But in the third relation, the ffds are

$Capital, Revenue, Workforce, CompAge, LegalType% r

0.9

IlliquidRisk

IlliquidRisk r

0.7

CreditRating

According to fuzzy 3NF control algorithm, the first ffd satisfies the second condi-

tion so it does not violate the fuzzy 3NF, but in the second ffd none of the condi-

tions are met, the left-hand side does not contain the right-hand side, and also it

does not contain the key, and lastly CreditRating is not a fuzzy prime attribute.

Consequently, the third relation is not in fuzzy 3NF, and it must be decomposed.

Applying the Dependency-Preserving Decomposition into fuzzy 3NF algorithm,

we get the decomposed relations as follows:

R4 (Capital, Revenue, Workforce, CompAge, LegalType, IlliquidRisk)

where Capital, Revenue, Workforce, CompAge, LegalType is the fuzzy key with

strength 0.9, and its ffd is $Capital, Revenue, Workforce, CompAge, LegalType% r

0.9

IlliquidRisk

R5 (IlliquidRisk, CreditRating)

where IlliquidRisk is the fuzzy key with strength 0.7, and its ffd is

IlliquidRisk r

0.7

CreditRating

Lossless Join Decomposition into fuzzy 3NF Algorithm has only one additional

step with respect to the dependency-preserving decomposition algorithm: If none

of the relation schemas contains the fuzzy key of R3, create one more relation

schema that contains attributes that form the fuzzy key of R3. But in our case,

relation R4 has the fuzzy key of R3; hence the decomposition is also a lossless join

decomposition.

3.7. Fuzzy Boyce Codd Normal Form

Like its classical counterpart, fuzzy boyce codd normal form (fuzzy BCNF)

is a stricter form of fuzzy 3NF. Fuzzy BCNF ensures that there is no redundancy

that can be detected using ffd information alone. It is the most desirable normal

form from the point of view of redundancy. The formal definition of the fuzzy

BCNF can be given as follows.

NORMALIZATION OF FRDBs 905

Definition. Let F be the set of ffds for schema R, and K be the fuzzy key of R

with strength q. R is called to be in Fuzzy BCNF if and only if R is in fuzzy 3NF and

for any X r

p

A in F, either A is in X or X is a fuzzy superkey of R, that is X K.

To check whether a given relation is in fuzzy BCNF, all of the ffds in the

relation should be checked against the specified two conditions. If the left-hand

side of the ffd contains all the attributes of the right-hand side or any of the fuzzy

keys of the relation, that ffd does not violate the fuzzy BCNF. The algorithm for

the decomposition into fuzzy BCNF is given below. The algorithm ensures that

the decomposition is a lossless join decomposition.

Algorithm. Decomposition into Fuzzy BCNF Algorithm: Let the ffd that vio-

lates fuzzy BCNF be X r

p

A, where A, X R and A is the single attribute.

Decompose R into two relation schemas R A and XA.

Recursively apply the previous step for all the ffds that violate the fuzzy BCNF, until

there is no ffd in the relation violating fuzzy BCNF.

The ffds being checked against fuzzy BCNF are already in fuzzy 3NF and their

right-hand sides consist of single attributes, because of the fuzzy 3NF decomposi-

tion algorithm.

Example 12. Consider a relation schema R (A, B, C, D, E, F, G) with ffds

CE r

0.7

A, BD r

0.6

E, and C r

0.9

B, and A is the fuzzy key of the relation

with strength 0.8, that is, A r

0.8

BCDEFG. The relation schema is in fuzzy 2NF

because there is no partial dependence (the fuzzy key of the relation, A, is already

a single attribute). Now, we have to check if the relation is in fuzzy 3NF. To be in

fuzzy 3NF, either the left-hand side of the ffds should contain the fuzzy key, A, or

the right-hand side is fuzzy prime, that is, a part of the fuzzy key. In our example,

the second ffd violates this constraint, so the relation is not in fuzzy 3NF and

consequently not in fuzzy BCNF. According to our algorithm, we decompose the

relation into two; one with attributes $B, D, E% and ffd BD r

0.6

E with BD as

the fuzzy key with strength 0.6, and the other with the attributes $A, B, C, D, F, G%

and ffd C r

0.9

B with still A as the fuzzy key with strength 0.8. In this decompo-

sition, the second relation schema is still not in fuzzy BCNF because of the ffd

C r

0.9

B. So we decompose it again into two new relations, the first one with

attributes $A, C, D, F, G% and A being the fuzzy key with strength 0.8, and the

second one with the attributes $B, C% and ffd C r

0.9

B with C as the fuzzy key

with strength 0.9. Thus each of the schemas BDE, BC, and ACDFG is in fuzzy

BCNF.

In the Leasing Risk Assessment example, all the decomposed relations are

in fuzzy BCNF.

3.8. Dependency Preservation Property Testing in Decompositions

While discussing the fuzzy 3NF, two algorithms are given for the decompo-

sition into fuzzy 3NF, one achieving the dependency preservation property, and

906 BAHAR AND YAZICI

the other also having the lossless join property. Also the algorithm for normaliza-

tion into fuzzy BCNF ensures the lossless join property. The dependency preser-

vation property of the decompositions in the fuzzy relational data model is studied

in Ref. 23 widely. In this section, an algorithm is presented to test the dependency

preservation property of decompositions.

Algorithm. Dependency Preservation Testing Algorithm: For every ffd, X r

a

Y,

where X X

1

X

2

. . . X

m

,

(1) Construct a transitive closure list, ZList, initially for all the attributes of the left-hand

side of the ffd, X, with maximum strengths.

ZList $~X

1

,1!, ~X

2

,1!, . . . , ~X

m

,1!%

(2) While (true)

i. ZList2 R Zlist.

ii. Reset domain.

iii. For each decomposed relation R

i

~i 1 to k!,

Reset domain.

If the attribute of this element is in U

i

, where U

i

is the attribute set of R

i

, add

the attribute to the domain.

i

.

i

,

If the attribute of this element is in U

i

add the element to TList

i

.

Combine TList

i

into ZList2 using fuzzy union operation.

iv. If ZList ZList2 break.

v Else ZList RZList2.

(3) If all the attributes in Y occur in ZList with the strength a or greater, then dependency

preserving property is not violated, continue with the other ffd.

(4) Else not dependency preserving, break.

Example 13. Let the attribute set for a relation R be $A, B, C%, the decomposed

relations be R1 (A, B) and R2 (B, C), and the ffds be A r

0.9

B and B r

0.7

C.

For the first ffd, A r

0.9

B, transitive closure of A is ZList $(A, 1)% initially.

ZList2 $~A, 1!%

R1 ] domain $ A%

ZList

1

$~A, 1!, ~B, 0.9!, ~C, 0.7!%

TList

1

$~A, 1!, ~B, 0.9!%

ZList2 $~A, 1!, ~B, 0.9!%

R2 ] domain $B%

ZList

2

$~B, 1!, ~C, 0.7!%

TList

2

$~B, 1!, ~C, 0.7!%

ZList2 $~A, 1!, ~B, 0.9!, ~C, 0.7!%

NORMALIZATION OF FRDBs 907

Because ZList ZList2,

ZList R$~A, 1!, ~B, 0.9!, ~C, 0.7!%

In the second pass, ZList ZList2, exiting the loop, we see that the right-hand-side

attribute B occurs in ZList with strength 0.9, so the dependency preservation prop-

erty is not violated and we continue with the second ffd, B r

0.7

C. Similarly, at

the end we find ZList $~B, 1!, ~C, 0.7!%, and because attribute C occurs in

ZList with strength 0.7, the dependency preservation property is not violated. Hence

the decomposition is dependency preserving.

3.9. Lossless Join Property Testing in Decompositions

Chen, Kerre, and Vandenbulcke impose a restriction on the extended alge-

braic operations in their study.

22

In accordance with the design issues and to achieve

a complete information reconstruction, they restricted the eight algebraic opera-

tions, namely product, union, intersection, natural join, projection, selection, minus,

and division, so that they are performed for base relations only on identical ele-

ments or tuples, not on close ones. That means that whenever tuple merging is of

concern, it is referred to identical elements. Raju and Majumdar

6

restrict the fuzzy

resemblance relation and named the class of ffds where the fuzzy resemblance

relation is restricted as restricted ffd. With these choice of restrictions, both Chen

et al.

22

and Raju and Majumdar

6

use a classic algorithm to test lossless join decom-

position of fuzzy relation with ffds.

In this article, we also utilize the classic algorithm to test whether a de-

composition has lossless join property. The logic in using the classical testing

algorithm in the similarity-based fuzzy relational database model is as follows.

The table created during the application of the algorithm is used only to deter-

mine whether there is a joining attribute between the decomposed relations.

Fuzziness is taken into the consideration after this point. If there is a joining

attribute, the decision whether they can be joined or not is given according to

their similarity levels and a predefined threshold. The tuples should also satisfy

all the ffds of the relation; that is, for every pair of tuples, for each ffd X r

q

Y,

C~Y@t

1

, t

2

# ! min~q, C~X@t

1

, t

2

# !!.

Algorithm. Lossless Join Testing Algorithm: Let the relation schema be R

with the attributes A

1

, A

2

, . . . , A

n

, F be the ffds, and r $R

1

, R

2

, . . . , R

k

% be the

decomposition.

(1) Create an initial table T with one row i, for each relation R

i

in the decomposition and

one column j for each attribute A

j

in the relation being decomposed, R.

(2) Put b

ij

in every cell of the table.

(3) For each row i and column j,

If A

j

is in attribute domain, U

i

, of R

i

, then set T

ij

a

j

(4) Repeat until there are no changes in T.

For each ffd X r

a

Y in F,

For all rows in T, look for those rows which have the same symbols in all columns

corresponding to attributes in X,

908 BAHAR AND YAZICI

For any two rows make the symbols in all columns for the attributes in Y be

the same as follows: if any of the symbols is an a symbol set the other to

that same a symbol.

(5) At the end, if a row is entirely of a symbols then the decomposition has lossless join

property. Otherwise, it is not lossless join decomposition.

Example 14. Let the relation schema be R (A, B, C, D, E, F) and ffds be

A r

0.6

B, C r

0.5

DE, and AC r

0.8

F. Here AC is the fuzzy key of the relation

with strength 0.5. Suppose we decompose R into two relations R1 (B, E) and

R2 (A, C, D, E, F), and then test for the lossless join. The initial table T has

i 2 rows for relations R1 and R2, and j 6 columns for the attributes A, B, C, D,

E, and F. For the second step, we initialize each cell with b

ij

. The initial table can

be seen in Table IV.

For the first row, T

12

and T

15

are set to a

2

and a

5

, respectively, because rela-

tion R1 contains the attributes A

2

B and A

5

E. Similarly for the second row,

entries T

21

, T

23

, T

24

, T

25

, and T

26

are set to a

1

, a

3

, a

4

, a

5

, and a

6

, respectively,

because R2 contains the attributes A, C, D, E, and F as in Table V.

For the first ffd A r

0.6

B, R1 and R2 do not have the same symbols in the

first column, the column for the attribute A, so there is no change in B column.

Considering the second ffd, C r

0.5

DE, again R1 and R2 do not have the same

symbols in the column for C, and there is no change in the table. The situation is

the same for the last ffd, AC r

0.8

F. At the end, because there is no row consist-

ing of entirely a symbols, the decomposition is not lossless join decomposition.

Example 15. Now we give a lossless join decomposition example. Let the rela-

tion schema be R (A, B, C, D, E, F, G) and ffds be ABC r

0.7

D, ABC r

0.8

E,

DE r

0.7

F, and F r

0.6

G. Here ABC is the fuzzy key of the relation with

strength 0.7. Suppose we decompose R into three relations R1 (A, B, C, D, E),

R2 (D, E, F), and R3 (F, G), and then test for the lossless join. The initial

table T has i 3 rows for relations R1, R2, and R3, and j 7 columns for the

Table IV. Initial table for relation R ~A, B,

C, D, E, F!.

T A B C D E F

R1 b

11

b

12

b

13

b

14

b

15

b

16

R2 b

21

b

22

b

23

b

24

b

25

b

26

Table V. Table after applying the third step of

lossless join testing algorithm to R.

T A B C D E F

R1 b

11

a

2

b

13

b

14

a

5

b

16

R2 a

1

b

22

a

3

a

4

a

5

a

6

NORMALIZATION OF FRDBs 909

attributes A, B, C, D, E, F, and G. For the second step, we initialize each entry with

b

ij

, Table VI.

For the first row, T

11

, T

12

, T

13

, T

14

, and T

15

are set to a

1

, a

2

, a

3

, a

4

, and a

5

,

respectively, because relation R1 contains the attributes A

1

A, A

2

B, A

3

C,

A

4

D, and A

5

E. Similarly for the second row, entries T

24

, T

25

, and T

26

are set

to a

4

, a

5

, and a

6

, respectively, because R2 contains the attributes D, E, and F. And

finally, entries T

36

and T

37

are set to a

6

and a

7

, respectively, because R3 contains

the attributes F and G, and the table becomes as Table VII.

For the first and second ffds ABC r

0.7

D, and ABC r

0.8

E, R1, R2, and R3

do not have the same symbols in the columns for the attributes A, B, and C, so

there is no change in the D column. Considering the third ffd, DE r

0.7

F, R1 and

R2 have the same symbols in the columns for D and E, so the column of attribute F

for relation R1, b

16,

will be changed into a

6

in the table. Then we get Table VIII.

Finally, for the last ffd F r

0.6

G, R1, R2, and R3 have the same symbols in

the column for F, so the column of attribute G for relation R1 and R2 will be

changed into a

7

in the table, and the table becomes as the one in Table IX.

At the end, because there is a row consisting of entirely a symbols, that is,

the first row, the decomposition is lossless join decomposition.

3.10. An Example Application: Fraud Detection

An increasing number of transactions are carried out remotely and electroni-

cally in todays financial world. Thus, with the complexity of the system, the oppor-

tunities for criminals to conduct fraudulent transactions rise. Credit cards are one

of the areas where fraudulent behavior is extremely important for financial insti-

tutions. Fraudulent behavior can arise through different ways. In one of these ways,

the criminals are individuals; they steal credit cards and then use them toward

purchases. In another case, criminal groups steal new credit cards and duplicate

Table VI. Initial table for relation R ~A, B, C,

D, E, F, G!.

T A B C D E F G

R1 b

11

b

12

b

13

b

14

b

15

b

16

b

17

R2 b

21

b

22

b

23

b

24

b

25

b

26

b

27

R3 b

31

b

32

b

33

b

34

b

35

b

36

b

37

Table VII. Result of the third step of lossless join

testing algorithm to R.

T A B C D E F G

R1 a

1

a

2

a

3

a

4

a

5

b

16 b17

R2 b

21

b

22

b

23

a

4

a

5

A

6

b

27

R3 b

31

b

32

b

33

b

34

b

35

A

6

A

7

910 BAHAR AND YAZICI

them. On the other hand, there is a customer-induced fraud in which customers

claim that their credit card was stolen after making some expensive purchases.

Most of the credit card companies use some sophisticated systems to detect fraud-

ulent behavior, because various opportunities for this still exist although most credit

card purchases are electronically verified before the actual transaction. These sys-

tems have to work with very little significant data; they know only the past cus-

tomer history and the current transaction information. On the other hand, they

should not too easily decline nonfraudulent transactions so as not to make the

customers dissatisfied.

At this point, the companies are unwilling to disclose system details or even

the fact that they use fuzzy logic fraud detection systems. Our case will be on a

financial service provider. The company offers its customers both banking and

insurance services, and the system is used for the detection of insurance fraud.

Each insurance claim in the field of home insurance is evaluated to assess the

fraudulent behavior likelihood. The company wanted to implement a fraud detec-

tion system that looks at multiple factors in every insurance claim and selects only

those that have a certain degree of likelihood of fraud.

All information about the customers is hold in a database. By using the sys-

tem, the insurance claim is evaluated, and if the likelihood of fraud assessed is

lower than a certain predefined threshold, the claim is immediately paid out to the

customer. If the result is higher than the threshold, then the claim is passed on to a

claims auditor with the reason result. After his manual review, final decisions on

further steps are made.

We have the following attributes for the system: Number of claims in the last

12 months, amount of the current claim, time with insurance, average balance on

all banking accounts over the last 12 months, number of overdrafts over the last 12

months, annual income of the customer, recent changes in status, insurance history

evaluation, banking history evaluation, personal evaluation, fraud likelihood, and

Table VIII. Result of the fourth step of lossless join

testing algorithm for the first three FFDs of R.

T A B C D E F G

R1 a

1

a

2

a

3

a

4

a

5

a

6

b

17

R2 b

21

b

22

b

23

a

4

a

5

a

6

b

27

R3 b

31

b

32

b

33

b

34

b

35

a

6

a

7

Table IX. Table for R ~A, B, C, D, E, F, G! at

the end of lossless join testing algorithm.

T A B C D E F G

R1 a

1

A

2

a

3

a

4

a

5

a

6

a

7

R2 b

21

b

22

b

23

a

4

a

5

a

6

a

7

R3 b

31

b

32

b

33

b

34

b

35

a

6

a

7

NORMALIZATION OF FRDBs 911

fraud reason explanation. The first three attributes give information about the insur-

ance contract and the claimitself, the next two attributes describe the banking back-

ground of the customer, and the sixth and seventh attributes provide the personal

background. Then our relation schema and the fuzzy functional dependencies will

be as follows:

R: ~NumClaim, Amount, CustSince, AvgAmnt, NumOvr, Income, StatChng,

HistIns, HistBank, Personal, Fraud, Reason!

FFD1: Number of claims in the last 12 months, amount of current claim, and

time with insurance mostly determines the insurance history evaluation.

$NumClaim, Amount, CustSince% r

0.8

HistIns

FFD2: Average balance on all banking accounts over the last 12 months and

number of overdrafts over the last 12 months generally determines banking

history evaluation.

$AvgAmnt, NumOvr% r

0.7

HistBank

FFD3: Annual income of the customer and recent changes in status more or

less determines personal evaluation.

$Income, StatChng% r

0.6

Personal

FFD4: Insurance history evaluation, banking history evaluation, and per-

sonal evaluation mostly determines fraud reason explanation.

$HistIns, HistBank, Personal % r

0.8

Reason

FFD5: Fraud reason explanation more or less determines fraud likelihood.

Reason r

0.6

Fraud

The attributes can be briefly explained as follows. NumClaim gives an indi-

cation of how often the customer has used the insurance in the past year. Amount

expresses how significant the current claim is. CustSince takes into account how

long the insurance contract has been in existence. HistIns indicates how much the

customer has exercised their insurance contract in the past and present. AvgAmnt

is the average total balance on all banking accounts of the customer. NumOvr is

the number of overdrafts on checking accounts. HistBank evaluates the banking

history of the customer and its relevance to his insurance claim. Personal assesses

the customers basic situation, detects possible motives within the customers life

style that could motivate fraudulent behavior. StatChng indicates whether a fun-

damental change in the customers life has occurred over the past four months.

Normalization process begins with the fuzzy 1NF, but because there are no

tuples at the beginning, we continue with the fuzzy 2NF. Analyzing the ffds, the

fuzzy key is $NumClaim, Amount, CustSince, AvgAmnt, NumOvr, Income, Stat-

Chng% with a degree of 0.6 because the transitive closure of this attribute set

912 BAHAR AND YAZICI

contains all the attributes of the relation. In this case, HistIns, HistBank, Personal,

Fraud, and Reason are fuzzy nonprime attributes. For the relation to be in fuzzy

2NF, none of these fuzzy nonprime attributes is partially fuzzy functionally depen-

dent on the fuzzy key. But in our case, this restriction is violated, so the relation is

not in fuzzy 2NF, and it should be normalized into a number of smaller relations

that are in fuzzy 2NF. Using the decomposition algorithm 3.4.2.1, R is decom-

posed into four new relations R1 through R4.

R1: (NumClaim, Amount, CustSince, HistIns) with the fuzzy functional dependency

$NumClaim, Amount, CustSince% r

0.8

HistIns

R2: (AvgAmnt, NumOvr, HistBank) with the fuzzy functional dependency

$NumOvr, AvgAmnt% r

0.7

HistBank

R3: (Income, StatChng, Personal ) with the fuzzy functional dependency

$Income, StatChng% r

0.6

Personal

and a relation with the remaining attributes, after removing the fuzzy nonprime

attributes partially fuzzy functionally dependent on the fuzzy key of the original

relation,

R4: ~NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng, Fraud,

Reason! with the fuzzy functional dependency

$NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng% r

0.6

Reason

Reason r

0.6

Fraud

After achieving the fuzzy 2NF, conditions for the fuzzy 3NF should be tested.

For a relation to be in fuzzy 3NF, it should already be in fuzzy 2NF, and addition-

ally for each ffd in the relation either the left-hand side contains the fuzzy key of

the relation or the right-hand side consist of fuzzy prime attributes. For the first

three of the relations, the left-hand sides of the ffds contain the respective fuzzy

keys of the relations. But in the fourth relation, in the second ffd, neither the left-

hand side contains the fuzzy key, that is, $NumClaim, Amount, CustSince, AvgAmnt,

NumOver, Income, StatChng%, nor the right-hand side attribute Fraud is fuzzy

prime. So the last relation should be decomposed into fuzzy 3NF. To be able to

make a lossless join decomposition into fuzzy 3NF, initially minimal cover of the

ffds of R4 should be found. After applying the minimal cover algorithm, we find

the minimal cover for R4 as shown below:

NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng r

0.6

Reason,

Reason r

0.6

Fraud

Then by using the lossless join decomposition into fuzzy 3NF algorithm, R4 is

decomposed into two new relations, R5 and R6.

NORMALIZATION OF FRDBs 913

R5: ~NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng, Reason!

with the fuzzy functional dependency

$NumClaim, Amount, CustSince, AvgAmnt, NumOver, Income, StatChng% r

0.6

Reason

R6: (Reason, Fraud) with the fuzzy functional dependency

Reason r

0.6

Fraud

At this point, all the relations are also in fuzzy BCNF. Applying the Dependency

Preservation Testing Algorithm, we see that the decomposition has the property of

dependency preservation. We can also check whether the decomposition of R into

R1, R2, R3, R5, and R6 has the lossless join property by using the lossless join

property testing algorithm. Table X has five rows, one for each decomposed rela-

tions, and 12 columns, one for each attribute. After initializing the entries with

respect to ffds the decomposed relations are shown in Table X. Then for each

fuzzy functional dependency, the table should be processed. The ffd to be pro-

cessed are

$NumClaim, Amount, CustSince% r

0.8

HistIns ,

$AvgAmnt, NumOvr% r

0.7

HistBank,

$Income, StatChng% r

0.6

Personal ,

$HistIns, HistBank, Personal % r

0.8

Reason,

Reason r

0.6

Fraud

The result of this step is shown in Table XI. Because there is a row, that is, the

fourth row, made up of entirely a symbols, therefore, the decomposition satis-

fies the lossless join property.

4. CONCLUSION

Like the classical databases, the fuzzy databases not properly designed suffer

from the problems of data redundancy and update anomalies. To provide a good

fuzzy relational database design, the concept of ffd is used to define the fuzzy

normal forms and dependency-preserving and lossless join properties.

In this article, we begin with the first step of the normalization process and

define the Fuzzy 1NF. Then the concept of fuzzy key is introduced. It constitutes a

base for the remaining fuzzy normal forms, Fuzzy 2NF, Fuzzy 3NF, and Fuzzy

BCNF. To state the condition for fuzzy normal forms, the definitions of fuzzy

prime and fuzzy nonprime attributes are introduced. We also discuss the two desir-

able properties of decompositions, namely the dependency preservation property

and the lossless join property, which are both used by the design algorithms to

achieve desirable decompositions. Normal forms are insufficient on their own as

criteria for a good database design. The relations must collectively satisfy these

two additional properties to qualify as a good design. The situation is the same

when we deal with fuzzy data and fuzzy normal forms. We illustrate how these

914 BAHAR AND YAZICI

T

a

b

l

e

X

.

I

n

i

t

i

a

l

t

a

b

l

e

f

o

r

r

e

l

a

t

i

o

n

R

(

N

u

m

C

l

a

i

m

,

A

m

o

u

n

t

,

C

u

s

t

S

i

n

c

e

,

A

v

g

A

m

n

t

,

N

u

m

O

v

r

,

I

n

c

o

m

e

,

S

t

a

t

C

h

n

g

,

H

i

s

t

I

n

s

,

H

i

s

t

B

a

n

k

,

P

e

r

s

o

n

a

l

,

F

r

a

u

d

,

R

e

a

s

o

n

)

a

f

t

e

r

s

e

t

t

i

n

g

t

h

e

e

n

t

r

i

e

s

w

i

t

h

r

e

s

p

e

c

t

t

o

d

e

c

o

m

p

o

s

e

d

r

e

l

a

t

i

o

n

s

.

R

N

u

m

C

l

a

i

m

A

m

o

u

n

t

C

u

s

t

S

i

n

c

e

A

v

g

A

m

n

t

N

u

m

O

v

r

I

n

c

o

m

e

S

t

a

t

C

h

n

g

H

i

s

t

I

n

s

H

i

s

t

B

a

n

k

P

e

r

s

o

n

a

l

F

r

a

u

d

R

e

a

s

o

n

R

1

a

1

a

2

a

3

b

1

4

b

1

5

b

1

6

b

1

7

a

8

b

1

9

b

1

1

0

b

1

1

1

b

1

1

2

R

2

b

2

1

b

2

2

b

2

3

a

4

a

5

b

2

6

b

2

7

b

2

8

a

b

2

1

0

b

2

1

1

b

2

1

2

R

3

b

3

1

b

3

2

b

3

3

b

3

4

b

3

5

a

6

a

7

b

3

8

b

3

9

a

1

0

b

3

1

1

b

3

1

2

R

5

a

1

a

2

a

3

a

4

a

5

a

6

a

7

b

4

8

b

4

9

b

4

1

0

b

4

1

1

a

1

2

R

6

b

5

1

b

5

2

b

5

3

b

5

4

b

5

5

b

5

6

b

5

7

b

5

8

b

5

9

b

5

1

0

a

1

1

a

1

2

T

a

b

l

e

X

I

.

T

a

b

l

e

f

o

r

r

e

l

a

t

i

o

n

R

(

N

u

m

C

l

a

i

m

,

A

m

o

u

n

t

,

C

u

s

t

S

i

n

c

e

,

A

v

g

A

m

n

t

,

N

u

m

O

v

r

,

I

n

c

o

m

e

,

S

t

a

t

C

h

n

g

,

H

i

s

t

I

n

s

,

H

i

s

t

B

a

n

k

,

P

e

r

s

o

n

a

l

,

F

r

a

u

d

,

R

e

a

s

o

n

)

a

t

t

h

e

e

n

d

o

f

l

o

s

s

l

e

s

s

j

o

i

n

t

e

s

t

i

n

g

a

l

g

o

r

i

t

h

m

.

R

N

u

m

C

l

a

i

m

A

m

o

u

n

t

C

u

s

t

S

i

n

c

e

A

v

g

A

m

n

t

N

u

m

O

v

R

I

n

c

o

m

e

S

t

a

t

C

h

n

g

H

i

s

t

I

n

s

H

i

s

t

B

a

n

k

P

e

r

s

o

n

a

l

F

r

a

u

d

R

e

a

s

o

n

R

1

a

1

a

2

a

3

b

1

4

b

1

5

b

1

6

b

1

7

a

8

b

1

9

b

1

1

0

b

1

1

1

b

1

1

2

R

2

b

2

1

b

2

2

b

2

3

a

4

a

5

b

2

6

b

2

7

b

2

8

a

9

b

2

1

0

b

2

1

1

b

2

1

2

R

3

b

3

1

b

3

2

b

3

3

b

3

4

b

3

5

a

6

a

7

b

3

8

b

3

9

a

1

0

b

3

1

1

b

3

1

2

R

5

a

1

a

2

a

3

a

4

a

5

a

6

a

7

a

8

a

9

a

1

0

a

1

1

a

1

2

R

6

b

5

1

b

5

2

b

5

3

b

5

4

b

5

5

b

5

6

b

5

7

b

5

8

b

5

9

b

5

1

0

a

1

1

a

1

2

NORMALIZATION OF FRDBs 915

fuzzy normal forms can be used to decompose an unnormalized relation into a set

of normalized relations by examples.

We have developed an implemented system (using Borland C4.0), which

is carried out within the framework. Implementation consists of two main parts.

The first part defines the attributes and their properties and provides an interface

to accept tuples and check their conformance. The second part of the implementa-

tion consists of normalization procedures, controlling the level of the normal forms

and decomposing the relation into various normal forms with dependency preser-

vation and the lossless join properties.

Further study involving the fuzzy multivalued dependencies, fuzzy join depen-

dencies, fuzzy inclusion dependencies, and related normal forms has been ongoing.

References

1. Codd E. Arelational model for large shared data banks. Commun ACM 1970;13:377387.

2. Chen G, Kerre EE, Vandenbulcke J. Normalization based on ffd in a fuzzy relational data

model. Inform Syst 1996;21:299310.

3. Imelinski T, Lipski W. Incomplete information in relational databases. J ACM1984;31:701

791.

4. Medina J, Pons O, Vila M. GEFRED: A generalized model to implement fuzzy relational

databases. Inform Sci 1994;47:234254.

5. Petry FE. Fuzzy databases: Principles and applications. Boston: Kluwer Academic Pub-

lishers; 1996.

6. Raju KVSVN, Majumdar AK. Fuzzy functional dependencies and lossless join decompo-

sition of fuzzy relational database systems. ACM Trans Database Syst 1988;13:129166.

7. Umano M, Freedom O. A fuzzy database system. In: E. Sanchez, M. M. Gupta, editors.

Fuzzy Information and Decision Processes. Amsterdam: North Holland; 1982. pp 339347.

8. Yazc A, George R. Fuzzy database modeling. Heidelberg: Physica-Verlag; 1999.

9. Zadeh L. Similarity relations and fuzzy orderings. Inform Sci 1971;3:177206.

10. Buckles PB, Petry FE. A fuzzy representation of data for relational databases. Fuzzy Set

Syst 1982;7:213226.

11. Prade H, Testemale C. Representation of soft constraints and fuzzy attribute values by

means of possibility distributions in databases. In: James Bezdek, editor. Analysis of Fuzzy

Information: Vol. II, Artificial Intelligence and Decision Systems. Boca Raton, FL: CRC

Press; 1987. pp 213229.

12. Rundensteiner E, Hawkes L, Bandler W. On nearness measures in fuzzy relational data

models. Int J Approx Reason 1989;3:267298.

13. Codd E. Further normalization of the database relational model. In: Rustin, editor. Data

base systems. New York: Prentice-Hall; 1972. pp 3364.

14. Elmasri R, Navathe SB. Fundamentals of database systems. New York: Benjamin Cum-

mings Publishing Co.; 2000.

15. Shenoi S, Melton A, Fan LT. Functional dependencies and normal forms in fuzzy rela-

tional database model. Inform Sci 1992;60:128.

16. Liu W-Y. Fuzzy data dependencies and implication of fuzzy data dependencies. Fuzzy Set

Syst 1997;92:341348.

17. Yazc A, Szat MI. Acomplete axiomatization for fuzzy functional and multivalued depen-

dencies in fuzzy database relations. Fuzzy Set Syst 2001;117:161181.

18. Chen G, Kerre EE, Vandenbulcke J. A computational algorithm for the FFD transitive

closure and a complete axiomatization of fuzzy functional dependence(FFD). Int J Intell

Syst 1994;9:421439.

19. Nakata M, Murai T. Updating under integrity constraints in fuzzy databases. In: Proc Sixth

IEEE Conf on Fuzzy Systems (FUZZ-IEEE97). Barcelona: IEEE; 1997. pp 713719.

916 BAHAR AND YAZICI

20. Yazc A, Szat MI. The integrity constraints for similarity-based fuzzy relational data-

bases. Int J Intell Syst 1998;13:641660.

21. Saxena PC, Tyagi BK. Fuzzy functional dependencies and independencies in extended

fuzzy relational database models. Fuzzy Set Syst 1995;69:6589.

22. Chen G, Kerre EE, Vandenbulcke J. On the lossless join decomposition of relation scheme(s)

in a fuzzy relational data model. In: Bilal M. Ayyub, editor. Proc ISUMA 93, Second

International Symposium on Uncertainty Modeling and Analysis. Los Alamitos, CA: IEEE

Computer Society Press; 1993. pp 440446.

23. Chen G, Kerre EE, Vandenbulcke J. The dependency preserving decomposition and a test-

ing algorithm in a fuzzy relational data model. Fuzzy Set Syst 1995;72:2737.

24. Kerre E, Zenner R, De Clauwe R. The use of fuzzy set theory in information retrieval and

databases: A survey. J Am Soc Inform Sci 1986;37:341345.

NORMALIZATION OF FRDBs 917

- Functional+dependencies+and+normalization4Uploaded byAbuBakar Sohail
- Question BankUploaded bySindhuja Vigneshwaran
- Dbms Question Bank-newUploaded byRenu Krishna
- DB2007HW03AnswerUploaded byPhyoThuAung
- w 26142147Uploaded byAnonymous 7VPPkWS8O
- lossyless joinUploaded byvaishuraji2001
- Lecture 8 NormalizationUploaded byFedrich De Marcus
- NormalizationUploaded byElias Nur
- 78 to 80Uploaded byJC Lalzarzova
- Baseis Dedomenon Functional DependenciesUploaded bylebenikos
- MN405 Data and Information ManagementUploaded bySambhav Jain
- 12 Rules for a Relational Database ModelUploaded byVijay Kumar
- Chapter4 - Schema Refinement and NormalisationUploaded byabdulgani11
- Lecture 18Uploaded byRahib Ali
- ENCh06(2)Uploaded byJyoti Gupta
- Oracle Question 1Uploaded bysatya1401
- Computer ScienceUploaded bydhruvthedon2
- Database notesUploaded byEbtisam Hamed
- 1 Comparison on the Performance of Induction Motor Drive uisng Artificial Intelligent Controller.pdfUploaded byDrPrashant M. Menghal
- IJEPES-S-07-00041Uploaded bySarmila Patra
- MDB_Overview_ENUUploaded byjuliermemello
- DBMSUploaded byDEEPA143
- 4327Uploaded byyes1nth
- Development of Parking AlgorithmUploaded byZhanar Madimova
- SyllabusUploaded byPrasanth Nava
- Enhancement of Color Images Using LP AlgorithmUploaded byseventhsensegroup
- Full TextUploaded byKang Dae Sung
- Chapter1.pptUploaded byMani Chandra Teja
- IN_90_Analyst_UserGuide_en.pdfUploaded byShiva CH
- Fuzzy LogicUploaded byAbhishek Sit

- Super ProUploaded byKel Lie
- UNIT I DBMS-2Uploaded byParth Sonkhia
- ACE Exam Guide Cold Fusion 9Uploaded bytanshul369
- z Db IntegrationUploaded byLaura Gallego
- B1iFW03TutorialUploaded bymaucim
- Brughmans T. 2009 Connecting the Dots: Exploring Complex Archaeological Datasets with Network Analysis, Case Study: Tableware Trade in the Roman East, Unpublished MSc Dissertation, University of Southampton.Uploaded bytom.brughmans8209
- Complete Unix CommandsUploaded bySantosh Kumar G
- Ignite SampleUploaded byxbsd
- Assignment 2.docxUploaded byRavid Villa Arya
- ASP Net MVC 3TierUploaded byAbdhal Galu
- Huawei eSpace U2990 Unified Gateway Product Description(E20120710)Uploaded bykrrakesh5
- Config ERS in VCS.pdfUploaded byMakarand Jog
- 1 cv saiful islam mar novUploaded byapi-274669796
- SDS TemplateUploaded bybhavesh_ratanpal
- SQL 2012 SP1 for System Center 2012 R2Uploaded byPaul Fuyane
- MG Btech 5th Sem Cs SyllabusUploaded byJinu Madhavan
- Oracle SecurityUploaded byGayathri Suresh
- mainframe FAQUploaded byAntima Vyas
- Avamar 5.0 Release Notes AddendumUploaded byRealDosMaster
- Mbah Tochukwu RefUploaded byEmmanuel N Onuzurike
- ReleasenotesUploaded bydenise_rosales_1
- DL-Lite: Tractable Description Logics for Ontologies: A SurveyUploaded byEditor IJRITCC
- Cookies-Crackers SIC2052 MUploaded byDataGroup Editorial
- Case Studies Related to Information Technology1Uploaded byRedenRodriguez
- Hayles Narrative and DatabaseUploaded bygbaggins
- GIS Documentation FINALUploaded byVishnu Gandhamaneni
- Railway reservationUploaded byVineet Panwar
- ZI Whitepaper Six Strategies 0513Uploaded byLe Thanh Tin
- NAVUploaded byDiegoCebrian
- 23593363 Mark AnalysisUploaded byLaxmanNaidu