You are on page 1of 14

Introduction

computational intelligence includes:

Fuzzy logic
Rough sets and its applications
Soft sets and its applications
Artificial neural networks
Evolutionary algorithms
Dynamically evolving systems based on the above

Some authors also attribute other areas of research such

as belief-based:

Dempster-Shafer theory
chaos theory
swarm and collective intelligence.

The application areas such as:

pattern recognition
image processing
classification
reduction of knowledge,
prediction of missing values
business and video analytics

Areas of research that are closer to Statistical Learning

Support Vector Machines


probability theory
Bayesian
Markov

are also sometimes considered to be a part of Computational Intelligence.

Rough set theory: can be considered as a specific point on the long road of mathematics.

We can say that it is

a topological bridge from real life problems to computer science and consequently to applications
in many fields.

This theory depends on a topological structure called a quasi-discrete topology generated by the
equivalence classes of the relation defined on the collection of data.

This theory was initiated by Pawlak. Exemplary applications are listed here:

Civil engineering,
medical data analysis,
pharmacology,
aircraft pilot performance evaluation,
data mining,
synthesis of switching circuit.

Roughian is a new appreviation for rough set information analysis

This term have been appeared and used after the fast development of rough set theory and its
applications. The main aim of this chapter is to give a comprehensive account on basic concepts of
rough sets and its applications, which play an important role in:
reduction of knowledge
artificial intelligent
data mining
decision making and others

The approximation space is a pair (U,R),

where [U] is a non-empty finite set of objects ( states, patients, digits, cars, …….etc ) called the
universe

and [R] is an equivalence relation over U which makes a partition for U

i.e. a family C={X1,X2,X3,…….,Xn},

such that Xi ⊆ U, Xi ≠ φ ,

Xi ∩ Xj = φ ,

for i≠j, || i,j=1,2,3,……..,n

and Xj∪Xi = U,

the class [C] is called the knowledge base of (U,R).

The universe[U] of objects with the relation[R] play an important role in converting data into
knowledge which use [R] as a tool of a mathematical model for dealing with members and subsets of
[U].

Thus we can say that [R] changes [U] from just being a set to a mathematical model.

EX:
Let [U]={1,2,3,4,5,6,7,8,9} and a relation [R] which partitions [U] into two classes even and odd
numbers,

then the knowledge base[C] is:

U/R= {{1,3,5,7,9},{2,4,6,8}}
X1={1,3,5,7,9]
X2={2,4,6,8}
X1∩X2=φ
X1∪X2=U

U: is the universe where all elements belongs

R: is an equivalence relation which makes partitions from [U]

C: The knowledge base [C] Contains all the partitioned classes [U/R]

U/R: Is the knowledge base [C]

even/odd numbers

as College/CI/Examples/Ch1/EX1

U/R= {{1,3,5,7,9},{2,4,6,8}}
X1={1,3,5,7,9]
X2={2,4,6,8}

A: All attributes from [U]

B: some attributes from [A]

ρ: A function that maps attributes [A] with [U] elements to gat a value [V]

V: The final result of [ρ] according to the attribute (Value)


S: Is the information system where S=(U,A,ρ,V)

An information system [S] can be defined as S=(U,A,ρ,V), where:

[U] is a non-empty finite set of objects called the universe

[A] is a non-empty finite set of all attributes

(features, variables ,characteristic conditions, leds, ....etc).

In an information system: S=(U,A, ρ,V) A subset X ⊆ U will be called a concept or a category in [U],

Each attribute [a] ∈ [A] can be viewed as a function [ρ] that maps elements of [U] into a set [Va]

the set [Va] is called the value of set of the attribute a

ρ : UxA → Va

The following table contains [9 patients] records with the [symptom] attributes which are

temperature, cough, headache, muscle-pain and the diagnosis done by a doctor which says whether
or not the patient has flu.

EX:
this example is an example of information system which has: the objects [U] are the patients ID,

i.e. U={1,2,3,….,8,9},

and the attributes [A] are symptom,

i.e. A={Temp, Cough, Headache, Muscle pain, Flu},

and the values [V] of information table are normal, subfev, high, yes and no.

EX:
The objects are digits [U]={0,1,2,3...,9}

and attributes are leds of display [A]=[a,b,c,d,e,f,g]

We find that values[V] are Va=Vb=Vc=Vd=Ve=Vf=Vg={0,1}

1 ≡ ON

0 ≡ OFF

a(5)=1

b(5)=0

g(9)=1

mathematical models
To extract knowledge from information system, it is necessary to find mathematical models using the
given data.

In the following we give some mathematical concepts that helps in modeling information systems.

indiscernibility
For a subset B ⊆ A

The indiscernibility (similarity) relation is:

IND(B) = { (x,y) ∈ U 2 : a(x)=a(y), ∀ a∈B }


Which groups the object that posses the same features (Not unique)

(the values of the attributes [A]) with respect to B

i.e. the objects that are indiscernible.

IND(B) is an equivalence relation[R] that partitions [U] and divides it into equivalence classes,

IND(B) = ∩a∈B IND({a})

An equivalence class of IND(B) is denoted by:

[x]B ={y∈U : (x,y) ∈ IND(B)}

Both x,y are objects in [U]

Thus the family of all equivalence classes with respect to [B] is denoted by U/IND(B), where:

U/IND(B) = {[x]B : ∀ x∈U}

∀ : For all.

indiscernibility : Not a unique thing.

EX:
Given a set of attributes B={a,b,c}.

Then from Table (1.2),

we get:

IND(B)={(0,0),(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8, 8),(9,9),(0,3),(3,0),(0,7),(7,0),(0,8),(8,0),(0,9),(9,0),
(1,4),(4,1),(3,7),(7,3),(3,8),(8,3),(3,9),(9,3),(5,6),(6,5),(7,8),(8,7),(7,9),(9,7),(8,9),(9,8)}

[0]B={0,3,7,8,9}

[1]B={1,4}

[2]B={2}

[5]B={5,6}

U={1,2,3,4,5,6,7,8,9}

U/IND(B)={{0,3,7,8,9},{1,4},{2},{5,6}}

Note:
Every Element is similar to itself: {(0,0),(1,1),(2,2),(3,3),...,(9,9)}

{(0,3),(3,0),(0,7),(7,0),(3,7),(7,3)} & {8,9} are similar in the values of the attributes B={a,b,c}.

The same goes for {1,4} , {5,6}

{2} is unique

IND(B): Groups [U] objects that poses same features

(Values [V] of attributes [B] ) with respect to [B]

[x]B: Is an equivalence class of IND(B)


U/IND(B): The family of all equivalence classes [x]B ( [C] )

B = {a,b,c}
IND(B) = {(0,0),(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8, 8),(9,9),(0,3),(3,0),(0,7),(7,0),(0,8),(8,0),
(0,9),(9,0),(1,4),(4,1),(3,7),(7,3),(3,8),(8,3),(3,9),(9,3),(5,6),(6,5),(7,8),(8,7),(7,9),(9,7),(8,9),(9,8)}
[0]B = {0,3,7,8,9}
[1]B = {1,4}
[2]B = {2}
[5]B ={5,6}
U = {1,2,3,4,5,6,7,8,9}
U/IND(B) = {{0,3,7,8,9},{1,4},{2},{5,6}}

discernibility matrix
An information system [S] defines a matrix MA called discernibility matrix.

Each entry: MA(x,y)⊆A

consists of a set of attributes that can be used to discern between objects x,y ∈ U :

MA(x,y) = {a ∈ A / a(x) ≠ a(y)}


Equalls to all attributes which elements (x,y) deffires in

MA is a |U|x|U| matrix, the discernibility matrix has another form:

Mij = {a ∈ A / a(xi) ≠ a(xj)}, i,j∈[1,n], n=|U|

|U| = number of elements in U

The discernibility matrix of Table (1.2) is shown in Table (1.3), in this table we neglect the row of object
9 and the column of object o for simplicity they will be blank,

not only for objects 0 and 9 but also for any two different objects, one from rows and the other from
columns
Note:

for example: objects(0,1) are different in the values of (a,d,e,f).

if they are similar then we delete one of them cause the same data is already represented in both.

discernibility function:
The discernibility function is a Boolean function representation of the 6-Discernibility_Matrices.

ƒ (a1,a2,…..,an) = ∧{∨Mij : 1 ≤ j < i ≤ n, Mij≠φ}


A

Which use the product of sums (P.O.S.) , n=|U|

This expression is used for reduction of attributes.

Also ƒA(x) = ∧ y∈U {∨a : a∈MA(x,y) ≠φ}

which is used for decision making.

In the following sections, reduction of attributes and decision making will be discussed.

Reduction of knowledge is divided to 8-1-Reduction_of_Objects and 8-2-reduction_of_attributes.

Reduction_of_Objects :
We can reduce the objects by using the discernibility matrix, where if we find an empty set (φ)
between 2 objects in discernibility matrix, then we can eliminate one of them.

They are completely similar so one of them is enough for the calculations.

reduction_of_attributes :
Let B ⊆ A, a ∈ B, then [a] is supperfluous (unnecessary) attributes in [B] if:

U/IND(B) = U/IND(B-{a})

And the reduct of B is RED(B)=B-{a}


The set M is called a minimal reduct of B if:

(i) U/IND(M) = U/IND(B)

(ii) U/IND(M) ≠ U/IND(M-{a}), ∀ a∈M

The core is the set of all characteristic of knowledge, which can't be eliminated from knowledge in the
reduct process:

CORE(B)= ∩ RED(B)

Let B = {a,b,c,d} where:

Then the set of all reducts [RED(B)] is given by:

RED(B)={{a,b}, {a,b,c}, {a,b,d}, {a,c,d}, {c,d}, {b,c,d}}

But minimal reduct is M={{a,b},{c,d}}.

Then CORE(B) = ∩RED(B) = {a,b}∩{c,d} = φ

So we can't eliminate any set furthermore (The value is: φ)

MA: Called discernibility matrix , |U|x|U| matrix

Contains all the attributes which elements (x,y) deffires in,

consists of a set of all attributes that can be used to discern between objects x,y ∈ U
ƒA: The discernibility function is a Boolean function representation of the Discernibility Matrix

Which uses the product of sums (P.O.S.)

M: Is called a minimal reduct of [B]

RED(B): the reduct of [B] is RED(B)=B-{a} where [a] is supperfluous (unnecessary) attributes in [B]

CORE(B): is the set of all characteristic of knowledge, which can't be eliminated from knowledge in
the reduct process CORE(B)= ∩ RED(B)

RED(B)={{a,b}, {a,b,c}, {a,b,d}, {a,c,d}, {c,d}, {b,c,d}}

minimal reduct is M={{a,b},{c,d}}

CORE(B) = ∩RED(B) = {a,b}∩{c,d} = φ

So we can't eliminate any set furthermore (The value is: φ)

For objects reduction, there is no reduction of objects because we find that no empty set between
any two objects in discernibility matrix.

For attribute reduction we get:

U/IND(a) = {{0,2,3,5,6,7,8,9},{1,4}}

U/IND(b) = {{0,1,2,3,4,7,8,9},{5,6}}

U/IND(c) = {{0,1,3,4,5,6,7,8,9},{2}}

U/IND(d) = {{0,2,3,5,6,8,9},{1,4,7}}

U/IND(e) = {{0,2,6,8},{1,3,4,5,7,9}}
U/IND(f) = {{0,4,5,6,8,9},{1,2,3,7}}

U/IND(g) = {{0,1,7},{2,3,4,5,6,8,9}}

U/IND(A)= {{0},{1},{2},{3},{4},{5},{6},{7},{8},{9}}
Note that all sets are different.

and we find that :

U/IND(A) ≠ U/IND(A - {a})

U/IND(A) ≠ U/IND(A - {b})

U/IND(A) = U/IND(A - {c})

U/IND(A) = U/IND(A - {d})

U/IND(A) ≠ U/IND(A - {e})


U/IND(A) ≠ U/IND(A - {f})
U/IND(A) ≠ U/IND(A - {g})

Then attributes c and d are supperfluous attributes and we get:

U/IND(A) = U/IND(A - {c,d})

Then the class of reducts of [A] is:

RED(A) = {{a,b,d,e,f,g},{a,b,c,e,f,g},{a,b,e,f,g}}

The minimal reduct of [A] is M = {a,b,e,f,g}

And the core of [A] is: CORE(A) = ∩RED(B) = {a,b,e,f,g}

Note:
we can obtain the minimal reduct in one step by using discernibility function as shown in the next
example.

Notice that the attributes are now reduced (c,d)

We can get to the same result in the last Example by using Table (1.3)
and Discernibility_Function as follows:

ƒA (a,b,c,d,e,f,g)=

{a+d+e+f}.{c+f+g}.{e+f+g}.{a+d+e+g}.{b+e+g}.{b+g}.{d+e+f}.{g}.{e,g}

.{a+c+d+e+g}.{a+d+g}.{f+g}.{a+b+d+f+g}.{a+b+d+e+f+g}.{a}.{a+d+e+f+g}.{a+d+f+g}

.{c+e}.{a+c+d+e+f}.{b+c+e+f}.{b+c+f}.{c+d+e+g}.{c+f}.{c+e+f}

.{a+ d+f}.{b+f}.{b+e+f}.{d+g}.{e+f}.{f}

.{a+b+d}.{a+b+d+e}.{a+f+g}.{a+d+e}.{a+d}

.{e}.{b+d+f+g}.{b+e}.{b}

.{b+d+e+f+g}.{b}.{b+e}

.{d+e+f+g}.{d+f+g}
.{e}

= a.b.e.f.g

Then RED(A) =ƒA(a,b,c,d,e,f,g) = {a,b,e,f,g}

which is the minimal reduct M of [A]

and CORE(A) = {a,b,e,f,g}

P.O.S : use the smallest set {a} for example then delete each bigger set.

Notice that the attributes are now reduced (c,d)

Note

if ƒA = a+b+e+f+g

it will be

{{a},{b},{e},{f},{g}}

while ƒA = a.b.e.f.g

will be

{a,b,e,f,g}

decision making
In this section we are going to illustrate by example the decision making which is used to give us a
decision table and find all minimal decision algorithms associated with the table.

Discernibility_Matrices and Discernibility_Function are used for Decision_Making.

from Table (1.4)


to finalize the decision making table we First reduce the attributes

NOTE that (c,d) are removed (reduced)

Then we make a new discernibility matrix as in Table(1.5):**

By using the discernibility function we get the decision table which is shown in Table (1.6)

get the discernibility function for each object

From Table (1.5)

by taking the attributes which are found in row and column for each object, we get:

fA(0) = {a+e+f}.{f+g}.{e+f+g}.{a+e+g}.{b+e+g}.{b+g}.{e+f}.{g}.{e+g} = {g}.{e+f} = g.e+g.f

fA(1) = {a+e+f}.{a+e+g}.{a+g}.{f+g}.{a+b+f+g}.{a+b+e+f+g}.{a}.{a+e+f+g}.{a+f+g} = {a}.{f+g} =


a.f+a.g

fA(2) = {f+g}.{a+e+g}.{e}.{a+e+f}.{b+e+f}.{b+f}.{e+g}.{f}.{e+f} = e.f

fA(3) = {e+f+g}.{a+g}.{e}.{a+f}.{b+f}.{b+e+f}.{g}.{e+f}.{f} = e.g.f

fA(4) = {a+e+g}.{f+g}.{a+e+f}.{a+f}.{a+b}.{a+b+e}.{a+f+g}.{a+e}.{a} = {f+g}.{a} = a.f+a.g

fA(5) = {b+e+g}.{a+b+f+g}.{b+e+f}.{b+f}.{a+b}.{e}.{b+f+g}.{b+e}.{b} = e.b


fA(6) = {b+g}.{a+b+e+f+g}.{b+f}.{b+e+f}.{a+b+e}.{e}.{b+e+f+g}.{b}.{b+e} = e.b

fA(7) = {e+f}.{a}.{e+g}.{g}.{a+f+g}.{b+f+g}.{b+e+f+g}.{e+f+g}.{f+g} = a.g.{e+f} = a.g.e+a.g.f

fA(8) = {g}.{a+e+f+g}.{f}.{e+f}.{a+e}.{b+e}.{b}.{e+f+g}.{e} = b.e.f.g

fA(9) = {e+g}.{a+f+g}.{e+f}.{f}.{a}.{b}.{b+e}.{f+g}.{e} = a.b.e.f

From the above, we can get the decision making for objects 0,1,2,…,9

As an example

the digit 1 appears iff Va(1)=0 & Vf(1)=0 or Va(1)=0 & Vg(1)=0

the digit 6 appears iff Ve(6)=1 & Vb(6)=0

and so on

as shown below:
#CI

we find the discernibility function for each object:

fA(0) = {a+e+f}.{f+g}.{e+f+g}.{a+e+g}.{b+e+g}.{b+g}.{e+f}.{g}.{e+g} = {g}.{e+f} = g.e+g.f

fA(1) = {a+e+f}.{a+e+g}.{a+g}.{f+g}.{a+b+f+g}.{a+b+e+f+g}.{a}.{a+e+f+g}.{a+f+g} = {a}.{f+g} =


a.f+a.g

fA(2) = {f+g}.{a+e+g}.{e}.{a+e+f}.{b+e+f}.{b+f}.{e+g}.{f}.{e+f} = e.f

fA(3) = {e+f+g}.{a+g}.{e}.{a+f}.{b+f}.{b+e+f}.{g}.{e+f}.{f} = e.g.f

fA(4) = {a+e+g}.{f+g}.{a+e+f}.{a+f}.{a+b}.{a+b+e}.{a+f+g}.{a+e}.{a} = {f+g}.{a} = a.f+a.g

fA(5) = {b+e+g}.{a+b+f+g}.{b+e+f}.{b+f}.{a+b}.{e}.{b+f+g}.{b+e}.{b} = e.b

fA(6) = {b+g}.{a+b+e+f+g}.{b+f}.{b+e+f}.{a+b+e}.{e}.{b+e+f+g}.{b}.{b+e} = e.b

fA(7) = {e+f}.{a}.{e+g}.{g}.{a+f+g}.{b+f+g}.{b+e+f+g}.{e+f+g}.{f+g} = a.g.{e+f} = a.g.e+a.g.f

fA(8) = {g}.{a+e+f+g}.{f}.{e+f}.{a+e}.{b+e}.{b}.{e+f+g}.{e} = b.e.f.g

fA(9) = {e+g}.{a+f+g}.{e+f}.{f}.{a}.{b}.{b+e}.{f+g}.{e} = a.b.e.f

From the above, we can get the decision making for objects 0,1,2,…,9

As an example

the digit 1 appears iff Va(1)=0 & Vf(1)=0 or Va(1)=0 & Vg(1)=0
the digit 6 appears iff Ve(6)=1 & Vb(6)=0

and so on...

You might also like