(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 4, April 2011
Boundary region: Thus the boundary region of a concept is theset of those elementary sets that have something to say aboutthe concept, excluding the positive region. It consists of thoseobjects that can neither be ruled in nor ruled out as members of the target set. These objects can be ambiguously (withconfidence less than 100%) assigned the class denoted by
i
Y
.Hence, it is trivial that if
A
BND
, then A is exact. Thisapproach provides a mathematical tool that can be used to findout all possible reducts.
)()()(
i Ai Ai A
Y RY RY BND
(6)Negative region: Thus the negative region of a concept is theset of those elementary sets that have nothing to say about theconcept. These objects cannot be assigned the class denotedby
i
Y
(their confidence of belonging to class
i
Y
is in fact 0%!)
)()(
i Ai A
Y RU Y NEG
(7)Concept Set: Concept set is the equivalence relation from theclass and elementary set are equivalence relation fromattributes. As mentioned above, the goal of the rough set is tounderstand the concept in term of elementary set. In order tomap between elementary set and concept, lower and upperapproximation must first defined. Then positive, boundary andnegative regions can be defined based on the approximationsto generate rules for categorization. Once the effect of subclassof concept is defined, the last step before rule generation is todefine the net effect on entire set of concepts. Given effect of subset of concept)(
i A
Y POS
, the net effect on entire set of concepts is defined as:
k ii A A
Y POSY POS
1
)()(
k ii A
Y BNDY BND
1
)()(
k ii A A
Y RY NEG
1
)()(
(8)Generating rules: There are two kinds of rules that can begenerated from the POS and the BND regions respectively. Forany
)(
j Ai
Y POS X
, we can generate a 100% confidence ruleof the form: If
i
X
then
j
Y
(or
ji
Y X
). For any)(
i Ai
Y BND X
we can generate a <100% confidence rule of the form: If
i
X
then
j
Y
(or
ji
Y X
), with confidence givenas:
i ji
X Y X conf
(9)Assessment a rule: As mentioned above, the goal of the RS isto generate a set of rules that are high in dependency,discriminating index, and significance. There are threemethods of assessing the importance of an attribute:-
Dependency:
How much does a class depends on A (subsetof attribute)
U classPOSclass
A A
)()(
(10)-
Discriminating Index:
Attributes A’s ability to distinguishbetween classes
U class BNDU class
A A
)()(
U class NEGclassPOS
A A
)()(
(11)-
Significance:
How much does the data depend on theremoval of A
)()()(
,,2,1,,2,1
classclassclass
Ad A A Ad A A A A
(12)Significance of A is computed with regard to the entire setof attributes. If the change in the dependency after removing Ais large, then A is more significant.
B.
Rough Set Based Attribute Reduction1)
Literature overview
Attribute or feature selection is to identify the significantfeatures, eliminate the irrelevant of dispensable features to thelearning task, and build a good learning model. It refers tochoose a subset of attributes from the set of original attributes.Attribute or feature selection of an information system is a keyproblem in RS theory and its applications. Usingcomputational intelligence tools to solve such problems hasrecently fascinated many researchers. Computationalintelligence tools are practical and robust for many real-worldproblems, and they are rapidly developed nowadays.Computational intelligence tools and applications have grownrapidly since its inception in the early nineties of the lastcentury [5, 8, 16, 24]. Computational intelligence tools, whichare alternatively called soft computing, were firstly limited tofuzzy logic, neural networks and evolutionary computing aswell as their hybrid methods [16, 40]. Nowadays, thedefinition of computational intelligence tools has beenextended to cover many of other machine learning tools. Oneof the main computational intelligence classes is GranularComputing [25, 40], which has recently been developed tocover all tools that mainly invoke computing with fuzzy andrough sets.However, some classes of computational intelligence tools,like memory-based heuristics, have been involved in solvinginformation systems and DM applications like other well-known computational intelligence tools of evolutionarycomputing and neural networks. One class of the promisingcomputational intelligence tools is memory-based heuristics,like Tabu Search (TS), which have shown their successfulperformance in solving many combinatorial search problems[10, 32]. However, the contributions of memory-basedheuristics to information systems and data mining applicationsare still limited compared with other computational
Identify applicable sponsor/s here.
(sponsors)
3http://sites.google.com/site/ijcsis/ISSN 1947-5500