Knowledge based Interactive Post mining using association rules and Ontologies

OUTLINE
Introduction  Existing System  Proposed System  Advantages in the Proposed System  Methodologies  System Specification  Conclusion

the usefulness of association rules is strongly limited by the huge amount of delivered rules.Introduction    In Data Mining. redundancy reduction. several methods were proposed in the literature such as item-set concise representations. Since this does not guarantee the rules extracted through mining are useful for the user. and post processing. To overcome this drawback. . It is necessary to help the decision maker to use efficient post processing step to reduce the number of rules Thus a new interactive approach to prune and filter discovered rules is proposed.

…tm}be a set of transactions called the database. A rule is defined as an implication of the form XYwhere X. The sets of items (for short itemsets) X and Y are called antecedent (left-hand-side or LHS) and consequent (right-hand-side or RHS) of the rule respectively. Let D={t1.Association Rule Mining   The problem of association rule mining is defined as: Let I={i1.t2. Each transaction in D has a unique transaction ID and contains a subset of the items in I.13…in} be a set of n binary attributes called items.Y is subset of I and X intersection Y =null.i2. .

So . . sometimes the unnoticed rules may be very important that surprises the user. thousands of rules are extracted from a database of several dozens of attributes and several hundreds of transactions.   Apriori [1] is the first algorithm proposed in the association rule mining field and many other algorithms were derived from it. for instance. The mining algorithms can discover a prohibitive amount of association rules.

which is called maximal frequent itemset (MFI) approach  MFI approach firstly locates the center points of high density clusters precisely. To achieve this. . These center points then are used as initial points .Mafia: A Maximal Frequent Itemset Algorithm  To efficiently and yet accurately cluster Web documents is of great interests to Web users and is a key component of the searching accuracy of a Web search engine. this paper introduces a new approach for the clustering of Web documents.

Theme   Redundancy Reduction of Association Rules(RRAR) Zaki and Hsiao used frequent closed itemsetsin the APRIORI algorithm in order to generate all frequent closed itemsets. Concise Representations of Frequent Itemsets(CRFI) represents metrics in the process of capturing dependencies and implications between database items. they generated nonredundant association rules using two closed itemsets. They used an itemset-tid set search tree and pursued with the aim of generating a small nonredundant rule set.  . To this goal. and express the strength of the pattern association.

we propose to use Domain Ontologies in order to strengthen the integration of user knowledge in the postprocessing task The item-relatedness filter (IRF) was proposed by Shekar and Natarajan .    . First. Starting from the idea that the discovered rules are generally obvious. This measure computes the relatedness of all the couples of rule items. they introduced the idea of relatedness between items measuring their semantic distance in item taxonomies. ARIPSO (Association Rule Interactive post-Processing using Schemas and Ontologies) to prune and filter discovered rules.

It takes More time to reduce the rules rules become almost impossible to use when the number of rules overpasses 100. Accuracy.Existing System       In this the problem is selecting interesting association rules throughout huge volumes of discovered rules We are not using Ontologies and Rule Schemas. intractable for a decision-maker to analyze the mining result. Efficiency is very low.  .

Second. an interactive framework is designed to assist the user throughout the analyzing task . we propose to use ontologies in order to improve the integration of user knowledge in the postprocessing task Third. for user expectations Furthermore. we propose the Rule Schema formalism extending the specification language proposed by Liu et al.we propose to integrate user knowledge in association rule mining using two different types of formalism: ontologies and rule schemas.Proposed System     First.

Advantages of Proposed System     Since The delivered Rules are minimal . Has high Accuracy More efficient than Existing system.the user gets access to each and every rule. Implementation is easy when compared to the earlier proposed systems that had a load of draw backs  .

HARDWARE CONFIGURATION Hard disk: 40 GB  RAM : 512mb  Processor : Pentium IV  Monitor :17’’Color Monitor   .

Back End : SQL SERVER 2000 Server : Apache Tomcat  .SOFTWARE CONFIGURATION  Front End : Java.J2EE Tools : Java Eclipse Operating System : Windows XP.

Methodologies  Modules:  Knowledge based Data Mining using Ontologies  Knowledge based Data Mining using Rule Schemas  Operations over Rule Schemas   .

 ontologies offer a more complex knowledge representation model by extending the only is-a relation presented in a taxonomy with the set R of relations. In addition.Knowledge based Data Mining using Ontologies Domain knowledge. defined as the user information concerning the database. is described in our framework using ontologies.   . the axioms bring important improvements permitting concept definition starting from existing information in the ontology.

.

and exception. The pruning operator allows to the user to remove families of rules that he/she considers uninteresting In databases. We propose two important operators: pruning and filtering operators.Knowledge based Data Mining using Rule Schemas    The rule schema filter is based on operators applied over rule schemas allowing the user to perform several actions over the discovered rules. unexpectedness. . The filtering operator is composed of three different operators: conforming.

and exception We propose to reuse the operators proposed by Liu et al. We propose two important operators: pruning and filtering operators. . unexpectedness. and we bring two new operators in the postprocessingtask: pruning and exceptions.: conforming and unexpectedness. The filtering operator is composed of three different operators: conforming.Operations over Rule Schemas    The rule schema filter is based on operators applied over rule schemas allowing the user to perform several actions over the discovered rules.

System Architecture  Use case diagram: User Login Rule Schemas Ontologies Operators Filtered Rules .

R E U Q S E I .

Sign up to vote on this title
UsefulNot useful