You are on page 1of 9

i

ii

Quantitative Association rule Mining using Rough Set Approach


Dissertation submitted in partial fulfillment of requirements for the award of degree of

Master of Technology In

Software Engineering
by M. ASHOK KUMAR (Reg. No : 11131D2501 ) Under the esteemed guidance of Dr. M. Phani Krishna Kishore PROFESSOR

Department of Information Technology GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING (AUTONOMOUS) (Affiliated to J.N.T.U., Kakinada) VISAKHAPATNAM 530 048 2011 2013

iii

CERTIFICATE
This is to certify that the Dissertation entitled Quantitative association rule mining using rough set approach that is being submitted by Mr. M. Ashok Kumar in partial fulfillment of the requirement for the award of the Degree of M.Tech in SOFTWARE ENGINEERING to the G.V.P College of Engineering (Autonomous) is a record of bonafide work carried out by him under my guidance and supervision.

The results embedded in this thesis have not been submitted to any other University or Institute for the award of any degree or diploma

Supervisor

Head of the Department Department of Information Technology GVP College of Engineering(Autonomous) Visakhapatnam-530048.

iv

ACKNOWLEDGEMENT
I am grateful to Dr. M.P.K. Kishore Professor, Department of Information Technology who has been my project guide and a constant inspirer, guiding me thorough out the project.

I express my gratitude and thanks to Prof. K.B. Madhuri, Head of the Department of Information technology, for providing all the necessary resources for completing this project. I am also thankful to remaining faculty, my parents and my friends who supported and helped me in all means for the completion of my project. I also express my deep sense of gratitude to my esteemed Institute. GAYATRI VIDYAPARISHAD COLLEGE OF ENGINEERING (A), which has provided me an opportunity to fulfill the most cherished desire to reach my goal.

M. ASHOK KUMAR 11131D2501

ABSTRACT
The goal of data mining is to extract higher level information from an abundance of raw data. Association rules are a key tool used for this purpose. An association rule is a rule of the form X Y, where X and Y are events. The rule states that with a certain probability, called the confidence of the rule, when X occurs in the given databases does Y. A well-known application of association rules is in market basket data analysis.

Quantitative association rules are multi-dimensional association rules in which the numeric attributes are dynamically discretized during the mining process so as to satisfy some mining criteria, such as maximizing the confidence or compactness of the rule mined.

In this project, a novel clustering algorithm is proposed to identify inherent clusters in the data which guides the discretization of quantitative attributes. A roughset approach is also used to Quantitative Association Rule Mining as the roughset framework is useful in dimensionality reduction and its applicability to imprecice and imperfect data.

vi

CONTENTS
CHAPTER -1: INTRODUCTION 1

CHAPTER 2 : LITERATURE REVIEW 2.1. Finding Large Itemsets 2.2. Generating Association Rules

12 13 16

CHAPTER -3 THEORITICAL ANALYSIS 3.1 Procedure 3.1.1 Calculation of Alpha cutoff value 3.1.2 Cluster Formation 3.1.3 Cluster interval table 3.1.4 Graph representation 3.1.5 Find frequent itemsets 3.1.6 Generating association rules 3.2 Algorithm 3.2.1 Initial Clustering algorithm 3.2.2 Secondary Clustering algorithm 3.2.3 Cluster merging algorithm 3.2.4 Formation of boolean matrix table 3.2.5 Formation of metadata table 3.2.6 Frequent item sets generation

22 22 22 23 23 23 23 24 24 24 25 25 25 26 26

vii

CHAPTER -4 EXPERIMENTAL INVESTIGATION 4.1 Example 4.1.1 Distance Table for Attribute A1 4.1.2 Distance Table for Attribute 2 4.1.3 Distance Table for Attribute 3 4.1.4 Formation of boolean matrix 4.1.5 Graph representation 4.1.6 Formation of metadata table 4.1.7 Gneration of Association rules 4.2 Dataset Description

27 27 28 30 33 36 36 37 38 39

4.2.1 Abalone Data Set 4.2.2 Iris Data Set 4.2.3 Blogger Data Set
CHAPTER -5 EXPERIMENTAL RESULTS 5.1 Execution 1: with constant cutoff of 80 % 5.2 Execution 2: with constant cutoff of 90 % 5.3 Graph Analysis CHAPTER -6 CONCLUSION REFERENCES

39 41
43

45 45 47 50 52 53

viii

ix

You might also like