You are on page 1of 8

MINING FREQUENT ITEMSETS USING HIGH- SPEED

ALGORITHMS AND FP-TREES

A PROJECT REPORT

Submitted by
ANTONY JAYASEELAN.G
MUTHU KUMARAN.D
PRAVVEEN.G
RAKESH.R

in partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING
IN

COMPUTER SCIENCE AND ENGINEERING

SAVEETHA ENGINEERING COLLEGE,


CHENNAI – 602 105

ANNA UNIVERSITY : CHENNAI 600 025


April 2009
ANNA UNIVERSITY : CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “MINING OF THE FREQUENT ITEMSET

MINING USING HIGH SPEED ALGORITHMS AND FP-TREES” is the

bonafide work of “ANTONY JAYASEELAN.G (21605104003), MUTHU

KUMARAN.D (21605104027), PRAVVEEN.G (21605104035), RAKESH.R

(21605104039)”, who carried out the project work under my supervision.

SIGNATURE SIGNATURE
Dr.P.Palaniswamy, M.Tech(IIT-M), Ph.D(IISc) Mr.Mohana Prakash.T.A, B.E
HEAD OF THE DEPARTMENT SUPERVISOR
LECTURER
Computer Science & Engineering Computer Science & Engineering
Saveetha Engineering College, Saveetha Engineering College,
Saveetha Nagar, Saveetha Nagar,
Thandalam, Thandalam,
Chennai – 602 105 Chennai – 602 105
INTERNAL EXAMINER EXTERNAL EXAMINER

ACKNOWLEDGMENT

We would like to thank Prof.R.Dheenadayalu, B.E, M.Sc (Engg.) Dean


(ICT) for his unwavering support during the entire course of this project work.

We are deeply indebted to our Head of the Department Dr.


P.Palaniswamy, M.Tech (IIT Madras), Ph.D (IISC), who modeled us both
technically and morally for achieving greater success in life.

We express our sincere thanks to Senior Lecturer Mr.Saravanan.R, for his


constant encouragement and support throughout our course, especially for the
useful suggestions given during the course of the project period.

We are very grateful to our internal guide Mr.Mohana Prakash.T.A,


Lecturer, for being instrumental in the completion of our project with his complete
guidance.

We would also like to thank our Project Coordinator Mr.Sridharan.K for his
support during the entire course of this project work.

We also thank all the staff members of our college and technicians for their
help in making this project a successful one.

Finally, we take this opportunity to extend our deep appreciation to our


family and friends, for all that they meant to us during the crucial times of the
completion of our project.

ABSTRACT
Efficient algorithms for mining frequent itemsets are crucial for mining

association rules as well as for many other data mining tasks. Methods for mining

frequent itemsets have been implemented using a prefix-tree structure, known as an

FP-tree, for storing compressed information about frequent itemsets. Numerous

experimental results have demonstrated that these algorithms perform extremely

well.

In this paper, we present a novel FP-array technique that greatly reduces the

need to traverse FP-trees, thus obtaining significantly improved performance for

FP-tree-based algorithms. Our technique works especially well for sparse data sets.

Furthermore, we present new algorithms for mining all, maximal, and closed

frequent itemsets. Our algorithms use the FP-tree data structure in combination with

the FP-array technique efficiently and incorporate various optimization techniques.

Even though the algorithms consume much memory when the data sets are sparse,

they are still the fastest ones when the minimum support is low. Moreover, they are

always among the fastest algorithms and consume less memory than other methods

when the data sets are dense.

This algorithm can be applied to various applications like Banking,

Insurance, and Departmental Stores etc. We implementing this algorithm adopted

especially for banking application

TABLE OF CONTENTS
CHAPTER.NO TITLE PAGE NO
ABSTRACT i
LIST OF FIGURES iii
LIST OF ABBREVIATIONS iv

1. INTRODUCTION 1
2. LITERATURE REVIEW 3
2.1 EXISTING SYSTEM 6
2.2 PROPOSED SYSTEM 13
2.3 PROBLEM FORMULATION
3. SYSTEM REQUIREMENTS 15
3.2 PLATFORM 17
3.2.1 Software Requirements 17
3.2.2 Hardware Requirements 19
4. SYSTEM DESIGN 22
3.3 PROJECT DESCRIPTION 26
3.4 ALGORITHM 32
3.4.1 fp-growth 32
3.4.2 fp-max 34
3.4.3 cfi tree & fp close 36
5. IMPLEMENTATION 39
5.1 CODING 39
5.2 TESTING 42

APPENDICES 52
REFERENCES 64

ii
LIST OF FIGURES

PAGE NO.
FIGURE NO. TITLE

2. a RELATION BETWEEN DIFFERENT ITEMSETS 20

2.3.a MODULE INTERFACE DIAGRAM 24


26
2.3.b DATA FLOW DIAGRAM

2.3.c CLASS DIAGRAM 36

2.3.d SEQUENCE DIAGRAM 37

2.3.d ER DIAGRAM 38

2.3.d FP GROWTH 39

iii

LIST OF ABBREVIATIONS

FI Frequent Items
MFI Maximal Frequent Item
CFI Closed Frequent Item
FP Frequent Pattern
FP-MAX Frequent Pattern Maximum
FP-CLOSE Frequent Pattern Closed
J2EE Java 2 Enterprise Edition
AWT Abstract Windowing Toolkit
API Application Program Interface
JDBC Java Data Base Connectivity
DSN Data Source Name

iv

You might also like