Parallel Edge Projection and Pruning (PEPP) BasedSequence Graph Protrude Approach for ClosedItemset Mining
kalli Srinivasa Nageswara Prasad
Research Scholar in Computer Science
Sri Venkateswara University, TirupatiAndhra Pradesh , India.
Prof. S. Ramakrishna
Department of Computer Science
Sri Venkateswara University, TirupatiAndhra Pradesh , India..
Past observations have shown that a frequent item setmining algorithm are supposed to mine the closed ones as the endgives a compact and a complete progress set and better efficiency.Anyhow, the latest closed item set mining algorithms works withcandidate maintenance combined with test paradigm which isexpensive in runtime as well as space usage when supportthreshold is less or the item sets gets long. Here, we show, PEPP,which is a capable algorithm used for mining closed sequenceswithout candidate. It implements a novel sequence closurechecking format that based on Sequence Graph protruding by anapproach labeled “Parallel Edge projection and pruning” in shortcan refer as PEPP. A complete observation having sparse anddense real-life data sets proved that PEPP performs greatercompared to older algorithms as it takes low memory and is morefaster than any algorithms those cited in literature frequently.
Key words – Data Mining; Graph Based Mining; Frequentitemset; Closed itemset; Pattern Mining; candidate; Itemset Mining;Sequential Itemset Mining.
INTRODUCTIONSequential item set mining, is an important task, having manyapplications with market, customer and web log analysis, itemset discovery in protein sequences. Capable mining techniquesare being observed extensively, including the general sequentialitem set mining [1, 2, 3, 4, 5, 6], constraint-based sequentialitem set mining [7, 8, 9], frequent episode mining , cyclicassociation rule mining , temporal relation mining ,partial periodic pattern mining , and long sequential item setmining . Recently it’s quite convincing that for miningfrequent item sets, one should mine all the closed ones as theend leads to compact and complete result set having highefficiency [15, 16, 17, 18], unlike mining frequent item sets,there are less methods for mining closed sequential item sets.This is because of intensity of the problem and CloSpan is theonly variety of algorithm , similar to the frequent closeditem set mining algorithms, it follows a candidate maintenance-and-test paradigm, as it maintains a set of readily mined closedsequence candidates used to prune search space and verifywhether a recently found frequent sequence is to be closed ornot. Unluckily, a closed item set mining algorithm under thisparadigm has bad scalability in the number of frequent closeditem sets as many frequent closed item sets (or just candidates)consume memory and leading to high search space for theclosure checking of recent item sets, which happens when thesupport threshold is less or the item sets gets long.Finding a way to mine frequent closed sequences without thehelp of candidate maintenance seems to be difficult. Here, weshow a solution leading to an algorithm, PEPP, which can mineefficiently all the sets of frequent closed sequences through asequence graph protruding approach. In PEPP, we need not eyedown on any historical frequent closed sequence for a newpattern’s closure checking, leading to the proposal of Sequencegraph edge pruning technique and other kinds of optimizationtechniques.The observations display the performance of the PEPP to findclosed frequent itemsets using Sequence Graph. Thecomparative study claims some interesting performanceimprovements over BIDE and other frequently cited algorithms.In section II, most frequently cited work and their limitsexplained. In section III, the Dataset adoption and formulationexplained. In section IV, introduction to PEPP and its utilization
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 9, September 201174http://sites.google.com/site/ijcsis/ISSN 1947-5500