Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Feed Forward Neural Network Algorithm for Frequent Patterns Mining

Feed Forward Neural Network Algorithm for Frequent Patterns Mining

Ratings: (0)|Views: 210 |Likes:
Published by ijcsis
Association rule mining is used to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorithm named Apriori-Feed Forward(AFF) based on Apriori algorithm and the Feed Forward Neural Network is presented to mine frequent patterns. Apriori algorithm scans database many times to generate frequent itemsets whereas Apriori-Feed Forward(AFF) algorithm scans database Only Once. Computational results show the Apriori-Feed Forward(AFF) algorithm performs much faster than Apriori algorithm.
Association rule mining is used to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorithm named Apriori-Feed Forward(AFF) based on Apriori algorithm and the Feed Forward Neural Network is presented to mine frequent patterns. Apriori algorithm scans database many times to generate frequent itemsets whereas Apriori-Feed Forward(AFF) algorithm scans database Only Once. Computational results show the Apriori-Feed Forward(AFF) algorithm performs much faster than Apriori algorithm.

More info:

Published by: ijcsis on Dec 04, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/04/2010

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No.8, November 2010
Feed Forward Neural Network Algorithm for Frequent Patterns Mining
Amit Bhagat
1
 
 Department of Computer Applications
Dr. Sanjay Sharma
2 
 Associate Prof. Deptt. of Computer Applications
Dr. K.R.Pardasani
3
 Professor Deptt. of MathematicsMaulana Azad National Institute of Technology, Bhopal (M.P.)462051, India
1
am.bhagat@gmail.com,
2
ssharma66@rediffmail.com,
3
kamalrajp@rediffmail.com
 
 Abstract 
:
 
Association rule mining is used to find relationshipsamong items in large data sets. Frequent patterns mining is animportant aspect in association rule mining. In this paper, anefficient algorithm named Apriori-Feed Forward(AFF) based onApriori algorithm and the Feed Forward Neural Network ispresented to mine frequent patterns. Apriori algorithm scansdatabase many times to generate frequent itemsets whereasApriori-Feed Forward(AFF) algorithm scans database OnlyOnce. Computational results show the Apriori-FeedForward(AFF) algorithm performs much faster than Apriorialgorithm.
 
 Keywords: Association rule mining, dataset scan, frequent itemsets, Neural Network 
.
 .
I. INTRODUCTIONData mining has recently attracted considerable attentionfrom database practitioners and researchers because it has been applied to many fields such as market strategy, financialforecasts and decision support [1]. Many algorithms have been proposed to obtain useful and invaluable information fromhuge databases [2]. One of the most important algorithms ismining association rules, which was first introduced in [3,4].Association rule mining has many important applications inour life. An association rule is of the form X => Y. And eachrule has two measurements: support and confidence. Theassociation rule mining problem is to find rules that satisfyuser-specified minimum support and minimum confidence. Itmainly includes two steps: first, find all frequent patterns;second, generate association rules through frequent patterns.Many algorithms for mining association rules fromtransactions database have been proposed [5, 6, 7]sinceApriori algorithm was first presented. However, mostalgorithms were based on Apriori algorithm which generatedand tested candidate itemsets iteratively. This may scandatabase many times, so the computational cost is high. Inorder to overcome the disadvantages of Apriori algorithm andefficiently mine association rules without generating candidateitemsets, many authors developed some improved algorithmsand obtained some promising results [9,10, 11, 12, 13].Recently, there are some growing interests in developingtechniques for mining association patterns without a supportconstraint or with variable supports [14, 15, 16].Associationrule mining among rare items is also discussed in [17,18]. Sofar, there are very few papers that discuss how to combineApriori algorithm and Neural Network to mine associationrules. In this paper, an efficient algorithm named Apriori-
 
Feed Forward(AFF) based on Apriori algorithm and
 Feed  Forward Neural Network 
is proposed, this algorithm canefficiently combine the advantages of Apriori algorithm andStructure of Neural Network. Computational results verify thegood performance of the
 Apriori-Feed Forward(AFF)
 algorithm. The organization of this paper is as follows. InSection II, we will briefly review the Apriori method and
 Feed  Forward Neural Network 
method. Section III proposes anefficient
 Apriori-Feed Forward(AFF)
algorithm that based onApriori and the
 Feed Forward(AFF)
structure. Experimentalresults will be presented in Section IV. Section V gives out theconclusions.II.
 
C
LASSICAL
M
INING
A
LGORITHM AND
 N
EURAL
 N
ETWORK 
 A.
Apriori Algorithm
 
In [4], Agrawal proposed an algorithm called Apriori to the problem of mining association rules first. Apriori algorithm isa bottm-up, breadth-first approach. The frequent itemsets areextended one item at a time.Its main idea is to generate
-thcandidate itemsets from the (
-1)-th frequent itemsets and to
201http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No.8, November 2010
find the
-th frequent itemsets from the
-th candidateitemsets.The algorithm terminates when frequent itemsets cannot be extended any more. But it has to generate a largeamount of candidate itemsets and scans the data set as manytimes as the length of the longest frequent itemsets. Apriorialgorithm can be written by pseudocode as follows.Procedure Apriori, Input: data set D, minimum supportminsup, Output: frequent itemsets
 L
(1) 1
 L
= find_frequent_1_itemsets(D);(2) for (
= 2;
 Lk 
1 −≠
ф
;
++)(3) {(4)
Ck 
= Apriori_gen(
 Lk 
−1 , minsup);(5) for each transactions
 _ D(6) {(7)
Ct 
= subset(
Ck 
,
);(8) for each candidate
c
 _ 
Ct 
(9)
c
.count++;(10) }(11)
 Lk 
= {
c
 _ 
Ck 
|
c
.count > minsup};(12) }(13) return
 L
= {
 L
1 _ 
 L
2 _ ... _ 
 Ln
};In the above pseudocode,
Ck 
means
-th candidate itemsetsand
 Lk 
means
-th frequent itemsets.
 B.
Neural Network 
 
 Neural network[19,20] is a parallel processing network which generated with simulating the image intuitive thinkingof human, on the basis of the research of biological neuralnetwork, according to the features of biological neurons andneural network and by simplifying, summarizing and refining.It uses the idea of non-linear mapping, the method of parallel processing and the structure of the neural network itself toexpress the associated knowledge of input and output.Initially, the application of the neural network in data miningwas not optimistic, and the main reasons are that the neuralnetwork has the defects of complex structure, poor interpretability and long training time. But its advantages suchas high affordability to the noise data and low error rate, thecontinuously advancing and optimization of various network training algorithms, especially the continuously advancing andimprovement of various network pruning algorithms and rulesextracting algorithm, make the application of the neuralnetwork in the data mining increasingly favored by theoverwhelming majority of users.
C.
 
 Neural Network Method in Data Mining 
 
There are seven common methods and techniques of datamining[21,22,23] which are the methods of statistical analysis,rough set, covering positive and rejecting inverse cases,formula found, fuzzy method, as well as visualizationtechnology. Here, we focus on neural network method. Neuralnetwork method is used for classification, clustering, featuremining, prediction and pattern recognition. It imitates theneurons structure of animals, bases on the M-P model andHebb learning rule, so in essence it is a distributed matrixstructure. Through training data mining, the neural network method gradually calculates (including repeated iteration or cumulative calculation) the weights the neural network connected. The neural network model can be broadly dividedinto the following three types:(1)
Feed-forward networks
: it regards the perception back- propagation model and the function network asrepresentatives, and mainly used in the areas such as prediction and pattern recognition;(2)
Feedback network 
: it regards Hopfield discrete model andcontinuous model as representatives, and mainly used for associative memory and optimization calculation;(3)
Self-organization networks
: it regards adaptive resonancetheory (ART) model and Kohonen model as representatives,and mainly used for cluster analysis.
 D.
Feedforward Neural Network :
 
Feedforward neural network (FF network) are the most popular and most widely used models in many practicalapplications. They are known by many different names, suchas "multi-layer perceptrons."Figure 2(a) illustrates a one-hidden-layer FF network withinputs
 ,...,
and output . Each arrow in the figuresymbolizes a parameter in the network. The network is dividedinto
layers
. The input layer consists of just the inputs to thenetwork. Then follows a
hidden layer,
which consists of anynumber of 
neurons
, or 
hidden units
 placed in parallel. Eachneuron performs a weighted summation of the inputs, whichthen passes a nonlinear 
activation function
, also called the
neuron
function.Figure 2(a) A feedforward network with one hidden layer andone output.Mathematically the functionality of a hidden neuron isdescribed bywhere the weights
{,}
are symbolized with the arrowsfeeding into the neuron.The network output is formed byanother weighted summation of the outputs of the neurons inthe hidden layer. This summation on the output is called the
 output layer 
. In Figure 2(a) there is only one output in theoutput layer since it is a single-output problem. Generally, the
202http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No.8, November 2010
number of output neurons equals the number of outputs of theapproximation problem. The neurons in the hidden layer of thenetwork in Figure 2(a) are similar in structure to those of the perceptron, with the exception that their activation functionscan be any differential function. The output of this network isgiven bywhere
n
is the number of inputs and
nh
is the number of neurons in the hidden layer. The variables
{,
 
 ,,}
arethe parameters of the network model that are representedcollectively by the parameter vector . In general, the neuralnetwork model will be represented by the compact notation
 g(,x)
whenever the exact structure of the neural network is notnecessary in the context of a discussion. Note that the size of the input and output layers are defined by the number of inputsand outputs of the network and, therefore, only the number of hidden neurons has to be specified when the network isdefined. The network in Figure 2(a) is sometimes referred toas a three-layer network, counting input, hidden, and outputlayers. However, since no processing takes place in the inputlayer, it is also sometimes called a two-layer network. Toavoid confusion this network is called a
one-hidden-layer FF network 
throughout this documentation.In training the network, its parameters are adjustedincrementally until the training data satisfy the desiredmapping as well as possible; that is, until () matches thedesired output
 y
as closely as possible up to a maximumnumber of iterations.The nonlinear activation function in the neuron is usuallychosen to be a smooth step function. The default is thestandard sigmoidthat looks like this.
 
Figure 2(b)III.
 
A
PRIORI
-
 
F
EEDFORWARD
A
LGORITHM
(AFF)In this Section, a new algorithm based on Apriori and theFeedforward Neural Network structure is presented, which iscalled Apriori-
 
Feedforward Algorithm(AFF).Fig.3a shows thedatabase structure in which there are different Item Ids andsets of Items purchased against Item ID and Fig 3(a)Fig 3(a): Item Id and Item List of DatabaseF
IG
3(B):
 
Data Structure of Nodes in FeedForward Neural Network The Apriori-
 
Feedforward algorithm mainly includes twosteps.First, a neural network model is prepared according to themaximum number of items present in the dataset. Then firsttransaction of the data set is scanned to find out the frequent 1itemsets, and then neural network is updated for frequent 2itemsets frequent 3 itemsets and so on. The data set is scannedonly once to build all frequent combinations of datasets.While updating frequent 2/frequent 3 itemsets…, its pruning isdone at the same time to avoid redundancy of item sets. Atlast, the built Neural Network is mined by Apriori-FeedForward Algorithm. The detailed Apriori-FeedForwardAlgorithm is as follows.Procedure :
Create_Model 
 Input: data set D, minimum support minsupOutput:(1) procedure Create_Model(n)(2) for(i=1;i ≠; i++)(3) for each itemset l
1
 l
k-1
(4) for each itemset l
2
l
k-1
(5) if( l
1
[1] = l
2
[1]) ( l
1
[2] = l
2
[2]) ( l
1
[3] = l
2
[3])…... ( l
1
[n] = l
2
[n])Item_ID Items001 I
1,
I
2,
I
3
 002 I
1,
I
3,
I
4
 003 I
2,
I
4
 004 I
1,
I
2
 005 I
1,
I
2,
I
3,
I
5
 ---- ------I1 f1I2 f2I3 f3I1I2 f12I1I3 f13
203http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->