This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

8, November 2010

**Feed Forward Neural Network Algorithm for Frequent Patterns Mining
**

Amit Bhagat1

Department of Computer Applications

**Dr. Sanjay Sharma2
**

Associate Prof. Deptt. of Computer Applications

Dr. K.R.Pardasani3

Professor Deptt. of Mathematics Maulana Azad National Institute of Technology, Bhopal (M.P.)462051, India

1

am.bhagat@gmail.com, 2ssharma66@rediffmail.com,3 kamalrajp@rediffmail.com

Abstract: Association rule mining is used

to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorithm named Apriori-Feed Forward(AFF) based on Apriori algorithm and the Feed Forward Neural Network is presented to mine frequent patterns. Apriori algorithm scans database many times to generate frequent itemsets whereas Apriori-Feed Forward(AFF) algorithm scans database Only Once. Computational results show the Apriori-Feed Forward(AFF) algorithm performs much faster than Apriori algorithm.

Keywords: Association rule mining, dataset scan, frequent itemsets, Neural Network..

I. INTRODUCTION Data mining has recently attracted considerable attention from database practitioners and researchers because it has been applied to many fields such as market strategy, financial forecasts and decision support [1]. Many algorithms have been proposed to obtain useful and invaluable information from huge databases [2]. One of the most important algorithms is mining association rules, which was first introduced in [3, 4].Association rule mining has many important applications in our life. An association rule is of the form X => Y. And each rule has two measurements: support and confidence. The association rule mining problem is to find rules that satisfy user-specified minimum support and minimum confidence. It mainly includes two steps: first, find all frequent patterns; second, generate association rules through frequent patterns. Many algorithms for mining association rules from transactions database have been proposed [5, 6, 7]since Apriori algorithm was first presented. However, most algorithms were based on Apriori algorithm which generated and tested candidate itemsets iteratively. This may scan

database many times, so the computational cost is high. In order to overcome the disadvantages of Apriori algorithm and efficiently mine association rules without generating candidate itemsets, many authors developed some improved algorithms and obtained some promising results [9,10, 11, 12, 13]. Recently, there are some growing interests in developing techniques for mining association patterns without a support constraint or with variable supports [14, 15, 16].Association rule mining among rare items is also discussed in [17,18]. So far, there are very few papers that discuss how to combine Apriori algorithm and Neural Network to mine association rules. In this paper, an efficient algorithm named AprioriFeed Forward(AFF) based on Apriori algorithm and Feed Forward Neural Network is proposed, this algorithm can efficiently combine the advantages of Apriori algorithm and Structure of Neural Network. Computational results verify the good performance of the Apriori-Feed Forward(AFF) algorithm. The organization of this paper is as follows. In Section II, we will briefly review the Apriori method and Feed Forward Neural Network method. Section III proposes an efficient Apriori-Feed Forward(AFF) algorithm that based on Apriori and the Feed Forward(AFF) structure. Experimental results will be presented in Section IV. Section V gives out the conclusions.

II. CLASSICAL MINING ALGORITHM AND NEURAL NETWORK A. Apriori Algorithm In [4], Agrawal proposed an algorithm called Apriori to the problem of mining association rules first. Apriori algorithm is a bottm-up, breadth-first approach. The frequent itemsets are extended one item at a time.Its main idea is to generate k-th candidate itemsets from the (k-1)-th frequent itemsets and to

201

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No.8, November 2010

find the k-th frequent itemsets from the k-th candidate itemsets.The algorithm terminates when frequent itemsets can not be extended any more. But it has to generate a large amount of candidate itemsets and scans the data set as many times as the length of the longest frequent itemsets. Apriori algorithm can be written by pseudocode as follows. Procedure Apriori, Input: data set D, minimum support minsup, Output: frequent itemsets L (1) 1 L = find_frequent_1_itemsets(D); (2) for (k = 2; Lk 1 −≠ ф; k++) (3) { (4) Ck = Apriori_gen( Lk −1 , minsup); (5) for each transactions t _ D (6) { (7) Ct = subset( Ck , t); (8) for each candidate c _Ct (9) c.count++; (10) } (11) Lk = {c _Ck | c.count > minsup}; (12) } (13) return L = { L1 _ L2 _ ... _ Ln }; In the above pseudocode, Ck means k-th candidate itemsets and Lk means k-th frequent itemsets. B. Neural Network Neural network[19,20] is a parallel processing network which generated with simulating the image intuitive thinking of human, on the basis of the research of biological neural network, according to the features of biological neurons and neural network and by simplifying, summarizing and refining. It uses the idea of non-linear mapping, the method of parallel processing and the structure of the neural network itself to express the associated knowledge of input and output. Initially, the application of the neural network in data mining was not optimistic, and the main reasons are that the neural network has the defects of complex structure, poor interpretability and long training time. But its advantages such as high affordability to the noise data and low error rate, the continuously advancing and optimization of various network training algorithms, especially the continuously advancing and improvement of various network pruning algorithms and rules extracting algorithm, make the application of the neural network in the data mining increasingly favored by the overwhelming majority of users. C. Neural Network Method in Data Mining There are seven common methods and techniques of data mining[21,22,23] which are the methods of statistical analysis, rough set, covering positive and rejecting inverse cases, formula found, fuzzy method, as well as visualization technology. Here, we focus on neural network method. Neural network method is used for classification, clustering, feature mining, prediction and pattern recognition. It imitates the neurons structure of animals, bases on the M-P model and Hebb learning rule, so in essence it is a distributed matrix

structure. Through training data mining, the neural network method gradually calculates (including repeated iteration or cumulative calculation) the weights the neural network connected. The neural network model can be broadly divided into the following three types: (1) Feed-forward networks: it regards the perception backpropagation model and the function network as representatives, and mainly used in the areas such as prediction and pattern recognition; (2) Feedback network: it regards Hopfield discrete model and continuous model as representatives, and mainly used for associative memory and optimization calculation; (3) Self-organization networks: it regards adaptive resonance theory (ART) model and Kohonen model as representatives, and mainly used for cluster analysis. D. Feedforward Neural Network : Feedforward neural network (FF network) are the most popular and most widely used models in many practical applications. They are known by many different names, such as "multi-layer perceptrons." Figure 2(a) illustrates a one-hidden-layer FF network with inputs ,..., and output . Each arrow in the figure symbolizes a parameter in the network. The network is divided into layers. The input layer consists of just the inputs to the network. Then follows a hidden layer, which consists of any number of neurons, or hidden units placed in parallel. Each neuron performs a weighted summation of the inputs, which then passes a nonlinear activation function , also called the neuron function.

Figure 2(a) A feedforward network with one hidden layer and one output. Mathematically the functionality of a hidden neuron is described by

where the weights {

,

} are symbolized with the arrows

feeding into the neuron.The network output is formed by another weighted summation of the outputs of the neurons in the hidden layer. This summation on the output is called the output layer. In Figure 2(a) there is only one output in the output layer since it is a single-output problem. Generally, the

202

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No.8, November 2010

number of output neurons equals the number of outputs of the approximation problem. The neurons in the hidden layer of the network in Figure 2(a) are similar in structure to those of the perceptron, with the exception that their activation functions can be any differential function. The output of this network is given by

Figure 2(b) III. APRIORI- FEEDFORWARD ALGORITHM(AFF) In this Section, a new algorithm based on Apriori and the Feedforward Neural Network structure is presented, which is called Apriori- Feedforward Algorithm(AFF).Fig.3a shows the database structure in which there are different Item Ids and sets of Items purchased against Item ID and Fig 3(a) Item_ID 001 002 003 004 005 ---Items I1, I2, I3 I1, I3, I4 I2, I4 I1, I2 I1, I2, I3, I5 ------

where n is the number of inputs and nh is the number of neurons in the hidden layer. The variables { , , , } are the parameters of the network model that are represented collectively by the parameter vector . In general, the neural network model will be represented by the compact notation g( ,x) whenever the exact structure of the neural network is not necessary in the context of a discussion. Note that the size of the input and output layers are defined by the number of inputs and outputs of the network and, therefore, only the number of hidden neurons has to be specified when the network is defined. The network in Figure 2(a) is sometimes referred to as a three-layer network, counting input, hidden, and output layers. However, since no processing takes place in the input layer, it is also sometimes called a two-layer network. To avoid confusion this network is called a one-hidden-layer FF network throughout this documentation. In training the network, its parameters are adjusted incrementally until the training data satisfy the desired mapping as well as possible; that is, until ( ) matches the desired output y as closely as possible up to a maximum number of iterations. The nonlinear activation function in the neuron is usually chosen to be a smooth step function. The default is the standard sigmoid

Fig 3(a): Item Id and Item List of Database

I1 I2 I3

f1 f2

I1I2 I1I3

f12 f13

f3

FIG 3(B): Data Structure of Nodes in FeedForward Neural Network The Apriori- Feedforward algorithm mainly includes two steps. First, a neural network model is prepared according to the maximum number of items present in the dataset. Then first transaction of the data set is scanned to find out the frequent 1 itemsets, and then neural network is updated for frequent 2 itemsets frequent 3 itemsets and so on. The data set is scanned only once to build all frequent combinations of datasets. While updating frequent 2/frequent 3 itemsets…, its pruning is done at the same time to avoid redundancy of item sets. At last, the built Neural Network is mined by AprioriFeedForward Algorithm. The detailed Apriori-FeedForward Algorithm is as follows. Procedure : Create_Model Input: data set D, minimum support minsup Output: (1) procedure Create_Model(n) (2) for(i=1;i ≠ ; i++) (3) for each itemset l1 ∈ lk-1 (4) for each itemset l2 ∈ lk-1 (5) if( l1[1] = l2[1]) ( l1[2] = l2[2]) …... ( l1[n] = l2[n])

that looks like this.

( l1[3] = l2[3])

203

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No.8, November 2010

(6) then (7) C = l1 x l2 (8) if already_generated(l1 x l2) then (9) delete C (10) else add C to Ck (11) FNN(Ck) Procedure FNN(Ck) Input: itemset Model Ck Output: frequent itemsets L (1) procedure FNN(Ck) (2) n = recordcount(Dataset) (3) for(i=1; i<n ; i++) (4) { (5) L1 = get_first_transaction(Dataset) (6) upf = update_frequecy(l1 x l2) (7) if (upf >= min_sup) (8) print(l1 x l2) In this algorithm a complete Feed forward neural network is prepared according to the maximum number of items present in the datasets. First layer of the network is a frequent 1 itemsets second layer is frequent 2 item sets and so on until a final layer is prepared which is a single node comprising of all items present in the datasets. Every n+1 layer is combination of item n with respect to all other items present at that layer, these layers are generated by calculating factorial of n+1 items.

IV. EXPERIMENTAL RESULTS

Fig.4(b). From Fig.4(a), 4(b). we can make the following two statements. First, Apriori- FeedForward algorithm works much faster than Apriori. It uses a different method FNN to calculate the support of candidate itemsets and it consumes less memory than Apriori because it doesn’t need to traverse database again and again. It needs only single scan to the database.

V. CONCLUSION AND FUTURE WORK

In this paper, we have proposed the Apriori- FeedForward algorithm. This method builds Feed Forward Neural Network Model and scans the data base only once to generate frequent patterns. The future work is to further improve the AprioriFeed Forward algorithm and test more and larger datasets. .

The content of our test data set are frequently purchased items of a super market. There are 7 to 12 different items and 10000 to 50000 records in that data set. In order to verify the performance of the Apriori - FeedForward algorithm, we compare Apriori-Feed Forwrd with Apriori. The algorithms are performed on a computer with i7 processor 1.60GHz and 4 GB memory. The program is developed by NetBeans 6.8. The computational results of two algorithms are reported in Table 1.The clearer comparison of two algorithms is given in Fig.4(a).Table 1. The running time of two algorithms Apriori - FeedForward algorithm Min.Supp 30% 25% 20% 15% 10% 5% Apriori 15000ms 15000ms 15000ms 19000ms 20000ms 30000ms AFF 1762ms 1545ms 1529ms 1682 ms 1634 ms 1625ms

REFERENCES

[1] M.S. Chen, J. Han, P.S. Yu, “Data mining: an overview from a database perspective”, IEEE Transactions on Knowledge and Data Engineering, 1996, 8, pp. 866-883. J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publisher, San Francisco,CA, USA, 2001. R.Agrawal, T.Imielinski and A.Swami, “Mining association rules between sets of items in large databases,in: Proceedings of the Association for Computing Machinery, ACM-SIGMOD, 1993, 5, pp.207216. R. Agrawal, R. Srikant, “Fast algorithms for mining association rules”, Proceedings of the 20th Very Large DataBases Conference (VLDB’94), Santiago de Chile, Chile, 1994, pp. 487-499. Agrawal, R., Srikant, R., & Vu, Q, “Mining association rules with item constraints”, In The third international conference on knowledge discovery in databases and data mining, Newport Beach, California, 1997, pp. 67-73. J.Han, Y. Fu, “Discovery of multiple-level association rules from large database”, In The twenty-first international conference on very large data bases, Zurich, Switzerland, 1995, pp. 420-431. Fukuda, T., Morimoto, Y., Morishita, S., & Tokuyama, T.,“Mining optimized association rules for numeric attributes”,In The ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, 1996, pp. 182-191. Park, J. S., Chen, M. S., & Yu, P. S., “Using a hash-based method with transaction trimming for mining association rules”, IEEE Transactions on Knowledge and Data Engineering, 1997, 9(5), pp. 812-825. J.Han, J.Pei and Y.Yin., “Mining frequent patterns without candidate Generation”, in: Proceeding of ACM SIGMOD International Conference Management of Data, 2000, pp. 1-12.

[2] [3]

[4]

[5]

[6]

[7]

[8]

Fig.4(a).Table 1.

[9]

204

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

[10] J.Han, J.Wang, Y.Lu and P.Tzvetkov, “Mining top-k frequent closed patterns without minimum support”, in:Preceeding of International Conference Data Mining,2002,12, pp. 211-218. [11] G.Liu, H.Lu, J.X.Yu, W.Wei and X.Xiao, “AFOPT: An efficient implementation of pattern growth approach”, in:IEEE ICDM Workshop Frequent Itemset Mining Implementations, CEUR Workshop Proc., 2003, 80. [12] J.Wang, J.Han, and J.Pei, “CLOSET+: searching for the best strategies for mining frequent closed Itemsets”, in:Preceeding of International Conference, Knowledge Discovery and Data Mining, 2003, 8, pp. 236245. [13] Tzung-Pei Hong, Chun-Wei Lin, Yu-Lung Wu,“Incrementally fast updated frequent pattern trees”, Expert Systems with Applications, 2008, 34, pp. 2424-2435. [14] K. Wang, Y. He, D. Cheung, Y. Chin, “Mining confident rules without support requirement”, in: Proceedings of ACM International Conference on Information and Knowledge Management, CIKM, 2001, pp.89-96. [15] H. Xiong, P. Tan, V. Kumar, “Mining strong affinity association patterns in data sets with skewed support distribution”, in: Proceedings of the Third IEEE International Conference on Data Mining, ICDM, 2003, pp. 387-394. [16] Ya-Han Hu, Yen-Liang Chen, “Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism”, Decision Support Systems, 2006, 42, pp. 1-24. [17] J. Ding, “Efficient association rule mining among infrequent items”, Ph.D. Thesis, University of Illinois at Chicago, 2005. [18] Ling Zhou, Stephen Yau, “Efficient association rule mining among both frequent and infrequent items”,Computers and Mathematics with Applications, 2007, 54, pp. 737-749. [19] Anderson, J. A., 1995, Introduction to Neural Networks (Cambridge, MA:MIT Press). [20] Van Hulle, M. M., 2000, Faithful Representations and Topographic Maps:From Distortion-to-Information-Based Self Organization (New York:Wiley). [21] Cristofor, L., Simovici, D., Generating an informative cover for association rules. In Proc. of the IEEE International Conference on Data Mining, 2002. [22] Yuan, Y., Huang, T., A Matrix Algorithm for Mining Association Rules, Lecture Notes in Computer Science, Volume 3644, Sep 2005, Pages 370 – 379 [23] Sotiris Kotsiantis, Dimitris Kanellopoulos,Association Rules Mining: A Recent Overview, GESTS International Transactions on Computer Science and Engineering, Vol.32 (1), 2006, pp. 71-82

205

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

Association rule mining is used to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorit...

Association rule mining is used to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorithm named Apriori-Feed Forward(AFF) based on Apriori algorithm and the Feed Forward Neural Network is presented to mine frequent patterns. Apriori algorithm scans database many times to generate frequent itemsets whereas Apriori-Feed Forward(AFF) algorithm scans database Only Once. Computational results show the Apriori-Feed Forward(AFF) algorithm performs much faster than Apriori algorithm.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd