(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No.8, November 2010
find the
k
-th frequent itemsets from the
k
-th candidateitemsets.The algorithm terminates when frequent itemsets cannot be extended any more. But it has to generate a largeamount of candidate itemsets and scans the data set as manytimes as the length of the longest frequent itemsets. Apriorialgorithm can be written by pseudocode as follows.Procedure Apriori, Input: data set D, minimum supportminsup, Output: frequent itemsets
L
(1) 1
L
= find_frequent_1_itemsets(D);(2) for (
k
= 2;
Lk
1 −≠
ф
;
k
++)(3) {(4)
Ck
= Apriori_gen(
Lk
−1 , minsup);(5) for each transactions
t
_ D(6) {(7)
Ct
= subset(
Ck
,
t
);(8) for each candidate
c
_
Ct
(9)
c
.count++;(10) }(11)
Lk
= {
c
_
Ck
|
c
.count > minsup};(12) }(13) return
L
= {
L
1 _
L
2 _ ... _
Ln
};In the above pseudocode,
Ck
means
k
-th candidate itemsetsand
Lk
means
k
-th frequent itemsets.
B.
Neural Network
Neural network[19,20] is a parallel processing network which generated with simulating the image intuitive thinkingof human, on the basis of the research of biological neuralnetwork, according to the features of biological neurons andneural network and by simplifying, summarizing and refining.It uses the idea of non-linear mapping, the method of parallel processing and the structure of the neural network itself toexpress the associated knowledge of input and output.Initially, the application of the neural network in data miningwas not optimistic, and the main reasons are that the neuralnetwork has the defects of complex structure, poor interpretability and long training time. But its advantages suchas high affordability to the noise data and low error rate, thecontinuously advancing and optimization of various network training algorithms, especially the continuously advancing andimprovement of various network pruning algorithms and rulesextracting algorithm, make the application of the neuralnetwork in the data mining increasingly favored by theoverwhelming majority of users.
C.
Neural Network Method in Data Mining
There are seven common methods and techniques of datamining[21,22,23] which are the methods of statistical analysis,rough set, covering positive and rejecting inverse cases,formula found, fuzzy method, as well as visualizationtechnology. Here, we focus on neural network method. Neuralnetwork method is used for classification, clustering, featuremining, prediction and pattern recognition. It imitates theneurons structure of animals, bases on the M-P model andHebb learning rule, so in essence it is a distributed matrixstructure. Through training data mining, the neural network method gradually calculates (including repeated iteration or cumulative calculation) the weights the neural network connected. The neural network model can be broadly dividedinto the following three types:(1)
Feed-forward networks
: it regards the perception back- propagation model and the function network asrepresentatives, and mainly used in the areas such as prediction and pattern recognition;(2)
Feedback network
: it regards Hopfield discrete model andcontinuous model as representatives, and mainly used for associative memory and optimization calculation;(3)
Self-organization networks
: it regards adaptive resonancetheory (ART) model and Kohonen model as representatives,and mainly used for cluster analysis.
D.
Feedforward Neural Network :
Feedforward neural network (FF network) are the most popular and most widely used models in many practicalapplications. They are known by many different names, suchas "multi-layer perceptrons."Figure 2(a) illustrates a one-hidden-layer FF network withinputs
,...,
and output . Each arrow in the figuresymbolizes a parameter in the network. The network is dividedinto
layers
. The input layer consists of just the inputs to thenetwork. Then follows a
hidden layer,
which consists of anynumber of
neurons
, or
hidden units
placed in parallel. Eachneuron performs a weighted summation of the inputs, whichthen passes a nonlinear
activation function
, also called the
neuron
function.Figure 2(a) A feedforward network with one hidden layer andone output.Mathematically the functionality of a hidden neuron isdescribed bywhere the weights
{,}
are symbolized with the arrowsfeeding into the neuron.The network output is formed byanother weighted summation of the outputs of the neurons inthe hidden layer. This summation on the output is called the
output layer
. In Figure 2(a) there is only one output in theoutput layer since it is a single-output problem. Generally, the
202http://sites.google.com/site/ijcsis/ISSN 1947-5500