Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Efficient Probabilistic Classification Methods for NIDS

Efficient Probabilistic Classification Methods for NIDS

Ratings: (0)|Views: 74 |Likes:
Published by ijcsis
As technology improve, attackers are trying to get access of the network system resources by so many means, open loop holes in the network allow them to penetrate in the network more easily. Various approaches are tried for classification of attacks. In this paper we have compared two methods Naïve Bayes and Junction Tree Algorithm on reduced set of features by improving the performance as compared to full data set. For feature reduction PCA is used that helped in proposing a new method for efficient classification. We proposed a Bayesian network-based model with reduced set of features for Intrusion Detection. Our proposed method generates a less false positive rate that increase the detection efficiency by reducing the workload and that increase the overall performance of an IDS. We also investigated that whether conditional independence really effect on the attacks/ threats detection.
As technology improve, attackers are trying to get access of the network system resources by so many means, open loop holes in the network allow them to penetrate in the network more easily. Various approaches are tried for classification of attacks. In this paper we have compared two methods Naïve Bayes and Junction Tree Algorithm on reduced set of features by improving the performance as compared to full data set. For feature reduction PCA is used that helped in proposing a new method for efficient classification. We proposed a Bayesian network-based model with reduced set of features for Intrusion Detection. Our proposed method generates a less false positive rate that increase the detection efficiency by reducing the workload and that increase the overall performance of an IDS. We also investigated that whether conditional independence really effect on the attacks/ threats detection.

More info:

Published by: ijcsis on Dec 04, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/04/2010

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 8, November 2010
Efficient Probabilistic Classification Methods forNIDS
S.M.Aqil Burney M.Sadiq Ali Khan Mr.Jawed Naseem
Department of Computer Science Department of Computer Science Principal Scientific Officer-PARCUniversity of Karachi, Karachi-Pakistan University of Karachi, Karchi-Pakistanmsakhan@uok.edu.pk  
 Abstract:
As technology improve, attackers are trying to getaccess of the network system resources by so many means, openloop holes in the network allow them to penetrate in the networkmore easily. Various approaches are tried for classification of attacks. In this paper we have compared two methods NaïveBayes and Junction Tree Algorithm on reduced set of features byimproving the performance as compared to full data set. Forfeature reduction PCA is used that helped in proposing a newmethod for efficient classification. We proposed a Bayesiannetwork-based model with reduced set of features for IntrusionDetection. Our proposed method generates a less false positiverate that increase the detection efficiency by reducing theworkload and that increase the overall performance of an IDS.We also investigated that whether conditional independencereally effect on the attacks/ threats detection.
 Keywords-Network Intrusion Detection System(NIDS); Bayesain Networks; Junction Tree Algorithm
I.
 
I
NTRODUCTION
Network Security whether in a commercial organization orin a critically important research network, is a major issue of concern with the increasing use of web even the personalinformation in under threat. Efficient network intrusiondetection system is only solution to such threats [4].IDS is a monitoring system of networks to control / avoid / secure the networks from cyber terrorist or it is the process of examing the events occurring in a network or computer systemand detecting the signs of incidents which are the threats of computer security policies. Network system monitored by theIDS for detection of any rules violation. Having such violationin the system, efficient IDS generates notification by means of an alarm generation that alert the administrator to put somesteps/major according to such vulnerabilities. Commonintrusion attacks are classified based on various features/ parameter. KDD-99 data set usually used for investigating thenature of attack. The data set has 41 features listed. Informationvalue of these features and interdependence among them is aninterest of investigation. How much reduction in features canbe made without reducing the efficiency of classificationalgorithm and whether interdependency really contributes todetection efficiency? We are tried to find the answers of suchkind of questions in this paper. PCA is an effective datadimension reduction technique. Similarly Naïve Bayes’classifier and Bayesian Network both use probabilisticapproach for determination of attack probability. Naïve Bayes’classifiers assume conditional independence while Bayesiannetwork consider assumes conditional dependence. Twomethods can be used to compare whether conditionalindependency or interdependency really contribute toprobability of attack. In the next section we discussed somerelated works which are already proposed, in section 3 wediscussed the two methods of classification, in section 4 themethodology is mentioned and finally in section 5 results anddiscussions are presented.II.
 
B
ACKGROUND
 For intrusion most network based systems become thetarget to the hacker, so building efficient IDS is the main task now a day [4]. Intrusion based systems needs a component thatgenerates an alerts on the basis of rule set, to detect themalicious activity correctly it is necessary to manage the alertscorrectly [1]. Data Mining approaches are being applied byresearchers for the attacks detection in their Intrusion DetectionSystems[2]..Probabilistic approaches for reducing the falsealarm rate are proposed for example, see [3]. The enormousamount of network data traffic is accumulated each day.Numbers of data mining approaches are used for collectingknowledge domain for intrusion detection which includesclustering, association rules and classification [12]. Dataanalysis supports by data mining techniques and now itbecomes one of the important features/component in intrusionbased system. The main concern of using data miningtechniques in attacks detection system to differentiate betweennormal packet vs abnormal. For applying data mining inintrusion detection we need a data set and a classificationmodel. That classification model may be Ba
 yesian Network,neural network, rule based decision tree based and other soft computing techniques as Support Vector Machines(SVM)[10,11]
. Intrusion Detection System is now becomes thenecessicity for an organizational security system with itscredibility that may depend upon the data mining techniques.
2.1
 
Clustering
The process of labeling data and arranging it in groups iscalled clustering. By grouping we basically improve theperformance of different classifiers used. The genuine clustercontains data corresponding to single category [5]. The data setbelongs to the cluster is modeled with respect to them exciting
168http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 8, November 2010
features. You may define the term clustering in such a way thatit refers as unsupervised machine learning
mechanism forpatterns matching in unlabeled data with numerousaspects.
2.2
 
Classification
In classification we break the data sets into different classesand it is much less exploratory than clustering. By means of classification we need to classify data into set of classes normal /not normal and to sub classify into different types. NaïveBayes’ used as a classification algorithm in this research bywhich data classification for intrusion detection be achieved.Due to the collection of huge amount of data traffic neededclassification is less famous [6].III.
 
C
LASSIFICATION
M
ETHODS
 
3.1 Naïve Bayes Classifier 
Naïve Bayes classifier is an effective technique forclassification of data. The technique is particularly useful forlarge data dimension. The Naïve Bayes is a special case of Bayes theoram which presuppose independence in dataattributes [7]. Even though Naïve Bayes assumes dataindependence, its performance is efficient and at par with othertechniques assuming data conditionality. Naïve Bayes classifiercan manage continuous or categorical data. Let for a set of given variable X={x
1
,x
2
,.....x
n
} with possible outcomesO={o
1
,o
2
,…..o
n
}. The posterior probability of the dependentvariable is obtained by Bayes rule.P(O
 j
| x
1
,x
2
,.....x
n
) * P(x
1
,x
2
,.....x
n
)O
 j
P(O
 j
)We can obtain a new case with
X
with a class label
O
 j
 
havehighest posterior probability asdThe efficiency of Naive Bayes classifier lies in the fact thatit converts multi dimensionality of data to one dimensionaldensity estimation. The occupations of evidence do not affectthe posterior probability so generally classification task isefficient. The same is proved in this study also when NaiveBayes classifier is compared with Junction Tree algorithm. Formodeling Naive Bayes classifier several distribution includingnormal gamma or Poisson density function can be employed.
 3.2
 Junction Tree Algorithm
 
Its a graphical method of belief updation or probabilisticreasoning. For Probabilistic reasoning, we are using BayesianNetworks and Decision Graphs (BNDG) for which details canbe found in [9]. The basic concept in junction tree is clusteringof predicted attributes [8]. In belief updation instead of approximating joint probability distribution of all targetedvariable (cliques) cluster attributes are formed and potential of clusters are used to approximate probability. So basically junction tree is the graphical representation of potential clusternodes or cliques and a suitable algorithm to update thispotential. Junction tree algorithm involve several steps asmoralizing the graph, triangulation junction tree formulation,assigning probabilities to cliques, message passing and readingcliques marginal potentials from junction tree.Using Junction tree algorithm requires that directed graphis changed to undirected graph to ensure uniform applicationprocess is called moralization which involve adding edgesbetween parents and dropping the direction let = (be a directed graph to be changed into undirected graph G(N
G
,E
G
) so infect two new sets along with EG required to beadded i.e.andThe set can be defined asIn moralization is obtained and newundirected moralized graph is given asJunction tree is formed after moralization which is basicallyhyper graphs of cliques if cliques of undirected graph G isgiven by C(G) than junction tree with a unique property thatintersections of any two nodes is contained in every node in theunique path joining the nodes.Let consider a cluster representation having to neighborcluster U and V sharing a variable S in common
169http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 8, November 2010
The aim of JTA is to modify potential in such a way thatthe distribution of P (V) is obtained by modified potential
Ψ(V). In such case probability of S can be given as
 P(S)=
∑ Ψ(V)
 SimilarlyP(S) =
∑ Ψ(U)
 
Let Ψ(S) represent modified potential so Ψ(S) = P(S), sonow if potential of let say Ψ(V) is delayed as result of new
evidence f 
the potential of both Ψ(S) & Ψ(U) can be updated
realizing the equivalence
Ψ(U) = P(S) = Ψ(V)
 Belief updation in junction tree is carried out throughmessage passing let U and V are two adjacent node withseparator S. so the task is to absorb V and W through S.
 potential Ψ(W) and Ψ(S) with condition
 
∑ Ψ*(W) = Ψ*(S) = ∑ Ψ*(V)
 
In absorption Ψ*(S) and Ψ*(W) are replaced as under Ψ*(S) = ∑ Ψ(V)
 
Ψ(S)
 
Ψ*(W) = Ψ (W)Ψ(S)
 In this way belief of the whole network is updated throughmessage passing.IV. M
ETHODOLOGY
 KDD’99 data set of intrusion detection was used. PCAtechnique was used and 14 features were selected on the basisof analysis. Selection of data set for training and testing plays avital role in accuracy of prediction. In intrusion detectionfrequency of some attacks are very large as compare to others.To ensure inclusion of all attacks type in learning stratifiedrandom sample were drawn relative to proportion of eachattack type. This produces better result as compare to simplerandom sampling. For Naive Bayes classification two data sets(stratified sample of equal size of 10000) were used forlearning and testing using software BN
classifier 
. In junctiontree algorithm structure learning is carried out by drawing arandom sample of 5000 from KDD data sets using
netica
. Thenfive data sets each of size 1000 are selected through simplerandom sample, data set is used for learning and drawing junction tree. Data set 2 to 5 were used for testing belief updatelearned by junction tree.V. R
ESULTS
&
 
D
ISCUSSION
 The 41 features of KDD’99 data set were reduced to 14features. The PCA identified 12 major components havingEigen values greater than and around more than 80%variability of data explained by these features while 98%variability can be explained 24 components.The difference of variability between 24 and 14 featuresselection is only 18% but computational cost highly increasedif 24 parameters are selected, so optimize the processing speed14 has been selected. It is evident from the graph mentionedabove that first 24 components represent 98.866% data and 14components explained 80% variability which is quite sufficient,and work was carried out on these components only, neglectingthe other components which seem less worthy. Besides this,structure learning also support selection of 14 features. TheBayesian network model shown in Figure 2 representsinterdependence among various attributes. It is evident thatmainly two factors as
count & src_byte
are effected byvarious features and in turn these two ultimately affect theattack types. The KDD’99 data set classification list 18 attack types however
normal & neptune
are more frequent.
Figure 1: Scree Plot of attributes
.
Ψ(U)
 
Ψ(V)
 
Ψ(S)
 USV
170http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->