Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
False Positive Reduction using IDS Alert Correlation Method based on the Apriori Algorithm

False Positive Reduction using IDS Alert Correlation Method based on the Apriori Algorithm

Ratings: (0)|Views: 94 |Likes:
Published by ijcsis
Correlating the Intrusion Detection Systems (IDS) is one challenging topic in the field of network security. There are many benefits from correlating the IDS alerts: to reduce the huge amount of alerts that IDS triggers, to reduce the false positive ratio and to figure out the relations between the alerts to get better understanding of the attacks. One of these correlation techniques based on the data mining. In this paper we developed new IDS alerts group correlation method (GCM) based on the aggregated alerts by the Threshold Aggregation Framework (TAF) we create our correlation method by adapting the Apriori algorithm for large data. This method used to reduce the amount of aggregated alerts and to reduce the ratio of false positive alerts.
Correlating the Intrusion Detection Systems (IDS) is one challenging topic in the field of network security. There are many benefits from correlating the IDS alerts: to reduce the huge amount of alerts that IDS triggers, to reduce the false positive ratio and to figure out the relations between the alerts to get better understanding of the attacks. One of these correlation techniques based on the data mining. In this paper we developed new IDS alerts group correlation method (GCM) based on the aggregated alerts by the Threshold Aggregation Framework (TAF) we create our correlation method by adapting the Apriori algorithm for large data. This method used to reduce the amount of aggregated alerts and to reduce the ratio of false positive alerts.

More info:

Published by: ijcsis on Nov 02, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/16/2011

pdf

text

original

 
 
False Positive Reduction using IDS Alert CorrelationMethod based on the Apriori Algorithm
Homam El-Taj, Omar Abouabdalla, Ahmed Manasrah,Mohammed Anbar, Ahmed Al-MadiNational Advanced IPv6 Center of Excellence (NAv6)Universiti Sains MalaysiaPenang, Malaysia{homam, omar, ahmad, anbar, almadi}@nav6.org
 Abstract
Correlating the Intrusion Detection Systems (IDS)is one challenging topic in the field of network security. Thereare many benefits from correlating the IDS alerts: to reducethe huge amount of alerts that IDS triggers, to reduce the falsepositive ratio and to figure out the relations between the alertsto get better understanding of the attacks. One of thesecorrelation techniques based on the data mining. In this paperwe developed new IDS alerts group correlation method (GCM)based on the aggregated alerts by the Threshold AggregationFramework (TAF) we create our correlation method byadapting the
 Apriori
algorithm for large data. This methodused to reduce the amount of aggregated alerts and to reducethe ratio of false positive alerts.
 
 Keyword 
s
 —
Intrusion Detection System; False Positive Alerts;Alert Correlation; Data Minig.
 
I.I
NTRODUCTION
 Based on the essential and extensive usage of internet andtheir applications, threats and intrusions become wider andsmarter. And because IDS triggers huge amount of alerts theneed of study these alerts become essential too. The study of IDS alerts led to bringing to light some of the IDS issueswhich should be studied, these issues comes in how to groupthe alerts, define the relation between the alerts and reducethe false alerts.II.I
NTRUSION
D
ETECTION
S
YSTEM
(IDS)IDS monitors the protected network activities and analyzethem to trigger alerts if there is any malicious activityaccrued. IDS can detect these activities based on anomalydetection methods [1], misuse detection methods [2] or acompensation between both of them. While anomalymethods detect the malicious traffic by determining theabnormality between the suspicious activities flow and thenorm flow based on a chosen threshold, misuse methodsdetect malicious activates based on their signatures. Themain differences between these methods based on thedetecting novel attacks and the false positive ratio, misusemethods have minimum amount of false positive, whileanomaly methods can detect novel attacks.III.IDS
 
A
LERTS
 
C
ORRELATION
S
TUDIES
 Correlation is part of intrusion detection studies that smoothes theprogress of the analysis of intrusion alerts based on the similaritybetween alert attributes, this can represented in mathematicalexpression as below:

_
 
={
 
1
,
 
2
,,
 
}
 Where the group of alerts {Alert
1
, Alert
2
, … , Alert
n
} with the samefeatures which have relations is represented by Corr_Alert.However, most of the correlation methods focus on IDS alerts byexamining other intrusion evidence provided by system monitoringtools or scanning tools. The aim of correlation analysis is to detectrelationships among alerts so it will be easy to build attack scenarios.
 A.
 
Classification of Alert Correlation TechniqueIDS alerts correlation studies got many angles to cover this issueusing many methods and techniques which can be categorized by:similarity-based, pre-defined attack scenarios, pre-requisites andconsequences and statistical causal analysis.a)
 
Similarity-BasedThis technique is based on comparing alert features to see if there is a similarity between the features, mainly thecorrelation will be based on these features (Source IPs,Distention IPs, Source Ports and Distention Ports).Valdes and Skinner [3] correlated the IDS alerts by threephases starting with the minimum similarity is based on thesimilarity of source and destination IPs, while the secondphase similarity is based on attack class and attack nameplus source and destination IPs. This phase ensures that itcorrelates the same alert from different sensors, and the lastphase a threshold value is applied to correlate two alertsbased on the similarity of similar attack class with noconsideration of other features.
This research was sponsored by the National Advanced IPv6 Center of Excellence (NAv6) Fellowship in Universiti Sains Malaysia (USM).
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 7, October 2010151http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 
b)
 
Pre-Defined Attack scenarios
The idea of studying the attack scenarios came from the factthat intrusions mainly took several actions to a successfulattack.Debar and Wespi [4] They proposed a system to correlateand aggregate IDS alerts triggers by different sensors, theirsystem got two steps starting by removing the redundantalerts if they are from different sensor, then correlating thealerts is achieved by applying the Consequences rules whichspecifies that any alert should be followed by another typeof alert, depending on these rules the alerts will becorrelated so the aggregation phase will start to check if there are any similarity between the source and destinationIPs and attack class.
c)
 
Pre-Requisites and ConsequencesThis technique comes in the middle between featuressimilarity correlations and scenarios based correlations. Pre-requisites can be defined as the essential conditions thatmust exist for the attack to be succeeded, and consequencesfor the attacks are defined as conditions that might existafter a specific attack occurred.Cuppens and Miege [5] they proposed a cooperation modulefor IDS alerts with five main functions: alert basemanagement function to normalize the alerts, alert clusteringand alert merging functions used to detect the similarity sothe alerts will be clustered and merged with each other, alertcorrelation function will use the explicit correlation ruleswith pre-defined and consequence statement to do thecorrelation, intention recognition function which is used toextrapolate intruder actions provides a global diagnosis of the (past, present and future) of the intruders actions, andreaction function used to help the system administrators tochoose the best measurement to prevent the intruder’smalicious actions.
d)
 
Statistical Causal AnalysisThis technique relies on the way of ranking the IDS alertsbased on one of the statistical models to correlate them.Kumar et.al [6] implemented anomaly detection by usingGranger Causality Test (time series analysis method) tocorrelate alerts in attack scenario analysis. This techniqueaims to reduce the amount of raw alerts by merging alertbased on their features, statistical causal analysis usesclustering technique to rank the alerts based on the relationsof attacks. This technique is a pure statistical causalityanalysis with no need for a pre-defined knowledge attack scenarios.IV.
 
P
ROPOSED
A
LERT
C
ORRELATION
M
ETHOD
U
SING THE
 A
PRIORI 
 
A
LGORITHM
 Our correlation method is based on the IDS aggregated alertusing Threshold Aggregation Framework (TAF), TAF outputwill be accurate aggregated alerts with no redundant alertsand incomplete alerts. In TAF to aggregate two alerts ormore a threshold value should be applied to give moreaccuracy combination results [7].Figure 4.1 shows the TAF flowchart, the TAF has two typesof inputs; the IDS alerts and the user aggregation options.Depending on these two inputs the aggregation will be done.The user will choose which type of aggregation method toaggregate the IDS alerts.We propose Group Correlation Method (GCM) which willuse the output of the TAF to correlate the alerts by using the
 Apriori
algorithm.From the GCM flowchart in Figure 4.2 we can see that thereis an alert counter checker to see whether the amount of thealert in the file less than or equal 2 we drop the alerts sincethere will be no need to correlate them.
User SelectionSelectionCriteriaReceivingThreshold ValueThr = trQuery GeneratorCheckParsingData ParserData ManipulatorData AnalyzerIDS AlertsDrop AlertWith ThrWithout ThrBad ParsingDatabaseContainerShow Results toUserGeneratingResultsShowSaveNew AlertsAlertCheckerMissing FeaturesAggregation Data
 
Figure 4.1 TAF flowchart [7]
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 7, October 2010152http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 
 A.
 
 Apriori
AlgorithmThe reason of choosing the
 Apriori
algorithm because it isone of fastest data mining algorithms used to find all frequentitemsets in a large database[8].
 Apriori
algorithm depends ontwo predefined threshold values (Support and Confidence) tosee whether the itemset (group of alerts) are related to eachother or not. The Support value equals the frequent of itemsin the itemset, while the Confidence value can be calculatedby the following equation:

=

+

100% (1)
Where LHD is the support of left side, RHD is the support of right side.
Alert AmountCheckerFiles ofAggregated AlertsDrop Alert
Amount ≤
2DatabaseContainerShow Results toUserDeterminedMinSuppMinConCalculate for each i
a
Support &confidenceIf i
a
Support <MinSuppIf i
a
Confidence< MinConGenerate Itemset I
a
YESSaveYES
 Figure 4.2 GCM flowchartSupport value should be calculated first for each itemset inthe current iteration, and only the itemsets that are biggerthan the threshold value
minSupp
. The second step is tocalculate the confidence by using equation 1. this step will bedone for each itemset in the current iteration, thisconfidences value will be compared with the secondthreshold value
minCon
to determine whether the currentitemset will be used in the second iteration or not. However;the main idea of 
 Apriori
is to determine if there is arelationship between the alerts which will be distinguishedby the confidence value.
 Apriori
works as illustrated in figure 4.3:
 
(1)
 
Read the aggregated alert(2)
 
Get two Items as a set of the First Itemand the value of the redundant of thatItem in the second item group as oneset of S{ i
1
, i
2
, ….., i
n
}(3)
 
Set
minSupp
& Set
minCon
 (4)
 
Calculate support value for each i
n
in S(5)
 
Iteration I = n-1(6)
 
While I
≥ 1
 (7)
 
Do
i
arnr=1
 (8)
 
Calculate Support and Confidence fori
n
in D{ j1, j2, ….., j
m
} where D
S(9)
 
For each j
m
in D if Support <
minSupp
 OR Confidence <
minCon
Drop theItemset.(10)
 
I = I-1
Figure 4.3
 Apriori
Algorithm
 B.
 
Mathmatical representation of 
 Apriori
AlgorithmFor a better understanding of 
 Apriori
algorithm we aremathematically representing it as follow:
The Initial Step:-
Let Itemset
S =i1, i2, ….., in, R =1, 2, 3, …, g and I= Iteration 
.
Iteration
I=
0
 
:-
 
=(
1,
2,…..,

),
=(
 
1,
 
2,…,
 
) 
ℎ
 
ℎ
 
 
{1,2,3,…,
},
=(1,2,….,
)
 

=|
|=
 
Iteration
I=1
:-
 
We make intersection between
 
and
 
where
suchthat
 
=(
 
1
,
 
2
,,
 
)
(
 
1
,
 
2
,,
 
)
=(
 
1
,
 
2
,,
 
)
 Where,
 
1
,
 
2
,,
 
1,2,3,…,
 

 
,
 Let

=
 
=

 Where,
=1,…
,

 
=1,…,
 
 
=
 

=|
|=
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 7, October 2010153http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->