Professional Documents
Culture Documents
Abstract—In recent years, several supervised intrusion detec- data update over time to incorporate new attacks. It emphasizes
tion systems have been proposed. However, these methods require the need for an unsupervised learning method that can identify
labeled data for training and cannot automatically adapt to abnormal behavior automatically. On the other hand for the
frequently changing network traffic scenarios. It is also required
for data to be updated periodically and requires the model to continuous network traffic, it is expected that the model is
be retrained to detect new attacks. This emphasizes the need able to learn the new patterns automatically which is not
for the development of unsupervised detection systems that can possible through offline systems. Since system behavior and
target zero-day attacks. In this paper, we propose an unsuper- applications are changing with respect to time, it is expected
vised solution that relies on detecting attacks in a discrete-time from the model to update itself and adapt to a frequently
sliding window using the distance between statistical features.
The proposed algorithm utilizes generated cluster profiles and changing environment so that it can be applied in real-time for
estimates the distance between statistical features to trigger an attack detection. Hence it is required to build the IDS system
attack event if it exceeds the predefined threshold. The proposed that can be operated in an online manner using an unsupervised
method was applied to CICIDS-2018 dataset and tested for FTP machine learning method. We proposed a solution that relies
Brute Force and HTTP Distributed Denial of Service attacks. on identifying these anomalies in networking parameters over
the sliding time window. The proposed system process data
for each specific time window and detect the attack activity is
I. I NTRODUCTION
present or not. Our solution does not rely on input from the
With the rise of the Internet, attacks like DDoS, BruteForce, last context and looks into only the current processing window
Botnet, etc. have become destructive and pose a great threat to for attack detection.
overall security. The number of attacks on computer networks
has been increased a lot over the years. These attacks can
hamper the network performance, cause huge financial loss A. Contributions
and overall credibility loss to an enterprise. According to The main contribution of the paper can be summarized as
Mcafee [1] the potential cost of cyber-crime to the global follows:
community is enormous around $1 trillion, and a data breach
costs the average company about $3.8 million with an • We developed an unsupervised algorithm for network attack
average interruption to operation about 18 hours. Various detection using clustering method that eliminates the need
open-source tools like LOIC [13], Slowloris [16], Nmap [10], of labeled data.
Cain & Abel, etc can be used to perform different types of • We proposed online method for detecting attack indepen-
attacks easily. These attacks are performed against existing dently for every time slot using flow-level information.
vulnerabilities in various networking protocols like SSH, • Proposed system is based on extracting statistical features
RDP, FTP, HTTP, ICMP, etc. Each of these different attacks from the data and using specific distance threshold for
has a varied effect on network parameters. For example, attack detection.
Port scanning allows a lot of packets generated from one • We proposed unified approach to detect multiple attacks
source port to multiple other destination ports that affect the using different distance thresholds.
distribution in destination ports. During high volume DDoS • We used recent benchmark dataset from Canadian Institute
attacks sudden increase in packet transfer rate is identified. for Cybersecurity (CICIDS2018) [14] to validate the
The major challenge is to identify these malicious activities proposed method.
with changing attack scenarios as model has to be updated The paper is organized in different sections as follows.
with time. In Section 2, we discuss the various solutions proposed for
online attack detection applied for various attacks. Section 3
Since most of the proposed supervised learning-based mod- describes the proposed methodology in detail and the compo-
els are trained using the labeled data and are used in pro- nents involved. Section 4 highlights the implementation of the
duction systems later to detect attack activities. Most of these proposed approach and experimentation results for the various
supervised learning-based Intrusion Detection Systems (IDS) stage of processing. In Section 5 we provide a conclusion &
require huge manual effort for label generation and require future work.
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-4893-2/21/$31.00 ©2021 IEEE 125
2021 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)
II. BACKGROUND
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
126
2021 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)
range [0,1]. If feature values are represented in X then represented in the equation 2. We consider the modulus
transformed value Xt is, value of the computed distance because some of the
features affect positively and others affect negatively in
X − Xmin
Xt = (1) terms of statistical distance.
Xmax − Xmin
N
2) Model Building X
The main objective of this stage is to build the unsu- Dij = |V i − V j | (2)
n=1
pervised model and generate the statistical features for
attack detection. To reduce the search space we require to For N different clusters total (N 2 −N ) different distances
combine similar points of records together into multiple are computed among every pair of cluster. The maxi-
groups. We perform the clustering for this purpose and mum of the computed distance is compared with defined
generate optimum cluster by analyzing the clustering threshold θ. If the magnitude of distance is greater than
quality using various evaluation measures like Purity, θ attack is notified. Threshold value is adjusted based on
Homogeneity score, etc. After final clustering, we can administrator input on the outcome of the last N window
analyze that similar flow-level information is combined results. It can also be updated based on requirement of
together. The main objective of this step is to identify reducing false positives or false negatives.
the abnormal cluster in later stages. Once the clusters
are created we need to analyze the clusters and identify
C. Algorithm
if any attack cluster exists with differentiating behavior.
Profile Generation is executed to build the statistical The proposed algorithm involves multiple stages of execu-
insights from clustering which is an important step to tion as described earlier. These stages impart a specific role
compute the statistical features for every cluster. These in the overall process. Flow level records at a particular time
statistical features are computed on each identified cluster window ∆t and distance threshold θ are provided as input to
and consist of various 1D and 2D features as described the algorithm. On this subset of flow records, required features
in table II. Let Fi and Fj are two features with value set are selected and scaled at an early stage. Further clustering
as X <x1 , x2 , ...xn >and Y <y1 , y2 , ...yn >. is performed to represent a group of similar points. At a
later stage, statistical features are computed on each cluster
Dimension Feature Formula to generate a cluster profile. Now objective is to identify
n
1X the maximum distance attained among pairs of clusters. This
1D Mean µx = xi maximum distance is matched with a predefined distance
n i=1
n threshold to identify the attack. The overall complexity of
1X
Variance σx2 = (xi − µx )2 the proposed algorithm is (O(N log(N )) + O(C 2 )) where N
n i=1 is number of data points in time slot t and C is the number of
X ·Y clusters created from data, if DBSCAN algorithm is used for
2D Cosine 1−
|X||Y | clustering which has worst case complexity as O(N log(N )).
Distance Pn
(xi − µx )(yi − µy ) The whole process is explained as below in the algorithm 1
Correlation 1 − i=1
σx · σy
Distance IV. E XPERIMENTATION R ESULTS
A. DataSet
TABLE II
L IST OF S TATISTICAL F EATURES We used CICIDS-2018 Dataset [14] to train and test the
proposed attack detection system. This is the recent dataset
After this process, each of the generated features Vi developed at the Canadian Institute of Cybersecurity in which
is named as <Si -Fi >Where Si represents a statistical records are collected over testbed with networking infras-
parameter and Fi represents feature or group of features tructure containing 50 attacking machines and the victim
on which it is computed as based on statistical feature. organization has 5 departments and includes 420 machines and
Once the statistical features are computed cluster is 30 servers. FTP-Patator and SSH-Patator tools [12] are used
represented in vector format as V which is called cluster for performing brute force attacks. LOIC tool is used for DDoS
profile as represented in table III. Profile generation is an attack. Flows are captured using CICFlowMeter-V3 [7]. In
important step that is used to map the clusters in vector total 7 different types of attacks are captured at different dates.
representation which is further used to identify the attack. We only selected the data for which FTP BruteForce attack
3) Attack Detection and HTTP DDOS attack are performed along with Benign
Attack Detection is the main stage to identify if any traffic as shown in the table V.
attack activity is present at the current time window.
Once the cluster profiles are generated attack detection
procedure follows to compute the inter-cluster distance B. Implementation
among every pair of cluster. These distances are com- We utilized flow level data captured on 14-02-2018 & 20-
puted individually among created profile parameters and 02-2018 specifically for FTP brute force attack and Http DDoS
summed up together to compute the total distance as attack. We do not require the associated label information
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
127
2021 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)
TABLE III
S AMPLE P ROFILE
TABLE V 1) Clustering Metrics: We used labels from the data set for
ATTACK DATASET the computation of clustering measures. These measures also
help us to select the best hyperparameters for clustering.
to start the process. The complete implementation is done (a) Purity Score: Purity score is calculated based on the
on Mac OS with 6 core intel core i7 processor with 16 number of correctly matched class and cluster labels
GB RAM. Feature selection is performed based on attack divided by the number of total data points. Score is
property as described earlier. Table VI provides the details on bounded between [0, 1]. Higher score denotes the better
selected features for each type of attack detection. We used clustering. Let N =Number of data points, K=Number of
the DBSCAN Clustering algorithm [4] for the generation of clusters, Ci is a cluster in C, and Tj is the classification
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
128
2021 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)
which has the max count for cluster Ci , Window Size Purity Score Homogeneity Score
K
2 min. 0.99 0.99
1 X 1 min. 0.99 0.97
purityscore = maxj |Ci ∩ Tj | (3)
N 1 30 sec. 0.98 0.96
15 sec. 0.98 0.93
(b) Homogeneity Score: Homogeneity score measures
whether a cluster contains only members of a single
class. Homogeneity score is bounded between [0, 1]. A TABLE VIII
C LUSTERING R ESULTS : FTP B RUTE F ORCE
higher score denotes better clustering.
We compute the mean of the above scores for a group of Window Size Purity Score Homogeneity Score
clusters identified at a time slot. In our proposed approach 2 min. 0.86 0.68
we also require each cluster to have a single class of 1 min. 0.87 0.71
examples and a higher purity score. 30 sec. 0.89 0.73
2) Classification Metrics: To estimate the performance of 15 sec. 0.93 0.82
the algorithm we need to estimate performance in terms of
precision, recall, f1-score, and accuracy metrics as described TABLE IX
below in table VII. We used timestamps to identify the event C LUSTERING R ESULTS : HTTP DD O S
i.e. attack or benign, associated with each window according to
the dataset. Let TP represents the True Positive, TN represents
the True Negatives, FP represents the False Positives and FN the detection performance we divided time range into multiple
represents the False Negatives. different slots based on window size and estimated the results
based on attack time duration. Overall classification metrics
Metrics Description Formula calculations are based on 2-class (Benign or Attack) classifi-
TP cation results. Threshold value 2.5 is used for FTP Brute Force
Precision Percentage of at-
TP + FP attack detection and value 47 is used for Http DDoS Attack
tacks detected out
of total outcomes detection. Detection results are represented in table X and XI
TP respectively with varying window size.
Recall Percentage of at- 2) Time Window Selection: A proper time window needs
TP + FN
tacks detected out to be selected for processing the results in an estimated time
of total attacks depending upon the flow arrival rate. We estimated time taken
2∗P ∗R
F1- Harmonic mean of for each batch to process for varying window size as shown
P +R in table XII & table XIII respectively. We see that per batch
Score precision P and re-
call R execution time is reduced on taking smaller window size
TP + TN and from the detection performance results, it is evident that
Accuracy Prediction
TP + FP + TN + FN reducing the window size after a certain point reduces the
Accuracy is for the
overall classification metrics. This can help us to select the
build model
optimum window size based on flow arrival rate. For HTTP
DDoS attack, processing time is larger than FTP Brute Force
TABLE VII
C LASSIFICATION M ETRICS
Window Precision Recall F1-score Accuracy
Size
D. Results and Analysis 2 min. 97% 81% 88% 88%
1) Detection Performance: We evaluated the detection per- 1 min. 98% 85% 91% 91%
formance of proposed method for FTP Brute Force and Http 30 sec. 98% 93% 95% 95%
DDoS Attack individually. As per CICIDS-2018 dataset, FTP 15 sec. 95% 87% 91% 90%
Brute Force attack is performed from 10:32 am till 12:09 pm
on 14-02-2018. We took data from 10:00 am till 1:00 pm TABLE X
time considering first 30 min and last 30 min as the Benign C LASSIFICATION R ESULTS : FTP B RUTE F ORCE
traffic. Similarly another attack Http DDoS is performed on
20-02-2018 from 10:12 am till 11:17 am time. We have taken Window Precision Recall F1-score Accuracy
data from 9:00 am till 12:30 pm to include the benign traffic Size
before and after attack. We estimated results for different 2 min. 90% 90% 90% 93%
time window size of 2 min, 1 min, 30 sec, and 15 sec. We 1 min. 91% 96% 93% 95%
used DBSCAN algorithm for clustering with configuration 30 sec. 74% 91% 81% 85%
parameters eps = 0.005 and min-sample = 10 for both the 15 sec. 54% 80% 64% 69%
attacks. Clustering evaluation metrics attained are attained
for various window sizes and shown in table VIII for FTP TABLE XI
BruteForce and table IX for HTTP DDoS attacks. To estimate C LASSIFICATION R ESULTS : HTTP DD O S
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
129
2021 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)
because of the higher volume of traffic generated during the The proposed scheme can also be extended to detect various
DDoS attack. other attacks like Denial of service, Botnet Detection, etc.
Window No. of Total Time per
Size windows Time(sec.) Batch(sec.) R EFERENCES
2 min. 89 43 0.48 [1] Mcafee report https://ir.mcafee.com/news-releases/news-release-
details/new-mcafee-report-estimates-global-cybercrime-losses-exceed-1
1 min. 177 44 0.24 (2020)
30 sec. 348 46 0.13 [2] Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel,
15 sec. 674 59 0.08 O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R.,
VanderPlas, J., Joly, A., Holt, B., Varoquaux, G.: API design for machine
learning software: experiences from the scikit-learn project. In: ECML
TABLE XII PKDD Workshop: Languages for Data Mining and Machine Learning.
FTP B RUTE F ORCE : W INDOW S IZE VS T IME pp. 108–122 (2013)
[3] Dromard, J., Roudière, G., Owezarski, P.: Online and scalable unsu-
pervised network anomaly detection method. IEEE Transactions on
Network and Service Management 14(1), 34–47 (2016)
Window No. of Total Time per [4] Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based
Size windows Time(sec.) Batch(sec.) algorithm for discovering clusters in large spatial databases with noise.
2 min. 91 884 9.71 In: kdd. vol. 96, pp. 226–231 (1996)
[5] Ippoliti, D., Jiang, C., Ding, Z., Zhou, X.: Online adaptive anomaly
1 min. 181 524 2.89 detection for augmented network flows. ACM Transactions on Au-
30 sec. 358 405 1.13 tonomous and Adaptive Systems (TAAS) 11(3), 1–28 (2016)
15 sec. 693 404 0.58 [6] Lakhina, A., Crovella, M., Diot, C.: Mining anomalies using traffic
feature distributions. ACM SIGCOMM computer communication review
35(4), 217–228 (2005)
TABLE XIII [7] Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A.: Char-
H TTP DD O S : W INDOW S IZE VS T IME acterization of tor traffic using time based features. In: ICISSp. pp. 253–
262 (2017)
[8] Liao, J., Teo, S.G., Kundu, P.P., Truong-Huu, T.: Enad: An ensemble
3) Threshold Selection: Threshold plays a vital role in framework for unsupervised network anomaly detection
getting the good results. To decide the threshold we need to [9] Lima Filho, F.S.d., Silveira, F.A., de Medeiros Brito Junior, A., Vargas-
check the performance metrics at various thresholds. We select Solar, G., Silveira, L.F.: Smart detection: an online approach for
dos/ddos attack detection using machine learning. Security and Com-
the min and max distance achieved for a specific time window munication Networks 2019 (2019)
size of 2 min and estimated the metrics. We select the threshold [10] Lyon, G.F.: Nmap network scanning: The official Nmap project guide
which results on best performance metrics identified as 2.5 for to network discovery and security scanning. Insecure. Com LLC (US)
(2008)
FTP Brute Force and 47 for HTTP DDoS attacks as shown in [11] Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble
figure 2 & 3. of autoencoders for online network intrusion detection. arXiv preprint
arXiv:1802.09089 (2018)
[12] Patator: Ftp patator and ssh patator https://github.com/lanjelot/patator
V. C ONCLUSION AND F UTURE W ORK [13] Sauter, M.: “loic will tear us apart”: The impact of tool de-
We presented an online approach for attack detection which sign and media portrayals in the success of activist ddos at-
tacks. American Behavioral Scientist 57(7), 983–1007 (2013).
works in an unsupervised manner. The proposed system re- https://doi.org/10.1177/0002764213479370
duces the overhead of labeling every network flow data. The [14] Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a
administrator can decide the threshold based on networking new intrusion detection dataset and intrusion traffic characterization.
(2018)
scenarios and the kind of attack that needs to be detected. Our [15] Shi, Z., Li, J., Wu, C., Li, J.: Deepwindow: An efficient method
implementation of intrusion detection is based on the latest & for online network traffic anomaly detection. In: 2019 IEEE 21st
representative flow level dataset. The current approach does International Conference on High Performance Computing and Com-
munications; IEEE 17th International Conference on Smart City;
not require the previous context of time windows that allows IEEE 5th International Conference on Data Science and Systems
it to execute independently for each time step. In the future, we (HPCC/SmartCity/DSS). pp. 2403–2408. IEEE (2019)
plan to extend the strategy to include context for improvement. [16] Yaltirakli, G.: Slowloris. https://github.com/gkbrk/slowloris (2015)
Authorized licensed use limited to: UNIVERSITY OF HERTFORDSHIRE. Downloaded on June 26,2023 at 12:07:46 UTC from IEEE Xplore. Restrictions apply.
130