You are on page 1of 7

129

Ensemble Learning Methods for Anomaly Intrusion


Detection System in Smart Grid
Tala Talaei Khoei, Ghilas Aissou, When Chen Hu, and Naima Kaabouch
School of Electrical Engineering and Computer Science, University of North Dakota
Grand Forks, ND 58202 USA
2021 IEEE International Conference on Electro Information Technology (EIT) | 978-1-6654-1846-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/EIT51626.2021.9491891

Abstract— Smart grid is an emerging technology that Specification-based IDSs mainly identify deviations from
delivers intelligently to the end-users through two-way malicious behaviors. Anomaly-based IDSs use statistical
communication. However, this technology can be subject to measures to distinguish malicious activities from normal
several cyber-attacks due to this network's inherent weaknesses. behaviors [3]. Several papers have focused on these
One practical solution to secure smart grid networks is using an
intrusion detection system (IDS). IDS improves the smart grid’s
categories and discussed their advantages and disadvantages.
security by detecting malicious activities in the network. For example, the authors of [4] highlighted the major
However, existing systems have several shortcomings, such as a advantages of anomaly-based IDS over signature-based and
low detection rate and high false alarm. For this purpose, specification-based, including the ability to detect zero-day
several studies have focused on addressing these issues, using or any other multi-step attacks in the network. Anomaly-
techniques, including traditional machine learning models. In based IDS can also be used to detect attacks in real-time
this paper, we investigate the performance of three different systems, such as smart grid, and analyze protocol-based
ensemble learning techniques: bagging-based, boosting-based, attacks or multidimensional traffic. In addition to these
and stacking-based. Their results are compared to those of three advantages, the authors of [5] state that anomaly-based
traditional machine learning techniques, namely K nearest
neighbor, decision tree, and Naive Bayes. To train, evaluate, and
techniques are better techniques for smart grid than the other
test the proposed methods. We used the benchmark of CICDDos two techniques due to their ability to detect multi-step attacks
2019 that consists of several DDoS attacks. Two feature selection and protect advanced metering infrastructure (AMI) against
techniques are used to identify the most important features. The malicious activities.
performance evaluation is based on the probability of detection,
probability of false alarm, probability of miss detection, and However, anomaly-based IDS have some limitations,
accuracy. The simulation results show that the stacking-based including high false positive (FPR) rates and low detection
ensemble learning techniques outperform the other algorithms rates [5, 6]. Several studies have been done to improve the
in terms of the four-evaluation metrics.
accuracy of anomaly-based IDS to detect any sophisticated
Keywords— Smart grid, intrusion detection system, anomaly- suspicious activities in the network and provide better results
based, DDoS attacks, supervised machine learning, ensemble to address these issues. For instance, the authors of [7]
learning, bagging, boosting, stacking, performance metrics. proposed an intelligent architecture for IDS using machine
learning techniques. They compared the performance of
Naïve Bayes, Bayesian Network, J48, Zero R, oneR, Simple
I. INTRODUCTION Logistic, support vector machine (SVM), Multi-Layer
perception, and random forest. Their proposed system
Smart grid is an intelligent electricity network that can classifies traffic activities and categorizes the network's
perform various operations and energy measures, such as attack types. Their simulation results show that J48 achieves
smart meters, renewable energy resources, and energy- the best results, and their proposed system can effectively
efficient resources. This technology can deliver electricity to distinguish between malicious and normal activities in the
consumers through two-way communication. However, it network.
suffers from some limitations, including a lack of security [1].
Weak security can lead to tremendous security risks in this
In [8], the authors also proposed an intrusion detection
heterogeneous network, which is a major concern when using
system, DeepCoin, that integrates deep learning techniques
smart grid.
like recurrent neural network algorithm and blockchain
framework and used three different datasets, including
For this purpose, several methods have been proposed to CICDDoS 2017, Power system, and BOT-IOT to evaluate
improve the smart grid's security. For instance, the authors of their results. The simulation results using these three datasets
[2] proposed an intrusion detection system (IDS) that can show that the proposed system provides a good detection rate.
detect unknown threats, vulnerabilities, and cyber-attacks, in In [9], the authors proposed a system that detects Denial of
any heterogeneous system. An intrusion detection system is a Service (DoS) attacks that compromise smart grid network
holistic solution that detects malicious activities and availability. Then, they employed cumulative sum (CUSUM)
sophisticated cyber-attacks, including zero-day attacks that and a method called abnormal behavior detection to detect
any other traditional methods cannot detect. any malicious node in the network. Their proposed IDS can
achieve a good detection rate.
IDSs can be classified into three categories, namely
signature-based, specification-based, and anomaly-based. In [10], the authors used an unsupervised machine
Signature-based IDSs can detect patterns in malicious learning system for detecting covert data integrity assaults on
behaviors by using a dataset of well-known attack signatures. the smart grid. For their evaluation, they used the SE-MF

978-1-6654-1846-1/21/$31.00 ©2021 IEEE


Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
130

dataset and proposed an algorithm, iForest. The proposed • Use of two feature selection methods to identify the
algorithm was performed through standard IEEE14-bus, 39- most important features.
bus, 57-bus, and 118-bus systems. Their simulation results • A comparative analysis of traditional machine
show that iForest could significantly improve the attack learning and ensemble learning methods using
detection rate in the proposed system. The authors of [11] TPR, FPR, FNR, and accuracy.
also exploited machine-learning techniques on the
CICDDoS2019 benchmark. They proposed two machine- This paper's remainder is organized as follows: Section II
learning models (convolutional neural network (CNN) and describes the data and methodology used in this study. The
long short-term memory (LSTM)) for the detection of DDoS results, methods are analyzed and discussed in Section III.
attacks in the network. Evaluation results show that both Finally, the conclusion and future directions are drawn in
models' accuracies are high. section IV.

In [12, 13], the authors also used deep learning techniques II. METHODOLOGY
to detect attacks in a network. Both studies mainly focus on This section discusses the data and methods used in this
detecting several types of attacks, including Botnet, UDP, work, including data description, preprocessing, feature
SYN, broadcast, sleep deprivation, and barrage attacks. The selection techniques, classification methods, and evaluation
simulation results show that these models are promising in metrics.
detecting attacks. Several other studies also used other
techniques to detect cyber-attacks in networks. In [14], the A. Data
authors developed a lightweight hybrid approach for
The dataset used in this study is the CICDDoS 2019 [18],
detecting Hello Flood and Sybil attacks. They used the
which was generated by the Canadian Institute of Cyber-
Routing over Low Power protocol and Lossy Networks
Security (CIC) and the University of New Brunswick (UNB)
(RPL) in their proposed system. Their results showed that this
[19, 20]. This database has more than 1 million normal flows
hybrid approach provides a good detection rate.
and 30 million malicious attacks, and it is one of the newest
datasets related to intrusion. This CICDDoS 2019 contains
As previously mentioned, most studies applied different training data corresponding to 13 different DDoS attacks, as
machine learning methods for detecting intrusions. However, shown in Table 1.
in some cases, individual machine learning techniques can
detect some attacks and miss others, leading to poor
In [18], the authors classified DDoS attacks into two
performance. One solution to address this issue is to use
categories: reflection-based and exploitation-based attacks.
ensemble-learning techniques [15]. These techniques
In reflection-based attacks, malicious user's identity remains
integrate the decisions of multiple machine learning models
hidden by using standard third-party devices. Reflection-
to improve the overall detection performance. For example,
based attacks mostly perform through the application layer
the authors of [16] used a stacked ensemble learning
protocols, using transport layer protocols, including
technique to detect attacks in networks. Their results show
Transmission Control Protocol (TCP), User Datagram
that this technique performs much better than any of the
Protocol (UDP), or a combination of both protocols. MSSQL,
individual machine learning techniques, namely artificial
SSDP, DNS, LDAP, NetBIOS, SNMP, PortMap, CharGen,
neural network (ANN), decision tree (CART), random forest,
NTP, and TFTP are examples of reflection-based attacks.
and support vector machine (SVM). The authors of [17]
describe an adaptive ensemble learning that integrates several
Exploitation-based attacks are similar to reflection-based
traditional machine learning methods, including decision
attacks, but with slight differences. Exploitation-based
tree, support vector machine, k-nearest neighbors, AdaBoost,
attacks are only triggered by specific features of protocols
and random forest, to achieve a good detection rate.
and bug implementations. In this case, a malicious user
initializes the attack by sending many messages to the victim.
There is no previous study that compared the performance
Therefore, when the victim replies to these messages, the
of ensemble learning techniques using different feature
malicious user refuses to respond. Ultimately, the victim is
selection methods for anomaly-based IDS in smart grid
exhausted, and no connection can be continued. The samples
networks to the best of our knowledge. This study fills the
of these attacks include SYN Flood, UDP, and UDP Lag
gap by evaluating the performance of three different
attacks.
ensemble-learning techniques, including bagging, boosting,
and stacking. The results of these techniques are compared
In this study, since it is impossible to show the results for
with those of three traditional machine-learning methods,
the 13 attacks in the corresponding dataset, we present our
namely the Decision tree (DT), Naïve Bayes (NB), and K
results in terms of two attack categories, namely reflection-
Nearest Neighbor (KNN). The benchmark of CICDDoS
based and exploitation-based attacks. These categories cover
2019 [18] dataset is selected for the learning, testing, and
cyber-attacks, as we mentioned earlier, except PortScan and
validation. The evaluation is performed based on several
CharGen. These two attacks do not have enough instances of
metrics, namely the probability of detection (TPR),
data in the given dataset, so we remove them from our
probability of false alarm (FPR), probability of misdetection
training model.
(FNR), and accuracy. In short, the contributions of this paper
can be summarized as follows:
• Review of ensemble learning techniques.

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
131

TABLE I. DISTRIBUTION OF BENIGN AND MALICIOUS FLOWS IN 3. Missing-value Imputations


CICDDOS 2019.
Real-world datasets suffer from missing value issues,
DDoS Attacks Number of
Instances
such as NaN values, blanks, or other types of placeholders.
Domain name system (DNS) 5071011 Training models with missing values can influence the
Lightweight directory access protocol (LDAP) 2179930 evaluation of these models. Several techniques have been
Simple network management protocol (SNMP) 5159870 proposed to solve this issue in the last few years, including
Microsoft SQLtoserver (MSSQL) 5781928
hot and cold deck imputations, mean imputations,
Network basic input/output system (NetBIOS) 4092937
Network time protocol (NTP) 1202649 extrapolation and interpolation imputations, and regression-
Simple service discovery protocol (SSDP) 2610611 based imputations. In this study, we applied the mean
Synchronized flooding (SYN flooding) 4444750 imputations, in which the missing values for a feature are
Trivia file transfer protocol (TFTP) 20082580 replaced with the mean of the values of that feature for all the
User datagram protocol (UDP) 3134645
User datagram protocol link aggregation (UDP-Lag) 366461 available cases in the dataset.
Port scanning (PortScan) 2312
Character generator (CharGen) 232 4. Normalization of Input Data
Total Benign 5693110
In the CICDDoS 2019, the scales of some features are
relatively different. The normalization (standardization) can
B. Data Preprocessing
prevent biases, which directly affect the results. Training a
In this work, data preprocessing consists of several steps, model without using any normalization technique may cause
namely class rebalancing and sample size reduction, errors in the classification. Several methods have been used
removing features, missing value imputations, normalization
to normalize data, such as standard scalar, min-max scalar,
of input data, and encoding labeled data. In the following, we
power transformer scalar, and unit vector scalar. In this study,
will discuss these steps in more detail.
we use standard scalar as a normalization technique that
rescales features to unit variance.
1. Class Rebalancing and Sample Size Reduction
As illustrated in Table I, CICDDoS 2019 suffers from an 5. Encoding Labeled Data
imbalanced class label, which may impact the performance
One of the important steps in preprocessing is to encode
of the models. In addition, many packets in the imbalanced
labeled data. In this study, we encoded all DDoS attacks as 1
dataset may need high computational power and processing
and normal traffic as 0. This strategy is applied to all DDoS
time. Thus, we randomly selected an equal number of
attacks (reflection-based and exploitation-based attacks) in
different DDoS attacks from CICDDoS 2019. Since the least
the used dataset.
amount of benign traffic for one DDoS attack is 29,808
instances, we choose this number for the benign traffic for
C. Feature Selection Methods
every attack. Therefore, we selected 3,726 attacks from each
of the attacks MSSQL, SSDP, DNS, LDAP, NETBIOS, Feature selection techniques play an important role in
selecting the most important features. In the CICDDoS 2019,
SNMP, NTP, and TFTP; while 9,936 instances are selected
we selected 37 features, as previously mentioned, to achieve
for each of the exploitation-based attacks, SYN, UDP, and
optimal results. However, some of these features still may not
UDP-lag.
be effective in our models' training. Several features in the
training process may be correlated, increasing the model
2. Features complexity and processing time and reducing detection
The database CICDDoS 2019 consists of 88 features accuracy. We used two feature selection methods to address
extracted using CICFlowMeter [21]. In this work, we reduced these problems, namely Pearson's correlation coefficient and
the feature dimensions to 37 relevant features. Removed Extra tree classifier [8]. In the following, we briefly explain
features include Flow ID, Source IP, Destination IP, and feature selection techniques.
Timestamp. For example, Source IP and Destination IP do
not impact the model training. In addition, they increase the Pearson's correlation computes the strength of the linear
overfitting issues since both malicious and normal users can relationship of two features between -1 and 1. When the result
use the same IP addresses. Universal features, including the is around 1 or -1, the features have a positive or negative high
number of packets or flow characteristics, can have a correlation, respectively. We consider two variables highly
significant impact on the detection accuracy of malicious correlated in this work when their correlation coefficient is
traffic in the network. higher than 0.9 [22, 23]. Therefore, when the correlation
between two variables is greater than 0.9, one of the features
In our datasets, we selected several features related to the is removed from our datasets. Pearson's Correlation
flow (such as Flow Duration, Flow Packet (s), etc.) to identify Coefficient (r) is given as:
malicious flows. We also selected some other features related ∑n − −
i=1[(xi −x )(yi −y )]
r= (1)
to packets’ characteristics (including Packet Length Mean, √∑n − 2 n − 2
i=1[(xi −x ) ]√∑i=1[(yi −y ) ]
Packet Length Variance, and Packet Length Std, etc.), along
with other features related to forward and backward
directions (such as Fwd Packets, Bwd Packet, etc.) We also Where x, y are two variables, 𝑥− , 𝑦− are the means of two
choose some major features related to time to distinguish variables, and n is the number of the samples, respectively.
malicious users from normal users.

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
132

In this work, we also employ another feature selection improvement with this technique is obvious, particularly
technique, namely a Tree-based algorithm. According to the when there is a diversity among the techniques at different
Gini importance, tree-based feature selection techniques only levels [29]. In this work, we use three different classification
rank the importance of features and remove irrelevant ones methods, namely K nearest neighbor (KNN), random forest
from the dataset. The features with a score of less than 0.01 (RF), and Naïve Bayes (NB).
importance are discarded, and removing them does not
influence the model efficiency. E. Evaluation Metrics
To evaluate the efficiency of the classification models,
D. Classification Methods four metrics are used. These metrics are the probability of
In some cases, traditional machine learning techniques do detection (TPR), probability of misdetection (FNR),
not achieve high-performance results, particularly when the probability of false alarm (FPR), and accuracy. The other
data is complex, noisy, or imbalanced. One solution to metrics, TPR, FPR, FNR, and accuracy, are estimated using
enhance the performance is to employ ensemble-learning the following equations:
T
techniques. These techniques can more effectively use Probability of Detection = p ∗ 100 ()
Tp +FN
several machine-learning classifiers based on feature
Fp
extraction of the given data set and fuse the results with Probability of False Alarm = ∗ 100 ()
TF +FN
different voting techniques. In general, ensemble-learning FN
techniques are divided into three categories, bagging-based, Probability of misdetection = ∗ 100 ()
TN+FP
boosting-based, and stacking-based ensemble learning. TP +TN
Accuracy = ∗ 100 ()
TP +TN +FP +FN
1. Bagging-based Ensemble Learning
Where TP is the number of corrected predicted malicious
A bagging-based ensemble learning or bootstrap
flows, TN is the number of predicted normal flows, FP is the
aggregating is used to enhance classification models'
number of incorrect predicted malicious flows, and FN is the
accuracy by the integration of generated training sets [24].
This method generates several sample subsets by using number of incorrect normal flows.
random sampling from the training dataset. An example of a III. RESULTS
bagging-based ensemble learning technique is the
randomized random forest, which generates few randomized For the classification experiments, a 5-fold class
decision trees from sample subsets. Respectively, this process validation and grid search hyper-parameter technique are
result can be achieved by voting or averaging the used to train and test the classifiers and find optimal results.
classification or regression results. One major advantage of With these techniques, the different classification algorithms
this method is decreasing the base algorithm’s variance and are trained with 80% and tested with 20% of data from our
increasing the model accuracy [25]. dataset. Ultimately, we used several evaluation metrics to
show the importance of the selected features. This section
2. Boosting-based Ensemble Learning discusses the results of the feature selection techniques as
well as the evaluation results of the classification methods
Boosting is one of the most widely used ensemble using the previously discussed metrics.
learning algorithms in classification and regression problems.
A boosting technique integrates a set of models (usually A. Feature Selection Results
decision trees) with weak performance and changes the The results of Pearson's correlation coefficient are
model to achieve high accuracy. The main aim of this illustrated in Fig.1. One one can see, several features (in
technique is to increase the suitability of the data model [26]. green) are highly correlated with a coefficient >0.9; thus,
In this work, we use the Adaptive Boosting (AdaBoost) these features are removed from our dataset. As a result, ten
model. In this model, the initial learning is trained according features are considered highly correlated.
to the training weights, which is updated based on the
previous iterations' performance. The weight of instances can Thus, 27 features (Fwd IAT Total, Packet Length
be adjusted to the probability of being correct, predicted, and Variance, Fwd Packets, Fwd IAT Mean, Fwd IAT Std, Flow
to the given error. Each new classifier's ultimate decision for IAT Std, Flow IAT Max, Fwd IAT Min, Min Packet Length,
a new instance is associated with its accuracy during the Packet Length Mean, Bwd Packet Length Mean, Bwd Packet
training process [27]. Length Std, Flow IAT Mean, Flow IAT Min, Bwd IAT Mean,
Bwd IAT Std, Total Length of Fwd Packets, Flow Bytes,
3. Stacking (Voting)-based Ensemble learning Flow Packets, Bwd Packets, Max Packets Length, Total Fwd
Packets, Total Bwd Packets, Fwd Packet Length Std, Bwd
A Stacking-based ensemble learning technique is a model Packet Length Min, Bwd IAT Total, and Bwd IAT Min)
that integrates several diverse classification algorithms [28]. remain as a selected feature.
This technique mainly finds the optimal solution from
multiple machine learning techniques. This technique usually Fig. 2 provides the feature importance score for DDoS
performs at two levels, 0 and 1. In level 0 (base-learner), the attacks, according to the Extra Tree classifier. As shown in
algorithm mainly trains diverse models along with their this figure, Min Packet Length is the most important feature
prediction results, while in level 1, the model learns from the based on this technique; however, other features (Fwd
best estimate of previous level models [25]. The significant Packets, Flow Packets, Packet Length Mean, Flow Bytes,

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
133

Fig.1. Pearson’s Correlation Coefficient Heatmap.

Max Packet Length, Total Length of Fwd Packets, Total Fwd 1. Reflection-based Attacks
Packets, Fwd IAT Total, Flow IAT Std, Fwd Packet Length The experimental results of the six different classifiers in
Std, Flow IAT Max, Flow IAT Mean, Fwd IAT Std, Bwd terms of TPR, FPR, FNR, and accuracy are shown in Table
Packet Length Min, Packet Length Variance, Fwd IAT Mean, III for reflection attacks. According to this table, the stacking-
Total Backward Packets, Bwd Packet Length Mean, Bwd based classifier provides the best performance results in terms
IAT Total, and Bwd Packets) are also considered important of TPR, FNR, FPR, and accuracy, followed by the bagging,
features. boosting, KNN, Random Forest, and Naïve Bayes. To be
precise, Stacking-based, using Flow Bytes as the most
In this technique, we also removed features that have a important feature, achieves the best results for reflection-
feature importance score less than 0.01. These features (Flow based attacks.
IAT Min, Bwd Packet Length Std, Bwd IAT Std, Bwd IAT
Mean, Bwd IAT Min, and Fwd IAT Min) are discarded from As shown in Table II, the TPR, FNR, FPR, and accuracy
the final dataset. Therefore, 21 features can be considered as of the stacking-based classifier are 96%, 4.1%, 8.9%, and
significant features for classifying reflection-based and 93.4%, respectively; however, it is apparent that bagging-
exploitation-based attacks. based classifier achieves better results in comparison with the
B. Classification Results boosting-based ensemble technique.
In this study, six classifiers are selected according to their TABLE II. EVALUATION RESULTS FOR REFLECTION-BASED ATTACKS..
characteristics, such as the ability to support multi-class
Classifier TPR FNR FPR Accuracy
classification and higher evaluation results required the (%) (%) (%) (%)
model to classify the data (Attacks, and Benign). In this work, Stacking 96 4.1 8.9 93.4
classification algorithms consist of several ensemble learning Bagging 94.8 5.2 9.5 93
algorithms (bagging, boosting, and stacking algorithms), and Boosting 94.04 5.9 9.3 92.2
several traditional machine learning algorithms, namely K- Random Forest 92.2 6.7 8.2 90
Naïve Bayes 90 7 27.1 83.4
nearest neighbor (KNN), Naïve Bayes (NB), and Random
KNN 93.4 6.5 9 91.2
Forest (RF). Since we use three different classifiers as a
stacking-based ensemble learning technique, we selected
these three traditional classifiers to compare their results with For example, the TPR, FNR, FPR, and bagging-based
ensemble learning techniques for reflection and exploitation accuracy are 94.8%, 5.2%, 9.5%, and 93%, which are
attacks. We also use grid search as a hyper-parameter tuning considered better results compared to those of the boosting
technique to ensure optimal results. For this purpose, we set technique. Moreover, the boosting technique has the worse
n_estimators to 10, 11, and 12, weights to uniform, and results compared to other ensemble techniques. Moreover, it
distance and min_samples_leaf to 1, 2, and 3 to get the is apparent from the table that traditional machine learning
optimal results.

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
134

Fig. 2. Importance of the features based on the Extra Tree classifier.

methods, random forest and KNN, provide better results attacks. For example, the boosting-based method provides
compared to the Naive Bayes technique for the reflection- 95.9% TPR, 0.9% FPR, 1.2 % FNR, and 96.7% accuracy,
based attacks. For instance, the TPR, FNR, FPR, and while the bagging-based method achieves 95% TPR, 1.2%
accuracy of KNN are 93.1%, 6.5%, 9%, and 91.9%, FPR, 1 % FNR, and 95 % accuracy.
respectively, which are relatively good results.
In exploitation attacks, the random forest method,
In addition, Table III shows that Naïve Bayes provides the compared to other traditional machine learning techniques,
worst TPR, FNR, FPR, and accuracy. As seen in this table, generally performs much better. For example, It achieves
this classifier provides a TPR of 90%, a FNR of 7%, a FPR 95% TPR, 1.2% FPR, 1.9 % FNR, and 94 % accuracy. Naïve
of 27.1%, and an accuracy of 83.4%. Consequently, a Bayes classifier performs the worst in comparison to the other
comparison of traditional machine learning techniques with classifiers, as shown in Table III. To conclude, the
ensemble techniques for reflection attacks shows that comparison of traditional machine learning techniques with
ensemble techniques overall perform better in terms of the ensemble techniques for exploitation attacks show that
four evaluation metrics. ensemble techniques overall perform better.

2. Exploitation-based Attacks In short, the key insights of these results are:


The simulation results of the selected classifiers in terms • Two feature selection techniques were used to discard
of TPR, FPR, FNR, and accuracy are illustrated in Table IV correlated and less important features from the used
for exploitation attacks. dataset.

TABLE III. EVALUATION RESULTS FOR EXPLOITATION-BASED ATTACKS.. • Ensemble-learning techniques seem to be better
classifier algorithms as they obtain acceptable results
Classifier TPR FNR FPR Accuracy
(%) (%) (%) (%) compared to the traditional machine learning
Stacking 96 1 0.7 97.3 techniques for reflection and exploitation attacks.
Bagging 95.0 1.2 1 95
Boosting 95.9 1.0 0.9 96.7 • Among ensemble-learning techniques, stacking-
Random Forest 94 1.9 1.2 94 based ensemble learning methods are the best in terms
Naïve Bayes 87 13 27 77.1 of TPR, FNR, FPR, and accuracy.
KNN 94.4 2 1.4 94.6
• Among traditional machine learning techniques, the
Naive Bayes does not provide acceptable results in
As shown in this table, the Stacking-based ensemble
terms of the four metrics.
learning technique provides the best performance results
among all the other classifiers for exploitation attacks. As can IV. CONCLUSION
be seen, this technique obtains 96% TPR, 1 % FPR, 0.7 %
In this paper, we evaluated the performance of three
FNR, and 97.3 % accuracy, respectively. In addition, other
ensemble-learning techniques, namely bagging, boosting,
ensemble techniques, namely boosting and bagging provide
and stacking-based, and three traditional machine-learning
acceptable results for exploitation attacks based on used
techniques, namely K nearest neighbor, Random Forest, and
evaluation metric. In fact, the boosting methods particularly
Naïve Bayes, for intrusion detection in smart grid networks.
perform better than the bagging techniques for exploitation

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.
135

The benchmark of CICDDoS 2019 is selected for the [19] V. Kanimozhi and T. P. Jacob, “Artificial Intelligence-based Network
Intrusion Detection with hyper-parameter optimization tuning on the realistic
assessment. We selected two feature selection techniques,
cyber dataset CSE-CIC-IDS 2018 using cloud computing,” International
Pearson's correlation coefficient and tree-based, to remove Conference on Communication and Signal Processing (ICCSP), pp. 0033–
correlated features and choose those that are most important 0036, 2019.
for reflection and exploitation attacks. Several performance [20] S. Chesney, K. Roy, and S. Khorsandroo, “Machine learning algorithms
metrics are employed, including the probability of detection, for preventing IoT cybersecurity attacks,” In Proceedings of SAI Intelligent
Systems Conference, pp. 679-686, Springer, Cham, 2021.
probability of false alarm, probability of miss detection, and
accuracy. The simulation results show that stacking-based [21] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani,
“Characterization of encrypted and VPN traffic using time-related features,”
ensemble learning techniques outperform the other ensemble International conference on information systems security and privacy
learning and traditional machine learning techniques. (ICISSP), pp. 407–414, 2016.
[22] A. Thakkar and R. Lohiya, “Role of the swarm and evolutionary
REFERENCES algorithms for intrusion detection system: A survey,” Swarm Evolutionary.
[1] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey of Computation., vol. 53, pp. 100631, 2020.
intrusion detection systems: techniques, datasets, and challenges,” [23] D. E. Hinkle, W. Wiersma, and S. G. Jurs, “Applied statistics for the
Cybersecurity, vol. 2, no. 1, p. 20, 2019. behavioral sciences,” Houghton Mifflin College Division, vol. 663, 2003.
[2] D. E. Denning, “An Intrusion-Detection Model,” Transactions on [24] J. Dou, A.P. Yunus, D.T. Bui, A. Merghadi, M. Sahana, Z. Zhu, C.W.
software engineering, vol. 13, no. 2, pp. 222-232, 1987. Chen, Z. Han, and B.T. Pham, “Improved landslide assessment using support
[3] Z. El Mrabet, H. El Ghazi, and N. Kaabouch, “A performance comparison vector machine with bagging, boosting, and stacking ensemble machine
of data mining algorithms-based intrusion detection system for smart grid,” learning framework in a mountainous watershed, Japan. Landslides,” vol. 17,
International Conference on Electro Information Technology (EIT), pp. 298- no. 3, pp. 641-658, 2020.
303, 2019. [25] M. H. D. M. Ribeiro and L. dos Santos Coelho, “Ensemble approach
[4] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey of based on bagging, boosting and stacking for short-term prediction in
intrusion detection systems: techniques, datasets and challenges,” agribusiness time series,” Applied Soft Computing, vol. 86, pp. 105837,
Cybersecurity, vol. 2, no. 1, p. 20, 2019. 2020.
[5] M. A. Faisal, Z. Aung, J. R. Williams, and A. Sanchez, “Data-stream [26] E. Schapire, “The strength of weak learnability,” Machine Learning,
based intrusion detection system for advanced metering infrastructure in vol. 5, no. 2, pp. 197–227, 1990.
smart grid: A feasibility study,” Systems, vol. 9, no. 1, pp. 31–44, 2015. [27] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of
[6] J. L. Puga, M. Krzywinski, and N. Altman, “Points of Significance: On-Line Learning and an Application to Boosting,” Journal of computer and
Bayes' theorem,” Nature Methods, 2015. system sciences, vol. 55, no. 1, pp. 119-139, 1997.
[7] E. Anthi, L. Williams, M. Słowińska, G. Theodorakopoulos, and P. [28] S. González, S. García, J. Del Ser, L. Rokach, and F. Herrera, “A
Burnap, “A supervised intrusion detection system for smart home IoT practical tutorial on bagging and boosting based ensembles for machine
devices,” Internet of Things Journal, vol. 6, no. 5, pp. 9042-9053, 2019. learning: Algorithms, software tools, performance study, practical
perspectives and opportunities,” Information Fusion, vol. 64, pp.205-237,
[8] S. Chesney, K. Roy, and S. Khorsandroo. “Machine Learning Algorithms
2020.
for Preventing IoT Cybersecurity Attacks,” Intelligent Systems Conference,
pp. 679-686, 2020. [29] J. Mendes-Moreira, C. Soares, A. M. Jorge, and J. F. De Sousa,
“Ensemble approaches for regression: A survey,” Computing Surveys (csur),
[9] M. Attia, S.M. Senouci, H. Sedjelmaci, E.H. Aglzim, and D. Chrenko.
vol. 45, no. 1, pp. 1-40, 2012.
“An efficient Intrusion Detection System against cyber-physical attacks in
the smart grid,” Computers and Electrical Engineering, 68, pp. 499-512,
2018.
[10] S. Ahmed, Y. Lee, S. H. Hyun, and I. Koo, “Unsupervised Machine
Learning-Based Detection of Covert Data Integrity Assault in Smart Grid
Networks Utilizing Isolation Forest,” Transactions on Information Forensics
and Security, vol. 14, no. 10, pp. 2765-2777, 2019.
[11] Y. Jia, F. Zhong, A. Alrawais, B. Gong and X. Cheng, “FlowGuard: An
Intelligent Edge Defense Mechanism Against IoT DDoS Attacks,” Internet
of Things Journal, 2020.
[12] O. Brun, Y. Yin, E. Gelenbe, Y. M. Kadioglu, J. Augusto-Gonzalez, and
M. Ramos, “Deep learning with dense random neural networks for detecting
attacks against IoT-connected home environments,” In International ISCIS
Security Workshop, pp. 79-89, 2018.
[13] Y. Meidan, M. Bohadana, Y. Mathov, Y. Mirsky, A. Shabtai, D.
Breitenbacher, and Y. Elovici. “N-BaIoT-Network-based detection of IoT
botnet attacks using deep autoencoders,” Pervasive Computing, vol. 17, no.
3, pp. 12–22, 2018.
[14] R. Stephen and L. Arockiam, “Intrusion Detection System to Detect
Sinkhole Attack on RPL Protocol in Internet of Things,” International
Journal of Electrical Electronics and Computer Science, vol. 4, no. 4, 2017.
[15] Y. Liu and X. Yao, “Ensemble learning via negative correlation,”
Neural Networks, vol. 12, no. 10, pp. 1399-1404, 1999.
[16] H. Rajadurai, and U.D. Gandhi, “A stacked ensemble learning model
for intrusion detection in wireless network,” Neural Computing and
Applications, 2020.
[17] X. Gao, C. Shan, C. Hu, Z. Niu, and Z. Liu, “An Adaptive Ensemble
Machine Learning Model for Intrusion Detection,” Access, vol. 7, pp.82512-
82521, 2019.
[18] I. Sharafaldin, A. H. Lashkari, S. Hakak, and A. A. Ghorbani,
“Developing realistic distributed denial of service (DDoS) attack dataset and
taxonomy,” International Carnahan Conference on Security Technology,
2019.

Authorized licensed use limited to: University of Adelaide. Downloaded on February 02,2024 at 15:35:53 UTC from IEEE Xplore. Restrictions apply.

You might also like