Explainability in IDS, Read Just After Intros

Accepted by 2024 IEEE International Conference on Communications (ICC), ©2023 IEEE
X-CBA: Explainability Aided CatBoosted Anomal-E for

Intrusion Detection System
Kiymet Kaya 1 , Elif Ak 1,3 , Sumeyye Bas 2, 3 , Berk Canberk 4 , Sule Gunduz Oguducu2
1 Istanbul Technical University, Department of Computer Engineering, Istanbul, Turkey
2 Istanbul Technical University, Department of Artificial Intelligence and Data Engineering, Istanbul, Turkey
3 BTS Group, Istanbul, Turkey
4 School of Computing, Engineering and Built Environment, Edinburgh Napier University, Edinburgh
Email: {kayak16, akeli, bass20}@itu.edu.tr, B.Canberk@napier.ac.uk, sgunduz@itu.edu.tr
Abstract—The effectiveness of Intrusion Detection Systems (IDS) is IDS studies [2], since aggregating network features helps to reveal
arXiv:2402.00839v1 [cs.CR] 1 Feb 2024
critical in an era where cyber threats are becoming increasingly complex. diverse and heterogeneous characteristics of cyberattacks. Expanding
Machine learning (ML) and deep learning (DL) models provide an
on the second aspect, there is a concerted academic effort to incorpo-
efficient and accurate solution for identifying attacks and anomalies in
computer networks. However, using ML and DL models in IDS has rate Explainable AI (XAI) models into various ML/DL frameworks
led to a trust deficit due to their non-transparent decision-making. This to provide local and global explanations of their operations. This
transparency gap in IDS research is significant, affecting confidence and endeavor aims to exploit the rationale behind model predictions,
accountability. To address, this paper introduces a novel Explainable whether by clarifying the significance of specific data points (local
IDS approach, called X-CBA, that leverages the structural advantages
of Graph Neural Networks (GNNs) to effectively process network traffic explainability) or by shedding light on the model’s overall behavior
data, while also adapting a new Explainable AI (XAI) methodology. (global explainability) [1]. Considering the research efforts on IDS
Unlike most GNN-based IDS that depend on labeled network traffic and and the security needs of organizations, it is evident that an advanced
node features—thereby overlooking critical packet-level information—our GNN-based methodology, combined with an appropriate XAI frame-
approach leverages a broader range of traffic data through network flows,
including edge attributes, to improve detection capabilities and adapt to
work, would bridge the existing divide. Nevertheless, surprisingly,
novel threats. Through empirical testing, we establish that our approach there is a notable scarcity in the literature, with few studies, such as
not only achieves high accuracy with 99.47% in threat detection but the one by Baahmed et al. [5], implementing an advanced GNN model
also advances the field by providing clear, actionable explanations of its alongside an XAI tool. To fill this gap, the proposed X-CBA enhances
analytical outcomes. This research also aims to bridge the current gap and the attack detection results of state-of-the-art IDS, and it demonstrates
facilitate the broader integration of ML/DL technologies in cybersecurity
defenses by offering a local and global explainability solution that is both that the PGExplainer [6] provides superior performance in explaining
precise and interpretable. the operational dynamics of sophisticated GNNs for network flows
Index Terms—network intrusion detection system, graph neural net- using local and global explainability. The main contributions of our
works, explainable artificial intelligence, self-supervised learning, edge study can be summarized as follows:
embedding, catboost
• Flow-based Network Data and Graph Edge Embeddings:
I. I NTRODUCTION
The proposed intrusion detection system (IDS) approach uses
In the contemporary digital environment, the continued increase network flows with many critical network metrics, representing
in sophisticated cyber threats still leaves security mechanisms in- them as graph edge embeddings. Using flow-based network data
adequate. Gartner forecasts that by 2024, a minimum of 50% of with well-modeled graph embedding is particularly effective in
organizations will adopt Machine Learning (ML) or Deep Learning detecting a range of cyber threats, including BruteForce, DDoS,
(DL) aided Security Operations Centers (SoCs) for faster cyberattack DoS, and sophisticated attacks like Bot, by uncovering distinctive
detection, a shift that is already underway with substantial investments threat patterns.
from leading firms in AI for enhanced security. However, it is • Enhanced Detection Performance with GNN and CatBoost
also well-known that there is a notable hesitance among enterprises Integration: Our novel method integrates a GNN-based de-
to adopt ML/DL-augmented network Intrusion Detection Systems tection pipeline with the CatBoost classifier, achieving higher
(IDS) due to the inconceivable black-box decision-making processes, accuracy, F1 score, and detection rates compared to existing
which are perceived as complicated, unpredictable, and unreliable state-of-the-art solutions in intrusion detection.
[1]. Addressing these concerns, the current IDS literature mainly • Advanced Explainability with PGExplainer Implementation:
prioritizes (i) developing advanced ML/DL models for sophisticated The system implements PGExplainer, offering both local and
attack detection [2], [3], and (ii) employing explainable AI to demys- global insights into the decision-making processes of the GNN-
tify ML/DL decision-making [1]. For the first aspect, recent studies based IDS. This approach outperforms baseline explainability
indicate that Graph Neural Networks (GNNs), a subset of DL models, models, particularly due to its ability to operate in inductive
are particularly promising for IDS [4]. Since the natural structure of settings, offer a non-black-box evaluation approach, and explain
network IDS is graph-based, where nodes are the network devices (e.g. multiple instances collectively, making it highly suitable for flow-
routers, hosts, etc. ) and edges are connections, packet transfers, or based IDS network data.
network flows between network devices. In this way, GNNs reveal the
impact of malicious activities on the network’s topology and leverage The rest of the paper is organized as follows. Section II presents
the neighboring information among network entities for improved related works.Section III gives details of our proposed model and
detection. Moreover, the current studies show that using network flows background methodology. Section IV presents baseline models and
rather than individually packet-based monitoring is more suitable in experimental results. Lastly, Section V concludes the paper.
Pre-Modeling Modeling Post-Modeling
embeding space
Step-5: self-supervised edge embedding, DGI

Step-1: flow-based network data capturing
Step-2: data preprocesing E-GraphSAGE

~ encoder, 𝐸
IPV4_SRC_ADDR IPV4_DST_ADDR L4_SRC_PORT … IN_PKTS OUT_PKTS …
#!" #!" Benign
LABEL
corruption, 𝐶
59.166.0.4 149.171.126.1 1060 … 89 102 … 0 function
- +
59.166.0.1 149.171.126.8 1269 … 132 192 … 1
… … … … … … … …
59.166.0.9 149.171.126.0 1011 … 75 68 … 0

#!" CatBoost Attack
discriminator,𝐷
Step-3: time-series IDS dataset
Step-6: intruison detection
𝐻 = 𝐶(𝐺) 𝐺
𝑣 E-GraphSAGE →
𝒮
𝑢 𝑒!" encoder, 𝐸
Read-out, 𝑅
PGExplainer
𝑒!!"! 𝑒!""" 𝑒!#"# 𝑒!!"!
~
#!" #!" perform
Step-4: graph construction, 𝐺 = (𝑉, 𝐸, 𝑒!" ) Step-5a: encoding Step-5b: decoding Step-7: explainability
training embedding model

revise
trained embedding model
Fig. 1: X-CBA: Explainability Aided CatBoosted Anomal-E Intrusion Detection System based on the DARPA [7] Recommendation
II. L ITERATURE R EVIEW PGExplainer shown in red-text in Fig. 1 are presented to make the
The benefit of GNNs in intrusion detection is that they take into proposed model easy to understand. Secondly, the flow of the X-CBA
account the properties of the computer network topology such as rela- which is presented in Fig. 1.
tionships between nodes. However, this also increases the complexity
A. Background Methods
of the prediction model and can be computationally costly in large
graphs and complex network topologies [8], [9]. Computational speed 1) Edge Embedding: E-GraphSAGE: E-GraphSAGE [17] algo-
is important, especially in large networks and real-time applications rithm requires a graph input G(V, E) consisting of nodes (V) and
like intrusion detection. For this reason, many of the current works edges (E). As an extended version of the GraphSAGE algorithm,
[10] utilize a less complex GNN model for representation learning it allows the use of edge features ({euv ∀uv ∈ E}) between each
and predicting network anomalies mainly with tree-based ensemble node u and node v in addition to node features ( − →) for message
x v
models. Moreover, there are still some works [11], [12] that only propagation. By utilising the edge attributes and graph topology, E-
use tree-based eXtreme Gradient Boosting (XGBoost) and CatBoost GraphSage generates output containing a new vectorial representation
methods for forecasting, since it is hard to learn a powerful GNN a.k.a. embeddings for each node (zK K
v ) and each edge (zuv ). The flow-
when computing resources are limited [8]. based NIDS datasets only consist of flow (edge) features rather than
node features. Therefore, the feature vectors of nodes are set as x −
→
Among the studies on intrusion detection on NSL-KDD data, v
in [13] the prediction model 3-layers neural network (NN) was = {1, . . . , 1} and the dimension of all one constant vector is the
explained with SHAP and LIME, in [14] anomalies were predicted same as the number of edge features. E-GraphSAGE samples a fixed
with RF and the explanations were conducted with SHAP. Patil et number of neighbors k for each node in the graph data and aggregates
al. performed intrusion detection with Random Forest models and information from the sampled neighbors to create an embedding for
used LIME for model explanation [15]. When reviewing the literature the destination node and edge embeddings. The initial value for k=0
on explainability in intrusion detection models, most studies favor (h0v ) is the feature vector ( −
→) of that node for all nodes.
x v
non-GNN-based explanation methods like SHAP and LIME, despite
employing GNN-based prediction models. Only a few studies [5], [16] hkN (v) ← AGGk {huk−1 ∥ek−1 uv , ∀u ∈ N (v), uv ∈ E} ;
(1)
include GNN-based models and explanations. In both of these studies,

hkv ← σ W k · CONCAT(hk−1 v , hkN (v) )
the authors only observed the performance of GNNExplainer to find
the important components of the network whereas comparative results In each iteration from k=1 to K, for all nodes in the node-set V, the
in terms of XAI models were not observed. To this end, this study node v’s neighborhood is initially sampled and the information from
focuses on improving the predictive accuracy of intrusion detection the sampled nodes is collected into a single vector. Next, as in 1, for
models and conducting a comparative analysis of XAI methods on all k and v values, the aggregated information hkN (v) at the k-th layer
computer network data, that provide additional insights to understand and at node v, based of the sampled neighborhood N (v) is calculated
and interpret conclusions of IDS. with the help of neighborhood aggregator function AGGk . Here, ek−1 uv
are the features of edge uv from N (v), the sampled neighborhood of
III. M ETHODOLOGY
node v, at layer k-1.
The overview of the Methodology is organized in two steps. In the
first step, Background Methods E-GraphSAGE, DGI, CatBoost, and zK K K
uv ← CONCAT(zu , zv ) (2)
The aggregated embeddings of the sampled neighborhood hkN (v) are maxGS M I(Y, GS ) = H(Y ) − H(Y |G = GS ) (5)
then concatenated with the node’s embedding from the previous layer
PGExplainer determines GS by maximizing the mutual information
hk−1
v . Here, the critical difference from the GraphSAGE is having the
M I between the predictions Y and this underlying structure with
edge features. The final node embeddings at depth K are assigned
the help of entropy term H as in 5. The goal of the PGExplainer
and the edge embeddings zK uv for each edge uv are calculated as the
approach is to specifically explain the graph topologies found in
concatenation of the node embeddings of nodes u and v as in 2.
2) Self-supervised Learning: Deep Graph Infomax: Self- GNNs and identify GS such that conditional entropy is minimized.
supervised learning aims to learn the underlying features of the data by However, because there are so many candidate values for GS , direct
creating its own pseudo-labels from unlabeled data. The pseudo-labels optimization is infeasible. Therefore, assuming GS follows a Gilbert
are the labels automatically obtained by the self-supervised model, not random graph distribution, selections of edges from the original input
the ground truths of the data. Teaching these pseudo-labels contributes graph G are conditionally independent of each other, a relaxation
to the creation of good representations for the data and can improve approach is used. With this relaxation, PGExplainer can recast the
the prediction performance of the supervised learning model to be aim as an expectation, making optimization more manageable.
used later. DGI [18] provides self-supervised learning by maximizing B. X-CBA: Explainability Aided CatBoosted Anomal-E
mutual local-global information and the trained Encoder of DGI can The flow of the Explainability Aided CatBoosted Anomal-E frame-
be reused to generate edge/node embeddings for subsequent tasks such work is presented in Fig. 1. In the proposed framework, computer
as edge/node classification. The details of the DGI model presented intrusion detection tabular data is transformed into attributed multi-
in Step 5 of Fig. 1 are as follows: graph data after the required preprocessing steps are completed. Here,
• A corruption function C (i.e., a random permutation of the
routers represent nodes and data flows represent edges. The self-
input graph node features which add or remove nodes from the learning DGI model is tuned with an E-GraphSAGE encoder and
adjacency matrix A) is used to generate a negative (corrupted) model training is performed on multigraph intrusion detection data.
graph representation H = C(G) from the input graph (G). Gradient descent optimization, powered by the Adam Optimizer and
• An encoder E, which can be any existing GNN such as E-
the Binary Cross Entropy (BCE) Loss in 3, is used to iteratively
GraphSAGE, generates edge embeddings both for the input graph optimize the D, R, and E of DGI. The encoder of the DGI uses
(G) and corrupted graph (H). a mean aggregation function on a 1-layer E-GraphSAGE model. E-
• The readout function R, which is at the core of DGI, aggregates
GraphSAGE uses a hidden layer size of 256 units and ReLU is used
edge embeddings by taking average of them and then processes as the activation function. As for the generation of the global graph
them through a sigmoid function to calculate a global graph summary, we averaged edge embeddings and passed them through a
summary − →s (a single embed vector of the entire graph). sigmoid function. BCE is used as a loss function and gradient descent
• The discriminator D then assesses these edge embeddings (a real
−−−→ is used for backpropagation with the Adam optimizer using a learning
edge embedding − z− −→
uv(i) and a corrupted edge embedding e zuv(j) rate of 0.001. With the trained encoder, edge embeddings are obtained.
−
→
) using the global summary s as a guide. Intrusion detection is performed with the CatBoost Classifier using the
1
P edge embeddings that outperform the meta-features of the data. The
E(X,A)) [logD( −
z−−→ − →
X
L= ( uv(i) , s )] prediction results are explained with the help of PGExplainer, which
P + S i=1
(3) is an XAI method designed specifically for graph data. In this way, the
S
−−−→ − edges (data flows) that contribute the most to the prediction and are
zuv(j) , →
X
+ E(X,A)) [(1 − logD( e s ))]) critical in the network topology are identified. The implementation
j=1
details and the code repository of the proposed IDS are available
• Comparisons by the discriminator D provides a score between here1 .
0 and 1, with the help of binary cross-entropy loss objective
function in 3 to discriminate the embedding of the real edge and IV. P ERFORMANCE E VALUATION
the corrupted edge to train the encoder E. We first evaluate the proposed X-CBA approach for intrusion
3) Intrusion Detection: Catboost Classifier: CatBoostClassifier detection performance comparing with the baselines in Section IV-A.
is a gradient-boosting ML algorithm known for its high predictive Then we also provide the explainability performance analysis with
accuracy and speed. It employs techniques such as ordered boosting the state-of-the-art method in Section IV-B.
and oblivious trees to handle various data types effectively and
A. X-CBA Intrusion Detection Evaluation
mitigate overfitting.
4) Explainability: PGExplainer: PGExplainer [6] offers explana- For the intrusion detection experiments, we have chosen baselines
tions on a global level across numerous instances by developing a from two categories: (i) Step 5 in Modeling and (ii) Step 6 in Post-
shared explanation network from nodes and graph representations of Modeling as shown in Fig. 1. Here, Anomal-E, DGI, and GraphSAGE
the GNN model. It seeks to locate a crucial subgraph that includes models are considered as edge embedding baselines for the evaluation
the most important nodes for the predictions of the given trained of the X-CBA model from the Step 5 category. For the second
GNN model makes predictions by removing nodes and attributes, and category from Step 6, the classifier models to be used as baselines
analyzes their effects on the output of the GNN model. Removing are as follows:
nodes also means removing the edges that are the endpoints of that Principal Component Analysis (PCA) algorithm adapted for
node from the graph, which leads to the identification of crucial intrusion detection, using a correlation matrix from “benign” samples
pathways. PGExplainer divides the input graph G into two subgraphs to identify attacks based on deviation from the benign correlation.
as in 4, GS represents the crucial subgraph and ∆G is comprised of Isolation Forest (IF) utilizes tree structures to isolate attacks:
unnecessary edges. samples closer to the root of the tree are identified as attacks, while the
G = GS + ∆G (4) 1 https://github.com/kiymetkaya/xai-catboosted-anomale
TABLE I: Comparative Performance Evaluation of Edge Embedding Baselines
NF-UNSW-NB15-v2 NF-CSE-CIC-IDS2018-v2 Average across datasets
F1-Macro Acc DR F1-Macro Acc DR F1-Macro
Anomal-E 92.35% 98.66% 98.77% 94.38% 97.80% 82.67% 93.36%
DGI 48.99% 96.02% 0.00% 46.82% 88.03% 0.00% 47.90%
GraphSAGE 54.77% 88.60% 26.63% 94.61% 97.90% 82.60% 74.69%
deeper ones in the tree are considered “benign”, creating an ensemble HBOS, PCA, and CBLOF methods together. The reason for this is
of trees for efficient and effective intrusion detection. that any one of these methods alone does not produce the best results
Clustering-Based Local Outlier Factor (CBLOF) treats intrusion in terms of all performance evaluation metrics.
detection as a clustering-based problem, assigning outlier factors
based on cluster size and distance between a sample and its nearest TABLE II: Network Intrusion Detection Prediction Results
cluster to determine if it is an attack. NF-CSE-CIC-IDS2018-v2
Histogram-Based Outlier Score (HBOS) Employing a histogram- F1-Macro Acc DR
based approach, constructs univariate histograms for each feature and Anomal-E - IF 81.11% 89.79% 91.84%
calculates bin densities, using these density values to determine the Anomal-E - HBOS 91.89% 96.86% 77.79%
HBOS score and classify samples as attacks or not. Anomal-E - PCA 92.57% 97.11% 79.16%
Anomal-E - CBLOF 94.38% 97.80% 82.67%
AutoEncoder (AE) is an unsupervised deep learning model, con-
sisting of four Linear layers, the first two used for encoding and X-CBA - RFC 96.53% 98.61% 88.50%
the last two for decoding, which respectively transform {number of X-CBA - AutoEncoder 97.76% 99.13% 94.92%
X-CBA - XGBC 98.45% 99.36% 94.85%
features, 16, 8, 16, number of features}. After each Linear layer,
X-CBA - LightGBM 98.56% 99.40% 95.19%
ReLU is used as an activation function.
X-CBA - CatBoost 98.73% 99.47% 95.74%
Random Forest Classifier (RFC) is a bootstrap ensemble model.
It creates several decision trees on data samples and then selects the
On the other hand, when we utilize only CatBoost in the predic-
best solution using voting. RFC is chosen rather than a single decision
tion phase with the X-CBA approach, we obtain the most accurate
tree because it reduces over-fitting and has less variance.
predictions with a single boosting model as seen in Table II green-
XGBoost Classifier (XGBC) stands for Extreme Gradient Boost- backgrounded row. Table II presents the results we obtained on
ing. XGBoost is an implementation of gradient boosting. The key the NF-CSE-CIC-IDS2018-v2 dataset with ETC, RFC, AE, XGBC,
idea behind the XGBC is the improvement of speed and performance LightGBM, and CatBoost ensemble models against the prediction
using the reasons behind the good performance such as regularization algorithms in Anomal-E. NF-CSE-CIC-IDS2018-v2 dataset contains
and handling sparse data. 18,893,708 network flows: 16,635,567 (88.05%) benign samples and
LightGBM Classifier is a histogram-based gradient boosting al- 2,258,141 (11.95%) attack samples. For a fair comparison, we follow
gorithm that reduces the computational cost by converting continuous the same preprocessing steps, training procedures, and train-test split
variables into discrete ones. Since the training time of the decision (70% training, 30% testing) as in Anomal-E [10]. We then present
trees is directly proportional to the number of computations and hence our results using the same test set for consistency. Moreover, the
the number of splits, LightGBM provides a shorter model training best CatBoost model was determined with Scikit-learn GridSearch
time and efficient resource utilization. 5-fold-CV, just like the other ensemble models. Accordingly, the best
The results presented in Table I are a summary result table for hyperparameters for CatBoost are: “min. samples split”: 4, “min.
the prediction results of the Anomal-E, DGI, and GraphSAGE with samples leaf”: 4, “max. depth”: 8.
the best corresponding baseline classifier model evaluated with the
list above. ‘Macro Average F1-score’ which is the average F1 score B. X-CBA Explainability Evaluation
of all classes, ‘Accuracy’ and ‘Detection Rate’ (DR, also known as The explainability performance of X-CBA approach implemented
Recall) that measures the percentage of actual attack observations that with PGExplainer [6] is evaluated through state-of-the-art XAI ap-
the model correctly classified are chosen as performance evaluation proaches. We analyze XAI methods designed specifically for GNNs
metrics. As it can be seen from the results in Table I, the superiority of that have the ability to explain important edges. Table III provides
the Anomal-E over the state-of-the-art methods DGI and GraphSAGE a summary of this analysis on XAI methods. In Table III, “GNN
has been proven with the experiments [10] on NF-UNSW-NB15-v2 Design” indicates whether the XAI method has a specific design
and NF-CSE-CIC-IDS2018-v2 datasets. for GNN models; “Black Box” indicates whether the prediction
After the preliminary experiments, in the second step, the proposed model is treated as a black-box when being explained, and “Target”:
X-CBA approach is compared with Anomal-E [10] since Anomal- represents the target component (N: nodes, NF: node features, E:
E is the edge embedding approach among state-of-the-art as shown edges.) whose explanation is presented. As can be seen in Table
in Table I. For further detailed analysis, X-CBA and Anomal-E are III, the explainability of GNNs’ edges with the current state-of-
evaluated with various classifier baselines located in Step 6 in the Post- the-art is measured by GNNExplainer and PGExplainer approaches.
Modeling. According to the results in Table II, where the prediction Other GNN explainability approaches are either focuses on nodes or
results are presented in ascending order according to the F1-Macro, node features, which is not suitable for flow-based network intrusion
the proposed X-CBA produced more accurate results comparing to detection explainability.
Anomal-E in both unsupervised (with AutoEncoder) and supervised Moreover, we evaluated the performance of network-flow impor-
(with CatBoost) approaches among baseline models. In other words, tance (a.k.a edge importance) with two metrics [26] as used in
Anomal-E utilizes edge embeddings and predicts attacks using IF, edge explainability performance evaluation: Sparsity metric in 6 and
Benign
Network Flow
Explainability
Bot
Network Flow
Explainability
Infiltration
Network Flow
Explainability
Fig. 2: Distribution of edges according to attack type within the edge explanation maximizing mutual information for ‘Benign’, ‘Bot’ and
‘Infiltration’
TABLE III: XAI Methods Comparison due to “GNN Design”, “Black

Box” and “Target”. F1-Macro
Method GNN Design Black Box Target

CAM, Grad-CAM [19] - - N
LRP [20] - - N Accuracy DR
SHAP [21] - + NF
GraphLIME [22] + + NF
PGM-Explainer [23] + + N
ZORRO [24] + + N
GNNExplainer [25] + + E
X-CBA - PGExplainer [6] + - E
Fidelity + F metric in 7. The Sparsity whose formula is given in Fig. 3: The Fidelity + F comparisons over F1-Macro, Accuracy and
6 measures the proportion of important components (N, NF, E) DR under different Sparsity levels
identified by the XAI method. Here, |mi | shows the number of
important edges determined by the XAI method, |Mi | shows the
number of all edges in the graph and K is the number of graphs. On
the identified edges are more important for the proposed X-CBA.
the other hand, Fidelity + F whose formula is given in 7 studies the
According to the results in Fig. 3, as it is expected, the PGExplainer
change of prediction score where F shows performance evaluation
outperforms the GNNExplainer significantly and consistently. This is
function (F1-Macro, Accuracy, and DR) [26]. Gi 1−mi represents
because PGExplainer’s non-black box explanation ability reveals the
the new graph obtained by keeping features of Gi based on the
explanation for flow-based intrusion packets better.
complementary mask (1 − mi ) and yi is the original prediction of
the GNN model. Moreover, we investigate the most influential subgraphs that affect
the prediction of an edge of a given attack type to locally observe the
K
1 X |mi | differences between PGExplainer and GNNExplainer. NF-CSE-CIC-
Sparsity = 1− (6)
K i=1 |Mi | IDS2018-v2 includes data for benign and fifteen different attack types.
For each attack type, we make use of the edges found to maximize
K
1 X mutual information by each XAI model. In Fig. 2, we illustrate the
Fidelity + F = F (Gi )yi − F G1−m
i
i
yi
(7) distribution of edge types in the subgraphs identified for three selected
K 1
classes: Benign, Bot, and Infiltration. These classes were chosen from
Fig. 3 presents the global graph’s edge explanation results for a total of 16 (15 attack classes and one benign class) to provide
PGExplainer and GNNExplainer with Fidelity + F and Sparsity. a diverse range of examples for analysis: Benign represents a non-
Fidelity + F measures the F1-Macro, Accuracy, and DR drops when attack scenario, Infiltration serves as a sample attack case, and Bot
important edges are removed. Higher Fidelity + F scores indicate exemplifies a sophisticated attack. The prevalence of benign edges
within subgraphs of various attacks is unsurprising, given the inherent [4] T. Bilot, N. E. Madhoun, K. A. Agha, and A. Zouaoui, “Graph neural
dataset imbalance. The subgraph for ‘Benign’ in Fig. 2 is therefore networks for intrusion detection: A survey,” IEEE Access, vol. 11, pp.
49 114–49 139, 2023.
expected to contain only benign edges, which is provided only by the
[5] A. R. E.-M. Baahmed, G. Andresini, C. Robardet, and A. Appice,
PGExplainer, and failed in GNNExplainer. “Using graph neural networks for the detection and explanation of
In contrast, Infiltration and Bot attacks are marked by a pre- network intrusions,” International Workshop on eXplainable Knowledge
dominance of similar connection types in their immediate network Discovery in Data Mining, 2023.
environments. These attacks are known for their ability to spread [6] D. Luo, W. Cheng, D. Xu, W. Yu, B. Zong, H. Chen, and X. Zhang,
“Parameterized explainer for graph neural network,” Advances in neural
across the network, often utilizing similar connections. In Fig. 2, information processing systems, vol. 33, pp. 19 620–19 631, 2020.
illustrated by red rectangular boxes, PGExplainer distinguishes itself [7] D. Gunning and D. Aha, “Darpa’s explainable artificial intelligence (xai)
in identifying key features of both Infiltration and Bot attacks. program,” AI magazine, vol. 40, no. 2, pp. 44–58, 2019.
However, its performance advantage is somewhat less pronounced for [8] S. S. Du, K. Hou, R. R. Salakhutdinov, B. Poczos, R. Wang, and K. Xu,
“Graph neural tangent kernel: Fusing graph neural networks with graph
Bot attacks. This is because Bot attacks in our dataset are consistently
kernels,” Advances in neural information processing systems, vol. 32,
linked to a specific node (IPV4 ADDR: 18.219.211.138), a pattern 2019.
easily detected by both XAI methods. Consequently, the performance [9] G. Secinti, P. B. Darian, B. Canberk, and K. R. Chowdhury, “Resilient
of the two methods is similar for Bot attacks. end-to-end connectivity for software defined unmanned aerial vehicular
PGExplainer’s true strength is demonstrated in its ability to identify networks,” in 2017 IEEE 28th Annual International Symposium on
Personal, Indoor, and Mobile Radio Communications (PIMRC), 2017,
Infiltration flows. These flows are often challenging to detect as pp. 1–5.
they are not always directly connected to the Infiltration attack. [10] E. Caville, W. W. Lo, S. Layeghy, and M. Portmann, “Anomal-e: A self-
On the other hand, GNNExplainer significantly underperforms in supervised network intrusion detection system based on graph neural
this area, frequently misidentifying Infiltration attack instances as networks,” Knowledge-Based Systems, vol. 258, p. 110030, 2022.
[11] S. Fraihat, S. Makhadmeh, M. Awad, M. A. Al-Betar, and A. Al-
Bot attacks and missing most of the real Infiltration network flows. Redhaei, “Intrusion detection system for large-scale iot netflow networks
This discrepancy can be attributed to PGExplainer’s operation in using machine learning with modified arithmetic optimization algorithm,”
inductive settings and its capability to collectively explain multiple Internet of Things, p. 100819, 2023.
instances, enabling it to more effectively uncover local network flow [12] E. Ak and B. Canberk, “Forecasting quality of service for next-generation
relationships. data-driven wifi6 campus networks,” IEEE Transactions on Network and
Service Management, vol. 18, no. 4, pp. 4744–4755, 2021.
V. C ONCLUSION AND F UTURE W ORKS [13] S. Mane and D. Rao, “Explaining network intrusion detection system
using explainable ai framework,” arXiv preprint arXiv:2103.07110, 2021.
In this study, we propose a novel IDS methodology, called X-CBA, [14] M. Wang, K. Zheng, Y. Yang, and X. Wang, “An explainable machine
that synergizes the strengths of Graph Neural Networks (GNNs) and learning framework for intrusion detection systems,” IEEE Access, vol. 8,
Explainable AI (XAI). X-CBA not only outperforms in detecting a pp. 73 127–73 141, 2020.
[15] S. Patil, V. Varadarajan, S. M. Mazhar, A. Sahibzada, N. Ahmed,
wide array of cyber threats through its use of network flows and graph O. Sinha, S. Kumar, K. Shaw, and K. Kotecha, “Explainable artificial
edge embeddings but also marks a significant leap in the accuracy and intelligence for intrusion detection system,” Electronics, vol. 11, no. 19,
reliability of threat detection, as evidenced by its remarkable 99.47% p. 3079, 2022.
accuracy, 98.73% F1 rate, and 95.74% recall. Most importantly, X- [16] W. W. Lo, G. Kulatilleke, M. Sarhan, S. Layeghy, and M. Portmann,
CBA addresses the critical issue of transparency in ML/DL-based “Xg-bot: An explainable deep graph neural network for botnet detection
and forensics,” Internet of Things, vol. 22, p. 100747, 2023.
security solutions. We evaluated the baseline XAI methods to show [17] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher, and M. Portmann, “E-
the strong explainability of our proposed framework in terms of its graphsage: A graph neural network based intrusion detection system for
ability to find important edges. By integrating PGExplainer, it provides iot,” in NOMS 2022-2022 IEEE/IFIP Network Operations and Manage-
both local and global explanations of its decision-making process and ment Symposium. IEEE, 2022, pp. 1–9.
[18] P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D.
gives much more accurate results in terms of sparsity and fidelity Hjelm, “Deep graph infomax,” arXiv preprint arXiv:1809.10341, 2018.
metrics compared to baselines, enhancing trust and accountability in [19] P. E. Pope, S. Kolouri, M. Rostami, C. E. Martin, and H. Hoffmann,
its operations. “Explainability methods for graph convolutional neural networks,” in
Proceedings of the IEEE/CVF conference on computer vision and pattern
ACKNOWLEDGEMENTS recognition, 2019, pp. 10 772–10 781.
This research is supported by the Scientific and Technologi- [20] F. Baldassarre and H. Azizpour, “Explainability techniques for graph con-
volutional networks,” in ICML 2019 Workshop” Learning and Reasoning
cal Research Council of Turkey (TUBITAK) 1515 Frontier R&D with Graph-Structured Representations”, 2019.
Laboratories Support Program for BTS Advanced AI Hub: BTS [21] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model
Autonomous Networks and Data Innovation Lab. project number predictions,” Advances in neural information processing systems, vol. 30,
5239903, TUBITAK 1501 project number 3220892, and the ITU 2017.
[22] Q. Huang, M. Yamada, Y. Tian, D. Singh, and Y. Chang, “Graphlime:
Scientific Research Projects Fund under grant number MÇAP-2022- Local interpretable model explanations for graph neural networks,” IEEE
43823. Transactions on Knowledge and Data Engineering, 2022.
[23] M. Vu and M. T. Thai, “Pgm-explainer: Probabilistic graphical model
R EFERENCES explanations for graph neural networks,” Advances in neural information
[1] S. Neupane, J. Ables, W. Anderson, S. Mittal, S. Rahimi, I. Banicescu, processing systems, vol. 33, pp. 12 225–12 235, 2020.
and M. Seale, “Explainable intrusion detection systems (x-ids): A survey [24] T. Funke, M. Khosla, M. Rathee, and A. Anand, “Z orro: Valid, sparse,
of current methods, challenges, and opportunities,” IEEE Access, vol. 10, and stable explanations in graph neural networks,” IEEE Transactions on
pp. 112 392–112 415, 2022. Knowledge and Data Engineering, 2022.
[2] K. He, D. D. Kim, and M. R. Asghar, “Adversarial machine learning [25] Z. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec, “Gnnexplainer:
for network intrusion detection systems: A comprehensive survey,” IEEE Generating explanations for graph neural networks,” Advances in neural
Communications Surveys & Tutorials, vol. 25, no. 1, pp. 538–566, 2023. information processing systems, vol. 32, 2019.
[3] E. Ak and B. Canberk, “Fsc: Two-scale ai-driven fair sensitivity control [26] H. Yuan, H. Yu, S. Gui, and S. Ji, “Explainability in graph neural
for 802.11ax networks,” in GLOBECOM 2020 - 2020 IEEE Global networks: A taxonomic survey,” IEEE transactions on pattern analysis
Communications Conference, 2020, pp. 1–6. and machine intelligence, vol. 45, no. 5, pp. 5782–5799, 2022.

Explainability in IDS, Read Just After Intros

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Explainability in IDS, Read Just After Intros

Uploaded by

Copyright:

Available Formats

Accepted by 2024 IEEE International Conference on Communications (ICC), ©2023 IEEE

X-CBA: Explainability Aided CatBoosted Anomal-E for

Step-5: self-supervised edge embedding, DGI

Step-2: data preprocesing E-GraphSAGE

59.166.0.9 149.171.126.0 1011 … 75 68 … 0

training embedding model

TABLE III: XAI Methods Comparison due to “GNN Design”, “Black

Method GNN Design Black Box Target

You might also like