You are on page 1of 4

Learning Deep Boltzmann Machines using Multi Ant

Colony Optimization to discover intrusion, anomaly


and cyber attacks

Asalah A. Al-Swidi Prof. Basher M. Al-Magalh, PhD


Thamar University Thamar University
Master's Student in Computer Science Dean of the computing college
Thamar, Yemen Thamar, Yemen

I. INTRODUCTION
In the last few years, the field of artificial intelligence has
developed greatly, especially in the field of deep learning, where
hardly a day goes without hear the latest advancements and
improvements that artificial intelligence and deep learning have
brought to a wide spectrum of domains from medicine,
engineering, science and many others. Deep learning has
become a science in itself at the present time, which is inspired
from the mechanism of the brain's work, its beginning was the
emergence of neural networks, as a result of their development,
the increase in the number of nodes and the number of layers in
them, deep learning appeared. Deep learning networks have
developed greatly in recent times and have been used in various Fig.1 Deep Boltzmann Machine
fields, as several algorithms have appeared like
backpropagation, autoencoders, variational autoencoders, This algorithm is usually trained using an algorithms based
restricted Boltzmann machines, deep Boltzmann machine, deep on probabilities and derivations like algorithms based on
belief networks, convolutional neural networks, recurrent neural Markov Chain Monte Carlo (MCMC), these algorithms like:
networks, generative adversarial networks, capsnets,
transformer, embeddings from language models, bidirectional • Variational Learning
encoder representations from transformers, and attention in • Stochastic Approximation Algorithm
natural language processing [1]. Each of these algorithms has its
own working mechanism and has its uses, applications, • Adaptive MCMC
advantages, disadvantages and challenges, as it is classified
either supervised, unsupervised or semi-supervised algorithms. For more details about these algorithms refer to [2]. Where
MCMC algorithms use these equations:
In this proposal, the deep Boltzmann machine was used,
which is a development of the restricted Boltzmann machine The energy of the state {v, h} is defined as:
(RBM), RBM is an unsupervised learning algorithm that can 𝐸(𝑣, ℎ; 𝜃) = −𝑣 𝑇 𝑊 1 ℎ1 − ℎ1𝑇 𝑊 2 ℎ2 (1)
automatically find patterns in its data by reconstructing the 1 2
inputs, it is a network consisting of two layers the visible layer Where v is the visible layer, h = {h , h } are the set of hidden
and the hidden layer, it has no output layer and it has units, and θ = {W1, W2} are the model parameters, representing
symmetrical bipartite graph because every node in the visual visible to hidden and hidden to hidden symmetric interaction
layer is connected to every node in the hidden layer and every terms. The probability that the model assigns to a visible vector
node in the hidden layer is connected to every node in the visible v is:
layer, it works both front and back and this work is the most 𝑃∗ (𝑣; 𝜃) 1
important to discover patterns and features, in which weights 𝑃(𝑣; 𝜃) = = ∑ exp(−𝐸(𝑣, ℎ1 , ℎ2 ; 𝜃)) .
and biases play an important role, it is called constrained because 𝑍(𝜃) 𝑍(𝜃)

there is no communication between nodes of the same layer, The derivative of the log-likelihood with respect to
DBM is RBM but with many of the hidden layers that help to parameter vector W1 takes the following form:
discover patterns and features accurately and efficiently as in
Figure 1.
𝜕 log 𝑃(𝑣; 𝜃) 𝑇 𝑇 Autoencoder with multiple Backpropagation (DA-MBP), Deep
1
= 𝐸𝑃𝑑𝑎𝑡𝑎 [𝑣ℎ1 ] − 𝐸𝑃𝑚𝑜𝑑𝑒𝑙 [𝑣ℎ1 ], (2) Autoencoder with RBM (DA-RBM) and Deep Convolutional
𝜕𝑊
Autoencoder with RBM (DCA-RBM).
Where EPdata [·] denotes an expectation with respect to the
completed data distribution Pdata (h, v; θ) = P (h|v; θ) Pdata(v), In [5], Deep appearance models: A deep Boltzmann machine
1 approach for face modeling is introduced. This paper presents a
with Pdata(v) = ∑𝑛 𝛿(𝑣 − 𝑣𝑛) representing the empirical
𝑁 novel Deep Appearance Models (DAMs) approach, an efficient
distribution, and EPmodel [·] is an expectation with respect to the
replacement for AAMs, to accurately capture both shape and
distribution defined by the model. The derivatives with respect
texture of face images under large variations. In this approach,
to parameters W2 take similar form.Exact maximum likelihood
three crucial components represented in hierarchical layers are
learning in this model is intractable, because both data-
modeled using the Deep Boltzmann Machines (DBM) to
dependent and model-dependent expectations cannot be
robustly capture the variations of facial shapes and appearances.
computed analytically in less than exponential time.[3]The most
Comparing to AAMs and other deep learning-based approaches,
disadvantages of these algorithms. It is slow, difficult to train
the proposed DAMs achieve competitive results in those
and adjust weights, and unable to track the loss that is required
applications, thus this showed their advantages in handling
(let alone take derivatives with respect to it) where DBM is
occlusions, facial representation, and reconstruction.
significantly slower than backpropagation. The deep Boltzmann
machine algorithm is used in several applications, including In [6], Listening with your eyes: Towards a practical visual
dimensionality reduction, classification, regression, speech recognition system using deep Boltzmann machines is
collaborative filtering, property learning, and modeling [1] and suggested. This method presents a novel feature learning method
it was used to extract more complex or sophisticated features and for visual speech recognition using Deep Boltzmann Machines
hence can be used for more complex tasks. It was used in the (DBM). Unlike all existing visual feature extraction techniques
medical field, image analysis, feature discovery, cybersecurity, which solely extract features from video sequences, thus method
which is what the deep Boltzmann machine will be trained for it is able to explore both acoustic information and visual
in this research, there are many researches that used DBM in information to learn a better visual feature representation.
cybersecurity such as intrusion detection, attacks, anomaly in
networks and others, in all these researches the trained DBM In [7], Personalized prediction of genes with tumor-causing
with an algorithm based on probabilities and derivations was somatic mutations based on multimodal deep Boltzmann
very precision compared to the rest algorithms but no study has machine is proposed. An early cancer monitoring method is
trained DBM using multi ant colony algorithm for any of the developed based on multi-modal deep Boltzmann machine to
researched areas. learn association between matched germline and somatic
mutations captured by whole exome sequencing from available
The multi ant colony optimization algorithm will use to train samples of cancer patients, and predict patient-specific high-risk
the deep Boltzmann machine in this research, this algorithm is a genes whose somatic mutations are required to drive normal
development of the ant colony optimization algorithm, which is tissues to a tumor state. Our experiments on a set of breast cancer
one of the algorithms inspired by nature, it is inspired by the samples show that our method significantly outperforms the
swarms of ants, it was used to determine the best path used at currently used frequency-based method in the personalized
the beginning randomly and then the path is optimized by the prediction of genes carrying critical mutations.
pheromone, on which the optimization process depends on it for
the rest of the ants. It is a method that depends on the pheromone In [8], Deep learning for cyber security intrusion detection:
and the probabilities to choose the shortest path. This algorithm Approaches, datasets, and comparative study is presented. In
has been developed, where the start is made with several ants this paper, a survey of deep learning approaches is present for
and the improvement is done in parallel, and thus it is faster than cyber security intrusion detection, the datasets used, and a
the traditional algorithm with the number of ants and it is more comparative study. Specifically, a review of intrusion detection
efficient. systems is provided based on deep learning approaches.
In [9], Deep learning techniques for cyber security intrusion
II. RELATED WORKS detection: A detailed analysis is suggested. In this study, a
detailed analysis of deep learning techniques is presented for
In [1], A Review of Deep Learning Algorithms and Their intrusion detection. Specifically, seven deep learning models is
Applications in Healthcare is proposed. This review aims to analyzed, the most important performance indicators are used,
categorically cover several widely used deep learning namely, accuracy, detection rate, and false alarm rate for
algorithms along with their architectures, their practical evaluating the efficiency of several methods.
applications, challenges, The pros and cons, their applications in
healthcare and the future direction of this domain. In [10], A review of intrusion detection systems using
machine and deep learning in internet of things: Challenges,
In [4], Deep medical image reconstruction with solutions and future directions is proposed. This study aims to
autoencoders using deep Boltzmann machine training is present a comprehensive review of IoT systems-related
suggested. In this method Deep Autoencoder is implemented as technologies, protocols, architecture and threats emerging from
it has been considered for high dimensionality reduction. Layer compromised IoT devices along with providing an overview of
by layer pretraining is achieved using an approximate inference intrusion detection models. This work also covers the analysis
algorithm called Deep Boltzmann Machine, the proposed of various machine learning and deep learning-based techniques
method proves to be efficient when compared with the suitable to detect IoT systems related to cyber-attacks.
performance of the other autoencoders such as Deep
In [11], Application of deep learning architectures for cyber compared with existing techniques, such as the a priori
security is presented. In this work, to leverage the application of classification algorithm.
deep learning architectures towards cyber security, we consider
intrusion detection, traffic analysis and Android malware III. PROBLEM STATMENT
detection. In all the experiments of intrusion detection, deep
learning architectures performed well in compared to classical Despite the successful applications of deep learning
machine learning algorithms. Moreover, deep learning algorithms, especially deep Boltzmann machine (DBM), which
architectures have achieved good performance in traffic analysis has been used in many fields, but it is very difficult and slow in
and Android malware detection too. the training process, it gets slower with the increase in the hidden
layers, and also one of its disadvantages is that it is unable to
In [12], Emergent deep learning for anomaly detection in track the loss that is required (let alone take derivatives with
internet of everything is introduced. This paper introduces a new respect to it). This is why this research was proposed, where the
generic deep learning frame- work for anomalous data detection deep Boltzmann machine algorithm will be trained using multi
in internet of everything environments. It combines ant colony optimization algorithm (MACO), due to what is
decomposition methods, deep neural networks, and evolutionary known about this algorithm in terms of speed and efficiency to
computation, a new recurrent neural network is proposed for reach the best path and best solution in the shortest time.
training time series data. Furthermore, two evolutionary Therefore, it will be used to reach the correct and required
computation algorithms– the genetic and the bees swarm, are weights and biases quickly to increase training speed and will be
proposed to accurately tuning the training step. These algorithms used in the detection of cyber-attacks and detection of intrusion
consider the hyper-parameters of the trained models and attempt anomalies in the networks that Its detection requires high speed
to find the optimal values. and extreme accuracy in order to respond to the threat in a timely
In [13], Improved Object Detection Algorithm using Ant manner, as it is one of the time-critical applications.
Colony Optimization and Deep Belief Networks Based Image
Segmentation is proposed. This paper represents that a variety IV. OBJECTIVES
of strategies based on object detection and efficiency of object The main objective of this research is to train Deep
detection framework using a saliency prior and DBNs for remote Boltzmann Machine (DBM) with multi ant colony optimization
sensing images. This research works proposed an efficient algorithm (MACO) to increase training speed and increase its
object detection using the ant colony optimization and deep efficiency and effectiveness to reach the appropriate weights and
belief networks. biases at high speed and precision
In [14], A Novel Ant Colony Based DBN Framework to
Analyze the Drug Reviews is suggested, in this method a novel V. METHODOLOGY
Ant Colony- based Deep Belief Neural Network (AC-DBN) In this proposed methodology, the algorithm will be built and
framework is proposed. Drug review tweets are opted to perform experiments will be conducted where will be input to DBM is
sentiment classification by using the proposed framework in the behaviors that occur within networks that this input was
python environment. A model fitness function is initiated in the collected from datasets and preprocessing it first, and since it is
DL framework and is observed that it is attaining high accuracy a symmetrical algorithm, its outputs are inputs to it, this process
with low computation time. Additionally, the obtained results continues until we reach the final output, which is attacks,
attained from the proposed framework are validated with threats, intrusion and anomaly detection. Where security domain
existing methods for evaluating the efficiency of the proposed
was selected for the experiences because DBM was success in
AC-DBN approach
discover intrusion, anomaly and cyber-attacks in previous
In [15], Greedy learning of deep Boltzmann machine studies better than other algorithms. The DBM algorithm will
(GDBM)’s variance and search algorithm for efficient image be trained with the MACO algorithm. What this algorithm needs
retrieval is proposed. Initially, a preprocessing technique is are four things, which are as follows:
introduced. In this study, a technique that uses a median filter to
• Ant: here are the inputs for DBM is the general
remove noise to achieve improved accuracy and reliability.
behavior of the network
Then, Fourier and circularity descriptors are extract in an
effective manner correspondent to the texture and affine shape • Pheromone: they are the weights and biases that the ant
adaptation features. In addition, various descriptors, such as controls on it and controls on its focus from increasing
color histogram, color moment, color auto correlogram and or decreasing to reach the best and shortest path (the
color coherency vector, are extracted as the invariant color best path is the best weights and biases that can reach
features. The multiple ant colony optimization (MACOBTC) to trained DBM with the shortest way)
approach is implemented with whole features to find relevant
features. Finally, the relevant features are utilized for the greedy • Path: it is the beginning of the input (general behavior
learning of deep Boltzmann machine classifier (GDBM). The within the network) to reach the output (Cyber-attacks
proposed approach obtains effective performance and accurate and detection of intrusions and anomalies in networks)
results on four datasets and is analyzed with various parameters • Target: it is the shortest path, i.e., reaching the correct
such as accuracy, precision, recall, Jaccard, Dice, and Kappa weights and biases as quickly as possible, thus
coefficients. The GDBM provides a 25% increase in accuracy speeding up the training of the network
The algorithm will be applied in python environment Transactions on Pervasive Health and Technology . J . Belgium, vol. 6,
because it is accurate and easy to use with deep learning. After pp e2, September 2020.
the algorithm is implemented and applied, its accuracy and [5] C. N. Duong, K. Luu, K. G. Quach and T. D. Bui, " Deep appearance
models: A deep boltzmann machine approach for face modeling," .
effectiveness will be measured, its results will be compared with International Journal of Computer Vision. J. Springer. Netherlands, vol
previous studies and finally the conclusion and future works will 127, pp. 437-455, December 2019.
be presented. [6] C. Sui and M. Bennamoun, " Listening with your eyes: Towards a
practical visual speech recognition system using deep boltzmann
machines," In Proceedings of the IEEE International Conference on
VI. IMPORTANCE AND SCOPE Computer Vision, 2015, pp. 154-162.
The importance of the research is to find a new way to train [7] Y. Li, F. Fauteux, J. Zou, A. Nantel and Y. Pan, " Personalized prediction
DBM and increase the accuracy, efficiency and speed of its of genes with tumor-causing somatic mutations based on multimodal deep
results in the chosen field, which is cybersecurity and network Boltzmann machine," Neurocomputing. J. Elsevier. Netherland, vol 324,
pp. 51-62, February 2018.
security field, as it was used to discover cyber-attacks, intrusions
[8] M. A. Ferrag, L. Maglaras, S. Moschoyiannis and H. Janicke, " Deep
and anomalies in networks that require high speed and accuracy learning for cyber security intrusion detection: Approaches, datasets, and
in order to respond in a timely manner. comparative study," Journal of Information Security and Applications. J.
Elsevier. United Kingdom, vol 50, pp. 102419, February 2020.
VII. CONCLUSIONS [9] M. A. Ferrag, L. Maglaras, H. Janicke and R. Smith, " Deep learning
techniques for cyber security intrusion detection: A detailed analysis," In
It was concluded that MACO succeeded in training DBM 6th International Symposium for ICS & SCADA Cyber Security
with high speed and accuracy, in order to work in parallel and to Research, 2019, pp. 126-136.
jointly improve weights and biases through the shared memory [10] J. Asharf, N. Moustafa, H. Khurshid, E. Debie, W. Haider and A. Wahab,
among all ants, through the results presented by it compared to " A review of intrusion detection systems using machine and deep
learning in internet of things: Challenges, solutions and future directions,"
previous studies, as this method was used in network security Electronics. J. MDPI. Switzerland, vol 9, pp. 1177,July 2020.
and cybersecurity but now it can be generalized to any field in
[11] R. Vinayakumar, K.P. Soman, P. Poornachandran and S.
which DBM can be used, as we recommend doing researches to Akarsh,"Application of deep learning architectures for cyber security," In
train it in other fields. Cybersecurity and Secure Information Systems, 2019, pp. 125-160.
[12] Y. Djenouri, D. Djenouri, A. Belhadi, G. Srivastava and J. Chun-Wei Lin,
" Emergent deep learning for anomaly detection in internet of
everything,", IEEE Internet of Things Journal. J. United States, 2021.
[13] A. Kaur and N. Kaur, "Improved Object Detection Algorithm using Ant
REFERENCES Colony Optimization and Deep Belief Networks Based Image
[1] H. Abdel-jaber, D. Devassy, A. Al Salam, L. Hidaytallah and M. EL- Segmentaion," International Journal of Latest Technology in Engineering,
Amir, "A Review of deep learning algorithms and their applications in Management & Applied Science IJLTEMAS. J. India, vol. 6, pp.86-90,
healthcare," Algorithms . J.Kyoto. Switzerland, vol. 15, pp. 71, February July 2017.
2022. [14] N. Tazeen and K. S. Rani, " A Novel Ant Colony Based DBN Framework
[2] R. Salakhutdinov and G. Hinton, "Deep Boltzmann machines," In to Analyze the Drug Reviews," International Journal of Intelligent
Proceedings of the International Conference on Artifical Intelligence and Systems and Applications. J. China, vol 13, pp. 25-39, December 2021.
Statistics, 2009, pp 448-455.
[3] R. Salakhutdinov, "Learning deep Boltzmann machine using adaptive
MCMC," In Proceedings of the 27th International Conference on Machine [15] M. J. J. Ghrabat, G. Ma, Z. A. Abduljabbar, M. Al Sibahee and S. J.
Learning (ICML-10), 2010, pp. 943-950. Jassim, " Greedy learning of deep Boltzmann machine (GDBM)’s
[4] S. Saravanan and S. Juliet, "Deep medical image reconstruction with variance and search algorithm for efficient image retrieval," IEEE Access.
autoencoders using deep Boltzmann machine training," EAI Endorsed J. United States, vol 7, pp.169142-169159,October 2019.

You might also like