You are on page 1of 3

Multiclass classification of DGA based malware in deep learning

The binary experiment is designed to answer the ML question of separating legitimate FQDNs
from malicious AGDs, considering all malware families as a single category. Experiment 2
(Multiclass) The multiclass experiment is designed to go beyond the above-mentioned binary
experiment in order to classify not only the legitimate FQDN but also sort malware samples
according to their families (Mattia Zago et al, 2019).
Machine learning models that attempt to do DGA classification based only on the domain name
itself, such as the ones considered in this paper, might not be sufficient to detect a DGA like
CharBot. The result highlights the need for ML models that exploit additional context features
such as the IP-addresses that the domains are mapped to, or temporal access patterns (e.g.
how often the domain was requested, and when) [3], [16]–[18], as was done successfully for
dictionary DGAs [10] (JONATHAN PECK et al, 2019- *peck2019.pdf ).
Research question
 which feature reduction strategy optimally approximates the data? Preliminary results using
nonlinear feature reduction techniques seem promising. character features, Unicode features,
Word‐bag model n‐gram
 Can we increase the performance of multiclass classification by balancing our data?
 Can we detect all malware families?
 Can encoding technique improve the performance?
Objectives
 To study and analyze the properties of each malware family
 Apply multiclass classification solution on deep learning
 Analyzing the statistical properties of malicious domains of specific family.

Literature review (Gap analysis)


In 2020, Karunakaran P. come up with deep learning approach to DGA classification to detect
DGA which generates malicious domains randomly []. They achieved 94.9% accuracy for DGA
classification with help of additional feature extraction and knowledge-based extraction in the
deep learning architecture (CNN & RNN). But they only used 20 of the malware type in their
dataset which indicate that their model will lack to detect other malware types. They also didn’t
mention how many of them were detected in the training or testing phase. The other thing that
I have observed was they have only used accuracy metrics to evaluate their model but using
others metrics like FPR is essential in actual deployment (Bin Yu et al, 2018).
In 2019, Mattia Zago et al. presented the state of the art aiming to polish the feature discovery process,
which is the single most time-consuming part of any ML approach in detection of DGA based botnet [].
Results show that only a minor fraction of the defined features are indeed practical and informative,
especially when considering 0-day botnet identification. Both bin ary and multiclass classification
experiment was done. The result of multiclass experiment performed worse than the binary
one due to unable to distinguish similar malware like Oakbot and Matsnu.
In 2019, Ryan R. Curtin et al. Detecting DGA domains which is a combination of a novel recurrent
neural network architecture with domain registration side information. Their experiment was
able to detect DGA families with high smash-word score but with low accuracy in addition the
difficult to detect word-based malware which is called Charbot were not included in the
experiment. Also, their model was unable to detect DGA families that do not look like natural domain
names.

In the same year Daniel S. Berman proposed 1D Application of Capsule Networks to DGA Detection.
They used, CapsNet, CNN and LSTM algorithm to detect different types of DG malware []. Their
experiment was not successful to detect some of them such as vawtrak, Vidro, Sphinx, corebot, virut,
cryptowall.

The greatest weakness of all the models tested is their deficiencies in detecting really word-
based DGAs. In some cases, some of these real word-based DGAs use a limited dictionary to
generate domain names and change that dictionary after some time. This manifests in three
ways. The first is that when the model is trained on data from that DGA, time is not taken into
account and the model fails to detect the malicious domain names, as is the case for matsnu
and gozi. The second is when the model can only detect the malicious domain names when it is
trained on data from that DGA, regardless of time, but fails to detect it otherwise, as is the case
for unknowndropper. Finally, there are models that initially perform well but after time passes,
performance significantly declines because of a change in the DGA generator, as is the case
with pizd and suppobox. Developing a model capable of detecting malicious domains in all
three of these situations is critical, and all models tested here fail to do so [] ( Ryan R. Curtin et al.
, 2019-*info10050157.pdf).
JONATHAN PECK et al, presented a novel DGA called CharBot, which is capable of producing large
numbers of unregistered domain names. In their experiment they get very poor performance
by state-of-the-art classifiers for real-time detection of the DGAs, including the recently
published methods FANCI (a random forest based on human-engineered features) and LSTM.MI
(a deep learning approach). They tried to highlight a dangerous weakness of modern DGA
classifiers, namely their vulnerability to extremely simple attacks that make no use of
sophisticated machine learning techniques.
Yanchen Qiao et al, proposed a DGA domain name classification method based on Long Short-Term
Memory (LSTM) with attention mechanism []. They used the character sequence of the domain name as
a feature but due to imbalanced dataset they achieved poor performance for 10 of them out of 18
malware class.

Xiaochun Yun et al, proposed Khaos, a novel DGA with high anti-detection ability based on neural
language models and the Wasserstein Generative Adversarial Network (WGAN). The experiment results
show that Khaos outperforms the other nine in all detection indices of the detection approaches but the
others was detected with poor performance.

So many researches are done on the classification of DGA based malware detection but it was
unsuccessful in identification of some malware. There Is problem of categorizing them according to their
malware family in current detection systems due to the used reduction & classification algorithms
nature, inappropriate context, and unbalanced data (Duc Tran, ). This research is to fill the gap of
multiclass classification problem by using different encoding techniques in deep learning.

You might also like