You are on page 1of 15

This article has been accepted for publication in IEEE Internet of Things Journal.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

Deep Hashing for Malware Family Classification


and New Malware Identification
Yunchun Zhang, Member, IEEE, Zikun Liao, Ning Zhang, Shaohui Min, Qi Wang,
Tony Q. S. Quek, Fellow, IEEE, and Mingxiong Zhao, Member, IEEE

Abstract—Although numerous state-of-the-art deep neural net- in the context of analyzing malware grayscale images [4]–
works have recently been proposed for malware classification, [7]. Image texture-based models, particularly prominent deep
effectively detecting malware on a large-scale sample set and learning benchmarks in the field of image processing, demon-
identifying zero-day or new malware variants still pose significant
challenges. To address this issue, a deep hashing-based malware strate comparable performance to dynamic features such as
classification model is designed for malware identification, in- API call sequences, albeit within a shorter timeframe [8], [9].
cluding two parts: ResNet50-based deep hashing for malware Additionally, various deep neural networks (DNNs) have been
retrieval and voting-based malware classification. Specifically, specifically designed to extract representative feature maps
multiple deep hashing models are developed by extracting the for different malware families, further enhancing classification
high-layer outputs (feature maps) from the ResNet50 trained
with malware gray-scale images in the first part. In this case, accuracy [5], [10]. Inspired by these groundbreaking studies,
to maximize the Hamming distance or dissimilarity among hash it has become widely recognized that classifying malware
values computed with malware samples under different families, using deep neural networks trained on benchmark datasets is
a ResNet50-based deep polarized network (RNDPN) is designed a common approach.
to return T op K similar samples. In the second part, we propose Despite the commendable performance achieved by deep
a majority-voting and a Hamming-distance-based voting for
malware identification according to the retrieved results. The learning models in malware classification, current research
experiment results show that RNDPN outperforms the other faces several challenges that necessitate attention. Firstly,
six deep hashing models with 97.54% mean average precision many existing malware classification models are supervised
(mAP ) for malware retrieval when only 40 similar examples and trained using known samples [8], [9], demonstrating
are retrieved, where the best results for all deep hashing models proficiency with known threats but struggling when confronted
are observed with 48 bits hashing code length. Furthermore,
the Hamming distance-based voting method implemented with with unknown samples, new variants, and zero-day samples in
RNDPN demonstrates unparalleled performance in malware Internet-of-Things (IoT) devices [11]. Secondly, the escalating
classification compared to other models. Notably, it achieves vulnerability exposure in deep learning models poses a threat
exceptional results in two key aspects: malware classification to the robustness of the existing frameworks [12]. Thirdly,
accuracy with an impressive accuracy rate of 96.5%, and the the inadequate consideration of unsupervised learning models
identification of new or zero-day malware with a commendable
accuracy of 85.7%. and sophisticated malware variant generation techniques con-
tributes to the difficulty of identifying obfuscated, encrypted,
Index Terms—Deep hashing, Image retrieval, Malware classi- or shelled malware samples [13]. Fourthly, the detection and
fication, Malware images, Deep neural networks.
family attribution of malware in IoT present an ongoing
challenge as the proliferation of IoT devices is accompanied
by incidents involving new malware [11], [14].
I. I NTRODUCTION
To address these issues, deep hashing has emerged as a
ver the past decade, significant advancements have been potential solution, since it enables the extraction of more
O made in deep learning-based image processing, with
numerous deep neural networks (DNNs) achieving impressive
intricate features for dissimilarity computation [15]. In contrast
to traditional hashing in cryptography, deep hashing is a
accuracy in image classification tasks, such as VGG [1], technique that involves the transformation of high-dimensional
AlexNet [2], and ResNet [3]. Due to their excellent automatic features extracted from deep neural networks into compact
feature extraction capability and outstanding performance on binary codes. This process generates similar binary codes for
image classification, deep learning models have made signifi- efficient data retrieval [16]. Moreover, it has also demonstrated
cant strides in the field of malware classification, particularly remarkable success in image retrieval tasks within computer
vision, returning images that belong to the same category or
This work was supported in part by the National Natural Science Foundation are closest in similarity [17]. Both supervised and unsuper-
of China under Grant 61801418 and 62361056, in part by the Applied Basic vised deep hashing techniques exhibit notable improvements in
Research Foundation of Yunnan Province under Grant 202201AT070203,
202201AT070156, 202301AT070194, and 202301AT070422, and in part by performance and reduced retrieval times [18]. Recent advance-
the Opening Foundation of Yunnan Key Laboratory of Smart City in Cy- ments in deep hashing methods aim to create more nuanced
berspace Security (No. 202105AG070010) under Grant 202105AG070010- distinctions within binary codes across different categories.
ZN-10 and 202105AG070010-JC-11. Corresponding author: Mingxiong Zhao
(Email: jimmyzmx@gmail.com). These efforts contribute to enhanced retrieval speeds and
Y. Zhang, Z. Liao, N. Zhang, S. Min, Q. Wang, and M. Zhao are with reduced time complexity in training deep hashing models.
the Engineering Research Center of Cyberspace, National Pilot School of Consequently, the application of deep hashing for malware
Software, Yunnan University, Kunming, 650500, China.
T. Q. S. Quek is with Singapore University of Technology and Design, identification based on benchmark datasets appears promising.
Singapore 487372. E-mail: tonyquek@sutd.edu.sg. By leveraging the principle that hashing values serve as

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

representative indicators for malware classification, a convo- models for malware classification: the majority voting and
lutional neural network (CNN) was trained in [19] for this the Hamming distance-based voting. When the number of
purpose. The CNN utilized hash values generated from raw retrieved samples ranges from 5 to 50, both voting models
malware files as inputs. While this approach shares similarities consistently demonstrate superior performance in terms
with other models that perform classification within a broad of accuracy, macro precision (macro P ), and macro
category, it lacks the image retrieval capabilities found in deep recall (macro R). Notably, the Hamming distance-based
hashing techniques. On the other hand, deep hashing operates voting model surpasses the majority voting model across
on the same principle as [19], but takes a different approach by all evaluation metrics. As a result, RNDPN proves to
extracting more intricate image features and producing multi- be a highly effective and efficient approach for malware
ple samples that exhibit significant similarities. Deep hashing classification, achieving an impressive accuracy of 96.5%.
thus acts as a bridge between existing malware classification • The Hamming distance-based voting with RNDPN model
models and the hashing features of malware. However, little exhibits a substantial advantage over other models,
attention has been given to explore the potential of deep including three DNNs and the majority voting-based
hashing for large-scale malware retrieval and classification. RNDPN, in detecting new and unknown malware. It
Therefore, following this line of thought, our study aims to achieves outstanding results with an accuracy of 85.7%,
capitalize on the insights derived from the high-level feature macro precision of 86.8%, and macro recall of 85.3%.
extraction capabilities of DNNs and deep hashing for effective However, it is worth noting that RNDPN does require
malware classification. Recognizing the importance of dis- longer training and classification times in comparison to
criminative representations, especially in the context of deep the three DNNs. Despite this, the Hamming distance-
hashing, this paper proposes multiple deep hashing models based voting model with RNDPN emerges as the most
based on state-of-the-art DNNs for malware classification. effective approach for promoting malware classification
The objective is to assess the performance of these different and the identification of new malware.
deep hashing models for malware classification. In addition to This paper is structured as follows. In Section II, we
effectively classifying known malware samples, our study also provide a comprehensive overview of the related works in
addresses the challenges associated with detecting unknown both malware detection and deep hashing. Section III presents
samples, new variants, and promoting the detection of zero-day the proposed model for deep hashing-based malware classifi-
malware. The main contributions of this paper are summarized cation and detection, which combines ResNet, deep hashing
as follows: algorithms, and voting methods. Detailed experimental results,
including various models and hashing algorithms tested under
• This paper aims to overcome the existing challenges in different configurations and parameter settings, are presented
malware classification by leveraging deep hashing tech- in Section IV. Finally, we conclude the paper with a summary
niques. By extracting advanced features for dissimilarity of future directions for research in the last section.
calculation, these techniques greatly improve the retrieval
of large-scale malware. In addition to introducing hashing II. R ELATED W ORKS
codes as a novel feature for malware classification, this
research establishes a crucial link between deep hashing, A. Deep hashing for Image Retrieval
primarily employed in image retrieval, and traditional Deep hashing models, which have garnered persistent at-
deep learning-based image classification. tention for large-scale image retrieval [20], aim to convert
• This paper presents a novel approach by combining deep images into compact binary codes, thereby preserving the data
hashing with grayscale images of malware to enable structure of the original space [21]. This preservation allows
large-scale malware retrieval. Our study is the first to inte- for the integration of hashing with deep neural networks,
grate ResNet50 with DPN for efficient malware retrieval. enabling fine-grained image classification and retrieval. In
By combining distinctive features extracted by ResNet50 general, these deep hashing models can be broadly categorized
with target hash codes, we optimized the polarization loss into two types: supervised and unsupervised models.
Lp to strengthen the discriminative power of generated 1) Supervised deep hashing model: Supervised deep hash-
hash code for differentiating malware among different ing methods have been introduced to enhance search speed
families. Remarkably, our method outperforms six bench- and reduce memory consumption, leveraging labeled samples.
mark deep hashing models, achieving a classification Existing research highlights the binary hashing code and
accuracy of 97.54% with an mAP @40 metric, where semantic features of an image.
mAP is mean average precision. It is worth noting a) Binary code-based deep hashing: Liu et al. [18] pro-
that all deep hashing models demonstrate optimal results posed Deep Supervised Hashing (DSH) that utilizes compact
when the hashing code length (L) is set to 48 bits. similarity-preserving binary codes for image retrieval. To im-
Additionally, we evaluate the performance of VGG16 and prove the time efficiency of conventional deep hashing, Jiang
AlexNet with RNDPN, and our findings demonstrate that et al. [22] proposed Asymmetric Deep Supervised Hashing
RNDPN surpasses the others in terms of classification (ADSH) specifically for large-scale nearest neighbor search.
accuracy when evaluated across all retrieved samples Cao et al. [23] presented HashNet, which learns binary hashing
using mAP @all as the metric. codes directly from imbalanced similarity data. Recognizing
• Expanding on RNDPN, this paper introduces two voting that semantic labels are governed by latent attributes, Yang

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

et al. [24] introduced supervised semantics-preserving deep retrieval.


hashing to unify image classification and retrieval. Zhang et al. b) Optimizing DNNs for unsupervised deep hashing: Dizaji
[25] proposed a supervised learning framework for generating et al. [39] introduced HashGAN, which leverages the adver-
compact and bit-scalable hashing codes directly from raw sarial loss computed by a Generative Adversarial Network
images. (GAN). Similarly, Deng et al. [40] designed an unsupervised
b) Semantic-based deep hashing: In addition to handcrafted hashing algorithm that exploits the semantic similarity of the
features, Lu et al. [26] proposed hierarchical recurrent neural training dataset using GAN, considering both feature and
hashing to extract more useful texture features for image neighbor similarity. Zhang et al. [41] incorporated a hashing
retrieval. To address the failure of existing deep neural net- layer as a hidden layer of an autoencoder and proposed
works in capturing underlying data structures, a deep fuzzy Autoencoder-based Unsupervised Clustering and Hashing.
hashing network [27] that successfully incorporates fuzzy rules Wang et al. [42] integrated graph convolutional networks and
to model underlying data uncertainties is proposed. Various proposed an unsupervised deep hashing method called node
loss functions have also been explored to improve hashing representation deep hashing for image retrieval.
performance. Yue et al. [28] designed a pairwise cross-entropy In summary, current unsupervised models concentrate on
loss based on the Cauchy distribution, while Li et al. [29] similarity computation grounded in semantic features. While
aimed to optimize maximum class separability in the binary these models exhibit proficiency with unlabeled samples, they
space. Jin et al. [30] emphasized the importance of preserving falter in detecting unknown samples and new instances that
the local spatial structure and proposed deep ordinal hashing, markedly differ in similarity from the already-known samples.
which learns ordinal representations to generate ranking-based Diverging from the existing unsupervised deep hashing ap-
hashing codes for enhanced image retrieval performance. Shen proaches, this study extends supervised deep hashing through
et al. [31] proposed deep asymmetric pairwise hashing by the incorporation of voting methods to address the aforemen-
revealing similarities indicated by semantic labels outputted tioned challenge.
by jointly trained deep neural networks for multimedia re- Previous research has explored supervised and unsuper-
trieval. Zhai et al. [32] proposed Deep Transfer Hashing, vised hashing techniques, but none of them have specifically
inspired by knowledge distillation for model compression, examined deep hashing in the context of malware images.
which uses knowledge from a teacher model as supervised This study is the first attempt to address the use of deep
information. Zheng et al. [33] addressed the quantization hashing specifically for malware image classification. While
error resulting from continuous relaxation and proposed Deep many DNNs have been successful in extracting features from
Balanced Discrete Hashing (DBDH). Yuan et al. [34] proposed malware images, our study provides further insights into
Central Similarity Quantization (CSQ) to optimize the central the fine-grained features that distinguish different malware
similarity between data points based on central points. families and variants.
In conclusion, the majority of supervised hashing ap-
proaches have traditionally relied on handcrafted sampling pat-
terns or label annotations provided by dataset providers. How- B. Malware Image Classification
ever, it becomes challenging to curate high-quality datasets Building upon the advancements in deep learning models
with predefined labels in numerous applications. Existing for image classification, it is common to train deep learning
studies face the challenge of optimizing generated hash codes models specifically for malware image classification [43], [44].
to maximize inter-class similarity while minimizing intra-class These deep learning-based models focus on optimizing DNNs
similarity. to automatically extract feature maps.
2) Unsupervised deep hashing model: In contrast to super- 1) Deep learning for malware classification: In the realm
vised deep hashing, unsupervised deep hashing primarily relies of malware detection, many existing deep learning models
on the similarity computation of semantic features for unsu- utilize grayscale images of malware [45]. For instance, Darus
pervised data samples. Clustering algorithms are commonly et al. [46] performed malware detection using three machine
employed in unsupervised deep hashing approaches. Besides learning models that leverage GIST features extracted from
feature optimization, others introduce more advanced DNNs grayscale images. However, deep learning models have proven
for unsupervised deep hashing. to outperform state-of-the-art machine learning models when it
a) Clustering-based unsupervised deep hashing: Wu et al. comes to automatic feature extraction. Cui et al. [9] proposed
[35] proposed unsupervised deep video hashing, which suc- a CNN specifically designed for detecting malware variants.
cessfully preserves the neighborhood structure by integrating Gibert et al. [47] observed a high degree of similarity among
feature clustering and feature binarization. For efficient visual malware variants within the same family and proposed a file-
object matching, Lin et al. [36] proposed DeepBit, aiming to agnostic deep learning system that learns visual features from
learn compact binary descriptors. Jin et al. [37] proposed unsu- executable files to classify malware into families. Furthermore,
pervised semantic deep hashing, utilizing semantic information Wang et al. [14] proposed a multi-classification method for
extracted by the convolutional layer to guide network training. Android malware family detection in IoT devices.
Cui et al. [38] observed limited performance improvement 2) Hashing as a feature for malware detection: The re-
in previous unsupervised hashing research due to a lack of markable progress made in detecting malware through the
semantic guidance. Therefore, they proposed SCAlable deep analysis of malware images has led to significant research
hashing to learn enhanced hashing codes for social image efforts emphasizing the role of hashing functions in achieving

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

Returned
images
Query image Query image Query binary based on
feature code similarity
ranking

Top 1
Image Image binary
feature 1 code 1 Similarity
DNN 2
Database Hash measure
feature
images function (Hamming
extraction
distance)
Image Image binary 3
feature 2 code 2

Image Image binary


feature N code N
K

Fig. 1: The system architecture of ResNet50-based deep hashing for malware detection and family classification.

efficient and rapid malware detection. One notable contribution A. Deep Neural Network for Malware Classification
in this area is the work of Ni et al. [48], who introduced Many state-of-the-art CNNs are available for malware image
a method called Malware Classification using SimHash and classification, including VGG [1], ResNet [3], and AlexNet
CNN. This method demonstrates the ability to identify new [2]. When taking malware gray-scale images as inputs, each
malware with an impressive average time of 1.41 seconds. neural network is trained as the base model for malware clas-
It is important to recognize that hashing techniques are not sification. In our experiments, we demonstrate that ResNet50
limited to malware images alone; they can also be applied [3] achieves the best performance on malware detection and
to other malware features. Alotaibi et al. [19] leveraged the classification than other models. Therefore, this paper further
Ruzicka index to validate the hash values of various malware aims to optimize ResNet50 for malware classification while
features, resulting in accelerated malware detection. focusing on the detection of malware variants that belong to
Although deep learning has demonstrated effectiveness in different families. Consequently, the training and optimization
malware detection, current models encounter difficulties in of ResNet50, which corresponds to the DNN feature extraction
detecting malware variants and zero-day malware, especially component in Fig. 1, are presented as follows.
in IoT devices that are targeted by new malware [11], [49]. When provided with a set of N malware gray-scale images
Furthermore, current research on hashing-based malware clas- X = {x1 , x2 , · · · , xN } from M families, the trained ResNet50
sification primarily adopts a supervised approach and exhibits is modeled as a function f : X → C, where xi ∈ X and
limited generalization across diverse models. Simultaneously, C = {c1 , c2 , · · · , cM } is a set of M target classes. Since
scant attention has been directed towards optimizing the hash- ResNet50 is mainly configured with many bottleneck residual
ing codes generated by deep hashing models. In this regard, blocks [3], it has been redesigned to take gray-scale images
our paper stands out as it is the first attempt to introduce as inputs instead of colorful images. The configuration for
a dedicated deep hashing-based model explicitly tailored for the different layers of ResNet50 is shown in Table I. As
malware classification and the identification of new malware. shown in Table I, the last average pooling layer and the
Softmax layer are deleted while we use the extracted feature
III. D EEP HASHING - BASED MALWARE CLASSIFICATION maps by ResNet50 for building the deep hash. When trained
with benchmark datasets, we denote the trained model as
Deep hashing is commonly employed in conjunction with
f (xi ; θ), where θ means the value of parameters once the
a deep learning model, and ResNet501 [3] is a state-of-the-art
model converges.
CNN often utilized for malware detection and classification.
We employ malware gray-scale images as inputs because both The performance of the trained deep neural networks on
semantic and texture features are preserved [44], [45]. The malware classification directly determines the quality of the
primary objective of this paper is to achieve high-performance deep hash. Thus, multiple deep-learning models for malware
malware detection and classification through the utilization classification are compared. To extract useful and important
of deep hashing in combination with ResNet50. The system feature maps for malware samples, the designed deep neu-
architecture, illustrated in Fig. 1, consists of three sequential ral networks should achieve satisfactory performance on all
components: the DNN for malware detection, the deep hashing metrics, including accuracy, precision, recall, and Area Under
module, and the similarity-based prediction mechanism. Curve (AUC). Three models for malware classification and
detection are designed in our study. All three models achieve
1 With minor modifications to the deep hashing process, other deep neural satisfactory performance with > 90.0% accuracy on average
networks for malware classification can substitute ResNet50 in our system. for malware samples in our experiments. Once the trained

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

TABLE I Detailed configurations of the ResNet50 for malware image


classification.
A fully connected layer fc (·) with L channels to map the ui
to vi is configured as vi = fc (ui ).
Conv. Block Layers Kernel Size Output Size Channel Feature Size
  Therefore, a hashing mapping function Φ(·) takes the real-
Conv2D 64 × 7 × 7 112 × 112
1
MaxPooling
×1
64 × 3 × 3 56 × 56
64 64 × 56 × 56 valued vector vi as input and yields a fixed-length binary
hashing code, given as bi = Φ(vi ), where we set Φ(·) =
 
Conv2D 64 × 1 × 1
2  Conv2D  × 3 64 × 3 × 3 56 × 56 256 256 × 56 × 56
Conv2D 64 × 3 × 3 sign(·) as a function that maps each element of vi [j], where
j ∈ [0, L − 1], into binary values of either −1 or 1 as
 
Conv2D 128 × 1 × 1
3  Conv2D  × 4 128 × 3 × 3 28 × 28 512 512 × 28 × 28
Conv2D 512 × 1 × 1 (

Conv2D

256 × 1 × 1 −1, if vi [j] < 0
4  Conv2D  × 6 256 × 3 × 3 14 × 14 1024 1024 × 14 × 14 Φ(vi [j]) = sign(vi [j]) = . (1)

Conv2D

1024 × 1 × 1 1, if vi [j] ⩾ 0
Conv2D 512 × 1 × 1
5  Conv2D  × 3
Conv2D
512 × 3 × 3
2048 × 1 × 1
7×7 2048 2048 × 7 × 7
Based on this, the final hashing value of the vector bi ∈
{−1, 1}L with L elements is computed. In summary, this study
denotes the hash function trained with ResNet50 as Φ(xi ).
models are acceptable in accuracy, the high-layer features are Given a binary hashing code bi , the deep hashing function
extracted for building deep hash. In ResNet50, we extract aims to map similar data into similar binary codes with a
output features from the last k layers (where we usually set small Hamming distance [53]. However, the hashing algorithm
k = 1 and use the outputs from the 5-th convolutional block) introduces high computation complexity because it requires
and input them to the designed deep hashing models. As shown computations of pairwise or triplet label information. In gen-
in Table I, the output features from the 5-th convolutional eral, the Learning-to-Hash (LtH) [54] model the deep hashing
block are in the size of 2048 × 7 × 7. algorithm as an optimization problem as
Besides malware images, other malware features are also X
min L(Φ(xi ), Φ(xj ), fij ), (2)
applicable here, such as MalConv [50] and AvastConv [51] Φ(·) i,j
that are trained from raw executable malware files. Once
where xi and xj means two pairwise samples chosen to
the deep neural networks show satisfactory performance on
compare, fij indicates whether the labels for the chosen
malware classification, we can extract the high-layer features
pairwise samples, xi and xj , are the same, defined as
and input them into the hashing algorithms. The designed (
system, as shown in Fig. 1, is flexible and scalable because −1, if ci = cj
the substitution of ResNet with other DNNs needs no changes fij = . (3)
1, if ci ̸= cj
to other components.
In Eq. (2), the choice of the loss function L(·) plays a crucial
B. Deep Hashing-based Malware Classification role in learning effective similarity measures. In [54], the
authors formulate their learning objective based on pairwise
The inputs to a deep hashing algorithm are mainly feature
or triplet similarities. However, it is also demonstrated that
maps extracted from neural networks. The better quality of the
pairwise or triplet label information is not necessary in the
extracted feature maps, the better quality of deep hashing is.
polarization loss [55]. In the context of DPN [55], the objective
The authors in [52] prove that the feature maps outputted by
is to learn accurate binary codes for image retrieval without
the last k convolutional layers are deterministic and important
relying on sample pairs or triples. The polarization loss Lp
for final classification. In general, we set k = 1 while taking
utilized in DPN encourages the binary codes to have large
the output feature maps from the last convolutional layers as
Hamming distances, thereby improving their discriminative
the inputs for the deep hashing algorithm. Based on ResNet50
power. This loss is defined as the sum of pairwise Hamming
for malware classification, the designed deep hashing for
distances between the binary codes, given by
malware image retrieval is shown in Fig. 2.
XN
1) ResNet-based deep hashing: Based on the features ex- Lp = max(δ − vi · ti , 0), (4)
tracted from the ResNet50, this study creatively builds a i=1
deep hashing method for malware image retrieval as follows. where N is the number of training samples, vi is the binary
The deep neural network, as a feature learning backbone code computed by the deep hashing model, and ti is the
[17], projects a raw malware gray-scale image xi into a precomputed target vector of the i-th sample xi . δ ≥ 1 is a
high-dimension tensor wi ∈ RH×W ×D as wi = f (xi ; θ), margin parameter that controls the minimum distance between
where H and W are the height and width of the spatial different classes and is usually set as 1.
dimensions, D is the channel dimension. As shown in Fig. To derive an optimized solution for Eq. (4), we need to
1, we set H = 7, W = 7, and D = 2048 as an example. compute the target vector ti of all class labels. Assume that
θ represents the parameters of the ResNet50 trained with there are M kinds of malware classes, and the computation of
malware images. Subsequently, the tensor wi is down-sampled ti is usually in a random assignment way, presented as follows.
to a vector ui ∈ R1×D with the global average pooling (GAP) 2) Random assignment of target vector ti of all labels: The
as ui = fGAP (wi ) = fGAP (f (xi ; θ)). basic idea of Locality Sensitive Hashing [56] is to reduce the
The hashing algorithm usually computes a fixed-length code computational complexity by mapping high-dimensional data
with L bits. Thus, the hashing module takes the vector ui as points into a low-dimensional hash space through a random
inputs and outputs a compact real-valued vector vi ∈ R1×L . projection or randomization function. In the hash space, the

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

similarity between the data points can be measured by the Algorithm 1: Target hashing code generation
Hamming distance between the hashing codes. Furthermore, Input: hashing code length L, positive bit rate p, M class
we achieve an efficient approximate nearest neighbor search labels.
based on similarity measures. Through detailed experiments, Output: Code set {t1 , t2 , · · · , tM }, ti ∈ {−1, 1}L satisfies
a random assignment policy is introduced to ensure that the min(Dh (ti , tj )) ≥ δ.
generated hashing codes have sufficient inter-class Hamming 1: for i ← 1 to M do
distance. 2: Initialize an empty hashing code ti ← NULL;
In general, two binary target vectors ti , tj ∈ H L in a 3: for j ← 0 to L − 1 do
L-bit Hamming space H L ∈ {−1, 1}L are sampled, where 4: Generate a random number r ∈ [0, 1];
each bit is sampled with probability p as 1. The Hamming 5: if r < p then
distance between ti and tj , denoted as Dh (ti , tj ), represents 6: ti [j] = 1;
the number of differing bits at corresponding positions. The 7: else
expectation of the Hamming distance between these two vec- 8: ti [j] = −1;
tors is Ep (Dh (ti , tj )) = 2L·p(1−p). The expectation reaches 9: end if
a maximum value of 0.5L with p = 0.5. The computation of 10: end for
target hashing code based on a predefined positive rate p and 11: end for
the length of codes L is shown in Algorithm 1. 12: Return {t1 , t2 , · · · , tM };

3) Hamming voting-based malware classification: As the


RNDPN is specifically designed for malware image retrieval,
this study is the first to apply it to malware identification. Two (4), a set of index values of the T op K nearest neighbor
voting methods, based on the T op K retrieved samples, have samples b1 , b2 , · · · , bK is computed. Each index corresponds
been devised, namely majority voting and Hamming-distance- to a single sample x ∈ D. Once finished, all retrieved
′ ′ ′ ′
weighted voting. samples are denoted as X = {x1 , x2 , · · · , xK }. In general,
For a query image xq and a set of images in a database those samples are often with different classifiers. Based on
D = {x1 , x2 , · · · , xN }, their binary hashing codes are rep- all retrieved samples, it is natural to launch a majority voting
resented as bq and B = {b1 , b2 , · · · , bN }, respectively. Let among them to decide the final label for a queried sample. In
C = {c1 , c2 , · · · , cN } be the label information corresponding a majority voting-based malware classification, the final label
to each image, where ci ∈ C is the label of the i-th image. for a queried sample is defined based on all retrieved samples

For a given query image xq , the hash model computes its and their labels < xi , ci > as
hash value and launches an image retrieval operation based
on B. By solving the optimization function defined in Eq. cq = i, where i = argmaxi∈{1,2,··· ,M } Count(ci ), (5)

Target hash code for


every class

𝑁
:-1
ℒ𝑃 𝑣𝑖 , 𝑡𝑖 = ෍ max(𝛿 − 𝑣𝑖 ⋅ 𝑡𝑖 , 0)

𝑖=1 :1

Family 1


Family 2
… GAP Sign
… Binary
… Codes




wi ui 𝑣i bi
Stage 0 Stage 4 Hash Family N
Backbone θResNet50 layer
Feature learning 𝐿 bits

Hash code learning

Fig. 2: The network configuration of the ResNet50-based deep hashing for malware classification.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

where Count(·) is a count function that summarizes the total Algorithm 2: The RNDPN-based Hamming voting for
number of samples belonging to the same label. Malware Classification
The majority voting-based malware classification performs Input: dataset X = {x1 , x2 , · · · , xN }, target binary vector
well when a large number of samples belonging to the same T = {t1 , t2 , · · · , tM }, and learning rate α, query sample
family are collected. However, the classification performance xq , number of returned samples T op K, labels of all
remarkably decreased when detecting unknown and new mal- samples C = {c1 , c2 , · · · , cN };
ware variants. Thus, their Hamming distance with the query Output: ResNet50 f (; θ), Trained ResNet50-based DPN
image xq are computed as Dh (bq , bi ). A shorter distance usu- model Φ(; θ), Label cq of xq .
ally means a higher probability that the query image xq is pre- 1: Training:
dicted as the same label as the queried image. Thus, our study 2: Initialize network weights θ for f (; θ);
denote their distance weights as w = {w1 , w2 , · · · , wK }, 3: for number of epochs do

where wi = Dh (b1q ,bi ) . For each triple element < xi , bi , ci >, 4: Randomly select a sample x from X;
the final score Sci under each class label ci ∈ C for a queried 5: v ← fGAP (f (x; θ)); L ← Lp (t, v); θ ← θ − α∇θ L;

sample xi is computed as 6: end for
XK 7: Testing:

Sci = wci , where f (xi ) = ci , (6) 8: bi ← Φ(xq ; θ); B = {b1 , b2 , · · · , bn };
i=1
9: bq ← Φ(xq ; θ); d ← [];
where S is a vector with a length of M . Sci denotes the 10: for i ← 1 to n do
weighted probability that xq belong to the class ci based on all 11: di ← Dh (bq , bi );

retrieved samples in X . Based on the weighted probabilities 12: append (di , ci ) to d;
of S, xq is assigned with the predicted label cq to the one 13: end for
with maximum value in S according to the principle of the 14: Sort d by the first element of each tuple in ascending
weighted majority voting: order;
15: for j ← 1 to T op K do
cq = i, where i = argmaxi∈{1,2,··· ,M } Sci . (7) 16: wj ← d1j ;
17: end for
The Hamming distance-based voting excels when retrieved PK
18: lq ← argmaxi∈{1,2,··· ,M } j=1 wcj ;
samples correspond to multiple labels. Nevertheless, there is a
19: Return Φ(; θ), cq
possibility that all retrieved samples belong to different labels,
resulting in Eq. (7) returning no viable result. In the context
of the malware domain, this scenario may occur when an
unknown or new malware is detected. In such cases, the query
samples from VX-Heaven2 are also used, which are labeled
sample is designated as new.
differently from the samples in Malimg. The VX-Heaven
This study introduces an innovative approach to Hamming
dataset contains 52 malware families, with some families
distance-based voting, as outlined in Eq. (7), which strikes a
having only a few samples available. All models are configured
well-calibrated balance between the efficacy of deep hashing
and evaluated under the same experimental settings to ensure
for large-scale image retrieval and the weighted majority
fair comparison.
voting based on Hamming distance. Consequently, we are
pioneers in applying this method to enhance malware clas- CICMalDroid2020: In our study, we also incorporate the
sification, particularly in the context of malware family detec- CICMalDroid2020 dataset [57] across all experiments to val-
tion, leveraging deep hashing tailored for large-scale malware idate the robustness of our models with extensive coverage.
image retrieval. The detailed implementation of the RNDPN- This dataset encompasses 10,000 Android malware samples
based Hamming voting for malware classification is shown in distributed across four categories: adware, banking malware,
Algorithm 2. malicious SMS software, and riskware. Notably, the dataset
is imbalanced, with only 870 samples in the FakeInst family.
Consequently, as detailed in Table III, we selectively utilize
IV. E XPERIMENT AND A NALYSIS 3,491 malware instances spanning 13 families in our experi-
A. Dataset and Performance Metrics mental evaluations.
Four commonly used indices are utilized to evaluate a
Malimg dataset: All models are tested with Malimg dataset classification model: true positive (T P ), true negative (T N ),
[4]. This dataset serves as a benchmark for malware clas- false positive (F P ), and false negative (F N ). T P and T N
sification and consists of 25 malware families with over represent correctly predicted classifications for positive and
9, 000 malware samples. For our experiments, we allocated negative samples, respectively. On the other hand, F P and
90% of the dataset for training and 10% for testing. The F N represent incorrectly predicted classifications for negative
distribution of samples across each malware family can be and positive samples, respectively. Using these indices, all
found in Table II. Additionally, over 900 malware samples metrics are defined as follows, including accuracy, precision,
from various well-known malware families are collected for
further evaluation. To assess the performance of our models
in detecting unknown and new malware families, malware 2 http://vxheaven.org/

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

TABLE II The distribution of malware families in Malimg.


Furthermore, malware family classification is a typical
Family Number Virus type multi-category classification problem. To evaluate the merits
Allaple.L 1591 Worm
of classifiers under each class comprehensively, another metric
Allaple.A 2949 Worm macro-averaging is introduced. Macro-averaging is the arith-
Yuner.A 800 Worm metic mean of each statistical metric value of all categories.
Lolyda.AA 1 213 PWS
Lolyda.AA 2 184 PWS
Two commonly used macro-averaging metrics include macro-
Lolyda.AA 3 123 PWS Precision (macro P ) and macro-Recall (macro R).
C2Lop.P 146 Trojan macro P: Macro-Precision measures the average preci-
C2Lop.gen!g 200 Trojan
Instantaccess 431 Dialer
sion value of the P multi-category model and is defined
N
Swizzot.gen!I 132 TDownloader as macro P = N1 i=1 P (ci ), where P (ci ) denotes the
Swizzot.gen!E 128 TDownloader P recision value of all samples with the i-th label.
VB.AT 408 Worm
Fakerean 381 Rogue macro R: Macro-Recall measures the average recall value
Alueron.gen!J 198 Trojan of Pthe multi-category model and is defined as macro R =
Malex.gen!J 136 Trojan 1 N
N i=1 R(ci ), where R(ci ) denotes the Recall value of all
Lolyda.AT 159 PWS
Adialer.C 125 Dialer samples with the i-th label.
Dontovo.A 162 TDownloader
Obfuscator.AD 142 TDownloader
Wintrim.BX 97 TDownloader B. Results and Analysis
Dialplatform.B 177 Dialer
Agent.FYI 116 Backdoor 1) Performance for malware image retrieval: The perfor-
Autorun.K 106 Worm.AutoIT mance for malware retrieval based on malware gray-scale
Rbot!gen 158 Backdoor images in mAP is evaluated at first. The ResNet50 is trained
Skinrim.N 80 Trojan
and used as the foundation to build deep hashing. Both
RNDPN and six state-of-the-art deep hashing algorithms are
TABLE III The distribution of android malware families in CICMal-
Droid2020. tested, including ADSH [22], HashNet [23], DBDH [33], CSQ
[34], DFH [29], and DCH [28]. For a fair comparison, all
Family Number Virus type models are tested under the same configurations with varied
Opfake 340 Trojan-SMS hashing code length L as 8 bits, 16 bits, 32 bits, 48 bits, and
Agent 468 Trojan-SMS 64 bits, respectively. The results are shown in Table IV.
FakeInst 870 Trojan-SMS
Jifake 223 Trojan-SMS Based on the results shown in Table IV, the following
Wapnor 147 Trojan-Dropper conclusions are obtained:
Piom 178 Trojan
• RNDPN demonstrates superior performance compared
Boogr 89 Trojan
Apofer 237 AdWare to state-of-the-art deep hashing methods, such as CSQ
Kuguo 120 AdWare [34] and DBDH [33]. Notably, it achieves a maximum
Dowgin 103 AdWare
Feiad 359 AdWare
improvement of 15.4% and 16.1% on mAP @40 com-
Dnotua 194 Riskware pared to DBDH [33] across two datasets. In comparison
Madad 163 AdWare to the classical deep hashing model, RNDPN exhibits
an average improvement of 11.77% and 16.20% over
HashNet [23] on the Malimg and CICMalDroid2020
and recall datasets, respectively.
TP + TN • In general, the performance improves gradually when
Accuracy = , (8)
TP + TN + FP + FN increasing the length of the hashing codes for all deep
TP hashing models. A shorter hashing code length, such as
P recision = , (9)
TP + FP L = 8 bits, usually means less sufficient representation
TP ability and is thus often poor on image retrieval. The
Recall = . (10)
TP + FN failure is mainly caused by the mapping from a large
mAP: To evaluate the performance of deep hashing algo- feature space of the input images to a small hashing
rithms, all models are tested with mAP [2]. In particular, the vector space. However, RNDPN achieves satisfactory
performance of each model based on predefined values of the performance and is 0.84% and 2.56% better than CSQ
top K retrieved images is denoted as mAP @K. Additionally, [34] on average under Malimg and CICMalDroid2020,
the mAP @all metric indicates that all retrieved samples are respectively. With a longer hashing code, a richer feature
used for evaluation. Given a database of N samples and M representation is possible. Therefore, the performance
samples to be queried, mAP is calculated as under each deep hashing model improves by increasing
the hashing code length.
1 XM 1 XN • When hashing code length is set as L = 48 bits, all
mAP @K = P (xq , xj )l(cq = cj), (11)
M q=1 N j=1
models achieve the best performance on image retrieval.
where P (xq , xj ) denotes the P recision of the top j-th re- With the increment of hashing code length, the perfor-
trieved image while querying the q-th image, and l(·) is the mance converges quickly. It can be concluded that a
indicator function, where l(True) = 1 and l(False) = 0. bigger hashing code length usually results in overfitting.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

TABLE IV The performance under different deep hashing models in mAP @40 on two malware datasets.

Malimg (mAP @40) CICMalDroid2020 (mAP @40)


Model
8 bits 16 bits 32 bits 48 bits 64 bits 8 bits 16 bits 32 bits 48 bits 64 bits
ADSH [22] 62.00 69.58 89.14 81.16 78.90 56.39 62.18 70.47 81.16 80.48
HashNet [23] 67.56 83.09 87.86 91.73 89.80 64.93 74.05 80.83 86.44 88.38
DBDH [33] 80.73 80.76 87.71 91.16 84.38 79.55 81.23 85.00 88.56 82.91
CSQ [34] 91.23 95.67 96.10 - 95.74 89.27 90.22 94.83 - 94.32
DFH [29] 51.09 64.74 67.71 72.08 76.00 50.24 60.93 65.67 71.36 74.28
DCH [28] 81.24 93.45 93.66 93.95 93.09 79.64 90.66 91.03 91.65 91.48
RNDPN (Ours) 92.95 96.16 96.19 97.54 96.03 90.79 95.13 96.73 96.74 96.23

Therefore, the best setting for the hashing code length 3. When evaluating the classification performance based on
is suggested in the range of [48, 64] bits for malware deep hashing algorithms for malware image retrieval, a good
retrieval tasks. value of K is important. The performance of classification
In summary, the designed deep polarized network in this accuracy is collected while changing the value of K in the
paper is suitable for malware gray-scale image retrieval. The range of [1, 200]. Based on the results in Fig. 3, the following
DPN outperforms all other deep hashing models with different observations are made:
hashing code lengths. To achieve the best performance for • The Hamming distance-based voting only marginally
malware retrieval, a reasonable hashing code length is set outperforms the majority voting on the whole. The same
to enhance the representation capability of the deep hash. results are also observed in deep hashing on large-scale
Furthermore, the quality of the baseline deep neural network is images especially when many retrieved samples are with
the key to the performance of deep hashing models. Although the same label.
ResNet50 is used as the baseline model in our study, other • For a small value of K (K ≤ 5), the classification
DNNs are also possible. Therefore, the designed DPN under performance changes remarkably. The main reasons for
different baseline DNNs is further evaluated for malware the fluctuation of accuracy are the biases introduced
image retrieval. among those retrieved samples and overfitting problems.
2) Performance under different baseline models: There are • When K ∈ [5, 50], both voting methods achieve stable
many deep neural networks for malware classification and accuracy around 96.0%. The best result with 96.5%
detection. For malware grayscale images, the ResNet50 is classification accuracy is observed when K = 40.
trained and used as the baseline model to build the deep • There is a sharp performance degradation when K ≥ 50.
hashing algorithms. To test whether the ResNet50-based DPN It is quite innovative while image retrieval on common
is the best model for malware image retrieval, our study datasets, including MNIST and CIFAR10, achieves a
also builds DPN on other deep neural networks, including stable accuracy when K ≤ 200. The major reason is that
VGG16 [1] and AlexNet [58]. Three models are evaluated malware grayscale images are texture feature dominant
under the same hashing code length with L = 48 bits. The while images from other datasets are mainly semantic
accuracy under three models is shown in Table V. Based dominant on the whole.
on the results in Table V, ResNet50 achieves 5.01% and
2.5% performance improvements than AlexNet and VGG16, All in all, the number of samples belonging to the same
respectively. It is concluded that ResNet50 is more powerful in class is critical to the above observations in the malware image
extracting representative grayscale image features than other domain.
DNNs.
9 7 . 0

TABLE V The accuracy of DPN under three baseline models,


including VGG16, AlexNet, and ResNet50, with K = 48 bits.
9 6 . 5

Baseline networks Accuracy (mAP @all)


A c c u ra c y (% )

AlexNet 90.36
DPN
VGG16 92.87 9 6 . 0

ResNet50 (ours) 95.37


M a jo r ity V o tin g
9 5 . 5
H a m m in g V o tin g
3) Performance under malware classification: Based on
RNDPN for malware retrieval, its performance for malware
9 5 . 0

classification is further evaluated. Two kinds of malware 1 0 2 0 3 0 4 0 5 0

T o p K
classification methods based on the T op K retrieved im-
′ ′ ′ ′
ages X = {x1 , x2 , · · · , xK } and their labels are proposed, Fig. 3: The classification accuracy under the majority voting and
including the majority voting and the Hamming distance- the Hamming distance-based voting shown in Eq. (7) with different
based voting. The classification accuracy is shown in Fig. numbers of T op K-retrieved samples.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

10

TABLE VI The performance for malware classification under conventional DNNs and deep hashing-based models on two malware datasets.

Malimg (mAP @40) CICMalDroid2020 (mAP @40)


Models
Accuracy(%) macro P(%) macro R(%) Accuracy(%) macro P(%) macro R(%)
ResNet50 95.19 91.07 91.47 94.77 90.64 90.25
AlexNet 91.99 90.74 88.23 91.37 90.63 90.78
VGG16 92.51 90.23 90.12 91.86 90.73 90.53
RNDPN-Majority Voting 95.83 95.37 95.04 95.44 95.28 95.05
RNDPN-Hamming Voting 96.5 96.01 95.71 95.94 95.58 95.26

Fixed the best value of T op K, the voting-based deep


hashing models with the benchmark DNNs are evaluated 1 0 0

for malware classification. This study highlights the malware


family classification and new malware detection based on deep
hashing techniques. Two voting-based DPN models with three 9 0

A c c u ra c y (% )
benchmark malware classification DNNs, including ResNet50,
AlexNet, and VGG16, are compared. The results are shown in
Table VI. All models are compared under three performance 8 0 M a jo r ity V o tin g
H a m m in g V o tin g
metrics, including accuracy, precision, and recall. Based on A le x N e t
V G G 1 6
the above results, our key findings are as follows: R e s N e t5 0

• The Hamming voting-based RNDPN achieves the best 7 0

classification performance on all metrics with 96.5% ac- 0 5 0 1 0 0 1 5 0 2 0 0

E p o c h
curacy, 96.01% precision, and 95.71% recall. The reasons
for its success are two folds. First, the ResNet50 is Fig. 4: Comparison of accuracy under different models for malware
powerful to extract useful features for malware classi- classification.
fication when compared with VGG16 and AlexNet as
shown in Fig. 4. Second, Hamming-distance-based voting
enhances the classification performance when compared voting-based RNDPN in classifying unknown or new malware
with majority-voting-based DPN, as shown in Fig. 4. samples, both majority voting as defined in Eq. (5) and
• When compared with three DNNs, including ResNet50, Hamming distance-based voting as defined in Eq. (7) with
VGG16, and AlexNet, the DPN-based hashing models are RNDPN are compared against three benchmark DNNs. The
all effective in malware classification even with a small performance of all five models is evaluated in terms of accu-
value of T op K (K = 40 in our experiments). More racy, macro-precision, and macro-recall using the VX-Heaven
specifically, T op 10 retrieved samples of the Hamming- dataset. As depicted in the results presented in Table VII,
voting-based RNDPN are shown in Fig. 5. Therefore, we deep hashing consistently outperforms conventional DNNs in
conclude that deep learning-based deep hashing models malware detection.
are more powerful to extract representative and detailed
features for malware grayscale images. • Based on the results shown in Table VI and Table
• Most DNNs for malware classification are supervised VII, three state-of-the-art DNNs (including ResNet50,
based on a set of labeled samples for both training VGG16, and AlexNet) perform unsatisfactory for detect-
and validation. On the contrary, the deep hashing-based ing new or unknown malware. ResNet50 is comparable
models for malware classification are non-supervised. to RNDPN but with poor performance on macro P and
Therefore, our study tries to further compare their perfor- macro R. It is, thus, challenge to detect unknown and
mances in detecting new malware or zero-day malware. new malware with supervised deep-learning models.
• Both the majority voting and the Hamming distance-
TABLE VII The performance for malware classification under three based voting with RNDPN outperform three DNNs by
state-of-the-art DNNs and deep hashing-based models to unknown or a great margin. The Hamming distance-based voting
new samples. achieves the best result on all metrics. It outperforms
the majority voting with 6.79% accuracy improvement,
Models Accuracy(%) macro P(%) macro R(%)
6.97% macro P improvement, and 6.47% marco R
ResNet50 65.31 67.42 65.87 improvement. The Hamming distance-based voting is
AlexNet 63.90 62.87 68.13
VGG16 64.20 63.23 75.19 21.23% better under accuracy on average than three
RNDPN-Majority Voting 78.91 79.83 78.83 DNNs, including ResNet50, VGG16, and AlexNet. The
RNDPN-Hamming Voting 85.7 86.8 85.3 success of the Hamming distance-based voting is two
folds. First, RNDPN extracts more intricate and fine-
4) Performance for new malware or zero-day malware granular features for malware classification. Second, the
identification: To thoroughly assess the effectiveness of the Hamming distance-based voting and the majority voting

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

11

Dialer.Adialer.C

Dialer.Adialer.C

Worm.VB.AT

Worm.VB.AT

Trojan.Alueron.gen!J

Trojan.Alueron.
gen!J

Query Top-10 retrieved images

Fig. 5: The T op 10 queried samples with the Hamming distance-based voting on RNDPN for malware image retrieval.

achieve a good balance between the queried samples and New. Through detailed experiments, the minimum Hamming
their labels. distance of 10 is the basis to identify new malware samples.
• Furthermore, the Hamming distance-based voting is bet- When querying a typical worm Net.Worm.Win32.Kolabc in our
ter than the majority voting-based model. First, the second query, the retrieved samples belong to two families,
Hamming distance-based voting measures the number of including Worm.Allple.L and Rouge.Fackrean. Although this
different bits between two binary codes. Consequently, query outputs two family labels, all samples returned are
it captures the subtle differences among samples with worms. Thus, the final prediction based on the Hamming
different labels. As far as the majority voting is con- distance-based voting is a worm. The third query with Tro-
cerned, it only considers the category label of each sample janDownloader.Win32.Adload returns the samples with the
and ignores the variability among samples belonging same label. This sample is predicted as TDownloader that
to the same category. Second, the Hamming distance- belongs to the Trojan. A TDownloader virus usually performs
based voting with RNDPN model captures the correlation downloading and installing other malware or components onto
among samples. Thus, it can improve the accuracy and an infected computer. Despite sharing the different family
robustness of classification by exploiting the correlation labels, the weighted Hamming distance is smaller than 10
among samples. Based on this feature, a secure and robust and returned the same parent class TDownloader. Therefore,
RNDPN for malware identification is feasible. RNDPN with the Hamming distance-based voting can effec-
• For a new or unknown malware sample, the queried tively identify new or unknown malware.
samples based on RNDPN are always with multiple labels Because some malware families have limited samples, more
and with a larger weighted Hamming distance as shown in comparisons are performed by returning more samples by the
Fig. 6. The identification of new and unknown malware is RNDPN model. In Fig. 7, three samples with 40 retrieved
valuable to the performance enhancement of the existing samples are shown. Based on the results in Fig. 7, more
deep learning-based malware detection systems. This is than 4 malware labels are returned in all cases. More im-
beneficial in the context of zero-day sample detection, as portantly, the T op 3 retrieved samples belong to different
it allows models to capture potential patterns and simi- labels. Based on the weighted Hamming distance, the designed
larities among samples, even if they belong to previously model successfully identifies all three samples as New instead
unseen or unknown malware families. The results in Table of any labels with the retrieved samples. The above results
VII extend the application of deep hashing models to not consolidate our conclusion that the Hamming distance-based
only large-scale malware retrieval but also small-sample voting with RNDPN succeeds in identifying new malware.
learning domain. Despite operating in a supervised manner, RNDPN re-
In Fig. 6, three samples from the VX-Heaven dataset are mains versatile and applicable to unsupervised scenarios. This
tested for new malware identification. The Hamming distance- adaptability arises from the fact that malware identification
based voting on RNDPN returns the T op 10 queried samples. relies on the Hamming distance rather than explicit label
The red frame on each queried sample means that the retrieved information. In unsupervised scenarios, unknown, new, or
images’ label differs from the label of the query sample zero-day malware samples can be input into the trained deep
xq . In Fig. 6, all retrieved samples by inputting the first hashing model. The generated hash codes typically exhibit
sample Hoax.Win32.Renos returns 10 samples belonging to closer similarities among retrieved samples. Additionally, with
the same family Worm.Allaple.A. By controlling the weighted minor modifications to the hashing component in RNDPN, a
distance with a minimum threshold (the minimum Hamming transition to unsupervised hashing is possible a feature that
distance is 10 in our experiments), this sample is identified as we plan to explore in our future work.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

12

Worm.Allaple.A

Hoax.Win32.Ren
os

Worm.Allaple.L Rouge.Fackrean

Net-Worm.Win32.
Kolabc

Tdownloader.Dontovo.A

TrojanDownload
er.Win32.Adload

Query Top-10 retrieved images

Fig. 6: The T op 10 queried samples with the Hamming distance-based voting on RNDPN for new/unknown malware sample retrieval.

Trojan.Ma Dialer.Inst Rogue.Fakerean Trojan.Alu Rogue.Fa


lex.gen!J antaccess eron.gen!J kerean

Backdoor.Win32.Bi …
frose

1 2 3 4 37 38 39 40

Trojan.C2 Tdownloader, Worm.Yuner.A Trojan.C2L Worm.Yuner.A


LOP.gen!g Swizzor.gen!I OP.P

Backdoor.Win32.VB … …

1 2 3 30 31 32 39 40

PWS.Loly PWS.Loly Backdoor.Agent.FYI Backdoor. Backdoor.Agent.FYI PWS.Loly


da.AT da.AA1 Rbot!gen da.AA1

Exploit.Win32.Pidief … …

1 2 3 10 11 12 39 40
Query Top-40 retrieved images

Fig. 7: The T op 40 queried samples with the Hamming distance-based voting on RNDPN for new/unknown malware sample retrieval.

In conclusion, the designed deep hashing-based models, time and prediction time are critical. All five models are tested
specifically the Hamming distance-based voting on RNDPN under two performance metrics, including training time and
model, demonstrate superior accuracy and robustness in iden- classification time, as shown in Table VIII.
tifying new, unknown, and zero-day malware compared to all
existing state-of-the-art DNNs used for malware classification. TABLE VIII Comparison of the hashing models against the conven-
tional DNNs under the training time and prediction time.
The success of deep hashing-based malware models is intuitive
and has a significant influence on the advancement of more Training Time Classification Time
Models
powerful deep learning models for malware classification and (hours) (seconds)
detection. Considering the design and optimization of voting ResNet50 2.23 1.91
methods, it is undeniable that an exceptional and resilient deep AlexNet 1.2 2.04
VGG16 3.2 1.89
learning-powered malware classification model can be devised. RNDPN-Majority voting 3.3 8.48
RNDPN-Hamming voting 3.3 8.25
C. Evaluation of Classification Speed
Through detailed experiments, deep hashing shows high In Table VIII, two voting models with RNDPN show higher
efficiency for large-scale image retrieval. To evaluate the time costs for both training and classification. The reasons
efficiency of the two designed models, this study further are two folds. First, deep hashing is trained by extracting the
measures the training speed and retrieval speed under differ- outputs from the last convolutional layer of a benchmark DNN.
ent models. When applying the trained models for malware Second, the computation of hashing codes usually involves
classification and detection in a specific scenario, the training paired similarity computation and, thus, is time-consuming.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

13

However, it is also noticed that the classification time for two to conventional deep learning models. This highlights the ef-
voting-based models is composed of two parts: hash generation fectiveness of deep hashing in capturing essential information
and voting-based prediction. If we only consider the voting- for malware analysis. Secondly, a proper voting method that
based prediction under two voting models with RNDPN, the maximizes the dissimilarity of hashing codes among different
classification time will be greatly reduced to a comparative families is crucial. This ensures that the resulting hashing
level with conventional DNNs. All in all, the deep hashing- codes provide meaningful discrimination between malware
based malware classification is more effective and comparative families, reinforcing the importance of an appropriate voting
than the state-of-the-art DNNs but with little sacrifice of both strategy. Our understanding of deep learning-powered malware
model training time and prediction time. However, this little classification is deepened by incorporating hashing codes as
sacrifice is worthwhile because all deep hashing models win new features. This novel approach sheds light on the potential
in malware classification under accuracy, precision, and recall. of using hashing techniques to improve malware classifica-
tion accuracy and interoperability. Furthermore, the Hamming
D. Discussion distance-based deep hashing technique successfully identifies
Our study extends the application of deep hashing from new samples, including previously unseen malware instances.
image retrieval to malware identification, highlighting the This effective identification of new and unknown malware
effectiveness of deep hashing in extracting intricate features contributes valuable insights into small sample learning within
for image retrieval compared to image classification. Through the domain of computer vision, opening avenues for further
comparative tests with existing DNNs for malware classifica- research in this area. Lastly, deep hashing exhibits promise
tion, several key insights emerge: (1) Deep hashing exhibits as a solution for detecting more advanced types of malware,
a notable characteristic of producing high-quality hash codes particularly IoT malware and artificial intelligence-based mal-
that capture more nuanced similarities among samples than ware. This highlights the potential for utilizing deep hashing
a broader classification into known malware categories (e.g., techniques to tackle emerging threats in the cybersecurity
the nine classes in the Microsoft malware classification chal- landscape.
lenge3 ). (2) The supervised training of RNDPN with a set Building on the findings presented in this paper, several
of labeled malware samples provides a foundation that could promising avenues for future research emerge. Firstly, there is
potentially be extended to unsupervised deep learning with a need for a more extensive evaluation of deep hashing models
minor adjustments. However, the voting methods designed across a diverse array of new malware families, including
in this study are not suitable for unsupervised learning. A samples obtained from emerging applications and alterna-
potential avenue for future work is the development of a tive datasets. This will help gauge the generalizability and
clustering algorithm based on RNDPN. (3) The effectiveness overall effectiveness of these models. Moreover, a thorough
of deep hashing in enhancing malware identification relies on exploration of the distribution and patterns of hashing codes
two crucial elements: high-quality features extracted by DNNs originating from samples with the same origins could enhance
and an adept hash code generation technique. Optimizing our understanding and potentially lead to improvements in
both parameters concurrently is essential for achieving optimal classification accuracy. Investigating the distinctions between
results in image retrieval. (4) The application of RNDPN in adversarial malware generated through popular adversarial
IoT is feasible by implementing a deep hashing-based IoT attacks and malware variants also presents a compelling area
malware detection system as an MLaaS (machine learning for further research. Finally, extending the application of deep
as a service). To implement deep hashing models on end hashing to other representations of malware features beyond
devices is also possible by replacing ResNet with compressed grayscale images, such as API call sequences, byte sequences,
DNNs or lightweight DNNs (e.g. shuffleNet and UNet). (5) OpCode n-grams, and the like—opens up valuable directions
Our study does not account for adversarial malware samples. for exploration and experimentation.
Adversarial perturbations are typically imperceptible, and the
role of deep hashing in detecting such perturbations remains R EFERENCES
an open question. This aspect warrants exploration in future
research. [1] K. Simonyan and A. Zisserman, “Very deep convolutional networks
for large-scale image recognition,” in Proceedings of International
Conference on Learning Representations (ICLR), San Diego, CA, USA,
V. C ONCLUSION May 2015, pp. 1–14.
The research conducted in this paper on deep hashing-based [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6,
malware classification and new malware identification has pp. 84–90, May 2017.
yielded fruitful results. By combining deep hashing with the [3] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
Hamming distance-based voting method, the performance of recognition,” in Proceedings of IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 2016, pp.
malware detection and identification of new malware has been 770–778.
significantly enhanced. This study contributes to the existing [4] L. Nataraj, S. Karthikeyan, G. Jacob, and B. Manjunath, “Malware
research in multiple significant ways. images: Visualization and automatic classification,” in Proceedings of
International Symposium on Visualization for Cyber Security (VizSec),
Firstly, deep hashing methods for image retrieval prove to Pittsburgh, PA, USA, Jul. 2011, pp. 1–4.
be more potent in extracting representative features compared [5] K. Kancherla and S. Mukkamala, “Image visualization based malware
detection,” in Proceedings of IEEE Symposium on Computational Intel-
3 https://www.kaggle.com/c/malware-classification/data ligence in Cyber Security (CICS), Singapore, Apr. 2013, pp. 40–44.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

14

[6] A. Bozkir, E. Tahillioglu, M. Aydos, and I. Kara, “Catch them alive: [30] L. Jin, X. Shu, K. Li, Z. Li, G. J. Qi, and J. Tang, “Deep ordinal hashing
A malware detection approach through memory forensics, manifold with spatial attention,” IEEE Trans. Image Process., vol. 28, no. 5, pp.
learning and computer vision,” Comput. Secur., vol. 103, p. 102166, 2173–2186, May 2019.
Jan. 2021. [31] F. Shen, X. Gao, L. Liu, Y. Yang, and H. Shen, “Deep asymmetric
[7] S. Kumar, B. Janet, and S. Neelakantan, “Identification of malware pairwise hashing,” in Proceedings of the 25th ACM International Con-
families using stacking of textural features and machine learning,” ference on Multimedia (MM), Mountain View, CA, USA, Oct. 2017, pp.
Expert Syst. Appl., vol. 208, p. 118073, Dec. 2022. 1522–1530.
[8] D. Vasan, M. Alazab, S. Wassan, B. Safaei, and Q. Zheng, “Image-based [32] H. Zhai, S. Lai, H. Jin, X. Qian, and T. Mei, “Deep transfer hashing
malware classification using ensemble of CNN architectures (IMCEC),” for image retrieval,” IEEE Trans. Circuits Syst. Video Technol., vol. 31,
Comput. Secur., vol. 92, p. 101748, May 2020. no. 2, pp. 742–753, Feb. 2021.
[9] Z. Cui, F. Xue, X. Cai, Y. Cao, G. Wang, and J. Chen, “Detection of [33] X. Zheng, Y. Zhang, and X. Lu, “Deep balanced discrete hashing for
malicious code variants based on deep learning,” IEEE Trans. Industr. image retrieval,” Neurocomputing, vol. 403, no. 3, pp. 224–236, April
Inform., vol. 14, no. 7, pp. 3187–3196, Jul. 2018. 2020.
[10] F. Mercaldo and A. Santone, “Deep learning for image-based mobile [34] L. Yuan, T. Wang, X. Zhang, F. E. Tay, Z. Jie, W. Liu, and J. Feng,
malware detection,” J. Comput. Virol. Hacking Tech., vol. 16, no. 2, pp. “Central similarity quantization for efficient image and video retrieval,”
157–171, Jun. 2020. in Proceedings of IEEE/CVF Conference on Computer Vision and
[11] R. Chaganti, V. Ravi, and T. Pham, “Deep learning based cross archi- Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 3080–3089.
tecture internet of things malware detection and classification,” Comput. [35] G. Wu, M. Zhao, L. Wang, Q. Dai, T. Chai, and Y. Liu, “Light field
Secur., vol. 120, p. 102779, Sep. 2022. reconstruction using deep convolutional network on EPI,” in Proceedings
[12] Y. Zhang, F. Feng, Z. Liao, Z. Li, and S. Yao, “Universal backdoor of IEEE Conference on Computer Vision and Pattern Recognition
attack on deep neural networks for malware classification,” Appl. Soft (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 1638–1646.
Comput., vol. 143, p. 110389, May 2023. [36] K. Lin, J. Lu, C. S. Chen, J. Zhou, and M. Sun, “Unsupervised deep
[13] M. Conti, P. Vinod, and A. Vitella, “Obfuscation detection in android learning of compact binary descriptors,” IEEE Trans. Pattern Anal.
applications using deep learning,” J. Inf. Secur. Appl., vol. 70, p. 103311, Mach. Intell., vol. 41, no. 6, pp. 1501–1514, June 2019.
Nov. 2022. [37] S. Jin, H. Yao, X. Sun, and S. Zhou, “Unsupervised semantic deep
[14] Z. Wang, Q. Liu, Z. Wang, and Y. Chi, “Deep learning-based multi- hashing,” Neurocomputing, vol. 351, pp. 19–25, Jul. 2019.
classification for malware detection in IoT,” J. Circuitus Syst. Comput., [38] H. Cui, L. Zhu, J. Li, Y. Yang, and L. Nie, “Scalable deep hashing for
vol. 31, no. 17, 2250297, Jul. 2022. large-scale social image retrieval,” IEEE Trans. Image Process., vol. 29,
[15] Z. Boulkenafet, J. Komulainen, and A. Hadid, “Face spoofing detection pp. 1271–1284, 2020.
using colour texture analysis,” IEEE Trans. Inf. Forensics Secur., vol. 11, [39] K. G. Dizaji, F. Zheng, N. S. Nourabadi, Y. Yang, C. Deng, and
no. 8, pp. 1818–1830, Aug. 2016. H. Huang, “Unsupervised deep generative adversarial hashing network,”
[16] H. Venkateswara, J. Eusebio, S. Chakraborty, and S. Panchanathan, in Proceedings of IEEE/CVF Conference on Computer Vision and
“Deep hashing network for unsupervised domain adaptation,” in Pro- Pattern Recognition (CVPR), Salt Lake City, UT, USA, 2018, pp. 3664–
ceedings of IEEE Conference on Computer Vision and Pattern Recog- 3673.
nition (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 5385–5394. [40] C. Deng, E. Yang, T. Liu, J. Li, W. Liu, and D. Tao, “Unsupervised
[17] Q. Cui, Z. Chen, and O. Yoshie, “Delving into the representation learning semantic-preserving adversarial hashing for image search,” IEEE Trans.
of deep hashing,” Neurocomputing, vol. 494, pp. 67–78, 2022. Image Process., vol. 28, no. 8, pp. 4032–4044, Aug. 2019.
[18] H. Liu, R. Wang, S. Shan, and X. Chen, “Deep supervised hashing for [41] B. Zhang and J. Qian, “Autoencoder-based unsupervised clustering and
fast image retrieval,” Int. J. Comput. Vis., vol. 127, no. 9, pp. 1217–1234, hashing,” Appl. Intell., vol. 51, pp. 493–505, Jan. 2021.
Sep. 2019. [42] Y. Wang, J. Song, K. Zhou, and Y. Liu, “Unsupervised deep hashing
[19] A. Alotaibi, “Biserial miyaguchi-preneel blockchain-based ruzicka- with node representation for image retrieval,” Patt. Recog., vol. 112, p.
indexed deep perceptive learning for malware detection in IoMT,” 107785, Apr. 2021.
Sensors, vol. 21, no. 21, p. 7119, Oct. 2021. [43] J. Qiu, J. Zhang, W. Luo, L. Pan, S. Nepal, and Y. Xiang, “A survey
[20] F. Zhao, Y. Huang, L. Wang, and T. Tan, “Deep semantic ranking based of android malware detection with deep neural models,” ACM Comput.
hashing for multi-label image retrieval,” in Proceedings of IEEE Con- Surv., vol. 53, no. 6, pp. 1–36, Nov. 2021.
ference on Computer Vision and Pattern Recognition (CVPR), Boston, [44] E. Rodrı́guez, B. Otero, N. Gutiérrez, and R. Canal, “A survey of
MA, USA, 2015, pp. 1556–1564. deep learning techniques for cybersecurity in mobile networks,” IEEE
[21] J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for large- Commun. Surv. Tutorials, vol. 23, no. 3, pp. 1920–1955, Jun. 2021.
scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, [45] L. Liu and B. Wang, “Malware classification using gray-scale images
pp. 2393–2406, Dec. 2012. and ensemble learning,” in Proceedings of the 3rd International Confer-
[22] Q.-Y. Jiang and W.-J. Li, “Asymmetric deep supervised hashing,” in ence on Systems and Informatics (ICSAI), Shanghai, China, Nov. 2016,
Proceedings of the AAAI Conference on Artificial Intelligence, New pp. 1018–1022.
Orleans, LA, USA, 2017, pp. 3342–3349. [46] F. Darus, N. Salleh, and A. M. Ariffin, “Android malware detection
[23] Z. Cao, M. Long, J. Wang, and P. S. Yu, “HashNet: Deep learning to using machine learning on image patterns,” in Proceedings of Cyber
hash by continuation,” in Proceedings of IEEE International Conference Resilience Conference (CRC), Putrajaya, Malaysia, Mar. 2018, pp. 1–4.
on Computer Vision (ICCV), Venice, Italy, 2017, pp. 5609–5618. [47] D. Gibert, C. Mateu, J. Planes, and et al., “Using convolutional neural
[24] H.-F. Yang, K. Lin, and C.-S. Chen, “Supervised learning of semantics- networks for classification of malware represented as images,” J. Com-
preserving hash via deep convolutional neural networks,” IEEE Trans. put. Virol. Hacking Tech., vol. 15, pp. 15–28, Mar. 2019.
Pattern Anal. Mach. Intell., vol. 40, no. 2, pp. 437–451, Feb. 2018. [48] S. Ni, Q. Qian, and R. Zhang, “Malware identification using visualiza-
[25] R. Zhang, L. Lin, R. Zhang, W. Zuo, and L. Zhang, “Bit-scalable tion images and deep learning,” Comput. Secur., vol. 77, pp. 871–885,
deep hashing with regularized similarity learning for image retrieval and Aug. 2018.
person re-identification,” IEEE Trans. Image Process., vol. 24, no. 12, [49] S. Wang, Q. Yan, Z. Chen, B. Yang, C. Zhao, and M. Conti, “Detecting
pp. 4766–4779, Dec. 2015. android malware leveraging text semantics of network flows,” IEEE
[26] X. Lu, Y. Chen, and X. Li, “Hierarchical recurrent neural hashing for Trans. Inf. Forensics Secur., vol. 13, no. 5, pp. 1096–1109, May 2018.
image retrieval with hierarchical convolutional features,” IEEE Trans. [50] E. Raff, W. Fleshman, R. Zak, H. S. Anderson, B. Filar, and M. Mclean,
Image Process., vol. 27, no. 1, pp. 106–120, Jan. 2018. “Classifying sequences of extreme length with constant memory applied
[27] H. Lu, M. Zhang, X. Xu, Y. Li, and H. T. Shen, “Deep fuzzy hashing to malware detection,” in Proceedings of the AAAI Conference on
network for efficient image retrieval,” IEEE Trans. Fuzzy Syst., vol. 29, Artificial Intelligence, Virtually, 2021, pp. 3547–3554.
no. 1, pp. 166–176, Jan. 2021. [51] M. Krčál, O. Švec, M. Bálek, and O. Jašek, “Deep convolutional
[28] Y. Cao, M. Long, B. Liu, and J. Wang, “Deep cauchy hashing for malware classifiers can learn from raw executables and labels only,”
hamming space retrieval,” in Proceedings of IEEE/CVF Conference on in Proceedings of the 6th International Conference on Learning Repre-
Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, sentations (ICLR), Vancouver, BC, Canada, 2018, pp. 1–4.
USA, 2018, pp. 1229–1237. [52] A. Sotgiu, A. Demontis, M. Melis, and et al., “Deep neural rejection
[29] Y. Li, W. Pei, Y. Zha, and J. Gemert, “Push for quantization: Deep fisher against adversarial examples,” EURASIP J. Info. Security, vol. 2020(1),
hashing,” arXiv, Aug. 2019, arXiv pre-print:1909.00206. pp. 1–10, Apr. 2020.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3353250

15

[53] F. Cakir, K. He, S. A. Bargal, and S. Sclaroff, “Hashing with mutual Shaohui Min received the B.E. degree from Yunnan
information,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 10, University, Kunming, China, in 2023. He is currently
pp. 2424–2437, Oct. 2019. pursuing the M.S. degree in Cyberspace Security
[54] G. Lin, C. Shen, D. Suter, and A. v. d. Hengel, “A general two- from the School of Software, Yunnan University,
step approach to learning-based hashing,” in Proceedings of IEEE Kunming, China. His current research interests in-
International Conference on Computer Vision (CVPR), Sydney, NSW, clude artificial intelligence, secure machine learning,
Australia, 2013, pp. 2552–2559. reverse engineering and software security.
[55] L. Fan, K. Ng, C. Ju, T. Zhang, and C. Chan, “Deep polarized network
for supervised learning of accurate binary hashing codes,” in Proceed-
ings of the 29th International Conference on Artificial Intelligence,
Yokohama, Japan, 2020, pp. 825–831.
[56] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approx-
imate nearest neighbor in high dimensions,” Commun. ACM, vol. 51,
no. 1, pp. 117–122, Jan. 2008.
[57] S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, “Effective and efficient
hybrid android malware classification using pseudo-label stacked auto- Qi Wang received the B.E. degree from Kunming
encoder,” J. Netw. Syst. Manag., vol. 30, no. 1, pp. 1–34, 2022. University of Science and Technology (KUST),
[58] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification Kunming, China, in 2021. She is currently pursuing
with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, the M.S. degree in Cyberspace Security from the
pp. 84–90, June 2017. School of Software, Yunnan University, Kunming,
China. Her current research interests include artifi-
cial intelligence, secure machine learning and natural
language processing.

Yunchun Zhang (Member, IEEE) received the B.S.


degree in Computer Science and Technology from
Jilin University, Changchun, China, in 2004. He Tony Q.S. Quek (S’98-M’08-SM’12-F’18) received
received the M.S. degree and the Ph.D. degree in the B.E. and M.E. degrees in electrical and electron-
Computer System Architecture from Jilin University ics engineering from the Tokyo Institute of Technol-
(JLU), Changchun, China, in 2007 and 2011, re- ogy in 1998 and 2000, respectively, and the Ph.D.
spectively. Since 2011, he has been with the School degree in electrical engineering and computer sci-
of Software, Yunnan University, Kunming, China, ence from the Massachusetts Institute of Technology
where he is currently a senior Lecturer. His current in 2008. Currently, he is the Cheng Tsang Man Chair
research interests include artificial intelligence, se- Professor with Singapore University of Technology
cure machine learning and network security. and Design (SUTD) and ST Engineering Distin-
guished Professor. He also serves as the Director of
the Future Communications R&D Programme, the
Head of ISTD Pillar, and the Deputy Director of the SUTD-ZJU IDEA.
His current research topics include wireless communications and networking,
network intelligence, non-terrestrial networks, open radio access network, and
6G.
Dr. Quek has been actively involved in organizing and chairing sessions,
and has served as a member of the Technical Program Committee as well as
Zikun Liao received the B.S. degree in Information symposium chairs in a number of international conferences. He is currently
Security from Yunnan University (YNU), Kunming, serving as an Area Editor for the IEEE T RANSACTIONS ON W IRELESS
China, in 2019. She is currently a candidate for a C OMMUNICATIONS.
Master’s degree in Software Engineering from the Dr. Quek was honored with the 2008 Philip Yeo Prize for Outstanding
School of Software, Yunnan University. Her current Achievement in Research, the 2012 IEEE William R. Bennett Prize, the 2015
research interests include artificial intelligence, se- SUTD Outstanding Education Awards – Excellence in Research, the 2016
cure machine learning and network security. IEEE Signal Processing Society Young Author Best Paper Award, the 2017
CTTC Early Achievement Award, the 2017 IEEE ComSoc AP Outstanding
Paper Award, the 2020 IEEE Communications Society Young Author Best
Paper Award, the 2020 IEEE Stephen O. Rice Prize, the 2020 Nokia Visiting
Professor, and the 2022 IEEE Signal Processing Society Best Paper Award. He
is a Fellow of IEEE and a Fellow of the Academy of Engineering Singapore.

Mingxiong Zhao (Member, IEEE) received the B.S.


Ning Zhang received the B.E. degree from degree in Electrical Engineering and the Ph.D. de-
Zhongyuan University of Technology, Zhengzhou, gree in Information and Communication Engineer-
China, in 2021. He is currently pursuing the M.S. ing from South China University of Technology
degree in Software Engineering from the School (SCUT), Guangzhou, China, in 2011 and 2016,
of Software, Yunnan University, Kunming, China. respectively. He was a visiting Ph.D. student at
His current research interests include artificial in- University of Minnesota (UMN), Twin Cities, MN,
telligence, secure machine learning and network USA, from 2012 to 2013 and Singapore University
security. of Technology and Design (SUTD), Singapore, from
2015 to 2016, respectively. Since 2016, he has been
with the School of Software, Yunnan University,
Kunming, China, where he is currently an Associate Professor and serves
as the Director of Cybersecurity Department. His current research interests
include network security, mobile edge computing, and edge AI techniques.

Authorized licensed use limited to: DELHI TECHNICAL UNIV. Downloaded on January 20,2024 at 14:03:00 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.

You might also like