You are on page 1of 19

MIRAD: A Method for Interpretable Ransomware

Attack Detection
Bartosz Marcinkowski1 , Maja Goschorska2*, Natalia Wileńska2,4 ,
Jakub Siuta1 , Tomasz Kajdanowicz2,3
1 MIM Solutions, Bitwy Warszawskiej 1920 r. 7B, Warsaw, 02-034,
Poland.
2 Sagenso, Pl. Jana Kilińskiego 2, Rzeszów, 35-005, Poland.
3 Wroclaw University of Science and Technology, Wyb. Wyspianskiego

27, Wroclaw, 50-370, Poland.


4 Lodz University of Technology, Zeromskiego 116, Lodz, 90-924, Poland.

*Corresponding author(s). E-mail(s): mgoschorska@sagenso.com;


Contributing authors: b.marcinkowski@leomail.pl;
nwilenska@sagenso.com; jakub.siuta@mim.ai;
tomasz.kajdanowicz@pwr.edu.pl;

Abstract
In the face of escalating crypto-ransomware attacks, which encrypt user data
for ransom, our study introduces a significant advancement in dynamic ran-
somware detection. We develop an innovative machine learning model capable of
identifying ransomware activity. This model is uniquely trained in a simulated
user environment, enhancing detection accuracy under realistic conditions and
addressing the imbalances typical of ransomware datasets.
A notable aspect of our approach is the emphasis on interpretability. We employ
a simplified version of Generalized Additive Models (GAMs), ensuring clarity in
how individual features influence predictions. This is crucial for minimizing false
positives, a common challenge in dynamic detection methods. Our contributions
to the field include a Python library for easy application of our detection method,
and a comprehensive, publicly available ransomware detection dataset. These
resources aim to facilitate broader research and implementation in ransomware
defense.

Keywords: Additive model, Binary classification, Ransomware detection, Explainable


artificial intelligence

1
1 Introduction
Ransomware, the primary cyber threat in recent years, affects all sectors, from health-
care, energy, and defense, through manufacturing and services, to finance. A recent
report by the European Union Agency for Cybersecurity (ENISA), noted a further
surge in ransomware attacks in 2023, which now account for 31% of all cyber threats
[1].
The report defines ransomware as “a type of attack where threat actors take control
of a target’s assets and demand a ransom in exchange for the return of the assets’
availability.” There are several means ransomware can take over the assets: lock, delete,
steal, or encrypt. Accordingly, lockware is a type of ransomware that locks its victims
out of the system, while scarware displays messages meant to scare the user. Both are
intimidation techniques designed to extort ransom without directly affecting user files.
A newer type, leakware, similarly does not alter user files, unless in a hybrid variant.
Instead, it steals user data and threatens to leak it if a ransom is not paid [2, 3].
The final, most common, and arguably most harmful type of ransomware is crypto-
ransomware [2, 4]. As the name implies, it encrypts user data, such as work-related
and private documents. Unlike other types of ransomware, the user data is inaccessi-
ble until a ransom is paid. Hence, while other types of attacks lead to a loss of trust
in the affected company and harm their customers by leaking their private data, only
crypto-ransomware causes major disruptions in day-to-day operations, causing delays
even in critical services. From January 2016 to December 2021, among 374 investi-
gated ransomware attacks on the US healthcare system, 10.2% caused cancellations
of scheduled care, and, terrifyingly, 4.3% in ambulance diversion [5]. Considering the
prevalence and danger of crypto-ransomware attacks, we chose it as the focus of our
study.
The increasing number of ransomware attacks in recent years has led to a cor-
responding rise in research dedicated to ransomware detection. The methods can be
broadly divided into static, which attempt to recognize ransomware without letting it
execute, and dynamic, which focus on monitoring the consequences of executing ran-
somware [2–4]. Both approaches increasingly rely on machine learning, and both work
well as complementary methods [6].
In this study, we focus on dynamic analysis only. Although more resource-intensive,
they are better at detecting previously unknown crypto-ransomware, since even
unrelated ransomware families behave similarly at runtime [7, 8].
To detect the infection at the earliest possible stage, we continuously monitor calls
to the Windows API, as well as registry changes. API calls are perhaps the most com-
mon source of features for dynamic analysis Table 1 since they facilitate ransomware
operations at all stages of the infection [7, 9]. Similarly, monitoring registry keys pro-
vides insight both into ongoing encryption and the early stages of the attack [10].
Other published approaches include measuring file entropy [10], DLLs [11], and net-
work traffic [12], as well as monitoring decoys and traps placed in the file system
[4, 13]. While they too gave promising results, we chose not to include them to keep
our detection system as light as possible while maintaining high prediction accuracy.
Monitoring too many processes on a protected machine could slow it down beyond
what is acceptable to a user.

2
Previous studies avoided this issue by only running the ransomware in the safe
environment of a sandbox [6, 12, 14, 15]. This approach shares both the advantages
and disadvantages of static analysis. On the one hand, it identifies ransomware before
it has the chance to infect a machine. On the other, the varied and constantly changing
methods of ransomware delivery, ranging from social engineering to exploring vulnera-
bilities in network connection or user systems, ensure that a proportion of ransomware
will never be sent for testing [1, 2].
We chose an alternative approach that focuses on detecting ransomware markers
directly on a user’s computer [7]. We argue that, when used as a complementary
method for static and sandbox tests, it will allow detecting the otherwise missed
ransomware.
When pursuing this approach, time is of the essence—the faster the detection,
the fewer user files will be lost. Optimally, ransomware would be detected before the
malware had a chance to commence encryption, i.e., either during staging, when it
embeds itself in the system and establishes communications with the outside world,
or during scanning, when it searches for user data [2, 4, 6]. However, if early detection
fails, there is value in recognizing the infections even during the encryption stage,
since it can limit the number of lost files. Additionally, a prompt warning set to other
machines in the organization will curtail the spread of the infection.
Considering the above, we developed a machine-learning model that operates on a
user’s computer and is capable of recognizing markers of all stages of the infection. We
monitored API calls and registry entries both before and during infection, including
the encryption phase. By dividing the collected data into 10-second segments, and then
generating predictions for each segment, we trained our model to recognize patterns
characteristic of all stages of the attack.
Additionally, as our model operates outside the sterile sandbox environment, we
adjusted the training method. When gathering the training data, instead of executing
the ransomware as the only process like others [6, 12, 14, 15], we created a complex vir-
tual environment which simulates typical user actions, e.g., file creation, modification,
copying, and deletion. Hence, we “contaminated” our data with processes that may co-
occur with an infection, training our model to detect ransomware under sub-optimal
but realistic conditions.
Unlike previous dynamic approaches, we also focus on model interpretability. This
allows us to address the primary weakness of dynamic models: a tendency to trigger
false alarms. Even the seemingly low false positive rate of about 0.0156 reported for
homogeneous decision trees [6] translates to over one in a hundred benign software
samples misclassified as ransomware. Considering the variety and turnover of tools
used by a company, and the drastic actions undertaken in case of infection (such as
a forced shutdown), this poses a serious problem. Worse, if a misclassification had
occurred, how to assure a client that the same mistake will not happen again?
We propose to use a solution inspired by generalized additive models (GAMs), the
gold standard for interpretability [16]. GAMs take the form:

g(E(Y )) = β0 + f1 (x1 ) + f2 (x2 ) + · · · + fm (xm ).

3
The contribution of individual features to the final prediction can be visualized by
plotting each univariate term fi (xi ) against xi . Because fi (xi ) can be nonlinear, these
plots are useful for revealing complex relationships and detecting artifacts and outliers.
A clever implementation of GAM, and GA2M, which contains bivariate terms
fij (xi , xj ) in addition to univariate terms fi (xi ), successfully predicted pneumonia risk
and the likelihood of readmission in two separate healthcare studies. Moreover, they
uncovered surprising and worrying patterns in the data [17]. For example, patients
with asthma were unexpectedly classified as having a lower risk of dying due to pneu-
monia. This classification, although true to the data, resulted from a difference in
treatment approach rather than a natural resistance to the sickness. Therefore, any
model suggesting to a doctor that an asthmatic is a low-risk patient would be harmful.
While GAMs are not above learning such false leads, unlike black-box models, they
allow the errors to be quickly spotted and corrected.
Similar to the healthcare sector, the ability to detect and analyze anomalies would
be invaluable in cybersecurity. During the training stage, multiple rounds of data
collection, model training, and data analysis would allow optimizing the virtual envi-
ronment and generating an optimal training dataset. Apart from advantages during
the training stage, the relevant plots could be analyzed whenever a false positive pre-
diction was reported. Depending on the results, either the virtual environment used
to generate training data or the model itself could be adjusted.
GAMs offer an additional advantage. For making predictions, each function fi
acts as a lookup table per feature, returning a term contribution. These terms are
then summed and passed through the link function g to obtain the final prediction.
This process is fast and resource-efficient, speeding up ransomware detection without
compromising the performance of a user’s machine.
Unfortunately, the Python library required for the above GA2M implementation,
although available [18], relies on numerous complex or custom dependencies. This, in
turn, poses problems with versioning and code maintenance, making them unsuitable
for use in production. Hence why, we propose a simpler, lightweight implementation
that relies only on a few standard libraries while maintaining the main advantages of
GAMs: their interpretability and speed of prediction.

2 Contribution
This paper presents several significant contributions to the field, particularly in
ransomware prediction and detection. Our key contributions are as follows:
1. Interpretable ransomware prediction method. We introduce a novel ran-
somware prediction method that emphasizes interpretability. This method leverages
advanced analytical techniques to provide clear insights into the factors leading
to ransomware attacks, thereby enabling more effective preventive strategies. Our
approach represents a significant advancement over existing methods, which often
lack transparency in their decision-making processes.
2. Innovative Training Methodology. Our research proposes a training method-
ology designed for GAM and ransomware prediction. This methodology enhances

4
the model’s ability to learn from complex data patterns and improves its predic-
tive accuracy. It addresses the challenges commonly faced in training models on
imbalanced and evolving ransomware datasets.
3. Python library for easy implementation. To facilitate wider adoption of our
approach, we provide a Python library that encapsulates our prediction method
and training process. This library is designed for ease of use, allowing researchers
and practitioners to implement our method with minimal setup. It is a compre-
hensive tool that significantly reduces the barrier to entry for conducting advanced
ransomware prediction research.
4. Publicly available ransomware detection dataset. Recognizing the scarcity
of quality datasets in this domain, we contribute a new ransomware detection
dataset, which we make publicly available. This dataset is comprehensive and up-
to-date, representing a range of ransomware types. It serves as a valuable resource
for researchers and practitioners looking to test and improve ransomware detection
methods.
5. Lessons on synthetic data generation for ransomware detection. Our work
provides in-depth lessons on improving the synthetic data generation process for
ransomware detection. These lessons are drawn from our extensive experiments and
analyses. By sharing our insights on how to generate more realistic and represen-
tative synthetic data, we aim to advance the field’s understanding and capabilities
in developing robust ransomware detection models.
In summary, our contributions not only advance the state of ransomware prediction
but also provide practical tools and resources for the cybersecurity community. Our
work is a step towards more effective and accessible solutions in combating the ever-
evolving threat of ransomware.

3 Related work
Previous studies on dynamic ransomware detection tested a wide range of machine-
learning algorithms, and an overview of this research is presented in Table 1. The
table includes only studies relevant to protecting Windows computers and is limited
to dynamic approaches that do not rely on specific hardware, such as physical sensors.
The algorithms used for dynamic ransomware detection can be roughly divided into
transparent glass-box algorithms, allowing users to understand their internal work-
ings, and opaque black-box algorithms, which provide predictions without revealing
the underlying processes or decision-making rationale. Examples of glass-box algo-
rithms include Logistic Regression (LR), Naı̈ve Bayes (NB), k-Nearest Neighbours
(KNN), and Decision Trees (DT). Support Vector Machines (SVM), Random Forests
(RFs), Multilayer Perceptrons (MLPs), GBDT (Gradient Boosting Decision Trees),
and LSTM (Long Short-Term Memory) are examples of black-box algorithms.
Among those, only the simplest of the glass-box algorithms, based on a handful
of features, provide easily interpretable predictions. As the complexity of a model
increases, the contribution of individual features to the final prediction becomes
increasingly hard to unravel. A good example is a deep Decision Tree. Although each
of its nodes makes decisions based on one feature, the same feature may be employed

5
Table 1 Machine Learning in Dynamic Ransomware Detection Methods

Study Features Used ML Algorithms Used


Sgandurra et al., 2016 [7] API calls, Registry Key Ops, LR, NB, SVM
File/Dir System
Chen et al., 2017 [19] API Calls RF, SVM, SL, NB
Hasan and Rahman, 2017 API calls, Registry Key Ops, File SVM, NB, RF
[20] Ops
Vinayakumar et al., 2017 API Calls MLP, LR, NB, DT, SVM, RF,
[14] KNN
Takeuchi et al., 2018 [15] API Calls SVM
Mehnaz et al., 2018 [13] IRP NB, LR, DT, RF
Shaukat and Ribeiro, 2018 API Calls, File Entropy, Static LR, SVM, RF, GBDT, NN
[21] analysis, Traps
Al-rimy et al., 2019 [22] API Calls Ensemble-based, RF, DT,
AdaBoost, SVM, KNN, MLP, LP
Kok et al., 2019 [6] API Calls Ensemble DT
Agrawal et al., 2019 [23] API Calls LSTM
Kok et al., 2020 [24] API Calls RF
Ahmed et al., 2020 [25] API Calls KNN, LR, SVM, RF, DT
Alqahtani et al., 2020 [26] API Calls, IRP LSTM
Bae et al., 2020 [27] API Calls RL, LR, NB, SGD, KNN, SVM
Qin et al., 2020 [28] API Calls TextCNN
Hwang et al., 2020 [29] API Calls, Registry Key Ops, File RF
System, Dir Ops, Strings, File
Extensions
Homayoun et al., 2020 [30] DLL, File System, Dir Activity DT J48, RF, Bagging, MLP
Roy and Chen, 2021 [31] Semantic Info from Logs Bi-LSTM
Almousa et al., 2021 [32] API Calls RF, SVM, KNN
Al.-rimy et al., 2021 [33] API Calls SVM, LR, DT, KNN, RF,
AdaBoost
Nguyen et al., 2021 [34] API Calls GBDT
Hirano et al., 2022 [35] I/O operation, LBA, Entropy RF, SVM, KNN, CNN
Kok et al., 2022 [36] API and System Calls RF
Molina et al., 2022 [37] API Calls NN, LSTM, RF, NB, KNN
Singh et al., 2022 [38] Memory Access Privileges XGBoost, RF, Boosted RF, DT,
Tree Ensemble, GBT, SVM, NB,
NN
Herrera-Silva and Registry Key Ops, API Calls, NB, RF, GBRT, NN
Hernández-Álvarez, 2023 Procmemory, Network, Static
[11] analysis, other

in multiple branches, intertwined with other features, obscuring its impact on the final
prediction. Similarly, the interpretability of KNN results depends on the number and
type of features, as well as the distance metric and the value of K. Generally, the
more features there are, the harder it is to interpret how each one affects the final
prediction, especially if they have different scales or units. This is known as the curse
of dimensionality.
This problem is exacerbated as the complexity of a model increases, with black-box
models being considered uninterpretable. Instead, a plethora of methods have been
developed to explain a part or the entirety of the decision-making process. While many
of these methods find use in the broader field of cybersecurity ([2]), the few that have

6
been used to improve ransomware detection focus mainly on the Android operating
system and static detection methods ([2, 3]).
To our knowledge, only one study applied interpretable machine-learning models
for ransomware detection on a Windows computer ([39]). It proposes an AI-powered
hybrid approach: a method to detect crypto-ransomware using advanced static and
dynamic analysis and AI. The method captures distinct features at Dynamic Link
Library (DLL), function call, and assembly levels and analyses the behavioral chains
of ransomware samples. The study employs association rule mining and natural lan-
guage processing techniques to generate rules and sentences that explain the rationale
behind ransomware detection. It also provides a prototype visualization tool, AIRaD,
that allows researchers and defenders to inspect the analysis results and the correla-
tion mapping among different levels of features. Unlike our study, it does not focus
on detecting complex quantitative relationships between individual features and the
likelihood of an attack but rather on the importance and relationships of individual
features.
The focus on feature importance over the feature-prediction relationship is common
in the field and is enforced by the limitations of the chosen algorithms. For instance,
to better understand the contribution of individual features to the final predictions
made by Light GBM, a Gradient-Boosted Decision Trees glass-box model, Nguyen
and colleagues ([34]) were limited to counting the number of times each feature was
used as split points in all learned trees.

4 A Method for Interpretable Ransomware Attack


Detection
We developed a method for training additive models for classification to provide the
interpretability required in ransomware detection.
In a regression problem {xi1 , . . . , xik , yi }ni=1 , where xij ∈ R and yi ∈ R, additive
model’s predictions have the form
k
X
ŷi = β0 + fj (xij ) (1)
j=1
where ŷi ∈ R
We adapted this formula to classification, where y ∈ {0, 1}, using the sigmoid
function
k
X
ŷi = sigmoid(β0 + fj (xij )) (2)
j=1
where ŷi ∈ (0, 1).
Similarly to the regression case, such models can be interpreted by plotting the fj
functions.
In our method, each fj is a step function. The cutoffs (points of discontinuity)
of fj depend only on {xij }ni=1 . The values of fj are optimized by gradient descent,

7
minimizing binary cross entropy (the classification task) and optionally additional
terms enhancing regularization of the step functions.

4.1 Pre-processing
Our pre-processing method determines cutoffs (points of discontinuity) of each fj in
the fitting phase (before training the model) and transforms feature values xij ∈ R to
discrete indices of segments between the cutoffs.

4.1.1 Fitting
For every j = 1, .., k the fitting procedure determines up to q − 1 cutoffs

cj,1 < ... < cj,sj

sj ≤ q − 1
where q ∈ N+ is a hyperparameter limiting the number of segments constituting
the step functions.
The pre-processing procedure also computes weights wj,1 , ..., wj,sj +1 such that,
empirically,

P [cj,l < xij ≤ cj,l+1 ] = wj,l+1


assuming cj,0 = −∞ and cj,sj +1 = ∞.

4.1.2 Transforming
Having the values cj,1 , ..., cj,sj , we define transformation tj by

cj,tj (xij )−1 < xij ≤ cj,tj (xij )


tj (xij ) ∈ {1, ..., sj + 1}
and denote transformed values by x̄ij

x̄ij = tj (xij )

4.2 Model training


The model has a trainable free term parameter β0 and per-feature trainable parameters
pj,1 , .., pj,sj +1 defining the step functions’ values

fj (xij ) = pj,tj (xij ) = pj,x̄ij


After transforming the data set to {x̄i1 , . . . , x̄ik , yi }ni=1 we can evaluate our additive
classification model as follows
k
X
ŷi = sigmoid(β0 + pj,x̄ij ) (3)
j=1

8
Algorithm 1 Determine cutoffs c and segment weights w for a step function
Require: x1j , ..., xnj ∈ R, q ∈ N+
1
1: m ⇐ 2q
2: sj ⇐ q − 1
3: cj,1 , ..., cj,sj ⇐ empirical q-quantiles of x1j , ..., xnj
4: while true do
5: wj,1 , ..., wj,sj +1 ⇐ such that, empirically, P [cj,l < xij ≤ cj,l+1 ] = wj,l+1
6: if min(wj,1 , ..., wj,sj +1 ) ≥ m then
7: return cj,1 , ..., cj,sj and wj,1 , ..., wj,sj +1
8: end if
9: dw ⇐ argmin(wj,1 , ..., wj,sj +1 )
10: if dw = 1 then
11: dc ⇐ 1
12: else if dw = sj + 1 then
13: dc ⇐ sj
14: else if wdw −1 < wdw +1 then
15: dc ⇐ dw − 1
16: else
17: dc ⇐ dw
18: end if
19: cj,1 , ..., cj,sj ⇐ cj,1 , ..., cj,dc −1 , cj,dc +1 , ..., cj,sj
20: end while

Before training, the per-feature trainable parameters pjl are initialized to zero, and
β0 is initialized so that
k
1X
sigmoid(β0 ) = yi
k i=1
We use gradient descent for optimization, with a 3 component loss
• mean binary cross entropy between yi and ŷi (classification loss)
• step loss: mean square difference between subsequent pj,l and pj,l+1 values (regu-
larization)
• sum loss: mean square sl=1
P j +1
pj,l wj,l across features (regularization)
The regularizing components are weighted by hyperparameters.

4.3 Time complexity


Evaluating the model on a single input vector consists of k per-feature evaluations of
tj and constant time per-feature operations such as look-up and addition. Each tj can
be evaluated in O(log(q)) time using binary search. Therefore, the time complexity of
evaluation is O(k ∗ log(q)).

9
4.4 Implementation
Key components of our method are provided by popular Python libraries, eg.
• initial cjl values computation by numpy.quantile,
• tj pre-processing transformations by numpy.digitize,
• parameter lookup (fj (xij ) = pj,tj (xij ) ) by torch.nn.Embedding,
• gradient descent by torch.optim.SGD.
Our implementation in Python, based on numpy and torch, is publicly available
as open source, under the permissive MIT license at https://github.com/Sagenso/
MIRAD. It consists of 3 components
• Digitizer - pre-processing with numpy.digitize,
• EmbeddingSumM odule - additive model based on torch.nn.Embedding,
• EmbeddingSumClassif ier - classifier with the standard fit/transform interface,
using the previous two components

4.5 Interpretation
We developed a visualization method that combines all key components (cutoffs c,
weights w and parameters p) on a single plot. The model can be interpreted by
analyzing such plots for all features.

Fig. 1 Model visualization - cutoffs, weights and parameters of a selected feature

For the j-th feature, the parameters are visualized as bars of height pjl and width
wjl . The left y-axis is used for pjl . Each border between two consecutive bars pj,l and

10
pj,l+1 corresponds to a cutoff. The cutoff value cj,l is marked on the border with a
dot. The right y-axis is used for cjl .

5 Experimental setup
5.1 Data
5.1.1 Synthetic data preparation procedure
We created a system that models the use of a computer by an office worker. This
system included the following components:
• Virtual machines with Windows 10 and Windows 11 operating systems
• Benign software.
• Python scripts that populated each virtual machine with unique user folders and
generated files with typical extensions found in office settings (PDF, DOCX, XLSX,
PPTX).
• AutoIt scripts that simulated user’s behaviour, including actions such as creating,
renaming, modifying, deleting, and executing files.
• A monitoring component, based on EaseFilter and customized to monitor operations
on files and registry key changes. The collected data was continuously sent to a
server via FTP.
• A single ransomware, and a script that unzipped and triggered it at a defined time
point.
We obtained ransomware EXE files from MalwareBazaar (https://bazaar.abuse.
ch). Crypto ransomware was then selected by the following method: A honeypot folder
with 1000 files in the formats usually encountered in user documents (PDF, DOCX,
XLSX, PPTX, GIF, JPG, MP3, MP4), was placed on the desktop. Hashes were then
obtained for each file before executing ransomware, and again five minutes after exe-
cuting ransomware. 78 ransomware that encrypted > 90% of the honeypot files on
both Windows 10 and Windows 11 were then used for further analysis.

5.1.2 Feature extraction


The monitoring component collected raw data before, during, and at least 10 minutes
after each attack. Raw data in the form of event logs has been then transformed
into signals, by filtering and aggregation into time series with a constant sampling
frequency of 10 seconds. The filtering involved selecting a set of events for aggregation,
e.g., selecting only file creation out of all operations on files. Aggregation involved
counting events that matched the filter in the time buckets determined by sampling.
The signals were then converted into features using a moving average. Thus, with
a 30-sample moving average, each feature represented the average values of the sig-
nal over the last 5 minutes. Finally, all features were transformed by applying the
logarithm.

11
5.2 Experimental procedure
We compare multiple classifiers (the proposed model, standard interpretable models)
on a data set based on simulated ransomware attacks. The data set and experiment
procedure are publicly available, as parts of the open-source MIRAD repository.

5.2.1 Data set


The data set is based on multiple simulation sessions, each including simulated user
activity and execution of a real ransomware on a virtual machine. The train/test split
was performed on sessions, based on ransomware publication date – older ransomware
is in the train set.
Each session contributes multiple data points to the data set, from different time
points. For a given session and a time point, the label is based on whether the time
point is before or after ransomware execution start. Each of the 13 features is based on
a number of system events of a specific kind in a preceding fixed-length time window.
The train and test sets have 39440 and 27987 data points respectively.

5.2.2 Classifiers
We compare multiple variants of the proposed model MIRAD to standard interpretable
models.
The MIRAD variants only differ in regularization hyperparameters, as listed in
Table 2.

Table 2 MIRAD variants

step loss weight sum loss weight


no regularization 0 0
sum regularization 0 0.01
step regularization 0.01 0
full regularization 0.01 0.01

The remaining hyperparameters are set as follows


• maximum number of segments of the step functions: 20
• gradient descent’s learning rate: 10
• gradient descent’s number of epochs: 1000
The standard interpretable models we compare against are
• Gaussian Naive Bayes
• Decision Tree
• Logistic Regression
Decision Tree’s depth is limited to 4 to maintain comparable intepretability. For
Logistic Regression, the features are normalized.

12
5.2.3 Metrics
We measure time of both training (on the train set) and evaluation (on the test set).
Prediction quality is quantified by AP (Average Precision) and ROC AUC (Area
Under the Receiver Operating Characteristic Curve).

6 Results
6.1 Metrics
Experiment results are summarized in Table 3. All MIRAD variants stand out as
the slowest in terms of training time, although the values remain acceptable for our
application. More importantly, MIRAD’s evaluation time is acceptable, within the
range of standard models.
All MIRAD variants compare favorably in terms of prediction quality. Comparison
between the different variants confirm positive effect of regularization.

Table 3 Metrics

training time evaluation time AP ROC AUC


Gaussian Naive Bayes 0.014 s 0.0072 s 87.370% 98.010%
Decision Tree 0.120 s 0.0027 s 99.252% 99.913%
Logistic Regression 0.653 s 0.0120 s 99.411% 99.670%
MIRAD: no regularization 30.292 s 0.0062 s 99.632% 99.937%
MIRAD: sum regularization 29.156 s 0.0065 s 99.633% 99.938%
MIRAD: step regularization 28.954 s 0.0063 s 99.698% 99.952%
MIRAD: full regularization 29.007 s 0.0070 s 99.699% 99.953%

6.2 Model interpretation


We use models obtained in the experiment to compare MIRAD to logistic regression
in terms of interpretability. For each feature, Table 4 compares logistic regression’s
coefficient with linear correlation between the feature and the target.
Feature 4, which corresponds to File Event Count: NotifyFileWasWritten, is an
interesting case. Domain knowledge suggests that ransomware attacks should cause
very high measurements. Indeed, target correlation 0.58 is one of the highest among
all features. However, logistic regression assigned a negative coefficient.
For comparison, Figure 2 visualizes values related to the same feature of the
MIRAD model trained with no regularization. The plot confirms that high measure-
ments indicate ransomware attacks. More importantly, the visualization hints at a
possible issue with our data generation method - it suggests that simulated user
activity that influences feature 4 is not diverse enough.
Finally, Figure 3 shows the effects of proposed regularization. The bar plot forms
a smoother curve, which prevents overfitting, as confirmed by the metrics in Table 3.

13
Table 4 Logistic regression interpretation

target correlation logistic regression coefficient


feature 0 0.16 -0.04
feature 1 0.06 -0.12
feature 2 0.61 1.25
feature 3 0.69 1.64
feature 4 0.58 -0.32
feature 5 0.63 9.94
feature 6 0.36 -0.83
feature 7 0.24 -0.79
feature 8 0.37 -0.25
feature 9 0.54 2.84
feature 10 0.30 -1.61
feature 11 0.13 -4.78
feature 12 0.41 0.85

Fig. 2 MIRAD: no regularization

Fig. 3 MIRAD: full regularization

14
7 Conclusions
The study successfully demonstrates the efficacy of the Method for Interpretable Ran-
somwareAttack Detection (MIRAD) model in the dynamic detection of ransomware.
The prediction quality of MIRAD is notably superior, as evidenced by the high
Area Under the Curve (AUC) and Average Precision (AP) metrics, surpassing other
tested models like Gaussian Naive Bayes, Decision Trees, and Logistic Regression, a
lightweight models that are required by the setting of the problem. This superior-
ity is further highlighted by the positive impact of various forms of regularization
on the model’s performance, with full regularization yielding the best results. The
interpretability aspect of MIRAD, a key focus of our study, also stands out. Unlike
traditional models such as logistic regression, which sometimes yield counterintuitive
coefficients, MIRAD offers clearer insights into feature relationships, enhancing our
understanding of ransomware patterns. This was particularly evident in the analysis
of Feature 4, where MIRAD’s interpretation aligns with domain expectations, in con-
trast to logistic regression’s negative coefficient. Moreover, MIRAD’s interpretability
facilitated the identification of potential limitations in our data generation method,
suggesting a need for more diverse simulated user activity. The regularization tech-
niques used in MIRAD not only improved prediction accuracy but also smoothed the
feature contribution curve, preventing overfitting and enhancing model robustness. In
conclusion, MIRAD emerges as a powerful tool in ransomware detection, combining
high accuracy, speed, and interpretability, and holds promise for practical applications
in cybersecurity. The insights gained from this study also pave the way for future
research, particularly in improving data simulation methods for training sophisticated
cybersecurity models.
Acknowledgments. The publication was partially supported by National Centre
for Research and Development. The system is the subject of patent application. This
article has been completed while the third author (Natalia Wileńska) was the Doc-
toral Candidate in the Interdisciplinary Doctoral School at the Lodz University of
Technology, Poland.

Declarations
Funding
This publication was partially supported by the National Centre for Research and
Development within the grant no. POIR.01.01.01-00-0228/22-01 titled ”Development
and validation of an AI-based system for protection against ransomware attacks.”

Conflict of Interest
The authors declare that there is no conflict of interest associated with this manuscript.

References
[1] Ardagna, C., Corbiaux, S., Impe, K.V., Ostadal, R.: ENISA THREAT

15
LANDSCAPE 2023 (2023). https://doi.org/10.2824/782573 . https://www.enisa.
europa.eu/publications/enisa-threat-landscape-2023

[2] Zhang, Z., Hamadi, H.A., Damiani, E., Yeun, C.Y., Taher, F.: Explainable artifi-
cial intelligence applications in cyber security: State-of-the-art in research. IEEE
Access 10, 93104–93139 (2022) https://doi.org/10.1109/ACCESS.2022.3204051

[3] Razaulla, S., Fachkha, C., Markarian, C., Gawanmeh, A., Mansoor, W., Fung,
B.C.M., Assi, C.: The age of ransomware: A survey on the evolution, taxonomy,
and research directions. IEEE Access (2023) https://doi.org/10.1109/ACCESS.
2023.3268535

[4] Hernández, J.A.G., Teodoro, P.G., Carrión, R.M., Gómez, R.R.: Crypto-
Ransomware: A Revision of the State of the Art, Advances and Challenges.
Multidisciplinary Digital Publishing Institute (MDPI) (2023). https://doi.org/
10.3390/electronics12214494

[5] Neprash, H.T., McGlave, C.C., Cross, D.A., Virnig, B.A., Puskarich, M.A., Hul-
ing, J.D., Rozenshtein, A.Z., Nikpay, S.S.: Trends in ransomware attacks on us
hospitals, clinics, and other health care delivery organizations, 2016-2021. JAMA
Health Forum 3, 224873 (2022) https://doi.org/10.1001/jamahealthforum.2022.
4873

[6] Kok, S.H., Abdullah, A., Jhanjhi, N.Z., Supramaniam, M.: Prevention of crypto-
ransomware using a pre-encryption detection algorithm. Computers 8 (2019)
https://doi.org/10.3390/computers8040079

[7] Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C.: Automated
dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv
preprint arXiv:1609.03020 (2016)

[8] Kirda, E.: Unveil: A large-scale, automated approach to detecting ransomware


(keynote), pp. 1–1 (2017)

[9] Bajpai, P., Enbody, R.: An empirical study of api calls in ransomware, pp. 443–448
(2020)

[10] Jethva, B., Traoré, I., Ghaleb, A., Ganame, K., Ahmed, S.: Multilayer ransomware
detection using grouped registry key operations, file entropy and file signature
monitoring. Journal of Computer Security 28, 337–373 (2020) https://doi.org/
10.3233/JCS-191346

[11] Herrera-Silva, J.A., Hernández-Álvarez, M.: Dynamic feature dataset for ran-
somware detection using machine learning algorithms. Sensors 23 (2023) https:
//doi.org/10.3390/s23031053

[12] Moussaileb, R., Cuppens, N., Lanet, J.-L., Bouder, H.L., Le Bouder: Ransomware

16
network traffic analysis for pre-encryption alert. Foundations and Practice of
Security: 12th International Symposium, FPS 2019, Toulouse, France, Novem-
ber 5–7, 2019, Revised Selected Papers 12. Springer International Publishing
November (2020) https://doi.org/10.1007/978-3-030-45371-8 2

[13] Mehnaz, S., Mudgerikar, A., Bertino, E.: Rwguard: A real-time detection system
against cryptographic ransomware, vol. 11050 LNCS, pp. 114–136. Springer, ???
(2018). https://doi.org/10.1007/978-3-030-00470-5 6

[14] Vinayakumar, R., Soman, K.P., Velan, K.S., Ganorkar, S.: Evaluating shallow and
deep networks for ransomware detection and classification, pp. 259–265 (2017)

[15] Takeuchi, Y., Sakai, K., Fukumoto, S.: Detecting ransomware using support vector
machines. Association for Computing Machinery, ??? (2018). https://doi.org/10.
1145/3229710.3229726

[16] Lou, Y., Caruana, R., Gehrke, J., Hooker, G.: Accurate intelligible models with
pairwise interactions, pp. 623–631 (2013)

[17] Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible
models for healthcare: Predicting pneumonia risk and hospital 30-day readmis-
sion, vol. 2015-August, pp. 1721–1730. Association for Computing Machinery, ???
(2015). https://doi.org/10.1145/2783258.2788613

[18] Nori, H., Jenkins, S., Koch, P., Caruana, R.: Interpretml: A unified framework
for machine learning interpretability. arXiv preprint arXiv:1909.09223 (2019)

[19] Chen, Z.G., Kang, H.S., Yin, S.N., Kim, S.R.: Automatic ransomware detection
and analysis based on dynamic api calls flow graph, vol. 2017-January, pp. 196–
201. Association for Computing Machinery, Inc, ??? (2017). https://doi.org/10.
1145/3129676.3129704

[20] Hasan, M., Rahman, M.: Ranshunt: A support vector machines based ransomware
analysis framework with integrated feature set, pp. 22–24 (2017)

[21] Shaukat, S.K., Ribeiro, V.J.: Ransomwall: A layered defense system against
cryptographic ransomware attacks using machine learning, pp. 356–363 (2018)

[22] Al-rimy, B.A.S., Maarof, M.A., Shaid, S.Z.M.: Crypto-ransomware early detec-
tion model using novel incremental bagging with enhanced semi-random subspace
selection. Future Generation Computer Systems 101, 476–491 (2019) https:
//doi.org/10.1016/j.future.2019.06.005

[23] Agrawal, R., Stokes, J.W., Selvaraj, K., Marinescu, M.: Attention in recurrent
neural networks for ransomware detection, pp. 3222–3226 (2019)

[24] Kok, S.H., Azween, A., Jhanjhi, N.Z.: Evaluation metric for crypto-ransomware

17
detection using machine learning. Journal of Information Security and Applica-
tions 55 (2020) https://doi.org/10.1016/j.jisa.2020.102646

[25] Ahmed, Y.A., Koçer, B., Huda, S., Al-rimy, B.A.S., Hassan, M.M.: A system call
refinement-based enhanced minimum redundancy maximum relevance method for
ransomware early detection. Journal of Network and Computer Applications 167
(2020) https://doi.org/10.1016/j.jnca.2020.102753

[26] Alqahtani, A., Gazzan, M., Sheldon, F.T.: A proposed crypto-ransomware early
detection(cred) model using an integrated deep learning and vector space model
approach, pp. 275–279. Institute of Electrical and Electronics Engineers Inc., ???
(2020). https://doi.org/10.1109/CCWC47524.2020.9031182

[27] Bae, S.I., Lee, G.B., Im, E.G.: Ransomware detection using machine learning
algorithms, vol. 32. John Wiley and Sons Ltd, ??? (2020). https://doi.org/10.
1002/cpe.5422

[28] Qin, B., Wang, Y., Ma, C.: Api call based ransomware dynamic detection
approach using textcnn, pp. 162–166. Institute of Electrical and Electronics
Engineers Inc., ??? (2020). https://doi.org/10.1109/ICBAIE49996.2020.00041

[29] Hwang, J., Kim, J., Lee, S., Kim, K.: Two-stage ransomware detection using
dynamic analysis and machine learning techniques. Wireless Personal Communi-
cations 112, 2597–2609 (2020) https://doi.org/10.1007/s11277-020-07166-9

[30] Homayoun, S., Dehghantanha, A., Ahmadzadeh, M., Hashemi, S., Khayami, R.:
Know abnormal, find evil: Frequent pattern mining for ransomware threat hunting
and intelligence. IEEE Transactions on Emerging Topics in Computing 8, 341–351
(2020) https://doi.org/10.1109/TETC.2017.2756908

[31] Roy, K.C., Chen, Q.: Deepran: Attention-based bilstm and crf for ransomware
early detection and classification. Information Systems Frontiers 23, 299–315
(2021) https://doi.org/10.1007/s10796-020-10017-4

[32] Almousa, M., Basavaraju, S., Anwar, M.: Api-based ransomware detection using
machine learning-based threat detection models. Institute of Electrical and
Electronics Engineers Inc., ??? (2021). https://doi.org/10.1109/PST52912.2021.
9647816

[33] Al-rimy, B.A.S., Maarof, M.A., Alazab, M., Shaid, S.Z.M., Ghaleb, F.A.,
Almalawi, A., Ali, A.M., Al-Hadhrami, T.: Redundancy coefficient gradual
up-weighting-based mutual information feature selection technique for crypto-
ransomware early detection. Future Generation Computer Systems 115, 641–658
(2021) https://doi.org/10.1016/j.future.2020.10.002

[34] Nguyen, D.T., Lee, S.: Lightgbm-based ransomware detection using api call
sequences. International Journal of Advanced Computer Science and Applications

18
12 (2021)

[35] Hirano, M., Hodota, R., Kobayashi, R.: Ransap: An open dataset of ran-
somware storage access patterns for training machine learning models. Forensic
Science International: Digital Investigation 40 (2022) https://doi.org/10.1016/j.
fsidi.2021.301314

[36] Kok, S.H., Abdullah, A., Jhanjhi, N.Z.: Early detection of crypto-ransomware
using pre-encryption detection algorithm. Journal of King Saud University - Com-
puter and Information Sciences 34, 1984–1999 (2022) https://doi.org/10.1016/j.
jksuci.2020.06.012

[37] Molina, R.M.A., Torabi, S., Sarieddine, K., Bou-Harb, E., Bouguila, N., Assi,
C.: On ransomware family attribution using pre-attack paranoia activities. IEEE
Transactions on Network and Service Management 19, 19–36 (2022) https://doi.
org/10.1109/TNSM.2021.3112056

[38] Singh, A., Ikuesan, R.A., Venter, H.: Ransomware detection using process
memory, pp. 1–10 (2022)

[39] Poudyal, S., Dasgupta, D.: Analysis of Crypto-Ransomware Using ML-Based


Multi-Level Profiling. Institute of Electrical and Electronics Engineers Inc. (2021).
https://doi.org/10.1109/ACCESS.2021.3109260

19

You might also like