You are on page 1of 8

Towards Pentesting Automation Using the

Metasploit Framework
Ovidiu Valea∗ and Ciprian Oprişa ∗†
∗ Technical
University of Cluj-Napoca
† Bitdefender

valea.ovidiu@gmail.com, coprisa@bitdefender.com

Abstract—Penetration testing is a well known methodology chain. The process of penetration testing implies ignoring
assessing security vulnerabilities by executing complex steps organizations perceptions of their own security and verifying
which form an attack. Professional pentesting is an expensive the systems for weaknesses.
service that sometimes cannot fit in the budget of Small and
Medium Enterprises. Automating this process means it can be Data obtained from a successful pentest often discovers
executed even by inexperienced system administrators while it problems which the process of evaluating the vulnerabilities
saves time for professionals. can’t identify. In most cases, these data represents passwords,
The difficulty of this problem consists in the heterogeneity of links between networks and personally identifiable information
networks and systems so the techniques need to be adapted each
(PII). Security engineers who are running the pentest have
time. Our approach is based on identifying system characteristics,
search for existing vulnerabilities and applying machine learning access to the most sensitive resources of the company, they can
for selecting the most appropriate exploit. The model was trained access zones with heavy consequences in reality if a wrong
using data collected from exploited machines on the “Hack the action is made.
Box” learning platform and delivers exploits from the Metasploit In penetration testing there is a limited knowledge about
framework.
The evaluation shows that the proposed framework can exploit the procedures of automating the process. Cyber security
a fair number of systems and can be extended to support new researchers are noticing the lack of resources in automating the
classes of exploits and new pentesting methodologies. security testing methodologies. [6] emphasizes that a partial
or even full automation of the security testing procedures is
I. I NTRODUCTION preferred.
Cybercrime is a major concern for most companies as it The meaning of automated penetration testing is explained
causes losses of 2.9 million USD every minute [1]. Companies by Farah and Esraa [7], who are stating that this process
invest millions of dollars in security programs to protect their is a combination of all the experiences of security experts.
critical infrastructures and to prevent security breaches. A Therefore, even the users with a little knowledge in the field
study made by Gartner [2] indicates the fact that 124 billion of security can replace the pentesting team to get an overview
dollars were spent in the year 2019 for information security of the organization’s security.
products and services. Such data shows that the cybersecurity Having defined what pentesting is, we note that it usually in-
is an important subject, that can’t be treated with a lack of volves a human factor, usually a professional that orchestrates
interest [3]. and performs the process. Hiring such a professional might
A good security strategy involves periodically testing the exceed the security budget for Small and Medium Enterprises.
security of an organization infrastructure. One of the most In this paper we propose a framework that automates the most
popular and used techniques for finding vulnerabilities is common steps of the pentesting process so it can be run even
penetration testing (shortly called pentesting). by an inexperienced system administrator and can also reduce
Matt [4] defines this technique as the process in which the the time for a professional.
cyber security professionals, also called white hat hackers, are The following section presents related work, while Sec-
running the same types of attacks as a real attacker, called tion III presents the most common vulnerabilities exploited
black hat hacker, would do at a company or client’s request. during a penetration testing. Section IV describes the details
This test will assure the fact that the target organization is of the proposed framework. The paper continues with experi-
maintaining the systems up to date with patches, monitors mental results in Section V and concludes with Section VI.
and responds accordingly to cyber threats and follows the best
practices for their systems and services. II. R ELATED W ORK
Furthermore, the authors of [5] mentions that pentesting is
a unique and competitive work which implies thinking like Penetration testing is an approach to security widely studied
a cyber criminal, using the attack tactics in your advantage in the literature [8]. The majority of available work aims at
and finding the weak link from the very complicated defense discovering vulnerabilities at the network level and mostly
978-1-7281-9080-8/20/$31.00 2020
c IEEE relies on procedures executed manually by a security expert.

171

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
As discussed in the paper, compared to existing research, Windows operating systems, anything that uses the SMBv1
our approach to penetration testing focuses on finding the best (Server Message Block version 1) file-sharing protocol is
suited exploit and relies on an automated process. technically at risk of being targeted for ransomware and other
Although with peculiar characteristics, the contribution in cyber attacks [18]. The NSA used EternalBlue for five years
the literature more similar to our proposal is [9] , where the before alerting Microsoft of its existence.
automated penetration testing is performed on an application The MS17-010 patch was designed to fix the SMBv1
running on a cloud, not on a remote machine. Other works software flaws for all supported Windows operating systems,
exploring the automation of the penetration testing process are including Windows XP, Windows Vista, Windows 7, Win-
[10] and [11]. In particular, the latter paper proposes a system dows 8, Windows 8.1, Windows 10, Windows Server 2008,
called Nemesis that queries a database of known vulnerabilities Windows Server 2003, Windows Server 2012, and Windows
and uses Metasploit to execute the tests. Server 2016. Microsoft also automatically disabled SMBv1 in
Another paper proposing penetration test automation by the latest versions of Windows 10 and Windows Servers 2012
scripting is [12]. The main difference between the platform and 2016 by default.
implemented in this paper and our framework is that we don’t The exploit makes use of the way Microsoft Windows
use an user interface where the user can select the initial handles, or rather mishandles, specially crafted packets from
template which generates the testing plan, but instead, our malicious attackers. All the attacker needs to do is send a
framework manages to figure our automatically how to exploit maliciously-crafted packet to the target server. EternalBlue
the target. has been famously used to spread WannaCry [19] and Petya
An approach similar to ours, in that it is based on the use ransomware. But the exploit can be used to deploy any type of
of rule trees, is suggested in [13]. The solution proposed in cyberattack, including cryptojacking and worm-like malware.
this research is based on previously known avenues of attacks, As of June 2020, Avast is still blocking around 20 million
compared to our solution that is based on the current services EternalBlue attack attempts every month. Microsoft’s patch
that our target machine has and their vulnerabilities. closes the security vulnerability completely, thus preventing
III. C OMMON V ULNERABILITES AND E XPLOITS attempts at deploying ransomware, malware, cryptojacking,
or any other worm-like attempts at digital infiltration using
A. Dirty Cow (CVE-2016-5195) the EternalBlue exploit. But a key problem remains for many
Dirty Cow [14] is a very popular vulnerability and is versions of Windows, the software update must be installed in
actively exploited in cyber attacks. It’s a race condition that order to provide protection.
was found in the way the Linux kernel’s memory subsystem Shadow Brokers, the group that found the NSA exploits
handled the copy-on-write (COW) breakage of private read- published more vulnerabilities together with EternalBlue, for
only memory mappings [15]. This is a 13-year-old critical example EternalRomance and EternalChampion, but the Mi-
vulnerability that has been discovered in virtually all versions crosoft MS17-010 patch resolves them all.
of the Linux operating system. All of the exploits presented previously can also be piv-
It basically is a privilege-escalation, but researchers are oted to a Meterpreter session via the DoublePulsar implant.
taking it extremely seriously due to the fact that the Dirty DoublePulsar is a backdoor implant tool that runs in kernel
Cow flaw exists in a section of the Linux kernel, which is a mode, which grants cybercriminals a high level of control over
part of virtually every distribution of the open-source operating the computer system. Once installed, it uses three commands:
system, including RedHat, Debian, and Ubuntu. ping, kill, and exec, the latter of which can be used to
Dirty COW potentially allows any installed malicious app load malware onto the system [20].
to gain administrative (root-level) access to a device and
completely hijack it within just 5 seconds. The Dirty COW C. BlueKeep (CVE-2019-0708)
vulnerability has been present in the Linux kernel since version
2.6.22 in 2007, and is also believed to be present in Android, BlueKeep is a software vulnerability affecting older versions
which is powered by the Linux kernel [16]. More precisely, it of Microsoft Windows. Its risk is significant because it attacks
affects Linux kernel 2.x through 4.x before 4.8.3 [17]. an operating systems Remote Desktop Protocol (RDP), which
An unprivileged local user could use this flaw to gain connects to another computer over a network connection. This
write access to otherwise read-only memory mappings and would allow a cyberthreat to spread very quickly [21].
thus increase their privileges on the system. This flaw allows Microsoft says that vulnerable in-support systems, which are
an attacker with a local system account to modify on-disk those that are still supported by the company, include Windows
binaries, bypassing the standard permission mechanisms that 7, Windows Server 2008 R2, and Windows Server 2008.
would prevent modification without an appropriate permission Out-of-support systems include Windows 2003 and Windows
set. XP. Customers running Windows 8 and Windows 10 are not
affected by the vulnerability.
B. EternalBlue (CVE-2017-0144) A remote code execution vulnerability exists in Remote
This exploit was created by the NSA as a cyber attack tool Desktop Services formerly known as Terminal Services when
and is officially named MS17-010 by Microsoft. It affects only an unauthenticated attacker connects to the target system

172

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
using RDP and sends specially crafted requests, aka ”Remote the nmap-vulners script of the well-known databases like
Desktop Services Remote Code Execution Vulnerability” [22]. VulDB [30] and NVD [17] are organized by severity, which
It is marked Critical on National Vulnerability Database can be Critical, High, Medium or Low and has associated a
with a score of 9.8. Microsoft has warned that the BlueKeep criticality score between 0 and 10.
vulnerability could cause a wormable cybersecurity outbreak With the facts presented until now we can design a simple
that could propagate from vulnerable computer to vulnerable methodology that can find the versions of the services or
computer in a similar way as the WannaCry malware spread applications and the operating system of the target using
across the globe in 2017. the Nmap tool, also known as the “Swiss Army Knife”
of the network utilities. After this step, we search for the
IV. F RAMEWORK D ESCRIPTION potential vulnerabilities and exploits using the vulscan and
A. Framework Architecture nmap-vulners scripts. Finally, we use all the gathered data
in the next step for predicting the best exploit. The data format
The goal of the proposed framework is to obtain a root shell
was designed to fit our needs to work with a decision tree but
inside a Meterpreter session within the given target. If this
can be used in other Machine Learning applications.
goal is achieved, the attacker have full control over the victim
The dataset was manually created after following the 143
machine. The following macro-steps are performed when the
writeups of the machines from the Hack the Box platform
framework is used to test the security of a machine:
[31], together with the data extracted after the port scan made
• scan for open ports and enumerate the services by Nmap tool. It contains five fields, namely:
• search for vulnerabilities in the services that were found
• Port - TCP or UDP port found open by Nmap
• select an appropriate exploit that fits best the target
• Service - The running service on that port
characteristics • CVE - Found vulnerabilities by the vulners script for the
• run the attack from the Metasploit framework and search
service
for open Meterpreter sessions • Exploit - The most used exploit for the found port,
The pentesting process starts with the system scanning, service, CVE and operating system, based on the boxes
extracting the important data and storing the data. In this writeups
process we use the Nmap library available for Python [23], • OS - The operating system of which the service is running
[24] together with two Nmap Scripting Engine scripts, namely The dataset contains 20 entries which represents the top 20
vulscan [25], [26] and nmap-vulners [27]. most commonly used exploits from Metasploit. We built our
For a better understanding, we will present the basic con- top by iterating through the machines writeups and noting the
cepts behind Nmap Scripting Engine (NSE). NSE is one of exploit chosen by hackers while trying to attack a particular
the most powerful and flexible features of Nmap, but it’s machine configuration.
often disregarded by pentesters. It allows the users to write
simple scripts to automate a wide variety of tasks related to the B. Selecting the Best Exploit
network. These scripts are then executed in parallel with the This module has the role to find the best suited exploit for a
speed and efficiency of Nmap. Users can rely on the growing given service, the port that the service is running, the operating
and diverse set of scripts distributed with Nmap, or write their system of the machine and the found vulnerabilities.
own to meet custom needs [28]. Given the low dimensionality and the small dataset, complex
The scripts we used, nmap-vulners and vulscan were solutions based on deep learning would lead to overfitting. We
designed to improve Nmap’s version detection by finding rel- have chosen instead to build a decision tree, as it resembles
evant information about the CVE of a service like SSH, RDP the decision process of a pentester when selecting an exploit.
or SMB and the potential exploits for the found CVEs. Both A decision tree is a non-parametric supervised learning
of these NSE scripts do an excellent job of displaying useful algorithm used for classification and regression problems [32],
information related to vulnerable services. nmap-vulners having a structure in which a node represents an attribute,
queries an on-line exploits database every time the NSE script the branches are decision rules and the leaves represent the
is used. vulscan, on the other hand, queries a local database result. The highest level node is known as the root node and
on our computer which is preconfigured when vulscan is it learns to split data based on the attribute value. Also, the
downloaded for the first time. Nmap will find the version of decision tree is a white box Machine Learning algorithm which
the services that were discovered and the NSE scripts will use distributes the decision making logic internally, something
it to search for the known CVEs that can be used to exploit that doesn’t happen in the case of black box algorithms, like
that service, which makes the vulnerability search more easier. Neural Networks. The training time is also shorter than the
CVEs, short for Common Vulnerabilities and Exposures, one from a neural network. Furthermore, the complexity of
is a method used by security researchers and by exploit the tree depends on the number of inputs and attributes from
databases to index the individual vulnerabilities. ExploitDB the dataset and it can handle large sized data with a good
[29] is the most popular database that contains public exploits. precision.
This database uses CVEs to index the exploits and vulner- To get a better understanding on how a decision tree works
abilities associated with a version of a service. CVEs from and on which criteria it select the best exploit, we have two

173

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
notions that stand at the base of a decision tree: information the columns OS_Windows and OS_Linux that will have the
gain and Gini impurity. values 0 or 1, depending on the initial value of the column.
To define Information Gain precisely, we begin by defining After the process of encoding the data from the CSV file,
a measure which is commonly used in information theory we need to divide our dataset into a training set and a testing
called Entropy [33]. Entropy basically tells us how impure set. These two sets help us in the process of validating the
a collection of data is. The term impure here defines non- data which can be done with the help of two methods.
homogeneity. In other word we can say that Entropy is the The first method is called Hold-Out Validation and it divides
measurement of homogeneity [34] and can be computed using the dataset into training set for a given percentage and the rest
Equation 1. is used as a testing set. We tested many ways to split the
dataset most efficient using this method and we got the best
n
 prediciton precision of 33% at the split of 70% training data
H(x) = − p(xi ) log2 p(xi ). (1) and 30% testing data.
i=1
Because the first method wastes 30% of the dataset, we
Information gain is used to choose the testing attribute for chose the second validation method which is called Cross
the node division based on the maximum reduction of the Validation. Even though this method consumes more from a
entropy at that given node [35]. It computes the difference computational point of view, it uses the whole dataset. For
between the entropy before the split and the average of the example, this Cross Validation method splits the dataset in
entropy after the dataset split based on the value of a given five pieces from which uses four for training and the last one
attributes. for validation. This process repeats until all five pieces were
used for validation.
 |Sv | In our framework we use a variation of this method called
Gain(S, A) ≡ Entropy(S) − Entropy(Sv ) K-fold which divides the data into K equal partitions. For
|S|
v∈V alues(A) each fold it uses K − 1 partitions for training and the last one
(2) for validation. The main advantage of this method is that all
The attribute with the biggest information gain is chosen as data is used for both training and validation. Each example is
the split attribute for a node. Therefore, the decision tree will used for validation exactly once.
pick for the current node split, the attribute that has the lowest We decided to use K-Fold Cross Validation given the fact
entropy, implying the maximum purity of the partitions and the that we want out model to predict precisely an exploit and to
biggest information gain. The second notion that is important avoid the overfitting process because we have a pretty small
for a decision tree is Gini impurity. It is a criteria to minimize dataset.
the probability of error in classification. The Gini index does a An overfitted tree will manage to perfectly classify the
binary split for each attribute and we can compute a weighted training data, without any errors. The disadvantage of a tree
sum of impurities for each partition. like this is that it will be very sensitive because any small
change of the training data will cause the prediction to change
k
 drastically which means that the variance of the model will
GI = 1 − p(i|n)2 . (3)
be very high. In this case the prediction of the model will be
i=1
wrong for new data. To prevent overfitting to happen, we have
The decision tree input needs to have it’s data processed. to set a stop condition.
Like in any other Machine Learning model, input data can’t A tree with a small depth cannot capture the non-linear limit
be formed of strings and needs to be converted to numerical which separated the classes. By minimizing the tree depth we
data. Data that has elements from the real world are mostly raise the bias which means the classification error at training,
made of strings and these are called categorical data because but at the same time we minimize the variance. The balance
each value represents a different category. To transform the between bias and variance searches for a compromise between
categorical data into numerical ones we have two possibilities, these two.
namely Label encoding and One-Hot encoding. While training, the tree continues to develop until each
Label encoding converts each unique categorical value to a region has exactly one training point, which means a train-
number. This process is reversible and between numerical val- ing precision of 100%. This will transform into a complete
ues a natural ordered relationship can be made. For categorical classification tree which splits the training dataset until each
data that doesn’t have this type of relationship, the first encod- leaf has a single element. In other words, the tree will reach
ing is not enough. Actually, using this encoding and leaving overfitting on the training set.
the model to guess by himself the natural ordered relationship, The depth of the tree can be determined by evaluating the
can get to causing poor performances and unexpected results. tree on a dataset with the help of cross validation. By dividing
In this case, one-hot encoding can be applied on top of the data into partitions of training and validation, by learning
the numerical data and in this process the column with the the trees with different sizes of the training partitions and by
initial values will be replaced by two binary columns for each looking at the classification precision of the validation set, we
unique value. For example, the column OS will be replaced by can find the tree depth which gives the best balance between

174

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
bias and variance. A tree like this does not predict perfectly be considered all of the vulnerabilities that does not give
on the training set, but the performance will be approximately an arbitrary code execution directly, like XSS, CSRF, SQL
the same if we change a bit the training set. This means we Injection, etc.
will have an acceptable bias and variance. In our research we use a RPC client made available by
Given the fact that we have a small sized dataset, we chose Metasploit that gives us the possibility to run commands
to use K-Fold Cross Validation with two folds. The next step from our Python script directly into the framework. After
is to train the decision tree using the training set, after which the connection with this client is established via a specified
we compute the prediction precision for a test example. This port and a password we can set the exploit generated by the
prediction is based on the test data. decision tree. The next steps are setting the IP of the target
Finally, to visualize the decision tree we used the Python machine, our IP and we set the Meterpreter payload based
library called graphviz. on the running operating system. After this, we execute the
exploit.
C. Exploiting the Vulnerabilities
Meterpreter is an advanced payload which offers the com-
In the process of exploiting the vulnerabilities we use plete working environment on the compromised machine and it
the Metasploit framework together with the exploit that the works on different operating systems like Android, Windows,
decision tree predicted to execute the needed steps for a Linux and it exists on many forms like binary on x86 or
penetration test. x64 architecture, or in the form of a script written in PHP or
The exploit is an instruction or data sequence that takes Python. It is compatible with the majority of the exploits, the
advantage of a vulnerability to obtain an inappropriate behav- disadvantage is that it can be detected by anti-virus software,
ior. It’s worth noting that exploit and payload are two distinct IDS or IPS. This payload has various features like creating
concepts. a shell, running commands with tab completion, can execute
This step offers control over the vulnerable system or privilege escalation, it can run additional Metasploit modules
leads to compromising the organization and in this phase we and it can upload or download files. Meterpreter is loaded only
have to chose the right exploit and payload based on the in memory and it does not write anything to disc. Furthermore,
target’s operating system and hardware so we can get what it does not create new processes while Meterpreter injects itself
we proposed. in the compromised process and can migrate into another open
Usually, we try to exploit the most critical vulnerabilities process very easily.
first, to obtain control over the vulnerable system and this If the executed exploit works, the framework will tell us that
does not gets realized only using a Remote Code Execution a shell was created that we can access. The first thing we check
type vulnerability, but the lateral movement strategy comes in the newly created shell is the privilege level that the user
into help. has. If we don’t obtain root or administrator privileges from
The exploits are mostly found in the Metasploit framework the exploit execution, we start the privilege escalation stage.
or on exploitDB and their use is chosen based on the applica- Usually, we can also find exploits in Metasploit for privilege
tion, the specific conditions and on the mitigations present escalation. If we have root privileges on a machine from Hack
on the vulnerable target because each vulnerability can be the Box platform we have to do one more step and that is
exploited in a way and two identical vulnerabilities can be finding the user and root flags. This task can be done easily
exploited differently. with the help of the search command from Meterpreter shell.
There are four main vulnerabilities classifications: Metasploit is an open-source project that provides resources
• Binary - memory corruptions, race-conditions, etc. to develop, test and execute exploits. Furthermore, it can be
• Web - XSS, CSRF, RCS, SQL Injection, Local/Remote used as a penetration testing framework and to create security
File Inclusion, etc. testing tools or exploitation modules. This project is written in
• Generic - weak or default credentials, directory traversal, Ruby, but it allows a bidirectional Remote Procedure Call. The
etc. Python library called pymetasploit3 facilitates the interaction
• Result based - RCE or Non-RCE between Python and msgrpc from Metasploit.
The payload represent the code which executes after the MsfRpcClient class provides the basic functionality to
system exploitation and is mainly a shellcode that can be navigate through Metasploit framework. Before starting the
used in more stages if the sheelcode is very big, or in a pentesting process from the framework, we have to start
“drive-by-download” attack which involves downloading the msfconsole and load the msgrpc service. This service needs
shellcode on the vulnerable system and executing it as a binary. to be protected by a password and it usually runs on port
Metasploit has a various list of payloads and each of them 55552. The interaction with msfrpc is almost similar to the
fits different types of exploits, operating system and processor msfconsole. The first step is to create an instance of the
architecture. MsfRpcClient class, after which we login to the msgrpc
Depending on the vulnerability and exploit, payloads needs service with the previously set password and we get a virtual
to be encoded to avoid unwanted characters and to avoid console. Using this virtual console we can access all the
antiviruses detection. This encoding applies to binary payloads modules that are available in Metasploit like exploits, payloads
and involves encrypting them. Non-RCE vulnerabilities can and auxiliaries. To activate one of these modules we have at

175

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
our disposal the method use. cracking the login credentials, dirbuster to enumerate all the
RPC API lets us use the Metasploit framework programat- available URLs or msfvenom to generate a payload with. There
ically with the help of the RPC services based on HTTP. are others that are exploitable by doing a SQL injection or
A RPC service is a collection of types of messages and by using smbmap which enumerates the samba shared drives
remote methods that provides a structured path for the external across the entire domain.
applications to interact with web applications. We can use the We tested our framework on these 10 boxes and we man-
RPC interface to execute local or remote commands, to run aged to get the user and root flags from 7 boxes that were
modules, interact with the database and sessions, export data exploited successfully. The remaining 3 machines required a
and to generate reports. more in-depth lateral movement, we cannot do a privilege
Even though Metasploit is written in Ruby, we can use to escalation from the first logged user.
communicate through RPC API any programming language In Figure 2 we can see the results from the generated
that supports HTTPS and MessagePack, like Python, Java or decision tree for a box we exploited from Hack the Box.
C. We also wanted to visualize which is the most important
feature that helps the decision tree to predict the best suited
V. E XPERIMENTAL R ESULTS
exploit for a particular machine and we managed to do this in
Hack The Box is an online platform allowing the user Figure 3.
to test his penetration testing skills and exchange ideas and In the step that contains the initialization of the environment
methodologies with thousands of people in the security field we initially tried using a tmux session to open both windows
[31]. in the terminal for the VPN connection and MSGRPC service.
This platform has 20 active machines and 148 retired For security reasons we decided to abandon this idea and open
ones that has paid access. Each box has various types of two separate terminals that ask the user for the root password,
vulnerabilities so that the user will learn a variety of techniques as a confirmation that he really wants to start our framework.
trying to find the user and root flags. Each week, a new
virtual machine is released and a previously-active machine VI. C ONCLUSION
retires. One important rule is that it is only allowed to publish Penetration testing is a very important process for a com-
walkthroughs for retired machines, not for active ones. pany’s security and there are many attackers waiting for the
In order to have a robust framework, we manually exploited next target. Pentesters also have to train so they can find all
16 machines from this platform so we can discover the steps the open doors left for attackers.
that are common between these boxes and that are repeating We have proposed a framework that automates the pentest-
with the scope to automate these steps. ing steps and can be used by pentesters to see if a machine
From the six steps of pentesting we can skip the first one, can be easily exploited or not.
information gathering, because HTB is already giving us the Our research can be also useful for anyone who wants to see
target IP and we don’t have to discover it anymore. The steps what are the common steps that are repeating at every pentest
that are left for us to implement in our framework are service and for the ones that are looking to solve a bug bounty task
enumeration, exploitation, persistence, post exploitation and without losing time with the target that are vulnerable and
cleanup. have publicly known exploits.
The decision tree finds the right exploit with an accuracy of Experimental results showed that the proposed framework
33%, as seen in Figure 1. Given the fact that the classification is able so solve precisely the machines with common vulner-
is not binary and the decision tree has to predict from various abilities and public exploits.
classes, we consider that is a good accuracy.
We showed that, compared to traditional approaches to pen-
etration testing, our process can be much faster and automated
and is able to cover a large set of vulnerabilities and exploits,
saving precious time for a pentester.

R EFERENCES
[1] RiskIQ. (2019) In just one evil internet minute over two phish
Fig. 1. Predicted exploit are detected and $2.9 million is lost to cybercrime. [Online].
Available: https://www.riskiq.com/press-release/just-one-evil-internet-
minute-two-phish-detected-2-9-million-lost-cybercrime-reveals-riskiq/
We went through all the writeups available for the 168 boxes [2] S. Moore and E. Keen. (2019) Gartner forecasts worldwide
and we realized that only 10 machines could be exploited information security spending to exceed $124 billion in
using just Metasploit framework and no human interaction. 2019. [Online]. Available: https://www.gartner.com/en/newsroom/press-
releases/2018-08-15-gartner-forecasts- worldwide-information-security-
By human interaction we refer to applications, mostly web, spending-to-exceed-124-billion-in-2019/
that need clicks or navigation to login, etc. The majority of [3] R. Sobers. (2020) 110 must-know cybersecurity statistics for
the boxes require the use of more tools to get to the root shell. 2020. [Online]. Available: https://www.varonis.com/blog/cybersecurity-
statistics/
Many machines that run web applications can be exploited [4] M. Burrough, Pentesting Azure Applications: The Definitive Guide to
with the help of Burp Suite that can intercept request, John for Testing and Securing Deployments. No Starch Press, 2018.

176

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
port_139 ≤ 0.5
gini = 0.92
samples = 100.0%
class = multi/http/php_cgi_arg_injection
True
False

service_http ≤ 0.5
gini = 0.0
gini = 0.914
samples = 11.1%
samples = 88.9%
class = multi/samba/usermap_script
class = multi/http/php_cgi_arg_injection

cve_CVE-2011-3556 ≤ 0.5
gini = 0.0
gini = 0.908
samples = 11.1%
samples = 77.8%
class = multi/http/php_cgi_arg_injection
class = multi/misc/java_rmi_server

cve_CVE-2011-2523 ≤ 0.5
gini = 0.0
gini = 0.903
samples = 11.1%
samples = 66.7%
class = multi/misc/java_rmi_server
class = unix/ftp/vsftpd_234_backdoor

os_windows ≤ 0.5
gini = 0.0
gini = 0.9
samples = 11.1%
samples = 55.6%
class = unix/ftp/vsftpd_234_backdoor
class = unix/irc/unreal_ircd_3281_backdoor

port_6667 ≤ 0.5 service_ms-wbt-server ≤ 0.5


gini = 0.5 gini = 0.875
samples = 11.1% samples = 44.4%
class = unix/irc/unreal_ircd_3281_backdoor class = windows/local/ms18_8120_win32k_privesc

port_* ≤ 0.5
gini = 0.0 gini = 0.0 gini = 0.0
gini = 0.857
samples = 5.6% samples = 5.6% samples = 5.6%
samples = 38.9%
class = unix/misc/distcc_exec class = unix/irc/unreal_ircd_3281_backdoor class = windows/rdp/cve_2019_0708_bluekeep_rce
class = windows/local/ms18_8120_win32k_privesc

cve_CVE-2017-0143 ≤ 0.5
gini = 0.0
gini = 0.833
samples = 5.6%
samples = 33.3%
class = windows/local/ms18_8120_win32k_privesc
class = windows/mssql/ms09_004_sp_replwritetovarbin_sqli

(...) (...)

Fig. 2. Exploit prediction decision tree

[5] D. O. J. K. D. A. M. Kennedy, Metasploit: The Penetration Tester’s flaw being exploited in the wild. [Online]. Available:
Guide. No Starch Press, 2011. https://thehackernews.com/2016/10/linux-kernel-exploit.html
[6] D. Xu, M. Tu, M. Sanford, L. Thomas, D. Woodraska, and W. Xu, [17] National Vulnerability Database. (2016) Cve-2016-5195 detail. [Online].
“Automated security test generation with formal threat models,” IEEE Available: https://nvd.nist.gov/vuln/detail/CVE-2016-5195
Transactions on Dependable and Secure Computing, vol. 9, no. 4, pp. [18] C. Burdova. (2020) What is eternalblue and why is the ms17-010 exploit
526–540, 2012. still relevant? [Online]. Available: https://www.avast.com/c-eternalblue
[7] E. Abu-Dabaseh, Farah; Alshammari, “Automated penetration testing: [19] S. Mohurle and M. Patil, “A brief study of wannacry threat: Ransomware
An overview,” Computer Science & Information Technology, 2018. attack 2017,” International Journal of Advanced Research in Computer
[8] M. Bishop, “About penetration testing,” Computer, vol. 5, no. 6, pp. Science, vol. 8, no. 5, 2017.
84–87, 2007. [20] (2020) Doublepulsar. [Online]. Available:
[9] V. Casola, A. de Benedictis, M. Rak, and U. Villano, “Towards auto- https://en.wikipedia.org/wiki/DoublePulsar/
mated penetration testing for cloud applications,” Computer, 2018. [21] J. Elder. (2019) The much-publicized bluekeep threat has finally emerged
why should you care? [Online]. Available: https://blog.avast.com/what-
[10] C. Sarraute, O. Buffet, and J. Hoffmann, “Pomdps make better hackers:
is-bluekeep
Accounting for uncertainty in penetration testing,” Computer, 2012.
[22] National Vulnerability Database. (2019) Cve-2019-0708 detail. [Online].
[11] P. Kamongi, M. Gomathisankaran, and K. Kavi, “Nemesis: automated Available: https://nvd.nist.gov/vuln/detail/CVE-2019-0708
architecture for threat modeling and riskassessment for cloud comput-
[23] A. Norman. (2016) python-nmap 0.6.1. [Online]. Available:
ing,” Computer, 2014.
https://pypi.org/project/python-nmap/
[12] B. Duan, Y. Zhang, and D. Gu, “An easy-to-deploy penetration testing [24] T. O’Connor, Violent Python: a cookbook for hackers, forensic analysts,
platform,” Computer, 2008. penetration testers and security engineers. Newnes, 2012.
[13] J. Zhao, W. Shang, M. Wan, and P. Zeng, “Penetration testing automation [25] M. Ruef. (2020) vulscan.nse: vulnerability scanner with nmap. [Online].
assessment method based on rule tree,” Computer, 2015. Available: https://www.computec.ch/projekte/vulscan
[14] D. Alam, M. Zaman, T. Farah, R. Rahman, and M. S. Hosain, “Study of [26] H.-C. Huang, Z.-K. Zhang, H.-W. Cheng, and S. W. Shieh, “Web
the dirty copy on write, a linux kernel memory allocation vulnerability,” application security: Threats, countermeasures, and pitfalls,” Computer,
in 2017 International Conference on Consumer Electronics and Devices vol. 50, no. 6, pp. 81–85, 2017.
(ICCED). IEEE, 2017, pp. 40–45. [27] S. Rahalkar, “Introduction to nmap,” in Quick Start Guide to Penetration
[15] Corey. (2019) Dirty cow vulnerability details. [Online]. Available: Testing. Springer, 2019, pp. 1–45.
https://github.com/dirtycow/dirtycow.github.io/wiki/VulnerabilityDetails [28] P. C. Pale, Mastering the Nmap Scripting Engine. Packt Publishing
[16] S. Khandelwal. (2016) Dirty cow critical linux kernel Ltd, 2015.

177

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.
service http 0.19
os Windows 0.1
port 1099 0.1
service UnrealIRCD 0.1
CVE-2011-4051 0.1
port 3632 0.1
service ms-wbt-server 0.1
CVE-2008-4250 0.1
CVE-2008-5416 0.1
port 445 0

0 0.1 0.2
Importance

Fig. 3. Feature importances

[29] Offensive Security. (2020) Exploit database. [Online]. Available:


https://www.exploit-db.com/
[30] M. Ruef. (2020) Vuldb: the community-driven vulnerability database.
[Online]. Available: https://vuldb.com/
[31] (2020) Hack the box. [Online]. Available: https://www.hackthebox.eu/
[32] R. Molala. (2019) Upside down trees that divide and conquer.
[Online]. Available: https://blog.clairvoyantsoft.com/upside-down-trees-
that-divide-and-conquer-e893c8f73ee8
[33] C. E. Shannon, “A mathematical theory of communication,” The Bell
system technical journal, vol. 27, no. 3, pp. 379–423, 1948.
[34] P. Badiuzzaman. (2020) Entropy calculation, infor-
mation gain & decision tree learning. [Online].
Available: https://medium.com/analytics-vidhya/entropy-calculation-
information-gain-decision-tree-learning-771325d16f
[35] M. Gorunescu. (2010) Arbori de clasificare si decizie. [Online].
Available: http://math.ucv.ro/ gorunescu/courses/DM/curs2.pdf

178

Authorized licensed use limited to: Yildirim Beyazit Univ. Downloaded on October 10,2021 at 17:15:18 UTC from IEEE Xplore. Restrictions apply.

You might also like