You are on page 1of 15

HDROP: Detecting ROP Attacks

Using Performance Monitoring Counters

HongWei Zhou1,2,3 , Xin Wu1,2 , WenChang Shi1,2 ,


JinHui Yuan3 , and Bin Liang1,2
1
Key Laboratory of Data Engineering and Knowledge Engineering,
Ministry of Education, Beijing, China
2
School of Information, Renmin University of China, Beijing, China
3
Information Engineering University, Zhengzhou, China

Abstract. Combining short instruction sequences originated only from


existing code pieces, Return Oriented Programming (ROP) attacks can
bypass the code-integrity effort model. To defeat this kind of attacks, cur-
rent approaches check every instruction executed on a processor, which
results in heavy performance overheads. In this paper, we propose an
innovative approach, called HDROP, to detecting the attacks. It utilizes
the observation that ROP attacks often make branch predictor in modern
processors fail to determine the accurate branch destination. With the
support of PMC (Performance Monitoring Counters) that is capable of
counting performance events, we catch the abnormal increase in branch
mis-prediction and detect the existence of ROP attacks. In HDROP, each
basic unit being checked consists of hundreds of instructions rather than
a single one, which effectively avoids significant performance overheads.
The prototype system we developed on commodity hardware shows that
HDROP succeeds in detecting ROP attacks, and the performance tests
demonstrate that our approach has acceptably lower overheads.

Keywords: ROP, misprediction, branch, performance monitoring


counters.

1 Introduction

ROP(Return Oriented Programming) is a kind of code-reuse attack technique


which constructs the exploits by combining short instruction sequences only orig-
inating from the existing binaries code. Without injecting any new component,
it can circumvent the protection provided by current code-integrity efforts in-
cluding W⊕X, NICKLE[1] and Secvisor[2], etc. Furthermore, the adversary is
able to perform Turing-complete computation with ROP technology, and ROP
has been a practical attack technique to subvert computer system.
The first ROP attack is proposed in 2007[3], it chains some gadgets together
which are the short instruction sequences ending with ret instruction. Gadgets
are the essential units for ROP attack. Therefore, a feasible prevention idea is
to make it difficult to identify gadgets in the available code-bases. Following

X. Huang and J. Zhou (Eds.): ISPEC 2014, LNCS 8434, pp. 172–186, 2014.

c Springer International Publishing Switzerland 2014
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 173

this idea, an approach to build ret -less software is presented[4]. However, to


avoid the reliance on ret instruction, JOP(Jump Oriented Programming)[6,7]
is proposed which launches attack using gadgets ending with jmp instruction
instead of ret instruction. Meanwhile, some improved ROP techniques that are
able to automatically construct ROP exploits are presented[8,9].
ROP prevention solutions can be divided into two main categories. One is to
defeat ROP attacks by eliminating the available gadgets in the code-base[4,5].
G-free[5] is a typical approach which focus on removing the gadgets from the
intended and unintended code-base. It wipes off the unintended instructions
by aligning the instructions with a code-rewriting technology and protects the
existing ret and indirect jmp/call instruction with the another approach. Thus,
the adversary fails to find the available gadgets in the new code-base to launch
ROP attack.
The other is to detect ROP attacks on the abnormity introduced by its
execution[13,14,15]. ROPdefender[13] checks the destination of every ret instruc-
tion, because ROP misuse ret instruction to transfer the control from one gadget
to the next gadget. Other researches follow a similar method. However, these so-
lutions are usually implemented on the binary instrumentation framework such
as pin[19] or Valgrind[20]. With the support of binary instrumentations, they
check each executing instruction to detect the special abnormity. As a conse-
quence, they all incur a heavy performance overhead from 2x to 5.3x times
slower.
In this paper, we focus on how to detect ROP attacks, and propose a novel so-
lution called HDROP(Hardware-based solution to Detect ROP attack) which is
capable of detecting ROP attacks without significant performance overheads.
Unlike existing solutions, HDROP takes hundreds of instructions as a basic
checking unit rather than a single one. Thus, the monitor is not necessary to
be frequently trapped in. Consequently, it is able to reduce the performance
overhead induced by the context-switches between monitoring objects and de-
tecting mechanism.
Our approach is on the following observation. It is well known that modern
processors utilize branch predictor to improve their performance. However, ROP
attacks often make it fail to predict the right branch target. The cause is that
ROP attacks break the normal execution for transferring the control from one
gadget to the next, which makes branch targets sharply different from the origi-
nal ones. Therefore, HDROP detects ROP attacks by our new idea that if there
is an abnormal increase of misprediction on the given execution path, it maybe
introduce a ROP attack.
To catch mispredictions and other interesting processor events, HDROP uti-
lizes the capabilities supported by the hardware PMC(Performance Monitoring
Counters) which is available on the Intel processor[16]. PMC holds some hard-
ware counters to count the processor performance events including retired in-
struction, executed ret instruction and so on. With the support of PMC, we
build a misprediction profile for the monitored execution path, and expose the
abnormal increase of misprediction introduced by ROP attacks.
174 H. Zhou et al.

Our prototype system is developed on Fedora 5 with a 2.6.15-1 kernel. At first,


HDROP collected the related data by inserting thousands of checkpoints into
the kernel utilizing a compiler-based approach. Then, it catched the abnormal
increase of misprediction and detected ROP attacks on the prepared data. To
validate the effectiveness of HDROP, we have constructed a ROP rootkit with
the approaches introduced by [3] and [7]. Our experiments show that HDROP
is capable of detecting ROP attack. Furthermore, we have implemented the
performance tests on commodity hardware and the results demonstrate that
HDROP has acceptable lower performance overheads.
The rest of the paper is organized as follows. Section 2 and section 3 present
our design and implementation of HDROP respectively, followed by the evalua-
tion of HDROP in section 4. The discussion of our solution is detailed in section
5. Section6 surveys related work and section 7 concludes this paper.

2 Design

In our design, there are three main challenges to be overcome. First, what are
our interesting performance events? PMC is capable of monitoring a variety of
processor performance events, and we need to identify those which are closely
helpful to detect ROP attack. Second, how to collect the data from the hardware
counters(e.g. PMC) for the further detection? Ideally, they should be collected
without the heavy overhead. Third, how to design detecting algorithm. In this
paper, we construct the algorithm on a balance between the accuracy and the
performance overhead. The solutions will be discussed in detail in the following
subsections.

2.1 Interesting Performance Event

BR RET MISSP EXEC[16] is our first interesting performance event. More


specifically, BR RET MISSP EXEC means that hardware counter records the
number of mispredicted executed ret instructions[16]. By catching it, we are able
to detect ROP attacks on an abnormal increase of mispredicted ret instructions.
It is noted that HDROP is designed to detect ROP attacks that utilize the gad-
gets ending with ret instruction in this paper. If detecting JOP attacks[6,7], we
should identify other processor performance events, and we consider it as our
future work.
There is a distinction between executed instruction and retired instruction.
An executed instruction may not be a retired instruction. In other words, more
executed instructions are counted than actual retired instructions on the same
execution. Ideally, BR RET MISSP RETIRED should be our interesting event.
However, we fail to identify the expected performance event. To resolve this
issue, we utilize BR RET MISSP EXEC as the alternative, but this does not
weaken the capability of detecting ROP attacks.
The number of executed ret instruction is also our interesting data. It is con-
sidered as the necessary data for accurately detecting ROP attacks. With the
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 175

different input, there are different execution paths on the same monitored in-
structions. To detect the abnormal increase of mispredicted ret instructions for
the given monitored instructions, we have to generate the baseline for each path.
In more serious cases, the baseline may be submerged by “noise”. However, if
we obtain the number of executed ret instructions at the same time, it is more
easy to identify the execution path than before. Furthermore, it is feasible to
detect ROP attacks with several baselines. Therefore, BR RET EXEC is an-
other interesting performance events which counts the number of executed ret
instructions[16].
At last, we pay close attention to the number of the retired instructions. As
mentioned earlier, our solution takes hundreds of instructions as a basic checking
unit. So we want to know the number of the retired instructions on the checked
execution path. With the number, we are able to know the length of our moni-
toring execution path, and the frequency that HDROP trap in at the checking
time. The performance event is denoted as INST RETIRED.ANY P[16].

2.2 Collecting Data


Figure 1 demonstrates our scheme collecting data for monitored instructions. To
prepare the data for the detection, we insert some CPs(Checking Points) into
the software. These CPs scatter in the software, and read the hardware counters
for collecting their current values, and log the values for further detection. To
the end, there are two CPs located around the monitored instructions. As shown
in figure 1, CP1 reports reading1 , and CP2 reports reading2 . Thus, reading2
minus reading1 is the prepared data for checking the monitored instructions A.
In our design, reported reading of every CP can be utilized many times. As
shown in figure 1, monitored instructions A are adjacent to monitored instruc-
tions B. Therefore, CP2 is not only considered as the exit of monitored instruc-
tions A, but also the entry of monitored instructions B. Thus, reading2 is used
twice as reading2 minus reading1 and reading3 minus reading2 . Note that not
every reading is used many times because the monitored instructions are not
always adjacent to each other.

Fig. 1. An example of collecting data

Ideally, every entry and exit of the monitored instructions should be accom-
panied with one CP. With the above example, CP1 and CP2 should be located
176 H. Zhou et al.

at the entry and exit of monitored instructions A respectively. However, it is


impossible to accurately deploy CPs as we expect, because we often have not
overall information of CFG(Control Flow Graph). Therefore, it maybe have some
entries and exits of monitored instructions uncovered by CPs. We maybe fail to
monitor some execution paths because no CPs collect the values of the hard-
ware counters. To address the above problem, it seems as a feasible solution by
reducing monitored instructions. However, there is a balance between the per-
formance overhead and the length of monitored instructions. Let’s imagine two
extreme cases. First, the monitoring object only hold one or several instructions.
However, the checker is trapped frequently, and this incurs a high performance
slowdown as existing solutions. On the other hand, locating only several CPs
in entire software is also not recommended because the introduced abnormity is
easy to be submerged by “noise”.
In our opinion, a function can be considered as the ideal basic monitoring
unit. Suppose that we have known the number of mispredicted ret instructions
of every subfunctions, the abnormity occurred in the parent function is easy to be
captured. Two causes contribute to it. First, a function seldom has hundreds sub-
functions. In other words, the sum of ret instructions is usually no more than one
hundred. Second, not every ret instruction issues one BR RET MISSP EXEC.
Note that the proposed approach is recursive, we have to monitoring every sub-
function before monitoring their parent. On the other hand, it is possible to take
several functions as a monitoring unit if these functions incur few mispredictions.
Thus, we can reduce the performance overhead further.

2.3 Detecting Algorithm


The goal of detecting algorithm is distinguishing the abnormality from “noise”.
To the end, the direct way is the classification algorithms. For example, we can
use ANN(Artificial Neural Networks) as detecting algorithm. First, we define
two categories including normality and abnormity to be classified. Then ANN is
trained to recognize two classes at the training time, and output the likelihood
of ROP attacks in the checking time. However, in this paper, we do not utilize
it as our detecting algorithm because of its heavy performance overhead.
Our detecting algorithm is an effective algorithm that is demonstrated in
figure 2. As shown in the figure, the number of mispredicted ret instructions
and executed ret instructions are denoted as missp num and exec num, and the
prepared data are denoted as the points. After the training, we build a shadowed
section to hold all legal points. At the checking time, if there is a ROP attack,
the point locates outside of the shadowed section. There are the simple formula
for the algorithm: a ∗ exec num + b < missp num < a ∗ exec num + b + c where
a, b and c are the computed parameters.
We first explain the parameter c. Usually, ROP attacks need about 5-10
gadgets[3,7,8]. Some existing detecting approaches consider that 3-5 gadgets
contribute to ROP attacks[14,15]. In this paper, we regard the number as 5
which is denoted as c shown in figure 2. It means that most of ROP attacks in-
crease the number of mispredicted ret instructions by 5. For example, assuming
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 177

Fig. 2. An example of detecting algorithm

that the number of mispredicted ret instructions is 3 in the normal execution,


there maybe a ROP attack if the number is more than 8 at the running time. In
essence, our detecting algorithm is to identify a narrow-region to only hold all
legal points whose width is less than parameter c.
We compute parameter a and parameter b on the training data. At the train-
ing time, we collect the number of mispredicted ret instructions and executed ret
instructions. In this way, we get some legal points as shown in figure 2. Mean-
while, we know the possible illegal points since we have known the legal points.
On the training data, we compute parameter a and b to build an expected sec-
tion as shown in figure 2. The section must hold all legal points, but any illegal
point. Note that the section does not always exist. In the scenario, the legal
points and illegal points are mixed, and no section hold all legal points whose
width is less than parameter c. To overcome it, we have to adjust the location
of CPs, and reduce the length of monitored instructions. In an extreme case, we
only monitor a function without any subfunctions by narrowing the length of
monitored instructions, thus we absolutely obtain the section.

3 Implementation
We have implemented a prototype of HDROP on fedora 5 with a 2.6.15-1 kernel.
Though most of ROP attacks are in the user space, [4] and [8] demonstrates the
feasibility of developing ROP rootkit in the kernel-space. Moreover, [4] proposes
a practicable defense technology to defeat ROP attacks. Like the above work, we
have developed HDROP to check ROP attacks in the kernel-space in this paper.
However, we believe that HDROP can be easily ported for detecting ROP attacks
in the user-space.
HDROP consist of some CPs and a DU(Decision Unit). To collect data, we
insert thousands of CPs into the kernel with a compiler-based approach, and
every CP sends the readings of hardware counters to DU. DU is developed as a
loadable module, and it activates the CPs in the kernel with a CP-map at the
loading time. HDROP is capable of customizing the monitoring objects with the
configured CP-map. At the training time, DU logs the collecting data to compute
the parameters of the detecting algorithm. At the running time, it performs the
final checking along the detecting algorithm.
In our implementation, the main challenge is to insert thousands of CPs into
the kernel. Our solution is developing a gcc plug-in that dynamically inserts
178 H. Zhou et al.

two CPs around each call instruction. More specifically, we rewrite the machine-
described file that is used as the guider for generating assembly code, and ask
gcc to insert the new instructions before and after each call instruction. The
instruction call the CPs function to report the readings of hardware counters.
Thus, we can monitor the execution of a function.
Figure 3 shows an example of our approach. Assume that we want to monitor
function F B, we insert two additional instructions around the call instruction in
function F A, which is shown as call F B in figure 3. The inserted instruction is
a five-byte call instruction which is shown as shadowed pane in the figure. Thus,
CPs collect the readings of hardware counters before and after the execution
of function F B. We redirect the kernel control flow to our code for collecting
readings of hardware counters.

Fig. 3. An example of inserting CPs. The shadowed panes are the inserted instructions,
and function F B is the monitored object. The dashed line indicates original execution
path, while the solid line shows appended execution path after CPs inserting into the
kernel.

Some readers may wonder that why we place the CPs around each call in-
struction? Our original intention is to build a function-granularity monitoring
framework. To the end, as shown in figure 3, one CP is inserted at the entry of
the function, and the other is inserted at the exit. Before CPs are inserted, the
execution path is shown by the dashed line in figure 3. After CPs are installed,
two additional executions are introduced which are indicated by the solid line
in figure 3. Moreover, with a configurable CP-map, it is flexible to monitor the
different functions. In this way, HDROP is able to cover most of kernel execution
path. Of course, there are some feasible solutions to insert CPs into the kernel
with the same goal. For example, we can insert the CPs at the beginning and
end of every function, which is considered it as an alternative scheme.

4 Evaluation
4.1 Effectiveness
To validate the effectiveness of our solution, we had constructed a ROP rootkit
guided by the approaches introduced by [3] and [7]. The rootkit waved six gad-
gets together that ends with ret instruction. Moreover, it did not reach any
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 179

malicious end, and only transfer the control from one gadget to next gadget for
incrementing one word in the memory. We launched the rootkit in two ways.
First, the kernel stack was overwritten to redirect kernel control flow to the
rootkit. Second, kernel control data was modified for hijacking kernel control
flow.
We had performed two experiments to show the effectiveness of HDROP. In
the first test, we had built a tested module that listed the running processes
in the kernel, and customized the CP-map to insert two CPs at the entry and
exit of the monitored function, which scanned task struct list to enumerate the
running processes in the kernel. At the training time, we caught the data for
computing the parameters of detecting algorithm. After that time, we launched
the ROP rootkit, and HDROP is able to detect the attack with the abnormal
increase on the number of mispredicted ret instructions.

Fig. 4. The experiment monitoring a function in the module

Figure 4 shows the result of our first experiment. Every point in figure 4
means a two-tuples (missp num,exec num) where missp num means the num-
ber of mispredicted ret instructions and exec num is the number of executed
ret instructions. For example, (1,28) means that normal execution took 1 mis-
predicted ret instruction and 28 executed ret instructions. If the ROP attack
was launched, the number of mispredicted ret instructions was increased. As an
example, the point (8,36) was abnormal which was denoted as a square in the
figure. In the first test, the monitored execution path was simple, and HDROP
was easy to detect the ROP rootkit.
In the second experiment, HDROP placed two CPs around an indirect call
instruction in kernel function real lookup. HDROP recorded the data when the
ROP rootkit was or not loaded in the kernel by modifying the destination of
the indirect call instruction. The data, including the number of mispredicted ret
instruction and executed ret instruction, were also taken as a point in figure 5.
Like figure 4, a legal point denotes as a triangle, while a illegal point as a square
in figure 5. Moreover, figure 5 only shows the part of obtained data of HDROP
for a better exhibition. As shown in figure 5, these points were mixed together,
and we failed to identify a narrow-region to just hold all legal points. It meant
HDROP failed to detect ROP rootkit with the deployed CPs.
180 H. Zhou et al.

Fig. 5. The experiment monitoring a kernel function

Fortunately, HDROP still had the capability of detecting the ROP rootkit.
As mentioned earlier, HDROP can overcome the challenge by adjusting the
CP-map to narrow the length of monitored instructions. Thus another CP was
implanted before the first call instruction in the function that is the target of the
monitored call instruction. In the normal execution, the number of mispredicted
ret instruction was usually zero. However, after launching ROP rootkit, the
number abnormally increased that was captured by the new inserted CP.
The above tests demonstrator the effectiveness of HDROR, and indicate the
feasibility of detecting ROP attacks with PMC. First, the rootkit is developed
following by [3],[4] and [8]. Second, the monitored objects include a kernel func-
tion and a module function. At last, we launch the ROP rootkit by modifying
kernel data which is a main way to subvert the kernel. In the future, we will
perform more experiments to show its effectiveness. Since there are some exist-
ing ROP shellcodes in the user-space, to check further its effectiveness, we might
improve HDROP for working in the user-space.

4.2 Performance

The second set of experiments is to measure the performance overhead of HDROP.


The benchmark programs was UnixBench of version 4.1.0[17], and the tested OS
was fedora 5 with 2.6.15-1 Linux kernel, and the hardware platform was Intel
X200. We had implemented our tests as follows. First, UnixBench run with de-
fault setting in the clear kernel, and recorded the final score of UnixBench. Second,
UnixBench run again while HDROP was installed in the kernel, and recorded the
final score. At last, we computed the performance slowdown of HDROP.
Figure 6 shows the performance overhead of HDROP with 3000 CPs inserted
into the kernel. To make our result precise, we repeated the test 5 times and
took the average as the score. The performance overhead of eleven tasks of
UnixBench are shown in figure 6. The task, called file read, incurred the maximal
performance overhead that was about 38%. On the other hand, the runtime
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 181

Fig. 6. Performance overhead of HDROP with 3000 CPs in the kernel

overhead of Dhrystone and Whetstone was almost zero. On the final score of
UnixBench, the average slowdown of HDROP was about 19%.
How many CPs should be inserted into the kernel? In our opinion, the num-
ber is no more than ten thousands against one assumption. We suppose that
the running kernel only hold several modules. To cover the dynamically loaded
modules, it is inevitable to place more CPs in the kernel. Moreover, OS often
loads different modules at the different time. So it is difficult to accurately esti-
mate the number of CPs if we want to cover all loadable modules. To make the
discussion clear, we optimistically suppose that the kernel only load few modules
without introducing additional CPs.
With the above assumption, we had performed following experiments to show
that several thousands CPs is able to cover the kernel. First, we caught the
number of the retired instructions while HDROP was detecting ROP attack.
We had performed our test based on the first experiment that was discussed in
the above subsection. We reset the hardware counter, and made it count the
number of retired instructions. What to be clarified was that the data was ob-
tained after HDROP detecting the ROP rootkit. The cause was that our tested
processor had only two hardware counters. At the detecting time, two counters
were busy to catch BR RET MISSP EXEC and BR RET EXEC. Therefore, the
number of retired instructions was obtained by repeating the test without catch-
ing BR RET MISSP EXEC. In the tests, we observed that HDROP monitored
about four hundreds instructions with only two CPs. In other words, HDROP
is able to take hundreds of instructions as the basic monitoring unit because
BR RET MISSP EXEC does not frequently occur.
To further validate our above idea, we had performed the other experiments
that monitored the execution of system calls. We placed two CPs around the
instruction call *sys call table to collect the number of retired instructions and
executed mispredicted ret instructions. In the test, we had catched 35802 items.
Every item can be denoted as {a,b}, where a was the number of executed mis-
predicted ret instructions and b was the number of retired instructions. Accord-
ing to the proportion of a to b, these items were divided into three categories
182 H. Zhou et al.

that was shown in figure 7. The first category was described as (a/b)*1000>5,
and held 2113 items. The second was (a/b)*1000<1, and held 532 items. The
test indicated that there were about 2.9 mispredicted ret instructions while one
thousand ret instructions retired. It means that it is possible to monitor several
million instructions just using ten thousands of CPs.

Fig. 7. Three categories divided against the proportion of a to b where a is the number
of retired instructions and b is the number of executed mispredicted ret instructions

At last, we had performed another test to demonstrate the performance over-


head of HDROP when different CPs are inserted into the kernel. In our opinion,
there are some factors severely impacting on performance overhead of HDROP.
First is the number of active CPs in the kernel, and second is the locality of
active CPs. To make our result precise, we randomly built CP-map which indi-
cated the number and locality of active CPs, and repeated the test with different
number of CPs in the kernel. The number of inserted CPs was from 500 to 6000.
As shown in figure 8, when 1000 CPs were inserted into the kernel, HDROP
incurred a 7% slowdown. If there were 6000 active PCs in the kernel, HDROP
introduced 31% performance overhead. Compared to existing solutions[13,14,15],
the performance overhead of HDROP is acceptable.

5 Discussion

There are some security assumptions for HDROP. First, the kernel, including
HDROP, is in the code-integrity. Otherwise, attackers can circumvent HDROP
by tampering with the code. Since some security mechanisms are available[1,2],
we believe this assumption is reasonable. Second, PMC is protected from mali-
cious modifying. It is possible to tamper with the hardware counters to forge the
readings. However, attackers have to do that with some crafting gadgets, and it
make more difficult to identify the gadgets. Therefore, we optimistically suppose
that PMC is immune to ROP attack. In this paper, we suppose a adversary
is capable of modifying the kernel data to launch ROP attacks, which include
return addresses, function pointers and so on.
Meanwhile, HDROP has some limitations. First, HDROP may send a false
alarm. The parameters of detecting algorithm are computed on the training data.
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 183

Fig. 8. Performance overhead of HDROP with different CPs in the kernel

So it is possible to take a legal execution as ROP attack for unfull coverage. Sec-
ond, we fail to automatically generate the CP-map for covering the whole kernel.
We are seeking a more appropriate detecting algorithm. At last, HDROP is not
capable of detecting JOP. We believe it is easy to improve HDROP by identify-
ing the new interesting performance events. To overcome the above limitations
is our future work.
A novel contribution of our work is to demonstrate the feasibility of detecting
ROP attacks with the support of PMC. More specifically, we not only propose a
novel practical solution of checking ROP attacks, but also present a new usage
of PMC. Furthermore, unlike existing software-based solutions, HDROP takes
hundreds of instructions as a basic checking unit rather than a single one. Thus,
HDROP does not incur a heavy performance overhead from 2x to 5.3x timer
slower.

6 Related Work
In 2007, Hovav Shacham presents ROP attack[3], which is further generalized
to a variety of platforms[10,12]. Meanwhile, there have been many efforts to
defeat ROP attacks. As mentioned earlier, they are proposed in two categories,
which are called gadget-less solution and abnormity-detecting solution. They are
closely related to our work, and we introduce them as follows.

6.1 Gadget-Less Solution


ROP attacks depend on the crafted gadgets from the available code-base. There-
fore, some solutions are proposed to kill the gadgets hiding in the code-base. A
compiler-based way is presented to build a ret-less kernel[4]. They systematically
replace ret instruction with other instructions while the kernel is recompiled.
Thus, the gadgets is hard to be found in the patched kernel.
G-Free[5] proposes a novel two-step approach to build no-gadget software.
The first step is to terminate all unintended ret and indirect jmp instructions by
184 H. Zhou et al.

padding several nop instructions for aligning them. The second step is to protect
aligned instructions from the misuse. Compared to ret-less kernel[4], G-Free is a
generic way to defeat ROP attack.
Similar to G-Free, Control-Flow Locking[18] divides the protected instructions
into intended instructions and unintended instructions. Control-Flow Locking re-
moves the misuse of unintended instructions by imposing alignment artificially.
To protect intended instructions, it proposes an interesting way. More specifi-
cally, it performs a lock operation before dynamic control transfer, and an unlock
operation if current transfer is legal. Though it allows one violation of CFG, it
still capable of defeating ROP attacks because only one deviation can not achieve
the malicious end.
In our opinion, the above work may call gadget-less solution. Their main goal
is generating a gadget-less code-base to eliminate ROP attacks. Beside that,
they have presented the different ways to protect control transfer for defeating
ROP attacks further. These work are considerable interesting, but our work is
completely different from them for we detecting ROP attacks with the abnormity
introduced by the attacks.

6.2 Abnormity-Detecting Solution


ROP attacks have some unique features. For example, ROP attacks often chain
several gadgets ending with ret instruction[3]. Moreover, every gadget is a short
instruction sequence, and it usually ranges from two to five instructions. There-
fore, DynIMA[14] records the length of the instructions between two ret in-
structions. If it is a short instruction sequence, DynIMA[14] considers it as a
“hit”. DynIMA reports ROP attacks occurring if there are consecutively “hit”.
DynIMA is partly implemented with the support of the PIN[19].
DROP[15] is a binary instrument tool to detect ROP attack. DROP[15] shares
the same observation with DynIMA[14]. However, it has developed a prototype
system, and the experiments show its effectiveness. But it incurs a heavy per-
formance overhead because it has to check every instruction for recording the
length of instructions between two ret instructions. Moreover, the adversary may
enlarge the length of the gadget, which makes them bypass DROP.
ROPDefender[13] detects ROP attacks on the side effect introduced by the
execution. As mentioned earlier, some ROP attacks wave the gadgets together
which end with ret instruction. Thus, the call and ret instruction are not paired.
On the observation, ROPDefender[13] maintains a shadow stack to identify every
ret instruction. More specifically, it monitors every executing instruction, and
stores a copy of the return addresses in the shadow stack for identifying the
misused ret instructions. Similar to DynIMA[14], ROPDefender is implemented
on the binary instrumentation framework PIN[19].
kBouncer[21] presents an efficient ROP mitigation technology. It only mon-
itors the last part of control transfers that lead to system call execution. In
this way, it avoids monitoring all control transfer which may introduce a high
performance overhead. It is on the observation that most of ROP attacks even-
tually perform a system call. Moreover, it considers that the control transfer on
HDROP: Detecting ROP Attacks Using Performance Monitoring Counters 185

ret instruction is abnormal without paired call instruction. Furthermore, it is


a hardware-based work with the support of LBR(Last Branch Recording)[16].
However, not all ROP attacks will perform a system call. Therefore, it is possible
to be circumvented.
HDROP is proposed on the novel observation: ROP attacks induce an abnor-
mal increase on the number of mispredicted ret instructions. Moreover, HDROP
does not monitor each executing instruction, and it induces less performance
overhead. Compared to some existing approaches, the performance overhead of
HDROP is acceptable.

7 Conclusion

The HDROP we propose in this paper is a low-cost hardware-based approach to


detecting ROP attacks. The observation behind our approach is straightforward
and effective: ROP attacks lead to increase in mis-prediction. Unlike previous
detection approaches, we take one or several functions as the basic monitoring
unit, not every instruction. Furthermore, HDROP utilizes PMC to collect in-
terested data to monitor a large body of instructions. Consequently, it greatly
reduces performance overhead. We have developed a prototype system on fedora
5. Experiments show that our approach can effectively detect ROP attacks with
an acceptable performance overhead.

Acknowledgment. The authors would like to thank the anonymous review-


ers for their insightful comments that helped improve the presentation of this
paper. The work is supported in part by the National Natural Science Founda-
tion of China (61070192, 91018008, 61170240,61303074), National 863 High-Tech
Research Development Program of China (2007AA01Z414), National Science
and Technology Major Project of China (2012ZX01039-004), CNITSEC Pro-
gram (CNITSEC-KY-2012-001/4), and Natural Science Foundation of Beijing
(4122041).

References
1. Riley, R., Jiang, X., Xu, D.: Guest-Transparent Prevention of Kernel Rootkits
with VMM-based Memory Shadowing. In: Proceedings of the 11th International
Symposium on Recent Advances in Intrusion Detection (2008)
2. Seshadri, A., Luk, M., Qu, N., et al.: SecVisor: A Tiny Hypervisor to Provide
Lifetime Kernel Code Integrity for Commodity OSes. In: Proceedings of the 21st
ACM Symposium on Operating Systems Principles (October 2007)
3. Shacham, H.: The geometry of innocent flesh on the bone: Return-into-libc with-
out function calls (on the x86). In: Proceedings of the 14th ACM Conference on
Computer and Communications Security (2007)
4. Li, J., Wang, Z., Jiang, X., Grace, M., Bahram, S.: Defeating return-oriented rootk-
its with return-less kernels. In: Proceedings of the 5th ACM SIGOPS EuroSys
Conference (2010)
186 H. Zhou et al.

5. Onarlioglu, K., Bilge, L., Lanzi, A., et al.: G-free: Defeating return-oriented pro-
gramming through gadget-less binaries. In: Proceedings of the 26th ACSAC (2010)
6. Checkoway, S., Davi, L., Dmitrienko, A., et al.: Return-oriented programming with-
out returns. In: Proceedings of the 17th CCS (2010)
7. Bletsch, T., Jiang, X., Freeh, V.W., et al.: Jump-oriented programming: A new class
of code-reuse attack. In: Proceedings of the 6th ACM Symposium on Information,
Computer and Communications Security (2011)
8. Hund, R., Holz, T., Freiling, F.: Return-oriented rootkits: Bypassing kernel code in-
tegrity protection mechanisms. In: Proceedings of USENIX Security 2009. USENIX
(August 2009)
9. Chen, P., Xing, X., Mao, B., et al.: Automatic construction of jump-oriented pro-
gramming shellcode (on the x86). In: Proceedings of 6th ASIACCS (2011)
10. Buchanan, E., Roemer, R., Shacham, H., et al.: When good instructions go bad:
Generalizing return-oriented programming to RISC. In: Proceedings of the 15th
ACM Conference on Computer and Communications Security (2008)
11. Checkoway, S., Feldman, A.J., Kantor, B., et al.: Can DREs provide long-lasting
security? the case of return-oriented programming and the AVC Advantage. In:
Proceedings of EVT/WOTE (2009)
12. Kornau, T.: Return oriented programming for the arm architecture. Technical re-
port (2010)
13. Davi, L., Sadeghi, A.-R., Winandy, M.: ROPdefender: A detection tool to defend
against return-oriented programming attacks. Technical Report HGI-TR-2010-001
(2010)
14. Davi, L., Sadeghi, A.R., Winandy, M.: Dynamic integrity measurement and attes-
tation: Towards defense against return-oriented programming attacks. In: Proceed-
ings of 4th STC (2009)
15. Chen, P., Xiao, H., Shen, X., Yin, X., Mao, B., Xie, L.: DROP: Detecting return-
oriented programming malicious code. In: Prakash, A., Sen Gupta, I. (eds.) ICISS
2009. LNCS, vol. 5905, pp. 163–177. Springer, Heidelberg (2009)
16. Intel. Intel 64 and ia-32 architectures software developers manual, volume 3b: Sys-
tem programming guide, part 2
17. UnixBench (2012), http://ftp.tux.org/pub/benchmarks/system/unixbench
18. Bletsch, T., Jiang, X.: Mitigating Code-Reuse Attacks with Control-Flow Locking.
In: Proceedings of the 27th Annual Computer Security Applications Conference,
ACSAC (2011)
19. Luk, C.-K., Cohn, R., Muth, R., et al.: Pin: Building customized program analysis
tools with dynamic instrumentation. In: Sarkar, V., Hall, M.W. (eds.) Proceedings
of PLDI (2005)
20. Nethercote, N., Seward, J.: Valgrind: A framework for heavyweight dynamic binary
instrumentation. SIGPLAN Not. 42(6), 89–100 (2007)
21. Pappas, V.: kBouncer: Efficient and transparent ROP mitigation. Technical report,
Columbia University (2012)

You might also like