Professional Documents
Culture Documents
Abstract 1 Introduction
The malware threat landscape is continuously evolving.
Early detection of these threats is a top priority for en-
terprises, governments, and end users. The widely-
The volume and the sophistication of malware are con- deployed signature-based and static-analysis-based de-
tinuously increasing and evolving. Automated dynamic tection approaches can be easily evaded by techniques
malware analysis is a widely-adopted approach for de- commonly seen in the wild, such as obfuscation, poly-
tecting malicious software. However, many recent mal- morphism, and encryption. Therefore, dynamic mal-
ware samples try to evade detection by identifying the ware analysis tools have recently become more pop-
presence of the analysis environment itself, and refrain- ular to automate the analysis and detection of these
ing from performing malicious actions. Because of the threats [1, 14, 35]. These systems execute the suspi-
sophistication of the techniques used by the malware au- cious sample in a controlled environment and observe
thors, so far the analysis and detection of evasive mal- its behavior to detect malicious intent. While this dy-
ware has been largely a manual process. One approach to namic analysis approach is more effective against com-
automatic detection of these evasive malware samples is mon static analysis evasion techniques, it faces a differ-
to execute the same sample in multiple analysis environ- ent set of challenges. More specifically, a malware sam-
ments, and then compare its behaviors, in the assumption ple, when executed, can detect the analysis environment
that a deviation in the behavior is evidence of an attempt and refuse to perform any malicious activity, for example
to evade one or more analysis systems. For this reason, it by simply terminating or stalling the execution.
is important to provide a reference system (often called Malware authors have developed several ways to de-
bare-metal) in which the malware is analyzed without the tect the presence of malware analysis systems. The
use of any detectable component. most common approach is based on the fingerprinting of
the runtime environment of the analysis system. This
In this paper, we present BareCloud, an automated includes checking for specific artifacts, such as some
evasive malware detection system based on bare-metal specific registry keys, background processes, function
dynamic malware analysis. Our bare-metal analysis sys- hooks, or IP addresses that are specific to a known anal-
tem does not introduce any in-guest monitoring compo- ysis tool. These artifacts must be known to the mal-
nent into the malware execution platform. This makes ware authors in advance to develop the corresponding
our approach more transparent and robust against sophis- fingerprinting techniques. Another approach leverages
ticated evasion techniques. We compare the malware be- the fact that most of the analysis systems use emulated
havior observed in the bare-metal system with other pop- or virtualized environments as their malware execution
ular malware analysis systems. We introduce a novel ap- platform. Such execution platforms can be detected by
proach of hierarchical similarity-based malware behavior checking the platform-specific characteristics that are
comparison to analyze the behavior of a sample in the different with respect to a baseline environment (i.e., an
various analysis systems. Our experiments show that our unmodified operating system installed on real hardware,
approach produces better evasion detection results com- often referred to as a “bare-metal” installation). Such
pared to previous methods. BareCloud was able to au- characteristics can be the timing properties of the execu-
tomatically detect 5,835 evasive malware out of 110,005 tion, or a small variation in the CPU execution seman-
recent samples. tics [31, 32].
1
Public-facing malware analysis systems are particu- from a real host is a system that uses an unmodified op-
larly vulnerable to the first approach to fingerprinting. erating system installation that runs on actual hardware
This is because an attacker can submit malware samples (i.e., a bare-metal system). However, this approach faces
specifically designed to extract the malware analysis en- several fundamental challenges. One of the important
vironment artifacts to be then used in fingerprinting the challenges is to efficiently restore the analysis system af-
analysis system. Private malware analysis systems are ter every analysis run. Recently, a bare-metal-based mal-
less prone to this type of fingerprinting. However, be- ware analysis system, called BareBox [25], proposed an
cause of the internal sharing of malware samples among efficient system-restore technique. In this technique, the
these private and public analysis systems, private sys- physical memory of the host is partitioned and only one
tems may also be vulnerable to such fingerprinting [37]. partition is used for the analysis environment, while an-
One way to prevent the fingerprinting of the analy- other partition is used for a snapshot of the system to be
sis environment is to construct a malware analysis sys- restored. Whenever needed, an external operating sys-
tem indistinguishable from a real host. Such systems tem located outside the physical memory of the analy-
are also known as transparent analysis systems. One of sis environment performs the restoration of the physical-
the first transparent analysis systems, called Cobra [34], memory snapshot, without the need for a reboot. How-
tries to achieve this by developing a stealthy analysis ever, a sophisticated malware can forcefully probe the
environment using binary translation. However, this physical memory and detect the presence of the Bare-
approach can only prevent known fingerprinting tech- Box system. Another bare-metal based malware analy-
niques. Ether [14] is a more robust transparent analysis sis framework is Nvmtrace [5]. This system leverages
system that leverages hardware virtualization to maintain IPMI (Intelligent Platform Management Interface) tech-
the CPU execution semantics of a hardware CPU. How- nology to automate the power cycle of the bare-metal
ever, the system introduces significant performance over- analysis system. However, a complete reboot of the sys-
head when performing fine-grained monitoring, which is tem is required after every analysis run. Another chal-
required to produce a comprehensive malware behavioral lenge to the bare-metal based malware analysis system
profile. With such performance overhead, it is funda- is the extraction of the behavioral profile. To this end,
mentally infeasible to make it transparent, especially if no process-level behavior, such as process creation, ter-
the malware execution has access to an external timing mination, and hooking activities, can be extracted from
source [20]. a bare-metal analysis system without introducing some
Instead of preventing the fingerprinting of the analysis form of an in-guest analysis component. However, the
system, some of the recent works have focused on detect- presence of such components inside the system violates
ing a deviation of the malware behavior in different anal- the transparency requirement and makes the system de-
ysis environments [9, 13, 23, 24, 28]. The approach is to tectable. Because of this limitation, the observable mal-
execute a malware sample in different analysis environ- ware behavior on a pure bare-metal system is limited to
ments and compare their behavioral profiles to find a de- the disk-level and network-level activities. When only
viation. A behavioral profile is a higher-level abstraction the disk-level and network-level behaviors are available,
of the activities performed by a malware sample when it may not be possible to perform an in-depth behavioral
executed. The assumption is that the presence of such analysis, but these types of activity can be effectively
deviations is evidence of an attempt to fingerprint and used for detecting evasive behavior.
evade one or more analysis systems. This is a generic and In this paper, we present BareCloud, a system for au-
robust approach because it can detect evasion regardless tomatically detecting evasive malware. BareCloud de-
of the knowledge of the techniques used by the malware tects evasive malware by executing them on a bare-metal
sample in order to fingerprint and evade the analysis sys- system and comparing their behavior when executed on
tem. This approach assumes that the malware shows its other emulation and virtualization-based analysis sys-
malicious behavior in one of the analysis systems, also tems. Our bare-metal system has no in-guest monitor-
known as the reference system. However, all previous ing component. This approach provides a robust trans-
approaches have used emulated or virtualized environ- parent environment for our reference system where both
ments for observing the deviation in the malware behav- user-mode and kernel-mode malware can be analyzed.
ior, and such environments are known to be detectable. If BareCloud transparently extracts the behavioral profile
all of the analysis systems are evaded by a malware sam- of the malware from its disk-level and network-level ac-
ple, no significant deviation may be present in the execu- tivity. The disk-level activity is extracted by comparing
tion traces. Moreover, some of the analysis systems use the system’s state after each malware execution with the
in-guest modules for behavior extraction, which further initial clean state. Using the understanding of the operat-
compromises the transparency of the analysis system. ing system of the analysis host, BareCloud also extracts
A malware analysis system that is indistinguishable operating-system-level changes, such as changes to spe-
2
cific registry keys and system files. Network-level activi- Note that, beside evasion, there can be other factors that
ties are captured on the wire as a stream of network pack- may cause a deviation in the behavioral profile. Section 4
ets. This approach extracts malware behavior only from describes how we mitigate those factors.
the persistent changes to the system. In principle, a mal- Our work makes the following contributions:
ware sample could perform its activities without causing
any persistent change, or could revert any changes after • We present BareCloud, a system for automatically
the activities are carried out. However, to perform any detecting evasive malware. Our system performs
substantially malicious activity, a malware has to depend malware analysis on a transparent bare-metal sys-
on some persistent change to the system, or it has to in- tem with no in-guest monitoring component and on
teract with external services, such as a C&C server. Both emulation-based and virtualization-based analysis
types of activities are transparently observable in our sys- systems.
tem.
• We introduce a novel evasion detection approach
When comparing the behavior of a malware sample
that leverages hierarchical similarity-based behav-
on multiple platforms, previous works have considered
ioral profile comparison. We show that this ap-
the behavioral profiles purely as sets or bags of ele-
proach produces better results compared to the pre-
ments drawn from a flat domain, and computed their
vious set-intersection-based approaches.
similarity using traditional set-intersection-based meth-
ods [8, 13, 28]. Set-intersection-based measures may not • We evaluate our system on a large dataset of re-
accurately capture similarity when data is sparse or when cent real-world malware samples. BareCloud was
there are known relationships between elements within able to detect 5,835 evasive malware instances out
the sets [19]. For example, if two behavioral profiles un- of 110,005 samples.
der comparison contain a large number of similar file ac-
tivities, but only one profile exhibits some network activ-
ities, set-intersection-based similarity measures, such as 2 System Overview
Jaccard similarity, produce a high similarity score, and
fail to properly capture the lack of similarity among net- The goal of our system is to automatically detect eva-
work activities. One may compute the similarity of the sive malware by performing automated analysis of a
file activities and the network activities separately. How- large number of samples on a bare-metal reference sys-
ever, similar problems exist; for example, two profiles tem and other dynamic analysis systems. The goal is
may contain large number of similar DNS activities, but to identify deviations in the dynamic behavior of a sam-
only one profile contains an HTTP request. It is im- ple when executed on different analysis environments.
portant to identify such small-yet-important differences BareCloud achieve this by a multi-step process as de-
while comparing behavioral profiles for detecting eva- picted in Figure 1. The large volume of input samples
sions. is first pre-screened using the Anubis malware analy-
When manually comparing behavioral profiles, we sis framework [1]. The purpose of the pre-screening
start from generic questions such as “Do both profiles process is to select more interesting samples that are
contain network and file activities?” If they do, we move likely to have environment-sensitive behavior. These
on to other questions such as “Do these activities corre- pre-screened samples are then executed on the cluster
spond to the same network or file objects?” This way of bare-metal analysis hosts and on three other malware
of reasoning indicates that the behavioral profiles have analysis systems, namely, Ether [14], Anubis [1], and
an inherent similarity hierarchy based on the level of ab- Cuckoo Sandbox [2]. Each analysis system consists of
straction of the activities. Therefore, our similarity mea- multiple analysis hosts. The execution of the same sam-
sure is based on the notion of the similarity hierarchy. ple in different systems is synchronized by the Scheduler
Such hierarchy-base similarity can compute similarity at component. Analysis hosts (workers) can independently
different levels of abstraction and identify activities that join, perform analysis, and leave the BareCloud sys-
share similar characteristics even if they are not exactly tem. BareCloud extracts behavioral profiles from each
the same. We show that this approach performs better of these analysis run, and, in the next step, it processes
than the set-intersection-based measure while comparing these profiles to detect evasive behavior.
behavioral profiles for detecting evasive malware.
We compare the malware behavioral profile extracted 3 Monitoring Environments
from the bare-metal system with three major malware
analysis platforms that are based on emulation and differ- In this section, we describe the four malware analysis
ent types of virtualization, and we detect evasive behav- environments we use for monitoring the behavior of mal-
ior by detecting the deviation in the behavioral profile. ware samples.
3
Incoming
Pre-filter Scheduler
samples
Synchronized
Execution
4
sis system is the malware initiator, which is the compo- activities performed by the malware after the detec-
nent that starts the execution of the malware. Usually, tion. This is the type of deviation we are interested
this component is implemented as some form of in-guest in.
agent that waits for a malware sample through a network
service. If the analysis system reboots after each analy- • Intrinsic non-determinism:
sis run, another approach can be to install a start-up en- A malware may have intrinsic non-determinism em-
try in the system configuration that executes an applica- bedded in the code. That is, malware behavior
tion from a specific path. A malware sample can then might depend on some random value that it reads at
be updated at this specific path for each analysis run by the time of execution. For example, a malware sam-
directly modifying the disk-image when the analysis sys- ple may create a file with a random name. Random-
tem is offline. However, precise control over the malware ization in the behavior can also result from the use
execution duration is difficult when using this approach, of certain system services and APIs. For example,
as the overall execution time includes the system reboot a successful call to URLDownloadToFile creates a
time, which can vary among multiple reboots.. random temporary folder to download the web con-
For a bare-metal analysis system, making its malware tent.
initiator component transparent is very important. This
is because the malware can simply check for the presence • Internal environment:
of this component to fingerprint the environment. To this
end, our system uses the network-based approach. The Difference in the software environment of the dif-
malware initiator removes itself and all of its artifacts af- ferent analysis systems may trigger different dy-
ter initiating the malware. This network-based approach namic behaviors of the malware sample. For exam-
also makes the malware execution duration more accu- ple, some malware may depend on a .NET frame-
rate, as it does not account for the reboot time. work installed in the analysis system, or may de-
pend on the availability of a specific version of a
system DLL. If one of the malware analysis envi-
3.5 User Environment ronments does not contain such software compo-
nents, the resulting malware behavior may be dif-
Apart from stock operating system, the environment ferent.
installed inside the malware analysis systems includes
some data and components that are usually present on • External environment:
a real host, such as saved credentials for common social
networks, browser history, user document files, and other Another critical factor that may cause a deviation
customizations. With this setup, we can observe addi- in the malware behavior is the external environment
tional malware behavior that we could not have observed with which a malware sample can interact. In the
using a bare user environment. context of malware execution, this external envi-
ronment largely comprises of different network ser-
vices, such as DNS and C&C servers. The non-
4 Behavior Comparison deterministic nature of such network services may
introduce deviations in the dynamic behavior of a
In this section, we discuss malware behavioral deviation, malware sample. One simple way to minimize this
behavioral profile extraction, and formalize behavioral factor is to completely disable access to external
profile comparison. network environments. However, the network activ-
ity of a malware sample is one of the most important
aspects to characterize the behavior of the sample.
4.1 Behavior deviation Hence, a successful behavior comparison of a mal-
ware sample requires the inclusion of its network
There are many factors that may cause a malware sam- activities.
ple to show deviations in the dynamic behavior associ-
ated with different analysis environments. Hereinafter,
Since our goal is to identify behavior deviations
we discuss each of these factors in detail.
caused by the evasive technique employed by the mal-
ware sample, we need to minimize the effect of the three
• Evasive behavior of the malware sample: other factors that may cause a behavior deviation.
Deviation in the behavior may be the result of a suc- One approach to identifying intrinsic non-determinism
cessful fingerprinting of the analysis environment. is to execute the same sample in the same environment
This deviation is observable due to the change in the multiple times. By comparing the execution traces from
5
these different execution runs, non-deterministic behav- and SMTP communications and respond with con-
ior can be identified. Previous work [28] used this ap- sistent replies in all analysis system. For example,
proach to filter out randomized behavior. However, this the system responds with identical IP information to
approach is resource- and time-expensive. Moreover, not identical DNS queries coming from different analy-
all malware exhibit such randomized behavior. sis environments. With this setup, we are also able
In this work, we propose a more efficient hierarchical to sinkhole non-existent domain and SMTP com-
similarity-based approach to behavior comparison, de- munications to the local simulated network. This
scribed in Section 4.4. This approach is able to min- helps us observe more network behavior of a mal-
imize the effect of intrinsic randomization without re- ware sample, which otherwise may not be observ-
quiring multiple execution runs of the same sample in able.
the same analysis environment. In order to address devi-
ation caused by different internal environments, we must
provide identical software environments to all analysis 4.2 Behavioral profile
systems. Therefore, we prepared identical base software After the execution of a malware sample in different
environments for all of our analysis systems. analysis environments, we need to extract its behavioral
Precisely controlling the behavior deviation intro- profile for comparison. Usually, the behavioral profile is
duced by the external environment is difficult. This is extracted from some form of dynamic execution trace,
because these factors are not under our direct control. such as a system-call trace. Bayer et al. have intro-
However, failure to minimize the impact of these factors duced a comprehensive method of extracting behavioral
may result erroneous behavior deviations. This consider- profile from an augmented system-call trace. The ad-
ation is important because most malware communicates ditional information provides taint tracking of input and
with the external environment to carry out its malicious output parameters of system calls that provides depen-
activities. To minimize the effect of the external environ- dency information between different system calls [10].
ment, we implemented the following strategies. This approach has been used to cluster a large number
of malware, and to compare malware behaviors [10, 28].
• Synchronized execution: We execute the same mal- Similar approaches can be used in three of our analy-
ware sample in all analysis environments at the sis environments, where system-call traces are available.
same time. The scheduler component facilitates the However, this system-call based approach is not directly
synchronization among different analysis hosts. By applicable to our bare-metal malware analysis system, as
doing this, we minimize the behavior deviation that we do not have access to the system-call trace.
may be introduced by the variation of the external
factors over time. For example, a malware may try
to connect to a fast-flux network. The availability Transient and resultant behavioral profile
and the returned response of the C&C server and
A transient behavioral profile is a profile that represents
the DNS server may vary over time. If the mal-
all of the operations performed by a malware sample dur-
ware is executed in different environments at differ-
ing its execution. The system-call-based behavioral pro-
ent times, such variations in external environment
file discussed previously is a type of transient behavioral
may result in a spurious behavior deviation. Syn-
profile. This represents a more comprehensive view of
chronized execution mitigates such differences.
how a malware performs its malicious activities. The
• Identical local network: Malware can interact with resultant behavioral profile consists of the cumulative
the local network by different network-related ac- changes made by the malware from the beginning to the
tivities, such as probing available local network ser- end of its execution. This includes those operations that
vices and accessing file shares. We expose all analy- make persistent changes to the system. Multiple simi-
sis systems to identical simulated local network en- lar operations to the same object are combined and rep-
vironments. resented as one operation to reflect the resulting effect
of the operations. This represents a more summarized
• Network service filters: One approach to minimize view of what a malware does to the system. A mal-
the non-determinism introduced by different net- ware can obfuscate its transient behavior to evade tran-
work services is to actively intercept network com- sient behavior-base similarity detection. However, simi-
munications and maintain identical responses to lar malicious activities produce similar resulting behav-
identical queries among all instances of a malware ioral profiles, even if the transient behavior is obfuscated
running in different analysis environments. This re- or randomized. This makes the comparison of malware
quires an application-level understanding of the net- behavior based on the resultant behavioral profile more
work services. To this end, we intercept all DNS robust.
6
The transparency requirement of our bare-metal analy- in case of a file type object, only the creation, deletion,
sis system limits us to the extraction of only the resulting and modification operations are included in the profile.
behavioral profile. That is, the transient behaviors of pro-
cess activities and filesystem activities are not available.
However, we can extract the resulting filesystem behav-
ior by comparing the disk contents from before and after 4.3 Behavior extraction
the malware execution. Extraction of network behavior
is straightforward using an external traffic capture com- Our bare-metal system can only access the raw disk con-
ponent. tents. We extract the filesystem behavior by comparing
With these constraints in hand, we model our behav- the filesystem state before and after the execution of a
ioral profile based on the model introduced by Bayer et malware sample. A detailed understanding of the filesys-
al. [10], such that only the objects and the operations are tem internal structures is required to extract such infor-
used. That is, we take into consideration the object upon mation. We leverage the functionalities provided by the
which a malware performs an operation that causes a per- SleuthKit framework [6] for extracting the file meta-data
sistent change to the object. Formally, a behavioral pro- information from the raw disk image. By doing this, we
file Π is defined as a 3-tuple. are able to extract all file names in the disk, including
Π = (O, R, P) some recently deleted files, along with their correspond-
Where, O is the set of all objects, R is the set of all op- ing meta-data, such as size and modification date. We
erations that causes persistent changes, and P ⊆ (O × R) first build two sets representing the file object meta-data:
is a relation assigning one or more operations to each ob- the clean set and the dirty set, corresponding to the disk
ject. Unlike in the model proposed in [10], where the content before and after a malware execution. Extract-
objects and the operations are conceptualized as OS Ob- ing the deletion and creation operations of a file object
jects and OS operations, we generalize the objects and are simple set operations. That is, any file not present in
operations to any environment entity with which a mal- the dirty set is considered as deleted, and any file only
ware can interact. More details on objects and operations present in the dirty set is considered as created. If a file
are provided hereinafter. is present in both sets with different meta-data, it is con-
sidered as modified. However, if a malware writes to
a disk-sector (other than MBR) that is invisible to the
Objects
filesystem, or modifies an existing file without chang-
An object represents an entity, such as a file or a network ing the size and file-date meta-data, the current approach
endpoint, upon which a malware can perform some op- will not detect such changes. The straightforward way
eration. of comparing all file contents between two disk states
It is a tuple of type and name formally defined as fol- can be very inefficient. This limitation can be mitigated
lows. by first detecting such changes in the disk sectors from
O = (ob j type, ob j name) copy-on-write data or iSCSI communication, and map-
ob j type ::= f ile|registry|syscon f |mbr|network ping the dirty sectors to files. Similar approach has been
previously proposed [29]. To this end, we leave this im-
The file type represents filesystem-specific file objects
provement as a future work.
of the disk, the registry type represents registry keys,
the sysconf type represents OS-specific system config- Registry behavior is extracted using a similar ap-
urations, such as the boot configuration, mbr represents proach. We extract the meta-data of all the registry keys
OS-independent Master Boot Record, and the network from the raw registry hive (registry database file) us-
type represents network entities, such as a DNS server. ing the registryfs filesystem extension of the SleuthKit
framework. Again, we build two sets representing the
registry meta-data corresponding to the registry hive con-
Operations tent before and after the malware execution. We perform
set operations similar to the case of the filesystem to ex-
An operation generalizes the actions performed by a mal-
tract malware operations on the registry objects.
ware sample upon the above-described objects. An oper-
ation is formally defined as: To extract the behavior of type sysconf, we process the
R ::= (op name, op attribute) filesystem and registry behavior to identify critical mod-
That is, an operation has a name and a correspond- ifications to the system configuration. Some examples of
ing attribute to provide additional information. As men- the system configuration locations are listed in Table 1.
tioned previously, only those operations that cause a per- For the three other analysis systems, we process
sistent change to the system are included. For example, system-call traces to extract behavior information.
7
Table 1: Examples of the configuration locations tions, which is the user’s home directory. Some system
obj type obj name System path
APIs that create temporary files also generate different
sysconf
sysconf
startup
startup
HKLM/Software/Microsoft/Windows/CurrentVersion/Run
HKCU/Software/Microsoft/Windows/CurrentVersion/Run
file paths, which are semantically similar. Many such
sysconf
sysconf
startup
boot
HKLM/System/CurrentControlSet/Services
%SYSTEMROOT%/BOOT.INI
temporary path names have known root locations and
sysconf
sysconf
autoexec
sysini
%SYSTEMROOT%/AUTOEXEC.BAT
%SYSTEMROOT%/WINDOWS/SYSTEM.INI
can be identified by their naming structure. We replace
sysconf winini %SYSTEMROOT%/WINDOWS/WIN.INI such occurrences in the object names with corresponding
generic tokens.
Behavior normalization
4.4 Behavior comparison
Behavioral profile extracted from the difference of the
initial and final disk states contains both malware behav- Previous works have compared the persistent change-
ior as well as the background operating system behav- based behavioral profile using set-intersection-based
ior. We need to filter out the features of the behavioral methods over the feature set [13, 28]. However, when
profile that do not correspond to the malware execution. comparing behavioral profiles that only considers persis-
One way to filter such features is to use the file modifi- tent changes, one can expect a sparse feature set. Fur-
cation timestamp of the file objects. That is, by selecting thermore, features within the profile are highly related
only those files that are created and modified during the and can be categorized in groups and subgroups. How-
time when the malware is executed, one can filter out un- ever, when the features are sparse or when there are
related file modifications that occur before and after the known relationships between features within the set, set-
malware execution. However, some unrelated filesystem intersection-based measures may not accurately capture
changes caused by the base operating system might still the similarity [19].
be present in the filtered profile. Moreover, many mal- Unlike previous works, we use a hierarchical similar-
ware samples actively modify the system time, or tamper ity measure to overcome this problem. The hierarchy is
with the file meta-data to revert the file’s modification associated with the different abstraction levels present in
time. Although the simple file time-stamp-based filter is the behavioral profile. This approach makes our simi-
efficient, this approach will fail in such situations. larity measure less sensitive to randomization introduced
Another approach to filter the background behavior is by non-determinism in malware code. This is because
to first learn the behavioral profile of the base operating the randomization is usually introduced only in one level
system and then filter this behavior from the profile gen- of the hierarchy while keeping other levels of the hierar-
erated by a malware execution. By doing this, we can chy identical. For example, a malware may randomize
overcome many of the shortcomings of the timestamp- the filename (ob j name) it creates, but perform the same
based approach. This approach may exclude some mal- create operation (op name) on a file object (ob j type)
ware operations that match the operation performed by with the same operation attribute (op attribute).
the base operations. However, it is difficult to perform
malicious actions using only operations that are also per- 4.5 Hierarchical similarity
formed by the base operating system. Also, such oper-
ations are less important in defining the malicious be- The notion of the hierarchical similarity is often used in
havior of the malware. We use this approach to filter our text similarity, in mining association rules, and in vari-
profiles. To extract the background behavior of the analy- ous computer vision tasks for finding similar shapes [16,
sis system, we wrote a “void” program that does nothing 17, 21]. We use a similar notion of hierarchical simi-
other than stall infinitely. For each analysis environment, larity to compare behavioral profiles. The similarity hi-
we extract the behavioral profile of the “void” program erarchy of the behavioral profile is represented in Fig-
from all of its analysis hosts and combine them to build ure 2. As one can see, knowledge of the semantics
a generalized background profile. We use this profile to and of the relationship between the objects is encoded
filter the behavioral profile of a malware execution. in the representation. The leaves of the tree are the ac-
Some objects used to describe the profile may tual feature elements of the behavioral profile. The first
be referenced using multiple names. For ex- level of similarity hierarchy is ob j type. An ob j type
ample, \\?\C:\Documents and Settings and may have one or multiple ob j name, and each such
C:/DOCUME∼1/ correspond to the same file object. ob j name can be associated with one or more op name
We convert such identifiable object names to the same corresponding to various operations. Each such opera-
format. Different usernames may also result in different tion has one leaf node corresponding to the associated
physical names for semantically similar file locations. attribute of the operation. The leaf nodes are the fea-
For example, the locations C:/DOCUME∼1/USERA and ture elements whose attributes are represented by its
C:/DOCUME∼1/USERB are semantically similar loca- parent nodes. For example, in the Figure 2, the ele-
8
root
Feature Elements p1 p2 p3 p4
ment p1 is a feature element having the feature attributes parent nodes also have matching labels. We recursively
(ob j type1, ob j name1, op name1, op attrb1). define match and candidate for each level d as:
We compute the similarity in a two-step process. First
we identify the matching nodes in the hierarchies of two
behavioral profiles. We do this iteratively, starting from matchH1 ,H2 (d) = L(H1 , d) ∩ L(H2 , d) |
the first level (ob j type). For each of these matching ∀l ∈ matchH1 ,H2 (d), PL(H1 , l) = PL(H2 , l) and
nodes, we identify the matching nodes among their child PL(H1 , l) ∈ matchH1 ,H2 (d − 1) (1)
nodes, i.e., the next level of hierarchy. We compute the
similarity measure at each hierarchy level. Finally, we
aggregate level similarity measures to compute the over- candidateH1 ,H2 (d) = L(H1 , d) ∪ L(H2 , d) |
all similarity. ∀l ∈ candidateH1 ,H2 (d),
PL(H1 , l) ∪ PL(H2 , l) ∈ matchH1 ,H2 (d − 1) (2)
The model
Let H be a rooted tree representing the similarity hierar- where,
chy, where all nodes have associated labels. For exam- matchH1 ,H2 (0) = root. (3)
ple, Figure 2 is an instance of H. Let LH be all labels
in H, and L(H, d) be the set of labels of the nodes of We define levelsimH1 ,H2 (d), the similarity of H1 and
H at depth d. Let δ be the height of the tree such that H2 at level d, as the Jaccard similarity coefficient. That
L(H, δ ) is the set of all labels of the leaves of H. The set is,
of labels L(H, δ ) represents the feature elements p such
that p ∈ P, where P represents the behavioral profile. In
|matchH1 ,H2 (d)|
the example Figure 2, the leaf nodes p1 , p2 , p3 , and p4 levelsimH1 ,H2 (d) = . (4)
represents the feature elements of the behavioral profile. |candidateH1 ,H2 (d)|
L(H, δ ) and P are equivalent, one represented as leaves We define the overall hierarchical similarity between
of the tree structure, another represented as a tuple of the behavioral profiles P1 and P2 as the arithmetic average of
feature attributes. With this, any P can be mapped into similarity at each level:
H. There is a hierarchy in L(H, δ ), and hence in P, su-
perimposed by H.
Let P1 and P2 be two behavioral profiles of a malware 1 δ −1
sample m from analysis systems a1 and a2 . Let these Sim(P1 , P2 ) = ∑ levelsimH1 ,H2 (d). (5)
δ d=1
behavioral profiles be mapped into hierarchies H1 and
H2 , instances of the hierarchical model H. Let PL(H, l) This definition is consistent, since the right side of this
be the label of the parent node of a node with label l, equation always lies between 0 and 1. Hence, the be-
where l ∈ LH , the set of all labels in H. Here, we want havior distance between P1 and P2 can simply be defined
to find nodes with matching labels at each depth d whose as:
9
Dist(P1 , P2 ) = 1 − Sim(P1 , P2 ). (6)
1.0
This is possible because Sim(P1 , P2 ) is derived from
the Jaccard similarity coefficients.
0.9
Finally, we define the behavior deviation score of
0.8
a malware sample D among different analysis system
Precision
a1 ...an with respect to the behavioral profile Pr extracted
0.7
from the reference system ar as the quadratic mean of the
behavior distances as follows.
0.6
Hierarchical similarity
Jaccard similarity
0.5
s
1 n
Deviation(D) = ∑ Dist(Pr , Pi )2 , (7)
0.4
n i=1
0.0 0.2 0.4 0.6 0.8 1.0
where n is the number of analysis systems, and Pi is
the behavioral profile extracted from the analysis system Recall
ai . This deviation score D is in interval [0,1], where the
value of 0 means no deviation and the value of 1 means
Figure 3: Precision-Recall analysis of the hierarchical
maximum deviation. We define a deviation threshold t.
similarity-based and the Jaccard similarity-based behav-
If the deviation score D exceeds this value, we consider
ior comparison.
the sample as evasive.
10
events) and network activity in the Anubis analysis
environment. We include this group to see if sim-
1.0
ilar behavior can be observed in all other analysis
environments.
0.8
Recall
t=0.84 tem activity in the Anubis analysis environment.
0.0
11
Table 2: Evasion detection on different environments will fail to detect BareCloud, because we are executing
malware on a bare-metal hardware. However, there is
Environment Detection count Percentage the possibility that our system can be fingerprinted by
Anubis 4,947 84.78 examining unique software/hardware environment fea-
Ether 4,562 78.18 tures, such as the MAC address of the network device
VirtualBox 3,576 61.28 or the presence of the iSCSI driver. In the case of emu-
All 2,530 43.35 lated/virtualized environments, it is trivial to randomize
Total 5,835 such information for each malware analysis run. Since
our system uses real hardware, introducing this random-
ization while preserving the transparency is difficult. The
6 Limitations iSCSI driver detection can be mitigated by using more
expensive hardware iSCSI initiator instead of a software
In this section, we discuss the limitations of our ap- iSCSI initiator. A hardware iSCSI initiator is a host bus
proach. adapter (HBA) that appears to the OS as a hardware stor-
One of the limitations inherent to all dynamic analysis age device. To this end, our system runs as a private mal-
systems is the inability of handling stalling code in a gen- ware analysis framework and all outside network com-
eral way. A malware sample can wait for some time (e.g., munications are blocked. A limited access to the Internet
tens of minutes) before performing any malicious activ- is provided through proxy. As long as the unique envi-
ity. Kolbitsch et al. have proposed a technique to detect ronment variables are not leaked to the malware authors,
and mitigate malware stalling code [26]. However, the the system can be kept undetectable. However, a dedi-
mitigation technique is only possible where fine-grained cated attacker may detect any dynamic analysis system
analysis is possible. The technique is not applicable in that allows external network communications like ours
our bare-metal system. A more sophisticated stalling at- by using active reconnaissance-based attacks [37]. Mal-
tack may encrypt the malicious payload with some weak ware writers can upload decoy samples to public mal-
encryption key, which the malware will brute-force dur- ware analysis systems so that it is eventually picked up
ing the execution. It is difficult to expedite such process by private analysis systems, such as ours. Such sam-
to reveal the malicious activities within a limited execu- ples can leak unique environment artifacts of these anal-
tion time. To this end, BareCloud will not be able to ysis systems using “phoning home” technique and can be
observe activity in malware with stalling code. used for active reconnaissance.
Malware may employ heuristic checks for anomalies
in the internal or external environments that are specific
to malware analysis systems. For example, it can wait for
7 Related works
real user activities, or try to resolve known non-existent
domain using a public DNS server. If there is no user 7.1 Dynamic analysis
activity in the system or if every non-existent domain is Researchers have developed many dynamic analysis
resolved, it is likely that the system is an analysis en- tools to analyze malware. These tools mostly focus
vironment. One of the mitigation techniques can be to on extracting system call or Windows API call traces.
remove external anomalies by not actively intercepting Many of these analysis systems are based on sandbox-
and modifying the network communication. User activi- ing techniques [1, 4, 14, 35]. A sandbox is an instru-
ties can be injected into the system by using automation mented execution environment that executes the malware
techniques. However, transparently performing such au- sample in a contained way. Some of these sandboxes
tomation in a bare-metal system is challenging. To this leverage in-guest techniques to trace Windows API calls,
end, such heuristic checks will be able to detect our bare- such as CWSandbox [35] and Norman Sandbox [4].
metal analysis system. Other sandbox systems are implemented using emula-
Although, our similarity measure is less sensitive to tion or virtualization technologies. VMScope [22], TT-
randomization introduced by the non-deterministic code Analyze [11], and Panorama [36] are some of the exam-
of a malware sample, high levels of randomization in the ples of emulation-based malware analysis systems. All
persistent changes will result in an erroneous detection of of them are based on Qemu [12] and implement whole-
evasion. Other mitigation techniques, such as the multi- system emulation. Other tools, such as Ether [14] and
ple execution of the malware on the same analysis sys- HyperDBG [15] are based on hardware-supported vir-
tem, can be performed to mitigate this problem at the tualization technology. While most system deal with
expense of more computational resources. user-land malware samples, some of the analysis sys-
Both known and unknown fingerprinting techniques tems are specifically targeted to analyze kernel-mode
focused on detecting virtualized or emulated platforms malware [27, 30].
12
7.2 Transparent analysis ent bare-metal environment with no monitoring compo-
nent present in the hardware. Moreover, we propose an
Many transparent malware analysis systems have been improved behavior comparison technique that captures
proposed to defeat evasive malware. Cobra [34] was the inherent similarity hierarchy of the behavior features,
the first analysis system specifically focused on defeat- and do not require the resource-expensive execution of
ing anti-debugging techniques. However, Cobra runs its same sample multiple times in the same analysis envi-
tool at the same privilege level as the malware. In prin- ronment.
ciple, this approach makes it impossible to provide abso-
lute transparency.
Many of the malware analysis tools based on the out- 7.4 Hierarchical Similarity
of-VM approach are designed to provide better trans-
parency [1, 14, 22], as the analysis system is completely Hierarchies are used to encode domain knowledge about
external to the execution environment. However, de- different levels of abstraction in the type of events ob-
tection techniques have been developed to detect these served. They have been used in different field of similar-
analysis systems as well. There are several techniques ity detection, such as finding text similarity [16], detect-
to detect VMWare [9, 18, 33], as well as Bochs and ing association rules using hierarchies of concepts [21],
Qemu [9, 18, 31]. Pek et al. [32] have shown that hard- and finding similarity among deformable shapes [17].
ware virtualization-based Ether [14] can be detected us- Ganesan et al. [19] proposed a similarity measure that in-
ing local timing attacks. corporates hierarchical domain structure. However, the
The most effective way to provide transparency is to similarity computation is focused on the element-level
run on real hardware, with an environment that has not similarity rather than the profile-level similarity. It uses
been extended with analysis artifacts. BareBox [25] and a modified version of cosine-similarity measure.
Nvmtracer [5] both provide bare-metal environments for
malware analysis.
8 Conclusions
13
References [15] FATTORI , A., PALEARI , R., M ARTIGNONI , L.,
AND M ONGA , M. Dynamic and Transpar-
[1] Anubis. http:// anubis.iseclab.org. ent Analysis of Commodity Production Systems.
In IEEE/ACM International Conference on Auto-
[2] Cuckoo Sandbox. http:// www.cuckoosandbox.org. mated Software Engineering (ASE) (2010), ACM.
[3] Intel Virtualization Technology. http:// intel.com/ [16] F ELDMAN , R., AND DAGAN , I. Knowledge Dis-
technology/ virtualization. covery in Textual Databases (KDT). In Conference
on Knowledge Discovery and Data Mining (KDD)
[4] Norman Sandbox. http:// www.norman.com/ . (1995).
[5] Nvmtrace. http:// code.google.com/ p/ nvmtrace. [17] F ELZENSZWALB , P. F., AND S CHWARTZ , J. D.
Hierarchical Matching of Deformable Shapes. In
[6] SleuthKit. http:// www.sleuthkit.org. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) (2007), IEEE.
[7] VirtualBox. http:// www.virtualbox.org.
[18] F ERRIE , P. Attacks on more virtual machine emu-
[8] BAILEY, M., O BERHEIDE , J., A NDERSEN , J., lators. Symantec Technology Exchange (2007).
M AO , Z. M., JAHANIAN , F., AND NAZARIO , J.
[19] G ANESAN , P., G ARCIA -M OLINA , H., AND
Automated Classification and Analysis of Internet
W IDOM , J. Exploiting Hierarchical Domain Struc-
Malware. In Symposium on Recent Advances in In-
ture to Compute Similarity. ACM Transactions on
trusion Detection (RAID) (2007).
Information Systems (TOIS) (2003).
[9] BALZAROTTI , D., C OVA , M., K ARLBERGER , C., [20] G ARFINKEL , T., A DAMS , K., WARFIELD , A.,
K IRDA , E., K RUEGEL , C., AND V IGNA , G. Ef- AND F RANKLIN , J. Compatibility Is Not Trans-
ficient Detection of Split Personalities in Malware. parency: VMM Detection Myths and Realities. In
In Symposium on Network and Distributed System USENIX Workshop on Hot Topics in Operating Sys-
Security (NDSS) (February 2010). tems (HotOS) (2007).
[10] BAYER , U., C OMPARETTI , P. M., H LAUSCHEK , [21] H AN , J., AND F U , Y. Discovery of Multiple-level
C., K RUEGEL , C., AND K IRDA , E. Scalable, Association Rules from Large Databases. In Con-
Behavior-Based Malware Clustering. In Sympo- ference on Very Large Data Bases (VLDB) (1995).
sium on Network and Distributed System Security
(NDSS) (2009). [22] J IANG , X., AND WANG , X. Out-of-the-Box Mon-
itoring of VM-Based High-Interaction Honeypots.
[11] BAYER , U., K RUEGEL , C., AND K IRDA , E. TT- In Symposium on Recent Advances in Intrusion De-
Analyze : A Tool for Analyzing Malware. Eu- tection (RAID) (2007).
ropean Institute for Computer Antivirus Research
(EICAR) (2006). [23] J OHNSON , N. M., C ABALLERO , J., C HEN , K. Z.,
M C C AMANT, S., P OOSANKAM , P., R EYNAUD ,
[12] B ELLARD , F. QEMU, a Fast and Portable Dynamic D., AND S ONG , D. Differential Slicing: Iden-
Translator. In USENIX Annual Technical Confer- tifying Causal Execution Differences for Security
ence, FREENIX Track (2005). Applications. In IEEE Symposium on Security and
Privacy (2011).
[13] C HEN , X., A NDERSEN , J., M AO , Z. M., BAILEY,
[24] K ANG , M., Y IN , H., AND H ANNA , S. Emulating
M., AND NAZARIO , J. Towards an Understanding
Emulation-resistant Malware. ACM Workshop on
of Anti-virtualization and Anti-debugging Behav-
Virtual machine security (2009).
ior in Modern Malware. In IEEE Conference on
Dependable Systems and Networks With FTCS and [25] K IRAT, D., V IGNA , G., AND K RUEGEL , C. Bare-
DCC (2008), IEEE. Box : Efficient Malware Analysis on Bare-Metal.
In Annual Computer Security Applications Confer-
[14] D INABURG , A., ROYAL , P., S HARIF, M., AND ence (ACSAC) (2011).
L EE , W. Ether: Malware Analysis via Hard-
ware Virtualization Extensions. In ACM Confer- [26] KOLBITSCH , C., K IRDA , E., AND K RUEGEL , C.
ence on Computer and Communications Security The Power of Procrastination: Detection and Mit-
(CCS) (2008). igation of Execution-Stalling Malicious Code. In
14
ACM Conference on Computer and Communica- [37] YOSHIOKA , K., H OSOBUCHI , Y., O RII , T., AND
tions Security (CCS) (2011). M ATSUMOTO , T. Your Sandbox is Blinded: Im-
pact of Decoy Injection to Public Malware Anal-
[27] L ANZI , A., S HARIF, M., AND L EE , W. K-Tracer: ysis Systems. Journal of Information Processing
A System for Extracting Kernel Malware Behavior. (2011).
In Symposium on Network and Distributed System
Security (NDSS) (2009).
15