Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Download
Standard view
Full view
of .
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
Automated Static Unpacking Using Speculative Decompression

Automated Static Unpacking Using Speculative Decompression

Ratings: (0)|Views: 1,826|Likes:
Published by Silvio Cesare

More info:

Categories:Types, Research
Published by: Silvio Cesare on Jan 22, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/22/2013

pdf

text

original

 
Automated Static Unpacking Using SpeculativeDecompression
Silvio Cesare
School of Information TechnologyDeakin UniversityBurwood, Victoria 3125, Australia
<silvio.cesare@gmail.com>
ABSTRACT
 Malware is a significant problem on the internet. Automated andmanual analysis of malware is important in detection andremediation. However, malware authors understand this processand try to hinder static analysis by introducing a malwaretransformation that hides their code and intent. This process isknown as malware packing and must be reversed before an analystor automated system can understand the intent of the malicioussoftware. Automated unpacking attempts to solve this problem ona large scale and has been partly successful, but there is still muchto be done. In this work we propose a system for automaticallyand statically unpacking some forms of packed code. We identifythe compression algorithm used to pack the malware and thendecompress the high entropy, compressed binary blob within thesample. This is effective for a small minority of malware samplesin the wild.
Keywords
 Malware, code packing, automated unpacking.
1.
 
INTRODUCTION
The presence of malicious software is a problem that plaguesinternet and network connectivity. Malicious software, alsoknown as malware, are programs characterised by their maliciousintent. They are hostile, intrusive or annoying software programs.Examples of malware include trojan horses, worms, backdoors,dialers and spyware. Malware is a problem that is increasing at asignificant rate. According to the Symantec Internet Threat Report[1], 499,811 new malware samples were received in the secondhalf of 2007. F-Secur 
e additionally reported, “As much malware[was] produced in 2007 as in the previous 20 years altogether“
[2].The modern purpose of malware is that of criminal enterprise for financial gain[3]
. In 2008, “78 percent of confidentialinformation threats exported user data”
[3]. The stealing of  banking information using malware known as spyware[4]tocovertly log and relay such private information, is a commonexample of modern malware.The malware problem continues when malicious software remainsundetected by users. The creation of criminal networks employingunauthorised
use of users’ computers is an example of a malicious
 botnet[5]. Botnets are illegally leased to criminal networks inorder to create Email spamming networks, and to extort moneyfrom commercial entities using the threat of distributed denial of 
service attacks. A user’s inability to prevent or detect malware
often makes them liable to become an additional node in the
 botnet’s zombie network.
 Detection of malicious software provides much benefit in fighting
the threat that malware poses to users’ security. Detecting
malware before it is allowed to execute its intent allows suchsoftware to be effectively disabled. To identify a program as beingmalicious or benign, automated analysis is required. The analysiscan employ either a static or dynamic approach. In the dynamicapproach, the malware is executed, possibly in a sandbox, and itsruntime behaviour examined. In the purely static approach, themalware is never executed.Traditional Antivirus solutions to secure systems against malwarehave focused on static detection using static string signatures.Dynamic approaches[6], while having some benefits compared tostatic detection, also have disadvantages. The dynamic approachrequires an execution environment in which to run, mandatingthat a virtual machine or sandbox is available. For cross platformsystems, this may be an ineffective environment in which tooperate. If a virtual environment is not provided, execution of themalware on the host is required, which may allow malware toexecute its intent, before being detected. Additionally, a dynamicanalysis may fail to identify malicious software if the malicious behaviour is not triggered during the analysis..Malware authors attempt to evade detection by creating polymorphic variants of their malware which are not detected byanti-malware systems. Polymorphism describes related, butdifferent instances of malware sharing a common history of code.Code sharing among variants can have many sources, whether derived from autonomously self mutating malware, or manuallycopied by the malware creator to reuse previously authored code.Related to polymorphic malware are packed malware. Code
 packing is an obfuscation technique used to hide a malware’s real
content. A code packing tool is applied to a malware instance, as a post-processing binary rewriting stage, to produce a new packedversion of the malware. It is often used to make manual analysisand automated analysis of the malware more difficult. Code packing is also used to evade signature detection by Antivirussoftware through the creation of malware variants which have noassociated Antivirus signature.
1.1
 
Traditional Code Packing
The most common method of code packing transforms executablecode into data and is applied as a post-processing stage in themalware development cycle as shown in Figure 1. Thistransformation may perform compression or encryption, hinderingan analyst's understanding of the malware using static analysis. At
 
runtime, the data, or hidden code, is restored to its originalexecutable form through dynamic code generation using anassociated restoration routine[7]. Execution then resumes asnormal to the original entry point. The original entry point marksthe entry point of the original malware, before the code packingtransformation is applied. Execution of the malware, once therestoration routine is complete and control is transferred to theoriginal entry point, is transparent to the fact that code packingand restoration had been performed. A malware may have thecode packing transformation applied more than once. After therestoration routine of one packing transformation has beenapplied, control may transfer another packed layer. The originalentry point is derived from the last such layer.
1.2
 
Innovation
Our work makes the following contributions:
 
We propose a method for automatically unpackingmalware using static decompression.
 
We implement the system.
1.3
 
Structure of the Paper
The structure of the paper is as follows: Section 2 examinesrelated work to malware unpacking. Section 3 outlines our  proposed approach to unpacking. Secti
on 4 discusses our system’s
design and implementation. Finally, Section 5 concludes the paper.
2.
 
RELATED WORK 
Automated unpacking employing whole system emulationwas proposed in Renovo[7]and Pandora's Bochs[8]. Whole system emulation has been demonstrated to provide effectiveresults against unknown malware samples, yet is not completely
resistant to novel attacks. Renovo and Pandora’s Bochs bo
thdetect execution of dynamically generated code to determinewhen unpacking is complete and the hidden code is revealed. Analternative algorithm for detecting when unpacking is completewas proposed using execution histograms in Hump-and-dump[9] . The Hump-and-dump was proposed as potentially desirable for integration into an emulator. Polyunpack [10]proposed acombination of static and dynamic analysis to dynamically detectcode at runtime which cannot be identified during an initial staticanalysis.Dynamic Binary Instrumentation was proposed as an alternativeto using an instrumented emulator [11]employed by Renovo and
Pandora’s Bochs. Omnipack 
[12]and Saffron[11]proposed automated unpacking using native execution and hardware basedmemory protection features. This results in high performance incomparison to emulation based unpacking. The disadvantage of these approaches is in the use of the unpacking system on E-Mailgateways, which forces the provision of a virtual or emulatedsandbox in which to run. A virtual machine approach tounpacking using x86 hardware extensions was proposed in Ether [13]. The use of such a virtual machine and equally to wholesystem emulator is the requirement to install a license for eachguest operating system. This restricts desktop adoption whichtypically has a single license. Virtual machines are also inhibited by slow start-up times, which again are problematic for desktopuse. The use of a virtual machine also prevents the system beingcross platform as the guest and host CPUs must be the same.
3.
 
OUR APPROACH
Our approach to unpacking traditionally packed malware is tospeculate the compression algorithm that is used to pack thecontents. We speculate based on identifying a fingerprint of thecompression code. We then attempt to decompress the packed blob which we identify by its high entropy. This process reveals
the malware’s hidden contents.
Our work is limited to code packing using compression. Decryption and the recovery of keysis left for future work.
4.
 
SYSTEM DESIGN ANDIMPLEMENTATION
The system is implemented in our previously built work for Malware unpacking and classification. We envision a deploymentusing an approach similar to Figure 2.
4.1
 
Packer Detection and Classification
RestorationRoutineHidden Code =f(Original Code)Original CodeRemnantRestorationRoutineOriginal Code =g(Hidden Code)PackingRuntimeOriginal ExecutablePacked ExecutableMemory Image at Runtime
Figure 1. Traditional code packing.
 
 
In our system we first determine if malware is packed usingentropy analysis. In the next stage we have experimented with packer classification. Packer classification is an approach that can be used to identify the type of packer that is used and then deploy packer specific unpacking strategies. The concept is that someclasses of packers can be unpacked using simpler and lesscomputationally complex approaches. We perform malwareclassification techniques on the low entropy program code anddata in the binary to determine the packer type. For packers that
we don’t handle, we can decide to immediately mark these
 potential malware as suspicious.
4.2
 
Application Level Emulation
Although application level emulation is foiled by anti-emulationtechniques used in malware, it can still successful unpack a largenumber of malware. Application level emulation is sometimeseasier to develop and more flexible than manually written staticunpackers. It can perform in real-time.
4.3
 
Static Unpacking Based on PackingAlgorithm
One novel approach that we have experimented with avoids the problems of undecidable code analysis or imperfect codeemulation. Our methodology is to use string matching to detect if and what type of off-the-shelf compression algorithm is used, andthen use standard decompression tools on the identified blob of  packed content. The advantage of this approach is that the systemis not prone to anti-emulation or anti static analysis techniques. Inwidely used open source packing tools, it is quite common for thetool be modified by malware authors who add anti-analysis andanti-emulation code fragments. Our system can easily unpack these types of malware.
4.4
 
Detecting the Packing Algorithm
We detect packing algorithms by constants that are present in thecode. Many compression algorithms make heavy use of constantsin tables or magic bytes for headers and these can be identifiedusing string matching. Antivirus software is already very good atstring matching, and our approach employs the same types of algorithms.
4.5
 
Detecting the Packed Blob
To detect the packed blob, we make use of entropy analysis.Calculating a sliding window of entropy over the binary image wecan clearly identify the region of the image that is compressed.However, the precision is limited to the window size. We cannotdetermine the precise beginning of the packed content, but have agood idea that the packed content begins in a particular window.
4.6
 
Static Unpacking of Compressed Blobs
Given a window that the packed blob begins in, we attempt todecompress the data by brute force unpacking at every offsetwithin that window. If we have chosen the correct offset, then thedecompression or decryption routine will succeed and give us asignificantly sized resultant image. If we have chosen the wrongoffset, then the decompression should abort since the image will be detected as corrupt. At this point, we have unpacked themalware and can pass the result to a classification tool, or to ahuman analyst for further investigation.
5.
 
CONCLUSION
Code packing is a tool used by malware authors that can hinder static analysis. If the goal of malware analysis is to have access to
the malware’s real code, then unpacking is necessary. This may be
required if malware are being grouped into families. Sometimeslegitimate commercial software is packed which means withoutanalysis of the hidden code Antivirus would incorrectly label it asmalicious. Unpacking code can be a challenging problem and nontraditional packing techniques such as instruction virtualizationare quickly becoming more and more used by malware authors. Itseems that the only safe solution for Antivirus is to perform packer detection and flag all such occurrences as potentialmalware. For legitimate software, white listing and co-ordinationwith Antivirus vendors may be the only secure way forward.
Packer DetectionPacker ClassificationApplication LevelEmulationStatic Unpacker based on Packer AlgorithmSuspicious...MalwareClassificationpacked
Unknown or can’t unpack
UnpackedNot packedUnpackedUnknownBinaryMaliciousBenignHash DatabaseUnknownWhite listedBlack listed
Figure 2. The proposed system.

Activity (2)

You've already reviewed this. Edit your review.
1 thousand reads
1 hundred reads

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->