Fast Automated Unpacking and Classification of Malware

Malware is a pervasive problem in distributed computer and network systems.Identification of malware variants provides great benefit in early detection. Control flowhas been proposed as a characteristic that can be identified across variants, resulting inclassification employing flowgraph based signatures. Static analysis is widely used toconstruct the signatures but can be ineffective if malware undergoes a code packingtransformation to hide its real content. This thesis proposes a novel system, namedMalwise, for malware classification using a fast application level emulator to reversethe code packing transformation, and two flowgraph matching algorithms to performclassification: exact flowgraph matching and approximate flowgraph matching. Theexact flowgraph matching algorithm uses string based signatures of graph invariants,and is able to detect malware with near real-time performance. The approximateflowgraph matching algorithm is slower but more effective and uses the decompilationtechnique of structuring to generate string based signatures amenable to comparisonsusing the string edit distance. To demonstrate the effectiveness and efficiency of theautomated unpacking and flowgraph based classification, we evaluate the system withsynthetic malware and over 15,000 real samples. The evaluation shows our system ishighly effective in terms o
f accuracy in revealing all a sample‟s
hidden code, executiontime for unpacking and classification, and accuracy in detection of malware variants.
Fast Automated Unpacking and Classification of Malware
Silvio CESAREMaster of Informatics
School of Management and Information SystemsFaculty of Arts, Business, Informatics and EducationCentral Queensland University
May 2010
Certificate of Authorship and Originality of thesis
The work contained in this thesis has not been previously submitted either in whole or in part for a degree at Central Queensland University or any other tertiary institution. Tothe best of my knowledge and belief, the material presented in this thesis is originalexcept where due reference is made in text.
17 May 2010

