You are on page 1of 48

Light and Dark side of Code Instrumentation

Dmitriy D1g1 Evdokimov DSecRG, Security Researcher

#whoami
Security Researcher in DSecRG
RE Fuzzing Mobile security

Organizer: DCG #7812 Editor in XAKEP

www.dsecrg.com

CONFidence Krakow 2012

Agenda
1. 2. 3. 4. 5. 6. 7. Instrumentation . Instrumentation .. Instrumentation Instrumentation . Instrumentation .. Instrumentation Instrumentation .
CONFidence Krakow 2012 3

www.dsecrg.com

Intro

It has been proved by scientists that a new point of evolution, any technical progress appears when a Man makes up a new type of tool, but not a product.

www.dsecrg.com

CONFidence Krakow 2012

Instrumentation
Instrumentation is a technique adding extra code to an program/environment for monitoring/change some program behavior.
Environment Program Own extra code Program

Own extra code

www.dsecrg.com

CONFidence Krakow 2012

Why is it necessary?
Parallel optimization Simulation Virtualization Emulation Performance analysis Automated debugging Error detection Binary translation Optimization Memory leak detection Software profiling Testing Correctness checking Collecting code metrics Memory debugging

www.dsecrg.com

CONFidence Krakow 2012

Instrumentation in information security


Control flow analysis Unpack Virtual patching Privacy monitoring Sandboxing Data flow analysis Deobfuscation Reverse engineering Behavior based security Transparent debugging Data Structure Restoring Security enforcement Forensic Security test case generation Code coverage Vulnerability detection Antivirus technology

Malware analysis

Taint analysis Program shepherding

Shellcode detection

Fuzzing

www.dsecrg.com

CONFidence Krakow 2012

Analysis
Criterion Code vs. data Code coverage Information about values Self-modifying code Interaction with the environment Unused code JIT code Static analysis Problem Big (but not all) No information Problem No Analysis Problem Dynamic analysis No problem One way All information No problem Yes No analysis No problem

www.dsecrg.com

CONFidence Krakow 2012

Code Discovery
Memory After static analysis 0101010110101001010010 Instr 1 0101010101101010101010 Instr 2 Instr 3 1111010101110101000111 jump reg InstrDATA 4 Instr 1011100111001010101011 5 Instr 4 6 Instr Instr 7 5 jmp 0x0ABCD PADDING 0111010110100111100110 Instr 7 cont. Instr 8 1010101101110001001011 Instr 9 Instr 6 10 Instr After dynamic analysis

www.dsecrg.com

CONFidence Krakow 2012

The general scheme of code instrumentation


1. 2. 3. 4. 5. 6. 7. Find points of instrumentation; Insert instrumentation; Take control from program; Save context of the program; Execute own code; Restore context of the program; Return control to program.
CONFidence Krakow 2012 10

www.dsecrg.com

Source Data
Source data

Source code

Byte code

Binary code

www.dsecrg.com

CONFidence Krakow 2012

11

Classification of target instrumentation


Instrumentation With source code Source code Linker/Compiler Byte code Byte code Interpreter/VM Without source code Binary code Executable file Process Environment Hardware
12

Environment Dynamic binary instrumentation Static binary modification Byte code instrumentation Link-time/Compilation-time Source code instrumentation - Static instrumentation - Load-time - Dynamic www.dsecrg.com CONFidence Krakow 2012

Source code instrumentation


Source code*
Source code instrumentation
Manual skills Plugins for IDE

Link-time/Compilation-time instrumentation
Options of linker/compiler

Tools: Visual Studio Profiler, gcc, TAU, OPARI, Diablo, Phoenix, LLVM, Rational Purify, Valgrind
*Unreal condition for security specialist =)
www.dsecrg.com CONFidence Krakow 2012 13

Unmoral programming

www.dsecrg.com

CONFidence Krakow 2012

14

Byte code instrumentation


Byte code intermediate representation between source code and machine code.

Java VM Dalvik VM AVM/AVM2 CLR

www.dsecrg.com

CONFidence Krakow 2012

15

Instrumentation byte-code (I)


Source code Compilation Byte-code Load Loader Lib

Virtual machine

JIT Execute

Lib

Lib

Machine code
www.dsecrg.com CONFidence Krakow 2012 16

Instrumentation byte code (II)


Byte-code
Static instrumentation
Static byte code instrumentation

Load-time instrumentation
Custom byte code loader

Dynamic instrumentation
Dynamic byte-code instrumentation

www.dsecrg.com

CONFidence Krakow 2012

17

Instrumentation Java (I)


Mechanisms:
java.lang.instrument package; Java Platform Debugger Architecture (JPDA) .

www.dsecrg.com

CONFidence Krakow 2012

18

Instrumentation Java (II)


Static instrumentation
ClassFileLoadHook Custom ClassLoader

Load-time instrumentation Dynamic instrumentation

Modification *.class files

ClassFileLoadHook -> RetransformClasses

Tools: Javassist, ObjectWeb ASM, BCEL, JOIE, reJ JavaSnoop, Serp, JMangler
www.dsecrg.com CONFidence Krakow 2012 19

Instrumentation .NET
Static instrumentation
Modification DLL files

Load-time instrumentation
AppDomain.Load()/Assembly.Load() Joint redirection Via event handler

Tools: ReFrameworker, MBEL, RAIL, Cecil


www.dsecrg.com CONFidence Krakow 2012 20

Instrumentation ActionScript (I)


ActionScript2
AVM Tags that (can) contain bytecode:
DefineButton (7), DefineButton2 (34), DefineSprite (39), DoAction (12), DoInitAction (59), PlaceObject2 (26), PlaceObject3 (70).

ActionScript3
AVM2 Tags that (can) contain bytecode:
DoABC (82), RawABC (72).
www.dsecrg.com CONFidence Krakow 2012 21

AVM2 Architecture
AS3 .abc function (x:int):int { return x+10 } .abc parser .abc getlocal 1 Verifier Bytecode pushint 10 add returnvalue Interpreter MIR JIT Compiler @1 arg +8// argv MIR Code Generator @2 load [@1+4] @3 imm 10 @4 add (@2,@3) MD Code Generator @5 ret @4 // @4:eax (x86, PPC, ARM, etc.)

x86 Runtime System (Type System, Object Model) mov eax,(eap+8) mov eax,(eax+4) add eax,10 Memory Manager/Garbage Collector ret
www.dsecrg.com CONFidence Krakow 2012 22

Instrumentation ActionScript (I)


Original SWF file AVM tag

Header

Tags

Instrumenteted SWF file

www.dsecrg.com

CONFidence Krakow 2012

23

Instrumentation AVM (II)


Static instrumentation
Add :
trace() dump() debug() debugfile() debugline()

Modification:
Create own class + change class name = hook!
www.dsecrg.com CONFidence Krakow 2012 24

Instrumentation binary code


The executable file Process
Static code instrumentation Debuggers
IAT

Environment

Static binary instrumentation

Modifying call table

Modifying call table/other structure Dynamic code instrumentation

Debugging API

Modifying OS options
SHIM LD_PRELOAD AppInt_DLLs DLL injection

IDT, CPU MSRs, GDT, SSDT, IRP able

Hardware

Dynamic binary instrumentation

Hardware debug features


Debug registers Hardware debuggers

Reproduction of the environment


Emulation Virtualization

www.dsecrg.com

CONFidence Krakow 2012

25

Static Binary Instrumentation (I)


Static binary instrumentation/Physical code integration/Static binary code rewriting Realization:
With reallocation:
Level of segment; Level of function;
Header Segment of code Segment of data Edited Header Segment of code Segment of data Extra segment of code Extra segment of data
26

Without reallocation.
www.dsecrg.com CONFidence Krakow 2012

Static Binary Instrumentation (II)


Reallocation: 1) Function Displacement + Entry Point Linking; 2) Branch Conversion; 3) Instruction Padding; 4) Instrumentation.
Tools: DynInst, EEL, ATOM, PEBIL, ERESI, TAU, Vulcan, BIRD, Aslan(4514N)
www.dsecrg.com CONFidence Krakow 2012 27

Debuggers
Breakpoints:
Software Hardware
App OS Processor Debugger

Debugger + scripting:

WinDBG + pykd OllyDBG + python = Immunity Debuggers GDB + PythonGDB

Python library's*: Buggery, IDAPython, ImmLIB, lldb, PyDBG, PyDbgEng, pygdb , python-ptrace , vtrace, WinAppDbg,

*See Python Arsenal for Reverse Engineering


www.dsecrg.com CONFidence Krakow 2012 28

Dynamic Binary Instrumentation


Dynamic binary instrumentation/Virtual code integration/Dynamic binary rewriting
App1 DBI App2 OS Processor

Tools: PIN, DynamoRIO, DynInst, Valgrind, BAP, KEDR, Fit, ERESI, Detour, Vulcan, SpiderPig
www.dsecrg.com CONFidence Krakow 2012 29

Dynamic Binary Instrumentation


Dynamic Binary Instrumentation (DBI) is a process control and analysis technique that involves injecting instrumentation code into a running process. Dynamic binary analysis (DBA) tools such as profilers and checkers help programmers create better software. Dynamic binary instrumentation (DBI) frameworks make it easy to build new DBA tools. DBA tools consist:
instrumentation routines; analysis routines.

www.dsecrg.com

CONFidence Krakow 2012

30

Kinds of DBI
Mode:
user-mode; kernel-mode.

Mode of work:
- Start to finish; - Attach.
Functionality

Modes of execution:
Interpretation-mode; Probe-mode; JIT-mode.
www.dsecrg.com CONFidence Krakow 2012

JIT

Probe Performance
31

DBI Frameworks*
Frameworks PIN OS Linux, Windows, MacOS Linux, Windows Linux, FreeBSD, Windows Linux, MacOS Arch x86, x86-64, Itanium, ARM x86, x86-64 x86, x86-64, ppc32, ARM, ppc64 x86, x86-64, ppc32, ARM, ppc64 Modes JIT, Probe Features Attach mode

DynamoRIO DynInst

JIT, Probe Probe

Runtime optimization Static & Dynamic binary instrumentation IR VEX, Heavyweight DBA tools

Valgrind

JIT

*For more details see DBI:Intro presentation from ZeroNights conference


www.dsecrg.com CONFidence Krakow 2012 32

Start work with DBI

www.dsecrg.com

CONFidence Krakow 2012

33

Levels of granularity
Instruction; Basic Block*; Trace/Superblock; Function; Section; Events; Binary image.
CONFidence Krakow 2012 34

www.dsecrg.com

Self-modifying code & DBI


Detect:
Written-protecting code pages Checking store address Inserting extra code

www.dsecrg.com

CONFidence Krakow 2012

35

Overhead
O=X+Y Y = N*Z Z = K+L O Tool Overhead; X Instrumentation Routines Overhead; Y Analysis Routines Overhead; N Frequency of Calling Analysis Routine; Z Work Performed in the Analysis Routine; K Work Required to Transition to Analysis Routine; L Work Performed Inside the Analysis Routine.
www.dsecrg.com CONFidence Krakow 2012 36

Rewriting instructions
Platforms:
with fixed-length instruction; with variable-length instructions.

www.dsecrg.com

CONFidence Krakow 2012

37

Rewriting code (I)


Easy / simple / boring / regular example
Rewriting prolog function

www.dsecrg.com

CONFidence Krakow 2012

38

Rewriting code (II)


Hardcore example:
Mobile phone firmware rewriting
Bootloader reboot Flash

SHELLCODE 1

AMSS GSM

Malicious SMS

Baseband processor
www.dsecrg.com CONFidence Krakow 2012 39

Instrumentation in ARM
ARM modes:
ARM
Length(instr) = 4 byte

Thumb
Length(instr) = 2 byte

Thumb2
Length(instr) = 2/4 byte

Jazzle

For more detail see A Dynamic Binary Instrumentation Engine for the ARM Architecture presentation.
www.dsecrg.com CONFidence Krakow 2012 40

Emulation

App1 Emulator OS Processor

OS

www.dsecrg.com

CONFidence Krakow 2012

41

Instrumentation & Bochs


Bochs can be called with instrumentation support. C++ callbacks occur when certain events happen:
Poweron/Reset/Shutdown; Branch Taken/Not Taken/Unconditional; Opcode Decode (All relevant fields, lengths); Interrupt /Exception; Cache /TLB Flush/Prefetch; Memory Read/Write.

bochs-python-instrumentation patch by Ero Carrera


www.dsecrg.com CONFidence Krakow 2012 42

Virtualization
App1 App1 VMM Processor Native VMM OS VMM OS Processor Hosted VMM OS

*VMM - Virtual Machine Monitor


www.dsecrg.com CONFidence Krakow 2012 43

Instrumentation & virtualization


Stages: 1. Save the VM-exit reason information in the VMCS; 2. Save guest context information; 3. Load the host-state area; 4. Transfer control to the hypervisor; 5. Run own code.

*VMCS - Virtual Machine Control Structure

www.dsecrg.com

CONFidence Krakow 2012

44

Instrumentation in Mobile World

Mobile Platform Android iOS Windows Phone

Language Java Objective-C .NET

Executable file format Dex Mach-O PE

www.dsecrg.com

CONFidence Krakow 2012

45

Conclusion

One can implement instrumentation of everything!

www.dsecrg.com

CONFidence Krakow 2012

46

Contact

Twitter: @evdokimovds E-mail: d.evdokimov@dsecrg.com

www.dsecrg.com

CONFidence Krakow 2012

47

Windows 8
Apps:
C++ & DirectX C# & XAML HTML & JavaScript & CSS

www.dsecrg.com

CONFidence Krakow 2012

48

You might also like