Light and Dark side of Code Instrumentation

Dmitriy “D1g1″ Evdokimov DSecRG, Security Researcher

#whoami
• Security Researcher in DSecRG
– RE – Fuzzing – Mobile security

• Organizer: DCG #7812 • Editor in “XAKEP”

www.dsecrg.com

CONFidence Krakow 2012

2

Agenda
1. 2. 3. 4. 5. 6. 7. Instrumentation . Instrumentation .. Instrumentation … Instrumentation …. Instrumentation ….. Instrumentation …… Instrumentation …….
CONFidence Krakow 2012 3

www.dsecrg.com

Intro

“It has been proved by scientists that a new point of evolution, any technical progress appears when a Man makes up a new type of tool, but not a product.”

www.dsecrg.com

CONFidence Krakow 2012

4

Instrumentation
Instrumentation is a technique adding extra code to an program/environment for monitoring/change some program behavior.
Environment Program Own extra code Program

Own extra code

www.dsecrg.com

CONFidence Krakow 2012

5

Why is it necessary?
Parallel optimization Simulation Virtualization Emulation Performance analysis Automated debugging Error detection Binary translation Optimization Memory leak detection Software profiling Testing Correctness checking Collecting code metrics Memory debugging

www.dsecrg.com

CONFidence Krakow 2012

6

Instrumentation in information security
Control flow analysis Unpack Virtual patching Privacy monitoring Sandboxing Data flow analysis Deobfuscation Reverse engineering Behavior based security Transparent debugging Data Structure Restoring Security enforcement Forensic Security test case generation Code coverage Vulnerability detection Antivirus technology

Malware analysis

Taint analysis Program shepherding

Shellcode detection

Fuzzing

www.dsecrg.com

CONFidence Krakow 2012

7

Analysis
Criterion Code vs. data Code coverage Information about values Self-modifying code Interaction with the environment Unused code JIT code Static analysis Problem Big (but not all) No information Problem No Analysis Problem Dynamic analysis No problem One way All information No problem Yes No analysis No problem

www.dsecrg.com

CONFidence Krakow 2012

8

Code Discovery
Memory After static analysis 0101010110101001010010 Instr 1 0101010101101010101010 Instr 2 Instr 3 1111010101110101000111 jump reg InstrDATA 4 Instr 1011100111001010101011 5 Instr 4 6 Instr Instr 7 5 jmp 0x0ABCD PADDING 0111010110100111100110 Instr 7 cont. Instr 8 1010101101110001001011 Instr 9 Instr 6 10 Instr After dynamic analysis

www.dsecrg.com

CONFidence Krakow 2012

9

The general scheme of code instrumentation
1. 2. 3. 4. 5. 6. 7. Find points of instrumentation; Insert instrumentation; Take control from program; Save context of the program; Execute own code; Restore context of the program; Return control to program.
CONFidence Krakow 2012 10

www.dsecrg.com

Source Data
Source data

Source code

Byte code

Binary code

www.dsecrg.com

CONFidence Krakow 2012

11

Classification of target instrumentation
Instrumentation With source code Source code Linker/Compiler Byte code Byte code Interpreter/VM Without source code Binary code Executable file Process Environment Hardware
12

Environment Dynamic binary instrumentation Static binary modification Byte code instrumentation Link-time/Compilation-time Source code instrumentation - Static instrumentation - Load-time - Dynamic www.dsecrg.com CONFidence Krakow 2012

Source code instrumentation
• Source code*
– Source code instrumentation
• Manual skills • Plugins for IDE

– Link-time/Compilation-time instrumentation
• Options of linker/compiler

• Tools: Visual Studio Profiler, gcc, TAU, OPARI, Diablo, Phoenix, LLVM, Rational Purify, Valgrind
*Unreal condition for security specialist =)
www.dsecrg.com CONFidence Krakow 2012 13

Unmoral programming

www.dsecrg.com

CONFidence Krakow 2012

14

Byte code instrumentation
Byte code – intermediate representation between source code and machine code.

… Java VM Dalvik VM AVM/AVM2 CLR

www.dsecrg.com

CONFidence Krakow 2012

15

Instrumentation byte-code (I)
Source code Compilation Byte-code Load Loader Lib

Virtual machine

JIT Execute

Lib

Lib

Machine code
www.dsecrg.com CONFidence Krakow 2012 16

Instrumentation byte code (II)
• Byte-code
– Static instrumentation
• Static byte code instrumentation

– Load-time instrumentation
• Custom byte code loader

– Dynamic instrumentation
• Dynamic byte-code instrumentation

www.dsecrg.com

CONFidence Krakow 2012

17

Instrumentation Java (I)
Mechanisms:
• java.lang.instrument package; • Java Platform Debugger Architecture (JPDA) .

www.dsecrg.com

CONFidence Krakow 2012

18

Instrumentation Java (II)
• Static instrumentation
– ClassFileLoadHook – Custom ClassLoader

• Load-time instrumentation • Dynamic instrumentation

– Modification *.class files

– ClassFileLoadHook -> RetransformClasses

Tools: Javassist, ObjectWeb ASM, BCEL, JOIE, reJ JavaSnoop, Serp, JMangler
www.dsecrg.com CONFidence Krakow 2012 19

Instrumentation .NET
• Static instrumentation
– Modification DLL files

• Load-time instrumentation
– AppDomain.Load()/Assembly.Load() – Joint redirection – Via event handler

Tools: ReFrameworker, MBEL, RAIL, Cecil
www.dsecrg.com CONFidence Krakow 2012 20

Instrumentation ActionScript (I)
• ActionScript2
– AVM – Tags that (can) contain bytecode:
• DefineButton (7), DefineButton2 (34), DefineSprite (39), DoAction (12), DoInitAction (59), PlaceObject2 (26), PlaceObject3 (70).

• ActionScript3
– AVM2 – Tags that (can) contain bytecode:
• DoABC (82), RawABC (72).
www.dsecrg.com CONFidence Krakow 2012 21

AVM2 Architecture
AS3 .abc function (x:int):int { return x+10 } .abc parser .abc getlocal 1 Verifier Bytecode pushint 10 add returnvalue Interpreter MIR JIT Compiler @1 arg +8// argv MIR Code Generator @2 load [@1+4] @3 imm 10 @4 add (@2,@3) MD Code Generator @5 ret @4 // @4:eax (x86, PPC, ARM, etc.)

x86 Runtime System (Type System, Object Model) mov eax,(eap+8) mov eax,(eax+4) add eax,10 Memory Manager/Garbage Collector ret
www.dsecrg.com CONFidence Krakow 2012 22

Instrumentation ActionScript (I)
Original SWF file AVM tag

Header

Tags

Instrumenteted SWF file

www.dsecrg.com

CONFidence Krakow 2012

23

Instrumentation AVM (II)
• Static instrumentation
– Add :
• • • • • trace() dump() debug() debugfile() debugline()

– Modification:
• Create own class + change class name = hook!
www.dsecrg.com CONFidence Krakow 2012 24

Instrumentation binary code
• The executable file • Process
– Static code instrumentation – Debuggers
• IAT • …

• Environment

• Static binary instrumentation

– Modifying call table

– Modifying call table/other structure – Dynamic code instrumentation

• Debugging API

– Modifying OS options
• • • • • SHIM LD_PRELOAD AppInt_DLLs DLL injection …

• IDT, CPU MSRs, GDT, SSDT, IRP тable • …

• Hardware

• Dynamic binary instrumentation

– Hardware debug features
• Debug registers • Hardware debuggers • …

– Reproduction of the environment
• Emulation • Virtualization

www.dsecrg.com

CONFidence Krakow 2012

25

Static Binary Instrumentation (I)
Static binary instrumentation/Physical code integration/Static binary code rewriting • Realization:
– With reallocation:
• Level of segment; • Level of function;
Header Segment of code Segment of data Edited Header Segment of code Segment of data Extra segment of code Extra segment of data
26

– Without reallocation.
www.dsecrg.com CONFidence Krakow 2012

Static Binary Instrumentation (II)
Reallocation: 1) Function Displacement + Entry Point Linking; 2) Branch Conversion; 3) Instruction Padding; 4) Instrumentation.
Tools: DynInst, EEL, ATOM, PEBIL, ERESI, TAU, Vulcan, BIRD, Aslan(4514N)
www.dsecrg.com CONFidence Krakow 2012 27

Debuggers
• Breakpoints:
• • Software Hardware
App OS Processor Debugger

• Debugger + scripting:
• • •

WinDBG + pykd OllyDBG + python = Immunity Debuggers GDB + PythonGDB

• Python library's*: Buggery, IDAPython, ImmLIB, lldb, PyDBG, PyDbgEng, pygdb , python-ptrace , vtrace, WinAppDbg, …

*See “Python Arsenal for Reverse Engineering”
www.dsecrg.com CONFidence Krakow 2012 28

Dynamic Binary Instrumentation
Dynamic binary instrumentation/Virtual code integration/Dynamic binary rewriting
App1 DBI App2 OS Processor

Tools: PIN, DynamoRIO, DynInst, Valgrind, BAP, KEDR, Fit, ERESI, Detour, Vulcan, SpiderPig
www.dsecrg.com CONFidence Krakow 2012 29

Dynamic Binary Instrumentation
• Dynamic Binary Instrumentation (DBI) is a process control and analysis technique that involves injecting instrumentation code into a running process. • Dynamic binary analysis (DBA) tools such as profilers and checkers help programmers create better software. • Dynamic binary instrumentation (DBI) frameworks make it easy to build new DBA tools. •DBA tools consist:
– instrumentation routines; – analysis routines.

www.dsecrg.com

CONFidence Krakow 2012

30

Kinds of DBI
Mode:
– user-mode; – kernel-mode.

Mode of work:
- Start to finish; - Attach.
Functionality

Modes of execution:
– Interpretation-mode; – Probe-mode; – JIT-mode.
www.dsecrg.com CONFidence Krakow 2012

JIT

Probe Performance
31

DBI Frameworks*
Frameworks PIN OS Linux, Windows, MacOS Linux, Windows Linux, FreeBSD, Windows Linux, MacOS Arch x86, x86-64, Itanium, ARM x86, x86-64 x86, x86-64, ppc32, ARM, ppc64 x86, x86-64, ppc32, ARM, ppc64 Modes JIT, Probe Features Attach mode

DynamoRIO DynInst

JIT, Probe Probe

Runtime optimization Static & Dynamic binary instrumentation IR – VEX, Heavyweight DBA tools

Valgrind

JIT

*For more details see “DBI:Intro” presentation from ZeroNights conference
www.dsecrg.com CONFidence Krakow 2012 32

Start work with DBI

www.dsecrg.com

CONFidence Krakow 2012

33

Levels of granularity
• • • • • • • Instruction; Basic Block*; Trace/Superblock; Function; Section; Events; Binary image.
CONFidence Krakow 2012 34

www.dsecrg.com

Self-modifying code & DBI
Detect:
– Written-protecting code pages – Checking store address – Inserting extra code

www.dsecrg.com

CONFidence Krakow 2012

35

Overhead
O=X+Y Y = N*Z Z = K+L O – Tool Overhead; X – Instrumentation Routines Overhead; Y – Analysis Routines Overhead; N – Frequency of Calling Analysis Routine; Z – Work Performed in the Analysis Routine; K – Work Required to Transition to Analysis Routine; L – Work Performed Inside the Analysis Routine.
www.dsecrg.com CONFidence Krakow 2012 36

Rewriting instructions
• Platforms:
– with fixed-length instruction; – with variable-length instructions.

www.dsecrg.com

CONFidence Krakow 2012

37

Rewriting code (I)
• Easy / simple / boring / regular example
– Rewriting prolog function

www.dsecrg.com

CONFidence Krakow 2012

38

Rewriting code (II)
• Hardcore example:
– Mobile phone firmware rewriting
Bootloader reboot Flash

SHELLCODE 1

AMSS GSM

Malicious SMS

Baseband processor
www.dsecrg.com CONFidence Krakow 2012 39

Instrumentation in ARM
ARM modes:
– ARM
• Length(instr) = 4 byte

– Thumb
• Length(instr) = 2 byte

– Thumb2
• Length(instr) = 2/4 byte

– Jazzle

For more detail see “A Dynamic Binary Instrumentation Engine for the ARM Architecture” presentation.
www.dsecrg.com CONFidence Krakow 2012 40

Emulation

App1 Emulator OS Processor

OS

www.dsecrg.com

CONFidence Krakow 2012

41

Instrumentation & Bochs
• Bochs can be called with instrumentation support. • C++ callbacks occur when certain events happen:
– – – – – – Poweron/Reset/Shutdown; Branch Taken/Not Taken/Unconditional; Opcode Decode (All relevant fields, lengths); Interrupt /Exception; Cache /TLB Flush/Prefetch; Memory Read/Write.

• “bochs-python-instrumentation” patch by Ero Carrera
www.dsecrg.com CONFidence Krakow 2012 42

Virtualization
App1 App1 VMM Processor Native VMM OS VMM OS Processor Hosted VMM OS

*VMM - Virtual Machine Monitor
www.dsecrg.com CONFidence Krakow 2012 43

Instrumentation & virtualization
Stages: 1. Save the VM-exit reason information in the VMCS; 2. Save guest context information; 3. Load the host-state area; 4. Transfer control to the hypervisor; 5. Run own code.

*VMCS - Virtual Machine Control Structure

www.dsecrg.com

CONFidence Krakow 2012

44

Instrumentation in Mobile World

Mobile Platform Android iOS Windows Phone

Language Java Objective-C .NET

Executable file format Dex Mach-O PE

www.dsecrg.com

CONFidence Krakow 2012

45

Conclusion

One can implement instrumentation of everything!

www.dsecrg.com

CONFidence Krakow 2012

46

Contact

Twitter: @evdokimovds E-mail: d.evdokimov@dsecrg.com

www.dsecrg.com

CONFidence Krakow 2012

47

Windows 8
• Apps:
– C++ & DirectX – C# & XAML – HTML & JavaScript & CSS

www.dsecrg.com

CONFidence Krakow 2012

48

Sign up to vote on this title
UsefulNot useful