You are on page 1of 14

Shellcode Analysis

Chapter 19
What is Shellcode
 Shellcode
 a payload of raw executable code, attackers use this code to obtain
interactive shell access.
 A binary chunk of data
 Can be generally referred as self-contained executable codes
 IDAPro can load the shellcode binary but no automatic analysis is
available since no executable file format that describes the content
 What the attacker can do with Shellcode?
 Suid(0) – root escalation
 Bash - execve(“/bin/bash, NULL, NULL, WinExec)
 Open certain network ports
 Reverse shell connecting to the hacker
Position-Independent Code
 No hard-coded addresses – shellcode
 Table 19-1, p. 408 – call/jmp are position independent –
calculate target addresses by adding an offset
 mov accessing global memory location is not position
independent/mov accessing addresses with an offset is position
independent
 Shellcode – no hard-coded memory addresses
 All branches and jumps relative
 Code can be placed anywhere in memory and still function as intended
 Essential in exploit code and shellcode being injected from a remote location
since addresses are not known
Identifying Execution Location
 Shellcode may need to find out its execution location –
dereference base pointer
 x86 does not provide EIP-relative access to embedded data as it
does for control-flow instructions
 Must load EIP into general purpose register
 Problem: “mov %eax, %eip” not allowed
 Two methods
call/pop
call pushes EIP of next instruction onto stack, pop retrieves it (Listing 19-1,
p. 410)
Example JMP-CALL-POP
Jmp to the shellcode
Dynamically figure our the memory address
Of “Hello Word” – no hard coded address
After call, the next instruction address will
Be pushed to stack
Inside call, pop this address on stack to EDI
Manual Symbol Resolution

 Shellcode need to resolve external symbols


 Shellcode can not use Windows loader to ensure libraries are in
process memory -
 Find symbols by itself
 Must dynamically locate functions such as LoadLibraryA and
GetProcAddress (both located in kernel32.dll)
 Finding kernel32.dll in memory
 Undocumented structure traversal (Figure 19-1, Listing 19-4, p. 414,
415)
 From Windows 2000 through Vista, kernel32.dll follows
ntdll.dll (second place InInitializationOrderLinks)
 Windows 7/10 change this so need to confirm using
UNICODE_STRING_FullDllName
Locate kernel32.dll

Begins with TEB-> FS segment register offset 0x30 -> Offset 0xC within PEB -> linked
list traversal

Windows 2000-Vista, Kernel32.dll follows ntdll.dll; changed after windows 7.


Parsing PE Export Data
 After base address is found for kernel32.dll, Parsing PE
Export Data in kernel32.dll for exported
symbols.

Addresses of exported calls in header (relative virtual addresses


in IMAGE_EXPORT_DIRECTORY )
 AddressOfFunctions, AddressOfNames,
AddressOfNameOrdinals arrays (Figure 19-2, p. 417)
 To make shellcode compact, hashes of function names used to
compare
 32-bit rotate-right-additive hash (Listing 19-5, 19-6, p. 418-
419) – calculates a 32-bit hash value
Shellcode Encoding
 Shellcode must embed in the program before exploit occurs/or
passed to exploit
 Exploit unsafe string function: strcpy, strcat – they do
not set maximum length (buffer overflow)
 Shellcode must look like valid data, no NULL bytes in the middle
if using strcpy/strcat (ends with NULL), which will terminate
buffer overflow pre-maturely
 Encode the payload to pass the filter (makes analysis more
difficult)
Buffer Overflow Attacks

 Return address stored on stack


 Attackers want to overwrite the return address with another
malicious address – redirect to shellcode
 Attackers have to deal with two unknowns:
1. What is the distance between the overflown buffer and the
return address slot ? – attackers have to make guesses about
the displacement
2. What is the actual address of the shellcode ? Shellcode is in the
buffer, part of the data
 Attackers have to make guesses of the shellcode address – use
NOP sleds to increase hitting probability
NOP Sleds
NOP – no operation does nothing
Long sequence of NOPs preceding shellcode
 Allows exploit to increase likelihood of hits by giving a range of
addresses that will result in shellcode executing

 To avoid detection, can repeat increment/decrement of registers.

You might also like