You are on page 1of 61

Lecture 4-7 – Control Hijacking

and Defenses
Dr. Cong Wang
CS Department
City University of Hong Kong
Slides partially adapted from lecture notes by M. Goodrich&R. Tamassia,
W. Stallings&L. Brown, Dan Boneh, and Dawn Song.
CS4293 Topics on Cybersecurity 1
More Hijacking Opportunities

Double Free:
Can cause memory mgr to write data to specific location

CS4293 Topics on Cybersecurity 2


An Example on Double Free Coding Error

CS4293 Topics on Cybersecurity 3


/* doublefree.c */
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {

char* ptr = (char*)malloc (8);

free(ptr);
free(ptr);

printf("Everything is fine.\n");

return 0;
}

CS4293 Topics on Cybersecurity 4


CS4293 Topics on Cybersecurity 5
More Hijacking Opportunities

Format String Bug

CS4293 Topics on Cybersecurity 6


close(fp);
} else {
printf("Open failed!\n");

What is a format string?


}
}

3 printf
Guidelines("The magic number is: %d\n", 1911);
The text to be printed is “The magic number is:”, followed by a format
3.1 What is a format string?
parameter ‘%d’, which is replaced with the parameter (1911) in the output.
Therefore the printf
output("The magic number is: %d\n", 1911);
looks like:
The text to be printed isThe magic
“The magic number
number is:
is:”, followed 1911.parameter ‘%d’, which is replaced
by a format
with the parameter (1911) in the output. Therefore the output looks like: The magic number is: 1911. In
addition to %d, there are several other format parameters, each having different meaning. The following
common format parameters
table summarizes these format parameters:

Parameter Meaning Passed as


-------------------------------------------------------------------
%d decimal (int) value
%u unsigned decimal (unsigned int) value
%x hexadecimal (unsigned int) value
%s string ((const) (unsigned) char *) reference
%n number of bytes written so far, (* int) reference

CS4293 Topics on Cybersecurity 7


he Stack and Format Strings
The Stack and Format Strings
avior of the format function is controlled by the format string. The function retrieves the parame
ed by the format string from the stack.
The behavior of the format function is controlled by the format string. The
tf ("a function retrieves%d,
has value the parameters
b has value requested
%d, byc the
is format string in stack.
at address: %08x\n",
a, b, &c);
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b, &c);
Stack Stack grows in this direction

Format String
Adress of c

Value of b

Value of a

Address of

Moving in this direction

printf()’s internal
pointer
CS4293 Topics on Cybersecurity 8
ED Labs – Format String Vulnerability Lab 5

What if there is a miss-match


The Stack and Format Strings
Miss-match
e behavior between
of the format function the format
is controlled stringstring.
by the format andThe
thefunction
actual arguments
retrieves ?
the parameters
uested printf
by the format("a
stringhas
from value
the stack. %d, b has value %d, c is at
address: %08x\n", a, b);
intf ("a has value %d, b has value %d, c is at address: %08x\n",
a, b, &c);
In the above example, the format string asks for 3 arguments, but
the program actually provides only two (i.e. a and b).
Stack Stack grows in this direction

Format String
Adress of c

Value of b

Value of a

Address of

Moving in this direction

printf()’s internal
pointer
CS4293 Topics on Cybersecurity 9
What if there is a miss-match
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b);

Can this program pass the compiler?


– The function printf()is defined as function with variable length of
arguments. By looking at the number of arguments, everything looks fine.

– To find the miss-match, compilers needs to understand how printf()


works and what the meaning of the format string is. However, compilers usually
do not do this kind of analysis.

– Sometimes, the format string is not a constant string, it is generated during the
execution of the program. Therefore, there is no way for the compiler to find the
miss-match in this case.

CS4293 Topics on Cybersecurity 10


What if there is a miss-match
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b);

Can printf() detect the miss-match?

– The function printf()fetches the arguments from the stack. If the format
string needs 3 arguments, it will fetch 3 data items from the stack. Unless the
stack is marked with a boundary, printf()does not know that it runs out of
the arguments that are provided to it.

– Since there is no such a marking, printf()will continue fetching data from


the stack. In a miss-match case, it will fetch some data that do not belong to this
function call.

CS4293 Topics on Cybersecurity 11


/* guide.c */
#include <stdio.h>

void foo(int x) {
int a = 1, b = 2, c = 3;

//printf("a has value %d, b has value %d, c is at


address: %08x\n", a, b, &c); // correct

printf("a has value %d, b has value %d, c is at


address: %08x", a, b); // incorrect
}

int main(int argc, char* argv[]) {


foo(1);
return 0;
}

CS4293 Topics on Cybersecurity 12


/* guide.c */
correct output >>>
a has value 1, b has value 2, c is at address: bffff334

(gdb) x/12 $esp

0xbffff320: 0x08048510 ==> format string “a has …”


0x00000001 ==> value of a
0x00000002 ==> value of b
0xbffff334 ==> address of c
0xbffff330: 0xb7fc53e4
0xbffff334==> 0x00000003
0x00000001 ==> local variables of foo()
0x00000002
0xbffff340: 0xffffffff
0xb7e53196
0xbffff368
0x08048438
CS4293 Topics on Cybersecurity 13
/* guide.c */
incorrect output >>>
a has value 1, b has value 2, c is at address: 080482dd

(gdb) x/12 $esp

0xbffff320: 0x08048510 ==> format string “a has …”


0x00000001 ==> value of a
0x00000002 ==> value of b
0x080482dd ==> printed memory content
0xbffff330: 0xb7fc53e4
0x00000001
0x00000002 ==> local variables of foo()
0x00000003
0xbffff340: 0xffffffff
0xb7e53196
0xbffff368
0x08048431
CS4293 Topics on Cybersecurity 14
History
• First exploit discovered in June 2000.
• Examples:
– wu-ftpd 2.* : remote root
– Linux rpc.statd: remote root
– IRIX telnetd: remote root
– BSD chpass: local root
• Any function using a format string could be vulnerable.
Printing:
printf, fprintf, sprintf, …
vprintf, vfprintf, vsprintf, …
Logging:
syslog, err, warn
CS4293 Topics on Cybersecurity 15
Viewing Memory at Any Location
int func(char *user) {
fprintf( stderr, user);
}

Problem: what if *user = “%s%s%s%s…%s%s%s” ??


– Most likely program will crash: DoS.
• Segmentation fault
– If not, program may print memory contents. Privacy?
• construct the %s and %x format tokens, among others, to
print data from the stack or possibly other locations in
memory

Correct form: fprintf( stdout, “%s”, user);


CS4293 Topics on Cybersecurity 16
• We have to supply an address of the memory. However, we cannot change the co
supply the format string.
Exploit Details
• If we use printf(%s) without specifying a memory address, the target address
from
• Ifthewe
stack
useanyway by the printf()
printf(%s) withoutfunction. The afunction
specifying memory maintains
address,an init
so it knows the location of the parameters in the stack.
the target address will be obtained from the stack anyway by the
• Observation: the format stringThe
printf()function. function
is usually maintains
located an initial
on the stack. stack
If we can encode the
the format string,
pointer, so the target address
it knows will be in
the location ofthe
thestack. In the following
parameters in the example,
stack. the
stored in a buffer, which is located on the stack.
int main(int argc, char *argv[])
{
char user_input[100];
... ... /* other variable definitions and statements */

scanf("%s", user_input); /* getting a string from user */


printf(user_input); /* Vulnerable place */

return 0;
}
CS4293 Topics on Cybersecurity 17
Exploit Details
• If we use printf(%s) without specifying a memory address, the target addres
from the stack anyway by the printf() function. The function maintains an in
so it knows the location of the parameters in the stack.
• Observation: the format string is usually located on the stack. If
we can encode
• Observation: the formatthestring
target address
is usually in the
located onformat string,
the stack. If we the
can target
encode th
the address will the
format string, be target
in theaddress
stack.will
In the
be infollowing
the stack. example, the format
In the following example, t
stored in a is
string buffer,
storedwhich
in aisbuffer,
located which
on the stack.
is located on the stack...
int main(int argc, char *argv[])
{
char user_input[100];
... ... /* other variable definitions and statements */

scanf("%s", user_input); /* getting a string from user */


printf(user_input); /* Vulnerable place */

return 0;
}
• If we can force the printf to obtain the address from the format
string (also on the stack), we can control the address.
• If we can force the printf to obtain the address from the format string (also on t
printf("\x10\x01\x48\x08_%x_%x_%x_%x_%s");
control the address.
CS4293 Topics on Cybersecurity 18
Exploit Details
• Consequence: we use four %x to move the printf()’s pointer towards the
ED Labs – Format String Vulnerability Lab 7
address that we stored in the format string. Once we reach the destination, we
abs – Format String Vulnerability Lab 7
will give %s to printf(), causing it to print out the contents in the memory
address 0x10014808. The function printf()will treat the contents as a
Print out the contents at the address 0x10014808 using format-string vlunerability
Printstring,
out the and print
contents at out the string
the address until reaching
0x10014808 the endvlunerability
using format-string of the string (i.e. 0).
user_input [ ]
user_input [ ]

Address of user_input [ ]
Address of user_input [ ]
0x10014808
0x10014808
%s

%x

%x

%x

%x

...
%s

%x

%x

%x

%x

...

Print this Print this


Print this Print this
for the 4th %x for the 1st %x
for the 4th %x for the 1st %x
For %s: print out the contentsCS4293
pointed by on
Topics this address
Cybersecurity 19
For %s: print out the contents pointed by this address
Exploit Details
• Reasoning: The stack space between user_input[]and the address passed
to the printf()function is not for printf(). However, because of the
format-string vulnerability in the program, printf()considers them as the
arguments to match with the %x in the format string.
SEED Labs – Format String Vulnerability Lab 7
• TheLabs
SEED key–challenge
Format StringinVulnerability
this attack Labis to figure out the distance between the 7

user_input[]and the address passed to the printf()function. This


Print out the contents at the address 0x10014808 using format-string vlunerability
distance decides how many %x you need to insert into the format string,
Print out the contents at the address 0x10014808 using format-string vlunerability
before giving %s. user_input [ ]
user_input [ ]

Address of user_input [ ]
Address of user_input [ ]
0x10014808
0x10014808
%s

%x

%x

%x

%x
...
%s

%x

%x

%x

%x

...

Print this Print this


Print this Print this
for the 4th %x for the 1st %x
for the 4th %x for the 1st %x
For %s: print outCS4293 Topics pointed
the contents on Cybersecurity
by this address 20
For %s: print out the contents pointed by this address
Writing to Memory at Any Location
• %n: The number of characters written so far is stored into the
integer indicated by the corresponding argument.
int temp;
printf( “hello %n”, &temp).

• It causes printf() to write ‘6’ to temp

• Using the same approach as that for viewing memory at any


location, we can cause printf()to write an integer into any
location. Just replace the %s in the above example with %n, and
the contents at the address 0x10014808 will be overwritten.
CS4293 Topics on Cybersecurity 21
Writing to Memory at Any Location
• Using this attack, attackers can do the following:
- Overwrite important program flags that control access privileges
- Overwrite return addresses on the stack, function pointers, etc.

• The value written is determined by the # of chars printed %n is


reached. How to write arbitrary integer values?
- Use dummy output characters. To write a value of 1000, a simple padding of
1000 dummy characters would do.
- To avoid long format strings, we can use a width specification of the format
indicators.

printf(“%08x.%08x.%08x.%08x.%n”)

CS4293 Topics on Cybersecurity 22


Defenses

CS4293 Topics on Cybersecurity 23


Preventing hijacking attacks
1. Fix bugs:
– Audit software
• Automated tools: Coverity, Prefast/Prefix.
– Rewrite software in a type safe languange (Java, ML)
• Difficult for existing (legacy) code …

2. Concede overflow, but prevent code execution

3. Add runtime code to detect overflows exploits


– Halt process when overflow exploit detected
– StackGuard, LibSafe, …
CS4293 Topics on Cybersecurity 24
Defense I: Marking memory as non-execute (W^X)

Prevent attack code execution by marking stack and heap as


non-executable
• NX-bit on AMD Athlon 64, XD-bit on Intel P4 Prescott
– NX bit in every Page Table Entry (PTE)
• Deployment:
– Linux (via PaX project); OpenBSD
– Windows: since XP SP2 (DEP)
• Boot.ini : /noexecute=OptIn or AlwaysOn
• Visual Studio: /NXCompat[:NO]

• Limitations:
– Some apps need executable heap (e.g. JITs).
– Does not defend against `return-to-libc’ exploits
CS4293 Topics on Cybersecurity 25
Examples: DEP controls in Windows

DEP terminating a program

CS4293 Topics on Cybersecurity 26


Attack: return to libc (aka. Arc injection)
• Control hijacking without executing code

stack libc.so

args
ret-addr exec()
sfp printf()

local buf “/bin/sh”

Question: where is libc.so located in memory?


CS4293 Topics on Cybersecurity 27
e.g. libc

CS4293 Topics on Cybersecurity 28


Arc injection vs. Code injection

Control hijacking without Control hijacking with


injecting executing code injecting executing code

CS4293 Topics on Cybersecurity 29


Defense II : Address randomization
• ASLR: (Address Space Layout Randomization)
– Start both stack and heap at a random location
– Map shared libraries to rand location in process memory
Þ Attacker cannot jump directly to exec function
– Deployment: (/DynamicBase)
• Windows Vista: 8 bits of randomness for DLLs
– aligned to 64K page in a 16MB region Þ 256
choices
• Linux (via PaX): 16 bits of randomness for libraries

– More effective on 64-bit architectures

• Other randomization methods:


– Sys-call randomization: randomize sys-call id’s
– Instruction Set Randomization (ISR)
CS4293 Topics on Cybersecurity 30
ASLR Example

Booting twice loads libraries into different locations:

CS4293 Topics on Cybersecurity 31


Effectiveness and Limitations
• Limitations in NX*:
– some apps need executable heap (e.g. JITs)
– Does not defend against “return-to-libc” exploits (aka. Arc injection)

• Limitations in ASLR: Randomness can still be limited

*: when applicable
CS4293 Topics on Cybersecurity 32
Control Hijacking

Run-time Defenses

CS4293 Topics on Cybersecurity 33


Run time checking:
Defense III: StackGuard
• Many run-time checking techniques …
– we only discuss methods relevant to overflow protection

• Solution 1: StackGuard
– Run time tests for stack integrity.
– Embed “canaries” in stack frames and verify their integrity
prior to function return.

Frame 2 Frame 1
top
local canary sfp ret str local canary sfp ret str of
stack

CS4293 Topics on Cybersecurity 34


Canary Types
• Random canary:
– Random string chosen at program startup.
– Insert canary string into every stack frame.
– Verify canary before returning from function.
• Exit program if canary changed. Turns potential
exploit into DoS.
– To corrupt, attacker must learn current random
string.

• Terminator canary: Canary = {0, newline, linefeed,


EOF}
– String functions will not copy beyond
terminator.
– Attacker cannot use string functions to corrupt
stack.
CS4293 Topics on Cybersecurity 35
StackGuard (Cont.)
• StackGuard implemented as a GCC patch.
– Program must be recompiled.

• Minimal performance effects: 8% for Apache.

• Note: Canaries don’t provide full proof protection.


– Some stack smashing attacks leave canaries unchanged

• Heap protection: PointGuard.


– Protects function pointers and setjmp buffers by
encrypting them: e.g. XOR with random cookie
– Less effective, more noticeable performance effects
CS4293 Topics on Cybersecurity 36
StackGuard enhancements: ProPolice
• ProPolice (IBM) - gcc 3.4.1. (-fstack-protector)
– Rearrange stack layout to prevent ptr overflow.
args, ret addr, SFP, are the
String args sensitive components.
Growth CANARY is protecting them
ret addr Protects pointer args and local
pointers from a buffer overflow
SFP
CANARY

Stack local string buffers


Growth pointers, but no arrays
local non-buffer variables
copy of pointer args
CS4293 Topics on Cybersecurity 37
MS Visual Studio /GS [since 2003]

Compiler /GS option:


– Combination of ProPolice and Random canary.
– If cookie mismatch, default behavior is to call _exit(3)

Function prolog: Function epilog:


sub esp, 8 // allocate 8 bytes for cookie mov ecx, DWORD PTR [esp+8]
mov eax, DWORD PTR ___security_cookie xor ecx, esp
xor eax, esp // xor cookie with current esp call @__security_check_cookie@4
mov DWORD PTR [esp+8], eax // save in stack add esp, 8

Enhanced /GS in Visual Studio 2010:


– /GS protection added to all functions, unless can be proven unnecessary
CS4293 Topics on Cybersecurity 38
/GS stack frame

args
String
Growth ret addr
Canary protects ret-addr and
SFP exception handler frame

exception handlers

CANARY

Stack local string buffers


Growth pointers, but no arrays
local non-buffer variables
copy of pointer args
CS4293 Topics on Cybersecurity 39
Effectiveness and Limitations
• Limitations:
– Evasion with exception handler

*: when applicable
CS4293 Topics on Cybersecurity 40
=avoid
Evading /GS with exception handlers
• When exception is thrown, dispatcher walks up exception list
until handler is found (else use default handler)

After overflow: handler points to attacker’s code


exception triggered ⇒ control hijack

Main point: exception is triggered before canary is checked


0xffffffff
Structured Exception Handling SEH frame SEH frame

high
next handler buf next handler next handler mem

CS4293 Topics on Cybersecurity 41


Evading /GS with exception handlers
• When exception is thrown, dispatcher walks up exception list
until handler is found (else use default handler)

After overflow: handler points to attacker’s code


exception triggered ⇒ control hijack

Main point: exception is triggered before canary is checked


0xffffffff
Structured Exception Handling SEH frame SEH frame

high
ptr to mem
next handler buf next
next handler next handler
attack code
CS4293 Topics on Cybersecurity 42
Defense IV: SAFESEH and SEHOP
• /SAFESEH: linker flag
– Linker produces a binary with a table of safe exception handlers
– System will not jump to exception handler not on list

• /SEHOP: platform defense (since win vista SP1)


– Observation: SEH attacks typically corrupt the “next” entry in SEH list.
– SEHOP: add a dummy record at top of SEH list
– When exception occurs, dispatcher walks up list and verifies dummy
record is there. If not, terminates process.

CS4293 Topics on Cybersecurity 43


Effectiveness and Limitations
• Limitations:
– Requires recompilation

*: when applicable
CS4293 Topics on Cybersecurity 44
Summary: Canaries are not full proof
• Canaries are an important defense tool, but do not prevent all
control hijacking attacks:
– Heap-based attacks still possible
– Integer overflow attacks still possible
– /GS by itself does not prevent Exception Handling attacks
(also need SAFESEH and SEHOP)

CS4293 Topics on Cybersecurity 45


What if can’t recompile: Libsafe
• Solution 2: Libsafe (Avaya Labs)
– Dynamically loaded library (no need to recompile app.)
– Intercepts calls to strcpy (dest, src)
• Validates sufficient space in current stack frame:
|frame-pointer – dest| > strlen(src)
• If so, does strcpy. Otherwise, terminates application

top
sfp ret-addr dest src buf sfp ret-addr of
stack

Libsafe strcpy main


CS4293 Topics on Cybersecurity 46
How robust is Libsafe?

low high
memory sfp ret-addr dest src buf sfp ret-addr memory

Libsafe strcpy main

a function pointer might be in the overwritten buf

strcpy() can still overwrite a pointer between buf and sfp,


causing hijacking attacks …

CS4293 Topics on Cybersecurity 47


Effectiveness and Limitations
• Limitations:
– Limited protection.

CS4293 Topics on Cybersecurity 48


More methods …
Ø StackShield
§ At function prologue, copy return address RET and
SFP to “safe” location (beginning of data segment)
§ Upon return, check that RET and SFP is equal to
copy.
§ Implemented as assembler file processor (GCC)
Ø Control Flow Integrity (CFI)
§ A combination of static and dynamic checking
§ Statically determine program control flow
§ Dynamically enforce control
CS4293 Topics flow integrity
on Cybersecurity 49
Effectiveness and Limitations
• Many different kinds of attacks. Not one silver bullet
defense.

CS4293 Topics on Cybersecurity 50


Control Hijacking

Advanced
Hijacking Attacks

CS4293 Topics on Cybersecurity 51


Heap Spray Attacks

A reliable method for exploiting heap overflows

CS4293 Topics on Cybersecurity 52


Heap-based control hijacking
• Compiler generated function pointers (e.g. C++ code)

FP1
method #1

ptr FP2 method #2


FP3
method #3
data vtable

Object T

• Suppose vtable is on the heap next to a string object:

data
buf[256] vtable

ptr
CS4293 Topics on Cybersecurity 53
object T
Heap-based control hijacking
• Compiler generated function pointers (e.g. C++ code)

FP1
method #1

ptr FP2 method #2


FP3
method #3
data vtable

Object T
shell
code
• After overflow of buf we have:

data
buf[256] vtable

ptr
CS4293 Topics on Cybersecurity 54
object T
A reliable exploit?
<SCRIPT language="text/javascript">
shellcode = unescape("%u4343%u4343%...");
overflow-string = unescape(“%u2332%u4276%...”);
cause-overflow( overflow-string ); // overflow buf[ ]
</SCRIPT>

Problem: attacker does not know where browser


places shellcode on the heap

???

data
buf[256] vtable

ptr
shellcode
CS4293 Topics on Cybersecurity 55
Heap Spraying [SkyLined 2004]

Idea: 1. use Javascript to spray heap


with shellcode (and NOP slides)
2. then point vtable ptr anywhere in spray area

NOP slide shellcode

heap
vtable

heap spray area


CS4293 Topics on Cybersecurity 56
Javascript heap spraying
var nop = unescape(“%u9090%u9090”)
while (nop.length < 0x100000) nop += nop

var shellcode = unescape("%u4343%u4343%...");

var x = new Array ()


for (i=0; i<1000; i++) {
x[i] = nop + shellcode;
}

• Pointing func-ptr almost anywhere in heap will


cause shellcode to execute.
CS4293 Topics on Cybersecurity 57
Many heap spray exploits
[RLZ’08]

• Improvements: Heap Feng Shui [S’07]


– Reliable heap exploits on IE without spraying
– Gives attacker full control of IE heap from Javascript
CS4293 Topics on Cybersecurity 58
(partial) Defenses
• Protect heap function pointers (e.g. PointGuard)

• Better browser architecture:


– Store JavaScript strings in a separate heap from browser heap

• OpenBSD heap overflow protection:


prevents
cross-page
overflows

non-writable pages

• Nozzle [RLZ’08] : detect sprays by prevalence of code on heap


CS4293 Topics on Cybersecurity 59
References on heap spraying
[1] Heap Feng Shui in Javascript,
by A. Sotirov, Blackhat Europe 2007

[2] Engineering Heap Overflow Exploits with JavaScript


M. Daniel, J. Honoroff, and C. Miller, WooT 2008

[3] Nozzle: A Defense Against Heap-spraying Code Injection


Attacks,
by P. Ratanaworabhan, B. Livshits, and B. Zorn

[4] Interpreter Exploitation: Pointer inference and JiT spraying,


by Dion Blazakis
CS4293 Topics on Cybersecurity 60
End of Segment

CS4293 Topics on Cybersecurity 61

You might also like