You are on page 1of 51

Lecture 5 – Control Hijacking and

Defenses
CS Department
City University of Hong Kong

Slides partially adapted from lecture notes by M. Goodrich&R. Tamassia,


W. Stallings&L. Brown, Dan Boneh, and Dawn Song.
An Information Security Short Course
1
(Summer 2020)
More Hijacking Opportunities

Double Free:
Can cause memory mgr to write data to specific location

An Information Security Short Course


2
(Summer 2020)
An Example on Double Free Coding Error

An Information Security Short Course


3
(Summer 2020)
/* doublefree.c */
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {

char* ptr = (char*)malloc (8);

free(ptr);
free(ptr);

printf("Everything is fine.\n");

return 0;
}

An Information Security Short Course


4
(Summer 2020)
An Information Security Short Course
5
(Summer 2020)
More Hijacking Opportunities

Format String Bug

An Information Security Short Course


6
(Summer 2020)
close(fp);
} else {
printf("Open failed!\n");

What is a format string?


}
}

3 printf
Guidelines("The magic number is: %d\n", 1911);
The text to be printed is “The magic number is:”, followed by a format
3.1 What is a format string?
parameter ‘%d’, which is replaced with the parameter (1911) in the output.
Therefore the printf
output("The magic number is: %d\n", 1911);
looks like:
The text to be printed isThe magic
“The magic number
number is:
is:”, followed 1911.parameter ‘%d’, which is replaced
by a format
with the parameter (1911) in the output. Therefore the output looks like: The magic number is: 1911. In
addition to %d, there are several other format parameters, each having different meaning. The following
common format parameters
table summarizes these format parameters:

Parameter Meaning Passed as


-------------------------------------------------------------------
%d decimal (int) value
%u unsigned decimal (unsigned int) value
%x hexadecimal (unsigned int) value
%s string ((const) (unsigned) char *) reference
%n number of bytes written so far, (* int) reference

An Information Security Short Course


7
(Summer 2020)
he Stack and Format Strings
The Stack and Format Strings
avior of the format function is controlled by the format string. The function retrieves the parame
ed by the format string from the stack.
The behavior of the format function is controlled by the format string. The
tf ("a function retrieves%d,
has value the parameters
b has value requested
%d, byc the
is format string in stack.
at address: %08x\n",
a, b, &c);
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b, &c);
Stack Stack grows in this direction

Format String
Adress of c

Value of b

Value of a

Address of

Moving in this direction

printf()’s internal
pointer
An Information Security Short Course
8
(Summer 2020)
ED Labs – Format String Vulnerability Lab 5

What if there is a miss-match


The Stack and Format Strings
Miss-match
e behavior between
of the format function the format
is controlled stringstring.
by the format andThe
thefunction
actual arguments
retrieves ?
the parameters
uested printf
by the format("a
stringhas
from value
the stack. %d, b has value %d, c is at
address: %08x\n", a, b);
intf ("a has value %d, b has value %d, c is at address: %08x\n",
a, b, &c);
In the above example, the format string asks for 3 arguments, but
the program actually provides only two (i.e. a and b).
Stack Stack grows in this direction

Format String
Adress of c

Value of b

Value of a

Address of

Moving in this direction

printf()’s internal
pointer
An Information Security Short Course
9
(Summer 2020)
What if there is a miss-match
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b);

Can this program pass the compiler?


– The function printf()is defined as function with variable length of
arguments. By looking at the number of arguments, everything looks fine.

– To find the miss-match, compilers needs to understand how printf()


works and what the meaning of the format string is. However, compilers usually
do not do this kind of analysis.

– Sometimes, the format string is not a constant string, it is generated during the
execution of the program. Therefore, there is no way for the compiler to find the
miss-match in this case.

An Information Security Short Course


10
(Summer 2020)
What if there is a miss-match
printf ("a has value %d, b has value %d, c is at
address: %08x\n", a, b);

Can printf() detect the miss-match?

– The function printf()fetches the arguments from the stack. If the format
string needs 3 arguments, it will fetch 3 data items from the stack. Unless the
stack is marked with a boundary, printf()does not know that it runs out of
the arguments that are provided to it.

– Since there is no such a marking, printf()will continue fetching data from


the stack. In a miss-match case, it will fetch some data that do not belong to this
function call.

An Information Security Short Course


11
(Summer 2020)
/* guide.c */
#include <stdio.h>

void foo(int x) {
int a = 1, b = 2, c = 3;

//printf("a has value %d, b has value %d, c is at


address: %08x\n", a, b, &c); // correct

printf("a has value %d, b has value %d, c is at


address: %08x", a, b); // incorrect
}

int main(int argc, char* argv[]) {


foo(1);
return 0;
}

An Information Security Short Course


12
(Summer 2020)
/* guide.c */
correct output >>>
a has value 1, b has value 2, c is at address: bffff334

(gdb) x/12 $esp

0xbffff320: 0x08048510 ==> format string “a has …”


0x00000001 ==> value of a
0x00000002 ==> value of b
0xbffff334 ==> address of c
0xbffff330: 0xb7fc53e4
0xbffff334==> 0x00000003
0x00000001 ==> local variables of foo()
0x00000002
0xbffff340: 0xffffffff
0xb7e53196
0xbffff368
0x08048438
An Information Security Short Course
13
(Summer 2020)
/* guide.c */
incorrect output >>>
a has value 1, b has value 2, c is at address: 080482dd

(gdb) x/12 $esp

0xbffff320: 0x08048510 ==> format string “a has …”


0x00000001 ==> value of a
0x00000002 ==> value of b
0x080482dd ==> printed memory content
0xbffff330: 0xb7fc53e4
0x00000001
0x00000002 ==> local variables of foo()
0x00000003
0xbffff340: 0xffffffff
0xb7e53196
0xbffff368
0x08048431
An Information Security Short Course
14
(Summer 2020)
History
• First exploit discovered in June 2000.
• Examples:
– wu-ftpd 2.* : remote root
– Linux rpc.statd: remote root
– IRIX telnetd: remote root
– BSD chpass: local root
• Any function using a format string could be vulnerable.
Printing:
printf, fprintf, sprintf, …
vprintf, vfprintf, vsprintf, …
Logging:
syslog, err, warn
An Information Security Short Course
15
(Summer 2020)
Viewing Memory at Any Location
int func(char *user) {
fprintf( stderr, user);
}

Problem: what if *user = “%s%s%s%s…%s%s%s” ??


– Most likely program will crash: DoS.
• Segmentation fault
– If not, program may print memory contents. Privacy?
• construct the %s and %x format tokens, among others, to
print data from the stack or possibly other locations in
memory

Correct form: fprintf( stdout,


An Information Security Short Course “%s”, user); 16
(Summer 2020)
• We have to supply an address of the memory. However, we cannot change the co
supply the format string.
Exploit Details
• If we use printf(%s) without specifying a memory address, the target address
from
• Ifthewe
stack
useanyway by the printf()
printf(%s) withoutfunction. The afunction
specifying memory maintains
address,an init
so it knows the location of the parameters in the stack.
the target address will be obtained from the stack anyway by the
• Observation: the format stringThe
printf()function. function
is usually maintains
located an initial
on the stack. stack
If we can encode the
the format string,
pointer, so the target address
it knows will be in
the location ofthe
thestack. In the following
parameters in the example,
stack. the
stored in a buffer, which is located on the stack.
int main(int argc, char *argv[])
{
char user_input[100];
... ... /* other variable definitions and statements */

scanf("%s", user_input); /* getting a string from user */


printf(user_input); /* Vulnerable place */

return 0;
}
An Information Security Short Course
17
(Summer 2020)
Exploit Details
• If we use printf(%s) without specifying a memory address, the target addres
from the stack anyway by the printf() function. The function maintains an in
so it knows the location of the parameters in the stack.
• Observation: the format string is usually located on the stack. If
we can encode
• Observation: the formatthestring
target address
is usually in the
located onformat string,
the stack. If we the
can target
encode th
the address will the
format string, be target
in theaddress
stack.will
In the
be infollowing
the stack. example, the format
In the following example, t
stored in a is
string buffer,
storedwhich
in aisbuffer,
located which
on the stack.
is located on the stack...
int main(int argc, char *argv[])
{
char user_input[100];
... ... /* other variable definitions and statements */

scanf("%s", user_input); /* getting a string from user */


printf(user_input); /* Vulnerable place */

return 0;
}
• If we can force the printf to obtain the address from the format
string (also on the stack), we can control the address.
• If we can force the printf to obtain the address from the format string (also on t
printf("\x10\x01\x48\x08_%x_%x_%x_%x_%s");
control the address. An Information Security Short Course
18
(Summer 2020)
Exploit Details
• Consequence: we use four %x to move the printf()’s pointer towards the
ED Labs – Format String Vulnerability Lab
address that we stored in the format string. Once we reach the destination, we7
abs – Format String Vulnerability Lab
will give %s to printf(), causing it to print out the contents in the7 memory
address 0x10014808. The function printf()will treat the contents as a
Print out the contents at the address 0x10014808 using format-string vlunerability
Printstring,
out the and print
contents at out the string
the address until reaching
0x10014808 the endvlunerability
using format-string of the string (i.e. 0).
user_input [ ]
user_input [ ]

Address of user_input [ ]
Address of user_input [ ]
0x10014808
0x10014808
%s

%x

%x

%x

%x
...
%s

%x

%x

%x

%x

...

Print this Print this


Print this Print this
for the 4th %x for the 1st %x
for the 4th %x for the 1st %x
For %s: print out the contents pointed by this address
For %s: print out the contents pointed by this address
An Information Security Short Course
19
(Summer 2020)
Exploit Details
• Reasoning: The stack space between user_input[]and the address passed
to the printf()function is not for printf(). However, because of the
format-string vulnerability in the program, printf()considers them as the
arguments to match with the %x in the format string.
• TheSEED
key challenge
Labs – Format inString
this attack is toLab
Vulnerability figure out the distance between the 7

user_input[]and the address passed to the printf()function. This


SEED Labs – Format String Vulnerability Lab 7

distance decides how


Print many
out the youaddress
%xat the
contents need0x10014808
to insertusing
intoformat-string
the format string,
vlunerability

before giving %s.


Print out the contents at the address 0x10014808 using format-string vlunerability
user_input [ ]
user_input [ ]

Address of user_input [ ]
Address of user_input [ ]
0x10014808
0x10014808
%s

%x

%x

%x
...
%x
%s

%x

%x

%x

%x

...

Print this Print this


Print this Print this
for the 4th %x for the 1st %x
for the 4th %x for the 1st %x
For %s: print out the contents pointed by this address
For %s: print out the contents pointed by this address
An Information Security Short Course
20
(Summer 2020)
Writing to Memory at Any Location
• %n: The number of characters written so far is stored into the
integer indicated by the corresponding argument.
int temp;
printf( “hello %n”, &temp).

• It causes printf() to write ‘6’ to temp

• Using the same approach as that for viewing memory at any


location, we can cause printf()to write an integer into any
location. Just replace the %s in the above example with %n, and
the contents at the address 0x10014808 will be overwritten.
An Information Security Short Course
21
(Summer 2020)
Writing to Memory at Any Location
• Using this attack, attackers can do the following:
- Overwrite important program flags that control access privileges
- Overwrite return addresses on the stack, function pointers, etc.

• The value written is determined by the # of chars printed %n is


reached. How to write arbitrary integer values?
- Use dummy output characters. To write a value of 1000, a simple padding of
1000 dummy characters would do.
- To avoid long format strings, we can use a width specification of the format
indicators.

printf(“%08x.%08x.%08x.%08x.%n”)
An Information Security Short Course
22
(Summer 2020)
Defenses

An Information Security Short Course


23
(Summer 2020)
Preventing hijacking attacks
1. Fix bugs:
– Audit software
• Automated tools: Coverity, Prefast/Prefix.
– Rewrite software in a type safe languange (Java, ML)
• Difficult for existing (legacy) code …

2. Concede overflow, but prevent code execution

3. Add runtime code to detect overflows exploits


– Halt process when overflow exploit detected
– StackGuard, LibSafe, …
An Information Security Short Course
24
(Summer 2020)
Defense I: Marking memory as non-execute (W^X)

Prevent attack code execution by marking stack and heap as


non-executable
• NX-bit on AMD Athlon 64, XD-bit on Intel P4 Prescott
– NX bit in every Page Table Entry (PTE)
• Deployment:
– Linux (via PaX project); OpenBSD
– Windows: since XP SP2 (DEP)
• Boot.ini : /noexecute=OptIn or AlwaysOn
• Visual Studio: /NXCompat[:NO]

• Limitations:
– Some apps need executable heap (e.g. JITs).
– Does not defend against `return-to-libc’ exploits
An Information Security Short Course
25
(Summer 2020)
Examples: DEP controls in Windows

DEP terminating a program

An Information Security Short Course


26
(Summer 2020)
Attack: return to libc (aka. Arc injection)
• Control hijacking without executing code

stack libc.so

args
ret-addr exec()
sfp printf()

local buf “/bin/sh”

Question: where is libc.so located in memory?


An Information Security Short Course
27
(Summer 2020)
e.g. libc

An Information Security Short Course


28
(Summer 2020)
Arc injection vs. Code injection

Control hijacking without Control hijacking with


injecting executing code injecting executing code

An Information Security Short Course


29
(Summer 2020)
Defense II : Address randomization
• ASLR: (Address Space Layout Randomization)
– Start both stack and heap at a random location
– Map shared libraries to rand location in process memory
Þ Attacker cannot jump directly to exec function
– Deployment: (/DynamicBase)
• Windows Vista: 8 bits of randomness for DLLs
– aligned to 64K page in a 16MB region Þ 256
choices
• Linux (via PaX): 16 bits of randomness for libraries

– More effective on 64-bit architectures

• Other randomization methods:


– Sys-call randomization: randomize sys-call id’s
– Instruction Set Randomization (ISR)
An Information Security Short Course
30
(Summer 2020)
ASLR Example

Booting twice loads libraries into different locations:

An Information Security Short Course


31
(Summer 2020)
Effectiveness and Limitations
• Limitations in NX*:
– some apps need executable heap (e.g. JITs)
– Does not defend against “return-to-libc” exploits (aka. Arc injection)

• Limitations in ASLR: Randomness can still be limited

*: when applicable An Information Security Short Course


32
(Summer 2020)
Control Hijacking

Run-time Defenses

An Information Security Short Course


33
(Summer 2020)
Run time checking:
Defense III: StackGuard
• Many run-time checking techniques …
– we only discuss methods relevant to overflow protection

• Solution 1: StackGuard
– Run time tests for stack integrity.
– Embed “canaries” in stack frames and verify their integrity
prior to function return.

Frame 2 Frame 1
top
local canary sfp ret str local canary sfp ret str of
stack
An Information Security Short Course
34
(Summer 2020)
Canary Types
• Random canary:
– Random string chosen at program startup.
– Insert canary string into every stack frame.
– Verify canary before returning from function.
• Exit program if canary changed. Turns potential
exploit into DoS.
– To corrupt, attacker must learn current random
string.

• Terminator canary: Canary = {0, newline, linefeed,


EOF}
– String functions will not copy beyond
terminator.
– Attacker cannot use string functions to corrupt
stack.
An Information Security Short Course
35
(Summer 2020)
StackGuard (Cont.)
• StackGuard implemented as a GCC patch.
– Program must be recompiled.

• Minimal performance effects: 8% for Apache.

• Note: Canaries don’t provide full proof protection.


– Some stack smashing attacks leave canaries unchanged

• Heap protection: PointGuard.


– Protects function pointers and setjmp buffers by
encrypting them: e.g. XOR with random cookie
– Less effective, more noticeable performance effects
An Information Security Short Course
36
(Summer 2020)
StackGuard enhancements: ProPolice
• ProPolice (IBM) - gcc 3.4.1. (-fstack-protector)
– Rearrange stack layout to prevent ptr overflow.

String args
Growth
ret addr Protects pointer args and local
pointers from a buffer overflow
SFP
CANARY

Stack local string buffers


Growth pointers, but no arrays
local non-buffer variables
copy of pointer args
An Information Security Short Course 37
(Summer 2020)
MS Visual Studio /GS [since 2003]

Compiler /GS option:


– Combination of ProPolice and Random canary.
– If cookie mismatch, default behavior is to call _exit(3)

Function prolog: Function epilog:


sub esp, 8 // allocate 8 bytes for cookie mov ecx, DWORD PTR [esp+8]
mov eax, DWORD PTR ___security_cookie xor ecx, esp
xor eax, esp // xor cookie with current esp call @__security_check_cookie@4
mov DWORD PTR [esp+8], eax // save in stack add esp, 8

Enhanced /GS in Visual Studio 2010:


– /GS protection added to all functions, unless can be proven unnecessary
An Information Security Short Course
38
(Summer 2020)
/GS stack frame

args
String
Growth ret addr
Canary protects ret-addr and
SFP exception handler frame

exception handlers

CANARY

Stack local string buffers


Growth pointers, but no arrays
local non-buffer variables
copy of pointer args
An Information Security Short Course
39
(Summer 2020)
Effectiveness and Limitations
• Limitations:
– Evasion with exception handler

*: when applicable An Information Security Short Course


40
(Summer 2020)
Evading /GS with exception handlers
• When exception is thrown, dispatcher walks up exception list
until handler is found (else use default handler)

After overflow: handler points to attacker’s code


exception triggered ⇒ control hijack

Main point: exception is triggered before canary is checked


0xffffffff
Structured Exception Handling SEH frame SEH frame

high
next handler buf next handler next handler mem
An Information Security Short Course
41
(Summer 2020)
Evading /GS with exception handlers
• When exception is thrown, dispatcher walks up exception list
until handler is found (else use default handler)

After overflow: handler points to attacker’s code


exception triggered ⇒ control hijack

Main point: exception is triggered before canary is checked


0xffffffff
Structured Exception Handling SEH frame SEH frame

high
ptr to mem
next handler buf next
next handler next handler
attack code
An Information Security Short Course
42
(Summer 2020)
Defense IV: SAFESEH and SEHOP
• /SAFESEH: linker flag
– Linker produces a binary with a table of safe exception handlers
– System will not jump to exception handler not on list

• /SEHOP: platform defense (since win vista SP1)


– Observation: SEH attacks typically corrupt the “next” entry in SEH list.
– SEHOP: add a dummy record at top of SEH list
– When exception occurs, dispatcher walks up list and verifies dummy
record is there. If not, terminates process.

An Information Security Short Course


43
(Summer 2020)
Effectiveness and Limitations
• Limitations:
– Requires recompilation

*: when applicable An Information Security Short Course


44
(Summer 2020)
Summary: Canaries are not full proof
• Canaries are an important defense tool, but do not prevent all
control hijacking attacks:
– Heap-based attacks still possible
– Integer overflow attacks still possible
– /GS by itself does not prevent Exception Handling attacks
(also need SAFESEH and SEHOP)

An Information Security Short Course


45
(Summer 2020)
What if can’t recompile: Libsafe
• Solution 2: Libsafe (Avaya Labs)
– Dynamically loaded library (no need to recompile app.)
– Intercepts calls to strcpy (dest, src)
• Validates sufficient space in current stack frame:
|frame-pointer – dest| > strlen(src)
• If so, does strcpy. Otherwise, terminates application

top
sfp ret-addr dest src buf sfp ret-addr of
stack

Libsafe strcpy main


An Information Security Short Course
46
(Summer 2020)
How robust is Libsafe?

low high
memory sfp ret-addr dest src buf sfp ret-addr memory

Libsafe strcpy main

a function pointer might be in the overwritten buf

strcpy() can still overwrite a pointer between buf and sfp,


causing hijacking attacks …

An Information Security Short Course


47
(Summer 2020)
Effectiveness and Limitations
• Limitations:
– Limited protection.

An Information Security Short Course


48
(Summer 2020)
More methods …
Ø StackShield
§ At function prologue, copy return address RET and
SFP to “safe” location (beginning of data segment)
§ Upon return, check that RET and SFP is equal to
copy.
§ Implemented as assembler file processor (GCC)
Ø Control Flow Integrity (CFI)
§ A combination of static and dynamic checking
§ Statically determine program control flow
§ Dynamically enforce control flow integrity
An Information Security Short Course 49
(Summer 2020)
Effectiveness and Limitations
• Many different kinds of attacks. Not one silver bullet
defense.

An Information Security Short Course


50
(Summer 2020)
End of Segment

An Information Security Short Course


51
(Summer 2020)

You might also like