You are on page 1of 24

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006

Exterminator: Automatically
Correcting Memory Errors
Gene Novark, Emery Berger
UMass Amherst
Ben Zorn
Microsoft Research
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Debugging Memory Errors
Billions of lines of deployed C/C++ code
Apps contain memory errors
Heap overflows
Dangling pointers
Notoriously hard to debug
Must reproduce bug, pinpoint cause
Average 28 days from discovery of
remotely exploitable memory error and
patch [Symantec 2006]
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Coping with memory errors
Unsound, may detect errors
Windows, GNU libc, Rx
Sound, always finds dynamic errors
CCured, CRED, SAFECode
Requires source modification
Valgrind, Purify
Order of magnitude slowdown
Probabilistically avoid errors
DieHard [Berger 2006]

Exterminator: automatically isolate and fix
detected errors
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
DieHard Overview
Fully-randomized memory manager
Bitmap-based with random probing
Increases odds of benign memory errors
Different heap layouts across runs
Replication
Run multiple replicas simultaneously, vote
on results
Increases reliability (hides bugs) by
using more space
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
DieHard Heap Layout
Bitmap-based, segregated size classes
Bit represents one object of given size
i.e., one bit = 2
i+3
bytes, etc.
malloc(): randomly probe bitmap for free space
free(): just reset bit

00000001 1010
size = 2
i+3
2
i+4
allocation bitmap

heap

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Exterminator Extensions
00000001 1010
size = 2
i+3
2
i+4
allocation bitmap

heap

2 1 3 object id (serial number)

3 2 dealloc time

DieHard

Exterminator

dealloc site

D
6
D
9

alloc site

A
4
A
8
A
3

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
The Exterminator System
seed
vote
broadcast
input output
DieFast replica
1
seed
DieFast replica
2
seed
Error isolator
correcting allocator
correcting allocator
correcting allocator
DieFast replica
3
runtime
patches
On failure, create heap images (core dump)
Isolator analyzes images, creates runtime patch
Correcting allocator corrects isolated errors:
pad allocations
extend object lifetimes
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Exterminator Isolation Algorithm
Identify discrepancies
Compare valid object data
Find equivalent objects (same ID) with different
contents
Find corrupted canaries (free space)
Check for possible buffer overflows
Check for dangling pointer error
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Comparing Object Data
Lots of valid reasons for data to differ
Pointers (random target locations)
File descriptors
Non-transparent use of pointers
e.g. Red-Black tree keyed on pointer value
Etc.
Exterminator identifies and ignores:
Values which differ across all replicas
Valid pointers referring to same target ID
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Error Isolation: Buffer Overflows
2 4 5 3 1 6
Replica 1: malignant overflow

1 6 3 2 5 4
Replica 2 & 3: benign overflows

1. Identify corrupt object
2. Search for source
3. Compare data at same
1 6 3 2 5 4
( = 1: No object )
5 5 5
( vs. & )
( = 2: candidate!) 2
2
2
2
5
( vs. & Match! )
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Error Isolation: Dangling Pointer
2 4 5 3 1 6
1 6 3 2 5 4
5
5
Freed, Canary value
Dangled ptr
?

Pr (
1
H
)
k

Assume dangling pointer
Extend lifetime of object
Corrupted canary values for object 5
Same object, same corruption
Buffer overflow?
Source object would be at same in all replicas
Unlikely,

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Error Isolation: Dangling ptr read
What if the program doesnt write to the
dangled pointer?
DieFast overwrites freed objects
Canaries produce invalid reads, crashes
How to identify prematurely freed objects?
Common case 1: read something that was a
pointer, dereference it
Common case 2: read numeric value, error
propagates through computation
No information: previous contents destroyed!
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Error Isolation: Dangling ptr read
Solution: Write canaries randomly (half the time)
Equivalent to extending object lifetime (until overwritten)
: overwritten with canaries : data intact
Legal free:
OK OK
OK OK
Illegal free:
(later read + deref ptr)
OK
OK CRASH!
CRASH!
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Error Isolation: Dangling ptr read
Correct frees uncorrelated with crash


For each object i, compute estimator:



P > 0.5: dangling pointer error
Create patch when confidence reaches
threshold

P
(crash canaried[i])

replicas

Pr(crash | canaried[i]) Pr(crash)
Pr(crash canaried[i]) 0.5
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Runtime Patches
Overflow patches
Allocation callsite
Overflow amount
Dangling pointer patches
Allocation & Deallocation callsites
Lifetime extension
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Correcting Allocator
Extended DieHard allocator
Reads runtime patches
Stores pad table & deferral table
On free:
Check for life extension for current object
Place ptr, time on deferral priority queue
On allocation:
Check for overflow fix for current callsite
Check deferral queue for pending frees
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Results
Analytical results
Empirical results
Runtime overhead
Error detection
Injected faults
Real application (Squid)
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Analytic results summary
Buffer overflows
False negative & positive rate decrease
exponentially with # of replicas
Dangling pointers
Write: exponentially low false +/- rate
Read-only:
Confidence threshold controls false positive
rate, # replicas needed to identify culprit

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Empirical Results: Runtime
Exterminator Overhead
0
0.5
1
1.5
2
2.5
c
f
r
a
c
e
s
p
r
e
s
s
o
l
i
n
d
s
a
y
p
2
c
r
o
b
o
o
p
1
6
4
.
g
z
i
p
1
7
5
.
v
p
r
1
7
6
.
g
c
c
1
8
1
.
m
c
f
1
8
6
.
c
r
a
f
t
y
1
9
7
.
p
a
r
s
e
r
2
5
3
.
p
e
r
l
b
m
k
2
5
4
.
g
a
p
2
5
5
.
v
o
r
t
e
x
2
5
6
.
b
z
i
p
2
3
0
0
.
t
w
o
l
f
G
e
o
m
e
t
r
i
c

m
e
a
n
N
o
r
m
a
l
i
z
e
d

E
x
e
c
u
t
i
o
n

T
i
m
e
GNU libc Exterminator
allocation-intensive SPECint2000
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Empirical Results: Overflows
Buffer Overflow Isolation
0%
20%
40%
60%
80%
100%
4 8 16
Overflow Size
I
m
a
g
e
s

R
e
q
u
i
r
e
d

(
%
)
3 images 4 images 5 or more
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Empirical Results: Dang. Ptrs.
Dangling Pointer Isolation
0%
20%
40%
60%
80%
100%
3 19 f ailed
Number of Images
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Empirical Results: Squid
Squid web cache heap overflow
Remotely exploitable
Crashes glibc 2.8.0 and BDW collector
DieFast detects error immediately
Corrupted canary past overflowed object
Exterminators isolator generates an
object pad of 6 bytes, fixing the overflow
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Conclusion
Randomization + Replication =
Information
Randomization bugs have different
effects
Exterminator exploits different effects
across heaps to isolate cause
Low overhead
Automatically fix bugs in deployed
programs
Breaks crash-debug-patch cycle
Create 0-day patches for 0-day bugs
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2006
Questions?
http://www.cs.umass.edu/~gnovark/