You are on page 1of 12

Bounds-Checking Entire Programs without Recompiling

Nicholas Nethercote

Jeremy Fitzhardinge

Computer Laboratory
University of Cambridge
United Kingdom

San Francisco
United States

jeremy@goop.org

njn25@cam.ac.uk

ABSTRACT
This paper presents a new te hnique for performing bounds he king. We use dynami binary instrumentation to modify
programs at run-time, tra k pointer bounds information and
he k all memory a esses. The te hnique is neither sound
nor omplete, however it has several very useful hara teristi s: it works with programs written in any programming
language, and it requires no ompiler support, no ode re ompilation, no sour e ode, and no spe ial treatment for
libraries. The te hnique performs best when debug information and symbol tables are present in the ompiled program,
but degrades gra efully when this information is missing|
fewer errors are found, but false positives do not in rease.
We des ribe our prototype implementation, and onsider
how it ould be improved by better intera tion with a ompiler.

Categories and Subject Descriptors


D.2.5 [Software Engineering: Testing and Debugging|
debugging aids, monitors, symboli exe ution ; D.2.4 [Software
Engineering: Software/Program Veri ation|assertion
he kers, reliability, validation ; D.3.2 [Language Classi ations: Multiparadigm languages|C, C++
General Terms
Reliability, Veri ation

Keywords
Bounds- he king, memory debuggers

1.

INTRODUCTION

Low-level programming languages like C and C++ provide raw memory pointers, permit pointer arithmeti , and
do not he k bounds when a essing arrays. This an result
in very e ient ode, but the unfortunate side-e e t is that
a identally a essing the wrong memory is a very ommon
programming error.

Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SPACE 2004 Venice, Italy
Copyright 2003 ACM X-XXXXX-XX-X/XX/XX ...$5.00.

The most obvious example in this lass of errors is ex eeding the bounds of an array. However the bounds of
non-array data obje ts an also be violated, su h as heap
blo ks, C stru ts, and sta k frames. We will des ribe as
a bounds error any memory a ess whi h falls outside the
intended memory range.
These errors are not di ult to introdu e, and they an
ause a huge range of bugs, some of whi h an be extremely
subtle and lurk undete ted for years. Be ause of this, tools
for preventing and identifying them are extremely useful.
Many su h tools are available, using a variety of te hniques.
None are ideal, and ea h one has a di erent set of hara teristi s, in luding:
 the kinds of bugs they an and annot spot;






whi h regions of memory (heap, stati , sta k) they


work with;
the number of false positives they produ e;
whether they are stati or dynami ;
whi h parts of a program they over, in parti ular how
they work with libraries;
what level of ompiler support is needed.

In this paper we des ribe a new te hnique for identifying bounds errors. The basi idea is that all data obje ts
that have bounds we wish to he k are tra ked, as are all
pointers; ea h pointer has a legitimate memory range it an
a ess, and a esses outside this range are agged as errors.
Our tool e e tively turns programs pointers into fat pointers on-the- y. However, the pointer ranges are maintained
separately from the pointers themselves, so pointer sizes do
not hange.
The te hnique spots many, but not all, bounds errors in
the heap, the sta k, and in stati memory; it gives few false
positives; it is dynami , and relies on dynami binary translation. These hara teristi s are not parti ularly ex iting.
The main ontribution of this paper is that our bounds he king te hnique has several unique hara teristi s: it does
not require any ompiler support; it works with programs
written in any language; and it he ks entire programs, without requiring any spe ial treatment for libraries. This last
hara teristi is parti ularly important|inadequate treatment of library ode is the single major short oming of many
previous bounds- he king te hniques.
Thus, our te hnique has its advantages and disadvantages,
and is unlikely to be a world-beater in its own right. However, it is a useful omplement to existing te hniques, and

an hopefully be part of a future hybrid te hnique that is a


world-beater.
This paper is stru tured as follows. Se tion 2 des ribes
the te hnique at a high-level, and Se tion 3 des ribes our
prototype implementation, and gives more low-level details.
Se tion 4 dis usses the short omings of the te hnique, and
ways in whi h a ompiler might o-operate with our tool so
that it an give better results. Se tion 5 des ribes related
work, and dis usses how our te hnique ompares to others.
Se tion 6 dis usses future work and on ludes.

2.

THEORY

This se tion provides a high-level des ription of our te hnique. Certain details are kept vague, and subsequently
eshed out in Se tion 3.

2.1 Overview
The basi idea is simple. Every pointer has a range of
addresses it an legitimately a ess. The range depends on
what the pointer originally pointed to. For example, the
legitimate range of a pointer returned by mallo () is the
bounds of the heap blo k pointed to by that pointer. We all
that range the pointer's segment. All memory a esses are
he ked to make sure that the memory a essed is within the
a essing pointer's segment. Any violations are reported.
Pointers are often used in operations other than memory
a esses. The obvious example is pointer arithmeti ; for
example, array elements are indexed using addition on the
array's base pointer. However, if two pointers are added,
the result should not be used to a ess memory; nor should
a non-pointer value be used to a ess memory. Thus we also
need to know whi h program values are non-pointers. The
result is that every value has a run-time type, and we need
a simple type system to determine the type of all values
produ ed by the program.
The following se tions des ribe aspe ts of the te hnique
in more detail.

2.2 Metadata

The te hnique requires maintenan e of metadata des ribing the run-time types of data obje ts. We say this metadata shadows the real values. This metadata has one of the
following four forms.






, a segment-type, des ribes a segment; this in ludes


its base address, size, lo ation (heap, sta k, or stati ),
and status (in-use or freed).

n, a non-pointer-type, des ribes a value whi h is known


to be a non-pointer.
p(X ), a pointer-type, des ribes a value whi h is known
to be a pointer to segment X ;

, an unknown-type, des ribes a value for whi h the


type is unknown.

The following data obje ts will have metadata asso iated


with them.

Segments: ea h new segment is given a new segmenttype X ;

Memory and registers: ea h value is shadowed by one


of the types n, u or p(X ).

In prin iple, every value produ ed, of any size, an be assigned a type. However, our most important he king takes
pla e when memory is a essed, whi h is always through
word-sized pointers. Therefore, the shadow value tra king
an be done at word-sized granularity.

2.3 Checking Accesses


Every load and store is he ked. When a essing a memory lo ation m through a value x we look at the type of x
and behave a ordingly.





n: Issue an error about a essing memory through a


non-pointer.
u

: Do nothing.

p(X ): If m is outside the range of X , issue an error


about a essing memory beyond the legitimate range
of x; if X is a freed segment, issue an error about
a essing memory through a dangling pointer.

Note that in all ases the memory a ess itself happens


unimpeded.

2.4 Allocating and Freeing Segments


There are four steps in dealing with ea h segment.
1. Identify when it is allo ated, in order to reate a segmenttype des ribing it.
2. Upon allo ation, set the metadata for the new words
in the segment.
3. Identify ea h pointer to the segment, in order to set
the pointer's shadow to the appropriate pointer-type.
4. Identify when it is freed, in order to hange the status
of its segment-type to \freed".
The way these aspe ts are handled di ers between the heap,
stati and sta k segments. In addition, the exa t meaning
of the term \segment", and thus the kinds of errors found,
di ers between the three memory areas. The following se tions dis uss these di eren es. Se tion 3.3.2 des ribes how
the implementation deallo ates segment types.

2.4.1 Heap Segments


Heap segments are the easiest to deal with. For the heap,
one heap segment represents one heap blo k. The four steps
for dealing with heap blo ks are as follows.
1. By inter epting mallo (), allo (), reallo (), new,
and new[, we know for every heap segment its address, size, and where it was allo ated. We an thus
easily reate the new segment-type, store it in a data
stru ture, and set the shadow value of the returned
pointer to point to it.
2. All the words within a newly allo ated heap blo k have
their shadows set to n, sin e they should de nitely not
be used to a ess memory, being either zeroed (for
allo ()) or uninitialised.
3. The pointer returned by the allo ation fun tion has its
shadow set to p(X ), where X was the segment-type
just reated. At this point, this is the only pointer to
the segment.

4. By inter epting free(), reallo (), delete, and delete[,


we know when heap segments are freed; we an look up
the segment-type in the data stru ture by its address
(from the pointer argument), and hange the segmenttype's status to \freed".1
Note that our te hnique as des ribed only dete ts overruns
past the edges of heap blo ks. For example, if the heap
blo k ontains a stru t whi h ontains two adja ent arrays,
overruns of the rst array into the se ond will not be aught.
This ould be improved by using debugging information,
whi h would tell us the type of the pointer. Then with
some e ort, elds a esses ould be identi ed by looking at
o sets from the blo k pointer. However, see Se tion 2.4.2 for
a dis ussion of some di ulties in identifying array overruns
in these ases.

2.4.2 Static Segments


Identifying stati segments is more di ult. First, identifying stati segments by looking only at the ompiled exe utable is lose to impossible. So we rely on debugging
information to do this; if this information is not present, all
pointers to stati data obje ts will have the type u, and no
a esses to stati memory will be he ked.
We an reate segment-types for ea h stati data obje t,
e.g. arrays, stru ts, integers; thus for stati memory, a segment represents an single data obje t. The four steps for
dealing with stati segments are as follows.
1. If debugging information is present, we identify stati
arrays in the main exe utable at startup, and for shared
obje ts when they are mapped into memory.
2. Newly loaded stati memory an ontain legitimate
pointers. Those we an identify, from debugging information, are set to point to stati segments. The
remaining words have their type set a ording to their
value. If a word is learly a non-pointer|e.g. a small
integer|we give it the type n, otherwise we give it the
type u.
3. Identifying pointers to stati segments is harder than
identifying pointers to heap segments. This is be ause
ompilers have a lot of ontrol over where stati data
obje ts are put, and so an hard-wire absolute addresses into the ompiled program. One possibility is
to rely on a simple assumption: any value that looks
like a pointer to an obje t, is a pointer to that obje t (even if it is not to the start of the obje t). For
example, if a stati integer array a[10 is lo ated at
address 0x8049400, we may see the following x86 instru tion snippets used to load elements of the array
into register %edx:
movl $0x8049400, %eax
movl (%eax), %edx

# load a[0

movl $0x8049400, %eax


movl 20(%eax), %edx

# load a[5

movl $0x8049414, %eax


movl (%eax), %edx

# load a[5

1
Se tion 3.9 explains how we an deal with programs that
use ustom allo ators.

movl $0x8049400, %eax


movl $5, %ebx
movl (%eax,%ebx,4), %edx

# load a[5

movl $5, %eax


movl 0x8049400(,%eax,4),%edx

# load a[5

movl $0x8049428, %eax


movl (%eax), %edx

# load a[10

This approa h works in all these ases ex ept the last


one; it is a bounds error, be ause the pointer value
used falls outside the range of a. However, dete ting
this error is not possible, given that another array b[
might lie dire tly after a[, and so an a ess to a[10
is indistinguishable from an a ess to b[0.
Unfortunately, this approa h sometimes ba k res. For
example, the program gap in the SPEC2000 ben hmarks initialises one stati pointer to a stati array as
p = &a[-1; p is then in remented before being used
to a ess memory. When ompiled with GCC, the address &a[-1 falls within the segment of another array
lying adja ent; thus the pointer is given the wrong
segment-type, and false positives o ur when p is used
to a ess memory. A similar problem o urs with the
ben hmark g , in whi h an array a ess of the form
a[var - CONST o urs, where var  CONST; the address a[-CONST is resolved by the ompiler, giving an
address in an adja ent array.
The alternative is to be more onservative in identifying stati pointers, by only identifying those that
point exa tly to the beginning of stati arrays. Unfortunately, this means some stati array a esses will
be missed, su h as in the third assembly ode snippet
above, and so some errors will be missed. In parti ular, array referen es like a[i-1 are quite ommon and
will not be handled if the ompiler omputes a[-1 at
ompile-time.
As well as appearing in the ode, stati pointers an
also o ur in stati data. These an be identi ed from
the debugging information.
4. We identify when stati arrays in shared obje ts are
unmapped from memory. With an appropriate data
stru ture, we an mark as freed all the segments within
the unmapped range. This is safe be ause stati memory for a shared obje t is always ontiguous, and never
overlaps with heap or sta k memory.
As with heap segments, arrays within stati stru ts ould
be identi ed with some e ort. However, as with top-level
arrays, dire t a esses that ex eed array bounds|like the
a[10 ase mentioned previously| annot be dete ted.

2.4.3 Stack Segments


The sta k is the nal area of memory to deal with. One
ould try to identify bounds errors on individual arrays on
the sta k, but a simpler goal is to identify when a sta k
frame is overrun/underrun. In this ase, one sta k segment
represents one sta k frame. The four steps in dealing with
sta k segments are as follows.

1. When a fun tion is entered, fun tion arguments are


above the sta k pointer,2 the return address is at the
sta k word pointed to by the sta k pointer, and lo al
variables used by the fun tion will be pla ed below the
sta k pointer. If we know the size and number of arguments, we an reate a segment-type X for the sta k
frame, whi h is bounded at one end at the address of
most extreme fun tion argument, and is unbounded at
the other end. This segment-type an be stored in a
sta k of segment-types.
2. The values within the segment should be shadowed
with n when the segment-type is reated, as they are
uninitialised and should not be used as pointers. The
ex eption is the fun tion arguments, whose shadow
values have already been initialised and should be left
untou hed.
3. The only pointer to a new sta k segment is the sta k
pointer; its shadow is set to p(X ), where X is the
segment-type just reated.
4. When the fun tion is exited, the segment-type an be
removed from the segment-type sta k, and marked as
deallo ated.3
These segments allow us to dete t two kinds of errors.
First, any a ess using the sta k pointer (or a register assigned the sta k pointer's value, su h as the frame pointer)
that ex eeds the end of the sta k frame will be dete ted.
Se ond, and more usefully, any dangling pointers to old
sta k frames will be aught. Su h dangling pointers an o ur if the address of a sta k variable is saved, or returned
from the fun tion. Even better, it will dete t any use of dangling pointers in multithreaded programs that have multiple
threads sharing sta ks. Su h bugs an be extremely di ult
to tra e down.
Knowing the size and number of fun tion arguments requires a ess to the debug information. If the debug information is missing, an approximation an be used: instead
of a segment that is bounded at one end, use an unbounded
segment. The rst error above, violations of the sta k's edge,
will not be dete ted. However, the se ond error, that of dangling pointer use, will be. This is be ause any a ess to a
freed segment triggers an error, regardless of whether that
a ess is in range. This approximation ould also be used
for fun tions like printf() that have a variable number of
arguments.
As with heap blo ks and stati memory, we ould do tra king of individual variables within sta k frames, but it would
be very ddly.

2.5 Operations
Every data-manipulating instru tion exe uted by the program must be shadowed by an operation to manipulate the
relevant metadata. These shadow operations are des ribed
in the following se tions.

2.5.1 Copying Values


2
We
3

assume the sta k is growing towards lower addresses.


This assumes fun tion entries and exits are paired ni ely,
and an be identi ed. Se tion 3.5 dis usses what happens
in pra ti e.

For ma hine instru tions that merely opy existing values


around (e.g. memory-register, register-memory, and registerregister opies) the metadata must be orrespondingly opied.
Note that p(X ) metavalues must be opied by referen e, so
that if the segment pointed to, X , is freed, all p(X ) types
that point to it see its hange in status.

2.5.2 New Static Values


Ma hine instru tions that mention onstants e e tively
introdu e new values. The type of ea h stati value is found
in the same way that the types of values in newly loaded
stati memory are given, as was des ribed in Se tion 2.4.2;
this will be n, u, or the appropriate pointer-type.
2.5.3 New Dynamic Values
Ea h ma hine operation that produ es a new value has a
orresponding operation to produ e the metadata for that
value. This in ludes arithmeti operations, logi operations,
and address omputations. Figure 1 shows the metadata
omputation for several ommon binary operations; the type
of the rst operand is shown in the leftmost olumn, and the
type of the se ond operand is shown in the top row.
n

p Y

p Y

p Y

p X

p X

p X

( )

( )

( )
( )

( )

(a) Add

( )
*

(b) Multiply
( )
( )

( )

&

p Y

p Y

p Y

p X

p X

p X

( )

( )

( ) Bitwise-and

( )

(d) Bitwise-xor

Figure 1: Basi operations


Let us start with addition, shown in Figure 1(a). Adding
two non-pointers results in a non-pointer. Adding a nonpointer to a pointer results in a pointer of the same segment; this is ru ial for handling pointer arithmeti and
array indexing. Adding two pointers together produ es a
non-pointer, and an error is issued; while not a tually in orre t, it is su h a dubious operation that we onsider it worth
agging. The `*' on that entry in the table indi ates an error message is issued. Finally, if either of the arguments are
unknown, the result is unknown. Thus unknown values tend
to \taint" known values, whi h ould lead to a large loss of
a ura y quite qui kly. However, before the metadata operation takes pla e, we perform the real operation and he k
its result. As we do for stati values, we use a range test,
and if the result is learly a non-pointer we give it the type
n. This is very important to prevent unknown-ness from
spreading too mu h.
Multipli ation, shown in Figure 1(b), is simpler; the result
is always a non-pointer, and an error is issued if two pointers are multiplied. At rst we issued errors if a pointer was
multiplied by a non-pointer, but in pra ti e this o asionally
happens legitimately. Similarly, division sometimes o urs

on pointers, e.g. when putting pointers through a hash fun tion. Several times we had to remove warnings on pointer
arithmeti operations we had assumed were ridi ulous, be ause real programs o asionally do them. Generally, this
should not be a problem be ause the result is always marked
as n, and any subsequent use of the result to a ess memory
will be agged as an error.
Bitwise-and, shown in Figure 1( ), is more subtle. If
a non-pointer is bitwise-and'd with a pointer, the result
an be a non-pointer or a pointer, depending on the nonpointer value. For example, if the non-pointer has value
0xfffffff0, the operation is probably nding some kind of
base value, and the result is a pointer. If the non-pointer has
value 0x000000ff, the operation is probably nding some
kind of o set, and the result is a non-pointer. We deal with
these possibilities by assuming the result is a pointer, but
also doing the range test on the result and onverting it to
n if ne essary. The resulting shadow operation is thus the
same as that for addition.
For bitwise-xor, shown in Figure 1(d), we do not try anything tri ky; we simply use a range test to hoose either
u or n.
This is be ause there are not any sensible ways
to transform a pointer with bitwise-xor. However, there
are two ases where bitwise-xor ould be used in a nontransformative way. First, the following C ode swaps two
pointers using bitwise-xor.
p1 ^= p2;
p2 ^= p1;
p1 ^= p2;

Se ond, in some implementations of doubly-linked lists, the


forward and ba kward pointers for a node are bitwise-xor'd
together, to save spa e.4 In both these ases, the pointer
information will be lost, and the re overed pointers will end
up with the type u, and thus not be he ked.
Most other operations are straightforward. In rement and
de rement are treated like addition of a non-ptr to a value,
ex ept no range test is performed. Address omputations
(on x86, using the lea instru tion) are treated like additions.
Shift/rotate operations give a n or u result, depending on
the result value|we do not simply set the result to n just
in ase a pointer is rotated one way, and then ba k to the
original value. Negation and bitwise-not give an n result.

2.5.4 Subtraction
We have not mentioned subtra tion yet. It is somewhat
similar to addition, and is shown in Figure 2. Subtra ting two non-pointers gives a non-pointer; subtra ting a nonpointer from a pointer gives a pointer; subtra ting a pointer
from a non-pointer is onsidered an error.
( )
*

p Y

p X

p X

n=

( )

( )

Figure 2: Subtra tion


4

The pointers an be re overed by using the bitwise-xor'd


values from the adja ent nodes; their pointers an be re overed from their adja ent nodes, and so on, all the way ba k
to the rst or last node, whi h holds one pointer bitwisexor'd with NULL.

The big ompli ation is that subtra ting one pointer from
another is legitimate, and the result is a non-pointer. If the
two pointers involved in the subtra tion point to the same
segment, there is no problem. However onsider this C ode:
har p1[10;
har p2[10;
int diff = p2 - p1;
p1[diff = 0;

This uses the pointer p1 to a ess the array pointed to by p2,


whi h is a di erent segment. ANSI-C a tually forbids su h
inter-array pointer subtra tion, but in pra ti e it o urs in
real C ode. Also, it may be valid in other programming languages, and if our te hnique is to be language-independent,
we must handle it.
The problem is that the addition of a non-pointer with
a pointer an result in a pointer to a di erent segment.
The most a urate way to handle this is to generalise the
type n to n(X; Y ), whi h is like n for all operations ex ept addition and subtra tion, in whi h ase we require that
p(X ) + n(X; Y ) gives p(Y ), p(X )
n(Y; X ) gives p(Y ), and
so on. Also, pointer di eren es should be added transitively,
so that n(X; Y ) + n(Y; Z ) gives n(X; Z ).
However, this is a omplex solution for a problem that
does not o ur very often. We des ribe how the implementation handles this ase more simply in Se tion 3.7.

2.5.5 Propagation of Unknown


Note that it is tempting to be less rigorous with preserving
u types. For example, one might try making the result of
u + p(X ) be p(X ). After all, if the u operand is really a
non-pointer, the result is appropriate, and if the u operand
is really a pointer, the result will undoubtedly be well outside
the range of p(X ), so any memory a esses through it will
be erroneous, and should be aught. By omparison, stri t
u-preservation will ause us to miss this error.
At rst, we tried doing this, being as aggressive as possible, and assuming unknowns are pointers when possible.
However, in pra ti e it auses false positives, usually from
obs ure sequen es of instru tions that would fool the shadow
operations into assigning the wrong segment-type to a pointer,
e.g. assigning a heap segment-type to a sta k pointer.
Ea h time we saw su h a false positive, we redu ed the
aggressiveness; after it happened several times, we de ided
that trying to se ond-guess like this was not the best way to
pro eed. Instead, we swit hed to being more onservative
with u values, and using range tests throughout.
There is no way of handling these shadow operations that
is learly the best. Instead, we just have to try alternatives
and nd one that strikes a good balan e between nding real
bounds-errors and avoiding false positives.
2.5.6 Other Operations
Ma hine instru tions that do not produ e or opy values,
su h as jumps, do not require any instrumentation.

2.6 Comments
The te hnique is very heavyweight, sin e it relies on (a)
instrumenting every instru tion in the program that moves
an existing value, or produ es a new value, and (b) shadowing every word in registers and memory with a shadow
value.

3.

PRACTICE

This se tion des ribes our prototype implementation of


our te hnique.

3.1 Implementation Framework


As mentioned in Se tion 2.6, the te hnique is very heavyweight. It was very mu h designed to be implemented using Valgrind [9, a system for building supervision tools on
x86/Linux, whi h supports exa tly this sort of pervasive instrumentation and value tra king. The tool, whi h we all
Annelid, is written as a plug-in for Valgrind, implemented
as about 3,500 lines of C (in luding blanks and omments).
The instrumentation is entirely dynami . The Valgrind
ore gains ontrol of the supervised program at start-up,
and none of the supervised program's original ode is run.
Instead, Valgrind translates the original ode on demand,
one basi blo k at a time, into a RISC-like intermediate
language alled UCode, whi h is expressed in virtual registers. The UCode is passed to the plug-in, whi h adds the
ne essary instrumentation, and then passes it ba k to the
Valgrind ore. The ore reallo ates registers, onverts the
UCode ba k to x86 ma hine ode, a hes it for future reuse, and runs it. Dynami ally loaded libraries are handled
exa tly like any other ode. For more details about how
Valgrind works, please see [9.

3.2 Metadata Representation


The four types of metadata are represented as follows.

distinguished shadow hunk (every word of whi h is set to

NONPTR) whi h non-buggy programs will never read from.

Ea h store is a ompanied by a shadow store; when a hunk


is written to for the rst time, a new shadow hunk is allo ated for it, and the appropriate pointer in the shadow
memory table is updated. Shadow memory thus slightly
more than doubles memory usage.
Sub-word writes to registers and memory will destroy any
pointer information in those words; the type for that word
be omes either n or u, depending on a range test. Thus,
any byte-by-byte opying of pointers to or from memory
will ause that information to be lost. (Fortunately, glib 's
implementation of mem py() does word-by-word opying.)
Similarly, metadata is not stored a ross word boundaries
in memory. With respe t to metadata, an unaligned wordsized write is handled as two sub-word writes; therefore any
pointer information will be lost, and the two aligned memory
words partially written to will be set to NONPTR or UNKNOWN,
depending on range tests. Fortunately, ompilers are loathe
to do unaligned writes, so this does not ome up very often.

3.3 Segment Management


3.3.1 Storing Segments
Segment stru tures are stored in a skip list [12, whi h
gives amortised log n insertion, lookup, and deletion, and is
mu h simpler to implement than a balan ed binary tree.

Ea h register and word of memory is shadowed,5 and holds


NONPTR, UNKNOWN, or a pointer to a segment stru ture. Memory shadowing is done in 64KB hunks. A shadow memory
table stores 65,536 pointers|one for every 64KB hunk of
memory. Initially every pointer in the table points to a

3.3.2 Freeing Segments


When a memory blo k, su h as a heap blo k, is freed, the
segment stru ture for that blo k is retained, but marked as
being freed. In this way, we an dete t any a esses to freed
blo ks via dangling pointers.
However, we need to eventually free segment stru tures
representing freed segments, otherwise our tool will have a
spa e leak. There are two ways to do this. The rst is to
use garbage olle tion to determine whi h segment stru tures are still live. Unfortunately, the rootset is extremely
large, onsisting of all the shadow registers and all of shadow
memory.6 Therefore, olle tion pauses ould be signi ant.
The se ond possibility is to use segment re y ling. We
would start by allo ating ea h new segment stru ture as
ne essary, and then pla ing it in a queue when its segment
is freed. On e the queue rea hes a ertain size (say 1000
segments), we an start re y ling the oldest stru ture segments. If the queue size drops to the threshold value, we
would re-start allo ating new segments until it grew bigger
again.
Thus we would maintain a window of preserved freedsegments. It is possible that a segment stru ture ould be
re y led, and then memory ould be a essed through a danging pointer that points to the freed segment. In this ase,
the error message produ ed will mention the wrong segment.
Or, if we are ex eedingly unlu ky, the pointer will be within
the range of the new segment and the error will be missed.
The han e of this last event happening an be redu ed by
ensuring old freed segments are only re y led into new segments with non-overlapping ranges.
In pra ti e, sin e a esses through dangling pointers are
not that ommon, with a suitably large threshold on the

5
Shadow type information for words in memory are stored
in an equally-sized pie e of shadow memory. Similarly, ea h
register has a orresponding shadow register.

6
This is a simpli ed garbage olle tion, as the traversals
will only be one level deep, sin e segment stru tures annot
annot point to other segment stru tures.

A segment-type X is represented by a dynami ally allo ated segment stru ture ontaining its base address,
size, a pointer to an \exe ution ontext" (a sta ktra e from the time the segment is allo ated, whi h
is updated again when the segment is freed), and a
tag indi ating whi h part of memory the segment is in
(heap, sta k, or stati ) and its status (in-use or freed).
Ea h stru ture is 16 bytes. Exe ution ontexts are
stored separately, in su h a way that no single ontext
is stored more than on e, be ause repeated ontexts
are very ommon. Ea h exe ution ontext ontains n
pointers, where n is the sta k-tra e depth hosen by
the user (default depth is four).

A non-pointer-type
stant NONPTR.

An unknown-type u is represented by a small onstant


UNKNOWN.

A pointer-type p(X ) is represented by a pointer to


the segment stru ture representing the segment-type
X . Pointer-types an be easily distinguished from nonpointer-types and unknown-types be ause the segment
stru tures never have addresses so small as NONPTR and
UNKNOWN.

is represented by a small on-

freed-queue this should happen extremely rarely. Also, re y led segments ould be marked, and error messages arising
from them ould in lude a dis laimer that there is a small
han e that the range given is wrong.
Alternatively, sin e the p(X ) representation is a pointer
to a segment stru ture, whi h is aligned, there are two bits
available in the pointer whi h ould be used to store a small
generation number. If the generation number in the pointer
doesn't mat h the generation number in the segment stru ture itself, a warning an be issued.
One other hara teristi of re y ling is worth mentioning.
If the program being he ked leaks heap blo ks, the orresponding segments will also be leaked and lost by our tool.
This would not happen with garbage olle tion.
So the trade-o between the two approa hes is basi ally
that garbage olle tion ould introdu e pauses, whereas re y ling has a small han e of ausing in orre t error messages. Currently our prototype uses re y ling, whi h was
easier to implement.

3.4 Static Segments


Se tion 2.4.2 des ribed how being too aggressive in identifying stati array pointers an lead to false positives, e.g. when
dealing with array a esses like a[i-1. In our implementation, by default we use aggressive stati pointer identi ation, but a ommand-line option an be used to revert to
onservative identi ation.

3.5 Stack Segments


Se tion 2.4.3 dis ussed how sta k segments ould be handled by our te hnique. There would be no problem if we
knew exa tly when sta k frames were built and torn down;
more pre isely, if we knew when fun tions were entered and
exited. Unfortunately, in pra ti e this is quite di ult. This
is be ause fun tions are not always entered using the all
instru tion, and they are not always exited using the ret
instru tion; some programs do unusual things with jumps
to a hieve the same e e ts. However, unusually enough,
tail-re ursion will not really ause problems, as long as the
fun tion returns normally on e it ompletes. The tail- alls
will not be identi ed, and it will be as if every re ursive
fun tion invo ation is part of a single fun tion invo ation.
It might not be a problem if some entry/exit pairs were
not dete ted; then multiple sta k frames ould be treated by
our tool as a single frame. This ould ause some errors to
be missed, but the basi te hnique would still work. However, even dete ting mat hing fun tion entries with exits is
di ult. The main reason is the use of longjmp(), whi h
allows a program to e e tively exit any number of fun tion
alls with a single jump. The obvious way to store sta k segments is in a sta k data stru ture. With longjmp(), it be omes ne essary to sometimes pop (and mark as freed) multiple segments from this segment sta k. If alls to longjmp()
ould be reliably spotted, this would not be a problem. However, GCC has a built-in non-fun tion version of longjmp()
that annot be reliably spotted. Sin e we annot even tell
when a longjmp() has o urred, we do not know when to
pop multiple frames. If we miss the destru tion of any sta k
frames, we will fail to mark their segment-types as freed,
and so our tool will not spot any erroneous a esses to them
via dangling pointers.
Sta k-swit hing also auses big problems. If a program
being he ked swit hes sta ks, our tool should swit h seg-

ment sta ks a ordingly. But dete ting sta k swit hes by


only looking at the dynami instru tion stream is di ult,
sin e it is often hard to distinguish a sta k swit h from a
large sta k allo ation.
We have tried various heuristi s in an attempt to over ome
these problems. For example, after a longjmp() o urs, the
sta k pointer will have passed several old frames in a single
step. This eviden e an be used to determine that these
frames are now dead. However, we have not managed to
nd any heuristi s robust enough to deal with all the di ulties. As a result, our tool urrently does not tra k sta k
segments at all. This is a shame, and we hope to resolve
these problems in the future.

3.6 Range Tests


As mentioned in Se tion 2.5, we an onvert many u result types to n if they are de nitely not pointers. In our
implementation, this test su eeds if a result value is less
than 0x01000000, or greater than 0xff000000.
This is slightly risky, as a (strange) program ould use
mmap() to map a le or reate a memory segment below
0x01000000. Sin e Valgrind has omplete ontrol over memory allo ation, we ould ensure this never happens. Alternatively, we ould tra k the lowest and highest addressable
addresses, and de lare any value outside this range as a nonpointer (with a suitable safety margin). In this ase, for most
programs on typi al x86/Linux systems, non-pointers would
be values below 0x8048000 (whi h is where the exe utable
is loaded) and above 0x 0000000 (whi h is reserved by the
kernel).

3.7 Pointer Differences


Se tion 2.5.4 suggested handling pointer di eren es between di erent segments pre isely by generalising the n type
to a n(X; Y ) type. We attempted this approa h, but it was
a pain. It made the addition and subtra tion operations
more omplex; also, pointer di eren es are often s aled, so
n(X; Y ) types would have to be propagated on multipli ation, division, and shifting. Finally, having to free n(X; Y )
stru tures ompli ated segment stru ture storage.
A mu h easier solution was to introdu e a new type b
(short for \bottom"). Any value of type b is not he ked
when used as a pointer for a load or store, and the result of
any type operations involving b as an operand is b. Values
of type b are never hanged to type n via a range test.
This is a blunt solution, but one that is not needed very
often|re all that intra-segment pointer di eren es, whi h
are mu h more ommon, are handled a urately.

3.8 System Calls


Valgrind does not tra e into the OS kernel. However,
sin e system alls are well de ned, it knows whi h arguments
are pointers, and what memory ranges will be read and/or
written by ea h system all. Valgrind tools thus know whi h
memory ranges will be written and/or read. Our prototype
does normal range he ks on ea h of these, and so an nd
bounds errors in system all arguments.
Most system alls return an integer error ode or zero; for
these we set the return type to n. Some system alls an
produ e pointers on su ess, notably mmap() and mremap().
Our urrent approa h is to give these results the value u.
We originally tried tra king segments returned by mmap()
like other segments, but abandoned this be ause they are

#in lude <stdlib.h>


int stati _array[10;
int main(void)
{
int i;
int* heap_array = mallo (10 * sizeof(int));

for (i = 0; i <= 10; i++) {


heap_array [i = 0;
// overrun: i==10
stati _array[i = 0;
// overrun: i==10
}
free(heap_array);
heap_array[0 = 0;
// blo k is freed

Figure 3: Sample Program:

bad.

information. If it had not been, the ode lo ations would


have been less pre ise, identifying only the ode addresses
and le names, not a tual line numbers. Also, the se ond
error involving the stati array would not have been found,
as we dis ussed in Se tion 2.4.2.

3.12 Performance
This se tion presents some basi performan e gures for
our prototype. All experiments were performed on an 1400
MHz AMD Athlon with 1GB of RAM, running Red Hat
Linux 9.0, kernel version 2.4.19. The test programs are a
subset of the SPEC2000 suite. All were tested with the
\test" (smallest) inputs.
Table 1 shows the performan e of our prototype. Column
1 gives the ben hmark name, olumn 2 gives its normal running time in se onds, and olumn 3 gives the slow-down fa tor. Programs above the line are integer programs, those
below are oating point programs.
Program
bzip2
rafty
gap
g
gzip
m f
parser
twolf
vortex
ammp
art
equake
mesa
median

di ult to deal with, sin e they are often resized, and an


be trun ated or split by other alls to mmap() that overlap
the range. We de ided this would not be a great loss, as
programs tend to use mmap() in straightforward ways, and
we suspe t overruns of mapped segments are extremely rare.

3.9 Custom Allocators


Our prototype handles the standard allo ators alled via

mallo (), new, new[, free(), delete, delete[. Custom

allo ators an be handled with a small amount of e ort,


by inserting lient requests into the program being he ked.
These are ma ros that pass information to our tool about
the size and lo ation of allo ated and deallo ated blo ks.

3.10 Leniency
Some ommon programming pra ti es ause bounds to
be ex eeded. Most notably, glib has heavily optimised
versions of fun tions like mem py(), whi h read arrays one
word at a time. On 32-bit x86 ma hines, these fun tions an
read up to three bytes past the end of an array. In pra ti e,
this does not ause problems. Therefore, by default we allow
aligned, word-sized reads to ex eed bounds by up to three
bytes, although there is a ommand-line option to turn on
stri ter he king that ags these as errors.

3.11 Examples
This se tion shows some example errors given by our prototype. Figure 3 shows a short C program, bad. . This
ontrived program shows three ommon errors: two array
overruns, and an a ess to a freed heap blo k.
Figure 4 shows the output produ ed by our prototype.
The error messages are sent to standard error by default,
but an be redire ted to any other le des riptor, le, or
so ket.
Ea h line is pre xed with the running program's pro ess
ID. Ea h error report onsists of a des ription of the error,
the lo ation of the error, a des ription of the segment(s) involved, and the lo ation where the segment was allo ated
or freed (whi hever happened most re ently). The fun tions mallo () and free() are identi ed as being in the
le vg_repla e_mallo . be ause that is the le that ontains our tool's implementations of these fun tions, whi h
override the standard ones.
The program was ompiled with -g to in lude debugging

Time (s) Slow-down


10.9
32.2
3.5
67.0
1.0
39.5
1.5
48.0
1.9
40.9
0.4
18.1
3.7
31.8
0.3
35.4
6.6
80.7
19.3
29.0
28.5
14.7
2.2
35.5
2.5
52.5
35.5

Table 1: Slow-down fa tor of prototype


As mentioned, this te hnique is heavyweight. Therefore,
the overhead is high and programs run mu h slower than
normal. However, the slow-down experien ed is not dissimilar to that with all thorough memory he king tools that
instrument a program at link-time or run-time. Also, we
have not yet tried optimising our prototype very mu h.

4. SHORTCOMINGS
Like all error- he king te hniques, ours is far from perfe t.
How well it does depends on the ir umstan es. Happily, it
exhibits \gra eful degradation"; as the situation be omes
less favourable, more and more p(X ) metavalues will be lost
and seen instead as u. Thus it will dete t fewer errors, but
will not give more false positives.

4.1 Optimal Case


In the best ase, the program will have all debug information and symbol tables present. In that ase, even if
implemented optimally, our te hnique will have problems in
the following ases.

Certain operations, su h as swapping pointers with the


bitwise-xor tri k, ause p(X ) metavalues to be \downgraded" to u; erroneous a esses using those pointers
will not be aught. One ould imagine bee ng up the

==16884== Invalid write of size 4


==16884==
at 0x8048398: main (bad. :11)
==16884== Address 0x40D1C040 is 0 bytes after the expe ted range,
==16884==
the 40-byte heap blo k allo ated
==16884==
at 0x400216E7: mallo (vg_repla e_mallo . :161)
==16884==
by 0x8048375: main (bad. :8)
==16884==
==16884== Invalid write of size 4
==16884==
at 0x80483A2: main (bad. :12)
==16884== Address 0x80495E8 is 0 bytes after the expe ted range,
==16884==
a 40-byte stati array in the main exe utable
==16884==
==16884== Invalid write of size 4
==16884==
at 0x80483C5: main (bad. :15)
==16884== Address 0x40D1C018 is 0 bytes inside the on e-expe ted range,
==16884==
the 40-byte heap blo k freed
==16884==
at 0x40021CE9: free (vg_repla e_mallo . :187)
==16884==
by 0x80483BE: main (bad. :14)

Figure 4: Sample Output

type system to handle su h ases, but the ost/bene t


ratio would be very high.

would have to fall ba k to using unlimited-size segments, as


dis ussed in Se tion 2.4.3.

Dire tly out-of-bounds a esses to stati and sta k data


obje ts (e.g. a essing a[10 in an array of ten elements) annot be dete ted if they happen to fall within
a nearby data obje t. Also, pointers not to the start of
arrays must either be ignored, missing potential errors,
or handled, ausing potential false positives.

4.4 No Symbols

4.2 Implementation
Our implementation su ers from a few more short omings, mostly be ause the optimal ase is too omplex to implement.




Pointer-types are lost if pointers are written unaligned


to memory.
Likewise, pointer-types are lost if pointers are opied
byte-by-byte between registers and/or memory words.
(Fortunately, as mentioned in Se tion 3.2, glib 's implementation of mem py() does word-by-word opying.
If it didn't, we ould just override it with our own that
did not use byte-by-byte opying.)
The use of b for inter-segment pointer di eren es will
ause some errors to be missed.
The implementation uses debugging information about
types in only simple ways. For example, it does not
try at all to break up heap blo ks into sub-segments.
Also, the only stati data obje ts it onstru ts segment
stru tures for are arrays (not for stru ts or basi data
types).
Sta k segments are urrently not handled at all, although we hope to re tify this soon.

4.3 No Debugging Information


If debugging information is not present, no stati he king
an be performed, as we annot re ognise pointers to stati
data obje ts.7 Also, if we were he king sta k frames, we
7
We ould he k the bounds of entire stati data segments,
sin e that information is known without debugging information.

If a program has had its symbol tables stripped, error


he king might degrade further. This is be ause, if our implementation did sta k he king, it would rely on symbols
being present for dete ting fun tion entry and exit points.8

4.5 Avoiding Shortcomings


A lot of the short omings arise be ause the information
available in a program binary at run-time is less than that
present in the original program. Debugging information and
symbol tables help represent some of this information, but
there are still some things that our tool would like to know
about.
The obvious way to improve the situation is to involve
the ompiler produ ing the programs; a lot of he king an
be done purely dynami ally, but some things learly require
stati help.
First, a ompiler ould promise, in ertain ir umstan es,
to avoid generating ode that auses problems for our tool.
For example, it ould ensure that all pointer reads and writes
are aligned; it ould ensure where possible that array a esses are done via the array's base pointer, rather than
pre- omputing o sets.
Se ond, a ompiler ould embed extra information into
programs for our tool to use. In our implementation, it
ould use lient requests (mentioned in Se tion 3.9 for handling ustom allo ators) to do this. This information might
be simple, for example, indi ating that a parti ular memory
a ess is intended to be to a parti ular segment; or signalling
when a longjmp() o urs. Or it ould be more omplex, for
example, indi ating that all memory a esses within a parti ular fun tion or module need not be he ked|this might
be suitable if our tool was used in onjun tion with a stati
8
The other obvious alternative|dete ting entry from the
dynami instru tion stream|is in pra ti e extremely di ult. Some ompilers an produ e \debugging grade" ode
whi h in ludes hooks that tell tools when fun tions are entered and exited. This would make things mu h simpler,
but no Linux ompilers we know of provide this.

bounds he ker (see Se tion 5.2 for a des ription of some).


Third, users ould embed information themselves manually, although this option is only really pra ti al for rare
events, su h as sta k swit hes.
Finally, some bounds errors that our tool annot nd
should arguably be found by a ompiler. In parti ular,
dire tly out-of-bounds a esses to stati and sta k arrays
(e.g. a essing element a[10 of a ten-element array) ould
easily be found by the ompiler.
Our te hnique has some very ni e hara teristi s, but it
learly also has some signi ant short omings. A hybrid
stati /dynami te hniques may be more e e tive; this is an
area worthy of future work.

5.

RELATED WORK

This se tion des ribes several tools that nd bounds errors for C and C++ programs, and ompares them to our
te hnique. No single te hnique is best; ea h has its strengths
and weaknesses, and they omplement ea h other.

5.1 Redzones
The most ommon kinds of bounds- he king tools dynami ally he k a esses to obje ts on the heap. This approa h
is ommon be ause heap bounds- he king is easy to do.
The simplest approa h is to repla e the standard versions
of mallo (), new, and new[ to produ e heap blo ks with
a few bytes of padding at their ends (redzones ). These redzones are lled with a distin tive values, and should never
be a essed by a program. When the heap blo k is freed
with free(), delete or delete[, the redzones are he ked,
and if they have been written to, a warning is issued. The
do umentation for mpatrol [13 lists many tools that use this
te hnique.
This te hnique is very simple, but it has many short omings.
1. It only dete ts small overruns/underruns, within the
redzones|larger overruns or ompletely wild a esses
ould a ess the middle of another heap blo k, or nonheap memory.
2. It only dete t writes that ex eed bounds, not reads.
3. It only reports errors when a heap blo k is freed, whi h
auses two problems: rst, not all heap blo ks will
ne essarily be freed, and se ond, this gives no information about where the overrun/underrun o urred.
Alternatively, alls to a heap- he king fun tion an be
inserted, but that requires sour e ode modi ation,
will ause pauses while the entire heap is he ked, and
still does not give pre ise information about when an
error o urs.
4. A esses to freed heap blo ks via dangling pointers are
not dete ted, unless they happen to hit another blo k's
redzone (even then, identifying the problem will be
di ult).
5. It does not work with heap blo ks allo ated with ustom allo ators (although the te hnique an be built
into ustom allo ators).
6. It only works with heap blo ks|sta k and stati blo ks
are pre-allo ated by the ompiler, and so redzones annot (without great di ulty) be used for them.

This te hnique has too many problems to be onsidered further. All these problems are avoided by te hniques that
tra k pointer bounds, su h as ours.
Some of these short omings are over ome by Ele tri Fen e
[11, another mallo () repla ement that uses entire virtual
pages as redzones. These pages are marked as ina essible, so that any overruns/underruns ause the program to
abort immediately, whereupon the o ending instru tion an
be found using a debugger. This avoids problems 2 and 3
above, and mitigates problem 1 (be ause the redzones are so
big). However, it in reases virtual memory usage massively,
making it impra ti al for use with large programs.
A better approa h is used by the Valgrind tool Mem he k [9, and Purify [5. They too repla e mallo () et
al with versions that produ e redzones, but they also maintain addressability metadata about ea h byte of memory,
and he k this metadata before all loads and stores. Be ause the redzones are marked as ina essible, all heap overruns/underruns within the redzones are spotted immediately, avoiding problems 2 and 3 above. If the freeing of
heap blo ks is delayed, this an mitigate problem 4. These
tools also provide hooks that a ustom allo ator an use to
tell them when new memory is allo ated, alleviating problem 5.
Purify is also apable of inserting redzones around stati
variables in a pre-link step, in ertain ir umstan es, as explained in the Purify manual:
Purify inserts guard zones into the data se tion
only if all data referen es are to known data variables. If Purify nds a data referen e that is relative to the start of the data se tion as opposed
to a known data variable, Purify is unable to
determine whi h variable the referen e involves.
In this ase, Purify inserts guard zones at the
beginning and end of the data se tion only, not
between data variables.
Similarly, the Solaris implementation of Purify an also
insert redzones at the base of ea h new sta k frame, and so
dete t overruns into the parent frame. We do not know the
details of how Purify does this, but we suspe t that the way
the SPARC sta k is handled makes it mu h easier to do than
on x86.
Redzones and addressability tra king works very well, whi h
a ounts for the widespread use of Mem he k and Purify.
However, the remaining short omings|1 and parti ularly 6
(even with Purify's partial solution)|are important enough
that tools tra king pointer bounds are worth having.

5.2 Fat Pointers

A di erent te hnique is to augment ea h normal pointer in


a program with bounds metadata|typi ally the minimum
and maximum address it an be used to a ess|giving a fat
pointer. All a esses through fat pointers are he ked, whi h
gives very thorough he king, and avoids all the problems
of the tools des ribed in Se tion 5.1. But there are other
signi ant disadvantages.
In earlier implementations, every pointer was repla ed
with a stru t ontaining the pointer, plus the bounds metadata (e.g. [1). This approa h has several major problems.
1. It requires ompiler support, or a pre- ompilation transformation step.

2. All ode must be re ompiled to use fat pointers, in luding libraries. In pra ti e, this an be an enormous
hassle.
Alternatively, parts of the program an be left unre ompiled, so long as interfa e ode is produ ed that
onverts fat pointers to normal pointers and vi e versa
when moving between the two kinds of ode. Produ ing this ode requires a lot of work, as there are
many libraries used by normal programs. If this work
is done, two kinds of errors an still be missed. First,
pointers produ ed by the library ode may la k the
bounds metadata and thus not be he ked when they
are used in the \fat" ode. Se ond, library ode will
not he k the bounds data of fat pointers when performing a esses.
3. Changing the size of a fundamental data type will
break any ode that relies on the size of pointers, for
example, ode that asts pointers to integers or vi e
versa, or C ode that does not have a urate fun tion
prototypes.
4. Support may be required in not only the ompiler, but
also the linker (some pointer bounds annot be known
by the ompiler), and possibly debuggers (if the fat
pointers are to be treated transparently).
Jones and Kelly des ribe a better implementation in [6.
Ea h pointer's metadata is stored separately from the pointer
itself, so it preserves ba kward ompatibility with existing
programs, avoiding problem 3, and redu ing problem 4.9
Patil and Fis her [10 also store metadata separately, in order to perform the he king operations on a se ond pro essor. We believe that if library ode was handled better, this
te hnique would be mu h more widely used.
Both these approa hes require ompiler support, and the
he king is only done within modules that have been re ompiled by the ompiler, and on pointers that were reated
within these re ompiled modules.
Our te hnique has the same basi idea of tra king a pointer's
bounds, but the implementation is entirely di erent, as it
works on already- ompiled ode, and naturally overs the
entire program, in luding libraries. It also works with programs written in any language; this is useful for systems
written in a mix of C or C++ and another language. Our
te hnique is less a urate, however, and the overhead is
mu h greater.

5.3 Static Checking


CCured [8 is a tool for making C programs type-safe,
implemented as a sour e ode transformer. It does a sophisti ated stati analysis of C sour e ode, then adds bounds
metadata, and inserts dynami bounds- he ks, for any pointers for whi h it annot prove orre tness stati ally. It suffers from problems 1{3 des ribed in Se tion 5.2. These additional he ks slow performan e; published gures range
from 10{150%. Also, on larger programs, one \has to hold
CCured's hand a bit" for it to work; getting su h programs
to work an take several days' work [4. Again, non- overage
of library ode is a problem.
Compuware's BoundsChe ker Professional tool [3 inserts
memory he ks into sour e ode at ompile-time. It seems to
9
In pra ti e, very small numbers of sour e ode hanges are
needed.

be similar to CCured, but without the lever stati analysis


to avoid unne essary he ks, and so an su er mu h larger
slow-downs.
By omparison, our te hnique is entirely dynami , and
does not require any re ompilation or sour e ode modi ation. It does nd fewer errors, though.

5.4 Runtime Type Checking


We know of two systems that perform run-time type he king. Hobbes [2 maintains run-time type metadata about
every value in a program. It gives warnings on run-time
type violations, e.g. if two pointers are added. It an also
dete t bounds errors if they lead to run-time type errors.
Implemented via an x86-binary interpreter, its slow-down
fa tor was in the range 50{175. RTC [7 is similar, but only
works for C programs as it uses a sour e-to-sour e transformation to add the he ks. Slow-downs when using it are
in the range 6{130. These tools use heap blo k padding
to nd some overrun errors, but as we saw in Se tion 5.1,
our te hnique is more pre ise for nding bounds errors as it
asso iates an expli it range with ea h pointer.

6. FUTURE WORK AND CONCLUSION


We have des ribed a te hnique for dynami ally he king
pointer usage, whi h nds many but not all bounds errors.
It e e tively onverts program pointers into fat pointers,
without requiring programs to be modi ed, re ompiled, or
relinked, and works with programs written in any language.
It also he ks library ode just as well as non-library ode,
without any extra e ort. We have a prototype implementation of the te hnique whi h performs many of the possible
he ks.
Further improvements to the prototype would be desirable, parti ular to make it faster, and to in lude bounds
he king on the sta k. In the long-term, we hope hybrid
stati /dynami te hniques may be able to produ e new, better bounds- he king tools that over ome the weaknesses of
our te hnique, and other existing ones.
We have also found that there is a tenden y for a large
number of a program's values to de ay to type u, even in
fairly simple programs. A lear dire tion for future work is
to identify why type information is being lost, and how to
avoid it.

7. ACKNOWLEDGMENTS
Many thanks to: Julian Seward, for reating Valgrind,
and for many useful dis ussions, and to Alan My roft and
the anonymous reviewers for their omments about this paper. The rst author gratefully a knowledges the nan ial
support of Trinity College, Cambridge.

8. REFERENCES
[1 T. M. Austin, S. E. Brea h, and G. S. Sohi. E ient
dete tion of all pointer and array a ess errors. In
Pro eedings of the ACM SIGPLAN Conferen e on
Programming Language Design and Implementation
(PLDI '94), pages 290{301, Orlando, Florida, USA,
June 1994.
[2 M. Burrows, S. N. Freund, and J. L. Wiener. Run-time
type he king for binary programs. In Pro eedings of
CC 2003, pages 90{105, Warsaw, Poland, Apr. 2003.

[3 Compuware Corporation. Bounds he ker.

http://www. ompuware. om/produ ts/devpartner/


bounds.htm.

[4 George Ne ula et al. CCured Do umentation, Sept.


2003. http://manju. s.berkeley.edu/ ured/.
[5 R. Hastings and B. Joy e. Purify: Fast dete tion of
memory leaks and a ess errors. In Pro eedings of the
Winter USENIX Conferen e, pages 125{136, San
Fran is o, California, USA, Jan. 1992.
[6 R. W. M. Jones and P. H. J. Kelly.
Ba kwards- ompatible bounds he king for arrays and
pointers in C programs. In Pro eedings of the Third
International Workshop on Automated Debugging
(AADEBUG'97), pages 13{26, Linkoping, Sweden,
May 1997.
[7 A. Loginov, S. H. Yong, S. Horwitz, and T. Reps.
Debugging via run-time type he king. In Pro eedings
of FASE 2001, Genoa, Italy, Apr. 2001.
[8 G. C. Ne ula, S. M Peak, and W. Weimer. C ured:
Type-safe retro tting of lega y ode. In Pro eedings of
POPL 2002, pages 128{139, London, United
Kingdom, Jan. 2002.
[9 N. Nether ote and J. Seward. Valgrind: A program
supervision framework. In Pro eedings of RV'03,
Boulder, Colorado, USA, July 2003.
[10 H. Patil and C. Fis her. Low- ost, on urrent he king
of pointer and array a esses in C programs.
Software|Pra ti e and Experien e, 27(1):87{110, Jan.
1997.
[11 B. Perens. Ele tri fen e.
ftp://ftp.perens. om/pub/Ele tri Fen e/.
[12 W. Pugh. Skip lists: A probabilisti alternative to
balan ed trees. Communi ations of the ACM,
33(6):668{676, June 1990.
[13 G. S. Roy. mpatrol: A Library for ontrolling and
tra ing dynami memory allo ations, Jan. 2002.
http://www. bmamiga.demon. o.uk/mpatrol/.