Professional Documents
Culture Documents
© ©
CS 536 Spring 2005 427 CS 536 Spring 2005 428
© ©
CS 536 Spring 2005 429 CS 536 Spring 2005 430
Mark-sweep garbage collection is pointers is a bit tricky in languages
illustrated below. like Java, C and C++, that have
Global pointer Global pointer
pointers mixed with other types
Internal pointer
within data structures, implicit
pointers to temporaries, and so forth.
Object 1 Object 3 Object 5
Considerable information about data
structures and frames must be
available at run-time for this purpose.
Objects 1 and 3 are marked because In cases where we can’t be sure if a
they are pointed to by global pointers. value is a pointer or not, we may need
Object 5 is marked because it is to do conservative garbage collection.
pointed to by object 3, which is In mark-sweep garbage collection all
marked. Shaded objects are not heap objects must be swept. This is
marked and will be added to the free- costly if most objects are dead. We’d
space list. prefer to examine only live objects.
In any mark-sweep collector, it is vital
that we mark all accessible heap
objects. If we miss a pointer, we may
fail to mark a live heap object and
later incorrectly free it. Finding all
© ©
CS 536 Spring 2005 431 CS 536 Spring 2005 432
© ©
CS 536 Spring 2005 433 CS 536 Spring 2005 434
Because pointers are adjusted, Copying Collectors
compaction may not be suitable for
languages like C and C++, in which it Compaction provides many valuable
is difficult to unambiguously identify benefits. Heap allocation is simple
pointers. end efficient. There is no
fragmentation problem, and because
live objects are adjacent, paging and
cache behavior is improved.
An entire family of garbage collection
techniques, called copying collectors
are designed to integrate copying
with recognition of live heap objects.
Copying collectors are very popular
and are widely used.
Consider a simple copying collector
that uses semispaces. We start with
the heap divided into two halves—the
from and to spaces.
© ©
CS 536 Spring 2005 435 CS 536 Spring 2005 436
© ©
CS 536 Spring 2005 437 CS 536 Spring 2005 438
and from spaces are interchanged, dead objects is essentially free. In
and heap allocation resumes just fact, garbage collection can be made,
beyond the last copied object. This is on average, as fast as you wish—
illustrated in Figure 0.2. simply make the heap bigger. As the
Object 5 From Space heap gets bigger, the time between
Internal pointer
collections increases, reducing the
Object 3
number of times a live object must be
Object 1 To Space
copied. In the limit, objects are never
copied, so garbage collection becomes
Global pointer Global pointer
© ©
CS 536 Spring 2005 439 CS 536 Spring 2005 440
will appear to be free, though longer- be greater than the average lifetime
lived objects will still exact a cost. of most heaps objects, we can
Aren’t copying collectors terribly improve our use of heap space.
wasteful of space? After all, at most Assume that 50% or more of the
only half of the heap space is actually heap will be garbage when the
used. The reason for this apparent collector is called. We can then divide
inefficiency is that any garbage the heap into 3 segments, which we’ll
collector that does compaction must call A, B and C. Initially, A and B
have an area to copy live objects to. will be used as the from space,
Since in the worst case all heap utilizing 2/3 of the heap. When we
objects could be live, the target area copy live objects, we’ll copy them into
must be as large as the heap itself. To segment C, which will be big enough
avoid copying objects more than if half or more of the heap objects are
once, copying collectors reserve a to garbage. Then we treat C and A as
space as big as the from space. This is the from space, using B as the to
essentially a space-time trade-off, space for the next collection. If we
making such collectors very fast at are unlucky and more than 1/2 the
the expense of possibly wasted space. heap contains live objects, we can still
get by. Excess objects are copied onto
If we have reason to believe that the an auxiliary data space (perhaps the
time between garbage collections will
© ©
CS 536 Spring 2005 441 CS 536 Spring 2005 442
stack), then copied into A after all their start, and utilize that structure
live objects in A have been moved. throughout the program. Copying
This slows collection down, but only collectors handle long-lived objects
rarely (if our estimate of 50% poorly. They are repeatedly traced and
garbage per collection is sound). Of moved between semispaces without
course, this idea generalizes to more any real benefit.
than 3 segments. Thus if 2/3 of the Generational garbage collection
heap were garbage (on average), we techniques [Unger 1984] were
could use 3 of 4 segments as from developed to better handle objects
space and the last segment as to with varying lifetimes. The heap is
space. divided into two or more generations,
Generational Techniques each with its own to and from space.
The great strength of copying New objects are allocated in the
collectors is that they do no work for youngest generation, which is
objects that are born and die between collected most frequently. If an object
collections. However, not all heaps survives across one or more
objects are so short-lived. In fact, collections of the youngest
some heap objects are very long- generation, it is “promoted” to the
lived. For example, many programs next older generation, which is
create a dynamic data structure at collected less often. Objects that
© ©
CS 536 Spring 2005 443 CS 536 Spring 2005 444
© ©
CS 536 Spring 2005 445 CS 536 Spring 2005 446
generations that might point to return address stored in a frame) to
younger objects. determine the routine a frame
Experience shows that a carefully corresponds to. This allows us to then
designed generational garbage determine what offsets in the frame
collectors can be very effective. They contain pointers. When heap objects
focus on objects most likely to are allocated, we can include a type
become garbage, and spend little code in the object’s header, again
overhead on long-lived objects. allowing us to identify pointers
Generational garbage collectors are internal to the object.
widely used in practice. Languages like C and C++ are weakly
Conservative Garbage Collection typed, and this makes identification
of pointers much harder. Pointers may
The garbage collection techniques be type-cast into integers and then
we’ve studied all require that we back into pointers. Pointer arithmetic
identify pointers to heap objects allows pointers into the middle of an
accurately. In strongly typed object. Pointers in frames and heap
languages like Java or ML, this can be objects need not be initialized, and
done. We can table the addresses of may contain random values. Pointers
all global pointers. We can include a may overlay integers in unions,
code value in a frame (or use the
© ©
CS 536 Spring 2005 447 CS 536 Spring 2005 448
© ©
CS 536 Spring 2005 449 CS 536 Spring 2005 450
prematurely freed, or perhaps never
freed. In fact, experiments have
shown [Zorn 93] that conservative
garbage collection is very competitive
in performance with application-
specific manual heap management.
©
CS 536 Spring 2005 451