You are on page 1of 9

IMPROVED EFFECTIVENESS FROM A REAL TIME LISP GARBAGE COLLECTOR

Jeffrey L. Dawson

WANG Laboratories, Inc.


Lowell, Massachusetts 01851

INTRODUCTION

This paper describes a real-time garbage The collector which we w i l l d i s c u s s is


collection algorithm for list processing a copying collector. For classical copying
systems. We identify two efficiency garbage collection [Feniche169, Cheney70],
problems inherent to real-time garbage the cell storage is divided into two
collectors, and give some evidence that the semispaces whose roles are reversed with
proposed algorithm tends to reduce these each garbage collection cycle. When
problems. In a virtual memory collection begins, the semispace labeled
implementation, the algorithm restructures TOSPACE is empty, and all accessible and
the c e l l storage area more compactly, thus garbage cells are in the s e m i s p a c e labeled
reducing working sets. The algorithm also FROMSPACE. The collector copies the
may provide a more garbage-free storage accessible cells out of FROMSPACE into
area at the end of the collection cycle, TOSPACE. When this process is completed,
although this claim really must await FROMSPACE contains only inaccessible cells,
empirical verification. while TOSPACE contains all and only
accessible cells. Now computation can
Garbaq~ collection resume with the mutator allocating cells
from the garbage free TOSPACE until it is
List processing systems comprise two again necessary to c o l l e c t .
related processes. The mutator allocates
cells from free storage and links them Baker's real-time collector
together to form data structures
representing the computation underway. The classical copying collector is n o t
W h e n the m u t a t o r drops a data structure it well suited to real-time applications,
no longer needs, the underlying cells since the mutator is subject to
remain in the system but are no longer unpredictable garbage collection interludes
accessible. whose duration is proportional to the
number of accessible cells in t h e system.
In order to prevent the mutator from In order to meet real-time constraints,
running o u t of c e l l s to a l l o c a t e and thus [Baker78] distributes the copying function
blocking computation, a collector process among the mutator's primitives. For each
gathers the inaccessible or garbage cells cell the mutator allocates, it m u s t c o p y a
and returns t h e m to f r e e s t o r a g e for r e u s e number of c e l l s b o u n d e d by a c o n s t a n t , k
by the mutator. A number of collector By c a r e f u l l y choosing k, o n e c a n be a s s u r e d
algorithms have been proposed with a t h a t the m u t a t o r w i l l n o t r u n o u t of c e l l s
variety of performance objectives. to allocate. Moreover, the mutator's
{Cohen81] provides a recent survey of primitive operators have well defined
garbage collectors. b o u n d s on t h e i r e x e c u t i o n times.

The trick is in maintaining the


consistency of the data structures as the
collector copies them and the mutator
Permission to copy without fee all or part of this material is granted alters them. This is accomplished by
provided that the copies are not made or distributed for direct giving the mutator the impression that
commercial advantage, the ACM copyright notice and the title of the garbage collection has completed and that
publication and its date appear, and notice is given that copying is by it is working in the new semispace,
permission of the Association for Computing Machinery. To copy although copying is in fact still going
otherwise, or to republish, requires a fee and/or specific permission. on. The mutator's access to the data
structures is through a set of registers
holding pointers to the root cells of the
data structures. When collection begins,
the cells pointed to by t h e registers are
© 1982 ACMO-89791-082-6/82/O08/O159 $00.75 immediately copied into TOSPACE.

150
Thereafter, all n e w c e l l s w i l l be a l l o c a t e d to the n e x t p h y s i c a l cell in s t o r a g e . This
in TOSPACE. Also, the data access scheme c a n be e n h a n c e d through "CDR-coding"
primitives, CAR and CDR, are required to to yield even more compact storage
copy their returned values into TOSPACE. [Clark77].
Thus the mutator's illusion that garbage
collection is c o m p l e t e will be m a i n t a i n e d . An "asynchronous copy" occurs when
Meanwhile, the collector linearly scans cells must be copied as a result of the
TOSPACE following any pointer into mutator's action, thereby confounding the
FROMSPACE and copying the cell into collector's attempts at l i n e a r i z a t i o n . The
TOSPACE. When all TOSPACE has been collector could be l i n e a r l y copying a list
scanned, garbage collection is c o m p l e t e . when the mutator stuffs a few unrelated
cells in the m i d d l e ; or the c o l l e c t o r finds
that a few cells, which would fit nicely
REAL-TIME PROBLEMS into its c u r r e n t list, were already copied
elsewhere by the m u t a t o r .
Although Baker has cleverly overcome
the major problem of getting the two Baker's collector relieves this problem
processes to run in p a r a l l e l , one process somewhat by f o r c i n g the m u t a t o r to a l l o c a t e
may interfere with the other's most cells at a d i f f e r e n t point in c e l l storage
efficient operation. We identify two t h a n t h a t to w h i c h cells are b e i n g copied.
inefficiencies in real-time garbage However, asynchronous copying still occurs
collection which could be viewed as when CAR or CDR would return a cell in
penalties for concurrent operation. Our FROMSPACE, or when either of the pointers
collector, w h i c h we w i l l d e s c r i b e later, is within a newly allocated cell are in
intended to r e d u c e t h e s e i n e f f i c i e n c i e s . FROMSPACE.

OUR COLLECTOR
All garbage collectors have a
two-phase, cyclic action. They identify The differences between our collector
the inaccessible cells, then reclaim them. and B a k e r ' s are the f o l l o w i n g . We redirect
If the c o l l e c t o r and m u t a t o r are p r o c e s s i n g the registers into TOSPACE one at a t i m e ,
in parallel, then during each of the rather than immediately at the b e g i n n i n g of
collector cycles, the m u t a t o r continues to the collection cycle. The registers are
generate garbage cells. If these are not not forced into TOSPACE, but drift there
identified, they will remain in the s y s t e m with the tide of copied cells.
as garbage until discovery during some Asynchronous and linear copying are
subsequent cycle. We will call these performed at different locations in cell
lingering cells "floating garbage" storage. These simple changes appear to
Analysis of a "mark and sweep" garbage redound in improvements in the areas of
collection algorithm in [ W a d l e r 7 6 ] suggests floating garbage and linearization of
that the average performance is c l o s e to storage. In t h i s s e c t i o n , we w i l l d e s c r i b e
the worst case in which none of the the a l g o r i t h m . CDR-coding is not e m p l o y e d
floating garbage is reclaimed until the in the algorithm described, although it
next cycle. An effective collector would could be added with an accompanying
minimize the amount and longevity of increase in complexity. A pseudo-Algol
floating garbage. representation is l i s t e d in the A p p e n d i x .

The floating garbage in Baker's Cell storag_ ~


collector consists of those TOSPACE cells
which are inaccessible immediately at the The cell storage is again separated
end of the c o l l e c t i o n cycle. They are of into two semispaces whose roles are
two different origins. The first type are reversed just before each collection cycle
cells whose access was dropped after they begins. All accessible and inaccessible
were copied. The second type are cells cells a r e in F R O M S P A C E at the b e g i n n i n g of
allocated after the b e g i n n i n g of the c y c l e the collection cycle, while TOSPACE is
whose access was dropped before the e n d of empty. Five pointers into cell storage are
the c y c l e . maintained throughout the collection
process:
Asynchronous co~in~
F top free storage a r e a in F R O M S P A C E ,
Another important function of copying T top free storage a r e a in T O S P A C E ,
garbage collectors in virtual memory B bottom free storage area TOSPACE,
systems is the restructuring of the PB linearizing pointer b e l o w B,
accessible data for more compact storage SB scanning pointer b e l o w B,
and easier access. This has lead to the ST scanning pointer a b o v e T.
practice of "linearization" in an attempt
to k e e p the c e l l s of e a c h d a t a s t r u c t u r e on At the b e g i n n i n g of the c y c l e ; SB, PB, a n d
as few p a g e s as p o s s i b l e , thus reducing the B point to the bottom cell of TOSPACE,
average working set during program while T and ST p o i n t to the topmost cell.
execution. In CDR-linearizing, whenever F points to the cell that T pointed to
possible the C D R p o i n t e r of a c e l l points before the cycle. The nonfree cells of

160
TOSPACE are separated into two regions with try to a l l o c a t e cells in F R O M S P A C E in t h e
the free storage between. The cells below hope that they will become inaccessible
B have been copied linearly out of before the e n d of the c y c l e . If t h e y are
FROMSPACE. The cells above T have been put not dropped, they will have to be c o p i e d .
t h e r e as a r e s u l t of the m u t a t o r ' s action. The maintenance of the forwarding address
They were either newly allocated or scheme puts a limitation on t h i s a l l o c a t i o n
asynchronously copied out of FROMSPACE. policy. If the CAR pointer of a newly
The TOSPACE pointers converge on t h e m i d d l e created cell is d i r e c t e d into TOSPACE, the
of the TOSPACE while maintaining the c e l l m u s t be a l l o c a t e d in T O S P A C E to a v o i d
relationships SB < PB < B < T < ST the CAR pointer being mistaken for a
Collection is c o m p l e t e when SB = PB = B , forwarding a d d r e s s to a c o p i e d c e l l .
T = ST , and all registers point into
TOSPACE. The collection process

Figure i. Diagram of cell storage area. Always, the collector's first choice is
to continue the linearization at B when
(I//// indicates nonfree cells) possible. The linearization pointer PB is
either equal to B or points to the last
FROMSPACE TOSPACE cell copied if t h e C D R of t h a t c e l l h a s n o t
been copied. The collector keeps copying
I/Ill/l/Ill High /M/I/I/1// the C D R of PB at B as l o n g as it e x i s t s and
/////////// addresses /////M/III has not already been copied. When this
/////////// I/I/I/H/I/ f a i l s the c o l l e c t o r l o o k s for a s i n g l e c e l l
/////////// ST ---> III/IIIIIII to copy at B as a new seed to the
//////////I H/I/1/I/I/ linearization process.
/////////// Hl/I/I/Ill
//////////I /I/1/I/llL The collector attempts, in o r d e r , three
//////////I T "-> methods for o b t a i n i n g a new seed. It s c a n s
////I////11 the pointers below B, examines the
//////////I registers, and scans the p o i n t e r s above T.
/////I///// The order of these attempts has effects
<--F upon performance which we will examine
later. The scan pointer SB m o v e s toward B
u n t i l it f i n d s a C A R p o i n t e r to a d a t a c e l l
B"~ in FROMSPACE. (Because of the
~~~H/Ill~,
PB --> linearization in t h i s a r e a of c e l l s t o r a g e ,
//////////I /I/I/I/IIL the collector need not examine CDR
//////////I SB --> ~ 1 ~ l / I / I l L pointers -- they are all directed into
/////////// LOw ~~~Ill~IlL TOSPACE.) That cell is copied at B and
/////ll///l addresses IJllllllll~ linearization resumes. As SB scans it
rewrites pointers into FROMSPACE with
Forwardinq addresses forwarding addresses if p o s s i b l e . If SB is
equal to B, the registers are searched
W h e n a c e l l is c o p i e d o u t of F R O M S P A C E , sequentially for a pointer to a d a t a cell
its CAR pointer is redirected to the in F R O M S P A C E to s e r v e as t h e n e w s e e d . If
TOSPACE version of t h e c e l l , and serves as all registers are directed into TOSPACE,
the forwarding address. The test of a then the scan pointer ST is m o v e d toward T
FROMSPACE cell to d e t e r m i n e whether it is u n t i l a n e w s e e d is f o u n d . When all three
an authentic data cell or a forwarding methods fail to provide a new seed, the
address holder is to d e t e r m i n e into which collection c y c l e is f i n i s h e d .
semispace its C A R p o i n t e r is d i r e c t e d . If
it p o i n t s into TOSPACE, it is a f o r w a r d i n g Interaction of collector and mutator
address.
In order to meet the real-time
Cell allocation constraints, we adopt Baker's strategy of
setting a scanning quota Q. For each new
We assume that there is some free cell allocated, the collector examines a
storage remaining in FROMSPACE, and that t o t a l of e x a c t l y Q cell pointers during the
the m u t a t o r can allocate cells at e i t h e r F linearization process and the new seed
or T. T h i s is n o t a s e r i o u s restriction in search. The data access primitives CAR and
a virtual memory implementation where an CDR are asked to assist the collector by
additional page of free storage can be updating s o m e of t h e p o i n t e r s to f o r w a r d i n g
a d d e d at T or F as n e e d e d . A clairvoyant address cells. Any CDR pointer is u p d a t e d
mutator would allocate at F a l l the cells to p o i n t to a d a t a c e l l , w h i l e o n l y T O S P A C E
that it knows it will drop during the cells have their CAR pointers updated
course of g a r b a g e collection so that they because of t h e f o r w a r d i n g address test. By
will be left behind when FROMSPACE is convention the mutator primitives never
abandoned at t h e e n d of t h e c y c l e , in w h i c h return pointers to forwarding address
c a s e , t h e r e w o u l d be no f l o a t i n g garbage of cells. In t h i s w a y o n e is a s s u r e d that the
the s e c o n d t y p e . arguments to any succeeding primitive and
the registers will point to d a t a c e l l s and
In the absence of such clairvoyance, we n e e d n o t be t e s t e d .

161
In o r d e r to ensure correct concurrent PERFORMANCE ANALYSIS
operation of t h e m u t a t o r and collector, the
following conditions are to be maintained In this section, we will discuss the
throughout the collection cycle. effectiveness of our collector,
Condition i: Any pointer in TOSPACE particularly in c o m p a r i s o n to B a k e r ' s . Our
lying below SB or above ST points into algorithm is better at linearizing the
TOSPACE. copied data and may reduce the amount of
We must be certain to maintain the floating garbage, although the latter is
self-referential quality of t h a t p o r t i o n of n o t as c l e a r l y established.
TOSPACE which lies outside the scan
pointers; otherwise, we may end up with Linearization
invalid pointers into FROMSPACE after the
cycle. The only way a problem of this Whenever cells from two different data
n a t u r e c o u l d a r i s e is if o n e of t h e T O S P A C E structures are copied at B, t h e s t r u c t u r e s
cells below S had one of its pointers become intertwined as t h e y a r e c o p i e d , thus
redirected by the primitives, RPLACA or diluting one another's locality of
RPLACD, so s p e c i a l attention must be p a i d reference. Choosing one linearization seed
to these primitives. Whenever a scanned at a t i m e tends to reduce this mingling.
pointer is b e i n g r e d i r e c t e d into FROMSPACE, Choosing the new seed by scanning the
the intended pointee is c o p i e d into TOSPACE bottom of T O S P A C E , we t e n d to c o p y t h e C A R
at T b e f o r e t h e r e d i r e c t i o n . cells immediately after the CDR chain of
Condition 2: Any CAR pointer in the list. This also improves the locality
FROMSPACE pointing into TOSPACE is a of reference. In f a c t , t h i s a p p r o a c h would
forwarding address for a copied cell. improve the classical non-real-time copying
We must exercise caution in t h e a p p l i c a t i o n collectors.
of R P L A C A and CONS, lest we create false
forwarding addresses. We must not apply By performing asynchronous copying and
RPLACA to a F R O M S P A C E cell with a pointer cell allocation at T, we a v o i d breaking up
into TOSPACE. SO, RPLACA copies the CDR chains under construction and
FROMSPACE cell into TOSPACE at T before introducing unrelated seeds. There will be
performing the redirection. Whenever a very little copying done when the top of
cell is created whose CAR pointer is TOSPACE is b e i n g scanned -- mostly just
directed into TOSPACE, the cell is redirection of CDR pointers through
allocated at T in T O S P A C E . forwarding addresses. The cells copied
will be those whose sole connection with
Termination TOSPACE is t h r o u g h a cell copied at T as a
result of one of the replacement
If we v i e w t h e c o l l e c t o r ' s activity as primitives, RPLACA or R P L A C D . If o n e w e r e
tracing pointers and ensuring that the cell to scan the top of TOSPACE before the
pointed to is in T O S P A C E , the termination registers in s e a r c h of a new linearization
condition for t h e c o l l e c t i o n cycle is t h a t seed, t h e p a r t s of d a t a s t r u c t u r e s attacked
all traceable pointers have been traced. by RPLACA or RPLACD would be closer
If we choose the scanning quota Q to be together. Going even further the
larger t h a n 2, we c a n be c e r t a i n that the asynchronous copy resulting from these
collection cycle will indeed terminate -- primitives c o u l d be d o n e at B w i t h t h e s a m e
essentially, because the collector is a b l e effect. Moreover, scanning the top of
to trace the existing pointers faster than TOSPACE w o u l d be e a s i e r since we would only
the mutator can add new ones. Suppose need to examine the CDR pointers. Since
there are N accessible pointers at the the replacement primitives are relatively
beginning of the cycle, i.e., N/2 infrequent, the degradation of the
accessible cells. After N/Q cells have linearity s h o u l d be s m a l l .
been allocated, N pointers will have been
traced and there will be at m o s t 2(N/Q) Asynchronous c o £ ~
traceable pointers, as y e t u n t r a c e d . After
another 2N/Q 2 more cells have been There is a n o t h e r reason for trying to
allocated, there will be at most avoid asynchronous copying aside from not
N(2/Q) 2 untraced pointers. Iterating thwarting the collector's efforts at
this argument we see that the number of linearization. In an interactive
untraced pointers falls below one, and application, some or all of the linear
collection m u s t be c o m p l e t e after copying could be carried out during
keyboard waits with no p e r f o r m a n c e loss to
(N/2) [(2/Q) + (2/Q) 2 + ... ÷ (2/Q) s] the mutator. By reducing the asynchronous
copying we can increase the possibilities
N ![1 - (2/Q) s]! N - 1 for this "free" copying.

An asynchronous copy will occur in


cells have been allocated, where s is Baker's collector whenever an argument of
equal to lOgQ/2 N . CAR, CDR or CONS points to a cell in
FROMSPACE. An asynchronous copy occurs in
our collector only when the first argument
of CONS points into FROMSPACE and for
certain combinations of a r g u m e n t s in R P L A C A

162
and RPLACD -- approximately half the the a l l o c a t i o n semispace by t h e C A R p o i n t e r
occurrences of RPLACA and one-fourth the effects just such a policy. Cell
occurrences of RPLACD during the cycle. allocation will follow the migration of
Data from [Clark79, Table I] on the c e l l s o u t of F R O M S P A C E into TOSPACE because
frequencies of these primitives within of the growing preponderance of CAR
three large LISP programs suggests t h a t by pointers into TOSPACE. The effectiveness
shifting the forced copying from CAR and of t h i s p o l i c y w i l l d e p e n d on the l o n g e v i t y
C D R to t h e l e s s f r e q u e n t RPLACA and RPLACD, of n e w l y a l l o c a t e d cells in r e l a t i o n to t h e
the a m o u n t of t h e a s y n c h r o n o u s copying will l e n g t h of t h e g a r b a g e c o l l e c t i o n cycle.
be reduced substantially. Indeed, unless
the scanning quota can be made fairly Although we m a y d o b e t t e r than Baker in
large, most of the copying in Baker's the p r o p o r t i o n of g a r b a g e t h a t we l e a v e in
collector is a s y n c h r o n o u s . FROMSPACE, our collection cycle is
generally longer, since every cell which
was allocated in FROMSPACE and is
accessible at the e n d of the c y c l e had to
The problem caused by f l o a t i n g garbage be copied, rather than allocated directly
in a v i r t u a l memory implementation is t h a t into TOSPACE. Because of the lengthened
the number of pages required at the cycle, more garbage m a y be g e n e r a t e d . The
beginning of e a c h c y c l e w i l l be l a r g e r than combined effect of t h e s e two tendencies is
need be, since the floating garbage cells hard to judge. In the worst case, our
will dilute the accessible cells. cycle would last approximately
(N-I)/(Q-2) cell allocations; while the
With copying collectors, once the root worst case for Baker's collector is N/Q
c e l l of t h e d a t a s t r u c t u r e has been copied, cell allocations.
the entire structure will be c o p i e d , even
though access to the structure has been Conclusions
dropped as it is b e i n g c o p i e d or as a n o t h e r
structure is b e i n g c o p i e d . By s e e d i n g the In summary, our algorithm does quite
copying process o u t of o n l y o n e r e g i s t e r at w e l l in t h e l i n e a r i z a t i o n of t h e d a t a . The
a t i m e , we i n c r e a s e our chances of l e a v i n g only flaws come from RPLACA and RPLACD
the garbage in F R O M S P A C E . which would have destroyed the linearity of
storage even if collection had finished
Re~ister effects instantly. The issue of floating garbage
is n o t q u i t e as e a s y to d e c i d e , and would
We can further pursue this garbage require experimental data to c l a r i f y . In
abandonment policy through the order in an earlier version of this algorithm, we
which we scan the registers. Because of had used a "semispace bit" associated with
sharing of accessible cells among the each register. If this bit is on, the
registers, one would expect the number of register may point only into TOSPACE. As
cells copied per register to decrease as the data accessible out of e a c h r e g i s t e r is
each register is scanned. Moreover, in copied the semispace bit is turned on.
some LISP implementations, one of the This prevents any recidivist registers
registers holds the list of bound global creeping back into FROMSPACE; and the
variables and their values. If that collection cycle is s h o r t e r , although cell
register is scanned first, a substantial storage is n o t as w e l l linearized. It is
portion of the c o p y i n g will be d o n e while our feeling that the benefits of
the remaining registers are still freely linearization probably outweigh the
dumping garbage into FROMSPACE. inefficiencies of t h e floating garbage. A
successful resolution of this and other
In addition, one could write programs performance issues will have to wait for
in s u c h a w a y t h a t lists which are likely implementation. Neither our algorithm nor
to be s h o r t l i v e d are accessible through the Baker's has been implemented and analyzed
registers scanned later, while the more as y e t .
durable data is accessible through
registers scanned earlier in t h e c y c l e . Acknowledgment

In summary, one should scan the This algorithm was developed as p a r t of


registers pointing to the eternal data an e f f o r t to p u t t o g e t h e r a LISP system at
first and the ephemeral registers last. Wang Laboratories for in house use. I
Thus, l i s t s of a r g u m e n t s for f u n c t i o n calls would like to thank Dan Corwin for his
w o u l d g o in t h e l a s t s c a n n e d registers, and patience and perseverance in that effort.
globally bound variables would go in the I would a l s o l i k e to t h a n k H e n r y B a k e r for
first scanned registers. his thoughtful comments and encouragement.

It is f e l t t h a t t h e r e is a h i g h early
r a t e of d r o p p a g e among the newly allocated
cells. If c e l l s a r e a l l o c a t e d in F R O M S P A C E
early in the cycle, the shortlived cells
can be left behind. While if cells are
allocated in TOSPACE late in the cycle,
t h e y w i l l n o t h a v e to be c o p i e d . Choosing

163
REFERENCES

Baker77 Baker, H.G., Jr. and C. H e w i t t , Cohen81 Cohen, J., " G a r b a g e collection
"The incremental garbage of l i n k e d d a t a s t r u c t u r e s , " ACM
collection of p r o c e s s e s , " Memo C__qomput___~ing__Surveys, Vol. 13, No.
No. 454, Artificial 3, S e p t e m b e r 1981, pp. 3 4 1 - 3 6 7 .
Intelligence Laboratory, Feniche169 Fenichel, R.R. and J.C.
M.I.T., Cambridge, MA, D e c e m b e r Yochelson, "A LISP
1977. garbage-collector for
Baker78 Baker, H,G., "List processing virtual-memory computer
in real time on a serial systems," Communications o f the
computer," Communications of AC___MM, Vol. 12, No. ii, N o v e m b e r
the ACM, Vol. 21, No. 4, A p r i l 1969, pp. 6 1 1 - 1 2 .
1978, pp 2 8 0 - 2 9 4 . Hood81 Hood, R. and R. Melville,
Cheney70 Cheney, C.J., "A n o n r e c u r s i v e "Real-time queue operations in
list compacting algorithm,." pure LISP," Information
Communications of the ACM, Vol. Processin~ Letters, Vol. 13,
13, No. ii, N o v e m b e r 1970, pp. No. 2, N o v e m b e r 1981, pp. 50-54.
677-678. Steele75 Steele, G.L. Jr.,
Clark77 C l a r k , D.W. and C.C, G r e e n , "An "Multiprocessing compactifying
empirical study of list garbage collection,"
structure in LISP," Communications of the ACM, Vol.
Communications of the ACM, Vol. 18, NO. 9, S e p t e m b e r 1975, pp.
20, No. 2, F e b r u a r y 1977, pp. 495-508.
78-87. Wadler76 Wadler, P. L., " A n a l y s i s of an
Clark79 Clark, D.W., "Measurements of algorithm for r e a l - t i m e garbage
dynamic list s t r u c t u r e use in collection," Communications of
LISP," IEEE Transactions on the ACM, Vol. 19, No. 9,
Software Enqine£!!n~, Vol. N o v e m b e r 1976, pp. 4 9 1 - 5 0 0 .
SE-5, No. i, J a n u a r y 1979, pp.
51-59.

APPENDIX: The collector al@orithm

We present a pseudo-Algol implementation of our collector algorithm. We


have tried to keep the notation very close to that of [Baker78] to facilitate
comparison. This version of the algorithm does not use CDR-coding although
that could be arranged with the accompanying complexity.

Symbol Type Usage


B pointer Bottom of free storage in TOSPACE
PB pointer Linearization pointer
T pointer Top of free area in TOSPACE
F pointer Top of free area in FROMSPACE
SB pointer Scan pointer into bottom of TOSPACE
ST pointer Scan pointer into top of TOSPACE
Q integer Scanning quota
r integer Number of registers
R[l:r] array Registers

boolean procedure TOSPACE(x) :: % Returns true if x is in TOSPACE

boolean procedure FROMSPACE(x) :: % Returns true if x is in FROMSPACE

procedure FLIP( ) % Relabels the two semispaces and initializes the


pointers F, T, B, SB, PB, and ST. %

pointer procedure ALLOC(top,x,y) :: % Allocates a new cell in either semispace. %


begin % Allocate in either semispace %
top := top-2; % Decrement top of free space %
top[0] := x; % Write the cell %
top[l] := y~ % %
top; % Return pointer to new cell %
end

164
pointer procedure FORWARD(p) :: Follows forwarding addresses if necessary and
returns pointer to data cell %
if fromspace(p) and If p has been copied %
tospace(p[0])
then p[0] Point to copied version %
else p; Else return p %

pointer procedure COPY(p,d) :: % Copies the data cell pointed to by p at d %


begin
dr0] := p[0]; % Copy CAR pointer of p at d %
dr1] :: pill; % Copy C D R pointer of p at d %
p[0] := d; % Leave forwarding address in p %
end

pointer procedure CAR(x) :: % Redirect any CAR pointer in TOSPACE to a data cell
(qv. Condition 2). %
if tospace(x)
then x[0] := forward(x[0]) % Rewrite CAR of if needed
else forward(x[0]) % Follow any forwarding address

procedure RPLACA(x,y)
if fromspace(x) and % Check Condition 2
tospace(y) % (False forward address)
then begin
copy(x,T); % Copy x into TOSPACE at T %
T[0] := y; % Redirect copied cell %
T := T-2; % Point T to top free cell %
end
else if tospace(x) and % Check Condition 1 %
(x less than SB or % (Keep TOSPACE %
x greater than ST) and % self referential) %
fromspace(y)
then begin
x[0] := copy(y,T); % Copy pointee into TOSPACE %
T := T-2; % Point T to top free cell %
end
else x[0] := y; % Garden variety RPLACA %

pointer procedure CDR(x) :: % Redirect an~ CDR pointer to a data cell %


x[l] := forward(x[l]) % Rewrite CDR pointer if needed %

procedure RPLACD(x,y)
if tospace(x) and % Check Condition 1 %
(x less than SB or % (Keep TOSPACE %
x greater than ST) and % self referential) %
fromspace(y)
then begin
x[l] := copy(y,T); % Copy new pointee and redirect
T := T-2; % Point T to top free cell
end
else x{l] := y % Garden v a r i e t y RPLACD

pointer procedure CONS(x,y,d)


begin
collect( );
if tospace(x) or % Check Condition 2
then alloc(T,x,y) % Allocate cell in TOSPACE
else alloc(F,x,y); % Else in TOSPACE
end

165
@ r o c e d u r e COLLECT( )
for i = 1 to Q d o
if not linearize(i) then % Try to c o n t i n u e C D R - l i n e a r i z a t i o n %
if not bseed(i) then % Get l i n e a r i z a t i o n seed b e l o w B %
if not rseed(i) then % Get seed through a register %
if not tseed(i) then % Get seed from above T %
flip() % S t a r t next c o l l e c t i o n cycle %

boolean p r o c e d u r e LINEARIZE(i) :: % E x t e n d s l i n e a r i z a t i o n as long as possible. R e t u r n s


false if a new seed is needed, true if the scan
quota was met. %
if i greater than Q then true % Scanning q u o t a met %
else if PB = B then false % Can't extend l i n e a r i z a t i o n %
else if not
fromspace(PB[l] := forward(PB[l]) )
then begin % C D R of PB already in T O S P A C E %
PB := B; % Stop linearization %
i := i+l; % Increment scanning counter %
false; % Look for new seed %
end
else b e g i n
PB[I] := copy(PB[I],B); % Copy C D R of PB %
PB := B; % Move linearization pointer %
B := B+2; % Point to next free b o t t o m cell %
i := i+l; % Increment scanning counter %
linearize(i); % Try next l i n e a r i z a t i o n %
end

b o o l e a n p r o c e d u r e BSEED(i) :: % Try to get a new seed by scanning b e l o w B. R e t u r n s


false if all c e l l s have been scanned, true if
a seed was found or the scan quota met. %
if i greater than Q then true % Scan quota met %
else if SB = B then false % No linearization seeds b e l o w B %
else if
fromspace( SB[0] := forward(SB[0])
then begin % F o u n d a new seed %
B := copy(SB[0],B)+2; % Copy the new seed at B %
SB := SB+I; % Increment scan pointer %
i := i+l; % Increment scanning counter %
true; % Resume linearization %
end
else begin % Continue seed search %
SB := SB+I; % Increment scan pointer %
i := i+l; % Increment scanning counter %
bseed(i); % Scan next c e l l %
end

boolean p r o c e d u r e RSEED(i) :: % Try to o b t a i n a new seed from the lowest order


register. Returns false if all registers p o i n t
into TOSPACE, true if a new seed was found. %
for j=l to r do % Scan the registers %
if i greater than Q then % Scanning quota met %
begin
j := r; % Halt register check %
true; % Set up return to mutator %
end
else if fromspace(R[j]) then % Register pointing into F R O M S P A C E %
beg i n
i: = i+l; % Increment the scanning counter %
R[j] := copy(R[j],B); % C o p y the new seed %
j := r; % Halt register check %
true; % Set up new round of l i n e a r i z a t i o n %
end
else b e g i n
i := i+l; % Increment scanning counter %
false % C o n t i n u e search for seed %
end

166
boolean procedure TSEED(i) :: % Try to get a new seed by scanning above T. Returns
false if all cells have been scanned, true if
a seed was found or the scan quota met. %
if i greater than Q then true % Scanning quota met %
else if ST = T then false % No new seeds above T %
else if
fromspace( ST[0] := forward(ST[0]) )
then begin % Found a new seed %
B := c o p y ( S T [ 0 ] , B ) + 2 ; % Copy new seed at B %
ST := ST-I; % D e c r e m e n t top scan pointer %
i := i+l; % Increment scanning counter %
true; % Set up next round of l i n e a r i z a t i o n %
end
else begin % C o n t i n u e search for seed %
ST := ST-I; % D e c r e m e n t top scan pointer %
i := i+l; % Increment scanning counter %
tseed(i); % S c a n next cell %
end

167

You might also like