You are on page 1of 9

Predictable Automatic Memory Management

for Embedded Systems

Roger Henriksson

LU-CS-TR:97-189
LUTEDX/(TECS-3077)/1-8/(1997)

Presented at:
OOPSLA’97 Workshop on Garbage Collection and Memory Management,
Atlanta, Georgia, USA, October 5, 1997.

An earlier version of the paper was published in:


Proceedings of SNART’97, Conference of the Swedish National Real-Time Association,
Lund, Sweden, August 21-22, 1997.

Department of Computer Science

Lund Institute of Technology


Lund University

Box 118, S-221 00 Lund, Sweden


Predictable Automatic Memory Management
for Embedded Systems

Roger Henriksson

Department of Computer Science, Lund University


Box 118, S-221 00 Lund, Sweden
e-mail: Roger.Henriksson@dna.lth.se

Abstract to manage the memory in a correct fashion usu-


The power of dynamic memory management can be ally manifests itself in one of two ways, namely
used to produce more flexible control applications memory leaks or dangling pointers. Memory
without compromising the robustness of the applica- leaks are caused by neglecting to deallocate dead
tions. It is demonstrated how automatic memory objects. Eventually the application runs out of
management, or garbage collection (GC), can be used memory, which cause the application to crash.
in a system that has to comply with hard real-time Dangling pointers are introduced by deallocating
demands while still preserving the predictability of objects to early, while they are still live. When the
the system. A suitable garbage collection algorithm application later tries to access one of the deallo-
is described together with a strategy for scheduling cated objects it often results in a crash.
the work of the algorithm. A method for proving that Memory leaks and dangling pointers can be
a given set of processes will always meet their dead- avoided if automatic memory management is
lines without interference from the garbage collector introduced. Here, it is the responsibility of a spe-
is given. cial module of the runtime system, the garbage
collector, to identify dead objects and reclaim the
memory they occupy. This process is called gar-
1 Introduction bage collection, or GC for short. A lot of error-
prone code can thus be removed from the applica-
Traditionally, embedded real-time systems have tion. Another bonus that GC brings is that it can
been implemented using static techniques to easily be combined with memory compaction,
ensure predictability. Static task scheduling has avoiding memory fragmentation that otherwise is
been used together with static memory manage- a problem associated with dynamic memory man-
ment. The demand for more flexible applications agement.
together with improved implementation and Automatic memory management has a lot of
analysis techniques have brought an increased attractive properties that make it desirable in
use of dynamic process scheduling. However, embedded systems. Unfortunately, GC has been
memory is still in most cases managed statically. little used in systems with strict demands on
Increased application complexity and demands on response times (hard real-time systems). The
flexibility make dynamic memory management major reason for this is that existing GC tech-
more and more desirable. Object-oriented tech- niques tend to disrupt the application for too long
niques have recently begun making their way periods of time.
into the development of embedded systems, which This paper presents a novel strategy for sched-
further accentuates the need for dynamic memory uling the GC work in such a way that the inter-
management. ference with the execution of critical processes is
The most common technique for dynamic minimized. The worst-case GC overhead for high-
memory management is manual memory man- priority processes is predictable, strictly bounded,
agement. This implies that it is the responsibility and small in respect to the response time
of the application, or the application programmer, demands of most control systems. This makes it
to decide which objects in memory must be possible to a priori guarantee that the application
retained (so called live object) and which can be will meet all its deadlines. A variant of the strat-
deallocated and reused (dead objects). Manual egy handling only a single critical process has
memory management does enable a flexible use of previously been described in the form of a licenti-
the available memory, but it also introduces a set ate thesis [Hen96].
of new problems. In any but toy-size applications,
the problem of keeping track of which objects are
live and which are dead is very complex. Failure
2 Control systems are copied, or evacuated, one at the time from an
old area (fromspace) to a new area (tospace),
In control systems, high-priority processes per- which is also used for allocating new objects, see
form tasks such as sampling input from the con- Figure 1. When tospace runs out of space, all live
trolled process, executing regulator algorithms objects must have been evacuated from from-
(PID algorithms etc.), and adjusting output sig- space, so it can be reused. At this moment, a flip
nals. These processes are executed periodically, is performed, reversing the role of tospace and
often as frequent as 100 times or more per second. fromspace and starting a new GC cycle. The GC
The scheduling demands on these processes are work (the copying) must be scheduled often
that they should both start with very little delay enough to guarantee that all live objects are evac-
and be completed within a guaranteed short uated before tospace is filled (but from an effi-
period of time, often in the order of 1 millisecond ciency point of view preferably just in time).
or less. Control theory needs the time-constraints Baker’s proposal is to let allocation of new objects
in order to guarantee stability [ÅW84]. To sum- trigger the GC work. When a new object is allo-
marize: cated, an amount of live objects relative to the
size of the new object is evacuated. This scheme
• High-priority processes must start on time simplifies the arguments that objects are indeed
– they can not afford to wait for extensive copied at a sufficient rate to guarantee that a flip
GC work to complete. The GC work must be can be made when tospace is filled. In order to
performed in small chunks or be interrupt- update all pointers to the new location of a moved
ible within a time frame that is shorter object, the original formulation uses a read bar-
than the demanded activation time. rier, that is, pointers are updated (using forward-
• High-priority processes must complete in ing pointers in fromspace) when accessed after
time. When calculating the worst-case exe- the object has been moved. A later improvement
cution time, possible delays add up. It is of the algorithm [Bro84] shows that a write bar-
therefore an advantage if the individual rier can be used instead, only intercepting pointer
worst-case costs for memory management updates.
operations can be kept small enough that Fine-grained incremental algorithms, such as
the cumulative cost does not add signifi- Baker’s algorithm and its variants, constitute a
cantly to the worst-case execution time. large step towards making garbage collection
work in real-time systems. Each memory man-
Low-priority processes are used to perform actions agement operation can potentially trigger GC
such as computation of reference values, support- work, but the amount of work performed at each
ing presentation and supervision of the state of invocation is small and bounded. The disturbance
the controlled process, changing parameters etc. is small enough for use in low-priority processes
These processes are also time critical, but the due to their relaxed real-time demands and since
time constraints are much weaker, typically in the possible delay of starting a high-priority pro-
the area of 100 milliseconds or less. Missing a cess is only one short GC invocation. For high-pri-
deadline can often be tolerated providing it does ority processes on the other hand, the overhead
not happen too often. can be too large. The worst-case execution time
for a sequence of memory management opera-
tions quickly adds up. Some improvements of the
3 A real-time GC algorithm algorithm are thus needed in order to make it
useful in a system with hard real-time demands.
Baker’s algorithm [Bak78] is an incremental
copying garbage collection algorithm. By incre-
mental we mean that the garbage collector (col- 4 Scheduling principle
lector for short) runs interleaved with the
application program (also denoted mutator since The fact that every operation related to memory
it modifies the object graph). Each time the collec- management may trigger GC work to be per-
tor is invoked, a small piece of GC work is per- formed makes it difficult to guarantee short
formed. The heap is divided into two equally sized enough response times for the high-priority pro-
semispaces, fromspace and tospace. Live objects cesses. We therefore propose that these processes

Fromspace Tospace

old objects evacuated allocated


objects objects

Figure 1 The heap structure of Baker’s algorithm.


are treated in a different way from a memory in the worst-case, every pointer assignment
management point of view. GC work should be would cause an object to be copied. The overhead
prohibited altogether while a high-priority pro- will furthermore depend on the size of the evacu-
cess executes. This in turn means that the gar- ated objects. The worst-case execution time for
bage collector will, temporarily, get behind with pointer assignments adds up quickly, making it
its work. The missing GC work is performed by a difficult to guarantee short response times.
separate GC process as soon as no more high-pri- Therefore, we employ a delayed-evacuation strat-
ority processes are eligible for execution. Low-pri- egy in which we delay the actual evacuation of
ority processes, on the other hand, use the objects until the high-priority processes have fin-
traditional strategy to trigger GC work. It should ished executing [Hen96]. The worst-case over-
be noted, however, that lengthy (from the per- head for a pointer assignment can now be reduced
spective of a high-priority process) GC work trig- to as little as 12 machine instructions on a proces-
gered by a low-priority process must be sor such as a Motorola 680x0, independent of the
interruptible in order not to delay an invocation object size.
of a high-priority process. We will thus have three Memory allocation can be a very costly opera-
main levels of priority: tion in the original version of the algorithm.
Whenever an allocation request is made, the gar-
1. High-priority processes bage collector is started, performing an amount of
2. Garbage collection triggered by the high- work. The amount of GC work, and thus the
priority processes required time, depends on the size of the
3. Low-priority processes interleaved with requested block of memory, the maximum amount
triggered GC work of simultaneously live memory, the total heap size
and the maximum amount of GC work that may
Processes within priority levels 1 and 3 above have to be performed during one GC cycle. The
may of course have different priorities amongst overhead for a memory allocation will conse-
themselves. quently vary if the size of the heap changes or if
The proposed scheduling strategy is quite gen- the maximum amount of live memory changes
eral in the sense that it is applicable to most fine- (perhaps due to changes in unrelated parts of the
grained incremental GC algorithms. We will, program). In order to eliminate the high cost for
however, limit ourselves to studying what impli- memory allocation in high-priority processes, we
cations the strategy has on one such algorithm, delay the GC work until all high-priority pro-
namely Brooks’ algorithm as formulated by cesses have finished executing. A memory alloca-
Bengtsson [Ben90]. tion will now become a cheap operation for a high-
priority process, only involving modifying the
High-priority processes allocation pointer and initializing the contents of
Time-critical high-priority processes must be able the new object.
to guarantee short response times. Therefore,
memory management operations performed by Reserved memory
such processes must be kept cheap. There are The copying garbage collector algorithm we use to
three kinds of memory management operations: illustrate our scheduling strategy may experience
pointer dereferencing, pointer assignment, and a deadlock if the evacuation of live objects from
allocation. fromspace does not keep up with the allocation of
If Brooks’ algorithm is used, the cost of deref- new objects. The garbage collector might end up
erencing a pointer is already low. Pointer derefer- in a situation where some objects remain to be
encing is performed using an indirection step (by evacuated but there is no space left in tospace to
following a forwarding pointer in the head of the hold them. When low-priority processes allocate
objects). This allows objects to be moved without memory, they make sure that enough GC work
immediately identifying and changing every has been performed before they actually allocate
pointer in the system referring to the object. The the new object, a strategy which guarantees that
overhead for dereferencing a pointer consists thus no such deadlock occurs. High-priority processes,
only of one extra memory access. on the other hand, allocate memory before the
In order to guarantee that an assignment to a corresponding GC work has been performed. This
pointer does not create pointers into fromspace is a potentially dangerous situation: If the high-
that the garbage collector does not know about, priority processes are invoked shortly before a flip
Brooks proposes that a write barrier is used. All is due, there might not be enough memory left in
pointer assignments are watched and those that tospace to hold both the new objects and the live
might jeopardize the integrity of the heap are objects that have not yet been evacuated from
caught. If the object referenced by the new value fromspace. The solution to this problem is to
of the pointer is located in fromspace it is immedi- schedule the GC work, and the flip, in such a way
ately evacuated and the pointer is updated. Thus, that enough memory remains in tospace for evac-
uation of live objects even if high-priority pro- ity process, τi, with period Ti, is invoked, it exe-
cesses are invoked immediately before the flip. cutes for a duration equal to its worst-case
This can viewed as reserving an amount of mem- execution need, Ci, and performs memory man-
ory in tospace for allocation made by high-priority agement related actions that requires a worst-
processes. We denote this amount MHP. case garbage collection work of Gi to be per-
formed. If we can show that this situation is
schedulable, all other possible (less demanding)
5 Scheduling analysis scheduling situations will also be schedulable.
Definition. We define the worst-case response
Before a safety-critical control program is used in time of the garbage collector, RGC, as the time
a real control situation, it is important to con- from the high-priority processes are simulta-
vince oneself that the program will meet all its neously released until no more garbage collection
hard deadlines, i.e. the process set is schedulable. work is left to be performed.
Information about the process set, such as worst- Let CGC denote the worst-case time required
case execution times for the various processes, for GC work in any interval of time of length RGC,
worst-case GC work that has to be performed to t..t+RGC. RGC can then be calculated in a similar
clean up after the processes, and worst-case allo- way to how response times are calculated in the
cation need for each process is used as input to exact rate-monotonic analysis. We assume that N
the analysis. high-priority processes, τ1..τN, exist:1
Verifying the schedulability of the GC work is
a two-stage process. First, the high-priority pro- N

∑  
R GC
cesses are studied separately and it is determined R GC = C GC + - ⋅ C i
--------- (1)
whether they are schedulable or not. If not, the i=1
Ti
system is clearly not schedulable and we are fin-
ished. Otherwise, we continue by analyzing the The equation contains CGC, which in our case
GC work motivated by the actions of the high-pri- depends on the actions of the high-priority pro-
ority processes. The findings from the analysis is cesses. For each invocation of a high-priority pro-
also used to determine the amount of memory cess τi during RGC, the required garbage
that must be reserved in tospace for the high-pri- collection work amounts to Gi. The total garbage
ority processes. collection work during RGC will therefore be:

High-priority processes N

∑  
R GC
The first part of the analysis, determining the C GC = - ⋅ G i
--------- (2)
schedulability of the high-priority processes i=1
Ti
employs standard scheduling analysis tech-
niques. For example, if rate monotonic scheduling Applying (2) to (1) yields:
[LL73] has been used to assign priorities to the
N
high-priority processes, rate monotonic analysis
∑  
R GC
can be used. This includes the scheduling test of R GC = - ⋅ ( C i + G i )
--------- (3)
Lui and Layland [LL73], which uses the processor i=1
Ti
utilization ratio, or the exact analysis originally
presented by Joseph and Pandya [JP86] and later RGC is found on both the left side and the right
enhanced by others to include processes with side of the equality. The smallest non-zero value
deadlines shorter than the period of the process, of RGC that satisfies (3) can be found using the
blocking, release jitter etc. [SRL94]. recursive formula:
GC interference with a high-priority process
manifests itself in two ways: a slightly increased
N 
0
R GC = ∑ Ci 
worst-case execution time for the high-priority 
processes and a slight release jitter in the invoca- i=1 
tion of the processes. Both types of interference N  (4)
can easily be handled by existing analysis theory.  R nGC 
= ∑  ---------- ⋅ ( C i + G i ) 
n+1
R GC
Garbage collection work i = 1
Ti 

Given that the high-priority processes of a system
have been determined to be schedulable, we need It should be noted that we cannot use 0 as the
to verify that the GC work motivated by the first approximation of RGC, as is usually done in
actions of the high-priority processes is schedula-
ble as well. Consider the worst-case scheduling 1. We use   to denote the ceiling function, i.e the small-
situation, in which all high-priority processes are est integer such that it is equal to, or larger than, the
released simultaneously. Each time a high-prior- function argument.
rate-monotonic analysis when calculating the high-priority processes, i.e. not until the garbage
worst-case response time of a process. This is collector have finished the current GC cycle. This
because 0 is a trivial solution to (3), whereas the is the worst possible situation.
solution we want is the first positive, non-zero Our assumptions lead to the conclusion that
solution. Clearly, RGC cannot be smaller than the we must reserve enough memory in tospace to
sum of the worst-case execution times for the hold all objects allocated during RGC time units.
high-priority processes, since all processes are This amounts to the sum of the worst-case alloca-
released simultaneously in the worst case and the tion needs for the high-priority processes during
garbage collector has lower priority than these this time. The allocation need for a process can be
processes. Therefore, the sum of the worst-case calculated by multiplying the worst-case alloca-
execution times for the high-priority processes is tion need for one invocation, Ai, with the number
a suitable initial approximation of RGC. of times the process might be invoked during a
If the garbage collection work is schedulable, time span of RGC. We get:
(4) will converge. If the garbage collection work is
N
not schedulable, (4) will not converge since no R GC
solution exists. It is easy to detect that (4) has M HP = ∑ - ⋅ Ai
---------
Ti
(5)
converged. This happens when two consecutive i=1
values of RGC are found to be equal. But how do
we detect that the formula does not converge?
The answer is that it is possible to calculate an Priority inheritance schemes
upper bound for RGC. If one of the steps in the Priority inversion is a phenomenon where a
iterative process of calculating RGC yields a value higher priority process is blocked by a lower pri-
larger than the maximum possible response time, ority process holding a semaphore that the higher
we can deduce that the iteration will not con- priority process is attempting to lock. The dura-
verge. tion of the blocking is usually short since the
Theorem. The maximum possible response semaphore is typically used to protect a short
time for the garbage collector is the least common critical region. However, if a third, medium prior-
multiple of the periods of the high-priority pro- ity, process is released, it will preempt the lower
cesses, denoted lcm(T1..TN). priority process and prevent it from releasing the
If we, for example, have a system with two semaphore. This lead to arbitrary delays of the
high-priority processes with periods of 10 and 14 higher priority process, which is not acceptable in
milliseconds respectively, the response time of the hard real-time systems. The problem of priority
garbage collector must be less than or equal to inversion was first described by Lampson and
lcm(10,14) = 70 milliseconds for the system to be Redell [LR80].
schedulable. To avoid blocking caused by priority inversion,
Proof. Assume that all the high-priority pro- priority inheritance protocols are often employed.
cesses are released simultaneously at time t. This All of these protocols involve temporarily raising
is the worst-case scheduling situation. The pro- the priority of a process that has locked a sema-
cesses will execute with different periods forming phore, which may cause a low-priority process to
a scheduling pattern. Sooner or later they will all become a high-priority one until the semaphore is
again become ready to run simultaneously, after unlocked. This must be taken into consideration
which the scheduling pattern will repeat itself. when analyzing the schedulability of a system of
This happens at time t+lcm(T1..TN). Thus, if processes. We will look at how the probably most
there was not enough time in the time slot used priority inheritance protocol, namely the
t..t+lcm(T1..TN) to complete the garbage collec- basic inheritance protocol, can be incorporated in
tion work in progress, there will not be enough our scheduling analysis. Two other priority inher-
time in the next time slot either, and so on. The itance protocols are the priority ceiling protocol
amount of required garbage collection work will and the immediate inheritance protocol, both of
continue to accumulate. The response time of the which can be included in the analysis in similar
garbage collector must therefore be less than, or ways.
equal to, lcm(T1..TN). The basic inheritance protocol states that
whenever a process blocks because the semaphore
Reserved memory it attempts to lock is already locked by a process
How do we calculate the amount of memory that with a lower priority, the process currently pos-
has to be reserved in tospace for high-priority sessing the lock will inherit the priority of the
allocation, MHP? We can do this by assuming that blocked process. The priority of a process is thus
all high-priority processes are released immedi- raised if, and only if, it is blocking a higher prior-
ately before a flip is to be performed. We further- ity process.
more assume that the flip can not be performed Blocking a high-priority process and raising
within RGC time units after the invocation of the the priority of the process holding the semaphore
to the priority level of the blocked process can garbage collector, and the real-time kernel were
clearly be viewed as being equivalent to the high- augmented with code producing output on the
priority process performing the work within the digital I/O port of the VME computer. The perfor-
critical region of the process with lower priority. mance of the garbage collector could then be mon-
We can thus incorporate the basic inheritance itored by connecting a logic analyzer to the I/O
protocol in our scheduling analysis by modifying port. Several performance aspects were studied,
(3) slightly: such as the cost of individual memory manage-
ment operations, the amount of work required for
N garbage collection, and the time a high-priority
∑  
R GC process can be delayed by garbage collection.
R GC = - ⋅ ( C i + d i + G i + g i ) (6)
---------
i=1
Ti
Pointer assignment
Each pointer assignment is guarded by a write
For each process, τi, we have to add the worst- barrier in order to catch assignments that might
case time it, as we see it, performs work on the jeopardize the integrity of the heap. The worst-
behalf of low-priority processes to the worst-case case time required for a pointer assignment was
execution time of the process. We denote the addi- determined to be 10 µs for high-priority processes.
tional time di. While performing the work of a The worst-case cost is independent of object size
low-priority process, τi might take actions that since we do not evacuate objects while high-prior-
motivate additional GC work to be performed. ity processes are running. For low-priority pro-
The additional worst-case time for GC work is cesses, on the other hand, the worst-case time
denoted gi, and must also be taken into account depends on object size. For example, a pointer
when calculating the response time of the gar- assignment involving a pointer to a 36 byte object
bage collector. It is worth noticing that it is only required 21 µs in the worst case. Larger objects
low-priority processes that influence di and gi. result in higher worst-case delays.
The execution time and GC need of high-priority
processes are already taken into account, even if Allocation
they do block other high-priority processes. No garbage collection is performed in association
To analyze a system utilizing the basic inherit- with memory allocation requests made by high-
ance protocol, we must be able to determine the priority processes, which makes them signifi-
value of di and gi. Any of the following two obser- cantly cheaper than in the original version of
vations can be used to find an upper bound Brooks’ algorithm. Allocation involves only updat-
[SRL90]: First, under the basic inheritance proto- ing an allocation pointer and initializing the con-
col, a process τi can be delayed at most once by tents of the new object. All memory cells
each process with lower priority which share constituting the object is set to zero except for the
semaphore with τi. Second, if m semaphores exist GC information fields in the object header. This
which can cause τi to block, then τi can be blocked approach has been chosen for reasons of simplic-
at most m times, i.e. once by each semaphore. By ity and safety. A more efficient approach, from a
analyzing the worst-case execution times and strict memory management point of view, would
allocation need of the corresponding critical be to just set the pointers within the new object to
regions and adding them up, we can compute di zero, not the contents of the entire object. How-
and gi. ever, initializing scalar values as well as pointers
is preferable in safety-critical applications since it
reduces the risk of bugs related to the program-
6 Implementation mer forgetting to initialize scalar fields. The total
cost for an allocation request consists of a con-
A garbage collector based on Brooks’ algorithm, stant time for administrating the request and a
as formulated by Bengtsson [Ben90], and sched- initialization time proportional to the size of the
uled according to the principles described in this object to allocate. The following table presents
paper has been implemented within an actual some worst-case allocation times for a high-prior-
real-time kernel [AB91]. The kernel has been ity process.
developed at the Department of Automatic Con-
trol, Lund Institute of Technology, and it is used
object size (bytes) worst-case delay (µs)
in various hard real-time applications, including
control of industrial robots. The garbage collector 100 22
was implemented in C with some critical parts in
assembly code. 500 37
A series of test programs were run on a VME
1000 58
control computer equipped with a 25 MHz
Motorola 68040 processor. The test programs, the
When a low-priority process requests memory, an again directly after completing the operation. No
amount of GC work corresponding to the size of context switch is possible during this time. A
the requested object is performed. The worst-case high-priority process becoming ready to run
time delay will in this case therefore be signifi- might therefore be delayed until the atomic oper-
cantly higher than for a high-priority process. ation is finished. It is therefore important to keep
atomic operations shorter than the maximum
Garbage collection work latency tolerated by the critical processes.
Whenever a high-priority process is suspended The first implementation of our garbage collec-
and no other high-priority process is ready to exe- tor considered copying an object to be a single
cute, the garbage collector checks if there is any atomic operation. Initialization of an object in
GC work pending. If so, garbage collection is connection with an allocation request was consid-
started. The amount of GC work that is per- ered an atomic operation as well. This proved to
formed depends on the amount of memory allo- work well as long as only small objects, less than
cated by the high-priority processes, the heap 100 bytes, resided on the heap, but it did not scale
size, the maximum amount of simultaneously live up as the maximum object size grew. For example,
memory, and the maximum amount of total GC a 1000 byte object caused a 177 µs locking delays,
work that may be required during one GC cycle. which is rather long for systems with high-fre-
The worst-case time required for garbage collec- quency control loops. For even larger object sizes,
tion in the presence of a single high-priority pro- which is quite realistic in an embedded system,
cess was studied. The high-priority process ran the problem grows even worse.
with a frequency of 1000 Hz and required 222 µs The increase of the worst-case locking time
to execute in the worst case. The clock interrupt with growing object size can be avoided if object
that triggered the process required 85 µs. Each copying and object initialization are made inter-
semispace consisted of 50 kilobytes (total heap ruptible. Then, the worst-case delay will be inde-
size 100 kilobytes) and the object size and amount pendent of the maximum object size. Object
of simultaneously live memory was varied. The initialization can easily be made interruptible,
worst-case times for garbage collection are given but object copying presents some practical prob-
in the following table. lems with process synchronization. If a context
switch occurs when the garbage collector is copy-
ing an object, the process given control of the pro-
simultaneously worst-case
object size
live memory time required cessor might attempt to modify the object being
(bytes)
(bytes) for GC (µs) copied. In such a case it will modify the old copy,
perhaps a part of the old copy that has already
100 4000 221 been copied. In order to avoid producing an incon-
sistent version of the object when the copying is
100 20000 261
resumed, we have chosen to back out of the copy-
500 20000 470 ing and restart it from scratch. This does lead to a
slightly increased worst-case overhead for GC,
1000 22000 503 since every context switch could potentially cause
an object copying to be aborted. By making object
We see that the garbage collector was able to keep initialization and copying interruptible we have
up with the application even under heavy load managed to limit the worst-case locking delay to
(1 megabyte allocated memory per second in a 38 µs, independent of the maximum object size.
100 kilobyte heap with a high ratio of live
objects).
It should also be noted that the garbage collec- 7 Conclusions
tor is suspended if a high-priority process
becomes ready to run while GC work is per- We have discussed embedded systems and how
formed. High-priority processes are thus not they can benefit from dynamic memory manage-
blocked for the time periods given in the table ment. The drawbacks of manual memory man-
above. agement have been pointed out and we have
established that automatic memory management,
Locking or garbage collection, is desirable in complex
Manipulating the pointer graph and copying embedded systems. However, previous techniques
objects cause the GC heap to be temporarily for garbage collection do not provide sufficient
inconsistent. Such operations must therefore be support for hard real-time systems.
performed atomically. This is achieved by switch- A novel strategy for scheduling the work of
ing off the processor interrupts before an atomic existing fine-granular incremental garbage collec-
operation is commenced and switching them on tion algorithms was presented. High-priority pro-
cesses which have to comply with hard real-time
demands execute with very little interference References
from the garbage collector. The interference is
small in respect to the response time demands of [AB91] L. Andersson, A. Blomdell. A Real-Time
the high-priority processes and is negligible from Programming Environment and a Real-
a control theory point of view. The remaining pro- Time Kernel. National Swedish Sympo-
cessor time is divided between running the gar- sium on Real-Time Systems, 1991.
bage collector and low-priority processes. The [Bak78] H. G. Baker. List Processing in Real
impact of garbage collection on the low-priority Time on a Serial Computer. Communi-
processes is small enough to satisfy soft real-time cations of the ACM, April 1978.
demands.
A method for analyzing the worst-case behav- [Ben90] M. Bengtsson. Real-Time Garbage Col-
ior of a system of processes and garbage collection lection. Licentiate Thesis, Dept. of Com-
was presented. The interference from the garbage puter Science, Lund University, 1990.
collector is strictly bounded and predictable. [Bro84] R. A. Brooks. Trading Data Space for
Overhead for memory management operations, Reduced Time and Code Space in Real-
such as pointer assignments and allocation Time Garbage Collection on Stock
requests is analyzed by including it in the worst- Hardware. Proceedings of the 1984
case execution time of the high-priority processes. ACM Symposium on Lisp and Func-
Process release jitter caused by the garbage col- tional Programming, August 1984.
lector momentarily switching off the interrupts is [Hen96] R. Henriksson. Scheduling Real-Time
also easily handled with standard schedulability Garbage Collection. Licentiate Thesis,
analysis techniques. It is thus possible to a priori Dept. of Computer Science, Lund Uni-
verify that a set of high-priority processes and the versity, 1996.
corresponding garbage collection work are sched-
ulable without violating any hard deadlines. It [JP86] M. Joseph, P. Pandya. Finding Re-
was also demonstrated how the analysis works in sponse Times in a Real-Time System.
The Computer Journal, Vol. 29, No. 5,
the presence of priority inheritance protocols.
1986.
The described garbage collection strategy was
implemented in an existing real-time kernel and [LL73] C. L. Liu, J. W. Layland. Scheduling Al-
used in an actual control application. Measure- gorithms for Multiprogramming in a
ments demonstrate that automatic memory man- Hard-Real-Time Environment. Journal
agement is feasible in hard real-time systems, of the ACM, Vol. 20, No. 1, 1973.
even in systems with sampling frequencies as [LR80] B. W. Lampson, D. D. Redell. Experi-
high as 1000 Hz. The cost of individual memory ence with Processes and Monitors in
management operations was observed to be low Mesa. Communications of the ACM,
for high-priority processes. Vol. 23, No. 2, 1980.

Acknowledgments [SRL90] L. Sha, R. Rajkumar, and J. P. Lehocz-


There are several people that have contributed to ky. Priority Inheritance Protocols: An
Approach to Real-Time Synchroniza-
the work presented in this paper and which I
tion. IEEE Transactions on Computers.
would like to thank: Boris Magnusson for intro-
39(9), 1990.
ducing me to the problem of scheduling garbage
collection work and for his support throughout [SRL94] L. Sha, R. Rajkumar, J. P. Lehoczky.
the work. Klas Nilsson and Anders Blomdell for Generalized Rate-Monotonic Schedul-
valuable hardware and software support during ing Theory: A Framework for Develop-
the implementation of the garbage collector. ing Real-Time Systems. Proceedings of
Anders Ive for helping me to evaluate the perfor- the IEEE, Vol. 82, No. 1, 1994.
mance of the garbage collector. In addition, Klas [ÅW84] K. J. Åström, B. Wittenmark. Computer
and Anders provided me with many valuable Controlled Systems - Theory and De-
comments on the draft of this paper. Thanks to sign. Prentice-Hall, Englewood Cliffs,
the Department of Automatic Control, Lund New Jersey, 1984.
Institute of Technology for providing the means to
evaluate the ideas in an actual control environ-
ment.
This work has been supported by NUTEK, the
Swedish National Board for Technical Develop-
ment within their Embedded systems program.

You might also like