Professional Documents
Culture Documents
Roger Henriksson
LU-CS-TR:97-189
LUTEDX/(TECS-3077)/1-8/(1997)
Presented at:
OOPSLA’97 Workshop on Garbage Collection and Memory Management,
Atlanta, Georgia, USA, October 5, 1997.
Roger Henriksson
Fromspace Tospace
∑
R GC
cesses are studied separately and it is determined R GC = C GC + - ⋅ C i
--------- (1)
whether they are schedulable or not. If not, the i=1
Ti
system is clearly not schedulable and we are fin-
ished. Otherwise, we continue by analyzing the The equation contains CGC, which in our case
GC work motivated by the actions of the high-pri- depends on the actions of the high-priority pro-
ority processes. The findings from the analysis is cesses. For each invocation of a high-priority pro-
also used to determine the amount of memory cess τi during RGC, the required garbage
that must be reserved in tospace for the high-pri- collection work amounts to Gi. The total garbage
ority processes. collection work during RGC will therefore be:
High-priority processes N
∑
R GC
The first part of the analysis, determining the C GC = - ⋅ G i
--------- (2)
schedulability of the high-priority processes i=1
Ti
employs standard scheduling analysis tech-
niques. For example, if rate monotonic scheduling Applying (2) to (1) yields:
[LL73] has been used to assign priorities to the
N
high-priority processes, rate monotonic analysis
∑
R GC
can be used. This includes the scheduling test of R GC = - ⋅ ( C i + G i )
--------- (3)
Lui and Layland [LL73], which uses the processor i=1
Ti
utilization ratio, or the exact analysis originally
presented by Joseph and Pandya [JP86] and later RGC is found on both the left side and the right
enhanced by others to include processes with side of the equality. The smallest non-zero value
deadlines shorter than the period of the process, of RGC that satisfies (3) can be found using the
blocking, release jitter etc. [SRL94]. recursive formula:
GC interference with a high-priority process
manifests itself in two ways: a slightly increased
N
0
R GC = ∑ Ci
worst-case execution time for the high-priority
processes and a slight release jitter in the invoca- i=1
tion of the processes. Both types of interference N (4)
can easily be handled by existing analysis theory. R nGC
= ∑ ---------- ⋅ ( C i + G i )
n+1
R GC
Garbage collection work i = 1
Ti
Given that the high-priority processes of a system
have been determined to be schedulable, we need It should be noted that we cannot use 0 as the
to verify that the GC work motivated by the first approximation of RGC, as is usually done in
actions of the high-priority processes is schedula-
ble as well. Consider the worst-case scheduling 1. We use to denote the ceiling function, i.e the small-
situation, in which all high-priority processes are est integer such that it is equal to, or larger than, the
released simultaneously. Each time a high-prior- function argument.
rate-monotonic analysis when calculating the high-priority processes, i.e. not until the garbage
worst-case response time of a process. This is collector have finished the current GC cycle. This
because 0 is a trivial solution to (3), whereas the is the worst possible situation.
solution we want is the first positive, non-zero Our assumptions lead to the conclusion that
solution. Clearly, RGC cannot be smaller than the we must reserve enough memory in tospace to
sum of the worst-case execution times for the hold all objects allocated during RGC time units.
high-priority processes, since all processes are This amounts to the sum of the worst-case alloca-
released simultaneously in the worst case and the tion needs for the high-priority processes during
garbage collector has lower priority than these this time. The allocation need for a process can be
processes. Therefore, the sum of the worst-case calculated by multiplying the worst-case alloca-
execution times for the high-priority processes is tion need for one invocation, Ai, with the number
a suitable initial approximation of RGC. of times the process might be invoked during a
If the garbage collection work is schedulable, time span of RGC. We get:
(4) will converge. If the garbage collection work is
N
not schedulable, (4) will not converge since no R GC
solution exists. It is easy to detect that (4) has M HP = ∑ - ⋅ Ai
---------
Ti
(5)
converged. This happens when two consecutive i=1
values of RGC are found to be equal. But how do
we detect that the formula does not converge?
The answer is that it is possible to calculate an Priority inheritance schemes
upper bound for RGC. If one of the steps in the Priority inversion is a phenomenon where a
iterative process of calculating RGC yields a value higher priority process is blocked by a lower pri-
larger than the maximum possible response time, ority process holding a semaphore that the higher
we can deduce that the iteration will not con- priority process is attempting to lock. The dura-
verge. tion of the blocking is usually short since the
Theorem. The maximum possible response semaphore is typically used to protect a short
time for the garbage collector is the least common critical region. However, if a third, medium prior-
multiple of the periods of the high-priority pro- ity, process is released, it will preempt the lower
cesses, denoted lcm(T1..TN). priority process and prevent it from releasing the
If we, for example, have a system with two semaphore. This lead to arbitrary delays of the
high-priority processes with periods of 10 and 14 higher priority process, which is not acceptable in
milliseconds respectively, the response time of the hard real-time systems. The problem of priority
garbage collector must be less than or equal to inversion was first described by Lampson and
lcm(10,14) = 70 milliseconds for the system to be Redell [LR80].
schedulable. To avoid blocking caused by priority inversion,
Proof. Assume that all the high-priority pro- priority inheritance protocols are often employed.
cesses are released simultaneously at time t. This All of these protocols involve temporarily raising
is the worst-case scheduling situation. The pro- the priority of a process that has locked a sema-
cesses will execute with different periods forming phore, which may cause a low-priority process to
a scheduling pattern. Sooner or later they will all become a high-priority one until the semaphore is
again become ready to run simultaneously, after unlocked. This must be taken into consideration
which the scheduling pattern will repeat itself. when analyzing the schedulability of a system of
This happens at time t+lcm(T1..TN). Thus, if processes. We will look at how the probably most
there was not enough time in the time slot used priority inheritance protocol, namely the
t..t+lcm(T1..TN) to complete the garbage collec- basic inheritance protocol, can be incorporated in
tion work in progress, there will not be enough our scheduling analysis. Two other priority inher-
time in the next time slot either, and so on. The itance protocols are the priority ceiling protocol
amount of required garbage collection work will and the immediate inheritance protocol, both of
continue to accumulate. The response time of the which can be included in the analysis in similar
garbage collector must therefore be less than, or ways.
equal to, lcm(T1..TN). The basic inheritance protocol states that
whenever a process blocks because the semaphore
Reserved memory it attempts to lock is already locked by a process
How do we calculate the amount of memory that with a lower priority, the process currently pos-
has to be reserved in tospace for high-priority sessing the lock will inherit the priority of the
allocation, MHP? We can do this by assuming that blocked process. The priority of a process is thus
all high-priority processes are released immedi- raised if, and only if, it is blocking a higher prior-
ately before a flip is to be performed. We further- ity process.
more assume that the flip can not be performed Blocking a high-priority process and raising
within RGC time units after the invocation of the the priority of the process holding the semaphore
to the priority level of the blocked process can garbage collector, and the real-time kernel were
clearly be viewed as being equivalent to the high- augmented with code producing output on the
priority process performing the work within the digital I/O port of the VME computer. The perfor-
critical region of the process with lower priority. mance of the garbage collector could then be mon-
We can thus incorporate the basic inheritance itored by connecting a logic analyzer to the I/O
protocol in our scheduling analysis by modifying port. Several performance aspects were studied,
(3) slightly: such as the cost of individual memory manage-
ment operations, the amount of work required for
N garbage collection, and the time a high-priority
∑
R GC process can be delayed by garbage collection.
R GC = - ⋅ ( C i + d i + G i + g i ) (6)
---------
i=1
Ti
Pointer assignment
Each pointer assignment is guarded by a write
For each process, τi, we have to add the worst- barrier in order to catch assignments that might
case time it, as we see it, performs work on the jeopardize the integrity of the heap. The worst-
behalf of low-priority processes to the worst-case case time required for a pointer assignment was
execution time of the process. We denote the addi- determined to be 10 µs for high-priority processes.
tional time di. While performing the work of a The worst-case cost is independent of object size
low-priority process, τi might take actions that since we do not evacuate objects while high-prior-
motivate additional GC work to be performed. ity processes are running. For low-priority pro-
The additional worst-case time for GC work is cesses, on the other hand, the worst-case time
denoted gi, and must also be taken into account depends on object size. For example, a pointer
when calculating the response time of the gar- assignment involving a pointer to a 36 byte object
bage collector. It is worth noticing that it is only required 21 µs in the worst case. Larger objects
low-priority processes that influence di and gi. result in higher worst-case delays.
The execution time and GC need of high-priority
processes are already taken into account, even if Allocation
they do block other high-priority processes. No garbage collection is performed in association
To analyze a system utilizing the basic inherit- with memory allocation requests made by high-
ance protocol, we must be able to determine the priority processes, which makes them signifi-
value of di and gi. Any of the following two obser- cantly cheaper than in the original version of
vations can be used to find an upper bound Brooks’ algorithm. Allocation involves only updat-
[SRL90]: First, under the basic inheritance proto- ing an allocation pointer and initializing the con-
col, a process τi can be delayed at most once by tents of the new object. All memory cells
each process with lower priority which share constituting the object is set to zero except for the
semaphore with τi. Second, if m semaphores exist GC information fields in the object header. This
which can cause τi to block, then τi can be blocked approach has been chosen for reasons of simplic-
at most m times, i.e. once by each semaphore. By ity and safety. A more efficient approach, from a
analyzing the worst-case execution times and strict memory management point of view, would
allocation need of the corresponding critical be to just set the pointers within the new object to
regions and adding them up, we can compute di zero, not the contents of the entire object. How-
and gi. ever, initializing scalar values as well as pointers
is preferable in safety-critical applications since it
reduces the risk of bugs related to the program-
6 Implementation mer forgetting to initialize scalar fields. The total
cost for an allocation request consists of a con-
A garbage collector based on Brooks’ algorithm, stant time for administrating the request and a
as formulated by Bengtsson [Ben90], and sched- initialization time proportional to the size of the
uled according to the principles described in this object to allocate. The following table presents
paper has been implemented within an actual some worst-case allocation times for a high-prior-
real-time kernel [AB91]. The kernel has been ity process.
developed at the Department of Automatic Con-
trol, Lund Institute of Technology, and it is used
object size (bytes) worst-case delay (µs)
in various hard real-time applications, including
control of industrial robots. The garbage collector 100 22
was implemented in C with some critical parts in
assembly code. 500 37
A series of test programs were run on a VME
1000 58
control computer equipped with a 25 MHz
Motorola 68040 processor. The test programs, the
When a low-priority process requests memory, an again directly after completing the operation. No
amount of GC work corresponding to the size of context switch is possible during this time. A
the requested object is performed. The worst-case high-priority process becoming ready to run
time delay will in this case therefore be signifi- might therefore be delayed until the atomic oper-
cantly higher than for a high-priority process. ation is finished. It is therefore important to keep
atomic operations shorter than the maximum
Garbage collection work latency tolerated by the critical processes.
Whenever a high-priority process is suspended The first implementation of our garbage collec-
and no other high-priority process is ready to exe- tor considered copying an object to be a single
cute, the garbage collector checks if there is any atomic operation. Initialization of an object in
GC work pending. If so, garbage collection is connection with an allocation request was consid-
started. The amount of GC work that is per- ered an atomic operation as well. This proved to
formed depends on the amount of memory allo- work well as long as only small objects, less than
cated by the high-priority processes, the heap 100 bytes, resided on the heap, but it did not scale
size, the maximum amount of simultaneously live up as the maximum object size grew. For example,
memory, and the maximum amount of total GC a 1000 byte object caused a 177 µs locking delays,
work that may be required during one GC cycle. which is rather long for systems with high-fre-
The worst-case time required for garbage collec- quency control loops. For even larger object sizes,
tion in the presence of a single high-priority pro- which is quite realistic in an embedded system,
cess was studied. The high-priority process ran the problem grows even worse.
with a frequency of 1000 Hz and required 222 µs The increase of the worst-case locking time
to execute in the worst case. The clock interrupt with growing object size can be avoided if object
that triggered the process required 85 µs. Each copying and object initialization are made inter-
semispace consisted of 50 kilobytes (total heap ruptible. Then, the worst-case delay will be inde-
size 100 kilobytes) and the object size and amount pendent of the maximum object size. Object
of simultaneously live memory was varied. The initialization can easily be made interruptible,
worst-case times for garbage collection are given but object copying presents some practical prob-
in the following table. lems with process synchronization. If a context
switch occurs when the garbage collector is copy-
ing an object, the process given control of the pro-
simultaneously worst-case
object size
live memory time required cessor might attempt to modify the object being
(bytes)
(bytes) for GC (µs) copied. In such a case it will modify the old copy,
perhaps a part of the old copy that has already
100 4000 221 been copied. In order to avoid producing an incon-
sistent version of the object when the copying is
100 20000 261
resumed, we have chosen to back out of the copy-
500 20000 470 ing and restart it from scratch. This does lead to a
slightly increased worst-case overhead for GC,
1000 22000 503 since every context switch could potentially cause
an object copying to be aborted. By making object
We see that the garbage collector was able to keep initialization and copying interruptible we have
up with the application even under heavy load managed to limit the worst-case locking delay to
(1 megabyte allocated memory per second in a 38 µs, independent of the maximum object size.
100 kilobyte heap with a high ratio of live
objects).
It should also be noted that the garbage collec- 7 Conclusions
tor is suspended if a high-priority process
becomes ready to run while GC work is per- We have discussed embedded systems and how
formed. High-priority processes are thus not they can benefit from dynamic memory manage-
blocked for the time periods given in the table ment. The drawbacks of manual memory man-
above. agement have been pointed out and we have
established that automatic memory management,
Locking or garbage collection, is desirable in complex
Manipulating the pointer graph and copying embedded systems. However, previous techniques
objects cause the GC heap to be temporarily for garbage collection do not provide sufficient
inconsistent. Such operations must therefore be support for hard real-time systems.
performed atomically. This is achieved by switch- A novel strategy for scheduling the work of
ing off the processor interrupts before an atomic existing fine-granular incremental garbage collec-
operation is commenced and switching them on tion algorithms was presented. High-priority pro-
cesses which have to comply with hard real-time
demands execute with very little interference References
from the garbage collector. The interference is
small in respect to the response time demands of [AB91] L. Andersson, A. Blomdell. A Real-Time
the high-priority processes and is negligible from Programming Environment and a Real-
a control theory point of view. The remaining pro- Time Kernel. National Swedish Sympo-
cessor time is divided between running the gar- sium on Real-Time Systems, 1991.
bage collector and low-priority processes. The [Bak78] H. G. Baker. List Processing in Real
impact of garbage collection on the low-priority Time on a Serial Computer. Communi-
processes is small enough to satisfy soft real-time cations of the ACM, April 1978.
demands.
A method for analyzing the worst-case behav- [Ben90] M. Bengtsson. Real-Time Garbage Col-
ior of a system of processes and garbage collection lection. Licentiate Thesis, Dept. of Com-
was presented. The interference from the garbage puter Science, Lund University, 1990.
collector is strictly bounded and predictable. [Bro84] R. A. Brooks. Trading Data Space for
Overhead for memory management operations, Reduced Time and Code Space in Real-
such as pointer assignments and allocation Time Garbage Collection on Stock
requests is analyzed by including it in the worst- Hardware. Proceedings of the 1984
case execution time of the high-priority processes. ACM Symposium on Lisp and Func-
Process release jitter caused by the garbage col- tional Programming, August 1984.
lector momentarily switching off the interrupts is [Hen96] R. Henriksson. Scheduling Real-Time
also easily handled with standard schedulability Garbage Collection. Licentiate Thesis,
analysis techniques. It is thus possible to a priori Dept. of Computer Science, Lund Uni-
verify that a set of high-priority processes and the versity, 1996.
corresponding garbage collection work are sched-
ulable without violating any hard deadlines. It [JP86] M. Joseph, P. Pandya. Finding Re-
was also demonstrated how the analysis works in sponse Times in a Real-Time System.
The Computer Journal, Vol. 29, No. 5,
the presence of priority inheritance protocols.
1986.
The described garbage collection strategy was
implemented in an existing real-time kernel and [LL73] C. L. Liu, J. W. Layland. Scheduling Al-
used in an actual control application. Measure- gorithms for Multiprogramming in a
ments demonstrate that automatic memory man- Hard-Real-Time Environment. Journal
agement is feasible in hard real-time systems, of the ACM, Vol. 20, No. 1, 1973.
even in systems with sampling frequencies as [LR80] B. W. Lampson, D. D. Redell. Experi-
high as 1000 Hz. The cost of individual memory ence with Processes and Monitors in
management operations was observed to be low Mesa. Communications of the ACM,
for high-priority processes. Vol. 23, No. 2, 1980.