Professional Documents
Culture Documents
Mark Lewandowski, Mark J. Stanovich, Theodore P. Baker, Kartik Gopalan, An-I Andy Wang
Department of Computer Science
Florida State University
Tallahassee, FL 32306-4530
e-mail: [lewandow, stanovic, baker, awang]@cs.fsu.edu, kartik@cs.binghamton.edu
Abstract 1 Introduction
2
τk will complete within its deadline if 10
traditional demand bound
8 refined demand bound
demand bound
k−1
X linear demand bound
ek + demandmax
τi (dk ) ≤ dk (3) 6
i=1 4
3
The refined load-bound function on the right of (9) Thread context Thread context
4
The scheduling of the e1000 device-driven tasks can to model the aperiodic workloads of these tasks in a
be described as occurring at three levels. The schedul- way that supports schedulability analysis.
ing of the top two levels differs between the two Linux
kernel versions considered here (Figure 3), which are 4 Empirical Load Bound
the standard “vanilla” 2.6.16 kernel from kernel.org,
and Timesys Linux, a version of the 2.6.16 kernel
In this section we show how to model the workload
patched by Timesys Corporation to better support
of a device-driven task by an empirically derived load-
real-time applications.
bound function, which can then be used to estimate
Level 1. The hardware preempts the currently ex- the preemptive interference effects of the device driver
ecuting thread and transfers control to a generic in- on the other tasks in a system.
terrupt service routine (ISR) which saves the processor
For example, suppose one wants to estimate the to-
state and eventually calls a Level 2 ISR installed by the
tal worst-case device-driven processor load of a network
device driver. The Level 1 processing is always preemp-
device driver, viewed as a single conceptual task τD .
tively scheduled at the device priority. The only way
The first step is to experimentally estimate loadmax
τD (∆)
to control when such an ISR executes is to selectively
for enough values of ∆ to be able to produce a plot sim-
enable and disable the interrupt at the hardware level.
ilar to Figure 2 in Section 2. The value of loadmax
τD (∆)
Level 2. The driver’s ISR does the minimum for each value of ∆ is approximated by the maximum
amount of work necessary, and then requests that the observed value of demandτD (t − ∆, t)/∆ over a large
rest of the driver’s work be scheduled to execute at number of intervals [t − ∆, t).
Level 3 via the kernel’s “softirq” (software interrupt) One way to measure the processor demand of a
mechanism. In vanilla Linux this Level 2 processing is device-driven task in an interval is to modify the kernel,
called directly from the Level 1 handler, and so it is including the softirq and interrupt handlers, to keep
effectively scheduled at Level 1. In contrast, Timesys track of every time interval during which the task ex-
Linux defers the Level 2 processing to a scheduled ker- ecutes. We started with this approach, but were con-
nel thread, one thread per IRQ number on the x86 cerned about the complexity and the additional over-
architectures. head introduced by the fine-grained time accounting.
Level 3. The softirq handler does the rest Instead, we settled on the subtractive approach, in
of the driver’s work, including call-outs to perform which the CPU demand of a device driver task is in-
protocol-independent and protocol-specific processing. ferred by measuring the processor time that is left for
In vanilla Linux, the Level 3 processing is scheduled via other tasks.
a complicated mechanism with two sub-levels: A lim- To estimate the value of demandτD (t − ∆, t) for a
ited number of softirq calls are executed ahead of the network device driver we performed the following ex-
system scheduler, on exit from interrupt handlers, and periment, using two computers attached to a dedicated
at other system scheduling points. Repeated rounds network switch. Host A sends messages to host C at a
of a list of pending softirq handlers are made, allowing rate that maximizes the CPU time demand of C’s net-
each handler to execute to completion without preemp- work device driver. On system C, an application thread
tion, until either all have been cleared or a maximum τ2 attempts to run continuously at lower priority than
iteration count is reached. Any softirq’s that remain the device driver and monitors how much CPU time it
pending are served by a kernel thread. The reason for accumulates within a chosen-length interval. All other
this ad hoc approach is to achieve a balance between activity on C is either shut down or run at a priority
throughput and responsiveness. Using this mechanism lower than τ2 . If ∆ is the length of the interval, and
produces very unpredictable scheduling results, since τ2 is able to execute for x units of processor time in
the actual instant and priority at which a softirq han- the interval, then the CPU demand attributed to the
dler executes can be affected by any number of dy- network device is ∆ − x and the load is (∆ − x)/∆.
namic factors. In contrast, the Timesys kernel handles It is important to note that this approach only mea-
softirq’s entirely in threads; there are two such threads sures CPU interference. It will not address memory
for network devices, one for input processing and one cycle interference due to DMA operations. The reason
for output processing. is that most if not all of the code from τ2 will oper-
The arrival processes of the e1000 input and out- ate out of the processor’s cache and therefore virtually
put processing tasks generally need to be viewed as no utilization of the memory bus will result from τ2 .
aperiodic, although there may be cases where the net- This effect, known as cycle stealing, can slow down a
work traffic inherits periodic or sporadic characteristics memory intensive task. Measurement of memory cycle
from the tasks that generate it. The challenge is how interference is outside the scope of the present paper.
5
Each host had a Pentium D processor running in For this and the subsequent graphs, note that each
single-core mode at 3.0 GHz, with 2 GB memory and data point represents the maximum observed preemp-
an Intel Pro/1000 gigabit Ethernet adapter, and was tive interference over a series of trial intervals of a given
attached to a dedicated gigabit switch. Task τ2 was length. This is a hard lower bound, and it is also a sta-
run using the SCHED FIFO policy (strict preemptive tistical estimate of the experimental system’s worst-
priorities, with FIFO service among threads of equal case interference over all intervals of the given length.
priority) at a real-time priority just below that of the Assuming the interference and the choice of trial in-
network softirq server threads. All its memory was tervals are independent, the larger the number of trial
locked into physical memory, so there were no other intervals examined the closer the observed maximum
I/O activities (e.g. paging and swapping). should converge to the system’s worst-case interference.
The task τ2 estimated its own running time using The envelope of the data points should be approx-
a technique similar to the Hourglass benchmark sys- imately hyperbolic; that is, there should be an inter-
tem [12]. It estimated the times of preemption events val length below which the maximum interference is
experienced by a thread by reading the system clock as 100%, and there should be an average processor uti-
frequently as possible and looking for larger jumps than lization to which the interference converges for long
would occur if the thread were to run between clock intervals. There can be two valid reasons for deviation
read operations without preemption. It then added up from the hyperbola: (1) Periodic or nearly periodic de-
the lengths of all the time intervals where it was not mand, which results in a zig-zag shaped graph similar
preempted, plus the clock reading overhead for the in- to line labeled “refined load bound” in Figure 2 (see
tervals where it was preempted, to estimate amount of Section 2); (2) not having sampled enough intervals to
time that it was able to execute. encounter the system’s worst-case demand. The latter
The first experiment was to determine the base-line effects should diminish as more intervals are sampled,
preemptive interference experienced by τ2 when τD is but the former should persist.
idle, because no network traffic is directed at the sys- In the case of Figure 4 we believe that the tiny blips
tem. That is, we measured the maximum processor in the Timesys line around 1 and 2 msec are due to pro-
load that τ2 can place on the system when no de- cessing for the 1 msec timer interrupt. The data points
vice driver execution is required, and subtracted the for vanilla Linux exhibit a different pattern, aligning
value from one. This provided a basis for determin- along what appear to be multiple hyperbolae. In par-
ing the network device driver demand, by subtracting ticular, there is a set of high points that seems to form
the idle-network interference from the total interference one hyperbola, a layer of low points that closely follows
observed in later experiments when the network device the Timesys plot, and perhaps a middle layer of points
driver was active. that seems to fall on a third hyperbola. This appear-
ance is what one would expect if there were some rare
100
vanilla Linux events (or co-occurrences of events) that caused pre-
Timesys
emption for long blocks of time. When one of those
80 occurs it logically should contribute to the maximum
load for a range of interval lengths, up to the length
of the corresponding block of preemption, but it only
% interference
60
shows up in the one data point for the length of the
trial interval where it was observed. The three levels
40 of hyperbolae in the vanilla Linux graph suggest that
there are some events or combinations of events that
20
occur too rarely to show up in all the data points, but
that if the experiment were continued long enough data
points on the upper hyperbola would be found for all
0 interval lengths.
0 1000 2000 3000 4000 5000
interval length (µsec) Clearly the vanilla kernel is not as well behaved as
Timesys. The high variability of data points for the
Figure 4. Observed interference with no network vanilla kernel suggests that the true worst-case inter-
traffic. ference is much higher than the envelope suggested by
the data. That is, if more trials were performed for
Figure 4 shows the results of this experiment in each data point then higher levels of interference would
terms of the percent interference observed by task τ2 . be expected to occur throughout. By comparison, the
6
observed maximum interference for Timesys appears here is: (1) eliminate any upward jogs from the data by
to be bounded within a tight envelope over all inter- replacing each data value by the maximum of the values
val lengths. The difference is attributed to Timesys’ to the right of it, resulting in a downward staircase
patches to increase preemptability. function; (2) approximate the utilization by the value
The remaining experiments measured the behav- at the right most step; (3) choose the smallest period
ior of the network device driver task τD under a for which the resulting hyperbola intersects at least one
heavy load, consisting of ICMP “ping” packets every of the data points and is above all the rest.
10 µsec. ICMP “ping” packets were chosen because
they would execute entirely in the context of the de- 100
reply to ping flood
vice driver’s receive thread, from actually receiving the 90 ping flood, no reply
packet through replying to it (TCP and UDP split ex- no load
80
ecution between send and receive threads).
70
% interference
100 60
vanilla Linux
Timesys 50
hyperbolic bound
80 40
30
% interference
60 20
10
40 0
0 1000 2000 3000 4000 5000
interval length (µsec)
20
Figure 6. Observed interference with ping flooding,
with no reply.
0
0 1000 2000 3000 4000 5000
interval length (µsec) To carry the analysis further, an experiment was
done to separate the load bound for receive process-
Figure 5. Observed interference with ping flooding, ing from the load bound for transmit processing. The
including reply. normal system action for a ping message is to send a
reply message. The work of replying amounts to about
Figure 5 shows the observed combined interference half of the work of the network device driver tasks for
of the driver and base operating system under a net- ping messages. A more precise picture of the interfer-
work load of one ping every 10 µsec. The high variance ence caused by just the network receiving task can be
of data points observed for the vanilla kernel appears obtained by informing the kernel not to reply to ping
to extend to Timesys. This indicates a rarely occurring requests. The graph in Figure 6 juxtaposes the ob-
event or combination of events that occurs in connec- served interference due to the driver and base operating
tion with network processing and causes a long block system with ping-reply processing, without ping-reply
of preemption. We believe that this may be a “batch- processing, and without any network load. The fitted
ing” effect arising from the NAPI policy, which alter- hyperbolic load bound is also shown for each case. An
nates between polling and interrupt-triggered execu- interesting difference between the data for the “no re-
tion of the driver. A clear feature of the data is that ply” and the normal ping processing cases is the clear
the worst-case preemptive interference due to the net- alignment of the “no reply” data into just two distinct
work driver is higher with the Timesys kernel than the hyperbolae, as compared to the more complex pattern
vanilla kernel. We believe that this is the result of addi- for the normal case. The more complex pattern of vari-
tional time spent in scheduling and context-switching, ation in the data for the case with replies may be due to
because the network softirq handlers are executed in the summing of the interferences of these two threads,
scheduled threads rather than borrowed context. whose busy periods sometimes coincide. If this is true,
Given a set of data from experimental measurements it suggests a possible improvement in performance by
of interference, we can fit the hyperbolic bound through forcing separation of the execution of these two threads.
application of inequality (9) from Section 2. There are Note that understanding these phenomena is not
several ways to choose the utilization and period so necessary to apply the techniques presented here. In
that the hyperbolic bound is tight. The method used fact the ability to model device driver interference with-
7
out knowledge of the exact causes for the interference Server τ1 τ2 τD OS
is the chief reason for using these techniques. Traditional high med hybrid vanilla
Background high med low Timesys
5 Interference vs. I/O Service Quality Foreground med low high Timesys
Sporadic high low med (SS) Timesys
This section describes further experiments, involv- Table 1. Configurations for experiments.
ing the device driver with two sources of packets and
two hard-deadline periodic tasks. These were intended We tested the above task set in four scheduling con-
to explore how well empirical load bounds derived by figurations. The first was the vanilla Linux kernel. The
the technique in Section 4 work with analytical load other three used Timesys with some modifications of
bounds for periodic tasks for whole-system schedula- our own to add support for a Sporadic Server schedul-
bility analysis. We were also interested in compar- ing policy (SS). The SS policy was chosen because it is
ing the degree to which scheduling techniques that re- well known and is likely to be already implemented in
duce interference caused by the device-driver task for the application thread scheduler of any real-time op-
other tasks (e.g. lowering its priority or limiting its erating system, since it is the only aperiodic server
bandwidth through an aperiodic server scheduling al- scheduling policy included in the standard POSIX and
gorithm); would affect the quality of network input ser- Unix (TM) real-time API’s.
vice. The tasks were assigned relative priorities and
The experiments used three computers, referred to scheduling policies as shown in Table 1. The scheduling
as hosts A, B, and C. Host A sent host C a heartbeat policy was SCHED FIFO except where SS is indicated.
datagram once every 10 msec, host B sent a ping packet
to host C every 10µsec (without waiting for a reply),
and host C ran the following real-time tasks: 100
• τ2 is another periodic task, with the same period Figure 7. Percent missed deadlines of τ2 with inter-
and relative deadline as τ1 . Its execution time ference from τ1 (e1 = 2 and p1 = 10) and τD subject
was varied, and the number of deadline misses to one PING message every 10 µsec.
was counted at each CPU utilization level. The
number of missed deadlines reflects the effects of
Figures 7 and 8 show the percentage of deadlines
interference caused by the device driver task τD .
that task τ2 missed and the number of heartbeat pack-
All the memory of these tasks was locked into phys- ets that τ1 missed, for each of the experimental config-
ical memory, so there were no other activities. Their urations.
only competition for execution was from Level 1 and The Traditional Server experiments showed that the
Level 2 ISRs. The priority of the system thread that vanilla Linux two-level scheduling policy for softirq’s
executes the latter was set to the maximum real time causes τ2 to miss deadlines at lower utilization levels
priority, so that τD would always be queued to do work and causes a higher heartbeat packet loss rate for τ1
as soon as input arrived. than the other driver scheduling methods. Neverthe-
Tasks τ1 and τ2 were implemented by modifying the less, the vanilla Linux behavior does exhibit some desir-
Hourglass benchmark [12], to accommodate task τ1 ’s able properties. One is nearly constant packet loss rate,
nonblocking receive operations. independent of the load from τ1 and τ2 . That is due
8
100
3000
number of heartbeat packets received
2500 80
% interference
2000
60
e2 = 5 msec
e2 = 6 msec
1500 e2 = 7 msec
40
1000 vanilla Linux
Timesys, τD low
Timesys, τD high 20
500 Timesys, τD med SS
0 0
0 1 2 3 4 5 6 7 8 0 10 20 30 40
τ2 execution time (msec) interval length (msec)
Figure 8. Number of heartbeat packets received by Figure 9. Sum of load-bound functions for τ1 and
τ1 with interference from τ2 (e1 = 2 and p1 = 10) and τ2 , for three different values of the execution time e2 .
τD subject to one PING message every 10 µsec.
The Background Server experiments confirmed that Figure 10. Individual load-bound functions for τ1
assigning τD the lowest priority of the three tasks (the and τD , and their sum.
default for Timesys) succeeds in maximizing the proba-
bility of τ2 in meeting its deadlines, but it also gives the the graph at the deadline (10000 µsec), and allowing
worst packet loss behavior. Figure 9 shows the com- some margin for release-time jitter, overhead and mea-
bined load for τ1 and τ2 . The values near the deadline surement error, one would predict that τ2 should not
(10) suggest that if there is no interference from τD or miss any deadlines until its execution time exceeds 1.2
other system activity, τ2 should be able to complete msec. That appears to be consistent with the actual
within its deadline until e2 exceeds 7 msec. This is performance in Figure 7.
consistent with the data in Figure 7. The heartbeat
The Sporadic Server experiments represent an at-
packet receipt rate for τ1 starts out better than vanilla
tempt to achieve a compromise that balances missed
Linux, but degenerates for longer τ2 execution times.
heartbeat packets for τ1 against missed deadlines for τ2 ,
The Foreground Server experiments confirmed that by scheduling τD according to a bandwidth-budgeted
assigning the highest priority to τD causes the worst aperiodic server scheduling algorithm, running at a pri-
deadline-miss performance for τ2 , but also gives the ority between τ1 and τ2 . This has the effect of reserving
best heartbeat packet receipt rate for τ1 . The line la- a fixed amount of high priority execution time for τD ,
beled “τ1 + τD ” in Figure 10 shows the sum of the the- effectively lowering the load bound curves. This al-
oretical load bound for τ1 and the empirical hyperbolic lows it to preempt τ2 for the duration of the budget,
load bound for τD derived in Section 4. By examining but later reduces its priority to permit τ2 to execute,
9
thereby increasing the number of deadlines τ2 is able the latter case, RTLinux allows device driver code to
to meet. The Sporadic Server algorithm implemented run as a RTLinux thread (see below).
here uses the native (and rather coarse) time account- The second technique, followed in the current paper,
ing granularity of Linux, which is 1 msec. The server is to defer most interrupt-triggered work to scheduled
budget is 1 msec; the replenishment period is 10 msec; threads. Hardware interrupt handlers are kept as short
and the number of outstanding replenishments is lim- and simple as possible. They only serve to notify a
ited to two. It can be seen in figure 7 that running the scheduler that it should schedule the later execution of
experiments on the SS implementation produces data a thread to perform the rest of the interrupt-triggered
that closely resembles the behavior of the vanilla Linux work. There are variations to this approach, depend-
kernel. (This is consistent with our observations on the ing on whether the logical interrupt handler threads
similarity of these two algorithms in the comments on execute in borrowed (interrupt) context or in indepen-
the Traditional Server experiments above.) Under ideal dent contexts (e.g. normal application threads), and on
circumstances the SS implementation should not allow whether they have an independent lower-level scheduler
τ2 to miss a deadline until its execution time exceeds (e.g. RTLinux threads or vanilla Linux softirq han-
the sum of its own initial budget and the execution dlers) or are scheduled via the same scheduler as nor-
time of τ1 . In this experiment our implementation of mal application threads. The more general the thread
the SS fell short of this by 3 msec. In continuing re- scheduling mechanism, the more flexibility the system
search, we plan to narrow this gap by reducing the developer has in assigning an appropriate scheduling
accounting granularity of our implementation and in- policy and priority to the device-driven threads. The
creasing the number of pending replenishments, and job of bounding device driver interference then focuses
determine how much of the currently observed gap is on analyzing the workload and scheduling of these
due to the inevitable overhead for time accounting, con- threads. This technique has been the subject of sev-
text switches, and priority queue reordering. eral studies, including [7, 4, 17, 5], and is implemented
in Windows CE and real-time versions of the Linux
6 Related Work kernel.
Facchinetti et al. [6] recently proposed an instance of
Previous research has considered a variety of tech- the work deferral approach, in which a system executes
niques for dealing with interference between interrupt- all driver code as one logical thread, at the highest sys-
driven execution of device-driver code and the schedul- tem priority. The interrupt server has a CPU time
ing of application threads. We classify these techniques budget, which imposes a bound on interference from
into two broad groups, according to whether they apply the ISRs. They execute the ISRs in a non-preemptable
before or after the interrupt occurs. manner, in interrupt context, ahead of the applica-
The first technique is to “schedule” hardware in- tion thread scheduler. Their approach is similar to the
terrupts in a way that reduces interference, by reduc- softirq mechanism of the vanilla Linux system, in that
ing the number of interrupts, or makes it more pre- both schedule interrupt handlers to run at the highest
dictable, by limiting when they can occur. On some system priority, both execute in interrupt context, and
hardware platforms, including the Motorola 68xxx se- both have a mechanism that limits server bandwidth
ries of microprocessors, this can be done by assign- consumption. However, time budgets are enforced di-
ing different hardware priorities to different interrupts. rectly in [6].
The most basic approach to scheduling interrupts in- Zhang and West [19] recently proposed another vari-
volves enabling and disabling interrupts intelligently. ation of the work deferral approach, that attempts to
The Linux network device driver model called NAPI minimize the priority of the bottom halves of driver
applies this concept to reduce hardware interrupts dur- code across all current I/O consuming processes. The
ing periods of high network activity[11]. Regehr and algorithm predicts the priority of the process that is
Duongsaa [13] propose two other techniques for reduc- waiting on some queued I/O, and then executes the
ing interrupt overloads, one through special hardware bottom half in its own thread at the highest predicted
support and the other in software. RTLinux can be priority per interrupt. Then it charges the execution
viewed as also using this technique. That is, to re- time to the predicted process. This approach makes
duce interrupts on the host operating system RTLinux sense for device driver execution that can logically be
interposes itself between the hardware and the host op- charged to an application process.
erating system[18]. In this way it relegates all device The above two techniques partially address the
driver execution for the host to background priority, problem considered in this paper. That is, they re-
unless there is a need for better I/O performance. In structure the device-driven workload in ways that po-
10
tentially allow more of it to be executed below inter- read from the network device. This will raise the bot-
rupt priority, and schedule the execution according to tom half’s priority to that of the high priority process.
a policy that can be analyzed if the workload can be However, since the typical network device driver han-
modeled. However, they do not address the problem of dles packets in FIFO order, the bottom half is forced to
how to model the workload that has been moved out of work through the backlog of the low-priority process’s
the ISRs, or how to model the workload that remains input before it gets to the packet destined for the high
in the ISRs. priority process. This additional delay could be enough
A difference between the Facchinetti approach and to cause the high priority process to miss its deadline.
our use of aperiodic server scheduling is that we have That would not have happened if the low-priority pack-
multiple threads, at different priorities, executing in ets had been cleared out earlier, as if the device bottom
independent contexts and scheduled according to stan- half had been able to preempt the middle-priority task.
dard thread scheduling policies which are also available In contrast, with our approach the bottom half still
to application threads. We have observed that differ- handles incoming packets in FIFO order, but by exe-
ent devices (i.e. NIC, disk controller, etc) all gener- cuting the bottom half in a server with a budget of high
ate unique workloads, which we believe warrant dif- priority time we are able to empty the incoming DMA
ferent scheduling strategies and different time budgets. queue more frequently. This can prevent the scenario
In contrast, all devices in the Facchinetti system are above from occurring unless the input rate exceeds the
forced to share the same budget and share the same pri- bottom-half server’s budgeted bandwidth.
ority; the system is not able to distinguish between dif-
ferent priority levels of I/O, and is forced to handle all 7 Conclusion
I/O in FIFO order. Imagine a scenario where the real
time system is flooded with packets. In the Facchinetti We have described two ways to approach the prob-
system the NIC could exhaust the ISR server’s bud- lem of accounting for the preemptive interference ef-
get. If a high priority task requests disk I/O while the fects of device driver tasks in demand-based schedula-
ISR server’s budget is exhausted, the disk I/O will be bility analysis. One is to model the worst-case inter-
delayed until the ISR server budget is replenished, and ference of the device driver by a hyperbolic load-bound
the high priority task may not receive its disk service in function derived from empirical performance data. The
time to meet its deadline. This scenario is pessimistic, other approach is to schedule the device driver by an
but explains our motivation to move ISR execution into aperiodic server algorithm that budgets processor time
multiple fully-schedulable threads. consistent with the analytically derived load-bound
A difference between the Zhang and West approach function of a periodic task. We experimented with
and ours is that we focus on the case where there is no the application of both techniques to the Linux device
application process to which the device-driven activity driver for Intel Pro/1000 Ethernet adapters.
can logically be charged. Our experiments use ICMP The experimental data show hyperbolic load bounds
packets, which are typically processed in the context of can be derived for base system activity, network receive
the kernel and cannot logically be charged to a process. processing, and network transmit processing. Further,
Another difference is that our model is not subject the hyperbolic load bounds may be combined with an-
to a middle-priority process delaying the execution of alytically derived load bounds to predict the schedula-
a higher priority process, by causing a backlog in the bility of hard-deadline periodic or sporadic tasks. We
bottom-half processing of I/O for a device on which the believe this technique of using empirically derived hy-
high priority process depends. Consider a system with perbolic load-bound functions to model processor inter-
three real time processes, at three different priorities. ference may also have potential applications outside of
Suppose the low priority process initiates a request for device drivers, to aperiodic application tasks that are
a stream of data over the network device, and that be- too complex to apply any other load modeling tech-
tween packets received by the low priority process, the nique.
middle-priority process (which does not use the net- The data also show preliminary indications that
work device) wakes up and begins executing. Under aperiodic-server scheduling algorithms, such as Spo-
the Zhang and West scheme, the network device server radic Server, can be useful in balancing device driver
thread would have too low priority for the network de- interference and quality of I/O service. This provides
vice’s bottom half to preempt the middle-priority pro- an alternative in situations where neither of the two
cess, and so a backlog of received packets would build extremes otherwise available will do, i.e., where run-
up in the DMA buffers. Next, suppose the high priority ning the device driver at a fixed high priority causes
process wakes up and during its execution, attempts to unacceptable levels of interference with other tasks,
11
and running the device driver at a fixed lower prior- [8] J. P. Lehoczky, L. Sha, and J. K. Strosnider. En-
ity causes unacceptably low levels of I/O performance. hanced aperiodic responsiveness in a hard real-time
In future work, we plan to study other device types, environment. In Proc. 8th IEEE Real-Time Systems
and other types of aperiodic server scheduling algo- Symposium, pages 261–270, 1987.
rithms. We also plan to extend our study of empir- [9] C. L. Liu and J. W. Layland. Scheduling algorithms for
multiprogramming in a hard real-time environment.
ically derived interference bounds to include memory
Journal of the ACM, 20(1):46–61, Jan. 1973.
cycle interference. As mentioned in this paper, our [10] J. W. S. Liu. Real-Time Systems. Prentice-Hall, 2000.
load measuring task can execute out of cache, and so [11] J. Mogul and K. Ramakrishnan. Eliminating receive
does not experience the effects of memory cycle steal- livelock in an interrupt-driven kernel. ACM Transac-
ing due to DMA. Even where there is no CPU inter- tions on Computer Systems, 15(3):217–252, 1997.
ference, DMA memory cycle interference may increase [12] J. Regehr. Inferring scheduling behavior with Hour-
the time to complete a task past the anticipated worst- glass. In Proc. of the USENIX Annual Technical Conf.
case execution time, resulting in missed deadlines. We FREENIX Track, pages 143–156, Monterey, CA, June
plan to perform an analysis of DMA interference on 2002.
[13] J. Regehr and U. Duongsaa. Preventing interrupt
intensive tasks. By precisely modeling these effects,
overload. In Proc. 2006 ACM SIGPLAN/SIGBED
increases in the execution time due to cycle stealing
conference on languages, compilers, and tools for em-
will be known and worst-case execution times will be bedded systems, pages 50–58, Chicago, Illinois, June
more accurately predicted. Further, by coordinating 2005.
the DMA and memory intensive tasks, the contention [14] L. Sha, J. P. Lehoczky, and R. Rajkumar. Solutions
for accessing memory can be minimized. for some practical problems in prioritizing preemptive
scheduling. In Proc. 7th IEEE Real-Time Sytems Sym-
posium, 1986.
References [15] B. Sprunt, L. Sha, and L. Lehoczky. Aperiodic task
scheduling for hard real-time systems. Real-Time Sys-
[1] N. C. Audsley, A. Burns, M. Richardson, and A. J. tems, 1(1):27–60, 1989.
Wellings. Hard real-time scheduling: the deadline [16] J. Strosnider, J. P. Lehoczky, and L. Sha. The de-
monotonic approach. In Proc. 8th IEEE Workshop ferrable server algorithm for enhanced aperiodic re-
on Real-Time Operating Systems and Software, pages sponsiveness in real-time environments. IEEE Trans.
127–132, Atlanta, GA, USA, 1991. Computers, 44(1):73–91, Jan. 1995.
[2] T. P. Baker and S. K. Baruah. Schedulability analysis [17] C. A. Thekkath, T. D. Nguyen, E. Moy, and E. La-
of multiprocessor sporadic task systems. In I. Lee, zowska. Implementing network protocols at user level.
J. Y.-T. Leung, and S. Son, editors, Handbook of Real- IEEE Trans. Networking, 1(5):554–565, Oct. 1993.
time and Embedded Systems. CRC Press, 2007. (to [18] V. Yodaiken. The RTLinux manifesto. In Proc. 5th
appear). Linux Expo, Raleigh, NC, 1999.
[3] S. K. Baruah, A. K. Mok, and L. E. Rosier. Preemp- [19] Y. Zhang and R. West. Process-aware interrupt
tively scheduling hard-real-time sporadic tasks on one scheduling and accounting. In Proc. 27th Real Time
processor. Proc. 11th IEE Real-Time Systems Sympo- Systems Symposium, Rio de Janeiro, Brazil, Dec. 2006.
sium, pages 182–190, 1990.
[4] L. L. del Foyo, P. Meja-Alvarez, and D. de Niz. Pre-
dictable interrupt management for real time kernels Acknowledgment
over conventional PC hardware. In Proc. 12th IEEE
Real-Time and Embedded Technology and Applications The authors are grateful to Timesys Corporation for
Symposium (RTAS’06), pages 14–23, San Jose, CA, providing access to their distribution of the Linux ker-
Apr. 2006. nel at a reduced price. We are also thankful to the
[5] P. Druschel and G. Banga. Lazy receiver processing
anonymous members of the RTAS 2007 program com-
(LRP): a network subsystem architecture for server
mittee for their suggestions.
systems. In Proc. 2nd USENIX symposium on oper-
ating systems design and implementation, pages 261–
275, Oct. 1996.
[6] T. Facchinetti, G. Buttazzo, M. Marinoni, and
G. Guidi. Non-preemptive interrupt scheduling for
safe reuse of legacy drivers in real-time systems. In
Proc. 17th IEEE Euromicro Conference on Real-Time
Systems, Palma de Mallorca, July 2005.
[7] S. Kleiman and J. Eykholt. Interrupts as threads.
ACM SIGOPS Operating Systems Review, 29(2):21–
26, Apr. 1995.
12