Professional Documents
Culture Documents
Embedded Real-Time
System Considerations
Eric Durant
Electrical Engineering and Computer Science Department
Milwaukee School of Engineering
Milwaukee, WI, USA
<durante@msoe.edu>
Abstract................................................................................................................................................... 1
Introduction ............................................................................................................................................. 1
Types of Scheduling....................................................................................................................... 13
Conclusion ............................................................................................................................................ 14
References............................................................................................................................................ 16
CS384 –Design of Operating Systems
Term Paper
Embedded Real-Time System Considerations
Abstract
the design of embedded, real-time systems. Focus is placed on kernels for such
systems. Formal development, tradeoffs of user versus kernel mode, security and
Introduction
Special demands are placed on embedded systems. Often, these systems must
nature often dictates that their task be completed in a specific time window in
ahead of or behind schedule with no adverse effects.1 When such hard real-time
scheduling can be useful. Other embedded systems may require a high level of
1
For example, for general word processing tasks, assuming the normal behavior of the system is “fast enough”by
the user’s criteria, the effects on his productivity will be minimal if the program operates at double-speed or half-
speed. However, if a respirator operates at either half or double the optimal speed, the effects could be
catastrophic.
In other systems, the focus may be on optimizing fault tolerance and ensuring a
may be necessary to ensure the immediate and continued security of the system.
This paper discusses all these topics in several broad sections. First, general
examined.
Two of the fundamental questions in any kernel design are “Will it work?” and
“What is in it?” Fowler shows how the former may be answered for real-time
systems using formal methods [Fow97]. The later is highly debated in real-time
systems practice and research, but some rules of thumb can be derived for
analyzing the tradeoffs of user and kernel mode for many applications.
Applications using each approach are reviewed for their merits [Sie97, Map97].
2
Formal Kernel Development
Formal proof that an operating system is valid, that is, that it meets its
due to the perceived complexity of a formal proof or the perceived stability of the
kernel and embedded system based on empirical data. While both are valid
reasons to not use formal proof in many situations, Fowler has shown that the
for expressing temporal behavior of a system) were the tools used to express the
requirements of the system and to refine and expand the formal description.
Fowler took an iterative approach, starting with the most abstract concepts and
refining and partitioning them in several stages. This allowed the validation of the
making it possible to reuse much of the work across different target hardware and
Fowler showed that with proper expertise, formal kernel development is feasible
for many applications. In addition, it requires a highly modular and abstract design
system design, these attributes have been shown to make systems more robust,
kernels as well, giving benefits beyond the assurance of formally proven stability.
3
In-Kernel versus User Space Implementation
The profusion of terms indicating the depth and breadth of the functionality of a
monolithic kernel, etc. To complicate matters, these terms often have different
meanings for RTOS designers, for the marketing department, and for the
The classic reason for including functionality in the kernel is efficiency, while the
malleable designs. So, in one sense, there can be no final answer to where a
operations must be performed in kernel mode [Mit97] and arbitrary user code on
a time-shared system must not be run in kernel mode, but most operating system
A brief overview of two recent research projects, one implemented in kernel mode
and the other in user mode, will yield some general guidelines for making this
decision.
A demonstration module was implemented in kernel mode for two reasons. First,
important, additional complexity would have resulted from the scheduling module
4
The classic reasoning that kernel mode activities can be implemented more
efficiently has lead to transport protocols often being implemented in kernel mode.
that such an implementation is easy to test and refine [Map97]. There is also the
protocol for ATM performed about 20% poorer than a highly optimized Linux
“These results clearly show that a user-space protocol is able to achieve significant
performance and should help to debunk the idea that transport protocols must be run in the
kernel in order to achieve reasonable performance!”
transport layer. What conclusions can be drawn from these examples? It seems
that the classical concerns deserve careful consideration, but are often
overstated. That is, mode switches have a cost, but this cost is often small
compared to other factors. In addition, using kernel mode means that exceptional
care must be taken to not compromise the system, and more fundamentally, that
to the Internet or other public networks, system security becomes important. One
and providing data (and perhaps encrypting them) only to trusted clients.
5
Another layer of security is the internal system security. At this layer, it is desirable
for which it does not have privileges. A return value may indicate the privilege
inform a trusted node outside the local system of the privilege failure. This would
Mitchem discusses an approach to security at the system call level that could be
extended to implement the final option mentioned above [Mit97]. This method
uses “kernel hypervisors” which are loadable kernel modules that intercept
system calls and can be used in concert. They may implement a number of
attempted security violations) and filters both before and after the requested
that virtualizes the hardware interface for the next software layer. Thus, a
kernel hypervisors run on top of the kernel and provide a virtual system call
2
However, depending on the system architecture, one might argue that a trusted system scheduler running in user
mode could have the same detrimental effects as a kernel mode scheduler, for example.
6
interface. This virtualization layer enables the checks, secondary actions and
features) which have been used to virtualize or wrap the system call interface:
calls.
performing other actions in response to system calls are desired. The kernel
selectivity) and the stacking of hypervisor modules to achieve the desired result
7
Multiprocessing in Embedded Operating Systems
mutual exclusion have the potential to incur much more overhead than in
uniprocessor systems. Two common solutions beyond basic spin locks are often
considered for this problem [Tak97]. These are preemption-safe locking and wait-
Takada did not use this terminology, the system for waiting for a lock is essentially
a condition variable system. Waits on a lock are queued, and cancelled upon
preemption. When a process runs again, it must reinitiate the lock sequence,
entering the lock queue at the rear.3 The cost of scheduling locks in this manner
is at least O(n), where n is the number of processes contending for the lock.
mutual exclusion are planned with a priori knowledge of their durations. However,
this approach does not scale to complex data structures. Takada notes that his
3
In a separate work, not reviewed for this paper, the authors suggested a variant of this approach in which a
preempted process resumes its place in the queue after reentering the run state.
8
must wait on a lock, the operation it plans to execute on the lock is stored in a
shared queue for that lock. If process “A” is preempted (not spinning) when its
turn to acquire the lock arrives, another process, “B,” spinning on the same lock
will execute the critical section stored in the queue by “A” on behalf of “A.” Thus,
When all tasks waiting on the lock are preempted, execution is suspended. The
first process waiting to enter the run state will process the pending wait queue.
Thus, progress will be made more quickly than if each process had to enter the
running state as the lock became available. Also, except for the overhead of
handling the enhanced lock data structures, the cost of preempting the processes
Fault tolerant, redundant systems are used when failure would be catastrophic, or
Common examples in which fault tolerant, redundant systems are used are
With a focus on military applications subject to battle damage and long periods of
operation under harsh conditions, Kim discusses various fault-tolerant modes and
how a system may transition through these modes in different operating phases
9
[Kim97]. The basic principle is that tradeoffs can be made among timeliness,
A supervisor node oversees all functions, but even this node is failsafed. Certain
nodes are configured to monitor the consistency of the supervisor’s behavior and
The supervisor assigns one or more nodes to each task group and sets recovery
policies.
The most basic recovery policy is distributed recovery block, or DRB. Critical
tasks are handled by both a primary node and a shadow node4. The shadow
node uses a different method for performing the task than the primary node. Each
node performs an acceptance test (AT) on its result. If the primary’s AT fails, it
notifies the shadow, which takes over the primary role, assuming its AT was
successful. The shadow will also take over if it does not receive a state update
from the primary by a certain deadline. State data is stored in a shared log cache
so that either node can restore a known state and then synchronize with the other
taking more time to recover and providing less consistent performance during
recovery. The first additional mode is sequential backward recovery. In this mode,
a single node performs a task. If its AT fails, the supervisor node assigns a
4
A “node,”in this context, may either be a hardware device or software module.
10
standby node from a pool to take over its function. The standby node then loads
its state from the shared log cache. If a complete node crash is detected, a similar
The other ADRB recovery policy is sequential forward recovery mode. In this
handler is usually less efficient than assigning a standby node, but can be used in
more situations. When a node crash occurs in sequential forward recovery mode,
the same action is taken as in sequential backward recovery node; the node is
Systems designed with DRB and ADRB require high levels of modularization and
for such applications, private sector applications in which robustness and graceful
degradation under the harshest of operating circumstances5 are prime areas for
5
such as medical devices and commercial aircraft
11
presented as a contrast to Siewert and as a complement to scheduling methods
[Sie97]. Best-effort systems work when it is known that there are sufficient
resources to meet all possible requests. Their performance is undefined when the
the worst-case execution time and implement scheduling based upon this. They
are desirable when an absolute guarantee is required, but can let a large
tightly coupled and optimized systems that guarantee performance, but are not
easily scaled.
All of the above solutions are in place in various production systems. However, for
the vast majority of applications in which high quality (but not absolute) service is
Off-line performance data and real-time profiling are used to negotiate a high
confidence of quality of service with user processes. Thus, deadline failures are
6
for example, multimedia transmission, virtual reality, and even classic hard real-time domains such as telemetry
and digital control
7
Their system also is capable of providing hard-real time based on worst-case analysis.
12
Types of Scheduling
types include:
n Cooperative - non-preemptive
n Static priority-driven - The highest priority process in the ready state is always
Many RTOSs do not provide full real-time services to user applications [Maa97].
In this case, an RTOS with a modular, extensible kernel may be needed so that
Microware’s OS-9.
industrial devices. MMS specifies semaphores, journaling, a file system and event
management, among other services. The services specified are similar to those
13
of many OSs, so the author argues that an MMS service is better implemented at
Since OS-9 is a highly modular operating system, Maaref found its kernel mode
So, depending upon the type of real-time services a particular RTOS exposes to
mode. When this is necessary, the architecture provided for extending the OS is
crucial.
Conclusion
issues indicates that this is a rapidly growing field. Driven by decreasing hardware
expand into low end and distributed systems. Research into security and fault-
tolerance, especially, will be key areas in the next few years. Most consumer
systems will require fault tolerance for reasons quite different from the military
applications for which these systems were first developed. That is, as the cost to
service consumer devices continues to rise, the cost of the devices themselves
matures.
14
On the other hand, life- and safety-critical embedded systems are growing more
complex and having more demanded of them. Hence, they will benefit from
Clearly, there is much overlap between the high (life- and safety-critical domains
such as military, aviation and medical equipment) and low (consumer domains
on the breadth of current research and continuing growth in both the high and low
technologies.
15
References
Kernel”, Proc. IEEE 18th Real-Time Systems Symposium, Dec. 1997, San
n [Kim97] Kim, K.H., et. al., “The Adaptable Distributed Recovery Block
131-138.
n [Liy97] Li, Y., Potkonjak, M. and Wolf, W., “Real-Time Operating Systems for
Design: VLSI in Computers and Processors, Oct. 1997, Austin, pp. 388-392.
Electronics (ISIE), Part 1 (of 3), July 1997, Guimaraes, pp. 29-34.
n [Map97] Mapp, G., Pope, S. and Hopper, A., “The Design and
1962.
n [Mit97] Mitchem, T., Lu, R. and O’Brien, R., “Using Kernel Hypervisors to
16
n [Sie97] Siewert, S., Nutt, G. and Humphrey, M., “A Real-Time Execution
Proc. IEEE 18th Real-Time Systems Symposium, Dec. 1997, San Francisco,
pp. 134-143.
17