You are on page 1of 14

seL4: Formal Verification of an OS Kernel

Gerwin Klein1,2 , Kevin Elphinstone1,2 , Gernot Heiser1,2,3
June Andronick1,2 , David Cock1 , Philip Derrin1∗, Dhammika Elkaduwe1,2‡, Kai Engelhardt1,2
Rafal Kolanski1,2 , Michael Norrish1,4 , Thomas Sewell1 , Harvey Tuch1,2†, Simon Winwood1,2
1
NICTA, 2 UNSW, 3 Open Kernel Labs, 4 ANU
ertos@nicta.com.au

ABSTRACT 1. INTRODUCTION
Complete formal verification is the only known way to guar- The security and reliability of a computer system can only
antee that a system is free of programming errors. be as good as that of the underlying operating system (OS)
We present our experience in performing the formal, kernel. The kernel, defined as the part of the system ex-
machine-checked verification of the seL4 microkernel from ecuting in the most privileged mode of the processor, has
an abstract specification down to its C implementation. We unlimited hardware access. Therefore, any fault in the ker-
assume correctness of compiler, assembly code, and hardware, nel’s implementation has the potential to undermine the
and we used a unique design approach that fuses formal and correct operation of the rest of the system.
operating systems techniques. To our knowledge, this is the General wisdom has it that bugs in any sizeable code
first formal proof of functional correctness of a complete, base are inevitable. As a consequence, when security or
general-purpose operating-system kernel. Functional correct- reliability is paramount, the usual approach is to reduce
ness means here that the implementation always strictly fol- the amount of privileged code, in order to minimise the
lows our high-level abstract specification of kernel behaviour. exposure to bugs. This is a primary motivation behind
This encompasses traditional design and implementation security kernels and separation kernels [38, 54], the MILS
safety properties such as the kernel will never crash, and it approach [4], microkernels [1, 12, 35, 45, 57, 71] and isolation
will never perform an unsafe operation. It also proves much kernels [69], the use of small hypervisors as a minimal trust
more: we can predict precisely how the kernel will behave in base [16, 26, 56, 59], as well as systems that require the use of
every possible situation. type-safe languages for all code except some “dirty” core [7,23].
seL4, a third-generation microkernel of L4 provenance, Similarly, the Common Criteria [66] at the strictest evaluation
comprises 8,700 lines of C code and 600 lines of assembler. level requires the system under evaluation to have a “simple”
Its performance is comparable to other high-performance L4 design.
kernels. With truly small kernels it becomes possible to take secu-
rity and robustness further, to the point where it is possible
to guarantee the absence of bugs [22, 36, 56, 64]. This can be
Categories and Subject Descriptors achieved by formal, machine-checked verification, providing
D.4.5 [Operating Systems]: Reliability—Verification; mathematical proof that the kernel implementation is consis-
D.2.4 [Software Engineering]: Software/Program Veri- tent with its specification and free from programmer-induced
fication implementation defects.
We present seL4, a member of the L4 [46] microkernel
General Terms family, designed to provide this ultimate degree of assurance
of functional correctness by machine-assisted and machine-
Verification, Design
checked formal proof. We have shown the correctness of a
very detailed, low-level design of seL4 and we have formally
Keywords verified its C implementation. We assume the correctness
Isabelle/HOL, L4, microkernel, seL4 of the compiler, assembly code, boot code, management of
caches, and the hardware; we prove everything else.

Philip Derrin is now at Open Kernel Labs. Specifically, seL4 achieves the following:

Harvey Tuch is now at VMware.

Dhammika Elkaduwe is now at University of Peradeniya • it is suitable for real-life use, and able to achieve per-
formance that is comparable with the best-performing
microkernels;
• its behaviour is precisely formally specified at an ab-
stract level;
• its formal design is used to prove desirable properties,
including termination and execution safety;

To appear in the 22nd ACM Symposium on Principles of Operating Systems • its implementation is formally proven to satisfy the
(SOSP), October 2009, Big Sky, MT, USA specification; and

1

and frames (for mapping in virtual address cation to the C implementation. CN- proach used in the formal verification from high-level specifi- odes. methods practitioners. formal methods practitioners tend toward top-down design. we link it with software (derived tated via reply capabilities. We also discuss the lessons spaces). and of the verification approach. responsible for defining the address the single arrows represent design/implementation influence space by mapping frames into the virtual space. The prototype modifies the user-level state of normal user-mode applications that have access to device the simulator to appear as if a real kernel had executed in registers and memory either by mapping the device into the privileged mode. page faults are propagated The double arrows represent implementation or proof effort. In Sect. systems and formal methods techniques. unlike most L4 ker. inter-process communication (IPC). 2. not its API design. including low-level physical and virtual 2 . In done for an ARMv6-based platform. was The remainder of this paper is structured as follows. Virtual address spaces boxes are formal artefacts that have a direct role in the proof. seL4 provides a mechanism to receive enables low-level design evaluation from both the user and notification of interrupts (via IPC) and acknowledge their kernel perspective. and. we present the design of seL4. design and implementation of algorithms that manage the chronous endpoints (port-like destinations without in-kernel low-level hardware details. which leads to designs motivated by low- level details. hardware efficiently. in a near-to-realistic setting. discuss the untyped capabilities. we describe how to design a kernel for formal verification. while traps container objects called CNodes. We found that our Memory management in seL4 is explicit: both in-kernel ob- verification focus improved the design and was surprisingly jects and virtual address spaces are protected and managed often not in conflict with achieving performance. endpoints. 22] based around an intermediate target the seL4 kernel. Fig. Initial development of seL4.  To our knowledge. Capabilities are segregated and from QEMU) that simulates the hardware platform. we describe important lessons we OS developers tend to take a bottom-up approach to ker- learnt in this project. static analysis or kernel implementations in type-safe languages can achieve. The functional-correctness property we prove for seL4   is much stronger and more precise than what automated  techniques like model checking. thread control blocks. 2. The central artefact is the and non-native system calls are also propagated via IPC Haskell prototype of the kernel. This leads to designs based on simple models with a high degree of abstraction from hardware. it is a platform of unprecedented trustworthiness. Physical memory is initially represented by In this paper.2 Kernel design process tions we make. 1 shows our approach in more detail. We have created a methodology for rapid kernel design and implementation that is a fusion of traditional operating receipt. 5. or by controlled access to device ports This arrangement provides a prototyping environment that on Intel x86 hardware. while at the same time providing an artefact that is broadly based on L4 [46] and influenced by EROS [58]. seL4 device drivers run as of the trap. Normal stored in capability address spaces composed of capability user-level execution is enabled by the simulator. and all verification work. of the kernel (so far without proof) to x86. via capabilities. and identify the assump- 2. To execute the Haskell prototype buffering) for inter-thread communication. 3. and the implications for kernel is explicit and authorised.  vide strong security guarantees. 4. • its access control mechanism is formally proven to pro. we describe precisely what we verified. which can be subdivided or retyped into methodologies we used. High performance is obtained by managing the with related work. We   not only analyse specific aspects of the kernel. such as safe execution. We therefore provide that is readily accessible by both OS developers and formal only a brief overview of its main characteristics. IPC uses synchronous and asyn. as proof tractability is determined 2. and developers. Exceptions of artefacts on other artefacts. we adopted This paper is primarily about the formal verification of an approach [19. we give an overview of seL4.1 seL4 programming model As a compromise that blends both views. via IPC to pager threads. virtual address space. The prototype requires the to support virtualisation. capabilities for authorisation. OVERVIEW by system complexity. tool and reasoned about. The model guarantees all memory allocation in the we have learnt from the project. but also provide a full specification and proof for Figure 1: The seL4 design process the kernel’s precise behaviour. In Sect. is a third-generation microkernel. can be automatically translated into the theorem proving It features abstractions for virtual address spaces. with RPC facili. are passed to the kernel model which computes the result As in traditional L4 kernels. of the design approach. It uses the functional programming seL4 [20]. with a subsequent port Sect. threads. The square nels. As such. have no kernel-defined structure. and provide an overview of the ap- kernel objects such as page tables. similar efforts. similarly to projects at Johns Hopkins (Coyotos) language Haskell to provide a programming language for OS and Dresden (Nova). 6. In Sect. seL4 is the first-ever general-purpose OS kernel that is fully formally verified for functional correctness. we contrast our effort nel design. which will allow the construction of highly secure and reliable       systems on top. and in Sect. In contrast.

based on type-safe languages. Even with a formal semantics to construct and guide the proof. Secondly. ance about the generated definitions and ultimately C. The the low-level implementation for performance. KERNEL DESIGN FOR VERIFICATION the proof and each of these layers in detail. For instance. they are related by formal proof. and we prove that all functions terminate. We restrict ourselves to a subset of Haskell that can be  automatically translated into the language of the theorem  prover we use. We have also modelled and proved the prover we are using can produce external proof representa- security of seL4’s access-control system in Isabelle/HOL on tions that can be independently checked by a small. here we give a The main body of the correctness proof can be thought short summary. the bottom layer in this verification effort is the high-performance C implementation of seL4. the Haskell runtime relies on garbage system call or what happens when an interrupt or VM fault collection which is unsuitable for real-time environments. The next layer down in Fig. Formally. Interactive exact and faithful formal semantics for a large subset of the theorem proving requires human intervention and creativity C programming language [62]. semantics.3 Formal verification Finally. it always must make The property we are proving is functional correctness in fundamental assumptions. and we have proof checker. Sect. The details of this subset are described elsewhere [19. 2. 4. we concentrate on the general functional cor- tised. of verification such as static analysis or model checking. 4 explains 3. the Haskell runtime is a significant body of code (much e. we do not make any substantial  use of laziness. mantics. We manually re-implement the model in the C programming language for several reasons. we work foundationally from first principles. Verification can never be absolute. structure and implementation details we expect the final C implementation to have. function boundaries. of as showing Hoare triples on program statements and on The top-most layer in the picture is the abstract speci.. bigger than our kernel) which would be hard to verify for and it describes in abstract logical terms the effect of each correctness. In our work we stop at the source- the strongest sense.3. Secondly. and we can now achieve a degree ensures that all Hoare logic properties of the abstract model of trustworthiness of formal. which implies that we assume at least the compiler A refinement proof establishes a correspondence between a and the hardware to be correct. In mathematics. However.memory management. The abstract level con. The executable specification contains all data microkernel performance. While an translation is not correctness-critical because we seek assur- automated translation from Haskell to C would have sim. but defined and proved. to be formally verified they must have formally-defined se- machine-assisted and machine-checked proof. It also provides a realistic execution  environment that is binary-compatible with the real kernel. 41]. One of the achievements of this project is a very we use the theorem prover Isabelle/HOL [50]. make only restricted use of type classes. Additionally. This is finite. occurs. Each unit of proof has a set of pre- 3 . feasible state spaces. how system-call arguments are encoded in binary form. machine-checked proof that far also hold for the refined model. tains enough detail to specify the outer interface of the kernel. unlike more automated methods discussed in Sect. More than 30 years of research in theorem proving The correspondence established by the refinement proof has addressed this issue. the Isabelle theorem rectness property. such as SPIN [7] and Singu. Firstly. functions in each of the specification levels. we would have lost most opportunities the Haskell source. complete our refinement and Hoare logic framework decomposes along specification of system behaviour. using C enables optimisation of tion generated from Haskell into the theorem prover.g. This means that if a security surpasses the confidence levels we rely on in engineering or property is proved in Hoare logic about the abstract model mathematics for our daily survival. Specifically. and Hoare logic are not axioma- this paper. it is not the final pro- duction kernel. ness. refinement guarantees niques: firstly. The proof in fication: an operational model that is the main. Fig. that the same property holds for the kernel source code. not yet connected it to the proof presented here. For programs The technique we use for formal verification is interactive. We use two specific tech- (not all security properties can be). not plified verification.  For example. 2 shows the specification layers used in the verification of seL4. we are showing refinement [18]: code level. high-level (abstract) and a low-level (concrete. the same arguments apply to other systems implemented in the kernel. we ran a subset of the Iguana embedded OS [37] on the simulator-Haskell combination. It does not describe in detail how these effects are Incidentally.21]. This is described elsewhere [11. Figure 2: The refinement layers in the verification While the Haskell prototype is an executable model and of seL4 implementation of the final design. simple a high level. The alternative of producing the executable specification directly in the theorem prover would have meant a steep learning curve for the design   team and a much less sophisticated tool chain for execution and simulation. which only serves as an intermediate to micro-optimise the kernel. it has the ad- of C in the logic of the theorem prover. 2 is the executable specifica- larity [23]. which is required for adequate prototype. we still have to read vantage that it is not constrained to specific properties or and translate the specific program into the prover. or refined) An often-raised concern is the question of proof correct- representation of a system.

combined with a relationship to the frame table. the user-level policy makes sense. Preemption is a limiting preemption points.conditions that need to hold prior to execution.and authorised applications [20]. For example. The model pushes complexity of the properties the pre. exports control of the in-kernel allocation to appropriately wards. the module is part of the trusted computing base. Ideally. deleted. A hypothetical. Yielding increases complexity sig- avoided during that violation. are also a form of non-deterministic yield. Around 80 % of the properties we show relate to we only need to prove that the mechanism works. that technique has no problem dealing with them. However. variables. OS kernels are not (free) memory region and that they do not overlap with usually structured like this. with a check that is local in scope. execution and non-determinism can be controlled. and by deriving the code from non-deterministically optional yield. of capabilities in this derivation tree.and post-conditions the policy for allocation outside of the kernel. and so on. or returning the cuss their effect on verification. overloaded page-table data structure that can represent resulting from asynchronous I/O devices. adding a reachable activity in the system. Our kernel uses both approaches. The degree of difficulty in showing that pre. In this light memory must be invalidated. Invariants are expensive because they need threads. and statements that rely on explicit local state. copy-on-write. they are outside the scope of kernel accidentally destroys the list or its properties. B. We address these issues by nificantly and makes verification harder. but it must still address interrupts. of this code arise from yielding. and the consumption. While this model is mostly post-conditions hold is directly related to the complexity of motivated by the need for precise guarantees of memory the statement. its design should userland does not change the fact that the memory-allocation minimise the complexity of these components. The treatment of globals becomes especially difficult if Yielding at X results in the potential execution of any invariants are temporarily violated. it also benefits verification. depending on how modularly the on uniprocessor support where the degree of interleaving of global variable is used. Larger execution blocks of ties. like structure. These smaller elements could then be composed The correctness of the allocation algorithm involves checks abstractly into larger elements that avoid exposing under. there are no system-wide dangling references. By design. Unfortunately. Concurrency issues in the verification have to preserve all of it. where A must create a large. seL4 avoids much translation. For example. validity of memory regions. as in preemption or interrupts. the corresponding Concurrency is the execution of computation in parallel invariant might state that all back links in the list point to (in the case of multiple hardware processors). However. X must establish the state data structures. interrupts and exceptions. drivers at user level. which means express. This implies A must es- new node to a doubly-linked list temporarily violates invari- tablish the preconditions required for all reachable activi- ants that the list is well formed. 4 . if global scheduler queues 3. moving it into To make verification of the kernel feasible. all references to this complexity without compromising performance. variants. such as in lock acquisition and waiting on condition to the attention of the design team. the capability derivation tree is used 3. can rely on verified kernel properties. or by non- the appropriate nodes and that all elements point to thread deterministic interleaving via a concurrency abstraction like control blocks.1 Global variables and side effects to find and invalidate all capabilities referring to a memory Programming with global variables and with side effects region. with simple in. allocation model keeps track of capability derivations in a tree- As a consequence of our design goal of suitability for real. that such a module can be verified separately. This involves either finding we will now examine typical properties of kernels and dis. In the second approach. much to be proved not only locally for the functions that directly harder than proofs about sequential programs. and the location of data in swap space. This this paper and left for future work. This would Consider the small code fragment A. not that preserving invariants. but for the whole kernel— While we have some ideas on how to construct verifiable we have to show that no other pointer manipulation in the systems on multiprocessors. life use. Obviously. Proofs about concurrent programs are hard. This is possible implicit state updates and complex use of the same global because all other kernel objects have further invariants on for different purposes can make verification harder than their own internal references that relate back to the existence necessary.3 Concurrency and non-determinism are implemented as doubly-linked lists.2 Kernel memory management or sequence of statements in a function that modify the The seL4 kernel uses a model of memory allocation that system state. In the first approach. whose nodes are the capabilities themselves. complex invariant for each of the involved establish the state that X relies on. X. including presenting specific object to the memory pool only when the last capability is features of our kernel design. zero- of the complications resulting from I/O by running device on-demand memory. and all reachable activities on return must establish unrelated code. and each of the involved operations would B relies on. It does mean. problematic example would be a com- even on a uniprocessor there is some remaining concurrency plex. manipulate the scheduler queue. Blocking kernel primi- Haskell. Global variables usually require stating and proving in- variant properties. thus making side effects explicit and bringing them tives. all outstanding capabilities to the object. should be the preconditions of B. the capability derivation tree is common in operating systems kernels and our verification is used to ensure. In this paper we focus proof can be easy or hard. and generally feature highly any other objects allocated from the region. the state the statement can modify. we side-step addressing the verification complex- ity of yield by using an event-based kernel execution model. and the post-conditions that must hold after. a statement 3. our kernel design attempts to minimise the proof Before re-using a block of memory. that new objects are wholly contained within an untyped lying local elements. kernel code (and associated proofs) would consist of simple however. Our memory inter-dependent subsystems [10].

This all reachable interrupt handlers must establish or preserve is sufficient to include interrupt controller access. The requirements of verification force the designers to The use of interrupt points creates a trade-off. which we unmasks the interrupt when the handler acknowledges the consider unacceptable. the timer driver. encodings and in kernel code by mapping a fixed region of the virtual ad- error reporting. We coarsely model the hardware interrupt controller of except for a small number of carefully-placed interrupt points. but 4. it will simply continue where the 4. and that interrupts only occur if unmasked. notifies the regis- during kernel execution. The interrupt which generates timer ticks for the scheduler. use [32] have adopted an event-based design to reduce the This guarantees that the correctness of a restarted destroy is kernel’s memory footprint (due to the use of a single kernel not dependent on user-accessible registers. preemptable execution (except for a few interrupt-points) has We make these operations preemptable by storing the state traditionally been used in L4 kernels to maximise average- of progress of destruction in the last capability referencing the case performance. the need for any interrupt-point specific post-conditions for interrupt handlers. and can execute without any interrupt In a number of cases there were significant other bene- points at all. and a mostly atomic application 3. seL4 VERIFICATION first thread was preempted (a form of priority inheritance). We prove that integers. whose fits. we just prove that all the preconditions for execution. our interrupt resulting in preemption (as a result of timer ticks). In order to express actively used or not.4 I/O programming interface [25]. As described earlier we avoid most of the complexity of Interrupt complexity has two forms: non-deterministic I/O by moving device drivers into protected user-mode com- execution of the interrupt handlers. the interrupted event is restarted. A must The model includes existence of the controller. This is particularly true for the design decisions aimed cleanup operations are inherently unbounded. interrupt controller management in detail. and pense of large or unbounded interrupt latency. We found repeatedly that this leads to overall better processing latency. lists. the data structures used in this abstract specifi- Arguments passed to the kernel from user level are either cation are high-level — essentially sets. such as 32-bit and is guaranteed to never produce a fault. Exceptions are similar to interrupts in their effect. Non- course. rather than temporary enabling of up such that it is easy to include more detail in the hardware interrupts. masks cally. In the seL4 kernel saying how it is done. On detection of a pending interrupt. The exception is object destruction.5 Observations emptability [25]. bers. This section describes each of the specification layers as instead of making the new thread dependent (blocked) on well as the proof in more detail. At the boundary we leave a (potentially modified) Our kernel contains a single device driver. For all user-visible kernel operations it we avoid exceptions completely and much of that avoidance describes the functional behaviour that is expected from the is guaranteed as a side-effect of verification. However. we explicitly model should it become necessary later to prove additional return through the function call stack to the kernel/user properties. including re-establishing to model the timer explicitly in the proof. the ARM platform to include interrupt support in the proof. controlled by think of the simplest and cleanest way of achieving their the kernel designer. we use finite machine words. leave implementation choices to lower levels: If there are multiple correct results for an operation. Another advan. but still achieve Fluke-like partial pre. Special care is system. After the in-kernel component of interrupt reloaded source of regular interrupts. All implementations that refine this specification required only for memory faults. trees. It is not modified or handling. and bounded latency. the completion of another. Instead. Theoreti. we refer to this as a zombie capability. this complexity can be avoided by disabling interrupts further interrupts from that specific source. event stored in the saved user-level registers. which tends to reduce the likelihood of bugs. will be binary compatible. In this way we avoid system behaviour on each tick event is correct. and are. functions transferred in registers or limited to preregistered physical and records. we run the kernel with interrupts mostly disabled. This effectively accessed during the execution of the kernel. critical to kernel integrity. and interrupt handling ponents. this would be at the ex. tage of this approach is that if another user thread attempts to destroy the zombie. We model memory and typed pointers explicitly. and interrupts. between proof complexity and interrupt goals. The proof is set rupt points via polling. Instead. of at simplifying concurrency-related verification issues. X is the interrupt point. We make use of non-determinism in order to frames accessed through the kernel region. Recent L4 kernels aimed at embedded object being destroyed.with a single kernel stack. so for instance some of the C-level size re- dress space to physical memory. in the above code fragment. boundary. 3. If. without modelling correctness of the We simplify the problem further by implementing inter. This is set up becomes a new kernel event (prefixed to the pending user. and basic the properties B relies on. Almost all of seL4’s operations have short design. interrupt. stack rather than per-thread stacks). Otherwise. this region appears in every virtual address space. this abstract layer 5 . independent of whether it is strictions become visible on this level. The region contains all the memory the these. tered user-level handler (device driver) of the interrupt. When processing an interrupt event. delivery mechanism determines the interrupt source. in the initialisation phase of the kernel as an automatically triggered event). behaviour in the proof. we rarely make use of infinite types like natural num- kernel can potentially use for its own internal data structures. We avoid having to deal with virtual-memory exceptions We precisely describe argument formats.1 Abstract specification are synchronous in that they result directly from the code The abstract level describes what the system does without being executed and cannot be deferred. masking of establish the state that all interrupt handlers rely on. We did not need re-tries the (modified) operation.

We generate similar obligations for all for priority queues. Note that and the hoops the kernel programmers were willing to jump priority queues duplicate information that is already available through in writing their source. far as possible. tcbSchedDequeue thread elled as a function picking any runnable thread that is active return False else do in the system or the idle thread. The additional complexity becomes apparent in the and write. 6 . modelled as a tree on laid out. used pointers to global variables (which are not The most detailed layer in our verification is the C im. padding of structs. The specification fixes the behaviour restrictions the C99 standard demands. tion that local variables are separate from the heap. for better automation. r <. efficiently. schedule ≡ do schedule = do threads ← all_active_tcbs.getSchedulerAction thread ← select threads. The Isabelle/HOL code switchToThread thread for this is shown in Fig. maxBound]) when (r == Nothing) $ switchToIdleThread chooseThread’ prio = do q <. because it level of detail. We treat a very large.isRunnable thread An example of this is scheduling. the abstract level. It implements the abstract specification. we specify how straightforward. The optimisation will require us to prove that ables. in our model of C. the only non. abstract reasoning on the with pointer-update operations. types. chooseThread’’ thread = do runnable <. It mentions that threads have time slices and it fication. As are optimised in C. They are typically icate. While trying to architecture-dependent word size. For example. On top of that. For instance. ters of large types that we do not want to pass on the stack. and about and the C implementation. chooseThread = do stract level. We achieved compliance with this requirement by avoiding reference parameters as much as possible.getQueue prio would return all of them and make clear that there is a choice. Instead. All data a primitive function from addresses to bytes without type structures are now explicit data types. purpose OS kernel.. of the scheduler to a simple priority-based round-robin al. The OR makes a non-deterministic choice between the first block and switch_to_idle_thread. liftM isJust $ findM chooseThread’’ q The implementation is free to pick any one of them. is now modelled as a doubly linked list We managed to lift this low-level memory model to a high- with limited level information. records and lists with information or restrictions. pragmatic subset of C99 in the veri- gorithm.findM chooseThread’ (reverse [minBound . The executable specification makes this choice more specific. restricted). For example types like unsigned int are encoded. we reflect the fundamental restrictions kernel programmers do. These are explicit in the to use more than 64 bits to represent capabilities. Figure 4: Haskell code for schedule. but rather an explicit search backed by data structures easy to discharge. It is a compromise between verification convenience clarifies when the idle thread will be scheduled. memory is determinism left is that of the underlying machine. we make assumptions about the in size and code structure that we expect from the hardware compiler (GCC) that go beyond the standard. type-safe fragment of the kernel [62. because. 3. plementation. and arithmetic on addresses. The translation from C into Isabelle is correctness-critical and we take great care to model the 4. This We have proved that the executable specification correctly could be violated if their address was available to pass on. and where they 4.2 Executable specification semantics of our C subset precisely and foundationally. exploiting model. how structures are the capability derivation tree of seL4. Its implementation (not shown) is an abstract logical predi. and how implicit and explicit type casts behave. We do not specify ally means that we do not just axiomatise the behaviour of in which way this limited information is laid out in C. 4 shows part of the scheduler specification at this proof obligations assuring the safety of each pointer access level. we take care not the architecture used (ARMv6). non-null and of the correct alignment. Figure 3: Isabelle/HOL code for scheduler at ab. case action of switch_to_thread thread ChooseNewThread -> do chooseThread od OR switch_to_idle_thread setSchedulerAction ResumeCurrentThread . we make the assump- the duplicated information is consistent. efficient implementations in C. We generate Fig. No scheduling policy is if not runnable then do defined at the abstract level. action <. cate over the whole system.3 C implementation were needed.. It is manipulated explicitly level calculus that allows efficient. C on a high level. The following paragraphs (in the form of thread states). 65]. The select statement picks any element of the set. in order to make it available describe what is not in this subset. for instance with kernel works (as opposed to what it does). Pre- The purpose of the executable specification is to fill in the cisely means that we treat C semantics. type- avoid the messy specifics of how data structures and code unsafe casting of pointers. They make it easy to find a runnable thread of We do not allow the address-of operator & on local vari- high priority. Because of its extreme is the most far-reaching restriction we implement. They state that the pointer in question must be chooseThread function that is no longer merely a simple pred. but we derive it from first principles as The executable specification is deterministic. Foundation- for instance known alignment of pointers.. The function all_active_tcbs return True returns the abstract set of all runnable threads in the system. and mem- details left open at the abstract level and to specify how the ory model as the standard prescribes. the scheduler is mod. 63. this proof alone already provides stronger is common to use local variable references for return parame- design assurance than has been shown for any other general. and we can therefore detect violations.

For the parts that we do not model. We support C99 compound literals.void setPriority(tcb_t *tptr. 6 shows our machine interface. making it internal state of the relevant devices. assurance can be obtained by adding more detail to the tion is generated to show that these functions are side-effect machine model—we have phrased the machine interface such free. and many all of type X machine_m which restricts any side effects to structs are packed bitfields. such as the interrupt controller. e. getDFSR :: word machine_m } getFAR :: word machine_m } getActiveIRQ :: (irq option) machine_m tptr->tcbPriority = prio. after compilation on ARM. Like other kernel implementors. for installing exception vector tables. Since the C implementation was derived from a Fig. implemented in C where possible. functions are implemented with maximal non-determinism. ksReadyQueues[oldprio]). not allow goto statements. such We did not use unions directly in seL4 and therefore do as the TLB. functions return nothing (type unit). that page-table (We do allow handing the address of a function to assembler updates always flush the TLB). and we rely on than one function call occurs within an expression or the traditional testing for these limited number of cases. The tool helps This means that in the extreme case they may arbitrarily us to easily map structures to page table entries or other change their part of the machine state. Additionally. this is achieved by writing to memory. we only assume that it does not change the memory state of 4. cache and TLB flushes we limit how side effects can occur in expressions. the main property we are interested in is other cases one has to drop down to assembly to implement functional correctness. getIFSR :: word machine_m false). we must check (but we do not prove) that the implementa- ing functionality described in the previous sections.g. hardware instructions because they are too far below the pressions with side effects. as for instance with TLB flushes. Fig. Even for devices hardware-defined memory layouts. it also automatically generates In the seL4 implementation. the relevant part in literals to be lvalues. The less behaviour we prescribe. maskInterrupt :: bool => irq => unit machine_m } Figure 6: Machine interface functions.g. which is easy to check. this function. these takes a specification and generates C code with the neces. Figure 5: C code for part of the scheduler. all unions in seL4 are tagged. required behaviour can We do not allow function calls through function pointers. standard C99 code with pointers. which is basically a single assembly instruction. 6 are Isabelle/HOL specifications and proofs of correctness [13]. but the features were not required in seL4. the result is more possible non-deterministic. be possible). For devices that we model more need for reference parameters. We do not allow compound closely. The generated code can that we model. 5 shows part of the implementation of the schedul. compact and faster than GCC’s native bitfields.) We also do proof obligations. in As mentioned. Instead. It is tions match the assumptions we make in the levels above. but change the state of dictably for kernel code. This section describes the main theorem we have shown In the easiest case. the machine_state component of the system. The basis of this formal model of the machine is the through cases. resetTimer :: unit machine_m if(thread_state_get_tcbQueued(tptr->tcbState)) { setCurrentPD :: paddr => unit machine_m oldprio = tptr->tcbPriority.4 Machine model the machine. In the abstract and executable specification. prio_t prio) { configureTimer :: irq => unit machine_m prio_t oldprio. Higher expression otherwise depends on global state. Of these. so not support them in the verification (although that would it can be replaced with more details later. we wrote a small tool that a device. Programming in C is not sufficient for implementing a kernel. the functions in Fig. setHardwareASID :: hw_asid => unit machine_m ksReadyQueues[oldprio] = invalidateTLB :: unit machine_m tcbSchedDequeue(tptr. To deal with this feature soundly. and otherwise in assembly. the instruction fault status register after a page fault. which on ARM returns thread_state functions used in Fig. The An example is the function getIFSR. Most of the we do not trust GCC to compile and optimise bitfields pre. we do not model the effects of certain direct programmers) is the unspecified order of evaluation in ex. we leave the corresponding type unspecified. and reducing the record machine_state. or switch statements with fall. The tool not the less assumptions the model makes about the hardware. currently masked. For erated bitfield accessors. invalidateHWASID :: hw_asid => unit machine_m if(isRunnable(tptr)) { invalidateMVA :: word => unit machine_m ksReadyQueues[prio] = cleanCacheMVA :: word => unit machine_m tcbSchedEnqueue(tptr. with minimal changes.. This proof obligation is discharged automatically by that future proofs about the TLB and cache can be added Isabelle. 5 are examples of gen. mapped device registers.5 The proof outside the semantics of C to manipulate hardware directly. One feature of C that is problematic for verification (and Presently. only generates the C code. There are places where the programmer has to go 4. the required behaviour. which would result in further code. If more are relevant for the correctness of the code. and how its proof was constructed. ksReadyQueues[prio]). abstraction layer of C. The functions are functional program. which we prove by showing formal 7 . Some of these restrictions could be machine_state contains details such as which interrupts are lifted easily. sary shifting and masking for such bitfields. cleanCacheRange :: word => word => unit machine_m } cleanCache :: unit machine_m else { invalidateCacheRange :: word => word => unit machine_m thread_state_ptr_set_tcbQueued(&tptr->tcbState. as for instance with a timer chip. a proof obliga. arrays and structs. be guaranteed by targeted assertions (e. we are careful to leave as much behaviour as be inlined and. collected in one convenient to return structs from functions.

implementation errors (deviations from the specification) can rectly implements our subset according to the ISO/IEC C99 only occur below the level of C. 8 . 4. For each refinement layer in Fig. and we assume they do not have any effect on the behaviour of the idle thread. let ma. σ! machine framework. there exists a corresponding transition on the abstract side from an abstract   state σ to a set σ " (they are sets because the machines may  M2 be non-deterministic). 4.MX31 platform. each transition with the same overall relation R. Assurance. and let machine MC and the proof forces us to be complete. foundationally from first principles [42]. The assumptions we make are correctness of tion. the assembly code level. code and hardware interaction. The transitions correspond if there exists a relation R between the states s and σ such that for each concrete state in s" there is an abstract one in σ " that makes R hold between them again. interrupts). We do however substantiate the model by manually stated Let machine MA denote the system framework instan. cache consistency. Coverage is complete. [49] report verification success. For Theorem 1. user The assumptions on the hardware and assembly level mean events. we have strengthened and varied this proof technique slightly. and we instantiate each  M1 of the specifications in the previous sections into this state. This has to be shown for Figure 7: Forward Simulation. User transitions are specified in Sect. These machine interface functions are (trap instructions. the theorems is a strong statement.2 kLOC of the kernel. As described in increasing amount of detail. we assume that the GCC compiler cor. only the details because all kernel reads and writes are performed through a of kernel behaviour and kernel data structures change. Kernel transitions that we do not prove correctness of the register save/restore are those that are described by each of the specification layers and the potential context switch on kernel exit. very simple-looking theorems: able resources. stand for the framework instantiated with the C program These are not fundamental limitations of the approach. Then we prove the following a decision taken to achieve the maximum outcome with avail- two. Any remaining For the C level. Details are published elsewhere [14. but the general idea remains standard [39]. buggy. TLB on ARM processors. as it allows us to conduct all further above state correspondence between the kernel entry and exit analysis of properties that can be expressed as Hoare triples points in each specification layer. because refinement is transitive. We describe these assump. idle events are the memory state of the C program. 7: To show that a concrete state machine M2 refines an abstract one M1 . other interrupts that the assumption they are used correctly. Finally. occur during kernel execution are modelled explicitly and In-kernel memory and code access is translated by the separately in each layer of Fig. This is only true under interrupts occurring during idle time. cache colouring. σ. Idle transitions model called from C. and the hardware. context switching.1. User events model kernel entry machine interface. 2. we show that the behaviour of the C implemen- We currently omit correctness of the boot/initialisation code tation is fully captured by the abstract specification. For our C semantics. 4. rately reflects this standard and that it makes the correct We now describe the instantiation of this framework to architecture-specific assumptions for the ARMv6 architecture the seL4 kernel. our state machines: kernel transitions. illustrated in Fig. kernel virtual memory is different to the high standards in chine ME represent the framework instantiated with the the rest of our proof where we reason from first principles executable specification of Sect. MC refines MA . only informally.4. Overall. We have the following types of transition in on the Freescale i. on the massively simpler abstract specification instead of a tions in more detail below and discuss their implications. or malicious. Having outlined the limitations of our verifica- Assumptions. we have verified the executable design of the boot code in an earlier design version. and idle events. MC refines ME . and TLB as non-deterministically changing arbitrary user-accessible flushing requirements are part of the assembly-implemented parts of the state space. properties and invariants. We have formalised this property for general state machines in Isabelle/HOL. This means our treatment of in- tiated with the abstract specification of Sect.  We have also proved the well-known reduction of refinement  to forward simulation. but read into the theorem prover. ME refines MA . our model does not oblige us to prove it. idle transitions.2. we have shown that kernel VM access and faults can be modelled Theorem 3. and the Verisoft project [3] showed how to verify assembly Theorem 2. Leroy verified an optimising C compiler [44] for the PowerPC architecture.refinement. user transitions. be they benign. For instance. complex C program. The constant one-to-one VM window which the kernel establishes fully non-deterministic model of the user means that our in every address space. We have also Therefore. the C compiler. that the formal model of our C subset accu- the same. we assume a The model of the machine and the model of user programs traditional. flat view of in-kernel memory that is consistent remain the same across all refinement layers. This which takes up about 1. 70]. Ni et al. 2. we now discuss the properties that are proved. it is sufficient to show that for each transition in M2 that may lead from an initial state s to a set of states s" . We make this consistency argument proof includes all possible user behaviours. faults.

This fact is exploited heavily in the designer to the verifier about an important invariant in the delete operation to clean up all remaining references to an code. Intuitively. A cynic might say that an implementation proof only shows example would be a capability slot containing a reference that the implementation has precisely the same bugs that to a thread control block (TCB). they over short stretches of code and then re-established later — collect information about what we know to be true of each usually when lists are updated or elements are removed. spurious calls. or they are required to show that an operation executes invariant statements we have proved. For instance. These are the most complex re-established later. The difference is the degree of abstraction and with type TCB. Another one is low-level memory invariants. and that the kernel never accesses space identifiers (ASIDs). These invariants are invariants are not merely a proof device. to be temporarily left dangling. we had to show a large number of invariants. Since the interface from user and for large parts of the code. They also often exclude specific a null pointer or a misaligned pointer. Other invariants formally describe a general symmetry principle that seL4 follows: if 1 One might think that assertions are pointless in a verified an object x has a reference to another object y. and There are four main categories of invariants in our proof: that only the idle thread is in this state. and as such aid verification. In the same notation. and algorithmic invariants. These the same way everywhere in the code. Typing invariants are usually simple to state kernel can enter an infinite loop. such as in ARM page table objects. Examples of simple algorithmic invariants general flavour. be left out because the condition can be shown to be always There is not enough space in this paper to enumerate all the true). This is a necessary about. data structure that the global kernel memory containing kernel code and invariants. These invariants are either required to prove statements between abstract and executable specification that specific optimisations are allowed (e. and also for large parts during kernel execution where in our proof are algorithmic invariants that are specific to some of these invariants may be temporarily violated and how the seL4 kernel works. that kernel objects are aligned to a thread. typing invariants. Our typing implies that both ME and MC never fail and always have invariants are stronger than those one would expect from defined behaviour. they also convey a useful message from the kernel directly or indirectly. before and after each system The fourth and last category of invariants that we identify call. specific properties of the system. but we will attempt a safely and does not violate other invariants. An example is our the main invariant is about potentially used references. then there kernel. The invariant would state the specification contains. To analyse point to well-defined and well-structured data. that a check can accounting for at most 20 % of the total effort for that stage. Type that the kernel does all argument checking correctly and that preservation for these two operations is the main reason for it can not be subverted by buggy encodings. but provide valuable not especially hard to state. because objects can be another. ment that there are no loops in specific pointer structures. show a few representatives. one might also introduce This is also a dynamic property. if a Reply capability exists to is no object at address 0. The typing invariants reply. that other lists are always terminated correctly with NULL. All invariants like correct back links in doubly-linked lists. a state- these properties hold with the full assurance of machine. This includes that all assertions1 in the kernel design using only a certain number of bits for hardware address are true on all code paths. with the actual refinement effort was spent. our refinement theorem implies the proof is difficult: removing and retyping objects. their preservation can be level to the abstract specification is binary compatible with proved automatically. This is a non-local property connecting the existence say that each kernel object has a well-defined type and that of an object somewhere in memory with a particular state its references in turn point to objects of the right type. The overall proof effort was clearly invariants in our proof and they are where most of the proof dominated by invariant proofs. This means the kernel can never crash or a standard programming language type system. In essence. This is true: the proof does not that the type of the first object is a capability-table entry guarantee that the specification describes the behaviour the and that its reference points to a valid object in memory user expects. values such as -1 or 0 as valid values because these are We proved that all kernel API calls terminate and return used in C to indicate success or failure of the corresponding to user level. this invariant implies that all the absence of whole classes of bugs. do allow some references. kernel objects or other data structures—always and works with concepts that are simpler and faster to reason point to an object of the expected type. As part of the refinement proof between levels MA and or that data structure layout assumptions are interpreted ME . Slightly more involved The first two categories could in part be covered by a type. The current level of abstraction is low enough to be condition for safe execution: we need to know that pointers precise for the operational behaviour of the kernel. a large number of other kernel invariants. We access control model of seL4 [11. they are not only a great help during devel. buffer The third category of invariants are classical data structure overflow attacks or other such vectors from user level. as long as we can prove that our strengthened proof technique for forward simulation [14] these dangling references will never be touched. 21]. not garbage. maliciously constructed arguments to system calls. even higher level of abstraction that contains only deleted and memory can be re-typed at runtime. 9 . data structure in the kernel. are that the idle thread is always in thread state idle. There are only two operations where the final implementation. Note that the aspects relevant for the property. There is no possible situation in which the operations. especially not the rough categorisation. An of another object somewhere else. checked proof. but they are frequently violated information and assurance in themselves. are relationships between the existence of capabilities and safe language: low-level memory invariants include that there thread states. reachable. In addition to the implementation correctness statement.g. In fact. and give a typing invariant. They are otherwise behave unexpectedly as long as our assumptions context dependent and they include value ranges such as hold. data is mapped in all address spaces. potentially used references in the kernel—be it in the abstract specification is one third the size of the C code capabilities. this thread must always be waiting to receive a their size. is a reference in object y that can be used to find object x opment. and that they do not overlap.

10 . Haskell/C Isabelle Invariants Proof The reason this delete operation is safe is complicated. theorem and our experiments with OKL4 2. but was in fact a significant net cost saver. only The overall code statistics are presented in Table 1. more convenient and efficient to reason about than the C level. C implementation took about 2 pm. If the objects are not live and performing the second refinement. Of course prototype (over all project phases). (2) Table 1: Code and proof statistics. This is a reflection yet in the verified code base. MHz ARM1136JF-S which is an ARMv6 ISA) produced 206 We expect that re-doing a similar verification for a new cycles as a point for comparison—this number was produced kernel. About 2 person years (py) went into the Haskell preserve the basic typing and safety properties. These is only twice the SLOCCount estimate for a traditionally- are hot-cache measurements obtained using the processor engineered system with no assurance.000 that show an efficient local pointer test is enough to ensure exec. most are interrelated. If an untyped capability c1 covers a sub-region of another capability c2 . language frameworks. Deleting the object will therefore develop. then c1 must be a descendant of c2 . for a total (kernel plus proof) of 8 py. we have: If an untyped capability has no chil- braries.000 lines of Isabelle script. LOC LOC LOP Here is a simplified. which is approaching the performance of seL4. would reduce using a hand-crafted assembly-language path.000 that deletion is globally safe: impl. many are complex. First an initial asymmetry a significant benefit. there is no further reference proof. It certainly compares cycle counter.MX31 evaluation board based on a 532 specific proof was 11 py. proof automation.000 ∼ 80 55.2 py All these invariants are expressed as formulae on the kernel including the Haskell effort. this figure to 6 py. also reflected in the proof size—the first proof step contained most of the deep semantic content.1 Performance estimates the total cost of seL4 at 4 py. then all kernel objects the development team completed the design. This is rest of the code. as the executable spec is kernel with limited functionality (no interrupts. including design. We cost. We consider this The project was conducted in three phases. the verification team developed dren in the CDT (a simple pointer comparison according to the abstract spec and performed the first refinement while additional data structure invariants). state and are proved to be preserved over all possible kernel This compares well with other efforts for developing a new executions. optimised assembly-language IPC paths for other L4 kernels The breakdown of effort between the two refinement stages on ARM processors. This Publicly available performance for the Intel XScale PXA includes significant research and about 9 py invested in formal 255 (ARMv5) is 151 in-kernel cycles for a one-way IPC [43]. to draw upon [47]. We have proved over 150 invariants on the different spec- The initial C translation was done in 3 weeks. proof tools. yet provides far less assurance than formal verification. Hence. worked on the verification framework and generic proof li- With these. in total about 20 py. SLOCCount [68] with the “embedded” profile 5. on the back of their experience from building the earlier 5. This means have evaluated the performance of seL4 by comparing IPC that our development process can be highly recommended performance with L4. which fragment of C and is not fundamentally different from the captures most of the properties of the final product. microkernel from scratch: The Karlsruhe team reports that. the development of the Pistachio kernel cost about 6 py [17]. for a total cost of 2. but it is within the verifiable of the low-level nature of our Haskell implementation.700 15. coding. The executable spec only the new invariants we just used as well. this took 3 pm. 20 % into the actual correspondence proof. and testing.object before it is deleted. including framework. favourable to industry rules-of-thumb of $10k/LOC for Com- Our measurement for seL4 is 224 cycles for one-way IPC in mon Criteria EAL6 certification.000 0 (1) If an object is live (contains references to other objects). then c1 must be a descendant of c2 according to the capability derivation tree (CDT). 80 % of the effort in 5. This optimised C version in the same kernel took 756 cycles. using the same overall methodology. Haskell pro- in its region must be non-live (otherwise there would be totype and C implementation. — 4. The third phase consisted capabilities to them. while the verification team mostly untyped capability c2 . (3) If a capability dress space and generic linear page table) was designed and c1 points to a kernel object whose memory is covered by an implemented in Haskell. in total the ification levels. single ad. the second (to concrete spec) At the time of writing. and generated proofs in the whole system that could be made unsafe by the type (not shown in the table) is 200. there exists a capability to it somewhere in memory. high-level view of the chain of invariants abst. almost a 3:1 breakdown. 8. The overall size of the no capabilities to them exist. The total effort for the seL4- are using (Freescale i. executable spec) consumed 8 py. there is strong IPC performance is the most critical metric for evaluation evidence that the detour via Haskell did not increase the in a microkernel in which all interaction occurs via IPC.700 13. In a second phase.1 [51] on the platform we prover extensions and libraries. The non.2 Verification effort the first refinement went into establishing invariants. which in turn would have to be children of extending the first refinement step to the full kernel and of the untyped capability). EXPERIENCE AND LESSONS LEARNT Hazelnut kernel. this optimised IPC C path is not less than 3 py. 5. docu- we also have to show that deleting the object preserves all mentation. libraries. This puts seL4 performance into the is illuminating: The first refinement step (from abstract to vicinity of the fastest L4 kernels. change because otherwise the symmetry principle on refer- The abstract spec took about 4 person months (pm) to ences would be violated. which has a long history of data points even for projects not considering formal verification. required setting up the translator.900 ∼ 75 110. The cost of the proof is higher. which would be $87M for an optimised C path.

or failing to update all relevant code fundamentally broke a number of properties and invariants parts for specification changes. never completed any substantial implementation proofs. understandable human error. ARM page tables and address spaces. and thus deleted immediately after use. the interrupt controller on ARM re-verify. been used by a number of internal student projects and the This change cost several pms to design and implement. when progressing from the first to the final implementation. For example. It modified about 12 % of defects in the implementation before verification had started existing Haskell code. Other more interesting bugs found change was small (less than 5 % of the total code base). They concluded that invariant reasoning the number of invariants it affects. 34 were implementation restrictions. The UCLA project made to the kernel? It obviously depends on the nature managed to finish 90 % of their specification and 20 % of of the change. which Klein [40] provides a comprehensive overview. the maximum size of virtual address space identifiers. and different interpretations of default values It took about 1 py or 17 % of the original proof effort to in the code. which do not interact in substantiate optimisations in C. obvious faults than one may expect. the kernel had adding interrupts. and found that the effort to avoid proving any invariants on the C code. but in one place does not occur after the kernel has been verified: implemen- the check was against NULL instead. and re-verification in earnest. such as adding the design team (to predict performance impact) was an a complex new data structure to the kernel supporting new important factor in the verification team’s productivity. None of the bugs found in the C verification isting invariants. but our iterative project. to a very precise. the formal verification has uncovered another 144 cost about 32 % of the time previously invested in verification.3 The cost of change Secure Operating System (PSOS) [24]. Operating System (KSOS) [52] by Ford Aerospace. The ability to change and rearrange code in discussion with Adding new. independent features. misread. We had one example of such a change when in the first stage were mainly missing checks on user supplied we added reply capabilities for efficient RPC as an API op- input. The bugs discovered in the second and are treated in most cases like other capabilities. but also a larger number of simple. defects and resulted in 54 further changes to the code to aid The new features required only minor adjustments of ex- in the proof. The C verification also lead to changes in the executable and abstract specifications: 44 of these were to make the 6. returns 0xFF to signal that no interrupt is active which is There is one class of otherwise frequent code changes that used correctly in most parts of the code.2 and was significantly more expensive. Reply ing global invariants. Our approach mirrors An obvious issue of verification is the cost of proof main. tation bug fixes. This is because the C code was written according served over the whole kernel API. fundamental changes to existing features verified in the first refinement stage. new code paths. We proved only to the size of the change. and how localised it is. the during the C implementation proof were missing exception comparative amount of conceptual cross-cutting was huge. which suggests that normal testing may not only miss the fly required complex preconditions to be proved for many hard and subtle bugs. subtle side effects in the middle of an operation break. The opera- well tested executable specification in the first refinement tions were carefully constrained in the kernel. the UCLA effort in using refinement and defining functional tenance: how much does it cost to re-verify after changes correctness as the main property to prove. a complex way with existing features. Doing them on proof. RELATED WORK proof easier. Simple typos also made up a on capabilities. design. Even though the code size of this security vulnerabilities. This ing the specification. Creation and deletion of capabilities require surprisingly large fraction of discovered bugs in the relatively a large number of preconditions to execute safely. is a clear benefit of the approach described in Sect. which we found confirmed in our We are not able to quantify such costs. Its The best case are local. Those two activities uncovered 16 resulted in 1. Adjusting the proof took less than 1 pw. timisation after the first refinement was completed. typically design methodology was later used for the Kernelized Secure optimisations that do not affect the observable behaviour. such as We briefly summarise the literature on OS verification. are single-use.5–2 py to re-verify. approach to verification has provided us with some relevant PSOS was mainly focussed on formal kernel design and experience. Algorithmic bugs found are bad news. and x86 port was underway.Our formal refinement framework for C made it possible We made such changes repeatedly. low-level specification which was already Unsurprisingly. These invariants have to be pre- was flawed. the rest were introduced for verification convenience. speeding for re-verification was always low and roughly proportional up this stage of the proof significantly. large. specifically the amount of code it changes. We experienced such a case essential to complete the verification in the available time. It is API calls that interact with other parts of the kernel. dominated the proof effort. The first serious attempts to verify an OS kernel were in the late 1970s UCLA Secure Unix [67] and the Provably 5. but lead to a considerable number of new stage were deep in the sense that the corresponding algorithm invariants for the new code. Some of these turned out not to be true. cross-cutting features. By the time the second refinement started. the specifications should make visible to the user. 2. their proofs in 5 py. About 50 % of API that atomically batches a specific. adding a new system call to the seL4 abstract spec and 200 in the executable spec. added another 37 %. They proof from executable spec to C were mainly typos. Even though their cause which required extra work on special-case proofs or changes was often simple. For example. or over-strong assumptions about what capabilities are created on the fly in the receiver of an IPC is true during execution. their effect to existing invariants (which then needed to be re-proved in many cases was sufficient to crash the kernel or create for the whole kernel). short sequence of these changes relate to bugs in the associated algorithms or existing system calls took one day to design and implement. case checking. usually has a moderate The first refinement step lead to some 300 changes in the effect. not just the new features. The 11 . low-level code changes. few additional invariants on the executable spec layer to Adding new.

but for a highly idealised OS kernel are reported for KIT. we have shown that optimisations are possible ties of the AAMP7 microprocessor [53]. However. The Separation Kernel Protection Profile [38] of Evidence suggests that taking the detour via a Haskell pro- Common Criteria shows data separation only. They show We have presented our experience in formally verifying data separation only. and David Tsai. exception.000 LOC) separation kernel. Even though the kernel deliberately breaks FLASK [60]. specification that seL4 is proven to implement. C-code. This includes a formally verified. Full functional correctness of a realistic microkernel is demonstrated that with modern tools and techniques. which tends to be substantially bigger tions [8].Secure Ada Target (SAT) [30] and the Logical Coprocessor and shape analysis. and directly The functionality provided is less complex than a general deployable. we show a variant of type safety for nel [57]. formal verification they seem to at least model the implementation level. functional correctness can be achieved. and Adam not widely deployed. A closely related contemporary project is Verisoft [2]. Catherine Menon. We deal with real C and standard Wiggins for valued feedback on drafts of this article. only single-level page Acknowledgements tables) and that their verified hardware platform VAMP is We thank Timothy Bourke. fication [27]. usable. not functional correctness. assembly parts of the kernel. it is not very strong. Timothy Roscoe. relevant part specification. which implements and that performance does not need to be sacrificed for ver- the functionality of a static separation kernel in hardware. it only does so in a safe way. running on ARMv6 and x86. the high-level analysis of SELinux [5. and have aimed for a commercially also would like to acknowledge the contribution of the former deployable. overruns. completed implementation proofs. Basic language concepts like pointers still C programs automatically. event-based kernel that is mostly non-preemptable and uses A similar property was recently shown for Green Hills’ interrupt polling. the project has suc. We have shown that full. towards information-flow properties. but did not verify substantial parts of a kernel. they is practically achievable for OS microkernels with very rea- did not conduct a machine-checked proof directly on the sonable effort compared to traditional development methods. It is a weaker totype increased our productivity even without considering property than full functional correctness. rigorous. We have cally. machine support for theorem proving was and BLAST [34]. such as correct API usage in de- posed large problems. In our proof. Other formal techniques for increasing the trustworthiness Jia Meng. their language runtime. They have also shown the ultimate degree of trustworthiness we have achieved is that verification of assembly-level code is feasible. 29] based on the seL4 code. a multi-core version of the ware stack from verified hardware up to verified application kernel. redefining the standard of highest assurance. which Future work in this project includes verification of the is attempting to verify not only the OS. Although we have not invested significant effort into op- Hardin et al [31] formally verified information-flow proper. of C++. application proofs can rely on the abstract. con. as well as application verification. but and the OS side. We observed to a low-level design that is in close correspondence to the a confluence of design principles from the formal methods micro code. and the MASK [48] project which was geared the C type system. model checking NICTA is funded by the Australian Government as repre- 12 . Heitmeyer et al [33] report on the verification and Common Criteria certification of a “software-based embedded device” 7. Implementations of kernels in type-safe languages such as The first real. These decisions made the kernel design Integrity kernel [28] during a Common Criteria EAL6+ certi. The proof goes down prototyping methodology for kernel design. The terminator tool [15] increases reliability of simplifications required to make verification feasible made device drivers by attempting to prove termination automati- the kernel an order of magnitude slower [67]. This correspondence is not proven formally. cessfully demonstrated that such a verification stack for full Compared to the state of the art in software certification. of operating systems include static analysis. they have to rely on traditional “dirty” code to implement sisting of 320 lines of artificial. formal kernel Even if not all proofs are completed yet. no longer the case. but realistic assembly instruc. that the kernel will never crash and that the attempted to verify C++ kernel implementations. Model checking in the OS space includes SLAM [6] In the 1970s. The UCLA effort reports that the vice drivers. this is still beyond the scope of these automatic techniques. Although seL4. purpose microkernel—the processor does not support online Collateral benefits of the verification include our rapid reconfiguration of separation domains. for instance. simpler and easier to verify without sacrificing performance. Static analysis can in the best case only Kernel (LOCK) [55] are also inspired by the PSOS design show the absence of certain classes of defects such as buffer and methodology. but a whole soft. The latter now programs. verification. Additionally. timisation. it will report a controlled proceed to the implementation level include the EROS ker. a null pointer access. They can show specific safety properties of rudimentary. leading to design decisions such as an by manual inspection. than the complete seL4 kernel. While type safety is a good Bevier and Smith later produced a formalisation of the property to have. ification. although SPIN [7] and Singularity [23] offer increased reliability. team members on this verification project: Jeremy Dawson. They code always behaves in strict accordance with the abstract managed to create a precise model of a large. CONCLUSIONS featuring a small (3. We tool chains on ARMv6. we prove much more: that there will never be any such null The VFiasco project [36] and later the Robin project [61] pointer accesses. Other misbehave or attempt. non-optimising becomes much more meaningful than previously possible: compiler for their own Pascal-like implementation language. realistic microkernel. The kernel may still Mach microkernel [9] without implementations proofs. The seL4 kernel is practical. formal modelling and proofs for OS kernels that did not Instead of randomly crashing.g. Verisoft accepts two orders of magnitude slow-down for their highly-simplified VAMOS kernel (e.

B.. Chow. Holt. 8. In 15th SOSP. C.html. and A. Hawblitzel. for [21] D. Brewster. Chakravarty. environment for commodity operating systems. McLean. Young. E. Accetta. C. D. 2009. pages 158–169. In ACL2’06: Proc. applications. Elphinstone. M. Bevier and L. 1998. and M. 15(11):1382–1396. Pradella. In 34th POPL. 2006. and Engineering. L. Ball and S. Bevier. T. C. verified kernel. [17] U. A. Mohamed. Bolosky. security-enhanced Linux. G. In 2008. McGrath. P.. M. Elphinstone. Jun 2009. 2006. 1979 National Comp. Dhurjati. Sirer. J. ACM. pages 117–122. May 2007. Hypervisors for consumer electronics. pages 44–55. In 3rd OSDI. Young. R. Derrin. G. [20] D. Alkassar. J. G. R. Hillebrand. V.org/cc-scheme/st/st vid10119-st. Elkaduwe. Formal design for isolation and assurance of physical memory. http: [10] I. Aug 2008. Alves-Foss. CACM. J. operating system. J. Tahar. 1970. Technical Report pages 177–190. Softw. Schirmer. Feiertag and P. [24] R. Heiser. G. ACM SIGOPS. D. Inc. volume 5170 of LNCS. and S. Klein. Smith. T. Fiuczynski. volume 372 of CEUR [31] D. Verified the Construction and Analysis of Systems (TACAS). In http://www. de Roever and K. D. pages 35–40. Vardi. and T. Hodson. Journal of Computer Security. Dannowski. O. Communications [16] J.pdf. K. Tullmann. Kit: A study in operating system Fluke kernel. Boyton. Lepreau. editors. Feb 1999. Klein. 13 . G. Apr 2008. E. Running the manual: An N. editors.: Theories. Springer. Herzog. machine code proof framework for highly secure [14] D. 4th WS [29] J. POLICY ’03: Proc. J. 2007. a provably secure operating system (PSOS). Hardin. A verified shared capability model. 42(2–4). Oct 2003. Becker. 21st Int. E. Kernel [3] E. Integrity real-time operating Engineering. Criswell. Klein. A. W. 2008. Terra: A virtual machine-based platform for [9] W. [11] A. Archer. Golub. Heiser. R. and [5] M. In O. protection model of the seL4 microkernel. D. Microsoft Research. Interface and execution models in the [8] W. Cock. editors. ENTCS. language for interface checking. 1986. Baron. S. editors. pages 93–112. [22] K. Aug 2008. and B. [19] P. Elkaduwe. Schirmer. Extending the [13] D. In 1st EuroSys Conf. Dec 1995. Jun 1979. pages 351–366. and K. Oct 2008. 2001. Springer. Rehof. D. M. Verifying information flow goals in 99–116. Tools and Alg. Skorupka. the load — leveraging a semantics stack for systems In ACM SIGPLAN Haskell WS. Formal specification and verification of data and M. VSTTE 2008 — [4] J. D. [12] P. through automatic generation. A. C.-P. J. Shankar. In 1986 Computer Science. Towards a practical. [30] J. The nucleus of a multiprogramming 13(1):115–134. Young. Chambers. Summer USENIX. pages C. Apr 2006. and V. Aiken. P. J. Proving that programs eventually do separation in a separation kernel for an embedded something good. Y. and W. E. Derrin. verification. Mach: A new Number 47 in Cambridge Tracts in Theoretical kernel foundation for UNIX development. Proc. D. on Software [28] Greenhills Software. 2:239–247. Huuck. Elsevier. Springer. Conf. 1989. A. In 6th TPHOLs. Tools & Experiments. N. Pfaff. Int. K.niap-ccevs. IEEE CCNC. Rybalchenko. JAR. Balancing approach to high-assurance microkernel development. A robust Workshop Proceedings. R. verification. G. Ramakrishnan and J. Oct 2007. In HotOS. G. Cock. S. pages 99–114. Hibler. ACM. editors. Cock. and A. Larus. and the Digital Economy and the Australian Research Coun. Personal communication. Sewell. In 16th SOSP. Muñoz. In 11th security-enhanced Linux policy specifications.. D. volume The MILS architecture for high-assurance embedded 5295 of LNCS. Model-Oriented Proof Methods and their Comparison. Secure microkernels. Savage. Apr 1993. Ramsdell. Rosenblum. Haigh and W. L. W. Secure virtual architecture: A safe execution cil through the ICT Centre of Excellence program. T. In AFIPS M. Harrison. P. In B. In volume 4963 of LNCS. P. Technical [27] Green Hills Software. system. ACM. Distributed Systems and Networks. G. and SPIN operating system. Derrin. Pardyak. Klein. and S. pages 109–123. Extensibility. C. and N. safety and performance in the [25] B. 1987. S. Language IEEE Computer Society. Guttman. In 19th SOSP. E. In C. ICSE ’99: Proc. M. A mathematical model of trusted computing. Elphinstone. 4th IEEE Int. Boneh. Rajamani. 2003. M. 2008. Ford. G. Analyzing G. pervasive verification of a paging mechanism. Engelhardt. Levi. Cambridge University Press. W. separation kernel security target version 1. Bershad. R. Beckert and on Software Engineering. 21st [32] G. Klein. VERIFY’08. A. Podelski. In 1st IIES. IEEE Trans. Garfinkel. WS on the ACL2 state monads and scalable refinement. T. and Syst. Linux //www. The foundations of [7] B. Starostin. 13(2):141–150. W. Tsyban. Smith. R. A. Gotsman. the Mach kernel: Atomic actions and locks. Inc. Adve. Leinenbach. and Conf. [33] C. Archer. Cook. Heitmeyer. Syst. Leonard. Eggers. M. systems. R. Fähndrich. support for fast and reliable message-based [6] T. and [2] E. M. Bitfields and tagged unions in C: Verification noninterference version of MLS for SAT. I. A. IEEE Transactions on Software [26] T. and K. USENIX. Oman. Computational Logic Inc. and S. theorem prover and its applications. Rashid. 13:238–250. Verified Softw. D. 2005. Hunt. Lenharth. 2009. as a case study: its extracted software architecture. Roscoe. P. Brinch Hansen. 1999. A. Leonard. Starostin.ghs. WS on Policies for [23] M. Taylor. J. Verification SSV’09. Tevanian.. MSR-TR-2001-21. Klein. Woodcock and N. Conf. Neumann. Int. Alkassar. REFERENCES [18] W. Schlich. R. INTEGRITY-178B Report 89. Elphinstone. Sep 2006. and M.com/products/rtos/integrity. Emb. Bowman..0. R. SLIC: A specification communication in Singularity OS. D. Data Refinement: [1] M. N.sented by the Department of Broadband. pages 167–182. and [15] B. C. pages 555–563.

Liedtke. and A. C. Helmuth. In 8th SOSP. and T. Jhala. Shapiro. Whitaker. and J. Version 1. S. D. Qu. [43] L4HQ. Luk. Structured types. Felleisen. now! In 10th HotOS. Hofmann and M. In 14th [64] H. Experience policies. Formal construction of the mathematically analyzed Specification and verification of the UCLA Unix separation kernel. Springer. Schönberg. and S. Workshop [55] O. editors. Aug 2008. volume 2283 framework for low-level C. Lepreau. May 1997. Towards real microkernels. Wheeler. 1981. In 5th Proc. Smalley. Urban. volume 5674.org/cc-scheme/pp/ [59] L.niap-ccevs. editors. and M. Spencer. T. In 16th SOSP. Systems Code. In M. pages 28–31. The VFiasco approach for a 335–350. J. S. Dec 2002. Jones. Hardy. Jun 2007. Saydjari. Martin. CACM. volume 4732 of LNCS. Conf. 2005.ertos. R. Nipkow. M. volume 217 of ENTCS. Formal certification of a compiler back-end. N. Cock. S. ISO Standard 15408. Klein. Tuch. Jones. pages 97–108. 23(2):118–131. volume 5674 of Software Verification (SSV’08). Shao. J.org. Formal Memory Models for Verifying C [44] X. Tuch. Tews.nicta. Elphinstone. and C. 2009. Security Initiative Conference. Springer. Tuch. Proc. In [38] Information Assurance Directorate. Yu. In [63] H. In 1st EuroSys Conf. Levin. Smith. JAR. Heiser. Aug 1999.org/arch/arm/. Islam. M. In C. 14 . and J. Using XCAP to certify [69] A. B. Beckman. Kemmerer. 2006. and G. [53] Rockwell Collins. ACM. Nipkow. C. 189–206. Formal verification of C systems code: J. 2009. Sep 1987. Codd. TPHOLs’07. G. Proc. Apr 2006. Shapiro. J. 42(2–4):125–187. Klein. TPHOLs’09. P. CACM. Reducing TCB complexity for security-sensitive [39] ISO/IEC. Wulf. [48] W. Proc. Norrish. K. In SPIN’03. and G. bytes. 2006. J. T. Jaeger. Oct 2007. on Computer and 2003. pages 12–21. and B. Goldberg. security kernel. Härtig. A. system. Programming languages — C. R. architecture: System support for diverse security [41] G. Operating system verification — an overview.com/sloccount/. Henzinger. Shapiro. Wenzel. Springer. Types. and K. Farber. Singaravelu. G. [46] J. Klein. Wenzel. C. OS verification — SOSP. E. G. T. Protection Profile for Separation Kernels in [58] J. and 39(9):70–77. R. [47] J. P. Heiser. 2001. 2003. H. 34th POPL. pages 146–160. Paulson. Pollack. HYDRA: The kernel of a Proceedings of the Seventh DoD/NBS Computer multiprocessor operating system. pages 129–141. 6th HotOS. S. UNSW. 33rd POPL. U. M. D. Sep 2004. M. Ni. Types. H. ISO/IEC JTC1/SC22/WG14. http://www. Sewell. In S. Gribble. Berghofer. Pu. J. Feb 2008. Nov 1996. A. 3rd Int. Pierson. F. Corwin. Loscocco. Perrig. Cohen. Dec 1993. Government 5th IWOOOS. White. pages 7–12. Jul 2005. and B. State [37] Iguana. Elphinstone. 1999. and M. In S. Elsevier. Mind the gap: A verification A Proof Assistant for Higher-Order Logic. ACM. Liedtke. In for IT Security Evaluation. 133–141. D. pages [68] D. Klein. editors. pages 89–100.S. Sep 1996. Hohmuth and H. Shaw. WS on Systems M. Majumdar. SecVisor: small kernels versus virtual-machine monitors. Liedtke.cfm/id/pp skpp hr v1. P. G. Faber. Achieved IPC [66] US National Institute of Standards. computers securely. and D. Tuch. A. Schlich. Peter. Nipkow. P. In 10th National Computer [35] M. Taylor. R. Communications Security. and F. [57] J. T. L. [65] H. Springer. maps and separation operating-system code. [45] J. Security Conference. of LNCS. In performance in the Denali isolation kernel. on Automated software engineering.dwheeler. [70] S. Leroy. and J. LNCS. 2000. or: Programming a compiler with a proof assistant.com. Völp. In R. Norrish. In 8th USENIX Security Symp. http://www. [49] Z. and M. pages [36] M. Isabelle/HOL — D. Jun 2005. [52] T. Klein. M. http://csrc. 1980. In 11th A tiny hypervisor to provide lifetime kernel code SIGOPS Eur. USENIX. Andronick.03/. J. G. [50] T. editors. [34] T. 15th IEEE Int. Morrisett and S. Reducing TCB size by using untrusted components — [56] A. fast capability system. http://l4hq. Smith. Hohmuth. Klein. 13th Conf. In ASE ’00: Proc. In 14th ICFP. pages OSDI. L. 34(1):27–69. Locking on Model Checking Software.au/software/kenge/ caching in the EROS kernel—implementing efficient iguana-project/latest/. Rushby. A formal model of microkernel. http://okl4. [40] G. Seshadri. separation logic. [67] B. H. Derrin. Sep 1984. S. Scale and realistic system code: Machine context management. kernelized secure operating system (KSOS). Klein. Sutre. Inc. Perrine. May pages 161–174. [54] J. Feb 2009. and Z. Winwood. Urban. PhD thesis. Common Criteria performance (still the foundation for extensibility). The Flask security Sādhanā. pp. http://www. editors. Jan 2007. D. pages 346–355. pages 175–188. verified operating system. [62] H. M.. Report 9899:TC2. SLOCCount. 1974. integrity for commodity OSes. and J. and M. 2002. AAMP7r1 Reference Manual. In CCS ’06: Proc. [51] OKL4 web site. In 2nd PLOS. and logic. WS. Weber. memory peculiarities for the verification of low-level [42] R.nist. Walker. Improving IPC by kernel design. Hibler. A. orthogonal peristence in a pure capability system.. M. A. and M. Leaman. W. CACM. and G. 2009. Berghofer. Tews. Software verification with Blast. In 17th SOSP. An overview of the [71] W. 17:337–345. separation logic and theorem proving. S. Sep 2007. and F. Dec 1999. EROS: A Environments Requiring High Robustness. Popek. Härtig. pages 79–96. Kolanski and G.gov/cc/. Wenzel. report: seL4 — formally verifying a high-performance [61] H. Huuck. Technical applications: Three case studies. Design and verification of secure systems. TPHOLs’09. J. IEEE Computer Society. N. Andersen. [60] R. G. Härtig.03. pages 42–54. Aug 2009.