Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Download
Standard view
Full view
of .
Look up keyword
Like this
0Activity
0 of .
Results for:
No results containing your search query
P. 1
c Formal Verification

c Formal Verification

Ratings: (0)|Views: 2|Likes:
Published by garoji

More info:

Published by: garoji on Dec 04, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/04/2012

pdf

text

original

 
J Autom Reasoning (2009) 42:125–187DOI 10.1007/s10817-009-9120-2
Formal Verification of C Systems Code
Structured Types, Separation Logic and Theorem Proving
Harvey Tuch
Received: 26 February 2009 / Accepted: 26 February 2009 / Published online: 2 April 2009© Springer Science + Business Media B.V. 2009
Abstract
Systems code is almost universally written in the C programming languageor a variant. C has a very low level of type and memory abstraction and formalreasoning about C systems code requires a memory model that is able to capturethe semantics of C pointers and types. At the same time, proof-based verificationdemands abstraction, in particular from the aliasing and frame problems. In thispaper we present a study in the mechanisation of two proof abstractions for pointerprogram verification in the Isabelle/HOL theorem prover, based on a low-levelmemorymodelforC.Thelanguage’stypesystempresentschallengesforthemultipleindependent typed heaps (Burstall-Bornat) and separation logic proof techniques.In addition to issues arising from explicit value size/alignment, padding, type-unsafecasts and pointer address arithmetic, structured types such as C’s arrays and
struct
sare problematic due to the non-monotonic nature of pointer and lvalue validityin the presence of the unary &-operator. For example, type-safe updates throughpointers to fields of a
struct
break the independence of updates across typed heapsor
-conjuncts. We provide models and rules that are able to cope with theselanguage features and types, eschewing common over-simplifications and utilisingexpressive shallow embeddings in higher-order logic. Two case studies are providedthat demonstrate the applicability of the mechanised models to real-world systemscode; a working of the standard in-place list reversal example and an overview of theverification of the L4 microkernel’s memory allocator.
Keywords
Separation logic
·
C
·
Interactive theorem proving
H. Tuch (
B
)Sydney Research Lab., National ICT Australia, Eveleigh, Australiae-mail: htuch@cse.unsw.edu.au
 
126 H. Tuch
1 Introduction
Systems code constitutes the lowest layer of the software stack, typically an operatingsystem, hypervisor, language run-time, real-time executive or the like. The vast ma- jority of systems code today is implemented in the C[39] programming language or some variant such as C
++
and Objective-C. Higher-level languages, for exampleJava or ML, are incompatible with the goals of system implementation due to themany abstraction breaking requirements, for example zero-copy I/O, address trans-lation manipulation and control over data structure layout[43]. We would like to have high confidence in the correctness of the implementationof this foundational aspect of any system and a rigorous approach requires formalarguments to be made about the behaviour of low-level C source code. This entailsbeing able to precisely model the syntax of the language and specifications, as wellas the language semantics, and possessing sound proof rules to manipulate thesestatements. In addition, as a practical necessity we need to be able to produce a proof in a timely and economical manner.Unfortunately, C was not developed with formalisation in mind and it has a lowlevel of type and memory abstraction—details down to the bit-layout of values,references and data structures are not opaque and it is easy to violate the C typesystem by its cast mechanism and through address arithmetic. C does not have gar-bage collection and the programmer is responsible for allocation and deallocation of memory through library calls.Over the last decade, developments in interactive theorem proving technology[31,35] and studies in real-world language mechanisation[20,21,32,34] have paved the way to verification of actual source code, as the compiler sees it. Previously thishad been only a theoretical possibility and toy problems and simplified languageswerethenorm.InthispaperweaddressthechallengesanddetailsofbothformalisingC’s memory model and types in higher-order logic with the Isabelle theorem provingsystem, and present developments of two commonly used proof techniques forreasoning about pointer programs. We have chosen this emphasis here due to thecritical importance of reasoning about inductively-defined mutable data structures,such as linked lists and trees, in C systems code verification. For example, mostcontemporary operating systems feature thread/process control blocks, schedulerqueues, page tables and more than one kernel memory allocator—non-trivial pointerlinked data structures.Below we provide an introduction to the aliasing and frame problems that causemuch of the difficulty in reasoning about pointer programs, the Burstall-Bornat andseparation logic proof techniques that tame them and the additional complicationsthat the C language burdens the verification effort with. We then conclude the intro-duction with a summary of our contributions and an overview of the rest of the paper.1.1 C Pointer Program Verification
1.1.1 Aliasing and Frame Problems
For an example of the aliasing problem, consider a program with two pointervariables
int *
p
and
int *
q
and the following triple:
{|
True
|}
 p
=
 37 
;
q
=
42
;{|
 p
=
?
|}
 
Formal Verication of C Systems Code 127
We are unable to ascertain the value pointed to by
p
as it may refer to the samelocation as
q
. We need to state that
p
=
q
or
p
=
q
in the pre-condition to be able todetermine the value of 
 p
in the post-state. We refer to aliasing between pointers of the same type in this paper as
intra-type aliasing
.The aliasing problem is much worse for inductively-defined data structures, whereit is possible that structural invariants can be violated, and where we need moresophisticated recursive predicates to stipulate aliasing conditions. These predicatesappear in specifications, invariants and proofs, and their discovery is often a timeconsuming trial-and-error process.The aliasing situation becomes untenable when code is type-unsafe and we areforced to seek improved methods. If instead we had a variable
float *
p
:
{|
True
|}
 p
=
 3
.
14
;
q
=
42
;{|
 p
=
?
|}
then not only do we have to consider aliasing between pointers of different types, butalso the potential for
p
to be pointing inside the encoding of 
q
and vice versa. Wetalk about this phenomenon as
inter-type aliasing
.The frame problem is apparent in Hoare triples. While specifications may mentionsome state that is affected by the intended behaviour of a program, it is hard tocapture the state that is not changed. In the above example, a client verification thatalso dereferences a pointer
, not mentioned in the specification, has no informationon its value after execution of the code fragment. This limits reusability and hencescalability of verifications.
1.1.2 Burstall-Bornat and Separation Logic
We first consider a simplified semantic model for the heap where we want to be ableto describe the effects of memory accesses and updates through pointer expressions.A reasonable approach from a descriptive language semantics perspective is toregard memory simply as a function from some type
addr 
representing addressablelocationstosometype
value
,i.e.
addr 
value
.Thisworksfinefortypelesslanguages,while for type-safe languages we can make
value
a disjoint union of language types,e.g.:
datatype
value
=
Int
int 
|
Float
float 
|
IntPtr
addr 
|
...
The semantics of access and update dereferences are easy to express as they trans-late to function application and update. Address arithmetic can be modelled byhaving
addr 
be an integer type.It is straightforward to adapt the Hoare logic assignment rule:
{|
[
 x
/
v
]|}
 x
=
v
;{|
|}
where
[
 x
/
v
]
indicates that all occurrences of program variable
x
in assertion
arereplaced with
v
. To do so, we treat memory as a variable with a function type. If thisvariable was called
h
then the rule would be:
{|
h p
=
[
h
/
h
(
 p
v
)
]|}
 p
=
v
;{|
|}
Since this is the same view of memory as in the previous section, the aliasing andframe problems described above are present.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->