Professional Documents
Culture Documents
CakeML
A verified implementation of ML
Ramana
umar, Ramana
Magnus Kumar
Myreen,
Ramana
Ramana Kumar,
Kumar, Magnus
Michael
Kumar,
Magnus
Magnus Myreen
Norrish,
Magnus Scott
Myreen,
Myreen, Michael
Owens
Myreen, Norrish
Michael
Michael
Michael Norrish,
Norrish, Scott
Norrish, ScottOwens
Owens
Scott Owens
Background
From my PhD (2009):
Verified Lisp interpreter
in ARM, x86 and PowerPC machine code
proof-producing code
Michael Norrish generation from HOL Magnus Myreen
(NICTA, ANU) (Uni. Cambridge)
Overall aim
to make proof assistants into trustworthy and
practical program development platforms
POPL’14
i o n o f M L
p le m e n t at
Ve r ifie d Im
e M L : A
Cak
3
t O w e ns
rr i s h 2 Scot
a e l N o † 1 Mich
us O . M yreen UK
⇤ 1 Ma g n o f C ambridge, ‡
Ku m a r nive r s it y
Ram a n a
p u te r L a boratory, U b , N I C TA , Australia
1
Com esearch La f Kent, UK
2
Canberra R puting, University o
3 chool of Com
S
n compila tion;
d u c ti o r ifi e d
1. Intro in ve
a s tr o ng interest s u lt s, ma ny based
seen r e
ecade has high-profile interest is
ed The last d ve been significant, 1, 14, 16, 29]. This nverified
system call . ha [
M L and there o m p iler for C ram verification, an u puting
Abstract rifi ed a n ML mpC er t c rog com
n d m e c h anically ve ubset of Standard on the Co y: in the context of p x part of the trusted work on
eveloped a stantial s print loop tif le g
easy to jus ms a large and comp none of the existin essed all
We have d hich supports a sub teractive read-eval- ensures o r e , dr
CakeML,
w
m e n te d as an in correctness theorem mitted compiler f er, to our knowledg ose languages has ad pilation
ple ev purp e com
keML is
im
ne code. O
ur
e results p
er base. How or general- ns: one, th of
Dimensions of Compiler Verification
source code
how far compiler goes
abstract syntax
intermediate language
bytecode Our verification covers the full
spectrum of both dimensions.
machine code
Proof by refinement:
Big-step semantics:
read-eval-print-loop
lexing, parsing
type inference
compilation
bytecode execution
lexing, parsing
Specification:
Context-free grammar (CFG) for significant subset of SML
Executable lexer.
Implementation:
Parsing-Expression-Grammar (PEG) Parser
‣ inductive evaluation relation
‣ executable interpreter for PEGs
Correctness:
Soundness and completeness
‣ induction on length of token list/parse tree and
non-terminal rank
type inference
Specification:
Declarative type system.
Implementation:
Correctness:
Proved sound w.r.t. declarative type system
Re-use of previous work on verified unification
compilation
Purpose:
Translates (typechecked) CakeML into CakeML Bytecode.
Implementation:
Instructions:
Sample rules:
Correctness:
Big-step semantics:
• has an optional clock component
• clock ‘ticks’ decrements every time a function is applied
• once clock hits zero, execution stops with a TimeOut
Why do this?
• because now big-step semantics describes both
terminating and non-terminating evaluations
either: Result
for every exp env clock there is some result or TimeOut
8exp env clock . 9res. (exp, env , Some clock) +ev res
Effort:
~70k lines of proof script in HOL4
< 5 man-years, but builds on a lot of previous work
Size:
875,812 instructions of verified x86-64 machine code
implementation generates
more instructions at runtime
Current compiler:
module compilation
IL-2
closure compilation
pattern-match compilation …
Anthony Fox joins project and helps with final phases asm.js
Verified examples on CakeML
Verification infrastructure:
• have: synthesis tool that maps HOL into CakeML [ICFP’12]
• future: integration with Arthur Charguéraud’s characteristic
formulae technology [ICFP’10, ICFP’11]
for developing cool verified examples.
Big example: verified HOL light
Contributions so far:
First(?) bootstrapping of a formally verified compiler.
New lightweight method for divergence preservation.
Current work:
Formally verified implementation of HOL light.
Verified I/O (foreign-function interface). seL4.
Compiler improvements (new ILs, opt, targets).
Long-term aim:
An ecosystem of tools and proofs around CakeML lang.
Questions? Suggestions?