You are on page 1of 46

Engineering Large Projects in Haskell

A Decade of Functional Programming at Galois


Don Stewart | 2009 04 20 | London HUG

2008 Galois, Inc. All rights reserved.

This talk made possible by...

Aaron Tomb

Joe Hurd

Adam Wick

Joel Stanley

Andy Adams-Moran

John Launchbury

Andy Gill

John Matthews

David Burke

Laura McKinney

Dylan McNamee

Lee Pike

Eric Mertens

Levent Erkok

Iavor Diatchki

Louis Testa

Isaac Potoczny-Jones

Magnus Carlsson

Jef Bell

Paul Heinlein

Peter White

Sally Browning

Trevor Elliott

Thomas Nordin

Phil Weaver

Brett Letner

and many others

2008 Galois, Inc. All rights reserved.

Jeff Lewis

What does Galois do?


Information assurance for critical systems
Building systems that are trustworthy and secure
Mixture of government and industry clients
R&D with our favorite tools:
Formal methods
Typed functional languages
Languages, compilers, DSLs
Systems components: kernels, file systems, network
stuff, analysis tools, user land apps, ...
Haskell for pretty much everything

2008 Galois, Inc. All rights reserved.

Yes. Haskell can do that.

Many 20 200k LOC Haskell projects


Oldest projects approaching 10 years
Teams of 1 6 developers at a time
Much pair programming, whiteboards, code reviews
20 30 devs over longer project lifetime
Have built many tools and libraries to support
Haskell development on this scale

Haskell essential to keeping clients happy with:


Deadlines, performance(!), maintainability
2008 Galois, Inc. All rights reserved.

Themes

Languages matter!
Writing correct software is difficult!
Programming languages vary wildly in how well they
support robust, secure, safe coding practices
Languages and tools can aid or hinder our efforts:
Type systems
Purity
Modularity / compositionality
Abstraction support
Tools: analyses, provers, model checking
Buggy implementations

2008 Galois, Inc. All rights reserved.

Detect errors early!


Detecting problems before executing the program is
critical
Debugging is hard
Debugging low level systems is harder
Debugging low level critical systems is ...
Culture of error prevention
How could we rule out this class of errors?
How could we be more precise?

2008 Galois, Inc. All rights reserved.

The toolchain matters!


Can't build anything without a good tool chain
Native code compiler
Libraries, libraries, libraries
Debugging, tracing
Profiling, inspection
Testing, analysis
Open, modifiable tools
Particularly when pushing the
boundaries
2008 Galois, Inc. All rights reserved.

Community matters!
Soup of ideas in a large, open research community:
Rapid adoption of new ideas
Support, maintainance and help
Can't build everything we need in-house!
Give back via:
Workshops: CUFP, ICFP, Haskell Symposium
Hackathons
Industrial Haskell Group
Open source code and infrastructure
Teaching: papers, blogs, talks
2008 Galois, Inc. All rights reserved.

How Galois uses Haskell

1. The Type System

Types make our lives easier


Cheap way to verify properties
Cheaper than theorem proving
More assurance than testing
Saves debugging in hostile environments

Typical conversation:
Engineer A: Spec says this must never
happen
Engineer B: Can we enforce that in the
type system?
2008 Galois, Inc. All rights reserved.

Kinds of things types enforce


Simple things:

Correct arguments to a function


Function f does not touch the disk
No null pointers
Mixing up similar concepts:
Virtual / physical addresses

Serious things:
Information flow policies
Correct component wiring and integration
2008 Galois, Inc. All rights reserved.

Recent experience
First demo of a big systems project
Six engineers
50k lines of code, in 5 components,
developed over a number of months
Integrated, tested, demo'd in only a week,
two months ahead of schedule, 2 rungs
above performance spec.
1 space leak, spotted and fixed on first
day of testing
2 bugs found (typos from spec)
2008 Galois, Inc. All rights reserved.

Purity is fundamental
Difficult to show safety without purity
Code should be pure by default
Makes large systems easier to glue:
Pure code is safe by default to call

Effects are code smells, and have to be


treated carefully
The world has too many impure
languages: don't add to that
2008 Galois, Inc. All rights reserved.

Types aren't enough though

Still not expressive enough for a lot of the


properties we want to enforce
We care a lot about sizes in types
Input must only be 128, 192 or 256 bits
Type T should be represented with 7 bits
2008 Galois, Inc. All rights reserved.

Other tools in the bag


Extended static analysis tools
Model checking
SAT, SMT,

Theorem proving
Isabelle, Coq

How much assurance do you need?


2008 Galois, Inc. All rights reserved.

2. Abstractions

Monads
Constantly rolling new monads
Captures critical facts about the execution
environment in the type

Directly encodes semantics we care about


Computed keys are not visible outside the
M component
Function f has read-only access to
memory
2008 Galois, Inc. All rights reserved.

Algebraic Data Types


Every system is either an interpreter or a
compiler
Abstract syntax trees are ubiquitous
Represent processes symbolically, via
ADTs, then evaluate them in a safe
(monadic) context
Precise, concise control over possible
values
But need precise representation control
2008 Galois, Inc. All rights reserved.

Laziness
Captures some concepts perfectly
A stream of 4k packets from the wire

Critical for control abstractions in DSLs


Useful for prototyping:
error M.F.foo: not implemented

2008 Galois, Inc. All rights reserved.

Laziness
Makes time and space reasoning harder!
Mostly harmless in practice
Stress testing tends to reveal retainers
Graphical profiling knocks it dead
Must be able to precisely enable/disable
Be careful with exceptions and mutation
whnf/rnf/! are your friends
2008 Galois, Inc. All rights reserved.

Type classes
We use type classes
Well defined interfaces between large
components (sets of modules)
Natural code reuse
Capture general concepts in a natural way
Capture interface in a clear way
Kick butt EDSLs (see Lennart's blog)

2008 Galois, Inc. All rights reserved.

Concurrency
forkIO rocks
Cheap, very fast, precise threads

MVars rock
STM rocks (safely composable locks!)
Result: not shy introducing concurrency
when appropriate
2008 Galois, Inc. All rights reserved.

3. Foreign Function Interface

Foreign Function Interface


The world is a messy place
A good FFI means we can always call
someone else's code if necessary
Have to talk to weird bits of hardware and
weird proof systems
ForeignPtr is great abstraction tool
Must have clear API into the runtime
system (hot topic at the moment)
2008 Galois, Inc. All rights reserved.

4. Meta programming

There's alway boilerplate


Abstractions get rid of a lot of repetitive
code, but there's always something
that's not automated
We use a little Template Haskell
Other generics:
Hinze-style generics
SYB generics

Particular useful for generating instance


code for marshalling
2008 Galois, Inc. All rights reserved.

5. Performance

Fast enough for majority of things


Vast majority of code is fast enough
GHC -O2 -funbox-strict-fields
Happy with 1 2x C for low level code

Last few drops get squeezed out:

Profiling
Low level Haskell
Cycle-level measurement
EDSLs to generate better code
Calling into C

2008 Galois, Inc. All rights reserved.

Performance
Really precise performance requires
expertise
Libraries are helping reify oral traditions
about optimization
Still a lack of clarity about performance
techniques in the broader Haskell
community though
2008 Galois, Inc. All rights reserved.

6. Debugging

There are still bugs!


Testing
QuickCheck!!!

Heap profiling
By type profiling of the heap

GHC -fhpc
Great for finding exceptions
Understanding what is executing

+RTS -stderr
Explain what GC, threads, memory is up to

2008 Galois, Inc. All rights reserved.

7. Documentation

Generating supporting artifacts


Haddock is great for reference material
Helps capture design in the source
Code + types becomes self documenting

Design documents can be partially


extracted via:

The major data and type signatures


graphmod
cabalgraph
HPC analysis

2008 Galois, Inc. All rights reserved.

8. Libraries

Hackage Changes Everything


There's a library for everything, and often
more than one...
Can sit back and let mtl / monadlib / haxml
/ hxt fight it out :)
Static linking need BSD licensed code
if we want to ship
Haskell Platform to answer QA questions
2008 Galois, Inc. All rights reserved.

9. Shipping code

Cabal
I don't know how Haskell was possible
before Cabal :)
Quickly adopted Cabal/cabal-install across
projects
cabal-install:
Simple, clean integration of internal and
external components into packageable
objects
2008 Galois, Inc. All rights reserved.

10. Conventions

We try to ...

-Wall police
Consistent layout
No tabs
Import qualified Control.Exception
{-# LANGUAGE #-}
Map exceptions into Either / Maybe

2008 Galois, Inc. All rights reserved.

We try to ...
deriving Show
Line/column for errors if you must throw
No global mutable state
Put type sigs in when you're done with
the design
Use GHCi for rapid experimentation
Cabal by default.
Libraries by default

2008 Galois, Inc. All rights reserved.

11. Things that we still need

More support for large scale


programming
Enforcing conventions across the code
Data representation precision (emerging)
A serious refactoring tool
Vetted and audited libraries by experts
(Haskell Platform)
Idioms for mapping design onto
types/functions/classes/monads
Better capture your 100 module design!

2008 Galois, Inc. All rights reserved.

You might also like