P. 1
Type-driven testing in Haskell slides

Type-driven testing in Haskell slides

4.5

|Views: 499|Likes:
Published by ShinNoNoir
Simon Peyton Jones talks about QuickCheck and SmallCheck.
Simon Peyton Jones talks about QuickCheck and SmallCheck.

More info:

Published by: ShinNoNoir on Dec 15, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/09/2014

pdf

text

original

Purely testing

Simon Peyton Jones Microsoft Research 2008

Summary

c.f. static types 19952005

1. Over the next 10 years, the software battleground will be the control of effects

2. To succeed, we must shift programming perspective from imperative-by-default to functional-by-default
3. A concrete example: testing
o o Functional programs are far easier to test A functional language is a fantastic test generation tool

Any effect

Spectrum

Pure (no effects)

C, C++, Java, C#, VB X := In1 X := X*X X := X + In2*In2

Excel, Haskell

Commands, control flow

Expressions, data flow

 Do this, then do that  “X” is the name of a cell that has different values at different times

 No notion of sequence  “A2” is the name of a (single) value

A bigger example

A

50-shell of 100k-atom model of amorphous silicon, generated using F# Thanks: Jon Harrop

N-shell of atom A Atoms accessible in N hops (but no fewer) from A

A bigger example
1-shell of atom A

A

N-shell of atom A Atoms accessible in N hops (but no fewer) from A

A bigger example
2-shell of atom A

A

N-shell of atom A Atoms accessible in N hops (but no fewer) from A

A bigger example
To find the N-shell of A • Find the (N-1) shell of A • Union the 1-shells of each of those atoms • Delete the (N-2) shell and (N-1) shell of A
Suppose N=4 A‟s 3-shell

A bigger example
To find the N-shell of A • Find the (N-1) shell of A • Union the 1-shells of each of those atoms • Delete the (N-2) shell and (N-1) shell of A
Suppose N=4 A‟s 3-shell 1-shell of 3-shell atoms

A bigger example
To find the N-shell of A • Find the (N-1) shell of A • Union the 1-shells of each of those atoms • Delete the (N-2) shell and (N-1) shell of A
Suppose N=4

A‟s 2-shell and 3-shell A‟s 4-shell

A bigger example

(–) unitSet :: Set a SetSet a -> Set a :: a -> -> a mapUnion :: (a -> a -> b) -> SetSet aSet b (–) :: Set Set Set a -> a -> neighbours :: Graph -> Atom -> Set Atom neighbours :: Graph -> Atom -> Set Atom

To find the N-shell of A • Find the (N-1) shell of A • Find all the neighbours of those atoms • Delete the (N-2) shell and (N-1) shell of A

nShell :: Graph -> Int -> Atom -> Set Atom nShell g 0 a = unitSet a nShell g 1 a = neighbours g a nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2 where s1 = nShell g (n-1) a s2 = nShell g (n-2) a

But...
nShell n needs • nShell (n-1) • nShell (n-2)

nShell :: Graph -> Int -> Atom -> Set Atom nShell g 0 a = unitSet a nShell g 1 a = neighbours g a nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2 where s1 = nShell g (n-1) a s2 = nShell g (n-2) a

But...

nShell n needs • nShell (n-1) which needs • nShell (n-2) • nShell (n-3) • nShell (n-2) which needs • nShell (n-3) • nShell (n-4)

nShell :: Graph -> Int -> Atom -> Set Atom nShell g 0 a = unitSet a nShell g 1 a = neighbours g a nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2 where s1 = nShell g (n-1) a s2 = nShell g (n-2) a

Duplicates!

But...

nShell :: Graph -> Int -> Atom -> Set Atom nShell g 0 a = unitSet a nShell g 1 a = neighbours g a nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2 where s1 = nShell g (n-1) a s2 = nShell g (n-2) a

BUT, the two calls to (nShell g (n-2) a)

must yield the same result
And so we can safely share them • Memo function, or • Return a pair of results
nShell

Same inputs means same outputs

g

n

a

“Purity” “Referential transparency” “No side effects”

Purity pays: understanding
X1.insert( Y ) X2.delete( Y )
What does this program do?

 Would it matter if we swapped the order of these two calls?  What if X1=X2?  I wonder what else X1.insert does? Lots of heroic work on static analysis, but hampered by unnecessary effects

Purity pays: verification
Spec#

Pre-condition

void Insert( int index, object value ) requires (0 <= index && index <= Count) ensures Forall{ int i in 0:index; old(this[i]) == this[i] } { ... }

 The pre and post-conditions are written in... a functional language  Also: object invariants But: invariants temporarily broken Hence: “expose” statements

Post-condition

Purity pays: maintenance
 The type of a function tells you a LOT about it reverse :: [a] -> [a]  Large-scale data representation changes in a multi-100kloc code base can be done reliably:
o change the representation o compile until no type errors o works

Purity pays: performance
 Execution model is not so close to machine
o Hence, bigger job for compiler, execution may be slower

 But: algorithm is often more important than raw efficiency  And: purity supports radical optimisations
o nShell runs 100x faster in F# than C++ Why? More sharing of parts of sets. o SQL, XQuery query optimisers

 Real-life example: Smoke Vector Graphics library: 200kloc C++ became 50kloc OCaml, and ran 5x faster

Purity pays: parallelism
 Pure programs are “naturally parallel”  No mutable state B1 A1 * A3 means no locks, + B1 no race hazards * B2  Results totally unaffected by parallelism (1 processor or zillions)  Examples
o Google‟s map/reduce o SQL on clusters o PLINQ

Purity pays: parallelism
Can I run this LINQ query in parallel?
int index = 0; List<Customer> top10 = (from c in customers where index++ < 10 select c).ToList();

 Race hazard because of the side effect in the „where‟ clause  May be concealed inside calls  Parallel query is correct/reliable only if the expressions in the query are 100% pure

Purity pays: testing
 Testing is tremendously important in practice  Regression tests check for, well, regressions. Only catches 15% of bugs.  Desperately needed: semi-automatic test generators. Challenges:
o How do we say what to test? o How do we generate test data?

Purity pays: testing
In an imperative or OO language, you must
 set up the state of the object, and the external state it reads or writes  make the call(s)  inspect the state of the object, and the external state  perhaps copy part of the object or global state, so that you can use it in the postcondition

Purity pays: testing in Haskell
• How do we say what to test? Answer: write a Haskell function
prop_union :: Set a -> Bool prop_union s = union s s == s

No “old” s

• Ordinary Haskell (no new language) • Type-checked • May involve inter-relationships
prop_revapp xs ys = reverse (xs ++ ys) == (reverse xs) ++ (reverse ys)

Purity pays: testing in Haskell
• How do we generate test data? Answer: use the QuickCheck library
Main> quickCheck prop_union *** OK Passed 100 tests

• QuickCheck is just a Haskell library • No new tools to learn • Lightweight, so more likely to be used

Example
 SMS encoding  Pack 7-bit characters into 8-bit bytes, and unpack pack :: [Word8] -> [Word8]
unpack :: [Word8] -> [Word8]

 Pack and unpack should be inverses
prop_pack :: [Word8] -> Bool prop_pack s = unpack (pack s) == s

Demo

Filters
prop_pack s = length s == 8 ==> unpack (pack s) == s

 If too much data is discarded, QuickCheck warns you (e.g. False ==> condition should not just say “passed”!)
Prelude> quickCheck prop_ins *** Gave up! Passed only 53 tests:

Filters
Danger of skewed data distribution
insert :: Ord a => a -> [a] -> [a] ordered :: Ord a => [a] -> Bool prop_ins x xs = ordered xs ==> ordered (insert x xs)

Chances of a list being ordered decrease with size => test distribution will be skewed towards small lists

Filters
Show data distribution
prop_ins x xs = ordered xs ==> collect (length xs) (ordered (insert x xs))
Prelude> quickCheck prop_ins *** Gave up! Passed only 53 tests: 39% 1 22% 0 20% 2 15% 3 1% 6

Generators
Generator
prop_pack2 = forAll (vectorOf 8 arbitrary) prop_pack prop_pack s = unpack (pack s) == s
arbitrary :: Gen Word8 vectorOf :: Int -> Gen a -> Gen [a] forAll :: Gen a -> (a -> Bool) -> Property

 Generators allow you to control the shape and distribution of your data

Digression: how can this work?
prop_rev :: [Int] -> Bool prop_rev xs = xs == reverse (reverse xs)  What
prop_revapp :: [Int] -> [Int] -> Bool prop_revapp xs ys = xs++ys == reverse xs ++ reverse ys
Prelude> quickCheck prop_rev OK Prelude> quickCheck prop_revapp OK

What type does quickCheck have????
Prelude> :i quickCheck quickCheck :: Testable p => p -> IO ()

“for all types w that support the Eq operations”

Type classes

delete :: w. Eq w => [w] -> w -> [w]
 If a function works for every type that has particular properties, the type of the function says just that
sort :: Ord a => [a] -> [a] serialise :: Show a => a -> String square :: Num n => n -> n

 Otherwise, it must work for any type whatsoever
reverse :: [a] -> [a] filter :: (a -> Bool) -> [a] -> [a]

Works for any type „n‟ that supports the Num operations

Type classes
=> n -> n

FORGET all you know about OO classes!

square :: Num n square x = x*x

class Num a (+) :: (*) :: negate :: ...etc.. instance Num a + b = a * b = negate a = ...etc..

where a -> a -> a a -> a -> a a -> a

The class declaration says what the Num operations are An instance declaration for a type T says how the Num operations are implemented on T‟s
plusInt :: Int -> Int -> Int mulInt :: Int -> Int -> Int etc, defined as primitives

Int where plusInt a b mulInt a b negInt a

How type classes work
When you write this...
square :: Num n => n -> n square x = x*x

...the compiler generates this
square :: Num n -> n -> n square d x = (*) d x x

The “Num n =>” turns into an extra value argument to the function. It is a value of data type Num n

A value of type (Num T) is a vector of the Num operations for type T

How type classes work
When you write this...
square :: Num n => n -> n square x = x*x class Num a (+) :: (*) :: negate :: ...etc.. where a -> a -> a a -> a -> a a -> a

...the compiler generates this
square :: Num n -> n -> n square d x = (*) d x x data Num a = MkNum (a->a->a) (a->a->a) (a->a) ...etc... (*) :: Num a -> a -> a -> a (*) (MkNum _ m _ ...) = m
A value of type (Num T) is a vector of the Num operations for type T

The class decl translates to: • A data type decl for Num • A selector function for each class operation

How type classes work
When you write this...
square :: Num n => n -> n square x = x*x

...the compiler generates this
square :: Num n -> n -> n square d x = (*) d x x

instance Num a + b = a * b = negate a = ...etc..

Int where plusInt a b mulInt a b negInt a

dNumInt :: Num Int dNumInt = MkNum plusInt mulInt negInt ...

An instance decl for type T translates to a value declaration for the Num dictionary for T

A value of type (Num T) is a vector of the Num operations for type T

All this scales up nicely
 You can build big overloaded functions by calling smaller overloaded functions
sumSq :: Num n => n -> n -> n sumSq x y = square x + square y

sumSq :: Num n -> n -> n -> n sumSq d x y = (+) d (square d x) (square d y)

Extract addition operation from d

Pass on d to square

Example: complex numbers
class Num a where (+) :: a -> a -> a (-) :: a -> a -> a fromInteger :: Integer -> a ....
inc :: Num a => a -> a inc x = x + 1
Even literals are overloaded

“1” means “fromInteger 1”

data Cpx a = Cpx a a instance Num a => Num (Cpx a) where (Cpx r1 i1) + (Cpx r2 i2) = Cpx (r1+r2) (i1+i2) fromInteger n = Cpx (fromInteger n) 0

Properties can be overloaded too
prop_assoc :: Num a => a -> a -> a -> Bool prop_assoc x y z = (x+y)+z == x+(y+z)
Prelude> quickCheck (prop_assoc :: Int -> Int -> Int) OK Prelude> quickCheck (prop_assoc :: Flt -> Flt -> Flt) Fails

 The type signature tells quickCheck whether to generate Ints or Floats

Back to QuickCheck
quickCheck :: Testable a => a -> IO ()

class Testable a where test :: a -> RandSupply -> Bool
class Arbitrary a where arby :: RandSupply -> a instance Testable Bool where test b r = b instance (Arbitrary a, Testable b) => Testable (a->b) where test f r = test (f (arby r1)) r2 where (r1,r2) = split r split :: RandSupply -> (RandSupply, RandSupply)

A completely different example: Quickcheck
prop_rev:: [Int] -> Bool Using instance for (->) Using instance for Bool test prop_rev r = test (prop_rev (arby r1)) r2 where (r1,r2) = split r

= prop_rev (arby r1)

Generating arbitrary values
class Arbitrary a where arby :: RandSupply -> a instance Arbitrary Int where arby r = randInt r

instance Arbitrary a Generate Nil value => Arbitrary [a] where arby r | even r1 = [] | otherwise = arby r2 : arby r3 where (r1,r’) = split r Generate cons value (r2,r3) = split r’

split :: RandSupply -> (RandSupply, RandSupply) randInt :: RandSupply -> Int

Three take-away thoughts
1. Testing pure functions is a lot easier than testing stateful ones 2. To generate tests you need a “domain specific language”
o Higher order functional languages (higher order) are ideal for this purpose

3. You can use (2) without (1)

Testing imperative programs
Test generator

Script Imperative program (e.g. Web Service)

Partial model

Results

Pass/fail

John Hughes‟ s company, Quvik, does just this for telecoms software

Other testing tools for Haskell
 Hunit (unit testing)  Lazy Smallcheck (exhaustive testing)  Catch (static analysis for pattern match failures)  Haskell Program Coverage Tool (so you can see where your tests reach)  Time and space profiling
http://haskell.org

Standing back....
 Mainstream languages are hamstrung by gratuitous (ie unnecessary) effects: effects are part of the fabric of computation
T = 0; for (i=0; i<N; i++) { T = T + i }

 Future software will be effect-free by default,
o With controlled effects where necessary o Statically checked by the type system

And the future is here...
 Functional programming has fascinated academics for decades  But professional-developer interest in functional programming has sky-rocketed in the last 5 years.
Suddenly, FP is cool, not geeky.

Most research languages
Practitioners
1,000,000

10,000

100

Geeks

The quick death
1

1yr

5yr

10yr

15yr

Successful research languages
Practitioners
1,000,000

10,000

100

Geeks

The slow death

1

1yr

5yr

10yr

15yr

C++, Java, Perl, Ruby
Practitioners Threshold of immortality
1,000,000

10,000

100

The regrettable absence of death

Geeks

1

1yr

5yr

10yr

15yr

Haskell
Practitioners
1,000,000
“I'm already looking at coding problems and my mental perspective is now shifting back and forth between purely OO and more FP styled solutions” (blog Mar 2007)

“Learning Haskell is a great way of training yourself to think functionally so you are ready to take full advantage of C# 3.0 when it comes out” (blog Apr 2007)

10,000

100

Geeks

The second life?

1

1990

1995

2000

2005

2010

Lots of other great examples
 Erlang: widely respected and admired as a shining example of functional programming applied to an important domain  F#: now being commercialised by Microsoft  OCaml, Scala, Scheme: academic languages being widely used in industry  C#: explicitly adopting functional ideas (e.g. LINQ)

Sharply rising activity
GHC bug tracker 1999-2007

Haskell IRC channel 2001-2007

Jan 20 Feb 9 Feb 11 Feb 12 Feb 13 Feb 16 Feb 19 Feb 20

Austin Functional Programming FringeDC PDXFunc Fun in the afternoon BayFP St-Petersburg Haskell User Group NYFP Network Seattle FP Group

Austin Washington DC Portland London San Francisco Saint-Petersburg New York Seattle

CUFP
Commercial Users of Functional Programming 2004-2007
Speakers describing applications in: banking, smart cards, telecoms, data parallel, terrorism response training, machine learning, network services, hardware design, communications security, cross-domain security

CUFP 2008 is part of the a new

Functional Programming Developer Conference
(tutorials, tools, recruitment, etc) Victoria, British Columbia, Sept 2008 Same meeting: workshops on Erlang, ML, Haskell, Scheme.

Summary
 The languages and tools of functional programming are being used to make money fast  The ideas of functional programming are rapidly becoming mainstream  In particular, the Big Deal for programming in the next decade is the control of effects, and functional programming is the place to look for solutions.

Quotes from the front line

 

“Learning Haskell has completely reversed my feeling that static typing is an old outdated idea.” “Changing the type of a function in Python will lead to strange runtime errors. But when I modify a Haskell program, I already know it will work once it compiles.” “Our chat system was implemented by 3 other groups (two Java, one C++). Haskell implementation is more stable, provides more features, and has about 70% less code.” “I‟m no expert, but I got an order of magnitude improvement in code size and 2 orders of magnitude development improvement in development time” “My Python solution was 50 lines. My Haskell solution was 14 lines, and I was quite pleased. Your Haskell solution was 5.” "C isn't hard; programming in C is hard. On the other hand, Haskell is hard, but programming in Haskell is easy.”

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->