Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Clowns to the left of me, Jokers to the right

Clowns to the left of me, Jokers to the right

Ratings: (0)|Views: 15 |Likes:

More info:

Published by: Lucius Gregory Meredith on Nov 29, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

11/29/2010

pdf

text

original

 
A POPL Pearl Submission
Clowns to the Left of me, Jokers to the Right
Dissecting Data Structures
Conor McBride
University of Nottingham
ctm@cs.nott.ac.uk
Abstract
This paper, submitted as a ‘pearl’, introduces a small but usefulgeneralisation to the ‘derivative’ operation on datatypes underly-ing Huet’s notion of ‘zipper’ (Huet 1997; McBride 2001; Abbottet al. 2005b), giving a concrete representation to one-hole contextsin data which is in
mid-transformation
. This operator, ‘dissection’,turns a container-like functor into a bifunctor representing a one-hole context in which elements to the left of the hole are distin-guished in type from elements to its right.I present dissection for polynomial functors, although it is cer-tainly more general, preferring to concentrate here on its diverseapplications. For a start, map-like operations over the functor andfold-like operations over the recursive data structure it induces canbe expressed by tail recursion alone. Moreover, the derivative isreadily recovered from the dissection, along with Huet’s navigationoperations. A further special case of dissection, ‘division’, capturesthe notion of 
leftmost 
hole, canonically distinguishing values withno elements from those with at least one. By way of a more prac-tical example, division and dissection are exploited to give a rela-tively efficient generic algorithm for abstracting all occurrences of one term from another in a first-order syntax.The source code for the paper is available online
1
and compileswith recent extensions to the Glasgow Haskell Compiler.
1. Introduction
There’s an old
Stealer’s Wheel
song with the memorable chorus:
‘Clowns to the left of me, jokers to the right, Here I am, stuck in the middle with you.’
Joe Egan, Gerry RaffertyIn this paper, I examine what it’s like to be stuck in the middleof traversing and transforming a data structure. I’ll show both youand the Glasgow Haskell Compiler how to calculate the datatype of a ‘freezeframe’ in a map- or fold-like operation from the datatypebeing operated on. That is, I’ll explain how to compute a first-classdata representation of the control structure underlying map and foldtraversals, via an operator which I call
dissection
. Dissection turnsout to generalise both the
derivative
operator underlying Huet’s
1
http://www.cs.nott.ac.uk/
ctm/CloJo/CJ.lhs
[Copyright notice will appear here once ’preprint’ option is removed.]
‘zippers’ (Huet 1997; McBride 2001) and the notion of 
division
used to calculate the non-constant part of a polynomial. Let metake you on a journey into the algebra and differential calculus of data, in search of functionality from structure.Here’san exampletraversal—evaluatingaverysimplelanguageof expressions:
data
Expr
=
Val Int
|
Add Expr Expreval
::
Expr
Inteval
(
Val
) =
eval
(
Add
e
1
e
2
) =
eval
e
1
+
eval
e
2
What happens if we freeze a traversal? Typically, we shall haveone piece of data ‘in focus’, with unprocessed data ahead of usand processed data behind. We should expect something a bit likeHuet’s ‘zipper’ representation of one-hole contexts (Huet 1997),but with different sorts of stuff either side of the hole.In the case of our evaluator, suppose we proceed left-to-right.Whenever we face an
Add
, we must first go left into the firstoperand, recording the second
Expr
to process later; once we havefinished with the former, we must go right into the second operand,recording the
Int
returned from the first; as soon as we have bothvalues, we can add. Correspondingly, a
Stack
of these direction-with-cache choices completely determined where we are in theevaluation process. Let’s make this structure explicit:
2
type
Stack
= [
Expr
+
Int
]
Nowwecan implement an ‘
eval
machine’—atailrecursion, ateachstage stuck in the middle with an expression to decompose,
load
ingthe stack by going left, or a value to use,
unload
ing the stack andmoving right.
eval
::
Expr
Inteval
e
=
load
e
[ ]
load
::
Expr
Stack
Intload
(
Val
)
stk 
=
unload
i stk 
load
(
Add
e
1
e
2
)
stk 
=
load
e
1
(
L
e
2
:
stk 
)
unload
::
Int
Stack
Intunload
[ ] =
unload
v
1
(
L
e
2
:
stk 
) =
load
e
2
(
R
v
1
:
stk 
)
unload
v
2
(
R
v
1
:
stk 
) =
unload
(
v
1
+
v
2
)
stk 
Each layerof this
Stack
structureis a
dissection
of 
Expr
’srecursionpattern. We have two ways to be stuck in the middle: we’re either
L
e
2
, on the left with an
Expr
s waiting to the right of us, or
R
v
1
, onthe right with an
Int
cached to the left of us. Let’s find out how todo this in general, calculating the ‘machine’ corresponding to anyold fold over finite first-order data.
2
For brevity, I write
·
+
·
for
Either
,
L
for
Left
and
R
for
Right
Clowns and Jokers
1
2007/7/17 
 
2. Polynomial Functors and Bifunctors
This section briefly recapitulates material which is quite standard. Ihope to gain some generic leverage by exploiting the characterisa-tion of recursive datatypes as fixpoints of polynomial functors. Formore depth and detail, I refer the reader to the excellent
‘Algebraof Programming’
(Bird and de Moor 1997).If we are to work in a generic way with data structures, we needto present them in a generic way. Rather than giving an individual
data
declaration for each type we want, let us see how to buildthem from a fixed repertoire of components. I’ll begin with the
 polynomial
type constructors in
one
parameter. These are generatedby constants, the identity, sum and product. I label them with a
1
subscript to distinguish them their bifunctorial cousins.
data
K
1
a
=
K
1
-- constant
data
Id
=
Id
-- element
data
(
p
+
1
)
=
L
1
(
p
)
|
R
1
(
q
)
-- choice
data
(
p
×
1
)
= (
p
,
1
q
)
-- pairingAllow me to abbreviate one of my favourite constant functors, atthe same time bringing it into line with our algebraic style.
type
1
1
=
K
1
()
Some very basic ‘container’ type constructors can be expressedas polynomials, with the parameter giving the type of ‘elements’.For example, the
Maybe
type constructor gives a choice between
Nothing
’, a constant, and ‘
Just
’, embedding an element.
type
Maybe
=
1
1
+
1
IdNothing
=
L
1
(
K
1
())
Just
=
R
1
(
Id
)
Whenever I reconstruct a datatype from this kit, I shall makea habit of ‘defining’ its constructors
linearly
in terms of the kitconstructors. To aid clarity, I use these
pattern synonyms
on eitherside of a functional equation, so that the coded type acquires thesame programming interface as the original. This is not standardHaskell, but these definitions may readily be expanded to codewhich is fully compliant, if less readable.The ‘kit’ approach allows us to establish properties of wholeclasses of datatype at once. For example, the polynomials are all
 functorial
: we can make the standard
Functor
class
class
Functor
p
where
fmap
:: (
s
)
p s
p
respect the polynomial constructs.
instance
Functor
(
K
1
)
where
fmap
(
K
1
) =
K
1
instance
Functor Id
where
fmap
(
Id
s
) =
Id
(
 f s
)
instance
(
Functor
p
,
Functor
)
Functor
(
p
+
1
)
where
fmap
(
L
1
p
) =
L
1
(
fmap
f p
)
fmap
(
R
1
) =
R
1
(
fmap
f
)
instance
(
Functor
p
,
Functor
)
Functor
(
p
×
1
)
where
fmap
(
p
,
1
) = (
fmap
f p
,
1
fmap
f
)
Our reconstructed
Maybe
is functorial without further ado.
2.1 Datatypes as Fixpoints of Polynomial Functors
The
Expr
type is not itself a polynomial, but its branching structureis readily described by a polynomial. Think of each node of an
Expr
as a container whose elements are the immediate sub-
Expr
s:
type
ExprP
=
K
1
Int
+
1
Id
×
1
IdValP
=
L
1
(
K
1
)
AddP
e
1
e
2
=
R
1
(
Id
e
1
,
1
Id
e
2
)
Correspondingly, we should hope to establish the isomorphism
Expr
=
ExprP Expr
but we cannot achieve this just by writing
type
Expr
=
ExprP Expr
for this creates an infinite type expression, rather than an infinitetype. Rather, we must define a recursive datatype which ‘ties theknot’:
µ 
p
instantiates
p
’s element type with
µ 
p
itself.
data
µ 
p
=
In
(
p
(
µ 
p
))
Now we may complete our reconstruction of 
Expr
type
Expr
=
µ 
ExprPVal
=
In
(
ValP
)
Add
e
1
e
2
=
In
(
AddP
e
1
e
2
)
Now, the container-like quality of polynomials allows us todefine a fold-like recursion operator for them, sometimes calledthe
iterator 
or the
catamorphism
.
3
How can we compute a
froma
µ 
p
? Well, we can expand a
µ 
p
tree as a
p
(
µ 
p
)
container of subtrees, use
p
’s
fmap
to compute
s recursively for each subtree,then post-process the
p
result container to produce a final resultin
. The behaviour of the recursion is thus uniquely determined bythe
p
-algebra
φ
::
p
which does the post-processing.
(
| · |
) ::
Functor
p
(
p
)
µ 
p
(
|
φ
|
) (
In
p
) =
φ
(
fmap
(
|
φ
|
)
p
)
For example, we can write our evaluator as a catamorphism, withan algebra which implements each construct of our language for
values
rather than expressions. The pattern synonyms for
ExprP
help us to see what is going on:
eval
::
µ 
ExprP
Inteval
= (
|
φ
|
)
where
φ
(
ValP
) =
φ
(
AddP
v
1
v
2
) =
v
1
+
v
2
Catamorphism may appear to have a complex higher-order re-cursive structure, but we shall soon see how to turn it into a first-order tail-recursion whenever
p
is polynomial. We shall do this bydissecting
p
, distinguishing the ‘clown’ elements left of a chosenposition from the ‘joker’ elements to the right.
2.2 Polynomial Bifunctors
Before we can start dissecting, however, we shall need to be able tomanage
two
sortsofelement.Tothisend,weshallneedtointroducethe polynomial
bifunctors
, which are just like the functors, but withtwo parameters.
data
K
2
a x
=
K
2
data
Fst
x
=
Fst
data
Snd
x
=
Snd
data
(
p
+
2
)
x
=
L
2
(
p x
)
|
R
2
(
q x
)
data
(
p
×
2
)
x
= (
p x
,
2
q x
)
type
1
2
=
K
2
()
We have the analogous notion of ‘mapping’, except that we mustsupply one function for each parameter.
3
Terminology is a minefield here: some people think of ‘fold’ as threadinga
binary
operator through the elements of a container, others as replacingthe constructors with an alternative algebra. The confusion arises becausethe two coincide for
lists
. There is no resolution in sight.
Clowns and Jokers
2
2007/7/17 
 
class
Bifunctor
p
where
bimap
:: (
s
1
1
)
(
s
2
2
)
p s
1
s
2
p
1
2
instance
Bifunctor
(
K
2
)
where
bimap
f
(
K
2
) =
K
2
instance
Bifunctor Fst
where
bimap
f
(
Fst
) =
Fst
(
 f
)
instance
Bifunctor Snd
where
bimap
f
(
Snd
) =
Snd
(
g
)
instance
(
Bifunctor
p
,
Bifunctor
)
Bifunctor
(
p
+
2
)
where
bimap
f
(
L
2
p
) =
L
2
(
bimap
f g p
)
bimap
f
(
R
2
) =
R
2
(
bimap
f g
)
instance
(
Bifunctor
p
,
Bifunctor
)
Bifunctor
(
p
×
2
)
where
bimap
f
(
p
,
2
) = (
bimap
f g p
,
2
bimap
f g
)
It’s certainly possible to take fixpoints of bifunctors to obtain re-cursively constructed container-like data: one parameter stands forelements, the other for recursive sub-containers. These structuressupport both
fmap
and a suitable notion of catamorphism. I canrecommend (Gibbons 2007) as a useful tutorial for this ‘origami’style of programming.
2.3 Nothing is Missing
We are still short of one basic component: Nothing. We shall beconstructing types which organise ‘the ways to split at a position’,but what if there are
no
ways to split at a position (because thereare no positions)? We need a datatype to represent impossibilityand here it is:
data
Zero
Elements of 
Zero
are hard to come by—elements worth speak-ing of, that is. Correspondingly, if you have one, you can exchangeit for anything you want.
magic
::
Zero
magic
=
seq
error
"we never get this far"
I have used Haskell’s
seq
operator to insist that
magic
evaluate itsargument. This is necessarily
, hence the
error
clause can neverbe executed. In effect
magic
refutes its input 
.We can use
p
Zero
to represent ‘
p
s with no elements’. For ex-ample, the only inhabitant of 
[
Zero
]
mentionable in polite societyis
[ ]
.
Zero
gives us a convenient way to get our hands on exactly theconstants, common to every instance of 
p
. Accordingly, we shouldbe able to embed these constants into any other instance:
inflate
::
Functor
p
p
Zero
p
inflate
=
fmap magic
However, it’s rather a lot of work traversing a container just totransform all of its nonexistent elements. If we cheat a little, wecan do nothing much more quickly, and just as safely!
inflate
::
Functor
p
p
Zero
p
inflate
=
unsafeCoerce
This
unsafeCoerce
function behaves operationally like
λ
,but its type,
b
, allows the programmer to intimidate thetypecheckerinto submission. It is usually present butwell hidden inthe libraries distributed with Haskell compilers, and its use requiresextreme caution. Here we are sure that the only
Zero
computationsmistaken for
s will fail to evaluate, so our optimisation is safe.Now that we have
Zero
, allow me to abbreviate
type
0
1
=
K
1
Zero
type
0
2
=
K
2
Zero
3. Clowns, Jokers and Dissection
We shall need three operators which take polynomial functors tobifunctors. Let me illustrate them: consider functors parametrisedby elements (depicted
) and bifunctors are parametrised by clowns(
) to the left and jokers (
) to the right. I show a typical
p
as acontainer of 
s
{−•−•−•−•−•−}
Firstly, ‘all clowns’
p
lifts
p
uniformly to the bifunctor whichuses its left parameter for the elements of 
p
.
{
}
We can define this uniformly:
data
p c
= (
p c
)
instance
Functor
Bifunctor
(
)
where
bimap
f
(
pc
) = (
fmap
f pc
)
Note that
Id
=
Fst
.Secondly, ‘all jokers’
p
is the analogue for the right parameter.
{
}
data
p c
= (
p
)
instance
Functor
Bifunctor
(
)
where
bimap
f
(
pj 
) = (
fmap
g pj 
)
Note that
Id
=
Snd
.Thirdly, ‘dissected’
p
takes
p
to the bifunctor which choosesa position in a
p
and stores clowns to the left of it and jokers to theright.
{
}
We must clearly define this case by case. Let us work informallyand think through what to do each polynomial type constructor.Constants have no positions for elements,
{}
so there is no way to dissect them:
(
K
1
) =
0
2
The
Id
functor has just one position, so there is just one way todissect it, and no room for clowns or jokers, left or right.
{−•−} {−◦−}
Id
=
1
2
Dissecting a
p
+
1
, we get either a dissected
p
or a dissected
.
L
1
{−•−•−•−} −→
L
2
{
}
R
1
{−•−•−•−} −→
R
2
{
}
(
p
+
1
) =
p
+
2
So far, these have just followed Leibniz’s rules for the derivative,but for pairs
p
×
1
we see the new twist. When dissecting a pair, wechoose to dissect either the left component (in which case the rightcomponent is all jokers) or the right component (in which case theleft component is all clowns).
(
{−•−•−•−}
,
1
{−•−•−•−}
)
L
2
(
{
}
,
2
{
}
)
R
2
(
{
}
,
2
{
}
)(
p
×
1
) =
p
×
2
+
2
p
×
2
Now, in Haskell, this kind of type-directed definition can be donewith type-class programming (Hallgren 2001; McBride 2002). Al-low me to abuse notation very slightly, giving dissection constraintsa slightly more functional notation, after the manner of (Neubaueret al. 2001):
Clowns and Jokers
3
2007/7/17 

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->