You are on page 1of 3

What is a “closure”

Occasionally, one bumps into the term "closure", or "block closure" in computer science writings.
Frequently, someone will be saying that a computer language feature under discussion isn't a "full
closure" or a "proper closure". You may have encountered such opinions, for example, when
researching into "inners" or "Runnables" or "Callables" in Java; or when looking at "functors"
(function objects) in C++.

Meaning

A closure is a block of code that isn't executed there and then. However, a closure is more than just
a fancy term for a function, routine or method. A closure can be passed to another block of code, for
the receiving block to execute when it deems fit. Thats why the term starts creaping closer when
discussing function pointers and functors in C and C++. They are close to enabling closures. But, as
well as being a code block amenable to being passed around and to deferred execution, a closure
also carries the context in which it was defined, and thus can reference variables, etc. from that
context. So if you're a Java programmer, maybe you think of inners; but they're not full closures,
because of those pesky final restrictions.

Imagine you wanted to give a Runnable object to a Java thread (this is how a thread is told what it's
to start doing in Java), and that you wanted to implement this Runnable object with an anonymous
local inner class. All well and good; you've been able to do that since Java 1.1. Furthermore, imagine
that the code block in which all this is going on has an important local variable that you would like
the anonymous local class code to refer to. Your initial thought might be that it would be impossible
for an object, in object memory (the "heap") to reference a local variable that is (or was) on the run-
time stack. Well you can. But in order to get it working, Java insists that locals (and arguments to
inner objects' enclosing methods) must be final (constant); then it can safely copy the item in
question into the local object.

Origin

"Closure" is an obscure term; albeit one that began, long ago, with a precise definition. "Closure" is,
however, a dusty old term, and a fairly meaningless choice of word as far as programming is
concerned. Allen Wirfs-Brock—involved in the design of several Smalltalks, and close to the original
leakage of the term into the programming community—mentions he thinks it was a mistake to start
the Smalltalk community using the term and thereby causing countless delvers into the deeper
depths of object-oriented programming to suddenly encounter the term, and rightly say "Eh?".

Probably only because I had done some Lisp and Smalltalk during the 1980s, did the term ring
distant bells when I first bumped into it.

The term "closure" originates with the last aspect of closures mentioned in the opening paragraph—
the aspect that is most difficult to simulate in languages like Java and C++—that the code block be
associated with, and have access to, the scope in which it was created.

Long ago, in the 1970s, Smalltalk was sorting out how methods and activation records would be
implemented. A method is probably something you are already familiar with: an object-oriented
function that is messaged for rather than called, and that is run by an object instance using its own
state as well as any argument values and local variables. An activation record is the record that's
created just for the use of the method invocation. Each individual method invocation needs its own
working area otherwise one can't have methods that message for themselves (recursion).

Most block-structured languages push method activation records onto a run-time stack. It's very
efficient. Scope, and nested scope handling, automatically tumble straight out of classic stack
behaviour. Some Smalltalks, however, used heap-based activation records to make it easier to
achieve proper closure. They were trying to get round the kinds of problems illustrated by the
discussion of Java local objects in the last paragraph of the previous section.

It was in Lisp, and possibly in the lambda calculus that influenced the design of many functional
languages, where the term "closure" arose. (If you haven't heard of the lambda calculus then
perhaps you've heard of the Turing machine. Back in the 1930s, Alan Turing and Alonzo Church (with
S. C. Kleene) were both working on the nature of computability. Today we still refer to the fruits of
their independent labours as the Church-Turing thesis. Turing used his eponymous machine and
Church used the lambda calculus.)

Lisp says that when a "block" is associated with a particular environment the block is "closed" (think
verb rather than adjective) with respect to that environment, and that any "open" (or "free")
variables in the function are bound to their corresponding entries in the environment. Hence the
term "closure". (You may not be familiar with "free variables". That's maybe because they are part of
why you can't easily do full closures in languages like C++ and Java, for example. Think of the variable
identifiers you encounter in the code block of a Java or C++ method (or member function). These
identifiers will bind either to argument values, local variables or instance variables (or data
members). A free variable, however, would not be defined in the code block's scope, but in the
environment in which the (code block) closure was defined.

Reason

You might be interested in why Smalltalk was going to all this trouble. It's because just about
everything in Smalltalk is an object. That's why the syntax of Smalltalk doesn't require hundreds of
pages to describe; only half a page. Smalltalk is object-oriented with a vengeance. (It doesn't mean
that Smalltalk is an order of magnitude easier to learn however—just a factor or two easier to learn.
A novice Smalltalk programmer must know a significant proportion of the library, whereas a novice
Java programmer could get by with knowing only a tiny bit about just one small library package.)

In particular, methods are implemented as objects in Smalltalk; and selection (if-then-else) and
iteration (loops) are implemented not as syntax, but as methods of library objects. For example an
"if-then-else" is done by passing two block objects—passing them to the true or false objects that
result from some boolean message. These two block objects are sent to this boolean object with the
message "if you're the 'true' object then execute this block object for me; on the other hand, if
you're the 'false' object then execute this other block for me". This way of doing things takes a bit of
getting used to but it's elegant and consistent, and it works fine. You would expect the 'true' block
and the 'false' block to be able to refer to variables from their context. Their context in block
structured languages, is the enclosing scope(s). If they are block objects being passed around,
however, you can see that they have to undergo closure with respect to their context.

You might also like