/  15
 
Literate Programming
Donald E. Knuth
Computer Science Department, Stanford University, Stanford, CA 94305, USA
The author and his associates have been experimenting for the past several years with a program-ming language and documentation system called
WEB
. This paper presents
WEB
by example, anddiscusses why the new system appears to be an improvement over previous ones.
A. INTRODUCTION
The past ten years have witnessed substantial improve-ments in programming methodology. This advance,carried out under the banner of “structured program-ming,” has led to programs that are more reliable andeasier to comprehend; yet the results are not entirelysatisfactory. My purpose in the present paper is topropose another motto that may be appropriate for thenext decade, as we attempt to make further progressin the state of the art. I believe that the time is ripefor significantly better documentation of programs, andthat we can best achieve this by considering programsto be
works of literature
. Hence, my title: “LiterateProgramming.”Let us change our traditional attitude to the con-struction of programs: Instead of imagining that ourmain task is to instruct a
computer 
what to do, let usconcentrate rather on explaining to
human beings
whatwe want a computer to do.The practitioner of literate programming can be re-garded as an essayist, whose main concern is with ex-position and excellence of style. Such an author, withthesaurus in hand, chooses the names of variables care-fully and explains what each variable means. He or shestrives for a program that is comprehensible because itsconcepts have been introduced in an order that is bestfor human understanding, using a mixture of formaland informal methods that re¨ınforce each other.I dare to suggest that such advances in documenta-tion are possible because of the experiences I’ve hadduring the past several years while working intensivelyon software development. By making use of severalideas that have existed for a long time, and by applyingthem systematically in a slightly new way, I’ve stumbledacross a method of composing programs that excites mevery much. In fact, my enthusiasm is so great that Imust warn the reader to discount much of what I shallsay as the ravings of a fanatic who thinks he has justseen a great light.Programming is a very personal activity, so I can’tbe certain that what has worked for me will work foreverybody. Yet the impact of this new approach on myown style has been profound, and my excitement hascontinued unabated for more than two years. I enjoythe new methodology so much that it is hard for me torefrain from going back to every program that I’ve everwritten and recasting it in “literate” form. I find myself unable to resist working on programming tasks thatI would ordinarily have assigned to student researchassistants; and why? Because it seems to me that at lastI’m able to write programs as they should be written.My programs are not only explained better than everbefore; they also are better programs, because the newmethodology encourages me to do a better job. Forthese reasons I am compelled to write this paper, inhopes that my experiences will prove to be relevant toothers.I must confess that there may also be a bit of mal-ice in my choice of a title. During the 1970s I wascoerced like everybody else into adopting the ideas of structured programming, because I couldn’t bear to befound guilty of writing
unstructured 
programs. Now Ihave a chance to get even. By coining the phrase “liter-ate programming,” I am imposing a moral commitmenton everyone who hears the term; surely nobody wantsto admit writing an
illiterate
program.
B. THE
WEB
SYSTEM
I hope, however, to demonstrate in this paper that thetitle is not merely wordplay. The ideas of literate pro-gramming have been embodied in a language and asuite of computer programs that have been developedat Stanford University during the past few years as partof my research on algorithms and on digital typography.This language and its associated programs have cometo be known as the
WEB
system. My goal in what fol-lows is to describe the philosophy that underlies
WEB
,to present examples of programs in the
WEB
language,and to discuss what may be the future implications of this work.I chose the name
WEB
partly because it was one of the few three-letter words of English that hadn’t al-ready been applied to computers. But as time went on,I’ve become extremely pleased with the name, becauseI think that a complex piece of software is, indeed, bestregarded as a
web
that has been delicately pieced to-gether from simple materials. We understand a compli-cated system by understanding its simple parts, and byunderstanding the simple relations between those partsand their immediate neighbors. If we express a pro-gram as a web of ideas, we can emphasize its structuralproperties in a natural and satisfying way.
WEB
itself is chiefly a combination of two other lan-guages: (1) a document formatting language and (2) aprogramming language. My prototype
WEB
system uses
submitted to THE COMPUTER JOURNAL
1
 
D. E. KNUTH
TEX as the document formatting language and
PAS-CAL
as the programming language, but the same prin-ciples would apply equally well if other languages weresubstituted. Instead of TEX, one could use a languagelike Scribe or Troff; instead of 
PASCAL
, one could use
ADA
,
ALGOL
,
LISP
,
COBOL
,
FORTRAN
,
APL
,
C
, etc.,or even assembly language. The main point is that
WEB
is inherently bilingual, and that such a combination of languages proves to be much more powerful than eithersingle language by itself.
WEB
does not make the otherlanguages obsolete; on the contrary, it enhances them.I naturally chose TEX to be the document formattinglanguage, in the first
WEB
system, because TEX is myown creation;
1
I wanted to acquire a lot of experiencein harnessing TEX to a variety of different tasks. I chose
PASCAL
as the programming language because it hasreceived such widespread support from educational in-stitutions all over the world; it is not my favorite lan-guage for system programming, but it has become a“second language” for so many programmers that itprovides an exceptionally effective medium of commu-nication. Furthermore
WEB
itself has a macro-processingability that makes
PASCAL
’s limitations largely irrele-vant.Document formatting languages are newcomers tothe computing scene, but their use is spreading rapidly.Therefore I’m confident that we will be able to expecteach member of the next generation of programmers tobe familiar with a document language as well as a pro-gramming language, as part of their basic education.Once a person knows both of the underlying languages,there’s no trick at all to learning
WEB
, because the
WEB
user’s manual is fewer than ten pages long.A
WEB
user writes a program that serves as the sourcelanguage for two different system routines. (See Fig-ure 1.) One line of processing is called
weaving 
theweb; it produces a document that describes the pro-gram clearly and that facilitates program maintenance.The other line of processing is called
tangling 
the web;it produces a machine-executable program. The pro-gram and its documentation are both generated fromthe same source, so they are consistent with each other.TEX
TEX
  
−−−−→
DVI
  
WEAVE
WEB
  
TANGLE
PAS
  
−−−−→
REL
  
PASCAL
Figure 1.
Dual usage of a
WEB
file.
Let’s look at this process in slightly more detail. Sup-pose you have written a
WEB
program and put it intoa computer text file called
COB.WEB
(say). To gener-ate hardcopy documentation for your program, you canrun the
WEAVE
processor; this is a system program thattakes the file
COB.WEB
as input and produces another file
COB.TEX
as output. Then you run the TEX processor,which takes
COB.TEX
as input and produces
COB.DVI
asoutput. The latter file,
COB.DVI
, is a “device-independent”binary description of how to typeset the documenta-tion, so you can get printed output by applying onemore system routine to this file.You can also follow the other branch of Figure 1, byrunning the
TANGLE
processor; this is a system programthat takes the file
COB.WEB
as input and produces a newfile
COB.PAS
as output. Then you run the
PASCAL
com-piler, which converts
COB.PAS
to a binary file
COB.REL
(say). Finally, you can run your program by loadingand executing
COB.REL
. The process of “compile, load,and go” has been slightly lengthened to “tangle, com-pile, load, and go.”
C. A COMPLETE EXAMPLE
Now it’s time for me to stop presenting general plat-itudes and to move on to something tangible. Let uslook at a real program that has been written in
WEB
.The numbered paragraphs that follow are the actualoutput of a
WEB
file that has been “woven” into a doc-ument; a computer has also generated the indexes thatappear at the program’s end. If my claims for the ad-vantages of literate programming have any merit, youshould be able to understand the following descriptionmore easily than you could have understood the sameprogram when presented in a more conventional way.However, I am trying here to explain the format of 
WEB
documentation at the same time as I am discussing thedetails of a nontrivial algorithm, so the description be-low is slightly longer than it would be if it were writtenfor people who already have been introduced to
WEB
.Here, then, is the computer-generated output:Printing primes: An example of 
WEB
..............
§
1Plan of the program
..............................
§
3The output phase
................................
§
5Generating the primes
..........................
§
11The inner loop
..................................
§
22Index
...........................................
§
27
1. Printing primes: An example of 
WEB
.
Thefollowing program is essentially the same as EdsgerDijkstra’s “first example of step-wise program composi-tion,” found on pages 26–39 of his
Notes on StructuredProgramming 
,
2
but it has been translated into the
WEB
language.[[Double brackets will be used in what follows to en-close comments relating to
WEB
itself, because the chief purpose of this program is to introduce the reader tothe
WEB
style of documentation.
WEB
programs are al-ways broken into small sections, each of which has aserial number; the present section is number 1.]]Dijkstra’s program prints a table of the first thou-sand prime numbers. We shall begin as he did, by re-ducing the entire program to its top-level description.[[Every section in a
WEB
program begins with optional
commentary 
about that section, and ends with optional
program text 
for the section. For example, you are nowreading part of the commentary in
§
1, and the programtext for
§
1 immediately follows the present paragraph.Program texts are specifications of 
PASCAL
programs;they either use
PASCAL
language directly, or they useangle brackets to represent
PASCAL
code that appears
2
submitted to THE COMPUTER JOURNAL
 
LITERATE PROGRAMMING
in other sections. For example, the angle-bracket nota-tion ‘
Program to print
...
numbers
2
is
WEB
’s way of saying the following: “The
PASCAL
text to be insertedhere is called ‘Program to print
...
numbers’, and youcan find out all about it by looking at section 2.” Oneof the main characteristics of 
WEB
is that different partsof the program are usually abbreviated, by giving themsuch an informal top-level description.]]
Program to print the first thousand primenumbers
2
2.
This program has no input, because we want tokeep it rather simple. The result of the program will beto produce a list of the first thousand prime numbers,and this list will appear on the
output 
file.Since there is no input, we declare the value
m
=1000 as a compile-time constant. The program itself iscapable of generating the first
m
prime numbers for anypositive
m
, as long as the computer’s finite limitationsare not exceeded.[[The program text below specifies the “expanded mean-ing” of ‘
Program to print
...
numbers
2
’; notice thatit involves the top-level descriptions of three other sec-tions. When those top-level descriptions are replacedby their expanded meanings, a syntactically correct
PAS-CAL
program will be obtained.]]
Program to print the first thousand primenumbers
2
program
print primes
(
output 
);
const
m
= 1000;
Other constants of the program
5
var
Variables of the program
4
begin
Print the first
m
prime numbers
3
;
end
.
This code is used in section 1.
3. Plan of the program.
We shall proceed to fillout the rest of the program by making whatever deci-sions seem easiest at each step; the idea will be to strivefor simplicity first and efficiency later, in order to seewhere this leads us. The final program may not be op-timum, but we want it to be reliable, well motivated,and reasonably fast.Let us decide at this point to maintain a table thatincludes all of the prime numbers that will be gener-ated, and to separate the generation problem from theprinting problem.[[The
WEB
description you are reading once again fol-lows a pattern that will soon be familiar: A typicalsection begins with comments and ends with programtext. The comments motivate and explain noteworthyfeatures of the program text.]]
Print the first
m
prime numbers
3
Fill table
p
with the first
m
prime numbers
11
;
Print table
p
8
This code is used in section 2.
4.
How should table
p
be represented? Two possi-bilities suggest themselves: We could construct a suffi-ciently large array of boolean values in which the
k
thentry is
true
if and only if the number
k
is prime; orwe could build an array of integers in which the
k
thentry is the
k
th prime number. Let us choose the lat-ter alternative, by introducing an integer array called
 p
[1
.. m
].In the documentation below, the notation ‘
 p
[
k
]’ willrefer to the
k
th element of array
p
, while ‘
 p
k
’ will referto the
k
th prime number. If the program is correct,
p
[
k
]will either be equal to
p
k
or it will not yet have beenassigned any value.[[Incidentally, our program will eventually make use of several more variables as we refine the data structures.All of the sections where variables are declared willbe called ‘
Variables of the program
4
’; the number
4
’ in this name refers to the present section, which isthe first section to specify the expanded meaning of 
Variables of the program
’. The note ‘
See also
...
refers to all of the other sections that have the same top-level description. The expanded meaning of ‘
Variablesof the program
4
’ consists of all the program texts forthis name, not just the text found in
§
4.]]
Variables of the program
4
 p
:
array
[1
.. m
]
of 
integer 
;
{
the first
m
primenumbers, in increasing order
}
See also sections 7, 12, 15, 17, 23, and 24.This code is used in section 2.
5. The output phase.
Let’s work on the secondpart of the program first. It’s not as interesting as theproblem of computing prime numbers; but the job of printing must be done sooner or later, and we might aswell do it sooner, since it will be good to have it done.[[And it is easier to learn
WEB
when reading a programthat has comparatively few distracting complications.]]Since
p
is simply an array of integers, there is littledifficulty in printing the output, except that we need todecide upon a suitable output format. Let us print thetable on separate pages, with
rr 
rows and
cc
columnsper page, where every column is
ww 
character positionswide. In this case we shall choose
rr 
= 50,
cc
= 4, and
ww 
= 10, so that the first 1000 primes will appear onfive pages. The program will not assume that
m
is anexact multiple of 
rr 
·
cc
.
Other constants of the program
5
rr 
= 50;
{
this many rows will be on each page inthe output
}
cc
= 4;
{
this many columns will be on each pagein the output
}
ww 
= 10;
{
this many character positions will beused in each column
}
See also section 19.This code is used in section 2.
6.
In order to keep this program reasonably free of no-tations that are uniquely
PASCAL
esque, [[and in orderto illustrate more of the facilities of 
WEB
,]] a few macrodefinitions for low-level output instructions are intro-duced here. All of the output-oriented commands inthe remainder of the program will be stated in terms of five simple primitives called
print string 
,
print integer 
,
print entry 
,
new line
, and
new page
.[[Sections of a
WEB
program are allowed to contain
macro definitions
between the opening comments andthe closing program text. The general format for eachsection is actually tripartite: commentary, then defini-tions, then program. Any of the three parts may beabsent; for example, the present section contains noprogram text.]]
submitted to THE COMPUTER JOURNAL
3

Share & Embed

Add a Comment

Characters: ...

This document has made it onto the Rising list!