Empirical laws for mathematical notations In the study of ordinary natural language there are various empirical historical

laws that have been discovered. An example is Grimm's Law, which describes general historical shifts in consonants in Indo-European languages. I have been curious whether empirical historical laws can be found for mathematical notation. Dana Scott suggested one possibility: a trend towards the removal of explicit parameters. As one example, in the 1860s it was still typical for each component in a vector to be a separately-named variable. But then components started getting labelled with subscripts, as in ai. And soon thereafter--particularly through the work of Gibbs--vectors began to be treated as single objects, denoted say by or a. With tensors things are not so straightforward. Notation that avoids explicit subscripts is usually called "coordinate free." And such notation is common in pure mathematics. But in physics it is still often considered excessively abstract, and explicit subscripts are used instead. With functions, there have also been some trends to reduce the mention of explicit parameters. In pure mathematics, when functions are viewed as mappings, they are often referred to just by function names like f, without explicitly mentioning any parameters. But this tends to work well only when functions have just one parameter. With more than one parameter it is usually not clear how the flow of data associated with each parameter works. However, as early as the 1920s, it was pointed out that one could use so-called combinators to specify such data flow, without ever explicitly having to name parameters. Combinators have not been used in mainstream mathematics, but at various times they have been somewhat popular in the theory of computation, although their popularity has been reduced through being largely incompatible with the idea of data types. Combinators are particularly easy to set up ini Mathematica--essentially by building functions with composite heads. Here's how the standard combinators can be defined: k[x_][y_]:=i x s[x_][y_][z_]:= x[z][y[z]] If one defines the integer n--effectively in unary--by Nest[s[s[k[s]][k]],k[s[k][k]],n] then addition is s[k[s]][s[k[s[k[s]]]][s[k[k]]]], multiplication is s[k[s]][k] and power is s[k[s[s[k][k]]]][k]. No variables are required. The problem is that the actual expressions one gets are almost irreducibly obscure. I have tried to find clear ways to represent them and their evaluation. I have made a little progress, but have certainly not been fully successful. [back to top] Printed vs. on-screen notation Some people asked about differences between what is possible in printed and on-

screen notation. Notation needs to be familiar to be understood, so the differences cannot be too sudden or dramatic. But there are some obvious possibilities. First, on screen one can routinely use color. One might imagine that it would somehow be useful to distinguish variables that are different colors. In my experience it is fine to do this in annotating a formula. But it becomes totally confusing if, for example, a red and green x are supposed to be distinct variables. Another possibility is to have animated elements in a formula. I suspect that these will be as annoying as flashing text, and not immediately useful. A better idea may be to have the capability of opening and closing sections of an expression--like cell groups in a Mathematica notebook. Then one has the possibility of getting an overall view of an expression, but being able to click to see more and more details if one is interested. [back to top] Graphical notation Several people thought I had been too hard on graphical notations in my talk. I should have made it clearer that the area I have found graphical notations difficult to handle is in representing traditional mathematical actions and operations. In my new science I use graphics all the time, and I cannot imagine doing what I do any other way. And in traditional science and mathematics there are certainly graphical notations that work just fine, though typically for fairly static constructs. Graph theory is an obvious place where graphical representations are used. Related to this are, for example, chemical structure diagrams in chemistry and Feynman diagrams in physics. In mathematics, there are methods for doing group theoretical computations-particularly due to Predrag Cvitanovic--that are based on graphical notation. And then in linguistics, for example, it is common to "diagram" a sentence, showing the tree of derivations that can be used to build up the sentence. All of these notations, however, become quite obscure if one has to use them in cases that are too big. But in Feynman diagrams, two loops are the most that are routinely considered, and five loops is the maximum for which explicit general computations have ever been done. [back to top] Fonts and characters I had meant to say something in my talk about characters and fonts. In Mathematica 3 we went to a lot of trouble to develop fonts for over 1100 characters of relevance to mathematical and technical notation.

Getting exactly the right forms--even for things like Greek letters--was often fairly difficult. We wanted to maintain some semblance of "classical correctness," but we also wanted to be sure that Greek letters were as distinct as possible from English letters and other characters. In the end, I actually drew sketches for most the characters. Here's what we ended up with for the Greek letters. We made both a Times-like font, and a monospaced Courier-like font. (We're currently also developing a sans serif font.) The Courier font was particularly challenging. It required, for example, working out how stretch an iota so that it could sensibly fill a complete character slot. Alphabet Other challenges included script and Gothic (Fraktur) fonts. Often such fonts end up having letters that are so different in form from ordinary English letters that they become completely unreadable. We wanted to have letters that somehow communicated the appropriate script or Gothic theme, but nevertheless have the same overall forms as ordinary English letters. Here's what we ended up with: [back to top] Searching mathematical formulas Various people asked about searching mathematical formulas. It's obviously easy to specify what one means by searching plain text. The only issue usually is whether one considers upper- and lowercase letters to be equivalent. For mathematical formulas things are more complicated, since there are many more forms that are straightforwardly equivalent. If one asks about all possible equivalences things become impossibly difficult, for basic mathematical reasons. But if one asks about equivalences that more or less just involve substituting one variable for another then one can always tell whether two expressions are equivalent. However, it pretty much takes something with the power of the Mathematica pattern matcher to do this. We're planning formula searching capabilities for our new web site functions.wolfram.com, though as of right now this has not actually been implemented there. [back to top] Non-visual notation Someone asked about non-visual notation. My first response was that human vision tends to be a lot more sensitive than, say, human hearing. After all, we have a million nerve fibers connected to our eyes, and only 50,000 connected to our ears. Mathematica has had audio generation capabilities since version 2 in 1991. And there are some times when I've found this useful for understanding data.

But I at least have never found it at all useful for anything analogous to notation. [back to top] Proofs Someone asked about presentations of proofs. The biggest challenge comes in presenting long proofs that were found automatically by computer. A fair amount of work has been done on presenting proofs in Mathematica. An example is the Theorema project, www.theorema.org. The most challenging proofs to present are probably ones--say in logic--that just involve a sequence of transformations on equations. Here's an example of such a proof: Given the Sheffer axioms of logic (f is the Nand operation): {f[f[a,a],f[a,a]]==a,f[a,f[b,f[b,b]]]==f[a,a], f[f[a,f[b,c]],f[a,f[b,c]]]==f[f[f[b,b],a],f[f[c,c],a]]} prove comutativity, i.e f[a,b]==f[b,a]: Note: (a b) is equivalent to Nand[a,b]. In this proof, L==Lemma, A==Axiom, and T==Theorem [back to top] Character selection I had meant to say something about selecting characters to use in mathematical notation. There are about 2500 commonly-used symbols that do not appear in ordinary text. Some are definitely too pictorial: a fragile sign, for example. Some are too ornate. Some have too much solid black in them, so they'd jump out too much on a page. (Think of a radioactive sign, for example.) But a lot might be acceptable. If one looks at history, it is fairly often the case that particular symbols get progressively simplified over the course of time. A specific challenge that I had recently was to come up with a good symbol for the logic operations Nand, Nor, and Xor. In the literature of logic, Nand has been variously denoted: Nand I was not keen on any of these. They mostly look too fragile and not blobby enough to be binary operators. But they do provide relevant reminders.

What I have ended up doing is to build a notation for Nand that is based on one of the standard ones, but is "interpreted" so as to have a better visual form. Here's the current version of what I came up with: nand [back to top] Frequency distribution of symbols In the talk I showed the frequency distribution for Greek letters in MathWorld. To complement this, I also counted the number of different objects named by each letter, appearing in the Dictionary of Physics and Mathematics Abbreviations. Here are the results. In early mathematical notation--say the 1600s--quite a few ordinary words were mixed in with symbols. But increasingly in fields like mathematics and physics, no words have been included in notation and variables have been named with just one or perhaps two letters. In some areas of engineering and social science, where the use of mathematics is fairly recent and typically not too abstract, ordinary words are much more common as names of variables. This follows modern conventions in programming. And it works quite well when formulas are very simple. But if they get complicated it typically throws off the visual balance of the formulas, and makes their overall structure hard to see. [back to top] Parts of speech in mathematical notation In talking about the correspondence of mathematical language and ordinary language, I was going to mention the question of parts of speech. So far as I know, all ordinary languages have verbs and nouns, and most have adjectives, adverbs, etc. In mathematical notation, one can think of variables as nouns and operators as verbs. What about other parts of speech? Things like And (And) sometimes play the role of conjunctions, just as they do in ordinary language. (Notably, all ordinary human languages seem to have single words for And or Or, but none have a single word for Nand.) And perhaps PlusMinus as a prefix operator can be viewed as an adjective. But it is not clear to what extent the kinds of linguistic structure associated with parts of speech in ordinary language are mirrored in mathematical notation.