TRAC, A Text-Handling Language

by C. N. Mooers and L. P. Deutsch (1965) Paper presented at the 20th National Conference of the Association of Computing Machinery, Cleveland, Ohio, August 1965. THE TRAC SYSTEM for Text Reckoning And Compilingwas developed as a software package and user language to go with the reactive typewriter. Design goals included the attainment of a concise and efficient input language, a straightforward philosophy and a high order of logical versatility. The external and internal forms of the TRAC language are the same. TRAC can accept, name, store, operate upon in any way, and emit any string of characters that can be produced on a teletypewriter keyboard. Any string can be treated at any time as text, name, or program. This paper describes the design decisions that went into the construction of the TRAC language and system. The acronym TRAC stands for "text reckoning and compiling" 1) The TRAC system had its genesis in the need for a general tool for dealing with text. In its later stages, TRAC developed in parallel with the evolution of the reactive typewriter concept; 2) TRAC is now running in a time-shared environment, and is currently undergoing testing and operational refinement; 3) In preliminary assessment, TRAC appears to exceed the design targets set for it. By text is meant any string of characters (alphabetic, numeric, punctuation, other signs, format control and signal control characters) which can be generated from a teletypewriter keyboard, or any other keyboard, or which can control the printing action of a teleprinter machine, or can control the action of any other device. By dealing with text is meant any effectively definable operation upon characters, strings of characters, or their code representation. Such operations may include making corrections and insertions in text, storage of text for later use, retrieval of text, re-formatting of text and page make-up, data formatting for numerical computation, simple numerical computation, character recoding, and many other possibilities. By reactive typewriter is meant a teletypewriter, or typewriter with teletypewriter capabilities, connected by telegraph line to a multiple access, time-shared computer with large storage capabilities. Low cost, full upper- and lower-case character capabilities, and convenient immediate responsiveness are important elements of the concept of the reactive typewriter (2). The TRAC system can be thought of as the "genie at the other side of the keyboard" for a person using the reactive typewriter. Such a person may be a scientist doing a computation, a stenographer correcting a legal brief, or a librarian composing a catalog card. Each user should have a set of procedures that can be called, used, and even redefined at his keyboard. These procedures should be callable in the TRAC language directly, or in English or "plain language" with intervening TRAC control. A subset of the TRAC language should be so simple in concept and use that a stenographer or librarian can use it constructively, while the full language should be of sufficient power so that it can be used to formulate any well-defined procedure. TRAC should secure these ends through a common-sense conceptual approach, as opposed to an approach requiring a high order of indoctrination in some system of symbolic logic, or an extensive background in computer programming notions and methods. Since the procedures written in the TRAC language are carried out interpretively, the versatility of TRAC occurs at the cost of lower absolute efficiency than for the same procedures written in assembly language programming. Therefore it is not intended that TRAC be used in large batch production jobs

such as partial differential equation solution, matrix inversion, sorting, large file management, language translation; character recoding, and the like. On the other hand, TRAC would be economical for on-line attended operation in letter writing, editing, bookkeeping, data formatting, computer program writing, logical experimentation, and other task where continuous human intervention is important.

In the design of TRAC, there were the following broad technical goals. As already stated, TRAC should be hospitable to, and should be able to accept, name, manipulate, store, and emit, any teletypewriter character or string of characters. This ability should include any control or formatting characters of any kind, or in any combination, i.e., to such characters as carriage return, space, and, in certain instances, to the specific lack of a character -- null in the most precise sense. Through suitable strings produced at the keyboard, it should be possible to specify any well-defined procedural operation upon strings of characters. It should do this through combinations of primitive functions which are used to make up such procedures. It should be able to perform integer arithmetic and to manipulate bit strings. It should be able to treat any character string as text, procedure, or name. It should be able to treat a string at one time in one way, and later in another way. Therefore it should have the ability to treat the statement of a procedure temporarily as text, and, as such, to modify it, and thereafter to execute the corrected statement as a new procedure. TRAC should permit the keyboard user to move any named string off onto a mass storage device (disc, drum, or tape), to retrieve it at will, and to control the organization of the strings within the storage device. It should also permit the user to output strings onto external media such as paper tape, and to read them back in again. By extension, it should permit the receipt and transmission of strings over common carrier communication channels, including transmission into, and recording in, other TRAC-controlled systems. In its time-shared environment, TRAC is a "user program", and thus is one of the many user programs which may be in simultaneous operation. In some cases, it may be desirable to have a version of TRAC as a component of an executive program serving many users -- a matter which has yet to be investigated. It was a goal that TRAC should be a "keyboard" language easily usable on line. Thus it should not be upset by the usual typing mistakes, and the user should find it easy to recover from such mistakes. No inadvertent combination of characters should cause TRAC to "blow up". It should accept nonsense gracefully, with at most a polite diagnostic response of "program error" or the like. At any time, it should be possible to stop extended loops and "sorcerer’s apprentice" actions, merely with a touch of the break key. This should result automatically in the re-initialization of the state of the TRAC processor, though without loss of any work in storage. A goal of the TRAC language was to produce a simple and explicit syntax which was independent of a line format on a page, and which was not a descendant of a syntax based upon the tabulating card. Another goal was to avoid a "crypto-syntax" such as one based on nonprinting characters, or on a confusing hierarchy of rules; or a "hopefully simplified" syntax based on English and built around some specific job. It was desired that the syntax of TRAC be based upon as few control (i.e., syntactic or meta) characters as possible, since the more such characters there are, the greater the possibility of getting into unanticipated trouble. When not acting in their syntactic role, these characters, as much as

possible, should behave as ordinary characters. There should be the possibility of changing at least certain of the control characters while under TRAC control. For some purposes, it may be desirable to be able to change all of them. The language should be based, as is most natural, upon strings of arbitrary length marked by some easily understood and used mode of punctuation and delimitation. Particularly to be avoided is a dependence upon atomic symbol strings which do not allow splitting or joining of character substrings within the ordinary scope of the language. It should be possible to divide any string, or to put together any two strings at any time to make an integral (non punctuated) new string. For the most natural usage, strings should stand for themselves. In other words, "a character string is a character string", and only if the TRAC syntax requires a string to be interpreted as a name is it treated as a name. On the other hand, naming and indirect addressing through names should be possible to any necessary depth. In the language, one should be able to nest any primitive function or procedural statement inside any other, to any depth, and without confusion. One should be able to handle iterations (repetitions of the same step) and recursions (deferrals of a partially complete step with a call upon itself or another step) to any depth within the limits of storage space in the host machine. Procedures should be able to operate upon themselves, or to refer to themselves, without limitation. Procedures should be able to create new definitions of procedures, and then to execute such procedures. It should not be necessary to keep track of "level", and the statement of procedures should be independent of level. The set of primitive functions available in TRAC should be carefully selected. There should be enough to provide convenience in the kinds of operations most used in practice, yet not so many that the functions become individually detailed and overly specialized. It might be an interesting logical exercise to try to develop another logically minimal, non-redundant set of functions. However, experience (such as with the logically elegant five-function set of primitives in LISP) shows that a practical operating system must have conveniently available to it a substantially larger set of user functions. Thus the goal should be to find some set of primitive user functions which have the required versatility and generality, yet which are still manageable in number and intellectual complexity. TRAC should be capable of performing decisions during the execution of procedures, and the procedures should be able to change and evolve through redefinition in consequence of data which is brought to them. Procedures should be capable of defining and storing new matter at any time, either text or new procedures. Procedures should be capable of creating new unique names for such matter, or for other purposes, as may be required. One of the main design goals was that the input script of TRAC (what is typed in by the user) should be identical to the text which guides the internal action of the TRAC processor. In other words, TRAC procedures should be stored in memory as a string of characters exactly as the user typed them at the keyboard. If the TRAC procedures themselves evolve new procedures, these new procedures should also be stated in the same script. The TRAC processor in its action interprets this script as its program. In other words, the TRAC translator program (the processor) effectively converts the computer into a new computer with a new program language -- the TRAC language. At any time, it should be possible to display program or procedural information in the same form as the TRAC processor will act upon it during its execution. It is desirable that the internal character code representation be identical to, or very similar to, the external code representation. In the present TRAC implementation, the internal character representation is based upon ASCII (American Standard Code for Information Interchange). Because TRAC procedures and text have the same representation inside and outside the processor, the term homoiconic is applicable, from homomeaning the same, and icon meaning representation (4).

Many of these stated goals were set early in the development. However, some of the goals were added more recently, as the power and the scope of the present TRAC programming emerged.

TRAC had a number of sources of inspiration; COMIT and LISP provided the main challenge to devise a system which could do at least as much as they could, yet would not have their limitations. A major limitation of COMIT was clearly the need for compilation, resulting in rigid procedures which could not be fundamentally modified at the keyboard during run time. Other shortcomings were in its line-by-line format, and the inability of one procedure to operate upon another procedural statement and then to execute the result. LISP, although elegant in concept, becomes inelegant in practice. It even "cheats" in the frequent use of machine-language procedures. It is still not an easily used keyboard language. More unfortunately, its documentation seems to require a "mystique" of a sort that not all the reactive typewriter users will have. Its bifurcated list structure basis significantly limits the generality of the data structures. Finally, LISP is troubled with a dual language problem: an M-language, which is easy to read, and is used externally, and an S-language, with which the LISP processor operates, and which is usable externally only by the hardened initiates. It should be noted here that were the S-language the only LISP language, LISP would be close to being homo-iconic (excluding the machine-language functions). IPL-V was also a challenge. However, it seemed that IPL-V was closer to a stylized computer language than it was to a user language as conceived here. Certainly for the reactive typewriter purposes, it was altogether too much like conventional computer programming to be usable by the expected user groups. The main inspiration for TRAC came from three papers by McIlroy and colleagues discussing their work with a macro assembly system (5,6,7). These papers pointed out that a macro assembly system had unexpected power as a symbol manipulator, and that if the system could store definition schemata, could create definitions, could execute such definitions, and could determine its next actions from a decision capability, then the system was indeed a Turing machine and was of complete logical generality. Unfortunately, the two most interesting papers by McIlroy (6,7) were never published in a journal. In discussing the implications of such an extended macro assembler system, McIlroy (5) had this to say about what a generalized compiler-processor should have: Definitions should be able to contain macro calls. Parenthetical notation should be available for nesting and compounding calls. Conditional statements should be operative at any time. Created symbols should be available for such use as in names. Definitions should themselves be able to contain schemata for creating additional definitions. Repetitions over a list should be permitted. (These are paraphrased slightly for clarity.) All of these capabilities in some form are available in the present TRAC system. TRAC, upon close inspection, shows itself as a system built around the notion of the macro as it is used in macro assembly languages. It is designed to be strong at putting things together and for carrying out tasks. Less attention was paid to providing extensive analytical ability. Thus TRAC may be clumsy (though not impossibly so) if one uses it to imitate a compiler’s scanner and parser. A goal for later work is to strengthen TRAC’s analytical ability.

McIlroy’s system was cast in the format of the symbolic assembly language of the IBM 704 computer. In other words, it was based on a tabulating card format. Like LISP, it dealt with atoms (symbolic addresses and symbolic operation codes). Because it was written in the usual columnar programmer’s format, it was clearly unsuitable. The problem was: how to modify it, how to generalize it, and how to extend it to give the results desired. This process of generalization turned out to be similar to the problem facing the inventor of some game using a playing board, pieces and rules. In the development of TRAC, the general nature of the end goal was intuitively visible from the beginning. Yet it was not evident what the end result would look like. The development process was one of tentatively setting forth a complete set of "rules of the game" and then trying to "play the game". In the early stages, there was no need to try to implement the ideas by programming on a computer. If the statement of the rules and of the moves would not permit performing certain useful procedures, that was the end of it. It was then necessary to go back and try to devise some new set of rules that would. In a highly interlocked and interdependent system such as TRAC, very small changes in a rule may cause major consequences in a number of unanticipated places.

A critical decision in TRAC, on which almost everything else rested, was the form of, and the manner of operation of, an elemental "statement". The hope was for a kind of statement having minimal complexity (visual, logical, and typographical), using the simplest explicit syntax giving the utmost logical versatility, all being done with the fewest possible syntactic or meta characters. In the end, as one might expect, it was necessary to accept some compromises; nevertheless, the overall result has proved to be quite satisfactory. The manner in which the scope of an elemental statement is indicated may appear to be a simple matter, yet it is far from trivial. Its consequences provide a good illustration of the high dependence of a language system upon the adopted script. Decisions on script were especially important in this case, since it was intended that TRAC be homo-iconic. Thus the script should not present severe problems of use either to the human or to the machine. It might seem that the choice of the script was a mere matter of decision based upon human convenience and scanability for the machine processor. This is not so. The consequences of the script decisions go a great deal deeper, as was apparent only after the script was implemented and tried on the computer. Some of the most valuable features of the script were not suspected when the script was first under consideration. The modes for delimitation of a statement seem to be restricted to the following three: 1) The use of word delimiting as in ALGOL, such as "begin A ... end A’ where A is the name of a procedure; 2) The use of a numerical projection, such a: "take the next N characters" or the next N units of text; and 3) The use, in some manner, of paired punctuation marks, such as parentheses. At first, "word delimiting" seemed attractive. Upon study, it was found that because the operands of procedures were going to be text strings, usually in a page format, there was a likelihood of collision between the uses of the words "begin" and "end" as delimiters and as elements of text. Another consideration was that word delimiting requires a substantial number of characters, and therefore with multiple nesting, it would be confusing and far from compact. It was also clear that the points of delimitation would not be readily visible when immersed in a stream of text. Therefore this approach was discarded primarily because of its clumsiness.

The "numerical projection" method depends upon a prediction of how many characters or units of text should be taken after some point of projection. The methods of the "Polish" prefix notation are of this character, in that each function implicitly calls for a very specific number of arguments to follow -usually two arguments. This is satisfactory if one can predict the situation. However, it often happens that one does not know, or there is no way of knowing, how many units of text or how many characters will eventually come within the scope of a statement. TRAC, as currently used, often encounters this situation in important applications. From the user’s standpoint, the prefix notation is almost unreadable if there is any depth of functional complexity. It is purported that machine processing is easier with prefix notation. However, in comparison with other easily-parsed notations, this advantage is probably more superficial than of any real consequence. The prefix style of notation and methods based on numerical projection were therefore discarded because of their inability to handle strings of unpredictable length, and because of unreadability. There remained the method of paired punctuation signs, best exemplified by parentheses. Parentheses have the advantage of being familiar from the script of algebra, and they are effective and compact when used to delimit text. Their major difficulty is demonstrated in an acute form by the parenthesis problems of LISP, where expressions can have a dozen or more parentheses at the end. However, it appears that the primitive functions chosen in LISP tend to aggravate the parenthesis problem. Atypical compound expression in LISP will tend to be of the form (symbol(expression)) with all the parentheses piling up at the end. In TRAC, as will be seen, compound expressions tend to have the form (symbol(expression)symbol) and thus there does not tend to be such a concentrated pile-up at the end. There will always be the need for careful parenthesis counts to make sure that delimiting will occur as intended. In balance, parentheses were chosen as the means of statement delimiting in TRAC.

The parentheses as text delimiters in TRAC have the purpose of indicating that something is to be done to the enclosed text. There are two main cases: 1) to indicate that the text is the argument of a function, and 2) to indicate that something is not to be done with the enclosed text, i.e., to protect functions within the text from immediate execution. In the first case, TRAC uses the format #(text string) to indicate a function. Here the "sharp sign" or "number sign" is a character having a special meaning when it immediately precedes a parenthesis. Its meaning can be roughly stated as "evaluate what is enclosed". In other words, it signals that the delimited statement should be treated as a function, if that is possible. Later we will consider a modified function represented by ##(text string). In TRAC, if the number sign does not occur in these two very specific ways: either as #( or as ##(, it is invariably treated as an ordinary alphabetic text character. In practice, the function indicator sign #, as well as the other TRAC control or meta characters, should be chosen in accordance with their ease in typing on the particular keyboard that will be used. When working with the Teletype Model 33 keyboard, the "up arrow" character is particularly easy to use in sequence with the parenthesis. Because most functions require several arguments, it is necessary to have a set of rules for dividing the text string inside a function into the various substrings which will then be individual arguments. Commas fulfill this role of marking the points of division. If the text string inside the function-delimiting parentheses consists only of commas and alphabetic characters, division is done simply at the locations of the commas.

More generally, the text string of a function will be more complicated. It may contain the statement of another function. It may have pairs of protective parentheses. There may be other variations. It is precisely at this point that TRAC epitomizes several unique contributions to the development of a language system. Before getting into these details, first consider some of the superficial aspects of TRAC functions, while omitting the problems of nesting or the other tricky aspects of argument strings. For simplicity, imagine the argument strings to be composed merely of commas and alphabetic characters, the latter having no syntactic significance. A typical TRAC function will have this form: #(ab,S1,S2, ... ,Sn) where the "a" and "b" are individual alphabetic characters, and the "S1" are alphabetic strings not containing commas. The two initial characters distinguish the particular TRAC function represented; the particular characters chosen for each function are chosen for their memory-aiding or mnemonic ability. TRAC has about 35 primitive functions, and thus a single character would not suffice. A variety of different ways of representing a function was considered during the design of TRAC. In the beginning it was hoped that there would be only five or six basic functions, and that some quite abbreviated expression, perhaps involving punctuation, could be developed. Convenience of use soon forced the decision to provide a larger number of primitive functions. That being the case, the indication by the two-letter mnemonic seemed to be the best solution. Consideration was also given during design to the alternative functional forms: #ab(S1, ... ,Sn) (#ab,Sl, ... ,Sn) The latter form was the strongest alternative to the form finally adopted. The adopted form #(ab,Sl, ... ,Sn) was finally chosen, partly because of its slightly better readability, but more particularly because it would allow the two-letter mnemonic to be treated consistently as one of the argument substrings. Let us now consider two of the most basic of the TRAC primitive functions. The first is the define string function which allows us to name and to store a piece of text. It is illustrated by: #(ds,Abc,The apple is red.) This function causes the text "The apple is red." to be placed in storage, and the name "Abc" to be set up in a table of contents with a pointer to the location of the stored text. Note that alphabetic case shifts, if they are applicable, are retained in TRAC. The counterpart function is call, illustrated by: #(cl,Abc) The result of this call is to bring the named text out of storage (without erasure) and to insert the text in place of the statement of the call functions, i.e., in the place of the "X(cl,Abc)". Specifically, #(cl,Abc) is deleted from the string in which it is found, the following text (if any) is pushed ahead, and the text

The apple is red. is put in the place formerly occupied by the call. Note that the call function leaves a value behind in its place. Thus the call function is a representative of the class of TRAC functions which have a value. On the other hand, the define string function, although producing a useful action, leaves nothing behind at the location where the define string statement occurred. Thus the define string function is an example of a TRAC function with a null value. If such a function occurs in a string of text, after the performance of the function, the enclosing text is closed up, leaving no trace of the function behind. Note especially that in TRAC, "null value" means specifically "no character of any kind", not even the space character or "blank" of the tab card. At this date, with TRAC running, the present form of the define string function and the call function seem very simple and obvious. Neither started this way. Initially, the analog of the define string function in TRAC did more than merely name and record a string. Following McIlroy, it created an entire macro expression, including the points of substitution. Due to its complexity, it required a special syntax and functional form which was different from the rest of the functions. The call function was in even a worse state of affairs. Initially, the call was an operator that moved a string from the memory into some central "accumulator" or scratch pad, where further operations might be performed on the string. Essentially the call function, in its philosophy, was then closely related to the "single address" mode of operation now almost universal in computer programming. The difficulty with this was that text, unlike a number, is not a simple entity; it has extension, and has many separate parts. Therefore, once the text from a call was in the accumulator, perhaps with segments of other text, it was difficult to specify the ways in which the various elements were to be combined. There were so many possibilities -- such as concatenate, interchange, substitute, segment, take a character -- that it seemed unlikely that a useful set of operations could be kept either small in number or of sufficient generality. An associated difficulty was the apparent need for a great variety of text markers to control the action of the various functions, i.e., to mark the beginning points, end points, insertion points, and the like. Also, the design of useful functions continued to give rise to instances where a special case with a special rule was needed for a proposed new function. This was intolerable. No good solution to these problems was found during the study of the single address mode of operation.

The breakthrough in dealing with text and calls (and therefore with all the functions) occurred as a consequence of a suggestion in the handling of input and output controls made by Deutsch. Up to the time of this suggestion, input and output was messy and generally non-uniform with the rest of the evolving system. As in most programming systems, it didn’t fit in. The suggestion was that we have two input-output functions, print string and read string, and that for basic input and output control these two functions be nested. Print string is represented by #(ps,X) where X represents any string of alphabetic characters. This function causes the printing out of the string represented by X. Read string is represented by #(rs), and causes the processor to "listen" to the typewriter. The value of the read string function is the string of characters typed in at the keyboard. For input and output control, these functions are nested into the expression called the idling program:

#(ps,#(rs)) As will be explained in greater detail later, such compound expressions are evaluated by working from the inside out, evaluating all functions. Thus the read string function is the first one to be evaluated. The idling program is the initial state of the processor. Then when a define string expression such as #(ds,aa,The rat ate the cheese.) is typed in, this expression replaces the #(rs) to give: #(ps,#(ds,aa,The rat ate the cheese.)) The define string expression is next executed and causes the naming and recording of the text. At the end of the operation, only #(ps,) is left, because of the null value of the define string function, and nothing is left to type out. The processor is again restored to the initial state by automatic reloading of the idling program. Next, by typing in #(cl,aa)we will, step by step, have the following actions: #(ps,#(cl,aa)) #(ps,The rat ate the cheese.) and finally a typeout of the text: The rat ate the cheese. From this initial suggestion, it was possible to evolve a comprehensive philosophy of nested functions, and to recast all of the ideas of TRAC which had been developed to that date. The first thing was to throw out completely the notions of a "single-address-accumulator" form of organization. In its place, a thoroughgoing adoption of the notion of nested functional expressions was taken. All functional expressions have either a null value, or, if they have a value, their value is inserted in the place of the functional expression. Accordingly, there is no need for a temporary limbo with the name of "accumulator" to hold text in process. Perhaps more important, each functional expression serves as a syntactic marker for showing where its text value is to be inserted within some other expression. This is an exceedingly powerful and versatile device. One should not be surprised at the power of using a functional expression as the marker for the location of insertion of a value, since this is exactly the technique developed over the centuries in mathematics for dealing with the most general kinds of combination of abstract and literal quantities. Also, since mathematical script is nothing but a standardized and printed version of handwritten signs, evolved and modified to suit the convenience of generations of mathematicians, again we should not be surprised that the technique makes for a convenient user’s script. The comprehensive treatment of nested functions in TRAC permits the building of a TRAC processor which has no "special cases" for the performance of its actions. At the beginning of any processing cycle, as the initialization step, the idling expression #(ps,#(rs)) is automatically loaded into the computational or scratch pad area of the processor. Text read in from the typewriter keyboard then appears in place of the #(rs) in the argument string of the print string function. If this text read in includes any functional statements, these are performed. At the end, if there is a value left behind from the functional statements, it is printed out by the print string function. The print string function itself has a null value, so the scratch pad is finally left empty. At this point, the processor re-initializes (resetting push down lists and registers, reloading the idling expression, but it does not erase the strings in storage or their names). The TRAC processor is now ready for further input.

Implicit in the above statement is the important fact that all the computation in TRAC (i.e., all the evaluation of functions, all manipulation of strings) logically occurs within the argument string of the print string function. Since functions can be nested to any depth, by induction all computation in TRAC occurs within some argument string in the nest. To understand the operation of TRAC, one must therefore always think in terms of what is happening in an argument string. Moreover, since functions are embedded in argument strings, and since these functions themselves have argument strings, we have an unlimited ability to deal with higher and higher levels of discourse. If the problems of relating these levels of discourse can be completely solved, then we should presumably have a most powerful processor. It appears that in TRAC we have a simple and closed solution to these matters.

The evaluation of TRAC functions, and the relation between the different levels of the argument strings, is completely determined by the scanning and evaluation algorithm of TRAC. To a first approximation, the scan process operates from left to right, from the outside in, starting with the print string function of the idling expression. In order for any function to be evaluated, it is necessary that all of its argument string must be converted from an "active form", containing possible unevaluated functions, to a "neutral form" in which all the evaluations permitted by the TRAC scanning algorithm have been completed. Thus, in an expression of nested functions, the first function actually to be evaluated will be the first inner function which contains no other function in its argument string. Sometimes the value of a function will be a string again containing one or more functions. If so, there are two ways in which the scanning is resumed. These possibilities are determined by whether the function producing the new value was of type #( ) or of type ##( ). If the value string was produced by a #( ) function, the new value string must in turn be scanned and its functions evaluated, and so on, so long as #( ) functions continue to appear and to produce values. However, if the value was produced by a ##( ) function, the new value string -- whatever it may contain -- is treated thereafter as a neutral string, i.e., any functions it may contain (of any sort) are not evaluated. As the active argument string is scanned, its characters are transferred one by one from the active to the neutral string. Thus an active string is depleted from its left-hand end, and the characters obtained are added to the right-hand end of the neutral string being built up. If a comma. is encountered during the scan of an active string, it is deleted, and its location marks the end of one argument substring and the beginning of the next. Thus if an argument string consists of only alphabetic characters and commas, the commas will completely determine the divisions of the text string into the various arguments of the function. By the scanning process, if a pair of closed parentheses not indicating a function is encountered in an argument string, the parentheses themselves are deleted, and the entire content between the matching pair of parentheses is transferred to the neutral string. This transfer includes any function expression, any commas, any number of closed paired parentheses to any degree of nesting, as well-as any format characters such as carriage return. The processor keeps a strict parenthesis count to determine the location of the matching right-hand parenthesis. This means that any attempt to insert unpaired parentheses explicitly in an argument string will generally cause the scope of some argument to be in error, and will usually cause a serious error in the execution of a procedural statement. The inability explicitly to handle unpaired parentheses is not an absolute limitation in TRAC, since unpaired parentheses can be dealt with implicitly by use of the neutral function form of the call ##(cl, ). Since unpaired parentheses can in this manner be kept out of the active argument string of any

function, they do not cause trouble. We should note the close parallel between the single set of protecting parentheses ( ). and the ##( ) functions. The ( ) inside a function causes a transfer of the entire expression contained within to the neutral string of that function. The ##( ) causes a transfer of the entire string produced by the function to the neutral string of an enclosing function. TRAC appears to be the only processor which permits use of both the #( ) and the ##( ) kind of function evaluations. The string processor of McIlroy based upon the macro assembly system uses functions solely of the #( ) type. This kind of function, called the active function, is the one that seems to be the most used in writing TRAC procedures. The neutral function epitomized by ##( ) is used somewhat less frequently, mainly because it does not lead to an automatically iterated, re-evaluation of the strings produced. LISP, for example, appears to be based exclusively upon the neutral function. There is a final point about the TRAC scanning algorithm, which might at first seem to be trivial, but which is of considerable practical importance. The characters for "carriage return", "line feed", and "tabulate" (if available to the keyboard) are automatically deleted by the scan algorithm where they occur in an argument string, unless they occur within some protecting pair of parentheses. The reason for this is that a TRAC procedure may contain a very complex sequence of functions. It is then helpful for easy readability to write these functions in a column, rather than in a solid line across a page. By automatic deletion of the unprotected format characters, it is possible to format such a sequence of functions without having the format characters themselves turn up later (as values) in an unwanted fashion at the end of the evaluation.

We now resume our consideration of the TRAC primitive functions, with a few examples of their effect. It should be stressed that the major contribution of TRAC does not lie in the particular set of primitive functions that were chosen. While they are a very useful and convenient set for the work contemplated, for other work another set might be more appropriate. The important contribution of TRAC lies in the thoroughgoing viewpoint of functions nested within argument strings, the distinctions between the manner of use of #( ) and ##( ), and the scheme by which this is all tied together. In other words, the major contribution lies in the area of the TRAC scanning algorithm. Given the TRAC algorithm as a substrate, it is possible to consider the employment of a variety of useful primitive functions. As two examples, functions might be used which were more appropriate to a dichotomous list-structured data base, or functions might be used to implement the COMIT or SNOBOL type of scanning and replacement philosophy. In systematic review of the TRAC primitive function set, we begin with the print string function (8). Print string #(ps,X) is a primitive function of two arguments, namely the two-letter mnemonic ps and the single neutral string represented by X. If the scope of any print string function has additional neutral strings, such as in #(ps,X1,X2,X3), only the first neutral string X will be used, and the rest will be ignored and will disappear from the computation. It has been found, in this and other functions, that this feature provides a very handy way to get rid of things. The principle operates for all functions which are defined for a specified number of argument strings. The print string function has a null value. Note that the #( ) versus ##( ) distinction makes no difference in the effect of the print string function, since print string has no value.

The read string function #(rs) has the value consisting of the string of characters typed at the keyboard up to the first occurrence of a specified meta character. Here we will use the apostrophe ’ as the meta character. This character has a meaning that can be read "end of text". This meta character can be readily changed, as will be explained later. The meta character is removed from the input string for all read in done by #(rs). A variation of the read string function is the read character function #(rc) which will read in one character. It will accept and transmit any character, including the end-of-text meta character, or even a single, unpaired parenthesis. Note that by use of ## or # it is possible to control the disposition, at the next level, of the characters read by either the read string or the read character function. Consider the two TRAC expressions: #(ds,A,#(rc))’X and #(ds,B,##(rc))’X These will cause the recording of various things, depending upon what is typed in for the symbol X. The second expression will cause whatever character was typed in to be stored, whether the character was an alphabetic character, ail apostrophe, a comma, a right or left parenthesis, or a number sign. However, because a #( ) function permits a re-scan of the value emitted, the first expression will not permit storage in certain of these cases. In particular, the form A will be null (empty) when X has the value of comma, or either of the parentheses, but it will store any other character. Consider the case he read string function. The TRAC expression #(ds,C,#(rs))’X in which X may contain a function, or a procedure of several functions, will cause the processor to try to execute immediately any functions which were read in, before the text is sent off to storage. On the other hand, the expression #(ds,D,##(rs))’X will read in and record "verbatim" absolutely any string of characters (including unpaired parentheses, commas, functions, or anything else) up to the point at which the first apostrophe is typed in (which is deleted). No evaluation is performed before recording. The end-of-read-in meta character can be changed with the function change meta #(cm,X). Execution of this function will cause the first character of the neutral string X to become the new meta character. By use of this function, we can sidestep any limitation due to a fixed end-of-text meta character. TRAC must have the ability to locate characters and substrings within strings, and to insert new text in such location. At an early design stage, a primitive insert stringfunction was considered for performing such an operation. Such a function would have the form #(is,N,X,Y) where N stands for the name of the form to be acted upon, X represents the literal value of the substring to be located in the string with name N, and Y represents the value of the string to be inserted as a replacement.

TRAC performs this task in another way, gaining some generality by separating the operation into two steps. It uses the segment string operation represented by #(ss,N,X1,X2, ... ,Xn) where N represents the name of the text which is acted upon, and the X1 represent neutral strings. In operation, the string named N is taken from the store. It is first scanned from left to right in search for a match with string X1. Wherever a match with X1 is found, the matching characters are deleted from the string, and the place is marked with a segment gap indicator of ordinal value one. This marker could be a pointer. However, in the current TRAC implementation, the marker is a special character outside the range of the input codes. The scanner now resumes with with X1, finding and marking other matching portions of text, deleting and marking these locations in turn with the segment gap indicator or ordinal value one. When the matching operation with string X1 is completed, the processor scans the modified text (With deletions and markers) and looks for a match with the string X2. Where it finds a match, it makes a deletion and inserts markers of ordinal value two. And so on, until the string named N has been scanned for matches with all the strings enumerated in the segment string function. The marked string is now replaced in the TRAC memory under the name N. Such a marked string in storage is called a form, and in general any text in storage is called a form. The segment string function has null value. A segmented form now can be brought forth, with insertions of new text in the segment gaps by means of the call function, which in its full form is represented by #(c1,N,Y1,Y2, ... ,Yn) where N again stands for the name of the form called forth, and the neutral strings Yl, Y2, ... ,Yn are the new strings to be inserted in the locations marked by the corresponding segment gaps. Observe that the define string and the segment string function together effectively create a macro definition, in which the Xl’s are the dummy variables in the macro definition. The macro call is then performed by the TRAC call, with the parameters of the call being the Y1’s. An important point to observe in connection with the TRAC segment string function is that, in comparison, McIlroy’s macro assembler and LISP can act only upon atoms(in the terminology of LISP). By this is meant that each dummy variable X1 has to be an indivisible symbol (like the symbols in assembly language programming) and each symbol must match exactly some other symbol set off by commas, i.e., another atom. Matches are not permitted in the middle of the text or in arbitrary character strings. Thus both these systems are quite limited in their ability to deal with text in general. Quite strictly, they are symbol manipulators, rather than being text manipulators, as is TRAC. They cannot concatenate two atoms, with the removal of commas between; neither can they divide atoms. Since atoms don’t exist for TRAC, both of these are trivial operations for it. The main structure of TRAC -- epitomized by the six primitive functions ps, rs,rc, ds, ss, and cl -have now been covered. Examples of the use of these functions will be found in the appendix. The remaining TRAC primitive functions will now be reviewed. Branching is performed in a number of different ways. The most obvious is according to string equality. This is done by the equals function which is represented by #(eq,X,Y,T,F) where each of X, Y, T, and F stand for a string. If the string X is the same as string Y, the value of the function is the string T; otherwise its value is the string F. Note that strings T and F may contain either text or function; thus it is possible to branch to a new subroutine. Numerical comparisons are made by the greater than function #(gr,X,Y,T,F). This has the value T or F according to whether X is numerically equal to or greater than Y, or not.

Parts of forms can be read from storage, with branching if the form is empty. Each application of the call character function #(cc,N,F) results in reading out the next character from the form whose name is represented by the symbol N. As each character is read out, a pointer is moved ahead one place. The character read out is the value of this function. When the pointer has reached the right end of the form, the value of the function is F. Again, the string F may contain a call for a new procedure or subroutine. The closely related call segment function #(cs,N,F) calls out the text segments, one by one, which were formed by the operation of the segment string function. Again when the segments are exhausted, the function has the value of the string F. The function call n characters(#(cn,N,n,F) has as its value the next n characters of the form named N (segment gaps being deleted during the call), and has the value F when the form is empty. The function initial #(in,N,X,F) looks for the first occurrence of the neutral string X in the form with name N. The value of this function is then the head end, or the initial part, of the form N, up to the point where the string X finds a match. The matching characters of the form N are now skipped over, and the read pointer is now moved up to the first character of the remaining form. When this function is applied again to the same form, it will get another initial string from the remaining part of the form, and so on. When the form is empty, the function has the value F. The calls with mnemonics cc,cs, in, and cnall modify the same pointer, which is attached to the recorded form. When the location of the pointer in a form has been changed by cc, cs, in, or cn, a subsequent call by cl will have a value which is the text that follows the pointer. The pointer can be restored to the head of the form by the call restore function #(cr,N) which has null value. Another technical point, but one of considerable practical importance, is that the functions cc, cs, in, and cn have their values as read out from the called form controlled by the #, ## distinction, while the function value for the empty form (i.e., from the string F) is always treated according to the # rule. This permits one to read out strings into the neutral argument string, but when the form is empty to cause execution of a call buried in F to cause a transfer to a new procedure. It was not intended that TRAC have any great power in numerical computation, but an ability with integer arithmetic was considered essential in order to handle arrays. Thus TRAC was given a capability for integer arithmetic, with the operations addition, subtraction, multiplication, and division, having the respective mnemonics ad,su, ml, and dv. The addition function is typical of these functions: #(ad,D1,D2). The arithmetic functions take decimal arguments, and the value is the decimal result. The TRAC memory stores the numbers in their decimal ASCII representation, but the computer arithmetic in the present implementation is in binary. TRAC can deal with patterns of zeroes and ones, i.e., with Boolean vectors. The patterns are represented by a sequence of octal digits. The functions available areBoolean union, intersection, complement, shift, and rotate. Shift and rotate are to the left with the number of binary places being given by a decimal number. These functions are represented, respectively, by #(bu,01,02), #(bi,01,02), #(bc,01), #(bs,D,01), and #(br,D,01) where the symbols 01 and 02 stand for an octal value, and D stands for a decimal number. TRAC would be of little practical interest if it could not be used to manage the transfer of content back and forth between some large-scale back-up storage. In the present time-shared environment, the back-up store is a UNIVAC Fastrand drum. The TRAC function store block#(sb,M,N1,N2, ... ,Nn) causes the forms whose names are represented by N1, N2, etc., to be taken from the TRAC processor storage and to be written onto the drum. These forms and their names are deleted from the TRAC storage and table of contents. At the same time, a new form is created, with the name represented by M and with the content being the drum address where the sequence of forms is stored. The length of

the stored sequence is of no concern, since the drum is organized in chained 50-word blocks by the executive program of the time-shared system. The converse function is fetch block #(fb,M) which causes retrieval without erasing of the stored drum forms. With the fetch order, the forms and their names are again set up in the TRAC processor memory. It should be noted that pointer values to the text of a form are preserved in the drum store and fetch operations. In fetch, the form M is not erased. Drum space can be regained, if this is a problem, by the function erase block #(eb,M), which erases the drum content and also the TRAC form named M. Diagnostic capabilities are very important for a. processor dealing with complex procedures. The list names function #(ln,X) has as its value the names of all the active forms in the TRAC memory, with the string X (usually taken to be carriage return and line feed) inserted ahead of each of the names. Forms -- complete with an indication of. the segment gaps -- can be inspected by the function print form#(pf,N) where N stands for the name of the form. Print form has null value. A step by step trace mode is activated by the function trace on #(tn). This has the result that as each complete set of neutral strings for a function is built up, it is typed out for inspection. Then a touch to the carriage return key causes the function to be evaluated. The neutral strings of the next function are then found and typed out, and so on. This action is halted by the trace off function #(tf). Control of additional peripheral units is gained by the select device function #(sd,X) in which X stands for the mnemonic of the device selected. The select device function modifies the next-following action of the read string function, the read character function, or the print string function, causing these functions to cause input-output on the named peripheral. For example, #(sd,p)#(rs) causes operation of the paper tape reader. Because many operations have null value, with no type out, it is desirable to indicate to the user that an action has been completed, for example that the paper tape punch has finished, or that a very complex transfer has been completed and that the processor is ready to receive more input. This is accomplished by having the reinitialization procedure type out a carriage return and a line feed just before the idling program is called into action.

This paper has endeavored to furnish an insight into the reasons behind a number of design considerations of TRAC. To do so, it was necessary to describe briefly the TRAC language itself. However, a complete technical specification of the TRAC language, its algorithm, and its functions will be found in another paper to which the reader is referred . TRAC is currently implemented on the PDP-1 computer in the Hospital Computer Project at Bolt Beranek and Newman Inc. in Cambridge, Mass. This computer furnishes to each time-sharing user a memory store of 4,000 words of 18 bits each. The backup store is the Fastrand drum. There are separate swapping drums for the user’s programs. The present TRAC translator program, with about 35 primitive functions, occupies only slightly more than half of the 4,000-word memory. The scratch pad can hold about 3,700 characters. The internal storage if TRAC is not list-structured in the manner of LISP and related processors. Instead the content is stored in a linear fashion, with garbage collecting and rewriting to reform and compact the lists whenever more space is needed. Despite the speed penalties due to a highly compressed program, and of a time-shared environment, TRAC is rapidly responsive. The usual responses occur within 1/3 to 3 seconds, and about one second when a Fastrand action is called for. Response times of this order will be required for full satisfactory use of the reactive typewriter.

Although the action of the TRAC processor, in the evaluation of the nested functions, is highly recursive, the TRAC computer program is not -- in the sense that a machine language subroutine is never entered twice. This is a consequence of using the TRAC scanning algorithm first, and storing the results of the scanning operation in a push-down list prior to the evaluation of any specific nested function. The TRAC language has gone through some four major stages of evolution since the beginning of the project in 1960. It was first programmed in a list-structured version for the PDP-1 in the summer of 1964 after Deutsch joined the project. During the resulting extended period of development and testing, TRAC has shown a gratifying convergence to a stable system.

For their interest and encouragement during early development of TRAC, special thanks are due to Dr. Max A. Woodbury of the New York University Medical Center, and to Dr. Harold A. Wooster of the Information Sciences Branch of the Air Force Office of Scientific Research. Appreciation is also expressed to Dr. Jordan Baruch of Bolt Beranek and Newman Inc., Cambridge, Mass., for cooperation and permission to use their PDP-1 computer complex.

The boldtext is typed by the user; all other text is the response given by the TRAC system #(ds,Store,( #(ds,#(ps, ( Give Name--))#(rs),#(ps,( Give Text--))##(rs) )))’ #(c1,Store)’ Give Name--Right Parenthesis’ Give Text--)’ #(c1,Right Parenthesis),) #(c1,Store)’ Give Name--’ (Null is here used as a name.) Give Text--ABC’ #(cl,)’ABC #(c1,Store)’ Give Name--1’ Give Text--(((A)))’ #(ss,l,##(cl,Right Parenthesis))’ ##(cl,l,Q)’(((AQQQ #(pf,1)’(((A1@1@1@ (1@ is the segment gap indicator.) #(ln,*)’*Store*Right Parenthesis**1 (Note that one name is null.) Project supported in part by the following grants and contracts: AF-AFOSR 376, AF-AFOSR 377, AF-AFOSR 461-64 from the Information Sciences Branch of the Air Force Office of Scientific Research; and GM 10416 from the Division of General Medicine of the National Institutes of Health, U.S. Public Health Service.

1. "Reckon," Oxford Universal Dictionary on Historical Principles, third ed./Clarendon Press, Oxford; 1955, offers following definitions: "to enumerate serially or separately; to go over or through a series, ’ "to count," "to ascertain the number or amount of," "to count up,.also to sum up, to estimate the character of (a person),’ "to include in a reckoning; hence to place or class," "to estimate, value . . take into consideration," "to consider, judge, or estimate by." 2. Mooers, C. N., "A Progress Report, The Reactive Typewriter Program," Comm. ACM, p. 48; Jan., 1963. 3. Advanced Research Projects Agency, Information Processing Techniques Div., Contract No. SD?295. 4. Following suggestion of McCullough, W. S., based upon terminology due to Peirce, C. S. s McIlroy. M. D., "Macro Instruction Extensions of Compiler Languages," Comm. ACM, p. 214?220; April, 1960. 5. Eastwood, D. E., and McIlroy, M. D., "Macro Comoiler Modification of SAP," unpublished memorandum, Bell Telephone Laboratories, Computation Laboratory; Sept. 3, 1959. 6. Mcllroy, M. D., "Using SAP Macro Instructions to Manipulate Symbolic Expressions," unpublished memorandum, Bell Telephone Laboratories, Computation Laboratory; early 1960. 7. Mooers. C. N., "TRAC, A Procedure Defining and Executing System, circulated in preliminary form as Rockford Research Memo No. V-157; June, 1964: in revision for publication. © Copyright 2000 by the TRAC Foundation, Inc.