February 15, 1988 {This document may be duplicated and distributed to others except as noted.

To contribute a document and/or to obtain copies of other ANSI X3V1.8M Music Information Processing Standards Committee documents, contact: X3V1.8M Secretariat, c/o Craig R. Harris, The Computer Music Association, P.O. Box 1634, San Francisco, California 94101-1634 USA.} X3V1.8M/SD-6 Journal of Development, Part One: Standard Music Description Language -- Objectives and Methodology -Editors: Charles F. Goldfarb IBM Research Steven R. Newcomb Florida State University 0. Introduction NOTE -- The Journal of Development is maintained in two parts only to facilitate maintenance by separate individuals; the two parts should always be read as a single document. There is much in Part Two, for example, that may seem confusing or contentious if it is not read in the context established by Part One. NOTE -- This introduction appears in both parts of the Journal of Development. 0.1. Purpose of the Document The Journal of Development describes the status of the Standard Music Description Language (SMDL) being developed by ANSI X3V1.8M, the Music Information Processing Standards (MIPS) Committee. NOTE -- General information about the MIPS committee, including a guide to participation, can be found in committee document X3V1.8M/SD-0. The Journal is in two parts: Part One describes the objectives of the project and the development methodology employed. Part Two describes the language design itself.

- 2 0.2. Development Methodology Both parts are revised by their respective editors after each meeting of the committee. As a result, the documents never represent text that has been agreed in detail by the committee, but only the editors' best efforts to express the committee's ideas. Moreover, the ideas in the journal are subject to further study and revision and do not represent a final design. Eventually, the design work will reach a point where all aspects of the language have been addressed, although not necessarily finalized. At that point, the Journal of Development will cease to be the vehicle that expresses the current language design. Instead, the committee will produce one or more successive "working drafts," consisting of text which represents the consensus of the committee. During the Journal of Development and working draft stages, public comment is sought and considered, but the process is informal. Eventually, when the committee is satisfied with a working draft, it will recommend that X3V1.8 process the document as a "draft proposal American National Standard." There will then commence a formal public review and ballot, during which the contributor of each comment will receive a written reply. 0.3. Editorial Conventions Formal standards can be complex documents in which every word has both legal and technical significance. Standards documents may also need to be translated into other languages. For these reasons, editorial conventions have been established to assure precision, accuracy, and clarity (albeit often at the expense of readability by the general public). The key principles are: (1) (2) Precise and consistent definitions of terms. Distinguishing real requirements from mere commentary, explanations, and examples -- and from definitions. Avoidance of redundancy. (Repetition of a requirement is normally a comment, to avoid the question of which text governs if the "repetition" is imperfect.)


Part Two of the Journal of Development observes some of the editorial conventions of a formal standard, but not

yet with the strictness and consistency that will be required in the final document. (See annex B of Part Two for details.)

- 3 1. Requirements for a Standard Music Description Language (SMDL). The SMDL is being developed to meet the requirements described in this clause. 1.1 General Needs 1.1.1 Book Publishing Publishers need a way of representing musical examples within a document (e.g. a music textbook), so that no additional typesetting or formatting cost is incurred, and no paste-ups need be done when either the text or music portions of the document are edited. 1.1.2 Business Presentations Makers of computer-mediated business presentations need to integrate music into their productions, and their productions need to be portable. Those who create business presentations, especially those who create business presentations of the kind that are now commonly done with a PC and a video projector, want to incorporate music in such presentations, and they want to be in a position to have the music reformatted (i.e., rearranged) for different performing resources "on the fly." The business of business presentations is a large one and it can be expected to generate considerable demand for computer music products, and, of course, for music itself. 1.1.3 Computer-assisted Instruction Computer assisted instruction which employs music as a reinforcer, or which actually teaches music, needs to be portable in order to maximize its marketability. The people who create the instruction need to be able to call upon databases of music written by other people who wrote or transcribed the music using perhaps incompatible hardware and/or software systems. 1.1.4 Electronic Information Distribution Electronic distributors of information (via videotex, etc.) need to be able to include music as part of their product mix. 1.1.5 Music Creation and Distribution

Composers, performers, and arrangers would be better able to exploit the market for their creativity, and their market would be better served and have a wider variety of product to choose from, given the existence of a lingua franca for music--a single representation which is able to encompass the kind of information which is available from printed music, as well as the kind of information (gesture, nuance) which performers add in any given performance. 1.1.6 Information Retrieval

- 4 Librarians and information retrieval specialists need a standard representation of music data bases, including the ability to identify musical works by themes as well as conventional bibliographic data. 1.1.7 Musical Analysis and Criticism Musicologists, reviewers, editors, and critics require the ability to annotate and analyze musical works, and to record their analyses in a manner that provides complete flexibility in their choice of analytical technique, as well as precision in indicating musical passages and phenomena. 1.2 Specific Assumptions Within the above broad categorization of application needs, specific requirements have been identified. 1.2.1 User Interface It is expected that the primary means of creating and revising SMDL documents will be with specialized music editors. However, it is also expected that direct access with "dumb" text editors will also be necessary, for example: 1. 2. By programmers who are developing or maintaining the specialized music editors. By users who have incorporated SMDL into larger documents for publication, and who must modify them in an environment where a music editor is not available.

1.2.2 Unique Representation The representation of a musical work must contain a "core" of information that can be encoded in a canoni-

cal form such that unambiguous comparisons can be made between works. In other words, there must be a defined portion of the representation that serves to distinguish a work from all other works. ***** section 1 TO BE COMPLETED: ***** ***** Contributions are solicited! ***** 2. The Role of SGML in the SMDL NOTE -- The SMDL is intended to be an SGML representation of music information. The nature of SGML is such that this objective does not restrict the SMDL design in any practical way. The purpose of this clause is to explain why that is so. 2.1 Background

- 5 The Standard Generalized Markup Language (SGML) is an internationally standardized language for document description, published as International Standard ISO 8879 : 1986. SGML has been adopted by a broad variety of organizations for a diverse range of applications. -The Association of American Publishers has adopted SGML for use by authors submitting manuscripts to publishers, and it has published applications of the language for journals, books, articles, mathematical formulae, and complex tables. The U.S. Government, which is the world's largest publisher, is a major user of SGML, and it is in the process of formally adopting it as a Federal Information Processing Standard. Agencies using SGML range from the Internal Revenue Service, which uses it for tax form preparation, training manuals, and other publications, to the Defense Department. The latter has adopted SGML as a standard for documentation in its Computer Assisted Logistical Support program, a project that could see the expenditure of over a billion dollars on SGML-based documentation support in the next three years alone.



The IBM Corporation, on whose Generalized Markup Language (GML) the SGML is based, is the world's second largest publisher. It uses GML for over 90% of its publications. The Oxford University Press is using SGML to create an immense data base of the contents of the Oxford English Dictionary and its many supplements. It will be the base for the publication of the New Oxford English Dictionary and many specialized dictionaries, and it will eventually be available for online information retrieval.


Implementations of SGML for IBM and Macintosh personal computers, DEC minicomputers, and IBM mainframes among others, are already available, and more are under development. 2.2 Document Representation with SGML 2.2.1 Structure SGML allows a document to be described as a hierarchy of logical elements. For example, a "book" may be described as a sequence of "chapter" elements, each of which contains a "title" element followed by one or more "paragraph" elements.

- 6 The title of a chapter might appear as: <title>How Dorothy Returned to Oz</title> and the first paragraph might appear as: <p>When Dorothy returned to her room, there was a tiny cameo lying on her dresser. She picked it up, and it began to glow, while the tiny face on it seemed to come to life.</p> While this example may seem trivial, it illustrates the beauty of SGML: an SGML document need not contain any formatting instructions, but all the information about the document needed to format it automatically (by means of an application designed to do that) can be placed within the document itself. Having created a document expressed in SGML, the author or editor can instruct a formatting program that, for example, all

chapter titles are to be centered on new pages, one third of a page down, followed by a specified amount of blank space. Thus, if the document is reprinted in a journal or anthology with different formatting conventions, no one needs to edit the document itself, because a formatter can reformat it according to the desired publishing style. SGML documents can contain normal text characters, graphics, images, mathematical formulae, and other specialized notations. In the above example, the structure of this instance of a book (a very short one!) was: <book> <chapter> <title> <para> data ... data ... where "data" was those characters other than the markup tags -- the "real" text of the document. 2.2.2 Data Content Notations In our book, the data was a string of normal English characters interpreted in the usual way. But consider the following data: three over four This example could also be normal English text, but in a different context it could be interpreted as the equivalent of:

- 7 3/4 The interpretation of data characters in a specialized manner like that described is called a "data content notation." Data content notations are frequently used in SGML documents to describe the content of elements such as mathematical formulas and pictures. However, the example could also have been represented as an SGML structure that did not require a data content notation:

<fraction><numer>3<denom>4</fraction> Here, "fraction" is an element (like "title" or "p"); it contains a numerator ("numer") and a denominator ("denom"). In other words, the structure is: <fraction> <numer> <denom>

data ... data ...

And, just as in our book, the data need no special interpretation -- the formatting of "fraction" is what specifies that the "numer" should be displayed above the "denom". The coding schemes conventionally used for musical scores are essentially data content notations. In them, for example, the letters "a" through "g" might stand for notes of a particular pitch, while the digits "4" and "8" might represent durations. By using such a notation, an SGML document like our book could contain a "song" element within (say) a chapter: <book><chapter><title>Some Obscure Songs</title> <para>Here is a really obscure song:</para> <song notation="DARMTRAN">4EDCDEEER</song> Here, the "notation" attribute on the tag that introduces the "song" element tells us that the data content of the element is interpretable by the "DARMTRAN" notation. The formatting program could call a "DARMTRAN" processor for that element in order to obtain the typeset music. 2.2.3 Cross-references Elements can have other attributes besides "notation."

- 8 One such attribute, called an "ID", allows a name to be assigned to an individual element. Relationships between other elements and that one can be expressed

with an "IDREF" (ID reference) attribute whose value is the ID of that one. In the following example, a paragraph contains a "songref" (song reference) element that refers to the song whose ID is "song1": <para>I am referring to <songref idref="song1"></songref> when I speak of true obscurity.</para> <song id="song1" notation="DARMTRAN" >T"Obscure Song"CDEFEDC</song> The "songref" element has no data of its own; presumably the formatting application will generate an automatic reference based on material that is decoded by the data content notation processor. (There seems to be a title hidden in there, but only a DARMTRAN processor would know for sure!) An alternative technique might be to define "song" elements as containing a "title" and a "body", with only the body being in the data content notation: <song id="song1"><title>Obscure Song</title> <body notation="DARMTRAN">CDEFEDC</body></song> Now the formatting application can be smart enough on its own (without the DARMTRAN processor) to extract the content of the "title" element of the song and use it to generate data for the "songref" element! 2.2.4 Data Content Notation Encoding The data content notations in our examples were both character coded. An advantage of a character coded notation is that it can be typed at a non-graphics terminal and printed in the form of a listing by nongraphics printers. This can be particularly helpful to programmers who write the friendly "front-ends" and applications that create and process SMDL documents. However, SGML documents also have the ability to contain binary data content notations, e.g., photographs. To a software developer, this may appear, at first blush, to be the most appropriate for music representation. However, the distinction between the SMDL and a representation that is internal to an application should always be kept in mind. The SMDL representation will be for the purpose of allowing applications with DISSIMILAR internal representations to communicate with one another. A binary encoded notation will not necessarily be more convenient for a given application to handle than a character coded one. 2.3 An SGML Design Tradeoff

- 9 There is obviously a tradeoff that must be made here when designing an SGML application. A data content notation, because it is designed specifically to describe a certain kind of information, will likely require fewer characters to express the same thing as a general-purpose description language like SGML. (To avoid any misapprehension that SGML is unacceptably verbose, it should be noted that SGML has "markup minimization" features that, if used, would have substantially reduced the amount of markup in the previous examples.) On the other hand, the more detail about an element that is exposed at the SGML level, the greater the possibility of interaction with other parts of the document (such as cross-references). Another benefit of maximizing the use of SGML structure is that any textual information in the music could be handled in the same way as any other text, and there would be the least likelihood of conflict between the formatting conventions for the text outside the music portion of a document and the formatting conventions for the text and music inside the music portion. A document expressed in SGML may be visualized as a tree, in which only the leaves contain data. The flatter the tree structure, the more likely that a notation will be required to interpret that data. But whether the tree has one level or one hundred, it is still an SGML document. In creating SMDL as an application of SGML, therefore, a range of possibilities present themselves: 1. The bare minimum of SGML structure could be used: <symphony notation="SMDL">GGGEb</symphony> 2. SGML structure could be used for some levels, but not for all of them. For example, SGML could be used for the few highest level elements of the tree where it might be useful to have crossreferences, etc., and where there is little efficiency to be lost because there are so few instances of them: <symphony> <movement id="first" notation="SMDL">GG</movement> <movement id="second" notation="SMDL">GEb</movement> </symphony> 3. SGML structure could be used right down to the


- 10 The choice between the above alternatives cannot be made with certainty until the full set of information to be represented in SMDL has been identified. The final language will almost certainly be some mix of SGML and a data content notation, but some difficult design work will be needed to implement it. For one thing, we will have to design (or adapt) a data content notation. In the interim, we can easily express the set of information to be represented by using alternative #3, as it does not require us to invent a notation at this time. Once the full set of information is defined, we can devise a coding scheme (data content notation) for the leaves and appropriate levels of node, and leave the remainder to be represented with SGML markup. 3. Design Philosophy This clause describes the principles that are observed in formulating the SMDL design. 3.1 Role of a Description Language A description language (such as SMDL) is a method of expressing certain material that falls within a range that the language designers specify. A description language does not make any demands on the material other than that it be within its range, nor is there any dynamic aspect to a description language. English can be used to illustrate the concept of the range of a language. English is a language that is ideally suited for writing material such as this document. English also lends itself beautifully to poetry. Mathematics, on the other hand, can only poorly be expressed in English (calculus and algebra are far better), and music cannot usefully be represented at all. Clearly, some material is within the range addressed by English, and some is not. English imposes a certain structure (grammar, vocabulary, spelling, etc.) on its range of subject material, but does not restrain the content. 3.2 Terminology The terms for SMDL constructs are chosen with care, but some may be different from conventional music terminol-

ogy, in the following ways: 1. The term may be used in a more restricted (or more general) sense in the Standard than in common music parlance. The term may refer to an SMDL language construct that corresponds to, but is not identical to, a construct in music.


- 11 3. The term may refer to a construct from another discipline that is here being applied to music. The term "thread," for example, refers to a concept which does not have a counterpart in conventional music terminology, but it is a metaphor like the one used when speaking of the "thread" of a story line or argument.

The terminology, like everything else in SMDL, is subject to review and revision, but for now we need "handles" for various concepts, and these are workable. 3.3 Efficiency It is intended that SMDL be able to express the bulk of the material it is intended to represent in an elegant and straightforward manner, with some thought given to efficiency as well. However, efficiency with respect to potential modification is much less a concern, since any given instance of a musical piece is static. The only criterion is whether the versions existing both before and after the change can be expressed correctly. Dynamic efficiency is more the concern of designers of the internal representations used by application software that will read and create SMDL documents. (It is especially easy for those of us who are software developers to fall into the trap of thinking like programmers rather than language designers.) 3.4 Redundancy and Consistency Our general approach is to avoid the possibility of conflicting information in an SMDL document, which is tantamount to avoiding redundancy. While it is recognized that internal redundancy can serve as a vehicle for error-checking, our belief is that it is the

responsibility of the originator of an SMDL document to assure that it is error-free and conforms to the standard. A non-redundant design assures that the document is internally consistent, and therefore processable, even if it does not correctly express the intention of the originator. 3.5 Economy of Constructs We intend that the final language design be elegant, in the sense of having no more constructs than are needed to describe the full range of subject material. During the design process, however, we prefer to err on the side of defining too many constructs, rather than too few, so that distinctions can easily be made as we gain better understanding of the differences between apparently similar things. We will, of course, remove any duplications when finalizing the design, but

- 12 premature optimization might cause us to overlook important differences. #