Professional Documents
Culture Documents
Robert D. Cameron
School of Computing Science
Simon Fraser University
Contents
Abstract
I. Introduction
II. Requirements
Requirement #1: Unambiguous Article Identification
Requirement #2: Canonical USINs
Requirement #3: Identification of Secondary Serial Components
Requirement #4: Scholar-Friendliness
Requirement #4.1: No Required Redundancy
Requirement #4.2: Standard Mnemonics
Requirement #4.3: Publication Numbering
Requirement #4.4: Standard Numbering Syntax
Requirement #4.5: Brevity of Article Identification
Requirement #4.6: Ease of Construction and Analysis
Requirement #4.7: Media Independent Specification
Requirement #4.8: Embedding USINs in Context
Requirement #5: Permanence of USIN Designation
Requirement #6: Accomodating Serial Evolution
III. Global Naming of Serial Publications
Hierarchical Naming Using the DNS Model
Three Initial Domains
Evolution of the USIN System: Towards Scholar-Friendly Names
IV. Hierarchical Identification of Serial Items
Example: Journal Article Citation
Multiple Articles Per Page.
Unpaginated E-Journals
A General Model for Identification by Hierarchical Numbering
Scope
Scope-Dependent Numbering
Syntactic Representation
Parallel Numbering Hierarchies
Chronology
Further Work: Hierarchical Numbering Theory
Additional Design Ideas for Hierarchical Numbering
Syntax for Holdings Description
Secondary Component Notation
The Reference Notation
Hyphenation Notation
V. USIN Support Technology
USIN Global Registry
SDL - Serials Definition Language
UPP: USIN Publication Protocol
SRP: Serial Registration Protocol
PDP: Publication Domain Protocol
USIN Global Database System
UIP - USIN Inquiry Protocol
Bibliographic Retrieval and Formatting
USINs, the USIN Global Database and Literature Research
VI. Conclusion
References
Abstract
The Universal Serial Item Name (USIN) scheme is proposed as a framework for a
single global namespace of articles and other contributions published in organized
serial collections. Requirements for USINs are analyzed with an emphasis on the
use of USINs in scholarly communication. A uniform naming model is described
based on the hierarchical naming of serial publications and the hierarchical
numbering of serial items. A number of concrete design ideas for USIN syntax are
presented. A USIN Global Registry and a USIN Global Database are proposed and
analyzed in terms of specific architectural features that interact to meet the
requirements of publishers, librarians and scholars. Applications of the USIN
concept to literature research, document retrieval, bibliography preparation and
addressing the "broken links" problem of the World-Wide Web are considered.
I. Introduction
The Universal Serial Item Name (USIN) scheme is proposed as a framework for a
single global namespace of articles and other contributions published in organized
serial collections. Although the initial focus is scholarly literature published in
journals, conference proceedings, technical reports and books, the scheme is
intended to accomodate extensions to include other types of serialized contributions
such as magazine articles, bills of a legislature, decisions of a court or minutes of
university committee meetings. The USIN is intended as a vehicle for
interoperability between various bibliographic citation applications, including
finding citations (literature research), retrieving citations (from on-line sources,
libraries or document delivery services), citation indexing, and citation formatting
(bibliography preparation). The USIN is also intended as one possible mechanism
for migrating the World-Wide Web away from dependence on Uniform Resource
Locators (URLs) [4] to a system meeting the requirements for Uniform Resource
Names (URNs) [23].
The USIN concept is related to the Serial Item and Contribution Identifier (SICI)
[19], the Publisher Item Identifier (PII) [1], and the Digital Object Identifier (DOI)
[11] schemes. However the USIN approach is primarily concerned with the task of
document identification in human communication, particularly scholarly, technical
and legal communication, whereas the other schemes are more concerned with
document delivery, library processing and publisher perspectives. In particular, the
USIN should use mnemonic coding and be reproducible by ordinarily literate people
(authors, students, librarians, law clerks, and so on) without the need for specialized
coding knowledge and check-sum algorithms. The USIN system is also intended for
serialized material that is not or cannot be registered with an International Standard
Serial Number (ISSN); both SICI and PII rely on ISSNs for serial item
identification. Philosophically, the USIN concept is most closely related to the SICI
scheme in that they each identify documents with their publication in a particular
organized series. The PII and DOI schemes identify documents as items owned by
publishers, with numbers possibly assignable in advance of publication and
independent of publication numbering. Green and Bide [14] and Paskin [21]
provide good overviews of the various current approaches to identification of
published articles or other items.
In application to the World-Wide Web, the USIN concept has considerable promise
as a potential partial solution to the problem of "broken links" [5, 13]. In short, the
URLs that are presently used for hypertext links on the World-Wide Web are based
on "locations" that specify documents in terms of access protocols, port numbers,
directory paths, and filenames. For various reasons, all of these attributes of
document location are subject to change and web links frequently become broken as
a result. Many proposals to resolve this problem through the creation of some form
of Uniform Resource Name have been put forward, but none seem to have
progressed beyond the experimental stage [8, 9].
Applications of the USIN scheme to other areas such as legal citation and legal
research are also envisaged. However, these are at present beyond the scope of this
paper and are left as an area for future consideration.
In order to ensure that every article is denotable, a logical first step is to ensure that
every serial is denotable. Unfortunately, the existing international standard in serial
identification, the ISSN, has an insufficiently large denotation space. The ISSN
system is based on an eight-digit identifier with seven working digits and a check
digit. The upper limit on the number of serials that can be accommodated is
therefore 10 million. When contemplating a universal designation scheme for serial
items as fine-grained as the minutes of curriculum committee meetings of a
particular university department, it should be become clear that the ISSN system as
presently constituted will not suffice.
Nevertheless, of the set of USINs that may legally denote an article, exactly one of
them should be specified as the canonical or preferred form. One use for canonical
forms is to make it easy to determine whether two different USINs denote the same
article: convert them both to canonical form and see if they are the same. For
example, if a user searches two distinct databases for articles of interest on a
particular topic and both databases return USINs in canonical form, then it is an
easy matter to filter out duplicate references to the same article because they are
represented by exactly the same string. A second important role for canonical forms
is to support indexing of information by USIN. By always associating information
with the canonical form of a USIN, it will be possible to retrieve that information
given any legal USIN form by first converting to the canonical form.
It is not possible nor desirable to define a priori the specific set of secondary serial
components that are identifiable in the USIN syntax. Instead the requirement
presented here is that the USIN scheme should accommodate specification of these
elements through an extensible syntax that can be coupled with a specification of
what elements exist on a serial-by-serial basis.
One might consider that there is a middle ground between accommodating the needs
of scholars and the needs of computer systems. However, the goal of establishing
USINs as names that will serve to denote published items over the long term should
be considered. From this viewpoint, apparent requirements that might derive from
the limitations of present-day computer systems (e.g., fixed-length fields, limited
storage capacity, etc.) should be avoided. There is little doubt that the processing
and storage capabilities of the computer systems that will be available in coming
decades will be vastly superior to those of their present-day counterparts.
Scholars will often need to write down USINs of interest or type them into their
computers. To minimize the tedium and the chance of error in these manual
processes, USINs should be designed to include only that information necessary to
clearly identify the cited work. Redundant forms that include additional information
may be allowed but must not be required. For example, for a journal that is
paginated by volume and that follows the convention of beginning each article on a
new page, it is sufficient to specify the journal, volume number and initial page
number to uniquely identify an article. In this case, a USIN specification must not
require the inclusion of additional information such as issue number, date or
complete page range.
One counterargument is that redundant information helps prevent errors, but one can
in turn counter that this approach to error control is obsolescent and inferior.
Historically, the requirement for redundant information at data-entry time is
designed to allow error detection at some future processing step. This is the basis for
three forms of redundancy in the existing SICI scheme for article identification:
chronology (date of publication), title codes and check digits. However, these
devices provide error detection without error correction. When an error is
encountered, there may be a considerable delay (e.g., days in interlibrary loan
applications) before the error can be corrected and processing resumed. Consider
instead an interactive process supported by a global network. When a scholar enters
a USIN, interactive software could immediately consult the global USIN database to
verify its correctness and to allow any necessary corrections or resolutions of
ambiguity. One existing model for this is the immediate feedback one receives when
entering an incorrect URL on the World-Wide Web (Web). In this way, an
interactive data entry process can both avoid the tedium of redundant data entry and
support a process of immediate error correction as well as detection. Construction of
such a global USIN system is probably feasible using the present-day technology of
Internet-connected computers; if not, it will certainly become feasible within a small
number of years.
The requirement for the use of publication numbering rules out the article
identification mechanisms contemplated by the PII and DOI schemes as a basis for
canonical USINs. Both of those schemes emphasize publisher-generated numbers
that may be different from the actual numbering on the published serial. This
requirement also rules out other reasonable schemes for unambiguous article
identification. For example, a scheme based on volume number and sequential
article number would be widely applicable as an unambiguous numbering scheme
for many journals. But scholars may be unable to easily determine the sequential
article number from either a printed copy of the article or a conventional
bibliographic citation. If publication numbering exists, it should be used.
One might also prefer to use publication chronology (e.g., dates, month-year
combinations) instead of publication numbering. In fact, chronology is a form of
numbering that happens also to be correlated with the passage of time. For some
types of publication, chronology may be the only numbering that exists and hence
must be used. In other cases, acceptable alternative USIN forms may be defined
based on chronology. However, chronology is generally more complex and involves
more identification pitfalls. For example, if (volume, page) identification generally
suffices for article identification in a particular journal, it may be the case that (year,
page) identification is inadequate for at least two reasons. First, the journal may
publish multiple volumes per year. Second, even if volumes are annual, they may
not correspond to calendar years; articles with the same starting page number in two
consecutive volumes could still end up being published in the same year. In other
cases, serial items may have duplicated and hence ambiguous chronology, for
example, when two technical reports are issued on the same date. There are also a
number of annoying coding problems for chronology. If numeric codes are used for
months, how do you code for month combinations or seasons? If nonnumeric
coding is used should it be in English or the original language and should
abbreviations be used? For all these reasons of potential ambiguity and complexity,
identification by simple publication numbering should be used in preference to
chronology.
From the scholar's point of view, the primary role and need for USINs is in
identification of articles. Identifying secondary serial components (volumes, issues,
special sections, abstracts, etc.) is a secondary issue of considerably less importance.
The requirement for scholar-friendliness then is that the syntax for article
identification not be complicated by codes to distinguish articles from other types of
component. Instead, where necessary, the syntax for secondary components should
include additional coding to indicate that a secondary component is being identified;
the absence of such coding should be taken to indicate an article identification.
It should be easy for scholars to construct and analyze USINs manually. Checksums
and other calculations should be avoided. Appropriate punctuation should be used to
avoid running numeric items together. For example, the code 20000229 used as the
SICI specification for February 29, 2000 violates this requirement. Arcane numeric
codes should also be avoided. Although numeric month codes 1 through 12 are
arguably acceptable, the SICI code 23 meaning "Fall" is not.
It is not uncommon to find a particular serial published in two or more formats, for
example, in HTML format on the Web and on paper. From the scholar's viewpoint,
it is usually the case that it is the content of the article, not the form of its
presentation, that matters. When there is no difference in content, the USIN
specification for articles should be fundamentally independent of publication
medium. This requirement does not preclude media specification from inclusion as
an optional element in a USIN syntax. However, the SICI convention of including
the medium format identifier (MFI) as standard practice would not satisfy the
requirement for USINs.
It may be the case that a publisher creates separate designations for different formats
of a serial, particularly when there may be significant differences in content. In this
case, the publication medium or format may be implicitly identified by the choice of
publication series designation. However, this does not represent a violation of
format independence of the USIN syntax itself.
The syntax of USINs should be designed to avoid confusions that can be created by
common notational features that may be expected in typical embeddings. In
particular, both formal and informal settings may embed USINs as notational
elements within structures delimited by parentheses, braces or similar bracketting
structures. To avoid confusion, USIN syntax should be constrained to allow
bracketting symbols only if they occur in matched pairs. For example, if a USIN X
is to be acceptable as a parameter in a BibTeX citation tag of the form \cite{X},
then any unmatched braces within X would surely cause confusion. It may be
worthwhile to avoid braces altogether because of their use in the TeX family of
document languages and similarly to avoid angle brackets ("<" and ">") because of
their use in HTML and SGML.
When USINs are used as elements in ordinary discourse, they may often occur at
the end of a sentence or phrase. Punctuation (periods, commas, semicolons and so
on) added at this point should not be a source of confusion. The presence or absence
of whitespace (blanks, tabs or line breaks) after such a punctuation symbol may be
used to discriminate. That is, a period, comma or other punctuation may be used
within the USIN syntax only if it is immediately followed by a nonblank character.
Any of these punctuation marks followed by whitespace should always denote the
end of a sentence or phrase.
However, it need not be required that canonical USINs always remain canonical, at
least in the initial development of the USIN system. Initially, the canonical USIN
forms for many serials will include serial designation by ISSN. As globally unique
mnemonic designations for these serials are gradually registered and accepted, those
forms may become canonical. It may also be the case that changes in the canonical
form of serial numbering become desirable, particularly for those aspects of
numbering that are not directly reflected in publication numbering (for example,
position of an article on a page).
Two issues involving particular forms of change deserve special attention in the
development of USIN syntax. The first is that title changes should not necessarily
require changes in the USIN code for a serial. This is at odds with the ISSN
convention, which requires new ISSNs to be issued when there is any significant
change in title. However, in considering mnemonic abbreviations of serial titles,
various changes in title may be accommodated with the same mnemonic. If the
publisher and readers of a journal wish to retain a particular mnemonic by which
the journal is known, the USIN system should respect this. The second issue is that
the syntax for identifying components of a particular serial should be flexible and
changeable. For example, if a serial starts out with sequentially numbered issues, its
USIN syntax should nevertheless accommodate a later reorganization to number the
publication by volume. Similarly, if a traditional print journal identifies articles by
volume and page number, the USIN syntax should accommodate a later change to
an electronic format in which articles are identified by volume and article number.
The USIN scheme generates distinct identifiers for each separately published
version of an article. One possible view of this is that each of these identifiers is in
fact an alternative identifier of the same article, with one of them (presumably the
most recent) being the canonical form. However, this approach has several serious
problems. The first is that there is no good basis for saying when two versions of an
article should be treated as the same. How many insertions and/or deletions of text
may be accomodated? What about changes in title or authorship? It is difficult to
imagine any set of rules that could provide a satisfactory and implementable
decision procedure. It is also difficult to imagine any mechanism that could ensure
that publishers actually identify these equivalent versions so that the correct
mappings to canonical form can be made automatically. Beyond these concerns,
there is also a problem with such equivalences automatically being applied to
citations: changes in the content of an article between versions may render a citation
apparently irrelevant or incorrect. This should not be considered a failure on the part
of the citing author. In essence, it is a misrepresentation to map the author's citation
of a particular version to any other version than the author intended.
Philosophically, then, USINs are names for particular versions of articles, not names
for the more abstract notion of an article that maintains its identity through various
versions over time. Systems to support this more abstract notion, at least at the
coarse-grain level of publication versioning, might well be built on top of a USIN
system, using USINs to identify particular published versions of articles. Finer-
grained versioning concepts, such as those of Augment/NLS [12] or Xanadu [20],
might also make use of USINs to interoperate with conventional bibliographic
databases.
The sharp reader may notice an apparent contradiction between the USIN
requirements with respect to changes to serials and changes to articles. The USIN
requirement for serial codes does represent the more abstract notion of a serial
publication as it goes through various changes rather than the serial as it exists at a
single point in time. However, this distinction between the treatment of serial and
article identifications reflects a fundamental philosophical view. In this view, serials
are like timelines and articles are like points on those lines. The timeline may go
through the twists and turns of changes in publisher, title or numbering scheme and
still retain its identity. Each point on each line is separate entity with a separate
identity. There may be relationships between points such as "version-of" and
"cites", but the separate identities of the points should be maintained in the USIN
approach.
Consider a USIN scheme that adopts the hierarchical naming idea of DNS, but with
a focus on naming serial publications and publishing organizations, not computer
resources. The distinction between naming publications and naming computer
resources is critical; the failure to make it may be one of the underlying problems of
the URN concept. Notations such as the following may be contemplated:
These examples are for illustrative purposes only; the actual development of a
domain structure and names for serials and their publishers requires a process of
international consultation and consensus.
In the USIN scheme, then, serial publications are given identifiers which must be
unique in the context of a particular publication domain. Thus d1.d2.d3 is
interpreted to specify a subdomain d3 within domain d1.d2, which is itself
hierarchically specified as a subdomain d2 within the global domain d1. In general,
domains will denote publishing organizations, administrative divisions of such
organizations or collectives for identifying organizations or publications.
The USIN syntax shown in this paper is intended to be illustrative rather than
prescriptive of the final form of USINs. Thus the choice of periods and slash marks
as separators is somewhat arbitrary. One could also argue that the distinction
between slash marks and periods is artificial, i.e., that S.ACM.TOPLAS would do as
well as S.ACM/TOPLAS. However, distinguished punctuation allows us to infer
directly from the form of a specification that S.ACM/TOPLAS is a serial publication
of the ACM, while S.ACM.SIGPLAN is an administrative division thereof. One could
also question the decision to reverse the right-to-left structuring of domains under
DNS; the reason for this is to use a consistent left-to-right hierarchical structuring
within all levels of the USIN notation. Lastly, the final syntax of domain,
subdomain and series identifiers is left as an area for further work. However,
allowance for case-sensitivity in such identifiers seems reasonable, e.g., CaS and
CAS could denote separate items.
Names assigned under the Internet's Domain Name System are the basis for the
third leg of the initial tripod supporting the USIN scheme. Whenever a DNS domain
name or host name is clearly associated with a particular publishing organization, it
may be used as a component of the RDNS (restricted DNS) domain of the USIN
scheme. For example, acm.org is a DNS domain identified with the Association for
Computing Machinery, so RDNS."acm.org"/TOPLAS denotes ACM TOPLAS.
Similarly, sfu.ca is a DNS domain for Simon Fraser University, so
RDNS."sfu.ca".CMPT/TR denotes the Technical Report series of the School of
Computing Science at SFU. In this last example, one might consider instead basing
the USIN specification on the cs.sfu.ca domain, that is, RDNS."cs.sfu.ca"/TR.
This form might be allowed, but the form based on the CMPT designation may be
preferred (canonical), because that designation has been specifically chosen by SFU
in a system of unambiguous codes for its departments.
The syntactic convention of enclosing a DNS name in double quotes when used as
an RDNS domain serves two purposes. First, it emphasizes that the hierarchical
structure of the DNS name plays no role in the interpretation of that name as an
RDNS subdomain. In essence, DNS names are being cited as atomic identifiers for
publishing organizations. Second, the quote marks delimit the scope of a DNS
name, within which the "." separator is understood not as a part of the USIN
syntax, but simply as a character in a quoted DNS name.
Unfortunately, there is no constraint within the DNS system that DNS domains are
permanently unique designations of organizations or their successors. Under DNS,
the essential requirement is that domains are unique at any particular point in time,
but it is quite conceivable that a naming authority at some level may reuse or
reassign a name. Furthermore, the association between DNS names and
organizations breaks down as one descends into the hierarchy of subdomains,
subsubdomains and so on. To avoid these problems, the USIN standardization
process could include the publication of a list of acceptable DNS names and their
associated organizations for use within the RDNS domain of the USIN scheme.
These designations should be permanent; the interpretation of a designation within
the RDNS domain should be derived from this list, even if that designation is later
reassigned to some other purpose within DNS itself. The intention of the list should
be to identify all and only those DNS domains that may be clearly identified with
publishing organizations.
The astute reader will note that designations such as RDNS."acm.org"/TOPLAS and
RDNS."sfu.ca".CMPT/TR seem unnecessarily awkward compared to the earlier
examples S.ACM/TOPLAS and CA.SFU.CMPT/TR. We should hope that forms such as
the latter ultimately become canonical under the USIN system. One might ask, then,
why not just skip the RDNS prefix, reverse the order of DNS domain names and
use those reversed names directly at the top-level of the USIN hierarchy in the
initial instance? The answer is that the top-level domain structure of the USIN
system should not be prematurely constrained. Once established for a particular use,
USIN designations are intended to be reserved permanently for that use. The RDNS
prefix allows existing DNS names to be used as a way of initializing the USIN
system, giving time for an orderly process of developing an internationally-
acceptable top-level domain structure.
Within the RDNS domain for a particular publishing organization, the identification
of administrative divisions and publication series should use codes specified by that
organization. In many cases, clear coding schemes are already in place now. In the
important case of universities, a system of unambiguous mnemonic codes for the
academic departments is typically available in the university calendar. Codes to
denote a publication series of a university department (e.g., TR for Technical Report,
TN for Technical Note and so on) are often included on publication lists produced by
the department or may be found on the documents themselves. Wherever possible,
the use of existing naming schemes should be accommodated in this way, in order
to maximize the scholar-friendliness of USIN designations.
In order to ensure the robustness and permanence of USIN designations, one should
expect that certain adaptations and accommodations of historical naming schemes
will be required. Thus, the USIN system must include a method for describing
naming schemes and rules for maintaining consistency. In order to make the greatest
use of historical naming schemes, the rules should be designed to accommodate a
great deal of variability. Nevertheless, some modifications of historical naming
schemes should be expected in order to comply with USIN requirements.
The three initial domains ISSN, ISBN and RDNS provide a plausible initial basis
for unified, permanent and globally-unique designations of archivable serial, book
and institutional publications. There are undoubtedly many cases in which the
coding of USIN specifications will initially be unclear, especially in the case of
institutional publications. However, it is certainly a common practice for the serial
publications of an institution to be identified using a numbering scheme that serves
to unambiguously denote those publications in the local context of an institution. It
is certainly also the case that the vast majority of publishing institutions in the
industrialized world can now be identified by an appropriate DNS domain. These
conditions suggest that it is presently feasible to initiate a USIN system.
S.ACM/TOPLAS:16@1811
Assuming that S.ACM does become the code for the Association for
Computing Machinery in the global domain for scholarly societies, this is the
canonical USIN in the proposed syntax for the article "A Behavioral Notion
of Subtyping" by Barbara H. Liskov and Jeannette M. Wing appearing in
ACM Transactions on Programming Languages and Systems, volume 16,
number 6, (November 1994), pages 1811-1841.
S.ACM/TOPLAS:16(6)@1811
This is an acceptable alternative USIN for the same journal article, specifying
the issue number.
S.ACM.SIGPLAN/Notices:32(1)@66
This denotes the position paper entitled "Global Computation" by Luca
Cardelli, published in ACM SIGPLAN Notices, Volume 32, Number 1,
January 1997, pp. 66-68. In this case the issue number is required, because
pages are renumbered from 1 with each issue of SIGPLAN Notices.
Occasionally, one may find journals with more than one article starting on a
particular page. For example, these might be items of technical correspondence.
One solution to this problem of starting page ambiguity is to use sequential
denotations with lower case letters. For example, S.ACM/CACM:38(1)@43a and
S.ACM/CACM:38(1)@43b could respectively denote the two short articles "Women
and Computing in the UK" by Alison Adam and "Announcing a New Resource:
The WCAR List" by Laura L. Downey, both appearing on page 43 of
Communications of the ACM, volume 38, number 1 (January 1995).
There are three small problems with this scheme that may be quite rare but are
theoretically possible and should be addressed. The first is that there may potentially
be more than 26 articles on a page. However, the scheme easily extends so that
designations such as aa for the 27th article and aaa for the 677th article may be
used. Second, there may be an ambiguity in determining the ordering of articles;
pages are two-dimensional while orderings are one-dimensional. The most scholar-
friendly way to resolve this is to follow the natural text ordering. For publications in
English and similar languages, this is column-major numbering: articles in column 1
always precede articles in column 2 and so on, while articles within columns are
numbered top to bottom. Finally, note that page numbers themselves might in some
cases include lower case letters. An example is preface material in a journal volume
numbered using lower case roman numberals. To handle this case, the USIN scheme
might specify that the underscore ("_") character can be used as a separator.
In practice, scholars will not want to learn the details of how to distinguish multiple
articles on a page until it becomes a problem. They may not even be aware of the
problem if they are entering a citation from its written form in a reference list. In
such a case, the user will likely omit the required lower case code when entering the
citation. Interactive USIN processing software should notify the user of the
ambiguity and query him or her for its resolution. Batch-oriented software could
return the set of all articles on the page and issue a warning report through an
appropriate message or log file.
Unpaginated E-Journals
When a journal is not printed on pages, one might expect that article identification
by page number is no longer appropriate. Although many electronic journals have in
fact retained page-oriented formatting and numbering, many others have chosen not
to do so. In particular, there is a growing trend to use the logical document markup
capabilities of SGML [7] and HTML in electronic journals. One advantage is that
formatting may be left to the reader's software; articles can be viewed and printed
in a variety of different formats (with a variety of different paginations) depending
on hardware capability and reader preference. In view of this, it seems reasonable to
expect that the trend towards unpaginated e-journals will continue.
Scope
One defining characteristic of the USIN hierarchical numbering model is that every
counter within every numbering tuple has a scope that defines the context of its
numbering. Issues of a journal are typically numbered from 1 within each volume;
they are said to have volume scope. Page numbers may have volume scope or issue
scope, depending on the particular serial. An "item-count" for distinguishing
multiple articles per page has page scope. The first, or principal, numbering
component of a serial is said to have global scope; it is numbered consecutively in
perpetuity.
Numbering scope is correlated with, but not synonymous with, hierarchical level.
For example, volume scope for page numbers is often used even when volumes are
divided into issues. Similarly, although issues are usually given volume scope when
volumes exist, they may sometimes be given global scope.
Scope-Dependent Numbering
Syntactic Representation
In general, the numbering scheme for every serial has a syntactic representation that
may be generated by mapping rules from the abstract representation as a
hierarchical numbering tuple. In the suggested standard journal article syntax, the
(volume, page, item-number) tuple of (12, 135, 2) maps to the syntactic
representation 12@135b. In general, each number in a hierarchical numbering tuple
is first mapped to a numeral in some encoding system, such as arabic numerals,
roman numerals or "alphabetic numerals" (a, b, c, ..., aa, ab, ...). Then a syntactic
string for the entire structure may be constructed by concatenation with appropriate
mnemonic operator symbols as punctuation. An essential goal of this process is that
the syntactic encoding be uniquely decodable. Operator symbols must be carefully
chosen both to have mnemonic value and to ensure unambiguous interpretation of
the syntactic forms. In principle, the order of appearance of numbering elements
may also be considered a design choice, but for simplicity and to avoid confusion it
may be desirable to enforce a strict left-to-right ordering of elements according to
the numbering hierarchy.
A fourth aspect of the hierarchical numbering model is that a serial may have
parallel numbering hierarchies for different purposes. In general, these hierarchies
have a common numbering prefix consisting of one or more of their uppermost
numbering levels, with divergence of numbering below these level(s). The simplest
example is that of the article-identification and issue-identification hierarchies of
journals that are paginated with volume scope. In this case, the (volume, page) and
(volume, issue) hierarchies may be considered parallel. In general, syntactic devices
are necessary to distinguish which hierachy is intended in any particular coding; the
(volume, page) and (volume, issue) hierarchies are distinguished by the @ and ()
syntax notations given previously. Other examples of parallel numbering are given
in the later subsection on secondary component notation.
Chronology
Finally, chronology is the fifth general property associated with the hierarchical
numbering model for serials. Chronology is the association of a date and/or time of
publication with a particular serial numbering component. In general, chronology is
a fundamental aspect of serial publication and should be defined for all hierarchical
numbering components down to some level at which all further structure is
considered simultaneously published. For example, traditional print journals have
chronology specified to the issue level, while electronic journals may have
chronology specified to the article level. In general, chronology is scope-dependent;
for example, when a quarterly journal changes to a monthly one, the chronology
associated with issue 3 in each volume may change from "Fall" to "March".
Chronology may also be irregular and possibly out-of-sequence, that is, with
publication numbers assigned out of order of actual publication dates. Chronology
itself is also an instance of hierarchical numbering, for example, using (year, month,
day) 3-tuples or (year, season) 2-tuples.
Beyond article identification, the next most important application area for USINs
may be in the description of library holdings or document delivery service
coverage. A single volume or issue of a journal is simple to identify by including
numbering only to the desired level. For example, S.ACM/TOPLAS:16 denotes
volume 16 of TOPLAS, while S.ACM/TOPLAS:16(6) denotes issue 6 thereof. But
holdings are more often described as volume ranges. In cases where issues are
missing, subscriptions are cancelled and then reinstated, or miscellaneous holdings
have been received by donation, the holdings may be broken up into a lists of
individually held items or ranges. To accommodate these requirements, it seems
reasonable to reserve the comma (",") to separate elements of a holdings list and the
double hyphen "--" to serve as a range operator.
Here, the serial code is specified only once. Commas separate individually held
items or ranges. The start and end of a range are indicated by enumeration to the
required level of specificity. An end range of "ff" indicates a continuing
subscription. As a syntactic constraint to aid in error detection, holdings should be
listed in strictly ascending order.
Only positive holdings data is shown, following the principle adopted by ANSI
Serials Holding Statements [2]. Determination of missing items can be made by
reference to either the USIN global database or an appropriate serial "definition"
(see the subsection on Serials Definition Language in the following section). For
example, using the knowledge that TOPLAS was quarterly during volume 10 tells us
that 10(2) and 10(4) are missing for these holdings while 10(5) is not (because it
does not exist).
The conventions for serials holdings are intended to apply to serials with any form
of hierarchical numbering and to any level of specifity. One implication is that the
syntax of USINs generally must be structured to avoid conflicts with the "," and "--
" symbols of the holdings notation. Another implication is that coverage can be
specified to a finer level of detail. For example, a document delivery service may
wish to identify "scanned holdings" to the article level, that is the articles that have
already been scanned or digitized and are hence available for short-turnaround
delivery.
S.ACM:TOPLAS:16|index
The index of volume 16 of TOPLAS (found at the end of
S.ACM:TOPLAS:16(6)).
S.ACM:TOPLAS:16(6)|contents
The table of contents of volume 16, issue 6 of TOPLAS.
S.ACM:TOPLAS:16@1811|abstract
The abstract of an example TOPLAS article.
S.ACM:TOPLAS:16@1811|sec(4.1)
Subsection 4.1 in the example article, entitled "Type Specifications".
S.ACM:TOPLAS:16@1811|fig(3)
Figure 3 in the example article, captioned "Stack Type".
The last two examples illustrate parallel (volume, page, section, subsection) and
(volume, page, figure) numbering hierarchies respectively for sections and figures
within articles.
One may question the need for fine-grained identification of article components.
Indeed it is reasonable to consider deployment of an initial USIN system that
focusses on article identification. Nevertheless, for a scheme that is designed to
serve for article identification and related purposes in perpetuity, it would seem
foolhardy not to allow the extension of the scheme using a notation such as the
secondary component notation presented here.
One use of the reference notation is to guarantee that you can quickly generate an
acceptable USIN for every reference in an article, providing that you can generate a
USIN for the article itself. During creation of citation databases, it may be desirable
to produce a full set of USINs for the reference lists of articles in a fairly
expeditious fashion. If the resolution of some references to their direct USIN form is
proving problematic, they may be left in indirect form during initial data entry. At a
later time, the resolutions of indirect references may be entered either manually or
by acquisition of an independently developed citation set for the same article.
Another use of the reference notation is to serve as a unique canonical form for
personal communications, unpublished works and other otherwise undenotable
items. In this way, there would be no need to create a classification or coding
scheme for such references. Furthermore, each such item would be automatically
given a permanent and unique code. For example, if two authors each write articles
citing "Famous Person, personal communication", those citations would be given
distinct canonical identifiers. This would prevent false positives when doing
coreference searches (finding papers that have 2 or more references in common).
Hyphenation Notation
In some cases it may be desirable to break a long USIN over multiple lines. This
can be accommodated by the following hyphenation convention. A line break may
be inserted after any hyphen appearing in a USIN, without changing its meaning.
Furthermore, any nonhyphenated USIN operator can be converted into a hyphenated
equivalent of that operator by adding a hyphen to the end. Thus, the hyphenated
equivalents of "." and "/" and "--" are respectively ".-" and "/-" and "--" (no
change). The following examples illustrate this convention in use.
RDNS."sfu.ca".CMPT/-
TR:97-16|ref(SICI)
S.ACM/TOPLAS:2--15(1),-
15(3),15(5)--17,20--ff
S.ACM/TOPLAS:2--15(1),15(3),15(5)--
17,20--ff
RDNS."sfu.ca".CMPT/-TR:97-16|ref(SICI)
The last example illustrates that a newline character is not strictly required after a
hyphenated operator. This accommodates reformatting operations that might
eliminate an inserted newline character but leave a vestigial hyphen in place.
Conversion to canonical form eliminates any hyphenated operators and embedded
newlines. USIN processing software should fully recognize the hyphenation
convention in the event that a multi-line USIN is entered using a cut-and-paste
operation.
In discussing these technologies, the goal is to present a vision of how USINs may
be generated, verified and used in the day-to-day work of publishers, librarians and
scholars. At this point in the development of the USIN concept, the focus should be
more on the analysis of overall system requirements than on the implementation
details of underlying mechanisms. Nevertheless, a number of design ideas are
included to help give a more concrete picture of the possible operation of an
integrated global USIN system.
These are the technologies that publishers and librarians could use on a daily basis
in the assignment of USINs to serially published items.
Fundamental to the USIN concept is the use of serial designations and numbering
schemes for identification of articles and other serial components. In order to
formally specify these schemes, consider the creation of a Serials Definition
Language (SDL). Each SDL specification would define one serial, establishing its
basic identity and publication scheme. In particular, this would include formal
specification of the hierarchical numbering scheme of the serial including its
abstract structure, scope-dependencies, chronology, and syntactic identification
schemes for articles and other serial components. It would also include the
specification of the canonical and allowable alternative forms for USIN
designations.
In addition to its formal role in the USIN scheme, SDL should also be designed to
serve a variety of related purposes. From a serials check-in and claiming
perspective, the enumeration and chronology specifications of an SDL definition
should also have predictive value as contemplated, for example, by the serial pattern
scheme of McNellis [16]. The SDL definition of a serial should also provide a basis
for evaluating and interpreting USIN holdings specifications and possibly
converting them to MARC Holdings Format. Similarly, from a bibliographic
database perspective, it should be possible to verify the enumeration and chronology
recorded in a database entry against that specified in an SDL definition. It should
also be possible to determine the comprehensiveness of database coverage: are there
any issues or articles published that are not in the database, or is the database
complete?
A major requirement for UPP is to ensure the integrity of assigned USINs from the
standpoint of global uniqueness and consistency with the current SDL definitions of
serials in question. One approach to this is to maintain within the USIN Global
Registry a current publication state for each serial and to define acceptable UPP
actions in terms of this state. In essence, the publication state identifies the last
issued USIN for the serial, plus a specification of which numbering levels in the
hierarchical numbering scheme are currently open. This gives a basis for predicting
the counter and date values for upcoming UPP requests.
For example, consider the publication state that might exist after registering the
article "Collecting Interpretations of Expressions" by Paul Hudak and Jonathon
Young appearing in ACM TOPLAS, Volume 13, Number 2, April 1991, pages 269-
290 with the USIN S.ACM/TOPLAS:13@269. The state may include volume and issue
counters that are currently open with values 13 and 2, respectively. A page counter
may be closed at page 290 (nothing more will appear on page 290). At this point,
there may be two legal UPP actions: add another article in this issue or close it. As it
happens, there is one more article in the issue. Based on the current publication
state, an expectation may be generated that the next article will have USIN
S.ACM/TOPLAS:13@291. If the publisher indeed submits that USIN with the next
UPP request, it can be accepted, otherwise an error can be reported.
After a "close issue" request has been made, the SDL definition and publication
state can be used to predict the next publication action and expected date. In the
example, this is an "open new issue" request for issue 3 of volume 13, July 1991.
These may be verified when the actual request is made. When issue 4 of this
volume is closed, the SDL definition should tell us that there are no more expected
issues in this volume. The expected sequence of following UPP requests is then a
"close volume" request, followed by an "open volume" request for volume 14, 1992,
an "open issue" request for issue 1 in January 1992 and an article publication
request with USIN S.ACM/TOPLAS:14@1. Each of these expectations may be in turn
verified against the actual UPP requests made.
Serials Registration Protocol is the proposed service for registering a serial code and
its accompanying SDL definition and tracking changes thereto over time. This
includes registering changes in publication numbering or chronology, changes in
publisher or publication domain, addition of alternative USIN codings, changes to
the canonical USIN form and/or deactivations and reactivations. In general, SRP
requests would be made with respect to a particular publication-domain/serial-code
combination.
Perhaps the most critical function under SRP is the creation of a new serial code
within an existing publication domain. The code may be the initial code for a new
or previously unregistered serial publication or it may be an alternative code for an
existing publication. In either event, creation of a serial code should always be
considered with care, because it creates, in the context of the given publication
domain, a permanent USIN binding between that code and the serial in question.
From this perspective, it is worth considering appropriate verification actions for
creation of a new serial code. Of course, verification that the code is previously
unassigned is an automatic function that should be implemented by the appropriate
query to the USIN Global Registry. Beyond this, there should also be some manual
verification to ensure that the code assignment is reasonably consistent with the
USIN concept. One option is to use national serial registration centres analogous to
those of the current international ISSN network. However, such a system is likely to
be too cumbersome for the management of publications at the fine-grained level of,
say, minutes of committee meetings of particular university departments. It also
does not account for an institutional role in approving the serial codes chosen by
administrative divisions within the institution.
An alternative for verifying serial code assignments that overcomes these problems
is the following. SRP requests for new serial code creation must be approved by a
USIN-certified cataloguing librarian. Certifications are awarded by an appropriate
international standards body. Each authority for a publication domain may designate
a certified librarian for that domain. When an SRP request to create a new serial
code is issued, it is handled by the librarian registered for that domain, if such a
librarian exists. Otherwise, verification of the creation request is attempted in the
immediately superior publication domain, and so on. For example, a university may
designate a single USIN-certified librarian to handle all institutional requests for
new serial codes. Regardless of how deeply structured the administrative hierarchy
within the university is, all serial code creation requests within the university are
passed up the domain hierarchy to be handled by this individual.
The second major function of the SRP protocol is to register the publication pattern
of a serial and changes to that pattern as required from time to time. As described
above, these publication patterns are specified as part of the serial's SDL definition.
UPP can be used to check the consistency of the publication patterns against future
publication attempts. That is, each time a USIN is specified in a future UPP request,
it serves to check that the SDL definition is correctly predicting the actual
publication numbering and chronology.
Whenever the publication pattern of a serial is changed, the SDL definition must be
modified to account for both future and past publications. The checking of future
publications is done by UPP. SRP is responsible for checking that the revised SDL
definition correctly accounts for the USINs assigned to past publications. This
checking may be done by formally re-evaluating the revised definition against the
entire history of actual publication as recorded in the global registry. The checking
should satisfy two conditions: (1) every USIN previously registered should be
accounted for by the new SDL definition, and (2) the new SDL definition should
not "predict" any past publication that does not, in fact, exist. Exhaustive checking
or a provably equivalent alternative method should be used. That is, a reduced form
of checking that puts at risk the consistency of the USIN system should not be
justified on the basis of minor concerns of computer processing efficiency.
The third major function of SRP is to register canonical and alternative forms of
USIN for a serial. When a serial is registered for the first time, the publication-
domain/serial-code combination under which it is first registered is the canonical
form of USIN. Subsequently, SRP may be used to create alternative USIN forms.
When such an attempt is made, the SRP request must specify both the publication-
domain/serial-code combination for the current canonical USIN and the new
alternative publication-domain/serial-code combination. It may be reasonable to
require that permission from the domain authority of both domains be obtained.
Any number of alternative forms for a serial may be created in this way.
The SRP request to change the canonical form of a serial must specify the
publication-domain/serial-code combination of both the current and proposed new
canonical forms. The request is made by the authority for the new publication
domain and must be verified by the authority for the currently canonical publication
domain. If approved, the change will be scheduled to occur at the next scheduled
global synchronization time for changes to USIN canonical forms, or to a later
synchronization time specified in the change request. Once the change becomes
effective, the canonical form is switched, but both forms remain acceptable.
Publication Domain Protocol is the final proposed service of the USIN Global
Registry. This protocol is used to create and register new publication domains,
transfer authority for domains, register the USIN-certified librarians for a domain
and other related functions. In general, these actions will refer to subdomains of
some existing publication domain; even top-level USIN domains such as ISSN and
RDNS may be considerd as subdomains of a global USIN publication domain.
Creation of a code for a new publication domain under PDP parallels the creation of
a new serial code under SRP. In both cases, the proposed code must be checked to
verify that it is previously unused in the context of the parent publication domain.
Furthermore, the manual review of serial codes by a USIN-certified librarian should
also occur for new publication domains. Ideally, this manual review should verify
that the publication domain corresponds to an actual publishing institution,
organization or administrative division thereof and is a scholar-friendly mnemonic
designation of that unit consistent with historical practice wherever possible.
Alternatively, the publication domain may represent a newly-formed collective or
coalition expressly formed for the purpose of organizing the upper levels of the
USIN domain structure.
PDP should also support the registration of alternative USINs and changes in
canonical USIN for the publishing entities denoted by publishing domains. The
registration of alternative USINs under PDP could parallel SRP in a straightforward
fashion. However, registration of a new canonical USIN for a publishing domain is
complicated by the implications for serials and subdomains within that domain.
Consider a proposed change from RDNS."acm.org" to S.ACM as the canonical USIN
for the Association for Computing Machinery. Normally, this should imply
corresponding changes for all subordinate serials and subdomains recursively. Thus,
changes in canonical USIN from RDNS."acm.org"/CACM to S.ACM/CACM, from
RDNS."acm.org".SIGPLAN to S.ACM.SIGPLAN and from
RDNS."acm.org".SIGPLAN/Notices to S.ACM.SIGPLAN/Notices should all be
expected in the example. However, it may be unwise to automatically make such
changes without review in every instance. Thus, under PDP, a change in canonical
form for a publishing domain should be carried out by first registering all the
appropriate changes for subordinate serials and subdomains. This may be enforced
under PDP by permitting a registration of a new canonical form for a publication
domain only when alternative canonical forms for all active subdomains and serials
therein have been registered.
Finally, PDP should also provide for the deactivation and possible reactivation of
domains. Deactivation of a publication domain implies that no further publication
activity is contemplated within that domain or its subdomains. Hence deactivation of
a domain should only be permitted when all subordinate serials and subdomains
have themselves been deactivated. Reactivation of a publication domain may
occasionally be contemplated. However, to ensure the permanence of identification
of USINs issued in the subdomain prior to its earlier deactivation, a reactivation
request should not be automatically granted. Instead, a "contract" may be first
returned identifying previous use of the domain, assigned subdomains and serials
and the requirement that new use will respect these. The proposed new domain
authority should agree to these terms before the domain can be reactivated.
USIN Global Database System
Now consider how the day-to-day needs of scholars can be directly supported by a
USIN Global Database System. Three basic needs can be identified: (a) the need to
inquire about the article or other item denoted by a given USIN, (b) the need of
authors to cite articles by USIN, and (c) the need to use USINs in literature
research, both to denote search keys (citation indexing) and search results. USIN
Inquiry Protocol is the first proposed technology to assist users in this regard; it
provides for both the interactive inquiry about USINs and for hypertext citation of
USINs in World-Wide Web documents. To support citation by USIN in other types
of document formatting software, a Bibliographic Retrieval Protocol is proposed
coupled with bibliographic formatting "plug-ins" for standard word processing
packages. The final subsection discusses the role of the USIN Global Database and
USINs generally in literature research.
One of the primary motivations underlying the USIN concept is to address the
"broken links" problem on the World-Wide Web: citation of works by Uniform
Resource Locator (URL) is prone to failure when the cited item is moved or
removed. To solve this problem, it has long been suggested that names of resources
rather than their locations should be the basis of citation, but none of the proposals
for Uniform Resource Names (URNs) has yet succeeded. A more successful
approach may be to concentrate on an important subset of the general problem:
links to serially-published documents. For this subset, consider the direct use of
USINs as permanent, "unbreakable" links and the development of USIN Inquiry
Protocol (UIP) to enable this use. For example, a hypertext reference to a sample
TOPLAS article could be coded using the following HTML markup.
Note that a hyperlink formed in this way makes no reference to any particular
computer system. Thus, the requirements of URNs are satisfied; the target of a link
is designated by naming what it is instead of where it is located.
Apart from this use in Web-based documents, UIP also supports the direct inquiries
about a particular USIN. All that the scholar need do is to type
uip:S.ACM/TOPLAS:16@1811 directly into the "location" field of his favorite Web
browser (assuming that the browser has been updated to include the UIP client-side
software.)
Ignoring for the moment how it works, the critical issue from a user perspective is
what you get when you make a UIP/USIN inquiry, either directly or by activating a
hyperlink. One answer is that you retrieve a metadata page, that is, an information
page about a document, but not the document itself. In general, the direct retrieval
of documents cannot be guaranteed because many of them may not be electronically
available. On the other hand, if a document is available on-line, it may be available
from a variety of different sources with a variety of different formats and/or pricing
structures. The purpose of a metadata page, then, is to provide a full bibliographic
description of the article or other item denoted by the target USIN, and a set of links
for making further inquiries about the article and/or retrieving a copy of it.
In general, one may consider an ambitious design goal for metadata pages: to
provide a comprehensive information resource with respect to the cited items. In
addition to basic bibliographic information and links for acquiring copies of articles,
a number of other items could be provided. Each article metadata page could
include direct links to information about the serial and its publisher. Using the
USIN notation it should also be easy to include links for retrieval of contents pages
for sibling articles in the same journal issue or volume. Links for exploring other
publications by the authors of the article might be included. In particular, links for
locating subsequently published corrigenda would be worth highlighting.
Information on review articles that discuss the document of interest may be
included. In conjunction with a citation database, links for retrieving the sets of
articles that are respectively cited by and cite this article could also be considered.
Finally, it may be reasonable to consider including links to search services that can
locate similar articles by full-text searching using a document surrogate (keywords
and other metadata that describe the current document).
It may be the case that the coded USIN in a UIP hyperreference does not refer to a
single article, but instead denotes some other serial component or is ambiguous or
erroneous. In each of these cases, the page returned through UIP should also strive
to provide comprehensive information to the user. For example, in the case of an
USIN reference by page number where more than two articles start on the specified
page, a menu showing each possible article could be returned together with their
correct canonical USINs.
These ambitious goals for the metadata pages returned by UIP servers need not
represent an obstacle to server development. The initial implementations of UIP
servers may focus on basic capabilities, allowing additional functionality to be
added over time. In addition, many of the capabilities could be implemented in a
fairly modular fashion. For example, if a particular document delivery service
supports web-based document ordering by USIN, then generating the appropriate
document ordering link is a simple matter.
Returning to the issue of how UIP may be implemented, note that the syntax for
UIP/USIN citations does not specify the actual server to be consulted in resolving
the UIP request. Rather it is reasonable to expect that the server would be specified
by an appropriate client-side mechanism, such as a UIPSERVER browser parameter or
environment variable. Typically, users might choose to set their UIPSERVER to
specify a server operated by a major local research library or library consortium. In
this way, the metadata pages returned can be formatted to emphasize local holdings
of cited documents, even when the citing document is remotely located.
Bibliographic Retrieval and Formatting
A key goal of the USIN scheme is to support authors of scholarly works in the
preparation of bibliographic references. This may be achieved by bibliographic
processing "plug-ins" or "add-ons" to standard word processing software that will
allow authors to cite works by merely entering USINs at the appropriate citation
points. The bibliographic processing modules could then take care of all the
remaining details for resolving and formatting the citations: retrieving the actual full
bibliographic citations, assigning appropriate in-text reference numbers or labels,
formatting the citations according to a chosen style guideline, sorting them
according to a user- or style-specified ordering, and incorporating the citations into
the document as a reference list at the back or sequentially in footnotes. As well as
removing a considerable source of tedium in the preparation of scholarly works, the
use of USINs in this way should also improve the accuracy and quality of citations
by eliminating manual errors and inconsistencies. Finally, a serendipitous benefit of
having the citations in a paper represented as USINs is that the citation set can then
be made available as data; citation databases can thus be supported by citation data
provision at the source [6].
One possible approach is to expand the requirements for the USIN Global Database
to also provide comprehensive support for literature research activities. After all, the
USIN Global Database is intended to be comprehensive in its coverage of the
citable works and must provide the basic bibliographic data (author, title, serial
name, serial enumeration, publication date) for each archived item. With the
extension of the database to include abstracts, keywords and classification data for
each item, it is possible to contemplate comprehensive support for literature
research.
From the standpoint of good modular system design, one can also argue that the
USIN Global Database should deal only with the basic bibliographic data that
derives from the publication process. Classification, evaluation and review materials
should be considered third-party metadata that may come from a variety of sources.
Without any agreed upon method for standardizing what types of metadata should
be provided and who should provide it, it would be a poor choice to impose de facto
standardization by incorporating a particular third-party metadata scheme into the
USIN Global Database.
VI. Conclusion
The USIN scheme is a proposed system for the global and persistent identification
of the publications in organized serial collections. Ultimately some global
identification scheme is likely to be developed for interoperation of various article
citation applications. Scholars should seize the opportunity that now exists to ensure
that the scheme that succeeds is the one that is designed primarily to meet the long-
term needs of people (authors and readers), not the short-term needs of particular
present-day computer systems belonging to vendors, libraries or document delivery
services.
There is a great deal more work required to fully realize the USIN concept. The
author would be most appreciative of your help.
Acknowledgements
Andrew Walenstein has helped greatly by providing valuable feedback on several
drafts of this paper. Jim Cole, while still questioning some issues from a serials
cataloguing perspective, has been a source of considerable encouragement. I am also
grateful to the anonymous referees for many constructive criticisms and helpful
suggestions.
References
[1]
American Chemical Society, American Institute of Physics, American
Mathematical Society, American Physical Society, Elsevier Science, IEEE,
"Publisher Item Identifier as a means of document identification", updated
October 9, 1997. Archived publication unknown. Available at
http://www.elsevier.nl/inca/homepage/about/pii/.
With no other formal denotation known for this work, it might only be
denotable by reference to this paper. Possible eventual USIN:
S.BCS/JoDI:1(3)$1|ref(1). This assumes that BCS becomes assigned to the
British Computer Society in the international domain of scholarly societies,
and that JoDI is reserved by BCS to to denote the Journal of Digital
Information.
[2]
American National Standards Committee on Library and Information
Sciences and Related Publishing Practices, Z39, Subcommittee E: Serials
Holding Statements. American National Standard for Information Sciences -
Serial Holdings Statements. ANSI Z39.44-1986. Approved August 14, 1985.
American National Standards Institute, New York, 1986.
[3]
T. Berners-Lee. "Universal Resource Identifiers in WWW: A Unifying Syntax
for the Expression of Names and Addresses of Objects on the Network as
used in the World-Wide Web", RFC 1630, RFC Editor, Internet Society, June
1994. Available at URL: http://ds.internic.net/rfc/rfc1630.txt.
[4]
T. Berners-Lee, L. Masinter, M. McCahill (Eds.), "Uniform Resource
Locators", RFC 1738, RFC Editor, Internet Society, December 1994.
Available at URL: http://ds.internic.net/rfc/rfc1738.txt.
[5]
Robert D. Cameron. "To Link or To Copy?-Four Principles for Materials
Acquisition in Internet Electronic Libraries", Technical Report TR 94-08,
School of Computing Science, Simon Fraser University, December 1994.
Available at http://elib.cs.sfu.ca/project/papers/e-lib-links.html.
[6]
Robert D. Cameron. "A Universal Citation Database as a Catalyst for Reform
in Scholarly Communication", First Monday 2(4), April 1997. Available at
URL: http://www.firstmonday.dk/issues/issue2_4/cameron/index.html
[7]
James H. Coombs, Allen H. Renear, and Steven J. DeRose. "Markup Systems
and the Future of Scholarly Text Processing." Communications of the ACM,
30(11), Nov. 1987, pages 933-947. Available at URL:
http://www.sil.org/sgml/coombs.html.
[8]
R. Daniel. "A Trivial Convention for using HTTP in URN Resolution", RFC
2169, RFC Editor, Internet Society, June 1997. Available at URL:
http://ds.internic.net/rfc/rfc2169.txt.
[9]
R. Daniel and M. Mealling. "Resolution of Uniform Resource Identifiers
using the Domain Name System", RFC 2168, RFC Editor, Internet Society,
June 1997. Available at URL: http://ds.internic.net/rfc/rfc2168.txt.
[10]
Nachum Dershowitz and Edward M. Reingold. Calendrical Calculations,
Cambridge University Press, Cambridge, UK, 1997. Suggested USINs:
ISBN/0-521-56413-1 and ISBN/0-521-56474-3. These codes use ISBNs for
the hardback and paperback versions, respectively. Choosing the code for the
hardback version as canonical may be appropriate.
[11]
DOI Foundation, "A Guide to Using Digital Object Identifiers", October 10,
1997. Archived publication unknown. Available at
http://www.doi.org/guidebook/guidebook.html.
[12]
Douglas C. Englebart, "Authorship Provisions in AUGMENT", Digest of
Papers - Compcon Spring 84 - Twenty-Eighth IEEE Computer Society
International Conference, San Francisco, February 27--March 1, 1984, pp.
465-472.
Initial USINs: ISBN/0-8186-0525-1@465 (paper), ISBN/0-8186-4525-3@465
(microfiche), ISBN/0-8186-8525-5@465 (casebound). Possible eventual form
I.IEEE/Compcon:28@465.
[13]
Roy T. Fielding. "Maintaining Distributed Hypertext Infostructures: Welcome
to MOMspider's Web", Computer Networks and ISDN Systems 27(2),
November 1994, Special Issue Selected Papers of the First World-Wide Web
Conference, pp. 193-204. On-line paper and software distribution available at
http://www.ics.uci.edu/WebSoft/MOMspider/.
[14]
Brian Green and Mark Bide. "Unique Identifiers: A Brief Introduction", Book
Industry Communication, London, 1997. Archived publication unknown.
Available at URL http://www.bic.org.uk/bic/uniquid.
[15]
Frank Halasz and Mayer Schwartz. "The Dexter Hypertext Reference Model",
Communications of the ACM 37(2), February 1994, pp. 30-39. Available at
URL: http://ds.internic.net/rfc/rfc2141.txt.
[16]
Claudia Houk McNellis. "A Serial Pattern Scheme for a Value-Based
Predictive Check-in System", Serials Review, Vol 22, No. 4, Winter 1996,
pages 1-11.
[17]
R. Moats. "URN Syntax", RFC 2141, RFC Editor, Internet Society, May
1997. Available at URL: http://ds.internic.net/rfc/rfc2141.txt.
[18]
P. Mockapetris, "Domain Names: Concepts and Facilities", RFC 1034, RFC
Editor, Internet Society, November, 1987. Available at URL:
http://ds.internic.net/rfc/rfc1034.txt.
[19]
National Information Standards Organization. Serial Item and Contribution
Identifier (SICI): An American National Standard Developed by the National
Information Standards Organization: Approved August 14, 1996 by the
American National Standards Institute. National Information Standards series
ANSI/NISO Z39.56-1996 (Version 2). NISO Press, Bethesda, Maryland,
1997. Available at URL: http://sunsite.Berkeley.EDU/SICI/.
[20]
Theodor Holm Nelson. Literary Machines, Edition 87.1, 1987. Initial USIN:
ISBN/0-89347-055-4.
[21]
Norman Paskin. "Information Identifiers", Learned Publishing, Vol 10, No. 2,
April 1997, pages 135-156. Available at URL
http://www.elsevier.com/inca/homepage/about/infoident/Menu.shtml.
[22]
Fritz Schwarz and Cindy Hepfer. "Changes to the Serial Item and
Contribution Identifier and the Effects of Those on Publishers and Libraries",
The Serials Librarian 28(3/4), 1996, pp. 367-70.
[23]
K. Sollins and L. Masinter. "Functional Requirements for Uniform Resource
Names", RFC 1737, RFC Editor, Internet Society, December 1994. Available
at URL: http://ds.internic.net/rfc/rfc1737.txt.
[24]
Jennifer Wheary and Bernard F. Schutz, "Living Reviews in Relativity:
Making an Electronic Journal Live", The Journal of Electronic Publishing.
Available at URL: http://www.press.umich.edu:80/jep/03-01/LR.html.