Professional Documents
Culture Documents
Professors Hendersons
LIS 437 Technical Services Functions
9 May 2004
2
In my view, a university-based institutional repository is a
set of services that a university offers to the members of its
community for the management and dissemination of digital
materials created by the institution and its community
members. It is most essentially an organizational
commitment to the stewardship of these digital materials,
including long-term preservation where appropriate, as well
as organization and access or distribution.3
As such, the very point of building and maintaining institutional
repositories would be to preserve and provide access to a breadth of
resources produced by institutionally-sanctioned members. This seems,
to me, to be a succinct and useful way to summarize the very point of
building and maintaining traditional libraries; with one caveat,
however, traditional libraries do not limit their collections to those
resources produced exclusively by their home institution, in this case,
to the university to which any given academic library is subordinated
to. But, if one looks at the published authors in any given academic
library—especially the authors of those resources most particular and
most important to academic libraries—one could say that the majority
of these authors of monographs and journal articles are, in fact,
institutionally-sanctioned scholars, researchers, teachers, and
administrators by virtue of the organization of university faculty and its
corresponding publishing regime across the various disciplines. In
short, the providing of stewardship over intellectual resources and
managing the mechanisms for distributing such resources are not new
to libraries and librarians.
The “DSpace Internal Reference Specification—Functionality”
manual indicates that this particular institutional repository software
includes mechanisms for browsing and searching author, subject, and
title indexes, which of course suggests that a catalog record, or a
catalog-like record has been somehow associated with each particular
digital resource.4 I conclude that catalog records, or catalog-like
records, exist because of how indexes are built, that is to say, indexes
are essentially lists, generally alphabetically arranged. Machine-
3
indexes associated with web search engines, for example, that point
toward the full text (retrieval based on keyword searching) have been
firstly arranged an alphabetical listing of words into indexes, which
point toward web resources that contain the particular words searched.
These indexes could be built on the fly and discarded, or stored in
index files, which seems more likely. But as this paper is not a paper
on the simplicities of natural language and full text search and retrieval,
I should say that based on this rudimentary functioning of indexes, I
can conclude two things. One, isolated record fields such as “author,”
“title,” and “subject” must exist somewhere in DSpace because DSpace
is able to create indexes to such fields. Two, such isolated record fields
must exist within a catalog record, or catalog-like record, somewhere
in the institutional repository.
As it turns out, the catalog-like records associated with each
particular DSpace stored digital resource do, indeed, use a data
structure encoding scheme that has similar components as the MARC
format encoding scheme used to structure traditional library catalog
records: “The baseline metadata requested for each submitted item
[digital resource] is based upon the qualified Dublin Core Metadata
Scheme, adapted to DSpace requirements by MIT Libraries.”5 I would
emphasize similar components of traditional library catalog records, for,
as we know, the Dublin Core Metadata Scheme is not the equivalent
replacement for the MARC format. Rather, there exist some shared
fields between the two encoding formats: fields such as “author,” “title,”
“alternative titles,” “date of issue [or ‘publication’],” “ISBN, ISSN, if
applicable,” “subject words,” and further description of the entity
[digital resource or monograph].6 Likewise, similar software as DSpace
software for building digital collections, or digital libraries, such as
Greenstone software, generally provides mechanisms for the
establishment of catalog-like records. And oftentimes provides
mechanisms for the creation of catalog-like records using the Dublin
Core Metadata Scheme and its elements, or designated fields, if you
will. In other words, institutional repository software seems to
generally include an encoding structure for the creation of catalog-like
records—in particular, for the creation of catalog-like records that
5 Ibid.
6 Ibid.
4
contain similar elements as the ISBD eight elements for descriptive
cataloging.
7 Ibid.
5
required record field; moreover, there is “currently no thesauri or
authority control for subject keywords,” much less grand controlled
vocabulary schemas such LCSH, or Sears. 8 It would be interesting to
see logs on precision and recall in DSpace—in other words, it will
prove interesting when persons start contemplating and researching the
relevance9 of resources retrieved, absent what appears to me to be any
sound sort of indexing capability (a controlled- and authority-based
indexing capability).
Related to the lack of classification schemes and capability for
controlled- and authority-based indexing is the second broad way that
traditional library features differ from institutional repositories features
such as DSpace features. That is, the responsibility for the acquisition
and collection management of digital resources in DSpace differs
dramatically from the acquisition and collection management functions
as performed in libraries. In libraries, librarians and library workers are
responsible for coordinating the acquiring and managing the library
collections—with of course, input from university faculty. In DSPace,
and in institutional repositories in general, the responsibility for
acquiring and collecting resources—or “capturing” resources—resides
with university faculty and university personnel outside of the library.
That is to say, the university-community-centered aspect to institutional
repositories places the acquisition and collection management of
digital resources in the hands of the authors themselves. Lynch writes,
a faculty member,
must exercise stewardship over the actual content and its
metadata: migrating the content to new formats as they
evolve over time, creating metadata describing the content,
and ensuring the metadata is available in the appropriate
schemas and formats and through appropriate protocol
interfaces such as open archives metadata harvesting.10
When one considers where the acquisition and collection management
functions are placed with respect to developing the institutional
8 Ibid.
9 And by relevance, I mean to suggest what Don Swanson and Patrick Wilson might suggest
relevance is: that is, subject-based relevance, or knowledge-based relevance as the
searcher defines it.
10 Lynch, “Institutional. . .,” pg. 330.
6
repository, the very flexibility of the Dublin Core Metadata Scheme and
its lack of controlled vocabularies, authority files, and classification
schemes becomes somewhat more understandable.
Academic libraries tend toward providing open access across all
library collections and holdings, for authorized patrons. I am not sure
what percentage of academic libraries have partially closed stacks areas
such as UIUC’s Main Bookstacks, but even given the “closed stacks”
that may be present, one can, I think, safely assume that access for
authorized patrons (university community of faculty, staff, and students)
is somehow provided across all library collections and holdings.
Institutional repositories such as DSpace operate on a very fundamental
difference with respect to access—and to access across all collections
within an institution’s repository. That is, DSpace places the control
over access within the domain of whomever is authorized to establish a
community of collections and authorized users; and particular
collections within particular communities. In other words, similar to
the course management tools such as WebCT and Blackboard, a faculty
member can authorize the list of persons that have exclusive access to
the resources collected and managed. This implies something very
different than the library providing access across the whole library’s
collections. This suggests that DSpace can be thought to be a space
which is in fact multiple spaces—or many locked rooms with a limited
number of keys to any given room, dispensed as the faculty member
charged with managing the room determines.
Finally, the last major difference between traditional libraries and
institutional repositories concerns top managerial responsibility for
each type of space. That is, nowhere in the literature have I found that
a University Librarian sits atop the managerial chain of an institutional
repository. Indeed, as Lynch points outs,
While operational responsibility for these services [an
institutional repository] may reasonably be situated in
different organizational units at different universities, a
effective institutional repository of necessity represents a
collaboration among librarians, information technologists,
archives and records managers, faculty, and university
administration and policymakers.11
7
Lest librarians and future librarians be alarmed by this, I should say that
any look at the literature that discusses what kinds of resources could
potentially be stored within an institutional repository such as DSpace
should indicate that a vast number of potential resources are kinds of
resources that libraries currently do not manage: administrative records
such as payroll and student transcripts; teaching resources such as
syllabi, handouts, assignment descriptions, and tests; pre-publication
drafts of scholarly research; databases of primary research; department
communications about curriculum, course offerings, committee
meetings, and teaching symposiums; student work including papers,
lab reports, and artistic creations; documents and schedules pertaining
to the coordination of K-12 pre-service teacher training with local K-12
schools; personal collections of papers, emails, letters, proposals, and
what-not by individual faculty members; and so on.
8
DSpace offers history functionality to provide an audit trail of
the administration of the archive, to provide data supporting
root-cause analysis, and to support human-moderated
rollbacks.12
12 “DSpace Internal. . .”
9
education moving more broadly into our society. Public
libraries might join forces with local government, local
historical societies, local museums and archives, and
members of their local communities to establish community
repositories. Public broadcasting might also have a role
here.13
10
References
____, Interview with Clifford Lynch, Ubiquity, vol. 4, no. 23 (July 30 - August
5, 2003)
11
Read, Brock, “New Digital Library Offers Alternative to Slides,” Chronicle of
Higher Education, (16 April, 2004): A34.
Unsworth, John M., “The Next Wave: Liberation Technology,” The Chronicle
Review, (23 January 2004): B16-B20.
Vest, Charles M., “Why MIT Decided to Give Away All Its Course Materials
via the Internet,” The Chronicle Review, (23 January 2004): B20-B21.
Young, Jeffrey R., “Google Tests Search Engine for Colleges’ Scholarly
Materials,” The Chronicle of Higher Education, vol. L., no. 33 (23 April
2004): A36.
___, “Will Colleges Miss the Next Big Thing?,” The Chronicle of Higher
Education, vol. L., no. 33 (23 April 2004): A35-A36.
12