You are on page 1of 12

Gwen Williams

Professors Hendersons
LIS 437 Technical Services Functions
9 May 2004

University Institutional Repositories and a Library’s Mission

The opening statement on MIT’s DSpace home page reads:


DSpace is a groundbreaking digital library system to
capture, store, index, preserve, and redistribute the
intellectual output of a university’s research faculty in
digital formats. Developed jointly by MIT Libraries and
Hewlett-Packard (HP), DSpace is now freely available to
research institutions world-wide as an open source system
that can be customized and extended. 1
At first glance, it could appear that DSpace should be thought to be a
traditional library that has been scanned into digital format. As such,
we might expect defining features of the traditional library, such as
storage and preservation of published (for the most part) items;
cataloging records that contain descriptive information and subject
access points; classification schemes that bring the entire library into an
ensemble of works through the subject-approach to publications, such
as Library of Congress Classification scheme, or SuDocs Classification
scheme, or some combination of a few classification schemes; indexes
that subordinate and collocate for patrons like desired items, by author,
by uniform title, by subject heading, for example; open stacks
arrangements for authorized patrons, which tend to span the entire
ensemble of works in a library’s holdings (exception is possibly the rare
book and special collections); and mechanisms for circulating items to
authorized patrons, the holders of library cards. Based on the brief
description of DSpace provided by its creators, it would appear that
DSpace is much like a traditional library. Indeed, some of the aspects
we typically associate with traditional libraries are present in DSpace, a
university institutional repository. We can, for example, provisionally
conclude that DSpace—and by extension, institutional repositories in
general—does include some form of cataloging records, as it indicates

1 DSpace Federation Home, online 9 May 2004, http://dspace.org/index.html


the presence of indexes; it does include some mechanisms for
circulating, or redistributing, resources; and it does include some
mechanisms for storing and preserving digital resources. Moreover,
DSpace was a joint project between MIT Libraries and Hewlett-
Packard, obviously suggesting that to think about DSpace as a
traditional library scanned into digital format is not too far from the
mark. But is this really the case?
Clifford Lynch advises that the institutional repository is not “a
collection of journals, and should not be managed like one. . . That’s
not the point of an institutional repository.”2 So if DSpace is not to be
managed like a library manages a collection of published resources,
such as journals, or monographs, for that matter, then what exactly is
an institutional repository such as DSpace? And what roles will
libraries and librarians play in the management of institutional
repositories? I aim to explore these questions in this essay. I do,
however, believe that it is fruitful to understand these questions about
libraries, roles for librarians, institutional repositories, and the
management of these newly emerging digital spaces by comparing the
institutional repository with the traditional library. The differences that
such a comparison would make evident could prove useful for
addressing the most important question for any librarian or future
librarian to think about: what roles will libraries and librarians play in
the management of institutional repositories?

Features of institutional repositories that are library-like features


There seem to be three main features of DSpace that exhibit
traditional library-like features: cataloging records of some kind,
associated with specific digital resources, that enable indexing; some
mechanisms of circulating, or redistributing, resources; and some
mechanisms for storing and preserving digital resources.
With respect to the mechanisms for circulating, or redistributing,
resources, and with the mechanisms for storing and preserving digital
resources, Lynch suggests that these two traditional library concerns are
the very reasons for the creation of institutional repository software,
such as DSpace. He writes,

2Lynch, Clifford, “Institutional Repositories: Essential Infrastructure For Scholarship in The


Digital Age,” portal: Libraries and the Academy, vol. 3, no. 2 (2003): 333.

2
In my view, a university-based institutional repository is a
set of services that a university offers to the members of its
community for the management and dissemination of digital
materials created by the institution and its community
members. It is most essentially an organizational
commitment to the stewardship of these digital materials,
including long-term preservation where appropriate, as well
as organization and access or distribution.3
As such, the very point of building and maintaining institutional
repositories would be to preserve and provide access to a breadth of
resources produced by institutionally-sanctioned members. This seems,
to me, to be a succinct and useful way to summarize the very point of
building and maintaining traditional libraries; with one caveat,
however, traditional libraries do not limit their collections to those
resources produced exclusively by their home institution, in this case,
to the university to which any given academic library is subordinated
to. But, if one looks at the published authors in any given academic
library—especially the authors of those resources most particular and
most important to academic libraries—one could say that the majority
of these authors of monographs and journal articles are, in fact,
institutionally-sanctioned scholars, researchers, teachers, and
administrators by virtue of the organization of university faculty and its
corresponding publishing regime across the various disciplines. In
short, the providing of stewardship over intellectual resources and
managing the mechanisms for distributing such resources are not new
to libraries and librarians.
The “DSpace Internal Reference Specification—Functionality”
manual indicates that this particular institutional repository software
includes mechanisms for browsing and searching author, subject, and
title indexes, which of course suggests that a catalog record, or a
catalog-like record has been somehow associated with each particular
digital resource.4 I conclude that catalog records, or catalog-like
records, exist because of how indexes are built, that is to say, indexes
are essentially lists, generally alphabetically arranged. Machine-

3 Ibid, pg. 328.


4“DSpace Internal Reference Specification—Funcationality,” version 2002-03-01, online 9
May 2004, http://dspace.org/technology/features.html

3
indexes associated with web search engines, for example, that point
toward the full text (retrieval based on keyword searching) have been
firstly arranged an alphabetical listing of words into indexes, which
point toward web resources that contain the particular words searched.
These indexes could be built on the fly and discarded, or stored in
index files, which seems more likely. But as this paper is not a paper
on the simplicities of natural language and full text search and retrieval,
I should say that based on this rudimentary functioning of indexes, I
can conclude two things. One, isolated record fields such as “author,”
“title,” and “subject” must exist somewhere in DSpace because DSpace
is able to create indexes to such fields. Two, such isolated record fields
must exist within a catalog record, or catalog-like record, somewhere
in the institutional repository.
As it turns out, the catalog-like records associated with each
particular DSpace stored digital resource do, indeed, use a data
structure encoding scheme that has similar components as the MARC
format encoding scheme used to structure traditional library catalog
records: “The baseline metadata requested for each submitted item
[digital resource] is based upon the qualified Dublin Core Metadata
Scheme, adapted to DSpace requirements by MIT Libraries.”5 I would
emphasize similar components of traditional library catalog records, for,
as we know, the Dublin Core Metadata Scheme is not the equivalent
replacement for the MARC format. Rather, there exist some shared
fields between the two encoding formats: fields such as “author,” “title,”
“alternative titles,” “date of issue [or ‘publication’],” “ISBN, ISSN, if
applicable,” “subject words,” and further description of the entity
[digital resource or monograph].6 Likewise, similar software as DSpace
software for building digital collections, or digital libraries, such as
Greenstone software, generally provides mechanisms for the
establishment of catalog-like records. And oftentimes provides
mechanisms for the creation of catalog-like records using the Dublin
Core Metadata Scheme and its elements, or designated fields, if you
will. In other words, institutional repository software seems to
generally include an encoding structure for the creation of catalog-like
records—in particular, for the creation of catalog-like records that
5 Ibid.
6 Ibid.

4
contain similar elements as the ISBD eight elements for descriptive
cataloging.

Features of libraries that are not integral to institutional repositories


A perusal of DSpace and its functionality manual, as well as a
perusal of a university-wide forum, such as The Chronicle of Higher
Education, suggests that there are several features of traditional libraries
that seem to not be very integral to the establishment and management
of institutional repositories like MIT’s DSpace. Broadly speaking, these
library features can be considered in four categories. One, key aspects
of the library catalog record and library catalog. Two, responsibility for
acquisition and collection management of resources. Three, open
access across all library collections and holdings for patrons. And four,
top managerial responsibility. I will consider these four categories in
order.
Fundamental aspects of the library catalog record and library
catalog are not part of MIT’s DSpace. For example, DSpace does not
utilize classification schemes to provide browsing structures for patrons
vis-à-vis a subject approach to resources. In fact, the functionality
manual does not even address the issue of whether or not resources
should be classified in some manner—although classification is implied
through its functional features of the creation of communities,
collections, and authorized users (more on this in the next section).
Another key feature of the library catalog record and library catalog
that does not appear in DSpace is what I would call a controlled- and
authority-based indexing capability. Let me explain. Above I indicated
that DSpace has the capability to build indexes on record fields suggest
as “author,” “title,” and “subject”: This is true. However, the Dublin
Core Metadata Scheme adapted by DSpace is missing two crucial
aspects for enabling a controlled- and authority-based indexing
capability. That is, while author is a possible DSpace record field, it is
not a required record field; moreover, there is “currently no authority
control for authors (i.e. DSpace does not currently know that “Samuel
Clemens” and “Mark Twain” are the same author, nor does it
distinguish well between two authors that share the same name).”7 In
addition, while subject is a possible DSpace record field, it is not a

7 Ibid.

5
required record field; moreover, there is “currently no thesauri or
authority control for subject keywords,” much less grand controlled
vocabulary schemas such LCSH, or Sears. 8 It would be interesting to
see logs on precision and recall in DSpace—in other words, it will
prove interesting when persons start contemplating and researching the
relevance9 of resources retrieved, absent what appears to me to be any
sound sort of indexing capability (a controlled- and authority-based
indexing capability).
Related to the lack of classification schemes and capability for
controlled- and authority-based indexing is the second broad way that
traditional library features differ from institutional repositories features
such as DSpace features. That is, the responsibility for the acquisition
and collection management of digital resources in DSpace differs
dramatically from the acquisition and collection management functions
as performed in libraries. In libraries, librarians and library workers are
responsible for coordinating the acquiring and managing the library
collections—with of course, input from university faculty. In DSPace,
and in institutional repositories in general, the responsibility for
acquiring and collecting resources—or “capturing” resources—resides
with university faculty and university personnel outside of the library.
That is to say, the university-community-centered aspect to institutional
repositories places the acquisition and collection management of
digital resources in the hands of the authors themselves. Lynch writes,
a faculty member,
must exercise stewardship over the actual content and its
metadata: migrating the content to new formats as they
evolve over time, creating metadata describing the content,
and ensuring the metadata is available in the appropriate
schemas and formats and through appropriate protocol
interfaces such as open archives metadata harvesting.10
When one considers where the acquisition and collection management
functions are placed with respect to developing the institutional

8 Ibid.
9 And by relevance, I mean to suggest what Don Swanson and Patrick Wilson might suggest
relevance is: that is, subject-based relevance, or knowledge-based relevance as the
searcher defines it.
10 Lynch, “Institutional. . .,” pg. 330.

6
repository, the very flexibility of the Dublin Core Metadata Scheme and
its lack of controlled vocabularies, authority files, and classification
schemes becomes somewhat more understandable.
Academic libraries tend toward providing open access across all
library collections and holdings, for authorized patrons. I am not sure
what percentage of academic libraries have partially closed stacks areas
such as UIUC’s Main Bookstacks, but even given the “closed stacks”
that may be present, one can, I think, safely assume that access for
authorized patrons (university community of faculty, staff, and students)
is somehow provided across all library collections and holdings.
Institutional repositories such as DSpace operate on a very fundamental
difference with respect to access—and to access across all collections
within an institution’s repository. That is, DSpace places the control
over access within the domain of whomever is authorized to establish a
community of collections and authorized users; and particular
collections within particular communities. In other words, similar to
the course management tools such as WebCT and Blackboard, a faculty
member can authorize the list of persons that have exclusive access to
the resources collected and managed. This implies something very
different than the library providing access across the whole library’s
collections. This suggests that DSpace can be thought to be a space
which is in fact multiple spaces—or many locked rooms with a limited
number of keys to any given room, dispensed as the faculty member
charged with managing the room determines.
Finally, the last major difference between traditional libraries and
institutional repositories concerns top managerial responsibility for
each type of space. That is, nowhere in the literature have I found that
a University Librarian sits atop the managerial chain of an institutional
repository. Indeed, as Lynch points outs,
While operational responsibility for these services [an
institutional repository] may reasonably be situated in
different organizational units at different universities, a
effective institutional repository of necessity represents a
collaboration among librarians, information technologists,
archives and records managers, faculty, and university
administration and policymakers.11

11 Ibid, pg. 328.

7
Lest librarians and future librarians be alarmed by this, I should say that
any look at the literature that discusses what kinds of resources could
potentially be stored within an institutional repository such as DSpace
should indicate that a vast number of potential resources are kinds of
resources that libraries currently do not manage: administrative records
such as payroll and student transcripts; teaching resources such as
syllabi, handouts, assignment descriptions, and tests; pre-publication
drafts of scholarly research; databases of primary research; department
communications about curriculum, course offerings, committee
meetings, and teaching symposiums; student work including papers,
lab reports, and artistic creations; documents and schedules pertaining
to the coordination of K-12 pre-service teacher training with local K-12
schools; personal collections of papers, emails, letters, proposals, and
what-not by individual faculty members; and so on.

Features of institutional repositories that are beyond the traditional


library
In addition to the vast number of potential institutional repository
resources mentioned above, there are other key features of institutional
repositories that are beyond the managerial domain of the traditional
library. Two of the most crucial differences are the use of archiving
techniques that most librarians do not use in their daily work; and the
community-centered, or user-centered if you like, control over defining
and managing communities of collections and users.
The Dublin Core Metadata Scheme for DSpace includes the
element called, “Series Name and Report Number.” Moreover, as the
functionality manual indicates, searchers of institutional repository
collections can access the resources by date the items were placed
within the repository: this seems to me to be a searching and browsing
capability that DSpace enables community members to perform that
traditional library catalogs and tools do not. Also, the submission of
resources to the institutional repository includes archival concepts for
organizing archival resources: the inclusion of provenance information
and serialization of resources. These seem fundamental and important
components of a institutional repository of unpublished resources
generated by various university members. For these archival concepts
and functional features enable something that traditional organization
of resources in libraries do not:

8
DSpace offers history functionality to provide an audit trail of
the administration of the archive, to provide data supporting
root-cause analysis, and to support human-moderated
rollbacks.12

Secondly, to reiterate somewhat what I have previously stated,


institutional repositories and their collections are determined by
university community members beyond the library. In other words,
institutional repository software such as DSpace does not require, nor
was it intended to require, the mediation of librarians with respect to
defining and managing the collections of resources. Communities of
university-sanctioned users are responsible for determining and
managing their particular slices, or rooms and keys, of DSpace.
Subordinate to every community would be the collections associated
with each community. Authorization to access any given collections or
any given community of collections resides with each community
administrator—in theory, this could mean every member of the
teaching and research faculty.

An institutional repository is not a traditional library manifested in the


digital
I believe that I have sufficiently demonstrated that an institutional
repository, that can be built from software such as MIT’s DSpace, is not
a traditional library that has been simply digitized. An institutional
repository is something much larger than a traditional library that is
manifested in the digital. While it is useful to compare features of the
institutional repository to features of the traditional library, it does not
seem wise to conceptually equate the two. In other words, it does not
appear wise to conceive of the institutional repository as simply the
traditional print-bound library gone digital. But Lynch tells us as much
when he suggests that institutional repositories developed in institutions
of higher education will probably move into the broader social realm:
University institutional repositories have some very
interesting and unexplored extensions to what we might
think of as community or public repositories; this may in fact
be another case of a concept developed within higher

12 “DSpace Internal. . .”

9
education moving more broadly into our society. Public
libraries might join forces with local government, local
historical societies, local museums and archives, and
members of their local communities to establish community
repositories. Public broadcasting might also have a role
here.13

Implications for libraries and librarians: a question of university-wide


organization and the mission of libraries
So what roles will libraries and librarians play in the management
of institutional repositories? I believe, and I have written extensively on
this issue in my LIS 437 Think Piece assignment, that it comes down to
considering the library’s mission as being characterized mostly by
stewardship, which includes the providing of service. That is, it seems
rather clear to me that the mission of university libraries is subordinated
to the larger university-wide organization and its mission. That is,
libraries play a stewardship role for universities, providing service and
access to resources for university members. The conception and
functional features of institutional repositories, such as DSpace, seem to
emphasize this subordinated, but of course, quite crucial mission of
libraries and librarians.
And of course, there are opportunities for librarians to collaborate
with university faculty in devising library-like features that could
improve any community’s organization of collections, such as various
thesauri, classification schemes of some kind, and name-authority files
of some fashion. Librarians could even play a role in facilitating
searching and browsing across communities and collections, should
community managers so desire. Moreover, I think it would be very
librarian-like to do so.

13 Lynch, “Institutional. . .,” pg. 336.

10
References

Carlson, Scott, “Cornell Tries a New Publishing Model,” Chronicle of Higher


Education, (5 March 2004): A29.

____, “Penn State Program to Allow Sharing of Course Materials and


Research Data,” Chronicle of Higher Education, (312 October 2003): A32.

____, “The Uncertain Fate of Scholarly Artifacts in a Digital Age,” Chronicle


of Higher Education, (30 January 2004): A25-A27.

Carnevale, Dan, “Colleges are Relieved as PeopleSoft Rejects Latest Oracle


Takeover,” Chronicle of Higher Education, (20 February 2004): A30.

____, “A New Technology Lets Colleges Spread Information to People Who


Want It,” Chronicle of Higher Education, (13 February 2004): A31-A32.

“DSpace Internal Reference Specification—Functionality,” version


2002-03-01, online 9 May 2004, http://dspace.org/technology/features.html

DSpace Federation Home, online 9 May 2004, http://dspace.org/index.html

Vincent Kiernan, “Company to Track Citations of Online Scholarship,”


Chronicle of Higher Education, (19 March 2004): A31.

____, “Killing Bytes, Not Trees,” Chronicle of Higher Education, (9 April


2004): A31-A33.

Lynch, Clifford, “Institutional Repositories: Essential Infrastructure For


Scholarship in The Digital Age,” portal: Libraries and the Academy, vol. 3,
no. 2 (2003): 327-336.

____, Interview with Clifford Lynch, Ubiquity, vol. 4, no. 23 (July 30 - August
5, 2003)

Marcum, Deanna. “Requirements for the Future Digital Library.” Address to


the Elsevier Digital Libraries Symposium, Pennsylvania: 25 January 2003.

Milstead, Jessica and Susan Feldman, “Metadata: Cataloging by Any Other


Name,” Online, January 1999.

11
Read, Brock, “New Digital Library Offers Alternative to Slides,” Chronicle of
Higher Education, (16 April, 2004): A34.

____, “Planning With Pixels, Not Pencils,” Chronicle of Higher Education,


(14 November 2003): A29.

____, “Science Library Stages Avant-Garde Plays, One Viewer at a Time,”


Chronicle of Higher Education, (28 November 2003): A35.

Short, Edmund C., “Knowledge and the Educational Purposes of Higher


Education: Implications for the Design of a Classification Scheme,”
Cataloging and Classification Quarterly, vol. 19, no. 3/4 (1995): 59-66.

Unsworth, John M., “The Next Wave: Liberation Technology,” The Chronicle
Review, (23 January 2004): B16-B20.

Vest, Charles M., “Why MIT Decided to Give Away All Its Course Materials
via the Internet,” The Chronicle Review, (23 January 2004): B20-B21.

Young, Jeffrey R., “Google Tests Search Engine for Colleges’ Scholarly
Materials,” The Chronicle of Higher Education, vol. L., no. 33 (23 April
2004): A36.

___, “Will Colleges Miss the Next Big Thing?,” The Chronicle of Higher
Education, vol. L., no. 33 (23 April 2004): A35-A36.

12

You might also like