You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/283121287

COMPARATIVE STUDY OF AN OPEN SOURCE DIGITAL LIBRARY SOFTWARE:


DSPACE, GREENSTONE AND EPRINT

Conference Paper · October 2015

CITATIONS READS

0 6,142

2 authors:

Ramani Ranjan Sahu Alekh Karadia


Indian Institute of Technology Kharagpur 13 PUBLICATIONS   7 CITATIONS   
19 PUBLICATIONS   63 CITATIONS   
SEE PROFILE
SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Bibliomatrics study View project

Building institutional repositories to support changing scholarly and research processes View project

All content following this page was uploaded by Ramani Ranjan Sahu on 24 October 2015.

The user has requested enhancement of the downloaded file.


COMPARATIVE STUDY OF AN OPEN SOURCE DIGITAL LIBRARY
SOFTWARE: DSPACE, GREENSTONE AND EPRINT

By

Ramani Ranjan Sahu Alekha karadia


Library Assistant Asst. Librarian
Pandit Deendayal Petroleum University Bhima Bhoi College,Rairakhol
Sahu.ramaniranjan0@gmail.com Alekh98@gmail.com
Mob: 9998191907 Mob: 9124037352

Abstract:-

Organizations have a new option for acquiring and implement system ,plus new opportunities for
participating in open sources software projects library professional should be involve in their
development and to build a digital library under economical conditions open sources software is
preferable. The extremely competitive environment, zero deficiency and enhanced productivity
has made it mandatory for the organization to carefully choose if they want to create a parallel
digital library with features which we may not find in traditional library .They should have basic
idea about the selection, installation and maintenance. This paper deals with comparison of
DSpace, Greenstone, and E-Prints Open sources software from various point of view and how to
selected open source software for digital library.

Keywords: Open sources software, Digital library, DSpace, GreenStone, EPrints

1-INTRODUCTION:

Due to Information Communication Technology and digital library has changed access methods
for all stake holders in retrieving key knowledge and relevant information. Digital libraries mean
creation, organization, maintenance, management, access, sharing and preservation of digital
document collection.Open source digital library software presents a system for the construction
and presentation of information collections. It helps in building collections with searching and
metadata-bases browsing facilities. Open source digital library management software’s provide
extensible features to administrators’ and allows an organization to showcase their digital achieve
to world audience. With full rights of software available under GPL and source code being
provided with the software, Organization’s can extend the functionality of the software as being
required for the particular operation. Due to shrinking budgets and the increasing prices of
journals, librarians have to look forward to a new alternative by which they can collect, store,
arrange, and disseminate information to the users. The concept of open access and institutional
repository (IR) has evolved to find out the solutions. In building the IR the academic libraries can
take the help of the OSS. (Meitei, L.S. & Devi, P. 2009). So that organization has evaluated and
comparisons choose popular open sources digital library software various point of view for
creating of institutional repository

Open source software

Open source software is computer software whose source code is available under a license
that permits users to study, change, and improve the software, and to redistribute it in modified
or unmodified firm. It is often developed in a public, collaborative manner. It is the most
prominent example of open source development and often compared to user generated content.

Reasons to Use Open Source Software

 It promotes creative development;


 Those who can't afford proprietary software can download open source programs for
free ;
 Money saved can be used to purchase other needed materials
 Can easily modify your software to suit patron's needs and your needs;
 Little to no upgrade costs ;
 No more grueling over software that doesn't meet your standards -- create it yourself
based off of a close preexisting piece of software ;
 The price (free) makes
 it easier to change your mind when the software doesn't live up to
 its expectations Little to no viruses!
 Open source software that can be incorporated into libraries.

Open Source Software for Libraries:-

Library Automation Koha, NewGenLib, Evergreen,


Digital Library Software DSpace, Fedor, Greenstone, keystone and
E-Prints ,opus etc
Web Publishing Joomla,drupal,Wordpress
Other Computer Programs Ubuntu, Firefox, PDF Creator,
Thunderbird, Open Office, GIMPshop,
NVU
Digital Library:-

Digital library is a collection of digital documents or objects. According to Smith (2001)


defined a digital library is an organized and focused collection of digital objects, including
text, images, video and audio, with the methods of access and retrieval and for the selection,
creation, organization, maintenance and sharing of collection. The digital library focused on
digital collections for preserving their documents.

2-SELECTION CRITERIA OF OPEN SOURCE DIGITAL LIBRARY SOFTWARE’S

Evaluation of open source software is different from proprietary programs. A key difference
for evaluation is that the information available for open source programs is usually different
than for proprietary programs; source code, analysis by others of the program design,
discussion between users and developers on how well it is working, and so on.My point of
view selection criteria are like that Open source licenses, Functional modules, Stable releases,
Developers and user community, User interface, Documentation.

Characteristic point of selection criteria Evaluation


Maturity Is the software new in market?
Reliability Popularity Does this software have
numerous user?
Availability Does this software frequently
release new software version?
Learnability How easy to learn or understand
the software without using user
manual?
Operability Is this software easy to operate?
Usability Accessibility Is this software easy to accessed
without other third party software
or plug-in?
User interface aesthetetics Is the user interface is suitable
with its software functionality?
Time behaviour Is this software easy to
install/configure and operate
within short time?
Performance efficiency Resource utilisation Is this software use minimal/
limited resources or can be used
with existing resources (e.g :
server, operating system )?
Functional completeness Does the software meet user’s
expectation and requirement?
Functional correctness Does the software provide
Functionality correct output as user’s
expectation?
Functional appropriateness Does the software function
appropriately?
Modularity Does the code structural and
readable? How well is the
software designed?
Modifiability How easy the system can be
customized to meet user’s
Maintainability requirement?
Reusability How easy to reuse or extent the
code for further extension or
integration?
Testability Is the software error-free?

Confidentiality How secure data and the software? How


confidence that software is free from
vulnerabilities?
Security Integrity Does the software have any control mechanism to
ensure system integrity?
Authenticity Does the software provide level of user’s
authentication?
Support Is there any community or commercial support
provided?
Tangible Documentation Complete documentation provided? Both technical
and user manual?
Version Does software version release as targeted or
Reliability expected time with mainly new functionality?
Community How active is the community for the software?
Responsiveness
Competence Does the community posses of required skill and
knowledge?
Credibility Does the development team and community have
Assurance perform good track record? How many bugs were
fixed in last 6 month?
Communication Does the community acknowledge your problems
Empathy and help in solving it?

Skill How many internal technical staff skilled with


tools and language used by this software?
Competence

(Chamili,K 2012)

3. Open source digital library software’s Comparison

In the following, the five open access Open source digital library software’sare compared
based on the characteristics identified in the previous section. The level of support of each
characteristic and specific considerations for each DL system are discussed.
Object model
Dspace: The basic entity in DSpace is item, which contains both metadata and digital content.
Qualified Dublin Core (DC) [8] metadata fields are stored in the item, while other metadata
sets and digital content are defined as bitstreams and categorized as bundles of the item. The
internal structure of an item is expressed by structural metadata, which define the relationships
between the constituent parts of an item. DSpace uses globally unique identifiers for items
based on CNRI Handle System. Persistent identifiers are also used for the bit streams of every
item.
Greenstone: Basic entity in Greenstone is document, which is expressed in XML format.
Documents are linked with one or more resources that represent the digital content of the
object. Each document contains a unique document identifier but there is no support for
persistent identifiers of the resources.
EPrints: Basic entity in EPrints is the data object, which is a record containing metadata. One
or more documents (files) can be linked with the data object. Each data object has a unique
identifier.
Collections and relations support
Dspace: Supports collections of items and communities that hold one or more collections. An
item belongs to one or more collections, but has only one owner collection. It is feasible to
define default values for the metadata fields in a collection. The descriptive metadata defined
for a collection are the title and description.There is no support of relations between different
items.
Greenstone: A collection in Greenstone defines a set of characteristics that describe its
functionality. These characteristics are: indexing, searching and browsing capabilities, file
formats, conversion plugins and entry points for the digital content import. There are also some
characteristics for the presentation of the collection.The representation of hierarchical structure
in text documents is supported for chapters, sections and paragraphs. The definition of specific
sections in text document is implemented through special XML tags. XLinks in a document
can be used to relate it with other documents or resources.
EPrints: There is no consideration of collections in EPrints. Data objects are grouped
depending on specific fields (subject, year, title, etc). There is no definition of relations
between documents, except using URLs in specific metadata fields.

Metadata and digital content storage


Dspace: Dspace stores qualified DC metadata in a relational database (PostgreSQL or Oracle).
Other metadata sets and digital content are represented as bitstreams and are stored on
filesystem. Each bitstream is associated with a specific bistream format. A support level is
defined for every bistreamformat, indicating the level of preservation for the specified file
format.
Greenstone: Both documents and resources are stored on filesystem. Metadata are user
defined and are stored in documents using an internal XML format.
EPrints: Metadata fields in EPrints are user-defined. The data object, containing metadata, is
stored in a MySQL database and the documents (digital content) are stored on filesystem.
Search and browse
Dspace: Provides indexing for the basic metadata set (qualified DC) by default, using the
relational database.Indexing of other defined metadata sets is also provided using Jakarta
Lucene API. Lucene supports fielded search, stemming and stop words removal. Searching can
be constrained in a collection or community. Also,browsing is offered by default on title,
author and date fields.
Greenstone: Indexing is offered for the text documents and specific metadata fields. Searching
capabilities provided for defined sections in a document (Title, chapter, paragraph) or in whole
document. Stemming and case sensitive searching is also available. Managing Gigabytes (MG)
open-source applications is used to support indexing and searching. Browsing catalogs can be
defined for specific fields using hierarchical structure.
EPrints: Indexing is supported for every metadata field, using the MySQL database. Full text
indexing is supported for selected fields. Combined fielded search and free text search are
provided to the end-user. Browsing is provided using specified fields (e.g. title, author,
subject).
Object management
DSpace: Items in DSpace are created using the web submission user interface or the batch item
importer, which ingests XML metadata documents and the constituent content files. In both
cases a workflow process may initiate depending on the collection configuration. The
workflow can be configured to contain from one to three steps where different users or groups
may intervene to the item submission. Collections and communities are created using the web
user interface.
Greenstone: New collections and the contained documents are built using the Greenstone
Librarian Interface or the command line building program.
EPrints: A default web user interface is provided for the creation and editing of objects.
Authority records can be used helping the completion of specific fields (e.g. authors, title).
Objects can also be imported from text files using multiple formats (METS, DC, MODS,
BibTeX, EndNote).
User interfaces
DSpace: A default web user interface is provided in order for the end-user to browse a
collection, view the qualified DC metadata of an item and navigate to its bistreams. Navigation
into an item is supported through the structural metadata that may determine the ordering of
complex content (like book pages or web pages). A searching interface is provided by default
that allows the user to search using keywords.
Greenstone: The default web user interface provides browsing and searching into collections,
navigating into hierarchical objects (like books) using table of contents. Presentation of
documents or search results may differ depending on specified XSLTs.
EPrints: The web user interface provides browsing by selected metadata fields (usually
subject, title or date). Browsing can be hierarchical for subject fields. Searching environment
allows user to restrict the search query using multiple fields and select values from lists.
Access control
DSpace: It supports users (e-people) and groups that hold different rights. Authentication is
provided through user passwords, X509 certificates or LDAP. Access control rights are kept
for each item and define the actions that a user is able to perform. These actions are: read/write
the bitstreams of an item, add/remove the bundles of an item, read/write an item, add/remove
an item in a collection. Rights are based in a default-deny policy.
Greenstone: A user in Greenstone belongs to one of two predefined user groups: an
administrator or a collection builder. The first user group has the right to create and delete
users, while the second builds and updates collections. End-users have access to all the
collections and the documents.
EPrints: Registered users in EPrints are able to create and edit objects. Users are logged in
using their username and password pair.
Multiple languages support
All the DL systems use Unicode character encoding, so the support of different languages can
be supported.Every system can use multiple languages in the metadata fields and digital
content. Keystone and EPrintsprovide an XML attribute on metadata fields to define the
language used for the field value. Greenstone provides ready to use multilingual interfaces
already translated in many languages
Interoperability features
All the DL systems support OAI-PMH in order to share the metadata of the DL with other
repositories. Greenstone and Keystone also support Z39.50 protocol for answering queries on
specific metadata sets. Fedora and DSpace are able to export digital objects as METS XML
files. Both systems also use persistence URIs to access the digital content providing a unified
access mechanism to external services. DSpace also supports OpenURL protocol providing
links for every item page. EPrints exports data objects in METS and MPEG-21 Digital Item
Declaration Language (DIDL) format.
Level of customization
Dspace: Although DSpace has a flexible object model is not so open in constructing very
different objects with independent metadata sets because of its database oriented architecture.
The user interface is fixed and provides only minor presentation interventions. Another
disadvantage is the full support of only specific file formats as digital content.
Greenstone: It provides customization for the presentation of a collection based on XSLTs and
agents that control specific actions of the DL. Greenstone architecture provides (i) a back end
that contains the collections and the documents as long as services to manage them and (ii) a
web based front end that is responsible for the presentation of collections, documents and their
searching environment.
EPrints: The data objects in EPrints contain user defined metadata. Plug-ins can be written in
order to export the data objects in different text formats. A Core API in Perl is provided for
developers who prefer to access basic DL functionality.
Findings
This comparison of five major open source software based on certain parameters mentioned
above has resulted into the following findings.
•DSpace is the most popular among the digital library solutions available in the open source
domain and DSpace is functionally richer and supports a wide range of object types, including
text, sound, images and video. It provides detailed implementation guidelines.
•GSDL and EPrints are also widely used and it is a low cost option for repository primarily
aimed at open access to article pre-prints and post-prints, including digital theses. A range of
object types can be uploaded, including video, audio, images and zip files. Educational
institutions dominate in the use of these packages.
•Institutions for which EPrints is not quite suitable may find DSpace and Greenstone more
closely meets their needs, without being unnecessarily complex. India is benefiting well from
the open source movement.
•DSpace supports community based content policies and submission process and
accommodates various kinds of digital document formats.
•EPrints is a useful Digital Library system with large user community. But when there is a
need for technical support and training in using the software,DSpace was found
suitableKeystone
•Though many libraries are using Greenstone , E-Print, fedora and Keystone but the majority
of the libraries prefer DSpace as it has got several advantages and can support numerous forms
and formats. It was also noted that by using DSpace, there isa possibility of interacting with
other libraries in the city for technical support. Moreover it is open source software and can be
customized as per the institutional requirement.
Conclusion:
The Digital Library Management software’s (DLMS) present an easy to use, customizable
architecture to create online digital libraries. With these institutions/organizations can
disseminate their research work, manuscripts, or any other digital media for preservations and
world over dissemination of digital items. The software’s discussed above present different
services and architectures. It is difficult to propose one specific DLMS system as the most
suitable for all cases. The Comparative study open source digital library software’s can be used
as a reference guide by any organization or institute to decide which one will be ideal for
creating and showcasing their digital collection. The choice usually depends on type/format of
material, distribution of material, software platform and time frame etc for setting up a Digital
Library.
(International Journal of Computer Applications (0975 – 8887) Volume 59– No.16, December
2012)

Reference:

1. C. Lagoze and H. Van de Sompel. The Open Archives Initiative: Building a low-barrier
interoperability framework. In Proceedings of the Joint Conference on Digital Libraries (JCDL
’01), 2001.

2. Chamili, Khadijah(2012).Selection Criteria for Open Source Software Adoption in


Malaysia.Asian Transactions on Basic and Applied Sciences (ATBAS ISSN: 2221-4291)
Volume 02 Issue 02 retrieved from http://www.asian-
transactions.org/Journals/Vol02Issue02/ATBAS/ATBAS-60212027.pdf

3. Goh, D, Razikin, K, Chua,Alton Y.K., Lee ,Chei Sian and Foo ,Schubert (2009).On the
Effectiveness of Social Tagging for Resource Discovery.Handbook of Research on Digital
Libraries: Design, Development, and Impact .pp. 251-260.www.irma-
international.org/chapter/effectiveness-social-tagging-resource-discovery/19888/

4. Ibrahim, Ushaman alhaji, Digitazation of Library Resources and formation of digital Libraries:
A Practical Approach.pp.2. http://www.library.up.ac.za/digi/docs/alhaji_paper.pdf

5. Kinoshenko,D, MashtalirV, Shlyakhov,V and YegorovaE (2012).Nested Partitions Properties


for Spatial Content Image Retrieval.MultimediaStorage and Retrieval Innovations for Digital
Library Systems. pp. 240-269.www.irma-international.org/chapter/nested-partitions-properties-
spatial-content/64471/

6. Kovacevic,Aand Devedzic,V (2009). Duplicate Journal Title Detection in References


Handbook of Research on Digital Libraries: Design,Development, and Impact (pp. 235-
242).www.irma-international.org/chapter/duplicate-journal-title-detection-references/19886/

7. Meitei, L.S. & Devi, P. (2009).Open source initiatives in digital preservations: The need for an
open sourcedigital repository and preservation system. In CALIBER 2009.
http://hdl.handle.net/1944/996

8. R. Kahn and R. Wilensky. A Framework for Distributed Digital Object Services. Corporation
of National Research Initiative - Reston USA, 1995. Available at
http://www.cnri.reston.va.us/k-w.html.

9. Shaoqun Wu and Ian H. Witten (2010).First Person Singular: A Digital Library Collection that
Helps Second Language LearnersExpress Themselves.International Journal of Digital Library
Systems (pp. 24-43).http://dblp.uni-trier.de/pers/hd/w/Wu:Shaoqun

10. Tyagi,Sunil (2013). The Concept of Metadata for Digital Information Resources with Special
Reference to DublinCore (DC).Design, Development, and Management of Resources for
Digital Library Services.pp.160-170.www.irma-international.org/chapter/concept-metadata-
digital-information-resources/72455/

11. DCMI Metadata Terms. Dublin Core Metadata Initiative. Available at


http://www.dublincore.org/documents/dcmi-terms/

12. DSpace Federation. Available at http://www.dspace.org/

13. EPrints for Digital Repositories. Available at http://www.eprints.org/

14. Fedora Project. Available at http://www.fedora.info/

15. Greenstone Digital Library Software. Available at http://www.greenstone.org/

16. Keystone DLS. Available at http://www.indexdata.dk/keystone/

17. METS: An Overview & Tutorial. Library of Congress. Available at


http://www.loc.gov/standards/mets/METSOverview.v2.html

View publication stats

You might also like