You are on page 1of 17

BUILDING A DIGITAL LIBRARY

WITH J_ISIS

BY FATUMA NAKATE
WITH ASSISTANCE
FROM PROF. EGBERT DE SMET

THE LIB@WEB END OF TRAINING PROJECT REPORT

DECEMBER 2013

Table of Contents
1.0. Introduction......................................................................................................... 2
1.0.1. Digital Library................................................................................................ 2
1.0.2. Functional Components of Digital Library......................................................2
1.0.3. Why digital collection.................................................................................... 2
1.0.4. J-ISIS Software............................................................................................... 2
1.1. Problem statement.............................................................................................. 2
1.2. Objective............................................................................................................. 2
1.2.1. Specific objectives......................................................................................... 2
1.3. Methodology........................................................................................................ 2
1.3.1 Installation...................................................................................................... 2
1.3.2. Starting J-ISIS................................................................................................. 2
1.3.3. Creating a new database for the Digital library.............................................2
1.3.4. Data Entry..................................................................................................... 3
1.3.5. Created display format using Pft manager....................................................3
1.3.6. J-ISIS Searching.............................................................................................. 4
1.3.7. Web J-ISIS Installation.................................................................................... 4
1.3.8. Web J-ISIS...................................................................................................... 4
1.4. J-ISIS Strength...................................................................................................... 4
1.5. Weaknesses......................................................................................................... 4
1.6. A few Observations.............................................................................................. 4
1.7. A brief performance comparison between J-ISIS and Greenstone........................4
1.8. Implementation................................................................................................... 5
1.9. Recommendations to UNESCO............................................................................. 5
Conclusion.................................................................................................................. 5
References.................................................................................................................. 5

1.0. Introduction
Digital Libraries are being created today for diverse communities and in different fields e.g. education,
science, culture, development, health, governance and so on. With the availability of several free digital
library software packages at the recent time, the creation and sharing of information through digital library
collections has become an attractive and feasible proposition for library and information professionals
around the world.
1.0.1. Digital Library
A digital library is a collection of digital documents or objects. This definition is the dominant perception of
many people of today. Nevertheless, Smith (2001) defined a digital library as an organized and focused
collection of digital objects, including text, images, video and audio, with the methods of access and
retrieval and for the selection, creation, organization, maintenance and sharing of collection.
1.0.2. Functional Components of Digital Library
Most digital libraries share common functional components. These include:
Selection and acquisition: The typical processes covered in this component include the selection of
documents to be added, the subscription of database and the digitization or conversion of documents to an
appropriate digital form.
Organization: The key process involved in this component is the assignment of the metadata
(bibliographic information) to each document being added to the collection.
Indexing and storage: This component carries out the indexing and storage of documents and metadata
for efficient search and retrieval.
Search and retrieval: This is the digital library interface used by the end users to browse, search, retrieve
and view the contents of the digital library. It is typically presented to the users as Hyper-Text Mark-up
Language (HTML) page. Some advantages of WWW for end users include: easy access, user friendliness,
no need for additional software installation, faster response to information requests etc.
1.0.3. Why digital collection
According to Tenopir (2003), the main reasons why libraries prefer digital collections are the following:
digital journals can be linked from and to indexing and abstracting databases;
access can be from the users home, office, or dormitory whether or not the physical library is open
the library can get usage statistics that are not available for print collections; and
digital collections save space and are relatively easy to maintain;
when total processing and space costs are taken into account, electronic collections may also result in
some overall reductions in library costs.

1.0.4. J-ISIS Software


J-ISIS is UNESCOs new information storage and retrieval database software and one of the open source
software with digital library application ability. J-ISIS is a general purpose Open Source database system
entirely written in Java and based on existing and solid FOSS software packages, such as Berkley DB and
Lucene, from the Apache Foundation. J-ISIS uses Berkeley DB for providing unlimited storage capability;
Lucene for searching and retrieval of documentary resources. This is the software for my project to build a
digital library. J-ISIS is still under development.
1.1. Problem statement
With so many free electronic information in the world today for example articles, lecture notes, books,
research papers, tutorials etc and this information is in various formats like PDF, Docx, RTF, Epub, Xml,
Mp3, avi, Mpeg etc., there is need for development of digital libraries to organize all the necessary
information for the library users for easy access. We are in the digital era, therefore digital libraries are a
necessity especially in academic institutions because they supplement the print collection, access to the
information is possible even when the library is closed, also save physical space etc.
1.2. Objective
To develop a digital library using J-ISIS
1.2.1. Specific objectives
To test the J-ISIS technology for a digital library application
To identify the strength and weaknesses of J-ISIS
To make recommendations to UNESCO
1.3. Methodology
1.3.1 Installation
Downloaded J-ISIS from
https://kenai.com/projects/j-isis/downloads/download/jisis_suite
%2025%20October%202013.zip

Extracted JISIS_suite to local disk C


Installed Java Development Kit (JDK) for 32bit
Installed Tomcat 7.0 in order to view the web J-ISIS pages

1.3.2. Starting J-ISIS


Open local disk (C:), open jisis_suite, open bin file, open jisis_suite for 32bit or 64bit, my case 32bit

J-ISIS opening

Opened J-ISIS Window

1.3.3. Creating a new database for the Digital library


Click database on menu bar J-ISIS, Open
connection, click finish.
Password can also be changed here for the web
J-ISIS, but the user is fixed, cannot be changed
yet in current version.

After opening the connection clicked database


on the menu bar of J-ISIS suite, click new
database. Give a name to the database, in this
case Demo-DL

This is the Field D


Table.
For a digital libra
publication or docume
has to have the Type D
Fields
like
Creator/Author,
Publisher, etc can a
added for the Metadata
Under type such fields
described as Alphan
Numeric or Alphabet.

Author/creator
can
repetitive in case of mo
1.
For creating a data entry worksheet. Drop all the fields into the
worksheet definition using the double drop down arrows (Add all
FDT fields)

Indexing techniques are applied here, for


example publication indexing technique
4 is applied which means to index each
word in the text since it is a full-text
document. The document can be
searched by just one word or
combinations thereof.

Clicking on finish will create the


database

1.3.4. Data Entry


Next step is to enter data into the new database.
Clicked Database on J-ISIS suite, then selected
Demo-DL.
Then clicked finish.

After having clicked finish after selecting the


newly created database, such a window
appeared.
Clicked Edit then selected data entry.

This is the data entry worksheet.


Used this to upload documents from my
computer
J-ISIS loads TIKA extracted text.
http://tika.apache.org/1.0/formats.html. TIKA
supports the following document formats:
HyperText Markup Language,
XML and derived formats,
Microsoft Office document
formats,
OpenDocument Format,
Portable Document Format,
Electronic Publication Format,
Rich Text Format,
Compression and packaging
formats,
Text formats,
Audio formats,
Image& Video formats; and
Java class files,
Archives; and
the mbox format

Clicking on the icon


popped up a file
selection dialogue.
Different document formats can be uploaded
e.g. PDF, DOCX, EPUB, PPT, XLS, RTF,
MP3, XML,AVI etc.

After loading one of


the PDF files. Not
yet saved.
J-ISIS is really fast
in uploading and
extracting text from
files.

After saving a link


was created for the
uploaded
document.
When
saving
Lucene applies full
text
indexing
because of the
FST-entry for this
field being 'full-text'

Record in Data
Viewer.
After saving, went
to Data viewer &
the record is there.

At the bottom of the


document, there is
a link to the original
document.

10

1.3.5. Created display format using Pft manager


Clicked tools, selected pft
manager. Clicked new, named
the new format metadata.
Clicked Generate HTML. Cut the
second line if p(v1). and
pasted below line 4 (if p(v4)
then..
edited the pasted line, in place of
publication I put link and replaced
(1) with (2) after </I> and also put
[2] after v1.
This created a link which is the
second
occurrence
after
uploading and saving the
document.
Clicked save.
The created Metadata display
format, below is the link to the
original document, which is the
second occurrence for the
uploaded
publication
or
document.

11

1.3.6. J-ISIS Searching


Any word can be searched in the
document & searched words are
highlighted. In this case the searched
word is Auditing.

1.3.7. Web J-ISIS Installation


Installed Tomcat 7
Copied the web-jisis3.war into the webapps-folder of Tomcat as seen in the screen shots below
After Launching Tomcat URL, the WAR will automatically be deployed (extracted in subfolder)

12

1.3.8. Web J-ISIS


This is the web interface for J-ISIS. At the moment, there is only one user profile, which is Admin. To open
the web interface, Tomcat needs to first be started. Web J-ISIS has the following functions: Login, Database
Selection, Browse, Edit and Search.
Web-J-ISIS Login Page
Can

be

accessed

via

http://127.0.0.1:8080/Web-JISIS3/

This is the login screen. Default user name:


Admin and password as defined in the J-ISIS
client.

After successfully logging in, this screen


appears. Clicked continue.

13

Web-JISIS list of databases


This is a list of different databases that
were created. To view one, click on
select against any of the databases. It
is possible to search, browse & edit.

Web-JISIS Searching
Search interface after selecting the Demo-DL.
When typing, a menu of words comes. That is
mainly because of indexing technique 4 and a
very high degree of interactivity as allowed by
Java Servlets server.

1.4. J-ISIS Strength


J-ISIS is very fast in processing documents
Fast and Full-text indexing using Lucene
Different databases of different formats could be created with the application like video, audio,
images & text, although in the current version some document formats dont open within the
database but in the web interface, they open.
Metadata Independence i.e. there fixed metadata structure. It is possible to create as many or less
different metadata formats which are easily understood by the users .
Supports most of the electronic document formats e.g. PDF, EPUB, DOCX, RTF, PPT, XLS, MP3,
FLV etc.

14

1.5. Weaknesses
No easy way for customization of the web interface
There are so many search windows in the web interface
Display formats for the data is not so attractive especially the RAW PFT but more attractive formats
can be created using the J-ISIS client PFT-wizards and the rich ISIS Formatting Language
Links are not visible as links in the web interface (depends on how the style-sheet is showing them,
they can be displayed in standard blue-underlined way but not done).
No user interface
1.6. A few Observations

Videos dont play within the J-ISIS database, but play in Web-JISIS although the video cannot be
forwarded.

Some file formats also dont open within the J-ISIS database e.g. PPT, RTF, EPUB, but they do
open in web J-ISIS.

It is not possible to download videos in the web interface or audio

Users comments worksheet, does not work well.

1.7. A brief performance comparison between J-ISIS and Greenstone


I used the same collection (14 documents of different formats i.e. PDF, Epub, PPT, DOCX, XLS, XML, RTF,
& DOC) for both J-ISIS and Greenstone, and below was my observation:
J-ISIS is faster than Greenstone in collecting, indexing and building the collection
GSDL allows adding folders of (multiple) documents into the collection whereas J-ISIS uses a
one-by-one approach, but still does this (much) faster than GSDL.
J-ISIS processed all 14 documents while GSDL only managed to process 5, 4 were unrecognized
& 5 rejected.
J-ISIS has a possibility to edit documents in the web interface although uploads are not possible in
the current version whereas this possibility is not possible in GSDL web interface.
The display format of documents in GSDL web interface is better than the display format of WebJISIS

15

1.8. Implementation
I work in a small institution with a population of around 1500 students & around 100 employees. J-ISIS can
be a very good software package to manage our digital library, but in order to implement it; I will need to
join the user community so as to have a very good understanding of J-ISIS & have my questions answered
like how to improve the user interface, how to create user profiles, make a better search interface etc. I
hope this forum can help: https://kenai.com/projects/j-isis/forums/forum .
Greenstone could also be another option for a digital library at my work place and
ABCD site building for my library page.

1.9. Recommendations to UNESCO

Allow interactivity i.e. provide a user comment box for each document viewed. It should be easy as
a new repeated-field in a dedicated data-entry worksheet.

Provision for sorting the display formats, so that the best PFT is on top or there should be one easy
and understandable PFT format for users.

Provision for creating user profiles.

Search interface should be modified with only 2 search possibilities i.e. basic & advanced.

Include upload button under the edit (data entry) on the web interface so that it is possible to also
upload documents from the web interface.

Customization of web interface so as to have an easy & user friendly interface; the web interface is
not attractive.

More promotion should be done and more time should be invested into J-ISIS in order to become
as popular as Greenstone because J-ISIS could be the next big digital library software with its
upload speed and storage capability.

Conclusion
J-ISIS digital library feature is very fast & easy for building collections, it has tested features like structural
independency, capacity of dealing with any document size and format; it can handle various languages and
Lucene-based full-text indexing. J-ISIS could be another powerful digital library solution if more time and
money is invested into the project for its full development to become fully functional with an active user
community.

16

References
Berhe, Helen Hagos. (2012). Extension with digital library technology of UNESCOs J-ISIS

database software, Msc. Thesis.


De Smet, E. (2012b). Digital libraries with J-ISIS: a preliminary account of possibilities and
performance, by H.H. Berhe, E. de Smet, in : Library Hi Tech News, Vol. 29 Iss: 2 pp. 7 10
Smith, Abbey (2001), Strategies for Building Digitized Collection.Washington, D.C. Digital Library
Federation, Council on Library and Information Resources. Available at http://www.clir.org.
Tenopir, C. (2003): Use and Users of Electronic Library Resources: An overview and Analysis of Recent
Research Studies, (With the assistance of Brenda Hitchcock and Ashley Pillow), Council on Library and
Information Resources Washington, D.C.The ISIS3 Conference (2010): New Challenges for a new future of
ISIS,

17

You might also like