You are on page 1of 46

I.

NATURE AND DEVELOPMENT OF INFORMATION SYSTEM

Nature of Information

 Value of information. Unlike other tangible resources, information is not readily


quantifiable - that is, it is impossible to predict the ultimate value of information to its
users. Also, over time, there is no predictable change in the value of information.
 Multiplicative quality of information. The results produced by the use of
information differ greatly from those produced by the use of other resources - for
instance, information is not lost when given to others, and does not decrease when
'consumed': sharing information will almost always cause it to increase - that is,
information has a self-multiplicative quality.
 Dynamics of information. Information cannot be regarded as a static resource to be
accumulated and stored within the confines of a static system. It is a dynamic force
for change to the system within which it operates.
 Life cycle of information. Information seems to have an unpredictable life cycle.
Ideas come into, go out of, and finally come back into, fashion.
 Individuality of information. Information comes in many different forms, and is
expressed in many different ways. Information can take on any value in the context of
an individual situation. This proves that, as a resource, information is different from
most other resources.

Source:
Meyer, H.W.J. (2005). "The nature of information, and the effective use of
information in rural development" Information Research, 10(2) paper 214
(Available at http://InformationR.net/ir/10-2/paper214.html)

Definition of Terms:

Information System
- The acquisition, processing, storage and dissemination of vocal, pictorial, textual and
numerical information by a microelectronics-based combination of computing and
telecommunications
- The collection, storage, processing, dissemination and use of information. It is not
confined to hardware and software, but acknowledges the importance of man and the
goals he sets for his technology, the values include in making choices, the assessment
criteria used to decide whether he is controlling and being enriched by it.

Information Retrieval
- Process of recovering or retrieving documents from a given collection, which are
relevant to a request.
- Implies document retrieval which will contain information relevant to the request.
- IR from the data retrieval because the latter implies satisfaction of a request for
information by providing the information as a direct answer to the question.
- Information retrieval is the process of searching within a document collection for a
particular information need (called a query)

Information Retrieval System

Information Retrieval System is a software programme that stores and manages


information on documents, often textual documents, but possibly multimedia. The system
assists users in finding the information they need. It does not explicitly return information or
answer questions. Instead, it informs on the existence and location of documents that might
contain the desired information. Some suggested documents will, hopefully, satisfy the user’s
information need. These documents are called relevant documents. A perfect retrieval system
would retrieve only the relevant documents and no irrelevant documents. However, perfect
retrieval systems do not exist and will not exist, because search statements are necessarily
incomplete and relevance depends on the subjective opinion of the user.
An Information Retrieval System is a system that is capable of storage, retrieval, and
maintenance of information. Information in this context can be composed of text (including
numeric and date data), images, audio, video and other multi-media objects. Although the
form of an object in an Information Retrieval System is diverse, the text aspect has been the
only data type that lent itself to full functional processing. The other data types have been
treated as highly informative sources, but are primarily linked for retrieval based upon search
of the text.

An information retrieval system (IRS) is a mechanism for carrying out the


information retrieval process, involving the following functions:
1.) The information is created and acquired for the system
2.) Knowledge records are analyzed and tagged by sets of index terms
3.) The knowledge records are stored physically and the index terms are stored into a
structured file, either manual or computerized
4.) The user’s query is tagged with sets of index terms and then is matched against the
tagged records
5.) Matched documents are retrieved for review
6.) Feedback may lead to several reiterations of requests

An information retrieval system merely informs the user of the existence (or non-
existence) and whereabouts of documents relating to a request.

3 Processes of Information Retrieval System

1. The representation of the content of the documents

 Usually called the indexing process. The indexing process results in a representation
of the document. The indexing process may include the actual storage of the
document in the system, but often documents are only stored partly, for instance only
the title and the abstract, plus information about the actual location of the document.
Users do not search just for fun, they have a need for information.

2. The representation of the user’s information need

 The process of representing user’s information need is often referred to as the query
formulation process. The resulting representation is the query. Query formulation
might denote the complete interactive dialogue between system and user, leading not
only to a suitable query, but possibly also to the user better understanding his/her
information need

3. The comparison of the two representations.

 . The comparison of the query against the document representations is called the
matching process. The matching process usually results in a ranked list of documents.
Users will walk down this document list in search of the information they need.
Ranked retrieval will hopefully put the relevant documents towards the top of the
ranked list.
Source:

Goker, A., Davies, J., Graham, M. (2009). Information Retrieval : Searching in the
21st Century. Hoboken, NJ : Wiley, 2009. Retrieved January 26, 2012 from
http://site.ebrary.com/lib/uniofmindanao/Doc?id=10358784&ppg=28

Langville, A.N. & Meyer, C.D. (2006). Google’s PageRank and beyond: the science of
search engine rankings. Princeton : Princeton University Press.

Purposes of Information Retrieval System

1. Designed to retrieve the documents or information required by the user


2. To collect and organize information in one or more subject areas in order to provide it
to the user as soon as asked for.
3. Serves as a bridge between the world of creators or generators of information and the
users of that information.

Functions of Information Retrieval System

1. To identify the information (sources) relevant to the areas of interest of the target
users’ community
2. To analyze the contents of the sources (documents)
3. The represent the contents of the analyzed sources in a way that will be suitable for
matching users’ queries
4. To analyze users’ queries and to represent them in a form that will be suitable for
matching with the database
5. To match the search statement with the stored database
6. To retrieve the information that is relevant
7. To make necessary adjustments in the system based on feedback from the users.

Broad Outline of Information Retrieval

1. Subject/Content Analysis
 Includes the tasks related to the analysis, organization and storage of
information
 Designing methods for identification and representation of the various
bibliographic elements essential for documents, automatic content analysis,
and text processing.

2. Search and Retrieval Process


 Includes the tasks of analyzing users’ queries, creation of a search formula, the
actual searching, and retrieval of information.
 Develop searching techniques, user interfaces, and various techniques for
producing output for local as well as remote users
Kinds of Information retrieval systems

1. In-house information retrieval systems


 Are set up by a particular library or information center to serve mainly the
users within the organization.
e.g. OPAC – provide facilities for library users to carry out online catalog
searches, and then to check the availability of the item required.

2. Online Information Retrieval System


 Are those that have been designed to provide access to remote databases to a variety
of users. Such services are available mostly on a commercial basis, and there are a
number of vendors that handle this sort of service

Source:
Chowdhury, G. G. (2004). Introduction to modern information retrieval. 2nd ed.
London : Facet Pub. pp.2-4

II. MODELS OF INFORMATION RETRIEVAL SYSTEMS

1. The Boolean Search Model

 George Boole (18-15-1864) devised a system of symbolic logic in which he used


three operators (+, x, and -) to combine statements in symbolic form.
 John Venn later expressed Boolean logic relationships through what are known as
Venn diagrams. The three operators of Boolean logic are the logical sum (+), logical
product (x), and logical difference (-). Information retrieval systems allow users to
express their queries by using these operators.
o Logical product or AND logic allows the searcher to specify the coincidence
of two or more concepts
Example:

In order to ask for information on “computers and information retrieval” the


user may formulate the search statement as:

COMPUTERS AND INFORMATION RETRIEVAL

SOCIAL AND ECONOMIC - (will produce the set of documents that


are indexed both with the term social and the term economic)

o Logical sum or OR logic allows the searcher to specify alternatives among


search terms (concepts). The searcher indicates that the items on either these
two topics, or both, will serve the purpose.
Example:

COMPUTERS OR INFORMATION RETRIEVAL

SOCIAL AND POLITICAL – (will produce the set of documents that


are indexed with either the term social or the term political , or both)
o Logical difference or NOT logic provides facilities to exclude items from a
set.

Example:

INFORMATION RETRIEVAL AND NOT DBMS

Note: Retrieved set are visualized by the shaded areas

Source of Figure:
Goker, A., Davies, J., Graham, M. (2009). Information Retrieval : Searching in the 21st
Century. Hoboken, NJ : Wiley, 2009. Retrieved January 26, 2012 from
http://site.ebrary.com/lib/uniofmindanao/Doc?id=10358784&ppg=4

2. Probabilistic Retrieval Model

 Probability theory has been used as principal means for modeling the retrieval process
in mathematical terms.
 Retrieval models based on probabilistic approach:
o Maron and Kuhns (1960) – They advocated that the probability that a given
document would be relevant to a user can be assessed by a calculation of the
probability, for each document in the collection, that a user submitting a
particular query would judge that document relevant. Thus, for a query
consisting only of one term, the probability that a particular document will be
judged relevant is the ratio of users who submit query term and consider the
document to be relevant in relation to the number of users who submitted the
query term. Adopting this approach, one has to employ historical information
to calculate the probability of relevance; the number of times that users who
submitted a particular query term judged a document relevant compared with
the total number of users who submitted that particular query term.
o Robertson and Sparck Jones – The essence of this approach is that the
probability of relevance can be calculated not for a set of users employing a
particular query term in relation to a given document, but for a set of
documents having a particular property in relation to a given user.
3. The Vector Processing Model

 The vector processing model assumes that an available term set called term vectors, is
used for both the stored records and information requests. Collectively the terms
assigned to a given text are used to represent text content.

4. Best match searching and relevance feedback model

 Best match searching is designed to produce ranked output. It therefore requires a


method to measure the relative importance of the retrieved items, which again
requires some method of weighting the search terms.
 A similarity measure comprises two major components: 1.) A term weighting scheme
that reflects the importance of a term by allocating numerical values to each index
term in a query or document and 2.) A similarity coefficient which uses these weights
to calculate the similarity between a document and a query

5. Natural language processing model

 Process and match query and document sentences, keeping in view the context or the
domain, resulting in more relevant information retrieval.
 Involve three levels of processing:
o Syntactic analysis, that is required to understand the structure of a given
sentence. It generally includes a lexicon containing words with associated
information.
o Semantic analysis, that deals with the meaning of the words and the sentence,
is usually stored in a knowledge base. It is used to derive meaning, and to
resolve ambiguities that cannot be resolved by only structural considerations.
o Pragmatic analysis that takes into consideration the specific domain and the
context. Pragmatic knowledge, i.e. the knowledge about a specific situation,
allows the system to eliminate the ambiguities and complete the semantic
interpretations.

6. Hypertext Model

 Is an interactive navigational structure that allow users to browse text non-


sequentially; it consists basically of nodes which are correlated by direct links in a
graph structure.
 Allows users to navigate within the different parts of a text, and among the different
texts in a collection.
ROLE OF INDEXING, ABSTRACTING AND THESAURUS IN INFORMATION
RETRIEVAL
Access to information is determined largely by successful information organization. A
successful information organization involves the following:
a.) Classification theory and techniques
b.) Principles of vocabulary control
c.) the relationship of information organization to users and their searching culture.

Indexing is a fundamental concept in information retrieval, since it decide the


effectiveness and efficiency of the retrieval process and results.

In the context of information retrieval, an index, is a tool which serves to indicate to


the researcher the information and/or documents which are potentially relevant to its request.

The library no matter how large the collection, is of little value if it is unable to
retrieve the right documents required by the users. In other words, a large collection of
documents are of little value in itself unless documents can be recovered when needed.

Librarians form a vital part in the process of information retrieval, thru bibliographic
control which is clearly manifested in the following activities:

1. Cataloging and classification


2. Indexing
3. Abstracting
Among the library materials that requires bibliographic control so that the information
contained in it can be easily be retrieved are the periodicals. Periodical indexing is the best
way of informing the library users of the available information found in journals, magazines
and newspapers. Through indexes, periodical collections can easily be retrieved and made
available.

Introduction to Information Analysis

 Information analysis makes a significant contribution to communication and


information flow

Knowledge and Information


 Knowledge that is in the position of individual is now being stored in the
library
 Knowledge = familiarity gained by experience
= person’s range of information

 Information = informing or telling things to

Libraries and Information


 Libraries contain information in many different physical forms
 The materials that contain the information are arranged in the library in a way
they can be easily located.
 Materials grow, the problem of locating them easily, soon arises, thus
librarians devise ways of countering this problem by providing substitutes to
the physical forms called RECORDS
 This little records can be gathered together in one place like a tray of cards
 Books are cataloged & classified so that they can be easily located.
 However, a library which contains many periodicals will rely on indexes and
abstracts.

Steps in Information Analysis

1. Examination of the document


2. Identification of indexable concepts
3. Translation of the concepts into the indexing language of the system

Information Analysis Tools

1. Abstracts vital component in the communication link between

2. Indexes the originator of information and the receiver

Role of Libraries

 Collect
 Store
 Organize
 Make information available to users

IV. INDEX AND INDEXING

Definition of Terms:
INDEX
 came from the Latin word “indicare”, which means to point out
 A tool which indicates to a user the information or source of information that one
needs.
 A systematic guide to items contained in, or concepts derived from a collection.
These items or derived concepts are represented by entries arranged in a known
or stated searchable order such as alphabetical, chronological, or numerical.
 The term COLLECTION is used to denote a body of materials indexed – a single
or composite text (e.g. Treatise, Anthology, Encyclopedia, Periodicals); a group
of such text; or a set of representation.
 The term ITEM means any book, article, report available in the collection
 An ENTRY is the basic unit of an index. It’s not only identifies the item or
concept but also guides its location. It is a record of an item in a catalog.
 ELEMENT is a distinct unit of an area of description.
 An INDEX therefore is an identifier of content and location. It is an operational
tool, a means to an end, and not the end itself. It provides the required
communication link between the sources and seekers of information. An index is
gate and not a surrogate of the original document. It is also called source locator
because it tells the exact location of sources of information.

INDEXING
 The process of analyzing the informational content of records of knowledge and
expressing the informational content in the language of the indexing system. It
involves:
 Selecting indexable concepts in a document.
 Expressing these concepts in the language of the indexing system as
index entries.
 The process of identifying and assigning index terms to a document either to
describe its physical characteristics, give facts about its creation or distribution,
or describe its content so that its contents are made known and the index created
can help in retrieving specific items of information.

INDEX TERM
 Is the word, phrase or symbol assigned by the indexer to the subject content or
concept of a document he or she is indexing.

INDEXING SYSTEM
 A system of prescribed procedures manual and/or machine for organizing the
contents of records of knowledge for purposes of retrieval and dissemination

INDEXER
 A person whose profession is the preparation of indexes. In his professional job
he is to perform two important functions – analysis and translation.
Analysis – analyze the documents to identify important concepts

Translation – identified concepts are translated into words or phrases.

 There are two types of indexer: Author Indexer and Professional Indexer.
There is controversy about the quality of indexes produced by these two groups.
According to one opinion, the author knows the subject better and can do more
justice to indexing of his creations. The other group holds the view that the
author is more concerned with the ideas and the indexer is more concerned with
the clienteles of different levels with different approaches. Indexing can be
carried out more effectively by professional indexers.

4.1 Development of indexes and indexing

3RD CENTURY B.C


 Callimachus made a list called Pinakes that served as a guide to information in the
thousands of papyrus rolls in the Alexandrian library. His work is the oldest known
manuscript catalog.
 Practice of abstracting the plots of plays and inserting them before the script was
developed and was known as hypotheses.
 Abstracts for business records were also prepared
 Indexes are probably as old as published writing and exist in virtually every
language.
 The early indexes were limited to personal names.
 Word indexes were used with religious writings.
 Topical (Subject) indexes were found frequently with even the order of entries in the
indexes remained unsystematic for a long time.

L.W. Daily in his work – Contribution to a History of Alphabetization in Antiquity and the
Middle Ages – the use of symbols in textual criticism and hermeneutics was associated with
the efforts to bring out pertinent information rapidly from documents.
Aristoplanes of Byzantium and Aristrachus from Alexandra are said to have invented
critical symbols.

Cassiodorus also worked out an elaborate system of symbols to be used in biblical


commentaries so that students could find readily required information on a particular passage.

The ideas of an alphabetic index came into reality with the general adoption of the codex
(manuscript) form of the book. A large number of incunabula (books printed before 1501)
contain alphabetic index. However as explained by Whealty (in his book ‘What is an Index’)
it is nominative rather that accusative and generally means “Table Of Contents” or “Literary
Guide.”

12TH CENTURY
 Alphabetical indexing emerged when debate was developed as a technique for
intellectual discourse in the universities of Europe.

14TH CENTURY
 Index which was taken verbatim from the text was placed in front of the document.
Often times, proper keywords were not used.
 Annotation of manuscripts, library catalogs and bibliographies were introduced.

18th CENTURY
 Witnessed the advent of the professional indexers.
 Alexander Cruden – prepared the first complete Concordance of the Bible in 1737.
 Johnson’s famous Dictionary of the English Language was published in 1755. In his
book, he employed 6 professional indexers to assist him.

19th CENTURY - indexing improved both in quantity and quality


 William Frederick Poole introduced the idea of one index to many periodicals.
People’s Index to Periodical Literature (1882) created subject entries from keywords
in the titles of the articles index.
 John Shaw Billings prepared the first index for medical literature in 1880.
 Charles Ammi Cutter codified subject cataloging principles thru his Rules for a
Printed Dictionary Catalog (1876)
 Andrea Crestado introduced KWIC indexing under the name “Keyword in Title”
in 1956
 Publication of the first separate Index Volume of Encyclopedia Britannica in 1874

20th CENTURY - mark the great age indexing. Here authors, publishers, the reading public
and literary critics have become conscious of index more and more. A whole new discipline
was created to study indexing techniques and theory and to develop criteria for assessing he
effectiveness of indexes.

 W.H. Wilson published his Readers’ Guide to Periodical Literature (Subject index)
(1901)
 Hans Peter Luhn introduced the mechanized form of derived title indexing known
as KWIC (Keyword in Context)
 Calvin Mooers developed an indexing system known as Zatacoding
 M.M. Kessler developed an indexing technique called Bibliography Coupling.
Published an excellent list of Reference Books for indexers.
 Mortimer Taube developed the “uniterm system” or one concept term. He use cards
with headings displayed at the top.
 Free indexing language an indexing language that uses any indexing word or term
that suits the subjects as an indexing term.
 Timothy C. Craven – he introduced the Nested Phrase Indexing Systems
(NEPHIS)
 C.W. Cleveron – his Cranfield Project is a landmark in evaluating the performance
of indexing languages.
 Derek Austin – designed and developed the Preserved Context Index System
(PRECIS)
 S.R. Ranganathan - introduced Chain indexing.

21st CENTURY – emergence of machine-aided indexing/automatic indexing.


 The computer extract words and/or phrases from documents
 The use of MARC 21 formats
 The largest medical library in the United States is the National Library of Medicine
located in the States of Maryland. It published Index Medicus – a medical index
which is published more than a century-old subject/author guide to articles titled.
 At present, the largest online bibliography database in the world is the OCLC Online
Union Catalog
 ISBD (International Standard Bibliographic Description) is an international format
standard for representing descriptive information in bibliographic records. At
present, ISBD covers 8 types of record formats.

4.2 Purpose and uses of indexes

General Purposes of Index

1. To construct representations of documents in a form that is suitable to the users to


browse
2. To minimize the time and effort in finding information – Give users systematic and
effective shortcuts to the information they need.
3. To maximize the searching success of the users – provide a system of accurate and
almost complete cross-references to related information to ensure satisfaction of
information need.

Uses of Index

1. Facilitate reference to the specific material or to locate wanted information


2. Serve as filter to withhold irrelevant materials
3. Make the information storage and retrieval system useful to individual
4. Disclose related information by means of see also references.
5. Direct users seeking information under terms not chosen as index headings to
headings that have been chosen by means of see references.
6. Provide a comprehensive overview of a subject field.

4.3 Types of Indexes


1. Author Indexes – entry points are names of persons, organizations, government
agencies, names of educational institutions, etc.
2. Alphabetical Subject Indexes – covers a number of different kinds of indexes. The
arrangement is in alphabetical order and follows a familiar pattern.
3. Classified Indexes – entry points are arranged in a hierarchy of related topics starting
with generic or broad topics and working down to the specific ones.
Examples:

-Index Medicus – classified index in the field of medicines and related


disciplines
- Engineering Index – classified index in the field of engineering and related
disciplines
4. Word and Name Indexes – indexes to individual names and words that the author
used.

5. Book Index/Back-of-the-Book Index – a list of words or group of words generally


alphabetical at the back of the book giving a page location of the subject or name
associated with each word or group of words.

6. Periodical Indexes/Newspaper Indexes – based on the same principles and has the
same general objectives as a book index but its scope is broader. Periodical indexes
are open-ended projects usually performed by a group of people. Each issue of a
periodical may deal with unrelated topics by several authors, written in different styles
and aimed at different users.
7. Computer-Based Indexes – necessitate the use of computing machines to generate
index entries. There are two methods employed by computer-based indexes:
- automatic indexing wherein one has to rely on the computer to construct
indexes
- computer-assisted indexing wherein the machines do the routine work while
a human performs the intellectual task of indexing.

Differences Between Book and Periodical Indexing

Book Index Periodical Index

compiled only once, within a a continuous process, more often


relatively short time and usually performed by a team of indexers
performed by a single person and last for an extended period
deals with more or less well-defined deals with a great variety of topics
central topic

indexing terms are almost always terminology must be consistent and


derived from the text usually derived from a controlled
vocabulary
specificity is largely governed by terms are prescribed by a controlled
the text itself vocabulary and their level of
specificity may be lower than that
of a book index
every single page of a book must be articles are scanned for indexable
read items and may rely on the abstract
or summary compiled
entire text is virtually subject to a periodical index will depend on a
indexing number of policy decisions
always bound with the indexed text compiled separately

4.4 Principles and concepts of indexing


- the effectiveness of an indexing system is controlled by 4 parameters:
1. Exhaustivity
 Refers to the extent to which concepts or topics are made retrievable by means
of index terms. It may imply giving only the overall theme of the item. It may
also mean giving as well as the subordinate themes to cover the subject matter
of the document rather completely.
 The extent to which a document is analyzed, either with the use of numerous
terms or use of a few terms to cover subject content
 Where indexing goes through the entire text almost sentence by sentence
 This level is primarily used for documents which consulted in great detail
e.g. Court decisions
Degrees of exhaustivity:
- refers to the degree to which the subject matter of a given document has been
reflected through the index entries
Summarization – identifies only a dominant, over-all subject of the item
recognizing only concepts embodied in the main theme. This is usually
observed in cataloging subject analysis.
Depth indexing – aims to extract all the main concepts dealt with in a
document, recognizing many sub-themes and sub-topics. This is usually
practiced in the subject analysis of parts of items (journal articles, chapters in
books, etc.)

2. Selective indexing
 The use of the few terms to cover only the main or major theme of a
document.
 Only the information of interest to users has been selected

Example:

Title of the Article: Court orders arrest of PCIJ writer

Indexing (Exhaustive) Indexing (Selective)


- Libel and Slander - Libel and Slander
- Philippine Center for Investigative - Philippine Center for Investigative
Journalism Journalism
- Garcia, Winston - Garcia, Winston
- Samonte-Pesayco, Sheila - Samonte-Pesayco, Sheila
- Government Service Insurance - Government Service Insurance
System System
- Codilla, Ramon, Jr.
- Regional Trial Court
- Cebu City
- Freedom of Speech
- Coronel, Sheila
- Graft and corruption
- Department of Justice

3. Specificity
 Refers to the extent to which a concept or topic in a document is identified by
a precise term in the hierarchy of its genus-species relationship. If the
descriptors used are parallel to the concepts contained in the item and
represent these concepts correctly, then the specificity level of indicating is
high.
Examples:
- An article about musicians should be entered under Musicians not under
Performing Arts.
- An article on the cultivation of oranges, indexed under Oranges rather than
Citrus Fruits or Fruits

4. Consistency
 Refers to the extent to which agreement exists on the terms to be used to index
some documents. It requires the items on the same subject be conceptually
analyzed and translated in the same way.

Two types of consistency level:


Inter-indexer consistency – refers to the agreement between or among
indexers
Intra-indexer consistency - refers to the extent to which one indexer
is consistent to himself.

V. INDEXING LANGUAGES
Inside your house you are using a common language to be able to communicate with
each other clearly like Tagalog if you are living in Manila, or Cebuano if you live in Cebu. In
the indexing process, the indexer is also obligated to use a language. As an indexer, you are
going to assign terms or labels or names to a document to depict its subject content that could
be understood by the people who will use the index. You will not use terms that are not
understandable to user to avoid confusions. For instance, if you are an Ilocano, you cannot
talk to Cebuano using your own terms. You must use terms that is common and
understandable to both of you, which could be English or Tagalog terms, you as an indexer
and the people whom you expect to use index can have assurance of better communication.
The names or labels that you have assigned to the document in the indexing process
are known as index terms. Index terms maybe a word, a phrase, or a code consisting of
numbers or letters of the alphabet or combination. The complete set of these terms is also
known as INDEXING LANGUAGE.

Indexing language – is a system for naming or identifying subjects contained in a document.


- Refers to the languages used in an index to represent the subject content or
other aspects of information found in documents. It is a list of terms or
notations that might be used as access points in an index.

5.1 Uses of Indexing Language in the Indexing Process


Lancaster (1991) enumerated three purposes of indexing language:

1. To allow the indexer to represent the subject matter of the documents in a consistent
way. The index language provides careful term definition or scope notes for related
terms and generous cross-references which will bring to the attention of the indexer
the most appropriate term, general or specific to represent the topic.
Example:

COMMUNITY HEALTH SERVICES


SN Various services within the community directed towards the
Promotion of the mental and physical well-being of community
BT HEALTH SERVICES
NT CHILD HEALTH SERVICES
CLINIC VISITS
HEALTH EDUCATION
MATERNAL CHILD HEALTH SERVICES
MATERNAL HEALTH SERVICES

(Where SN refers to Scope Note, BT refers to Broader term and NT refers to Narrower
Term)

2. To bring the vocabulary used by the searcher into coincidence with the vocabulary
used by the indexer. The index language should prescribe the language that the
searcher must use by directing him from non-searchable term to searchable term.
Example:

Female
Use WOMEN
Female Infertility
Use INFERTILITY
Emotionally unstable
Use INSECURITY
Disabilities
Use HANDICAPS
Games
Use SPORTS

3. To provide means whereby a searcher can modulate a search strategy in order to


achieve a high recall or high precision as varying circumstances demand.

5.2 Features/characteristics of Indexing Language

1. Vocabulary – composed of terms loosely called index terms. It employs certain


classes of words, adjectives, participles and gerund, few prepositions and
conjunctions, almost no adverbs, pronouns or verbs and no interjections.

Two types:
a. Index vocabulary – consists of index terms which are assigned to express the
concepts of the author. These are also called descriptors.
Example:
Deacidification
Dead titles
Decay
Depository collection
Depreciation
Depreciation scale
Deselection
Deselection policies

b. Approach vocabulary – consists of terms which are used as lead-in terms to


the index term.
Example:
Able Students
use. ACADEMICALLY GIFTED
Academic Advisement
use. EDUCATIONAL COUNSELING
Activity Learning
use. EXPERIENTIAL LEARNING

2. Syntax – refers to the arrangement and relative positions and mutual relationships of
words in the sentence or statement as required by established usage and grammatical
rules of the language being used. Concern with the clearness of the expression, with
efficient and unambiguous communication and is language dependent.
- a complete indexing language includes certain devices which are used to
achieve either high recall or high precision in both indexing and searching
operations.

Indexing Language Devices


a. Recall Devices – these are indexing language devices that group terms
together into classes of one type or another so that such devices will allow
improvements in recall in search operation and will make the next consistent
in assigning index terms that represent the subject contents of document.
Examples of such devices are:
1.) Synonyms (syn.), near synonyms (ns) and quasisynonyms (qs)
(opposite)
For example:

DISASTER
Syn. Calamity, Catastrophe, Misadventure, Tragedy, Woe
Ns. Accident, Casualty, Fatality, Mishap
Qs. Fortune, Luck

2.) Control of word form endings, i.e. using the root only as index terms
For example:

TENURE
Use JOB TENURE

EMOTIONAL SECURITY
Use INSECURITY

KEYBOARD IDIOPHONES
Use PERCUSSION INSTRUMENTS

3.) Hierarchical Relationships which creates the ability to find systematic


headings from one general to more specific or from a Broader Term (BT) to
the Narrower Term (NT)
For example:

INFANT NUTRITION
SN Nutrition of children from birth to 2 years of age
BT NUTRITION
NT BOTTLE FEEDING
BREAST FEEDING
b. Precision Devices – these are indexing language devices that when used with
association with association with terms will increase the shades of meaning of
the terms. Hence, such devices will improve precision in both indexing and
search operation.
Most common type of such devices are:
1.) Term coordination or combination of two or more different meanings to
come up with a distinct index term with specific meaning. These are called
adjectival headings, phrase headings, and compound headings

For example:
Adjectival heading FRUIT WINE
OFFICE MANAGEMENT
COMPUTER PROGRAMMING
FATTY ACIDS
ENERGY CONSERVATION
FATTY ISSUES

Phrase Headings STUDY, METHOD OF


WOMEN AS AUTHORS
GARDEN ORNAMENTS AND FURNITURE
ENGLISH AS A SECOND LANGUAGE

Compound Headings SCIENCE AND RELIGION


BANKS AND BANKING
HEROES AND HEROINES
PUBLISHERS AND PUBLISHING

2.) Subheadings or Subdivision. These are terms or phrases that are used
under main headings or index terms to subdivide certain subjects into more
specific topics or show a particular aspect of a given subject or index term.
For example:
EDUCATION—FINANCE
POETRY—COLLECTIONS
POPES—VOYAGES AND TRAVELS
PLANETS—EXPLORATION
PESTS—BIOLOGICAL CONTROL

3. Semantics – the study of meaning as expressed in communication such as words. In


indexing, semantics indicate class relations among index terms.
Categorized into:

a. Equivalent relationship – implies that there will be more than one term
denoting the same concept. Below are some terms denoting this type of
relationship
Synonyms (e.g. feminism ; Women’s Liberation Movement)
Quasi-synonym (e.g. economics ; cost and financing)
Preferred spelling (e.g. program ; programme)
Acronyms, abbreviations (e.g. ALA ; American Library Association)
Current and established terms (e.g. developing countries ; Third
World ; Underdeveloped areas ; less developed countries)
Translation (e.g. Manila hemp ; Abaca
b. Hierarchical relationship
Genus/species (represents class inclusion)
Agro Industry

Food Industry

Meat Industry

Whole/part relationships
Foot

Toes

Affinitive/Associative Relationships – displayed with the use of


the related term
Examples:

Men-Women
Education-Teaching
Maintenance-Repairing

5.3 Types of Indexing Language


1. Natural Language
 Uses index terms/words occurring in the printed text as index entries.
 If we use terms as they appear in documents without modification, we are
using natural language or derivative system. This approach is also called an
indexing by extraction.
 Its characteristics are:
 Tends to improve recall because it provides more access points but
reduces precision
 Redundancy is greater
 Uses more current terms
 Tends to be favored by subject specialists or the end-user

2. Controlled Vocabulary or Artificial Language


 an authority list that enables an indexer to establish a standard description
for each concept and use that description for each concept and use that
description each time it is needed..
 In general, an indexer can only assign to a document terms that appear on
the list adopted by the library

Functions:
 To control synonyms by choosing one form as the standard term
 To make distinctions among homographs
e.g. Security (Law) ; Security (Psychology)

 To bring or link together those terms whose meaning are closely related
e.g. Cereals and wheat

 Establishes its size or scope


e.g. whether the word baseball would include softball

 Usually records its hierarchical and affinitive/associative relations

 Syndetic devices:
 USE and UF for synonymy
Use indicates that another term is to be used in preference
UF indicates that a term is used instead of another
 BT, NT, RT reference for differing levels of specificity and certain
near synonyms and antonyms
 Parenthetical qualifiers to resolve semantic ambiguity
e.g. Mercury (Planet) Mercury (Metal)

Types of Controlled vocabulary


 Subject Heading List – follows an alphabetical arrangement of
terms and it covers a broad area of knowledge. It is used primarily to
index textual, book length documents, with one or two terms that
capture what the document is all about.
Examples:

Library of Congress Subject Headings (LCSH)


Sears List of Subject Headings (SLSH)
Medical Subject Headings (MeSH)

 Subject Thesaurus – alphabetical listing of terms providing a structured and relational


information about the concept.
- Synonyms, hierarchical, associated and homographic relationship
among terms are clearly displayed. It has three types of relationship:
BT (Broader Term), NT (Narrower Term) and RT (Related Term)

 Stop List – List of terms prepared in order to avoid using words that
are not keywords as access points.
- it is an in-house listing of index terms not found in the subject
heading and thesaurus.

3. Free Language
 The free-text language does not consist of a list of terms distinct from those
used to describe concepts in a subject area.
 Indexing is free in the sense that there are no constraints on the terms that
can be used in the indexing process.
 Common in a computer-indexing environment.

VI. INDEXING SYSTEMS

1. Coordinate Indexing – created by combining two or more single index terms to


create a new class.
Example:
Training + Employees = Training of Employees
Public + School + Librarians = Public School Librarians
Drugs + Pneumonia + Cats = Drugs that will cure pneumonia in cats
Migration + Philippines = Migration in the Philippines
Types:
Pre-coordinate indexing
 Are non-manipulative indexes where manipulation is done at the
indexing stage. This type of indexing is applied in traditional
printed indexes to books and the conventional card catalog.
 Coordination index where terms are combined before the time of
searching. It is prepared by indexers at the time of indexing.
Post-coordinate indexing
 coordination is done by the user at the searching stage and not by
the indexer at the indexing stage.
 a search strategy is formed by combining the terms with Boolean
operators (and, or, not) to express his information need. Online
retrieval systems are based on this type of index.
 Formulated by the searcher at the time of searching

2. Classified Indexing – starts with a predetermined scheme in which subjects are


already ordered in a specified system of relationships. Each subject has an
identifying symbol or notation, and the notation itself is the mark by which the
entry receives its appropriate place in the file.
Types:
Faceted Indexes
 is a type of synthetic classification and is often called analytico-
synthetic system
 a facet analysis is a tightly controlled process by which simple
concepts are organized into carefully defined categories by
connecting class numbers of the basic concepts.

 is pre-coordinated at the time of indexing and is arranged in


classification order rather than a straight alphabetical order.
 S.R. Ranganathan introduced the faceted classification system by
publishing his basic works on the system during the 1930s.

Enumerative indexes
 Enumerative classifications aim to enumerate or list all subjects
present in the literature that the scheme is intended to classify.
 the enumeration is normally achieved by first identifying the main
disciplines to be covered by the scheme, either on a philosophical
or pragmatic basis, and allocating each a main class status.
3. Chain Indexes
 Chain indexing is simply a technique for constructing an organized set of
entries for an alphabetical subject index of a classified catalog
 Chain indexes provide that every concept becomes linked, or chained, to its
directly related concept in the hierarchy system
 Introduced by S. R. Ranganathan as part of his Colon classification, the
system uses “synthesis” or “number building.” The number that represents
some complex subject is arrived at by joining the notational elements that
represent more elemental subjects.

Example:
Topic: Victorian period English poetry (821.8)
Hierarchy:
8 Literature
2 English
1 Poetry
8 Victorian period

 Index entries
Victorian period: Poetry: English: Literature 821.8
Poetry: English: Literature 821
English: Literature 820
Literature 800

4. Permuted Title Indexing


 Indexes that are created by systematically rotating information conveying
words in the title as subject entry points into the index.
 Advantages:
Indexing can be done easily with minimum cost
Does not need the expertise of a professional indexer because
it is entirely done by a computer.
 Disadvantages:
the titles may not accurately reflect content
the limited number of terms restrict complete subject
indication
most of title indexes are unappealing to the eye and are
difficult to scan
lack of vocabulary control can increase the retrieval of
irrelevant documents. These indexes usually employ stop-lists
(words that are unsuitable as subject indicators)
Scattering if synonyms and generic terms usually cause user
frustration and missed entries.
Types:
a. KWIC (Keyword in Context)
 introduced by Hans Peter Luhn in 1959.
 It is a rotated index most commonly derived from the titles of
documents. Each keyword appearing in a title becomes an entry point
and highlighted in some way by setting it off at the center of the page.
It is based on the three principles:
1. Titles are generally informative
2. The words extracted from the title can be used effectively to guide
the user to an article or a paper likely to contain desired information.
3. Although the meaning of an individual word viewed in isolation may
be ambiguous or too general, the context surrounding the word helps
to define and explain its meaning.

Structure of KWIC
1. The Keyword (which is the heading) arranged in alphabetical order
2. The Context (which function as a modification)
3. The Identification Code (which is the Reference) – it indicates the location
of the document.

Example:
Blue-eyed cats in Texas
The cat and the Fiddle
Dogs and cats and their Diseases
The Cat and the economy

KWIC index construction


In Texas, Blue-eyed Cats ………… 23
The Cat and the Economy …… 12
The Cat and the Fiddle ………. 17
Dogs and Cats and Their Diseases ...... 3
Blue-eyed Cats in Texas …………….. 23
and Their Diseases, Dogs and Cats … 3
Their disease Dogs and Cats and……….. 3
and The Economy, The Cat ………. 12
and The Fiddle, The Cat ………….. 17
in Texas, Blue-eyed Cats…… 23

b. KWOC (Keyword out of Context)


- Does not rotate the title, but lifts out the keyword of interest that become the access
points are set off on the left hand margin or list it separately to the side.
KWOC index construction

Blue-eyed Blue-eyed cats in Texas……. 23


Cat The Cat and the Economy…. 12
Cat The Cat and the Fiddle…….. 17
Cats Dogs and Cats and their Diseases… 3
Cats Blue-eyed Cats in Texas…… 23
Diseases Dogs and Cats and their Diseases… 3
Dogs Dogs and Cats and their Diseases… 3

c. KWAC (Keyword out of Context)


- A keyword used as an entry point in a KWAC index is not usually rotated but is
replaced by an asterisk (*) or some symbols.

KWAC index construction


Blue-eyed * Cats in Texas ……. 23
Cat The * and the Economy… 12
Cat The * and the Fiddle…. 17
Cats Dogs and * Their Disease… 3
Cats Blue-eyed * in Texas…. 23
Diseases Dogs and Cats and Their * 3
Dogs * and Cats and Their Disease… 3
Economy The Cat and the * …………… 12
Fiddle The Cat and the * …………… 17
Texas Blue-eyed cats in * ………….. 23

5. Citation Indexing
 Consist of a list of articles with a sub-list under each article of subsequently
published papers which cite the articles.
 Its primary advantage in using citation index is that it leads the user to the
latest articles

6. String Indexing
 is a word-based system in which the indexer analyzes the various aspects of a
complex subject treated in a document and records the aspects as words along
with “role operators” that is, instructions to the computer. The computer
programs combine these words into a string of terms that represents a brief
summary of the document’s content. Then the program provides index entries
by automatically recasting the string under every significant term that forms
part of the string.
 Timothy C. Craven cited two main characteristics of a string index:
Each index item normally has a number of index entries containing at
least some of the same terms
Computer software (index string generator) generates the description
part (index string) of each index entry according to regular and explicit
syntactical rules.

Types:
PRECIS (Preserved Context Index System)
 A method of subject indexing developed by Derek Austin for the
British National Bibliography (1971-1973) in order to produce
printed alphabetical subject entries.
 It involves:
Determining the subject content of the document
Analyzing the subject statement to determine the role of
each significant terms (action term, location term, an agent
or object of the action)
Computer will manipulate the coded string to produce index
entries.
Determine the relationship of a term to others in the
database and how should all these terms be linked.
Example of a PRECIS index entry:

Document on: Education of Librarians in the Philippines

Entries:

Education. Librarians. Philippines

Librarians. Philippines

Education

Philippines

Librarians. Education
Three Techniques of PRECIS
Cycled or Cyclic Indexing
 Involves the movement of the first lead term to the last position and this
process is continued until each element or concept has occupied the lead
position at once.
Example:
A B C Education. Librarians. Philippines
B C A Librarians. Philippines. Education
C A B Philippines. Education. Librarians

Rotated Indexing
 Involves each element becoming the main heading under which an entry is to
be filed, but there is no change in the citation order. The entry element is
highlighted, in this case by the use of Italics
Example:

A B C Education. Librarians. Philippines


A B C Education. Librarians. Philippines
A B C Education. Librarians. Philippines

SLIC Indexing (Selective Listing In Combination)


 Involves the combination of elements but in one direction only.
Example:

A B C Education. Librarians. Philippines


A C Education. Philippines
B C Librarians. Philippines
C Philippines

POPSI (Postulate-based Permuted Subject Indexing)


 Developed at the Documentation Research and Training Center
(India), follows the classification ideas of S. R. Ranganathan.
 This index type form of non-print/non-book formats
 Designed by Ganesh Bhattacharyya

NEPHIS (Nested Phrase Indexing System)


 Developed by Timothy C. Craven, the input string was
designed to be a phrase in ordinary language.

CIFT (Contextual Indexing and Faceted Taxonomic Access


System)
 Developed for the Modern Language Association (MLA),
alphabetical subject entries are created from strings provided by
indexers who assign facets derived from literature and
linguistics. It is published with MLA International
Bibliography.
VII MEASURES OF THE EFFECTIVITY OF THE INDEXING

2 basic effectiveness measures

Recall = is the ratio of the number of relevant records retrieved to the number of relevant
records in the database.
= refers to the proportion of relevant materials retrieved by a system

Precision = is the ratio of the number of relevant records retrieved to the total number of
documents retrieved
= refers to the proportion of retrieved documents that are relevant

# of relevant documents retrieved


Precision =
Total # of documents retrieved

# of relevant documents retrieved


Recall =
# of relevant documents in the collection

In other terms:
Relevant Not relevant

Retrieved true positive false positive

Not retrieved false negative true negative

Recall and precision are inter-dependent measures:


◮ precision usually decreases while the number of retrieved documents increases
◮ recall increases while the number of retrieved documents increases

 By making search term more exhaustive, we tend to get a higher recall


e.g.
search term : internet
other related terms: net, world wide web

 We can say that a higher level of exhaustivity of indexing tends to ensure High Recall
However, by increasing the level of exhaustivity, we tend to decrease the level of precision
Thus, an increase in indexing exhaustivity tends to increase recall but reduce precision

 High level of term specificity tends to ensure high precision


e.g. Use broader term – end up to retrieving a large number of documents
VIII. SUBJECT INDEXING PROCESS
 Process by which the subject matter of documents are presented in an index, be it
printed or machine-readable form
 It involves three steps:
1. Content Analysis - Determining the “aboutness” or subject contents of a
document
 Decide which topics in the item are relevant to the potential user of the
document
 Decide which topics truly capture the content of the document
 Determine terms that come as close as possible to the terminology user
in the document
 Decide on index terms and the specificity of these terms.
2. Subject or conceptual analysis – is required to decide which of an item’s
aspects should be represented in the bibliographic record.
 Analyzing the ideas/concepts dealt within a document into a
constituent element
 Assembling the constituent elements in a preferred and helpful
sequence, keeping in mind who the readers are and what they will be
seeking.
 Naming the constituent elements using natural language
3. Translation – the process of converting concepts derived from the document
into a particular set of index terms usually derived from a controlled
vocabulary
 Group references to information that is scattered in the text of the
document
 Combine heading and subheadings into related multilevel headings
 Direct the user seeking information under terms not used to those that
are being used by means of see references and to related terms with
see also references
 Arrange the index into a systematic presentation

IX. INDEXING POLICIES, PROCEDURES AND GUIDELINES

The Indexing Plan

 Consists of a set of guidelines representing overall decisions pertaining to the


indexing project
 An indexing plan is needed to:
Ensure completeness and consistency of access to information contained in
documents
Ensure consistency in decision making
Keep the index within reasonable limits
Ensure that the structure of the index has been thoroughly followed by the
indexers

Factors Considered in Preparing the Indexing Plan


1. Users of the Index
 Actual and potential users
 Information-seeking behavior of users
2. Documents to be indexed
Documents with permanent reference value – these materials with permanent value
are not just to be read and forgotten, but to be reread and used.
a. Periodicals (professional journals, magazines, newspaper articles)
b. Government documents (memoranda, circulars, department orders)
c. Pamphlets/Leaflets with information that has permanent reference and
instructional value
c. Abstract from theses, dissertations, project studies, research papers
d. Monographs/Books
3. Parts of the documents to be indexed
4. Concepts to be indexed
 Places, names of persons, important events and topics
5. Exhaustivity of the index
 Limitations on the amount of space for the index
 Staff, time and money constraints
6. Indexing language to be used
 Natural language
 Controlled vocabulary
7. Factors in selecting journal titles to be indexed
a. Usefulness
b. Curricular offerings
c. Subject coverage or content
d. Class and range of its readership
e. Regularity of issue
f. Availability in most libraries
g. Being indexed in other indexing services

Indexing Methods

a. Derived Indexing – words and phrases are directly extracted from the contents of the
document to represent its subject content.

b. Assigned Indexing – involves assigning index terms from a source other than the
document itself.
Flowchart of Indexing a Document

STEP 1
Recording Bibliographic Data

Choose an article and make a proper


bibliographic entry.

Read and understand the article


STEP 2

Note key ideas from:


Identify and list down significant - title
- abstract
words
- text proper
- references
- indexer may add words
Translate the word into the
STEP 3 controlled vocabulary and make
professional decision which terms
reflect the content of the article.

Consult:
Establish Cross-references if
STEP 4
Necessary Thesaurus
(see and see also) Subject Heading Lists (SLSH,
LCSH)

Write the subject heading/s on


STEP 5
the previously prepared
bibliographic record to complete
index entry

INDEX ENTRY
STEP 6
Ready for Filing/Encoding
Indexing Procedures for Books

1. Examine the text carefully


2. Read the text several times, page by page, to be analyze the contents and determine
indexable topics
3. Select the topics to be indexed taking into considerations their significance to the
central theme of the book.
 Name the topics that were chosen to be indexed
 Mark up page proofs all at once before any cards are prepared or
entries keyboarded
 For each chosen heading, supply a modification, a word or phrase
that narrows the application of the heading
 If a text discussion extends more than one page, beginning and
ending reference have to be given.
 Type the entries either have been completed and checked, read
quickly through the pages again to determine if anything indexable
has been omitted.
4. Alphabetize the entries
 All entries are arranged in alphabetical groups by initial letter
 Entries within each letter group are arranged alphabetically following either
the word-by-word or letter-by-letter mode.
Example:

Letter-by-letter Word-by-word

Weather proofing, 212 We five, 101

Weather underground, 143 Weather Underground, 143

Weavebird, 119 Weatherproofing, 212

Weaver, James Baird, 47 Weaver, James Baird, 47

We Five, 101 Weavebird, 119

Weft knitting, 68 Weft knitting, 68

5. Edit the entries


 Decide which entries should be the main headings and which should be the
subheadings
 Decide whether certain entities will be treated as main entries or subentries
Painting handicrafts; painting
Pottery making or pottery making;

Weaving weaving; wood carving

Wood carving

 Main entries unmodified by subentries should not be followed by long rows or


page numbers. Provide at least one subentry for a heading that has more than five
references.
 Subentries must be concise and informative and begin with a keyword or phrase
 Make a final choice among synonymous term (being, life or existence)
 Provide adequate but not excessive cross-referencing
Examples:

Cars, see also trucks trucks

Chevrolet, 224 Dodge Ram, 219

Mazda, 146 GMC (Jimmy), 143

Volkswagen Mercedez-Benz, 144

See also cars

 Punctuation
 The inversion of a phrase used as the heading in a main entry is punctuated
by a comma
 If the heading is followed immediately by page references, a comma is
used between the heading and the first numeral and between subsequent
numerals
 If the heading is followed immediately by run-in subentries, a colon
precedes the first subheading. All subsequent subentries are preceded by
semicolons.
Example:

Payments, balance of:definition of, 16;

Importance of, 19

6. Determine the design of the index after the compilation of the entries
 Decide whether subentries will follow and indented or run-in style
 Index should be well balanced and should not be overloaded with too many
subentries
 Typography should be used to differentiate between types of heading and to
distinguish them from numerals indicating volumes, parts and pages.

7. Typing, proofing and the final review


 After typing, proofread the typescript against the cards, check the alphabetical
order of all entries and conduct a final review.
Indexing Procedures for Periodical Articles

What are Periodicals?

 Periodicals are types of publications which are considered as the most


numerous and found everywhere that constituted a large family of serials.
 It is a publication in any medium issued in successive parts bearing numerical
or chronological designation and intended to be continued indefinitely.
 The definition covers not only journals, magazines, bulletins, and newsletters
but also publications issued only once a year, or even less frequently (such as
yearbooks, almanacs, annual reports, and proceedings of annually held
meetings and conferences.

Why Index Periodicals?

 the presence of printed indexes in the library would not exempt the librarian from
indexing her own library collection because of the following reasons:
1. Some of the titles of periodicals in your library may not be
included in the printed indexes
2. The journals included in the printed indexes may not be found
in your library
3. Some articles needed by the library users may not be indexed.
4. Not all libraries can afford to subscribe to the printed indexes.

Periodical indexes – are guides to the content of periodicals.

2 types:
1. Individual indexes to individual journals
2. Bound indexes to a group of journals
Example:
IPP (Index to Philippine Periodicals
RGPL (Reader’s Guide to Periodical Literature)
DIP (DACUN Index to Periodicals)

The Index Entry


2 Parts:
1. Heading – identifies the subject content of the periodical article or news item.
2. Modification – consist of the identification and location or citation.

Indexing Style and Format

 There is no definite style and format in indexing since studies have shown that
subject literature have different structures and users have different needs.
 Specific subjects and various physical forms of literature need individual
consideration when being indexed.
 Indexing service agencies usually prescribed their own set of policies and guidelines
in indexing. Likewise, printed periodical indexes may also serve as useful sources for
determining the physical format of an index.
 However, most of the modern indexes make use of the hanging indention style.

Indexing Guidelines

1. Before indexing, scan through the contents of the documents. Determine what type of
articles are worthy to be indexed based on the following criteria:
a. Index articles of permanent reference and instructional value.
b. Exclude brief item of temporary interest.
c. Index materials that have relevance
d. Index all signed articles

2. Read and understand the article


a. Record the bibliographic data
 The title of the article and/or the news
 The author of the article
 The title and/or the name of the journal, magazines or newspaper
 The volume, the issue number, the page and name of publication
(Journals and magazines)
 The page in which the news item appears the section and the column
 Analyze the contents of the documents through its title, abstract, list of
contents, text, reference section.
b. Determine and/or identify the subjects, concepts, or ideas covered by the
document.
 Be consistent in assigning subject headings. Consult the terms against
the list of subject headings or from the controlled vocabulary of the
indexing language
c. After the article has been indexed place “IND” next to the article to indicate
that the entry has been made for the article

Preparation of Index Entries from Magazines and Journals

1. Record the author in inverted order, followed by a full stop (.)


Example:

Scheff, Joanne.

2. Record the complete title of the article or news item capitalizing only the first letter of
the first word in the article and first letter of all succeeding proper names, followed by
a full stop (.)
Example:

Scheff, Joanne. How the arts can prosper through strategic


collaboration.

3. Write the complete title of the periodical and underscore it.


Example:

Scheff, Joanne. How the arts can prosper through strategic


collaboration. Harvard Business Review.

4. Record the volume and issue number, followed by the page, the month and year.

 Abbreviations:
a. For purposes of brevity and economy months could be abbreviated, following
a standard list of abbreviations.
January – Ja July - Jy
February – Fe August - Ag
March – Mr September - S
April – Ap October - O
May – My November - N
June – Je December – N

 Paging:
 Enter page where article starts followed by a hyphen and last inclusion page
where article appears. If the article is continued after intervening pages,
indicate a plus (+) sign after the inclusive page

EXAMPLES:

Journal Article

Scheff, Joanne. How the arts can prosper through strategic collaborations.
Harvard Business Review. v74 n1 : 52-53, 58-62. Ja-Fe 1996.

1. Business collaborations. I. Title.

Fig. 1. Main index entry

BUSINESS COLLABORATIONS
Scheff, Joanne. How the arts can prosper through strategic collaborations.
Harvard Business Review. v74 n1 : 52-53, 58-62. Ja-Fe 1996.

Fig. 2. Subject index entry

Analytical Indexing
X. PRODUCTION OF INDEX ENTRIES

10.1 Book indexing


a. Group the class into two groups
b. Prepare a back-of-the-book index and author index

10.2 Analytical indexing


Submit 20 analytical indexes (Filipiniana books)

10.3 Periodical indexing


Submit 20 periodical indexes

10.4 Computerized periodical indexes


XI. ABSTRACT AND ABSTRACTING
11.1 Development of abstracts and abstracting

ANTIQUITIES
 Historical researches for Francis J. Witty disclose that when writing was still
done on clay tablets a device similar in function to an abstract was first used.
“In some on the clay envelope enclosing Mesopotamia documents, so made to
protect the materials from the wear and tear of opening and closing before and
after use.”
 The Greek and Roman literature around 2000-1000 B.C. Such as the plays of
great dramatics were abstracted (called Hypothesis in Greek) to provide
concise information about the original document and to facilitate the search
for and recall of specific information, purpose of abstract still very much the
same up to today

MIDDLE AGES
 When Minks transcribed manuscripts, they would frequently make marginalia
that summarized the page’s contest; royal secretaries used to prepare abstracts
of reports of ambassadors to the kings; or to the Popes in Papal court; early
scientist give reports of their works to fellow scientist of friends in the form of
abstracts.

RENAISSANCE
 In Elizabeth period, scientists make frequent use of abstracts in sending
reports of their study to their friends and colleagues. Abstracts was their room
of private communication.
TH
17 CENTURY
 Abstracts become a system of public information dissemination, ushered by
the information of the French Academy of Science by cardinal Richeliue.
 Les Journals de Scavars, was the first abstracts periodicals for public
information dissemination published in Paris.

11.2 Abstracts and the various types of document surrogates

ABSTRACT
 An abbreviated, accurate representation of the significant contents of a
document.
 It is usually accompanied by an adequate bibliographic description to enable
the original document to be traced
 It is also called a document surrogate or substitute of the original document

Other Types of Document Surrogates:

ANNOTATION
 This is a note added below the bibliographic reference or title of a document
by way of comment or brief description of what the document is about. It
usually appears in one or two sentences only.
 A one sentence description or explanation of a document

EXTRACT
 An abbreviated version of a document that is produced by drawing out
sentences from this.

SUMMARY
 A restatement of the document’s salient findings and conclusions that is
intended to complete the orientation of a reader who has read the preceding
text. It is usually found at the end of the text.

ABRIDGEMENT
 A reduction in terms of length of the original document that aims to present
only the major points. Non-major points are omitted.

SYNOPSIS
 This is similar to summary. Example – Short resume at the back of a
pocketbook.

TERSE LITERATURE
 A condensation of the original. This is done by using statement which are
highly abbreviated to encapsulate the major points. Example – Short articles
found in the Reader’s Digest.

11.3Characteristics of abstracts

a. ACCURACY
 An abstract is error free
 Efforts are exerted by the abstractor to prevent the occurrence of error in the
presentation of the document surrogate to the reader.

b. BREVITY
 An abstract is brief, shorter than the original document.
 Can be achieved by removing redundancy in the language used in writing the
abstract.
 It saves space, and reading time of the customer

c. CLARITY
 An abstract is clearly written in a style that is easily read.
 An abstract must be written in complete sentences and must use the author’s
own words, however abstractors can paraphrase the author’s work because by
paraphrasing, the ideas encoded in the document are also interpreted to
enhance the literary quality of the abstract.

d. SELF-SUFFICIENCY
 Complete in itself and fully understandable to the reader without reference to
the original

11.4 Purposes and uses of an abstract


1. Abstracts facilitate selection – abstract help the user decide whether a particular
document is likely of interest to him.
2. Abstracts Save the (Reading) time of the reader – abstracts are smaller in size
compared with the original document but it provides as much significant information
to the user.

3. Abstracts facilitate literature searches – It would be impossible to search the


huge volume of literature without the indexed abstracts.

4. Abstracts promote current awareness – abstracts re-package the information into


its condensed form thereby making the information easier and less time-consuming to
read.

5. Abstracts help overcome the language barrier – abstracts help the user find out
what studies and researches have been conducted and published in language/s he
cannot read.

6. Abstracts improve indexing efficiency – indexing is made much more rapid and
less costly without sacrificing quality than with original documents.

7. Abstracts aid in the compilation and provision of other tools such as indexes,
bibliographies, and reviews.

11.5 Types of abstracts

By Type of Information/Internal Purpose


1. Indicative Abstracts or Descriptive Abstracts
 Abstracts that only described briefly what will be found if you read the original
document
 This abstract does not contain much data and most often cannot be used in place
of the original.
 It merely indicates the content of an article and contains general statements about
it.
 It abounds in phrases such as is discussed, is described, is enumerated, has been
investigated etc. yet does not record the outcome of the investigations.
 30-50 words can make up an indicative abstract, thus it can be written quickly and
economically by an abstractor.
 Shorter than an informative abstract

Documents that are abstracted descriptively:


- review articles
- books
- conference proceedings
- reports without conclusions
- essays and bibliographies

2. Informative Abstracts
 Present qualitative and quantitative information contained in a document.
 The objectives are:
- to help in assessing the relevance of a document to enable the customer to decide
whether to consult or not to consult the document
- to serve as substitute for the original document especially if the knowledge
contained in a document satisfies the information needs of the customer.
 Written longer than other abstracts
 Usually, technical reports, conference papers, journal articles are abstracted to 100
to 250 words. ; for theses, dissertations, technical reports, 500 words maybe
appropriate.
 An informative abstract need not be specified of any length, normally it can be
one-tenth or one-twentieth of the original length of the document.
 Written by an abstractor who is a subject expert and well-trained in abstracting to
maintain the qualitative presentation of information.

3. Critical Abstracts
 It is more on a review of the document rather than a true indicator of document
content.
 It is really a condensed critical review that when applied to reports, journal
articles, and other relatively brief items, serve much the same purpose as a critical
book review.
 Is subjective and evaluative, i.e. the abstractor expresses views on the quality of
work of the author and contrasts it with the work of others.

4. Indicative-Informative Abstracts
 This is a combination of indicative and informative abstracts
 Parts of the abstract are written in informative or indicative style.
 Major aspects of the documents are written in informative way, while aspects
which are of minor importance are written indicatively.
 This mixed style can utilize not too many words and not too little words just
enough to be able to transmit information effectively.

By External Purpose

1. Discipline-Oriented Abstracts
 an abstract written for abstracting service dealing with a branch of knowledge.
 This abstract aims to serve the needs of a particular subject or discipline.

2. Mission-Oriented Abstracts
 an abstract written for abstracting service dealing with the applications of a branch
of knowledge
 It aims to serve the information needs of a particular industry or group of
individuals

3. Slanted Abstracts
 Published as in-house abstracting bulletins
 This is chiefly used for domestic needs of an organization

By Whom Written

1. Author-prepared abstracts
 Prepared by authors of the documents for publication together with the document.
This is submitted on time since it generally accompanies the article for
publication. However, authors do not necessarily write the best abstracts since
they lack training and experience in abstracting as well as abstracting rules.
2. Subject expert-prepared abstracts
 Abstracts prepared by professionals in the subject. May be an excellent, high-
quality abstract if the expert is trained and experienced in the procedures and
methods of abstracting.
 In general, subject experts volunteer as abstractors and may not submit their
abstracts on time. They are given a modest honorarium or none at all, if they
volunteered.

3. Professional abstractor-prepared abstracts


 Prepared by a professional abstractor, a person who has been trained in the
procedures and methods of abstracting
 One who has attained experience in abstracting, has foreign language expertise,
and can cover areas in which subject experts cannot be found.

By Form

1. Statistical or Tabular Abstract


 is a summary of the data presented on a tabular form.
 This is use for certain specialized subjects, such as thermo physical properties,
where the emphasis is exclusively tabular and statistical.
 Examples of this may be found in the Statistical Abstracts of the United States.

2. Modular abstract
 Combination of the different types of abstract in one presentation
 Consists of five parts:
- Citation
- Annotation
- Indicative abstract
- Informative abstract
- Critical Abstract

3. Structural Abstract
 Refers to an abstract in non-narrative form wherein the abstractor list the items in
the worksheet or template as these are found in the documents.
 This kind of abstract works well only for a subject area in which the essential
elements/items are more or less the same from one study to another.
 Another type of structural abstract is one with subheadings such as background,
aim, methods, results and conclusions, to facilitate scanning. Commonly used in
medical journals.
Type of Soil Climate

Irrigation Type CropsConditions Place Results

_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________

Worksheet for a Structural Abstract

Source: Lancaster, F.W. Indexing and Abstracting in theory and practice. 2nd ed. London:
Library Association, 1998.

4. Mini Abstract
 Known as Machine-readable index abstract
 As used by Lunin (1967) the term refers to a highly structured abstracting by
designed primarily for searching by computer.
 The terms that are used are drawn from a controlled vocabulary and are arranged
in a specified sequence nearly approximating that of a sentence structure.
 Example:
METHOD/DETERMS/STRONTIUM/HUMAN/RADIOACTIVAT
ION/ANALYSIS

A method is described for the determination of strontium and


barium in human bone by radio activation analysis.

5. Telegraphic Abstract
 is written in a telegram style and therefore imprecise.
 It is written in incomplete sentences and really resembles a telegram
 It contains a string of keywords which serves as crude indicator of the subject
scope of the document.
 This type of abstract is computer produced based on words count, i.e. the higher
the frequency of the appearance of the words through word counts the higher the
possibility that these words will be part of a string. It is just a string of keywords
without syntax.

11.6 Parts of an abstract


Abstracts have three major parts:
1. Bibliographic Reference Portion
 Contains the complete bibliographical citation of the original document. It
Directs the customers to the original document.
 All abstracts must be accompanied by an adequate bibliographic reference
in any publication or other applications where they appear remote from the
parent document.
 The data must be accurate, complete and adhere to some set of rules or
standards.
 The precise contents of bibliographic reference and its format are
dependent on the practice or standards being adopted by the abstracting unit
of the library or information center.

Parts of the bibliographic reference portion


a. Document identification number
 This number is an accession number that is sequentially assigned to a
document as it arrived and processed
 The number serves to identify the particular abstract within the abstracting
periodical for easy retrieval
b. Author/s
 Prominence is given to the author’s name if it is placed before the title of
the document
 Usually author’s name is written in the inverted order, surname, first name,
middle initial.
 Some abstracting services give all names when there are three authors,
while other services write only the name of the first author and use the
word [et al] as substitute for the names of the others.
Example:
Buenrostro, Juan C. Jr.

c. Author affiliation
 The author’s affiliation is given in parenthesis following the name.
 If there are two authors or more authors working for different organization,
their respective affiliations are written after each name.
 This portion helps the customer identify the place where the author is
connected and where the document originated.
 It also helps the customer know where and how to contact the author for
future consultation.
Example:
(Institute of Library Science, U.P. Diliman, Q.C.)

d. Title
 Serves as the guide to the subject content of the document.
 For the purpose of accuracy, the actual title is normally lifted and written
verbatimly in the bibliographic reference portion of the abstract
 For titles of foreign language documents, these are cited in both the original
language and the translated language.
Example:
“Librarianship and the New Professional in the 21st Century”
(Ang Librarianship at ang Bagong Propesyonal sa ika-21
Siglo)

“An Evaluation of Graduate Library Education Programs in


Institutions of Higher Learning.

e. Source of the Document


 This portion is very important because it enables the customer to locate the
original document
 The source of the document is described in the bibliographic reference
portion this way:

Periodical/ journal title, volume and or issue number, date of issue, and
pagination.

 The periodical title often appears in an abbreviated form


Example:
J for Journal
Soc. for Society
Lib. For Library
Lit. for Literature

 As for standards which recommend periodical title abbreviations, ISO 4-


1986 Documentation Rules for Abbreviation of Title Words and Titles of
Publications presents an international code for the titles of periodicals
 On the other hand, ISO 832-1975 contributes specifically by listing
abbreviations of typical words in bibliographic references. After the
journal/periodical title, the volume and or issue number, date of publication
in parenthesis, and then the pagination.

For example:

Educ Qrtly. 37: 1 (Mar 1990): 74-90


J. of Phil. Librarianship. 15: 1&2 (Mar. & Sept. 1992.): 1-5.
J. of Phil. Librarianship. 16 (1993): 19-30

f. Original language
 If the article for which the abstract is being prepared is in language other than
English, this should be stated after the source, e.g. (Text in Filipino)

Samples of bibliographic reference portion

Buenrostro, Juan C. Jr. (Inst. Of Library Science, U.P. Diliman, Q.C.)


“An Evaluation of Graduate Library Education Programs in Institutions of
Higher Learning” J. of Phil. Librarianship. 16 (1993): 1980.

Buenrostro, Juan C. Jr. (Inst. Of Library Science, U.P. Diliman, Q.C.)


‘Librarianship and the New Professional in the 21 st Century” (Ang Librarianship at ang
Bagong Propesyonal sa ika-21 Siglo) J. of Phil. Librarianship. 15: 1&2 (Mar. & Sept. 1992):
1-5. (Text in Filipino)

Source: Buenrostro, Juan C. Abstracting and indexing made easy. Quezon City : Great
Books Trading, 2002.

2. The Body or Abstract Proper


 It contains complete idea or relevant data from the original document
 The abstract proper contains the following parts:
a. Purpose
b. Methodology
c. Results and Conclusion

3. Signature Section of the Abstract


 Indicates the abstractors name to give him/her credit, and place responsibility upon
him/her, and also to indicate authority
 Signature may be full names or initials only. Full names are written as means of
positively identifying the abstractor and also as means of rewarding him/her.
4. Descriptors portion
 The descriptors portion is optional. It may or may not be included in the abstracts.
 This portion is done by putting a string of subject headings assigned by the
abstractor to represent the subject matter of the contents of the document.

11.7 Principles and concepts of abstracting

Types of Materials Abstracted

1. Journals- main sources of information for primary literature. Abstracts are made for all
papers containing significant materials in the journal. Among these are:
a. theoretical papers
b. research papers
c. technical papers
d. speculative essays
e. review articles
f. letters to the editor
g. editorials

2. Reports – are primarily reports of recipients of federal grants and other foreign researches

3. Theses and Dissertations – are important sources of original documents


e.g. Dissertation Abstracts International

4. Books and Monographs

5. Patent specifications
6. Conference and symposium proceedings

Limitations of an Abstract

1. Abstracts vary in quality from worthless to superb (excellent, high quality)


- can be affected by factors like errors, omission, abstractor’s bias/es or may simply
be poorly written

2. NOT all users are equally proficient in using abstracts

The Abstracting Process

1. Record the reference fully and accurately


- the order of the presentation may depend on the choice of the abstractor or the abstracting
agency but the elements included are fairly the same:
(a) Title – if the title is vague or misleading, the abstractor should make corrective measures
by adding or modifying words and enclosing them in brackets.
As a general rule, however, titles should be retained as they are published except for a few
problematic cases.
(b) Author may or may not come first before the body
(c) Author’s affiliation makes it easier for the reader to contact the author in cases he/she
wants reprints of the original documents.
(d) Funding agency a form of acknowledgement of support or grant
(e) Publication source key unit in the reference which provides the location of the paper. It
should be accurate and consistent and should follow standard conventions for citing.
2. Content Analysis
(a) Reading-Understanding – the first and initial step wherein the introductory paragraphs
and text are scanned for key information. This concludes with comprehension or textual
meaning interpretation.
(b) Selection – Process of purposeful elimination develop by means of contraction, reduction
and condensation strategies. Here, the abstractor may mark the important phrases and
passages and jot down marginal notes
(c) Interpretation – using reasoning and inference, the abstractors makes a second
interpretation. Here, she starts organizing the phrases and passages previously marked as well
as the marginal notes jotted down. Then, a rough draft of the abstract is produced.
(d) Synthesis/Analytical description – the desired type of abstract is carefully considered in
writing the final draft. Information must be organized that the abstract should contain the
following:
1. Objective/Purpose – should be stated unless this is already clear from the title of
the document or can be derived from the remainder of the abstract
2. Methodology – Techniques/approaches should be described only for purposes of
comprehension. New techniques should be identified clearly
3. Results and conclusions should be clearly presented.

In the presentation of data, main findings should be highlighted. Collateral information and
additional information may be added.

Collateral Information – includes findings or information incidental to the main purpose of


the study which includes (a) modifications of new methods; (b) new instruments; (c) new
discovered documents; (d) new discovered data sources.

Additional Information – includes tables, illustrations. These may be indicated in


abbreviated form within parenthesis at the end of the abstract.
e.g. (4 tab.,5 fig.)

3. Writing the Narrative in Natural Language


The result of the content analysis must now be written. An outline is a useful device.
The first sentence of the abstract should be a topic sentence that tells the readers what the
paper is all about, or will allow the readers to decide if they should continue reading or not.

Things to Remember when writing an abstract

1. The first sentence should not repeat words in the title


e.g.
The history of cats in Bontoc, Mountain Province

The abstract should not begin with, This paper is about the history of cats in
Bontoc, Mountain Province

2. Build upon the information on the title, don’t duplicate it.


3. Abstract should not be written while reading the content
4. Try to avoid vague expressions and long, rambling sentences and redundant phrases
 Avoid words that can have different meanings depending on the context in which they
are used. Simple and short sentences should be used
5. Omit what the user likely knows or may not be interested in (e.g. background,
historical info.)
6. Stress should be on what the author did, and not on what he tried to do or intends to
do next.
7. Abbreviations may be used as long as they are known or familiar to the users. If not
define first before abbreviating.
8. Avoid jargon (vocabulary of a specialized field because these words are not
commonly understood.
9. Critical abstracts should not take sides on controversial issues and should not distort
what the author is really saying.

4. Writing of the Signature (Abstractor’s full name or initials)

WRITING THE ABSTRACT PROPER

STEP 1 READ THE DOCUMENT


 To gain an understanding of its content and an appreciation of its scope
 The introductory paragraph of the document should be carefully read because the
introduction usually depicts the objective of the author in writing the paper. The
summary and conclusions at the end of the document should be noted because they
reveal the author’s findings which form part of the abstract.

STEP 2 NOTE DOWN KEY INFORMATION


 Note down the answers to the following questions:

1. What did the author hope to accomplish? (These are the purposes or objectives of
the study), or why the study was conducted?
2. How did the author/investigator achieve what he wanted to accomplish?
(Describe the methodology and techniques of the study; type and number of
respondents, test applied, and measurements used)
3. What did the author find and conclude? (Highlight the main findings and clearly
state the conclusions of the study. Describe the findings as concisely and
informatively as possible.

STEP 3 ORGANIZE THE KEY INFORMATION


 Organize the key information by making a draft of an abstract from notes recorded in
STEP 2, using a standard format and in keeping with the sequencing of the
components and word length of original draft.

STEP 4 FOLLOW STANDARD ABSTRACT FORMAT


 Abstracts have three major parts, namely: the reference, body and the signature.
 The reference portion directs the customers to the original document, hence it should
be accurate and complete. The body contains the abstract itself, the signature indicates
the abstractor (either the name or initials may be given) and usually comes at the end
of the abstract proper.

STEP 5 CHECK THE DRAFT ABSTRACT


 Check the punctuation, spelling, accuracy, omissions, and conciseness. Accuracy is
particularly essential. Apart from the errors due to carelessness, proper names and
chemical and mathematical formulas are particularly susceptible to mistakes.

STEP 6 EDIT AND POLISH THE DRAFT


 When all the necessary amendments have been spotted, edit the draft abstract and
make any improvement to the style that are possible.

STEP 7 WRITE THE FINAL ABSTRACT

Five Components of the Body of the Abstract

1. Scope = 3%
 States the what of the study and its boundaries and limitations; for example, 18 words

2. Objectives = 7%
 States the why of the study; for example, 42 words

3. Methodology = 15%
 States the techniques used, apparatus, equipment, tools, materials, respondents
studied, and tests and measurements employed, e.g. chi-square, t-test, etc.; for
example, 90 words

4. Findings = 70%
 This portion concisely presents the results obtained in the study, for example, 420
words

5. Conclusion = 5%
 States the conclusion and suggested courses of action to be taken, for example, 30
words

STYLE
Generally accepted rules for good writing are also applicable to the writing of
abstracts. Clarity and concise expression characterize a good abstract.

XII. APPLICATIONS OF ABSTRACTING


12.1 Primary publications
12.2 Indexing and abstracting journals and bulletins
12.3 Database products
12.4 Current awareness services

You might also like