Professional Documents
Culture Documents
well known and familiar expressions. Scientists 2. Information analysis (analytic and synthetic
have permitted rather differen t senses for old processing of information/document surro-
terms with the emergence of new theoretical gates/ documen ts themselves), categorisation
concepts. In science, theories are continuously and systematisation of the incoming flow
changing, but the change does not cuase a of information; and
water fall of new words. The new phenomena
are interpreted, through the old familiar ones, 3. Storage and retrieval of information/docu-
through old words whose meaning is slightly, ments and the development and use of
continuously changed. For instance, every body information retrieval procedures with the
knows the common meaning of the word use of modern means for achieving the de-
'isolate'. In Colon Classification its meaning sired results.
is completely changed (that which in isolation
is not fit to form the name of subject and has of the above mentioned three stages, the second
to have a Basic Subject component along with one namely, information analysis or document
it to represent a subject). In the language of analysis or subject analysis or content analysis
science, new concepts emerge, and old notions decides the efficiency of the system. It deals
are often assigned new meaning. Because of its with systematising or organizing the informa-
continuously changing nature, specific languages tion/documents accompanied by classification
of science are accessible only to those working and indexing of the information content of the
in the field and thus constantly interacting with documents; the creation of new secondary
the informational flow in science. It can be documents or document surrogates on the basis
said that, the emergence of a new independent of certain rules and procedures depending on
discipline must be accompanied by the emer- specific tasks of information practice to form
gence of a new specific language or a dialect of the index me or enquiry me, and the retrieval
the discipline. "In the realm of professional procedures constituting the interface in the
activities too there evolves a restricted and information retrieval process. A model Biblio-
accommodated language, a so called professional graphic Information System is given in figure-I.
language. A professional language is developed Documents are selected, received and their
and used by people for some professional pur- information content analysed to determine the
pose, that is, performing professional tasks in subject categories into which they are to be
a specific field and communicating about them. classified as well as the index terms associated
A professional language consists of vocabulary with the documents, according to a specified
(terminology and concepts) and types of com- language having grammar, called information/
municative acts (including typical intentions). documentation/indexing language. After the
A professional language is not only used for completion of this analysis, there is a branching
talking about an activity field (a universe of dis- which allows for more than one method of
course); it is also a part constituting it" 19]. 'organising' the 'files'. The documents them-
of course Library and Information Processing selves are stored physically in a pre-defined
profession has its own professional language and sequence constituting the, 'storage file'. Since
dialects too. only one physical arrangement of documents
is possible without expensive duplication of the
BIBLIOGRAPHIC INFORMATION SYSTEM documents, alternative paths of access to them
are provided through the 'index me' which con-
The functions of a bibliographic Information tains the index terms determined at the analysis
System are: collecting, organising, storing, step .
citing and disseminating of documents or· the
information/ideas recorded in the documents. The user requests information. She/He inter-
Such information processing activity can be acts with the index and/or storage file. To do so,
generally divided into the following stages: she/he must convert his/her request for informa-
tion into a well defined search question to which
1. Collection of information/documents em- the system can respond. The formulation of
bodying information with the purpose of the search question again has to be according
su pplying most fully, the necessary infor- to or in the information/indexing language.
mation/ documen ts in accordance with the When the user is satisfied with the documents
request of the users; which have been located in response to this
r-----------~STORAGq_------~
I
I
I
I
I
I
SELECTION ANALYSIS _- - - - - -- - -1- - - - --
I
I
I
I I
I I i
I I
I I
I I
I
I I
I INDEX 1
I I
I
1 I
I I
: I
I I
I I I
I -
1. ....1.I -II
Fig. 1
request, she/he reads them to obtain new facts formalised structure they are regarded as meta-
and ideas. Presumably, she/he will now become languages for information organisation and re-
~ gene~ator of new documents which will in turn trieval. They are used in
find their way into the system.
The nucleus of the Bibliographic Informa- 1 analysing the information content of docu-
tion System is the Information/Indexing langu- ments;
age. According to it the documents are ana-
lysed. The creation and organisation of the 2 formulating names for the information con-
'index file' and the setting up of the retrieval tent or names for the subject of the docu-
procedures are all based on the analysis of the ments;
documents according to the prescriptions of
the information/indexing language. 3 verifying the correctness and completeness
Natural language is a tool for thinking and of the statements of the names of subjects
a means of communication. In such a language of documents; and
there are synonymous correspondences bet-
ween words and meanings. The meaning of 4 constructing other information handling
words, sentences and even speech may change tools.
with time. Hence, the use of a natura1language
as such in information retrieval systems for the INFORMATION/INDEXING LANGUAGE AS
description of semantic contents of documents METALANGUAGE
is associated with difficult problems. in order
to overcome these problems, special artificial In information/indexing languages, the meaning
languages called information languages/indexing extracted from the text of a document (denot-
languages/documentation languages are used. ing the information content), is designated by
These artificial languages are formalised auxili- concepts and their inter-relations (which are not
ary languages, created by information pro- necessarily found in the text) by 'adhoc' sym-
fessionals for use in bibliographic information bols (descriptors, role indicators, facet indica-
storage and retrieval systems. Because of their tors, links etc.), in a formalised way. Because
these set of symbols are external to the object- necessary step by which to initiate the descrip-
language in which the documents under analysis tion or generation of natural language text and
are written and intended to facilitate the mani- to the understanding of language. The standard
pulation of these documents in various ways unit of analysis is the sentence, as it is more or
and for various purposes, they can be called less intuitively defined in traditional grammar;
'metalanguages' [41. They are not only used in any larger unit is felt to exceed the scope of
analysing 'what the information content of syn tactical analysis. The basic language units
documents are' but also in constructing state- for syntactical analysis are grammatical cate-
ments of the names of subjects of documents gories (nouns, verbs, adjectives etc.) and gram-
and also for verifying the correctness and com- matical functions (subject, object etc.) of
pleteness of these statements. In the formalised former times. But they are being further refined,
statements of these metalanguages "some of In accordance with the priority given to syn-
the 'logical semantic' relations, specifically tactical analysis, semantic analysis is confined
those of implication are specified, but not in to a subsidiary position as illustrated by Katz
the surface-structure of the object-language, and Fodor [12], which still seems to form the
that is, natural language" [10]. Moreover, these basis of the idea that the properties of 'surface
information/indexing languages, though de- structures' (in general terms) play a primary
veloped under independent conditions for the role in the determination of meaning [13] .
analysis of documents of many different kinds - But linguists themselves have felt the need
not only scientific texts, but also sociological to broaden the basis of language analysis. Grimes
records, scripture, folklore etc., - have striking [14] has pointed out that it is unwise to go on
similarities [4]. An account of these meta- ignoring the findings of other disciplines also
languages has been given by Gardin [11] . concerned with the analysis of text such as,
rhetoric, criticism etc., and information science
FEATURES OF INFORMATION/SUBJECT itself, especially since some of the better de-
AND LINGUISTIC ANALYSIS fined procedures recently setforth in these
areas occasionally unveil facts that linguists
Information/document analysis in the context would be ill-advised to neglect. A number of
of a particular science, obviously implies some linguists have proposed the broader concept of
knowledge 'about' the universe of discourse. 'paraphrase' as a basis upon which to decide the
There is a need to have a knowledge to write proper level where sentences should be assigned
summarising statements about the content of the same 'deep structures' [15]. Also a notion
the document/information being analysed. It is of syntactic analysis as a less discriminating form
left to be decided whether language analysis in of semantic analysis [16], and the notion that
a more general, less specialised field of speech "syntactic structures are derived from semantic
or writing, does not also imply some knowledge graphs" [17] point to the fact that syntax is
about that field. no more considered to be the first step of langu-
Information/indexing languages do not feel age analysis. The derivation of syntactic struc-
it an obligation to consider the natural language tures on 'semantic graphs' is exactly the strategy
sentence as the proper unit of analysis, as a adopted in information/indexing languages in
large number of linguists would have it. For one connection with subject/document analysis.
thing, the definitions of what a sentence is, in An investigation of 'prelexical' structures,
any given natural language are not so stable as equivalent to representations in information/
to provide with a firm analytical framework for indexing languages, prior to the development of
the kind of semantic analysis required in infor- grammatical transformations has also been
mation/indexing languages. Conversely, infra- advocated by linguists [18]. Again the adop-
sentential units often provide a more con- tion of ...•.
'seman tic categories (that) can guide
venient basis on which to conduct the deriva- the interpretation of sentences, independently
tion of information/indexing language units, of and in parallel with perceptual processing of
which are later chained to one another through the syntactic structure" [19] has also been
procedures that again overstep the boundaries advocated. Though some linguists have favoured
of natural language sentence. units smaller than sentence [16, p. 101-102],
In language analysis the priority is given to others on the contrary, have demonstrated that
syntactical analysis, which is considered as the larger units are necessary to provide a proper
entity) [10, p. 210], to express semantic as 3 To take into account the immensely diverse
well as syn tactic represen tations with: relational data observed in documents, a
set of rules of syntax, constituting the
1 the need to categorise the symbols of the syntagmatic structure of the metalanguage,
vocabulary (words, descriptors) in such a which is contrasted to the paradigmatic
way that formation rules equivalent to the structure, not in essence but in use.
phrase-structure rules of grammar can be
stated adequately with no regard to the POSTULATE BASED PERMUTED SUBJECT
grammatical status customarily assigned to INDEXING LANGUAGE
the words concerned; and
The Postulate-based Permuted Subject Indexing
2 the need to account for the derivation of language of Bhattacharyya [36-42] has all the
propositions from one another, as a necess- essential features mentioned above. It is based
ary component, in the understanding of on
language behaviour.
1 a set of postulated Elementary Categories
of the elements fit to form components of
COMPONENTS OF INFORMATION/INDEX- names of subjects;
ING LANGUAGE
2 a set of syntax rules with reference to the
In documentation/information/indexing langu- Elementary Categories for formulating ad-
ages, there is a focus on the concept of 'sum- missible names of subjects; and
mary' or 'aboutness' which may not be consi-
dered important in 'other' languages. Infor- 3 a faceted hierarchic scheme of terms for
mation/indexing languages are used to formu- vocabulary control called Classaurus.
late statements as answers to questions of
the type 'what the information content of Each of the terms occurring in POPSI language
particular documen ts are about'. These state- statements have their appropriate broader terms
ments stand as names of the topics or subjects of prefixed to them. Hence the POPSI language
the documents. In short they are languages for has been shown as a source language for pro-
naming subjects for the special purpose of ducing and organising associated indexes [43].
storage and retrievaL The minimum essential It has also been shown as a metalanguage for
components of any such metalanguage are [4, computer aided generation of information re-
p.147] . trieval thesaurus [44], as well as Classaurus [45,
46]. POPSI language is amenable for com-
1 The 'lexicon', a list of content terms, either puterisation of index generation, especially in
extracted as they occur in a given natural producing different types of indexes including
language (keywords) or redefined for the chain index and PRECIS-format index entries
purpose of the analysis (descriptors). If the [47. 48] . It is also possible to use POPSI langu-
list of content terms carry no relational age in computer based online information
information of any kind, then the metalan- retrieval. In such an online information retrieval
guage is said to be 'unorganised' (uniterm system the searcher need not know any query
lists) . language and the system would have built-in
vocabulary control mechanism [48,49] .
2 A set of relational data provided 'a periori'
CONCLUSIONS
with the lexicon, irrespective of the way in
which they are expressed (cross references,
A study of the salient features of language and
hierarchy, tree, factoring etc.), and whether
language analysis is necessary for the design of
they be regarded as semantic data as in
information retrieval systems in general and
taxonomies or as syntactic templates as in
indexing system in particular [2] . It would help
some faceted classification schemes. This
in incorporating the required features though
forms the paradigmatic structure of the
not exactly but atleast analogically in indexing
metalanguage .
systems. POPSI has incorporated all the essential
features necessary for making it an universal 11. Gardin, J C. Semantic analysis procedures in the
indexing system. Its resilience and amenability science of man. Soc Sc Inf. 1969,9, 1342.
to computerisation has paved the way for re-
search in this direction, the ultimate aim of 12. Katz, J and Fodor, J .A. The structure of seman-
which is to develop an online information re- tic theory. Language 1963,39,170-210.
trieval system which would be user friendly.
13. Chomsky, N. Deep structure, surface structure
and semantic interpretation. In Steinberg, K and
BIBLIOG RAPHlCAL RE FERENCES Jakobovits, L., Ed. Semantics: An interdiscipli-
nary reader in philosophy, linguistics and psycho-
1. Beling, G. The use of EDP in terminological logy. Cambridge University Press, Cambridge.
work. In Overcoming the language barrier: Pro- 1971. p 214.
ceedings of the language barrier: Proceedings of
the Third European Congress on Information 14. Grimes, J E. The thread of discourse. Cornell
Systems and Networks. Luxembourg, 3-6 May, University, Dept. of Modern Languages and Lin-
1977. K.G. Saur, London. 1979. v.1, P 102-103. guistics, Ithaca, New York. (Technical Report
No.1). (NSF Grant. GS-3180). 1972. P 13-34.
2. Mitchell, Gillian. The natural language founda-
tions of indexing language relations. CanJI of Inf
15. Partee, B H. On the requirement that transfonna-
Sc 1979,4,99-104.
tions preserve meaning. In Fillmore, C.J. and
Langendden, D.T., Ed. Studies in linguistic
3. Nalimov, V.V. In the labyrinths of language: A
semantics. Holt, Rinehart and Winston, New
mathematician's journey. ISI Press, Philadelphia.
York. 1971, p 3.
1981. p 3.
10. Montgomery, C.A. Linguistics and information 23. Fillmore, C J. Verbs of judging: An exercise in
science. Jl of Amer Soc for Inf Sc 1972, 23(3), semantic description. In above cited ref. 14;
214-215. p 273-89.
24. Fillmore, C J and Langendden, D.T., Ed: Studies 36. Bhattacharyya, G and Neelameghan, A: Postu-
in linguistic semantics. Holt, Rinehart and Wins- late-based subject headings for dictionary catalo-
ton, New York. 1971. gue system. (DRTC Annual Seminar. 7; 1969;
paper CA).
25. Fillmore, C J. The case for case. In Bach, E and
Harms, R. Ed. Universals in linguistic theory. 37. Bhattacharyya, G: A general theory of SIL,
Holt, Rinehart and Winston, New York. 1968, POPSI and Classaurus: Results of current classifi-
p 1090. cation research in India. (Paper presented at the
International Classification Research Forum,
26. Vermier, D. Semantic hierarchies and abstrac- organised by SIG/CR of ASIS, Minneapolis, Oct
tions in conceptual schemata. Inf Systems 1983, 1979).
8(2),117-24.
38. Bhattacharyya, G: POPSI: Its fundamentals and
27. Ranganathan, S R. Hidden roots of classification. procedure based on a general theory of subject
Inf Stor and Retr 1967,3,407-9. indexing languages. Lib Sc with a slant to Doc
1979,16(1),142.
28. Neelameghan, A: Absolute syntax and structure
39. Bhattacharyya, G: Some significant results of
of an indexing and switching language. In Order-
ing systems for global information networks: current classification research in India. Intl
Proceedings of the Third International Study Forum on Inf and Doc 1981,6(1),11-21,
Conference on Classification Research. Bombay,
6-11 Jan, 1975. (FID/CR Pub No. 553). DRTC, 40. Bhattacharyya, G: Subject indexing language: Its
Bangalore, 1979, p 170. theory and practice. (DRTC Refresher Seminar,
13; 1981;paper BA).
29. Bach, E: Nouns and noun phrases. In above cited
ref. 24; p 91. 41. Bhattacharyya, G: Elements of POPS!. In Rajan,
T.N.' Ed. Indexing systems: Concepts, models
30. Sager, N: Natural language information format- and techniques. IASLIC, Calcutta. 1981. p 82.
ting: The automatic conversion of texts to a
structured data base. Advances in Computers 42. Bhattacharyya, G. Classaurus: Its fundamentals,
1979,17,89-162. design and use. In Dahlberg, I., Ed. Proceedings
of the 4th International Study Conference on
31. Schank, R C. Inference and paraphrase by com- Classification Research. Augusburg, 28 June-
puter. Jl of Asso for Comput Machinery 1975, 2 July, 1982. Indeks Verlaag, Frankfurt. v 1.
22,309-28. 1982. p 13948.
32. Schank, R C and Abelson, R P: Scripts, plans and 43. Bhattacharyya, G: POPSI: A source language for
knowledge. In Proceedings of the Fourth Inter- organising and associative classifications. Lib Sc
national Joint Conference on Artificial Intelli- with a slant to Doc 1982, 19, 243-52.
gence. Tbilisi. 1975, p 151-157.
35. Lehnert, W A: The process of question answer- 46. Devadason, F J: Online construction of alphabe-
ing. Dept of Computer Science, Yale University. tic classaurus: A vocabulary control and indexing
Research Report. 88. 1977. tool. Inf Process and Mgt 1985,21(1),14.
47. Devadason, F J: Computerisation of deep struc- (Guide. M.R. Kumbhar]. Kamatak University.
ture based indexes, IntI Classif 1985; -12(2). P 87. Dharwad. 1985.