Russian Experience in Hypertext: Automatic Compiling of Coherent Texts

Russian Experience in Hypertext: Automatic
Compiling of Coherent Texts
Ft. S. Gilyarevskii
All-Russian Institute for Scientific and Technical Information, Usievicha 2Oa, Moscow 125219, Russia
M. M. Subbotin
State Scientific Technical Center of Hypertext Information Technologies, Zemlyanoi Vat 52/16, Moscow, Russia
Russian hypertext research emphasizes algorithmic used in the systems of artificial intelligence, the hypertext
navigation. Navigation rules are based on features network is intended not for acquiring derivative knowl-
of hypertext nodes formulated in terms of graph edge-conclusion, appraisal, diagnosis, etc.-but for read-
theory. The trail built in this navigation can be per-
ceived as a nonformal reasoning or a coherent text. ing sequentially the textual material; the goal here is the
In creating hypertext systems there appear specific same as in reading the linear text-to master the knowledge
problems of logic and structural analysis which were given in the text.
first advanced by Russian researchers. The Russian When knowledge is represented in the form of hyper-
hypertext systems, HYPERLOG, HYPERNET, BAHYS, text, the reader receives the possibility to acquire it more
and SEMPRO, are described.
actively: he/she alone can select the initial point and order
of reading in accordance with his/her intellectual interests
and knowledge.
Introduction: The Concept of Hypertext These advantages of the hypertext, in comparison with
the linear text, will become apparent only if the reader
The readers of JASLSare familiar with the basic concept
establishes the proper navigational routes in the hypertext
of hypertext (Lunin & Rada, 1989) which is why we shall
network, i.e., if the consequently read fragments form a
only dwell upon some features of this concept that are
joint, coherent content.
significant from our point of view.
The problem of finding suitable intelligent navigation
Hypertext is interpreted as a nonlinear, networking form routes is an acute one, when large and complex hypertext
of arranging textual material.
networks are under discussion.
Thus, textual material consists of separate fragments
(“nodes”) with indicated possible transitions (“links”) be-
tween them. There are various ways of establishing these
Large and Complex Hypertext Networks
links, but for us the important issue is semantic proximity
of the linked fragments. Very large hypertext networks usually appear, when we
As a rule, every fragment is connected with several oth- speak about a dynamic hypertext (Carmel, McHenry, &
ers by links, which gives the material a network form. The Cohen, 1989) that expands permanently with newly inserted
process of reading the fragments, which form a hypertext, information. Complex networks emerge in a hypertext,
by following the links, is called navigation. which originally is not built hierarchically (new information
A hypertext network gives a reader the possibility of is not counted to be inserted in rubrics set before), and
navigating along different routes, i.e., reading the ma- especially when links of new nodes are set on the estimation
terial in a different order, and not in only one, as in of its semantic proximity to each available node. As a rule,
the case of reading ordinary linear texts. In contrast to many cycles appear in texts built in this way. It should also
the network forms of representing knowledge, which are be mentioned that a link which reflects semantic proximity
may go in two directions, so a graph built on these links is
not directed in general cases. The more links are related to
Received February 15, 1991; revised June 16, 1991, March 16, 1992;
the same node in a network, the more difficult it is for the
accepted October 21, 1992. user to select the next node while in navigation. The well-
known problem of “getlost” in hypertext emerges (Conklin,
0 1993 John Wiley & Sons, Inc. 1987).
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 44(4):185-193, 1993 CCC 0002-8231/93/040185-09
Direction of the links, hierarchically organized structure, Correspondence between the hypertext topology and
the building of composite nodes, and finally, an availability system of links of represented knowledge in hypertext
of routes paved beforehand for the reader-all reduce the makes it possible to impart semantic interpretation to the
difficulties in choosing while in navigation. But it is reached graph characteristics of the nodes. In particular, a correct
simply through the reduction of a possible choice. All the semantic connection of the nodes in the navigation route can
mentioned means of facilitation of navigation weaken the be described in graph theory terms (e.g., as a requirement to
most interesting property of the hypertext-the variety of every next node to have links not only with the previous, but
implicit routes (linear texts), the possibility of revealing in with some other preceding nodes of the navigational route
the hypertext unexpected, but quite intelligent routes. also). Such criteria of selection of nodes in a navigational
From our point of view, it is very important that the route have a heuristic character. For the last 20 years we
hypertext include all the variety and fullness of possible have revealed and checked a series of requirements to
semantic links between cognitive elements, and thus a great graph characteristics of the node being chosen. Keeping to
variety of potential navigational routes. these requirements provides quite high logical and semantic
This means that facilitation of navigation and perception qualities of the textual material being built in the process
of the material read in the process of navigation must of navigation.
be reached not through directly reducing the number of
possible routes, but in an other way. Possibility of Navigation under Rules
We believe it can be reached by using certain rules, (Algorithmic Navigation)
criteria, and in selecting each next node from a possible
variety. With availability of criteria in selecting nodes according
to structure, the graph-theory features give the possibility
of carrying out navigation under the rules. Usage of these
Navigation Which Provides Coherence
rules does not turn the hypertext into an expert system.
of the Read Material
These rules do not require an already checked model of
If links in hypertext reflect a semantic connection of a subject domain, they are, in general, independent from
the content of nodes, then the consequence of the text’s the concrete material. That is why they also can be applied
fragments determined by the navigational route always has in dynamic hypertext, which is permanently expanding and
some level of semantic organization. But an availability changing with new information.
of semantic links between two neighboring fragments only In the Russian systems, HYPERLOG, SEMPRO, and
does not give any guarantee that the joint content of the BAHYS, these rules are implemented in corresponding al-
nodes will be coherent for the user and thus available for gorithms and programs and provide algorithmic navigation
mastering it effectively. To meet these requirements, while through hypertext.
in navigation, one must take into consideration not only
the semantic proximity to the direct earlier node, but also Illustrating the Algorithmic Navigation
semantic and logical relations with all preceding nodes of in Dynamic Hypertext
the navigational route.
Under ordinary “manual” navigation in a large and Let us assume that we accumulate knowledge on the
complex hypertext network it is very difficult to select the problem “navigation in hypertext,” finding statements from
next node that would meet the mentioned requirements. literature on this problem and writing them down in the
We believe a true, effective choice of the next node in order we came across them. Let us assume that now we
navigation can be made on the basis of an estimation of its have the 14 statements listed in Table 1.
place in the topological structure of the hypertext network. Let us establish direct semantic links in our collection of
statements and express them (Table 2). The left column of
Table 2 contains the statements numbers in the order they
Correspondence between the Hypertext
were set down initially, and the lines contain the numbers
Topological Structure and the System
of adjacent semantic statements for each one.
of Semantic Links of Knowledge Units
The hypertext network, reflected in this table, can be
There are direct and mediate links between the elements easily presented graphically (Fig. 1).
of knowledge in any subject domain. The direct links take The presence of the link in this hypertext is a property of
place if one unit of knowledge confirms the other; makes adjustment of the corresponding statements, therefore, navi-
it more concrete; generalizes it; or makes it appear as a gation here provides a certain coherence level of sequence
cause, goal, etc. of statements.
The topological structure of the hypertext network can However, ordinary manual navigation does not, as a rule,
quite adequately reflect a system of such semantic links un- provide a very high level of coherence. The reader can try to
der two conditions: if nodes are elementary monosemantic move across the links of this hypertext and, in most cases,
units of knowledge (statements, ideas, facts, etc.); and if will realize the typical defects essential to this type of text
links are established in the hypertext in all cases when a building. On one hand, they are characterized by a violation
direct semantic link is present. of logic (e.g., later statements evidently introduce ideas,
186 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993

TABLE 1. Collection of statements.
1. Hypertext, or nonlinear text consists of separate text fragments which are called “nodes.”
2. While in process of navigation, user has access to information on the content of neighboring nodes and, on this basis, selects
his/her route through the network.
3. A necessity arises to have something like a compass for hypertext navigation.
4. Russian research indicate that rules of algorithmic navigation may be based on structural characteristics of the hypertext network.
5. Possibility to move from one node to another is called “link.”
6. Following the hypertext network’s links is called “navigation.”
7. In complex networks, where each node has many links, selection for the next navigational step becomes very difficult.
8. Structural characteristics of the hypertext network, used for setting rules of algorithmic navigation, can be described in terms of graph
theory.
9. Following links, it is possible to traverse a hypertext network in various directions and by different routes.
10. With the rules, navigation may be carried out algorithmically.
11, Since every node may have many links, the so-called “hypertext network” emerges. A hypertext network may be very large and of high
complexity.
12. Each node has a prescribed set of links.
13. Rules determining direction of navigation in accordance with the user’s subject matter should act as a compass.
14. Many researchers point out that, in large hypertext networks, it is easy for the user to become disoriented or to get lost.
which were used in previous statements); on the other, For navigation in large hypertexts there are also other
statements included in these texts by manual navigation criteria used. Of importance is the following criterion: the
are far from all being a rule. Often, when the navigational general number of links of the next node must not be more
route reaches any node, it turns out that all the neighboring than that of the previous (as numerous observations show, it
nodes have been used already in the building of the linear provides a deductive character of exposition, from general
text; having returned, one can move to new statements, but, to particular) (Subbotin, 1986).
as we know, it is very easy to get lost in large hypertext (the
reader permanently returns to routes he has already passed).
Our system, BAHYS, has automatically created a linear Semantical Gaps
text of these 14 statements, built on the basis of one of Of course, the possibility of building a logically coherent
the algorithms which implement our heuristic rules (see exposition of a theme depends not only on criteria of
Table 3). selection of the next node, but on the accumulated material
One can become convinced that this linear text is itself, on the actual topology of the hypertext network.
logically consequent and set out the material quite That is why the navigational algorithm must be adapted
systematically. to the network structure. If it is impossible to select the
The algorithm used in this case selects the nodes, be- next node by strong criteria (just like the one mentioned
ginning from the third-the first two nodes are set by above), weaker criteria are used, or the entire set of criteria
the user. The nodes are selected on the basis of pure is not taken into account. Systems carrying out algorithmic
structural indication; the following two criteria were used
for selection: the current node should have no less than two
direct links with previous nodes of the route (it provides
quite a high level of general coherence of the linear text
under building); there must be a direct link with the
previously adjacent node (it provides continuity of content).
TABLE 2. Direct semantic links in the collection of statements.
1 5 12
2 6 7 9
3 7 10 13 14
4 8 10 13
5 1 9 11 12
6 2 7 9 11
7 2 3 6 11 13 14
8 4 10
9 2 5 6 11
IO 3 4 8 13
11 5 6 7 12 14
12 I 5 11
13 3 4 7 10
14 3 7 11
FIG. 1. Arrows indicate the “good” path, forming good linear text.
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993 187

TABLE 3. Automatically constructed text.
Hypertext, or nonlinear text consists of separate text fragments which are called “nodes” (Lunin & Rada, 1989).
Possibility to move from one node to another is called “link” (Gilyarevskii & Kaloshin, 1988).
Each node has a prescribed set of links (Luhn, 1958).
Since every node may have many links, the so-called “hypertext network” emerges. A hypertext network may be very large and of high
complexity (Nelson, 1966).
Following links, it is possible to traverse a hypertext network in various directions and by different routes (Bush, 1945).
Following the hypertext network’s links is called “navigation” (Chelnokov, 1985).
While in process of navigation. The user has access to information on the content of neighboring nodes and, on this basis, selects his/her
route through the network (Carmel, McHenry, & Cohen, 1989).
In complex networks, where each node has many links, selection for the next navigational step becomes very difficult (Otlet, 1975).
Many researchers point out that, in large hypertext networks, it is easy for the user to become disoriented or to get lost (Tscitin, 1961).
A necessity arises to have something like a compass for hypertext navigation (Conklin, 1987).
Rules determining direction of navigation in accordance with user’s subject matter should act as a compass (Garfield, 1955).
With the rules, navigation may be carried out algorithmically (Engelbart & English, 1968).
Russian research indicates that rules of algorithmic navigation may be based on structural characteristics of hypertext network (Subbotin,
1986).
Structural characteristics of hypertext network, used for setting rules of algorithmic navigation, can be described in terms of graph theory
(Otlet, 1975).
navigation indicate places in the built linear text, where the the procedure of semantic contiguity (direct link) turns out
next node was selected by weaker criteria and, therefore, be laborious when inserting every following statement in
where logical correctness and coherence weakened (i.e., the relation to every available one in hypertext statements.
semantical gap). Practically, an ascertainment of the availability or ab-
As a rule, a gap is overcome by inserting additional sence of semantic contiguity is set not for every pair of
information, by adding new cognitive elements, which are statements, but only for those, which by some indications,
able to establish a link between fragments of a linear text. most probably, have this contiguity. A selection of corre-
Indication of the gaps stimulates a purposeful search for sponding pairs of statements is carried out on the basis of
new information and ideas. key words, which can be singled out automatically-on the
basis of morphological and syntactic analysis of the text.
Construction of Hypertext and Algorithmic In many cases, in the practical sense, the final result can be
Navigation as an Instrument of Logical satisfactorily received under fully automatic establishment
Arranging of the Accumulated Material of links between statements.
In our case, it should be noted also that it is not necessary
The example given shows that conslruction of the hy- to point out the type of relations for establishing semantic
pertext and the algorithmic navigation through it can be contiguity (correspondingly, there is no need to classify
considered as instrument of systematization of the accumu- these relations); the fact of presence of some relation is
lated knowledge. itself important. The explanation is that the final result-a
If, at input, we have a free and a nonorganized collection good linear text-presumes high coherence of every next
of statements and at the output we want to achieve a well- statement with previous fragments of the exposed material,
organized sequence (from the logical and semantic criteria independently of what concrete relations this statement has
point of view), we must convert statements into nodes and with the previous ones.
set direct semantic links between them, and then carry out
navigation under the structural criteria.
If the user has regulated and structuralized the accu-
Some Other Directions of Algorithmic Structural
mulated statements corresponding to the formerly fixed
Analysis of the Hypertext Network
structure of the given subject domain, he actually would
have brought new information to some strict, a priori Until now, we spoke about a structural analysis of the
scheme. If the collection of statements is left not structural- hypertext network, meaning the linear adjustment of its
ized, then the possibility of comprehending and acquiring nodes. However, other types of analyses are also of cer-
the stored knowledge is lost. A structuralization (hierar- tain interest, in particular, the singling out by structural
chization, systematization) of the accumulated material, indication subgraphs which reflect the content of the given
proceeding on semantic and logical links, contained in the hypertext as a whole. As in the case of algorithmic navi-
collection of statements, would have been most desirable. gation, we speak about structural indications which have
Of course, the problem of the laboriousness of this kind a semantic interpretation, because of the correspondence
of systematization of knowledge appears. First, there is between the topological structure of the network and the
the supposed building of the hypertext on the principle system of semantic links of cognitive elements.
of setting links in all cases of close semantic contiguity Thus, the more links the node has in the hypertext
between statements. Though not complicated intellectually, network, obviously the higher and more considerable the

role of the corresponding cognitive element in the total ranking topics by the sum of distances-an algorithm
knowledge represented in the hypertext. developed and programmed by Chelnokov (1985)]. The
Such graph characteristics as the node’s centrality (the substantive analysis of the resulting graphs indicated that,
sum of distances from other nodes) is also evidence of during the 15 years, Soviet information science contained at
the importance of the corresponding cognitive element in least three problem areas which were not strongly connected
the system of the presented knowledge. with other fields:
Using such structural indications one can single out the l means for automation of information retrieval;
most essential cognitive elements that best characterize the l linguistic aspects of text processing; and
content of the hypertext. For example, the subgraph, which l information needs-analysis and services.
includes only nodes with the number of links higher than
a certain limit (and the links of these nodes with each The first of these problem areas for the initial five-
other)-is a representative part of the whole hypertext. year period was undoubtedly the most important; in the
last period, however, it was superseded by theoretical
and methodological problems of information science. The
An Example: Investigation of Soviet Dissertations
second of the fields, which initially appeared isolated and
The changes in the subjects of Soviet dissertations had underdeveloped, had a large number of nodes in the last
to be analyzed to reveal the substantive trends in the period, with strong connections to the core of the subgraph.
development of information science in 1965-69, 1970-74, The third problem area, throughout the 15 years, remained
and 1975-79 (Gilyarevskii & Kaloshin, 1988). Dissertation one of the main though isolated regions, with a large
subjects formulated on the basis of titles and abstracts (a number of links. Certain areas selected for dissertations
total of 3,500 subjects in some 700 dissertations) served as were remarkably stable. Throughout the 15 years, improve-
hypertext elements. For each subject, its logical relations ments in scientific information work based on automation
with other subjects in the file were established. Building of information processes remained in a key position.
such a logical-semantic information collection is equiva- Now we have the possibility to comparing these
lent to forming an unoriented graph of conceptual relations results with the analysis of 120 Soviet dissertations
for the subject field. defended in 1989-1990 (this analysis was made by
It was constructed and analyzed automatically by a set O.V. Stolyarova in her Masters thesis). The dissertations
of algorithms and programs which consists of five steps: graph consists of two isolated subgraphs (see Fig. 5).
(1) statements are ranked according to the number of links; They demonstrate that two problems-“Development of
(2) a minimal number of links is defined that makes a Library/Bibliographic/Reference and Information Services”
statement eligible for the aggregated scheme; (3) statements and “Semantic Processing in Information Systems” were
that pass the threshold of detailing are singled out; (4) for most important in Soviet information science during the
statements that pass the detailing threshold, their links with last years of the USSR’s existence.
each other are specified; and (5) the data are used for
automatic construction of a graph.
Approach to the Knowledge as a
This graph-building scheme makes it possible to deter-
Network of Cognitive Units
mine the key problems of Soviet information science and
its groups of problems represented by connected subgraphs The method under consideration is theoretically based
(Figs. 2-4). For estimating the significance of problems, on the ideas developed not only by the researchers of
we ranked subjects by the number of links with other hypertext, but also by their predecessors. One of the basic
problems [although other criteria are possible, such as principles of our method is based on the assumption that
comparison of types of IS
statistical description theory of IR
of text choice of inf. lang.
nmJral language mechanized IS
text procewng computerized IS
text analysis l scanning of text

secondary publications
machine translat
l use of book collections
technical means of IR SD1 documentary IS
classification
organization of inf. work
FIG. 2. Graph reflecting dissertations prepared for 1965-1969. Autom., automatic (ed); Bibl., bibliography(ic); DB, data base(s); ES, expert
system(s); Inf., information; IR, information retrieval; IS, information system(s); Lang., language; Lib., library(ies); Ref., reference; S&T,
science and technology; SDI, selective dissemination of information; Transl., translation.

machine transl. implementation
of aotom. IS
auf. lang. text processing
aotom. inf. services
S & T specialized IS
commumcation science l Inf. & bibl. services
efticiency of scientific documents

national inf. network
FIG. 3. Graph reflecting dissertations prepared for 1970-1974. See legend to Figure 2 for abbreviations.
any subject area (ultimately human knowledge) is a single framed from all that has been published . . (Otlet,
system of links among cognitive elements forming it. It 1975a)
is this particular understanding that has been forming in
It should certainly be taken into account that this idea
information science for a long time. Pioneers of this science
was expressed at the beginning of the 20th century and was
had advanced similar ideas as far back as the pre-hypertext oriented at the technical possibilities of that time. Although
period. Now it is becoming clear that successes in the
they were very limited by present-day standards, Otlet
development of computer facilities and programming have
foresaw modern achievements, even systems of remote
made it possible to realize ideas which had been developed access to data banks. In 1934, he wrote:
within library and information sciences long before. Bibli-
ographers, for hundreds of years, appreciated the difficulty Any one from afar would be able to read the passage,
of organizing and presenting information. Possibilities and expanded or limited to the desired subject, projected on
trends in the development of information technology in this his individual screen. Thus, in his armchair, any one
would be able to contemplate the whole of creation or
field were guessed and correctly predicted by the pioneers
certain of its parts. (Otlet, 1975b)
of informatics.
P. Otlet is known to most specialists only in connection From the time of pioneers of hypertext (Bush, 1945;
with the Universal Decimal Classification (UDC) he created Engelbart & English, 1968; Nelson, 1966) to the mid-
in 1905. As far back as 1905 he realized the need to 1980s the hypertext idea experienced an incubation period
order the world system of scientific communication. In his when numerous projects which developed certain aspects
report at the International Congress on Bibliography and of this idea were carried out in an isolated way within the
Documentation (Brussels, 1908) he expressed an idea which framework of various scientific directions. As a rule, the
contained the kernel of hypertext technology: designers, far from using the term “hypertext,” did not often
realize their link with the works by Engelbart and Nelson.
The medium of the organization of scientific work is the
book, above all in its latest form, the periodical . . . . Only now is it possible to identify the range of works
The only conception which corresponds to reality is which in fact developed certain essential elements of the
to consider all books, all periodical articles, all the hypertext concept. Although the task is far from simple its
official reports as volumes, chapters, paragraphs in one solution would facilitate the use of the potential of former
great book, the Universal Book, a colossal encyclopedia projects to ensure further development of hypertext and
autom. text recognition

semantic text structure
DB in management tom. analysis of a text
semantic compression
informatics for management
inf. needs
theory of scient. inf bibl. services
inf. services
theory of inf. science man-machine communication
representation of data theory of IS
FIG. 4. Graph reflecting dissertations prepared for 197551979. See legend to Figure 2 for abbreviations.

ref. serwces I” agriculture L & I services in agriculture
international bibl.
bibl. in agriculture
inf. needs & users
bibl. in science
bibl. in engineering
ref. service in big library
bibl. & inf. services in medicine

bibl. DB for new technology
mathematical models of IS for city management

lexical control in IS IS in management
logic & semantic
in classification
inf. services in
software for
language interface
software for DB
mathematical model of language

knowledge processing
i” Sema”tic “~~~n~~c .,,,dem semantic networks
semantic data in ES autom. creation of structure

of subject field
FIG. 5. Graph reflecting dissertations prepared. See legend to Figure 2 for abbreviations.
improvement of its new forms and tools. If the problem of the many works of this cycle mention should be made
hypertext is viewed from broad logical-linguistic positions, of the reports by Tseitin (1961) and Ivanov (1961)
one can see that it arose from attempts to overcome the at the Conference on Information Processing, Machine
narrowness of traditional tools of information retrieval: Translation and Automatic Reading of Text held in Moscow
hierarchical classification schemes and descriptor languages in 1961.
of coordinate indexing. Similar attempts in earlier years In the Tseitin report, the main task dealing with the
led to Luhn’s (1958) ideas of automatic abstracting and construction of a model of text was to identify a host
Garfield’s (1955) citation networks. The former line proved of grammatically correct phrases and to indicate which
unpromising, whereas the latter was brilliantly developed in were equivalent to each other in meaning. The task was
science citation indexes and in maps and atlases of scientific to develop an algorithm generating a multitude of such
fields. equivalent pairs. The Ivanov report ended with a prophetic
idea to the effect that deliberate operations over linguistic
systems required for machine translation and information
Soviet Investigations in the 1960s and 1970s
retrieval may be linked by feedback with the development
One of important lines leading to modern linear text gen- of these systems themselves.
erating based on logical-linguistic research of the 1960s. Of special interest are works dealing with the distributive
In the 1960s the USSR conducted major logi- analysis of text and developing methods of American
cal-linguistic and logical-mathematical projects aimed descriptive linguistics (Harris, 1951). In this connection we
at establishing formalized models of natural language would like to mention the book by Andreev (1967) and
with a view to automating information processes. Of a Doctoral thesis by Shaikevich (1982). The dissertation

defines this analysis as a formal description of the structure Navigation rules add to hypertext a shade of logi-
of collections of written texts subdivided into segments calization. But this is not the logic of formal inference
(words and utterances) and forming the so-called semantic used in Al systems. It is quite a different logic which
graph. The description takes the form of the classification of Subbotin proposed to call connectivity logic. Its rules can be
source linguistic elements, which in the course of analysis advanced proceeding from intuitive ideas, from logical and
may be linked into classes which are perceived as clusters philosophical theories, and then tested on many hypertext
on the graph. We can say that in this period the tradition networks represented by various subject fields.
appeared to present texts as networks of semantic links.
In the early 197Os, Ivanov and Subbotin (1978) began
to link knowledge units (concepts and assertions) instead Russian Hypertext Systems with
of text segments for describing any subject field. Hence, Algorithmic Navigation
“logical-semantic models” emerged. In these models, those Today, four hypertext systems with algorithmic naviga-
cognitive units are linked that are related in meaning. The tion realized on PCs IBM XT/AT have been developed:
criterion for linking some pair of units is the possibility of HYPERLOG, BAHYS, HYPERNET, and SEMPRO.
combining its concepts or assertions by expressions such The HYPERLOG system has been functioning since
as “is, ” “is the cause of,” “is the end of,” “therefore,” 1988 (Zefirova, 1990). It served initially for research in
and so on. Networks built on that principle served for algorithmic navigation criteria. For researchers a sheet was
describing and investigation of any subject field. They came built that contained graph characteristics of each node
right up to hypertext. It is important that links would be regarded as a candidate for the continuation of navigation
fixed for all pairs of concepts or assertions which can path. The experienced user was enabled to influence output
be combined by copula or other correlate expressions. text features by ranking connectivity criteria in some pri-
This, of course, caused technological problems in large ority order. The recent version of HYPERLOG is currently
networks in searching candidates for linking. But it is the used for interactive navigation offering user some local
principle of completeness of links that paved the way to the filters to select nodes. HYPERLOG was developed in the
investigation of topological characteristics of the hypertext State Scientific Technical Center of Hypertext Information
network. Technologies (Moscow).
The most interesting direction of these investigations The most recent product of this center is BAHYS (Basic
proved the algorithmic navigation in the network. This Hypertext System). BAHYS can be considered an intelli-
navigation is based on heuristic rules reflecting the intuitive gent text processor. It provides, besides various common
ideas on good connectivity. The sequence of nodes built in procedures of text processing, transformation of elementary
accordance with rules of good connectivity was called “non- text units (“statements”) into hypertext nodes and also
formal reasoning” (Subbotin, 1986). The range of criteria supports arrangement of these statements into new coher-
of good connectivity has expanded gradually. ent texts. Thus, BAHYS supports the elaborating of text
At the end of the 1970s and beginning of the 198Os, documents on the semantic level (see Figs. 6 and 7).
large hypertext networks (up to 3500 nodes, as in the case BAHYS lacks a graphical browser or tools to visualize
of dissertations investigation) consisting of statements on hypertext. In this regard it is supplemented by the HYPER-
different subject domains were built and analyzed experi- NET system intended, above all, for visual representation of
mentally on large mainframes. hypertext nets with a different degree of detailing. It also al-
. -...- .
rA
for
necessity
hypertext
arises
navigation.
to have something 1 il::e a compass
I---- -----Ob:PECT~OC:Tb-
) In complex network, wher+ each, npde. has. many. 1 I nkcj-, se,]:ec>
Many researchers point out, that in large hyperte,:c network:
Rules determining direction af navigation in accordqce, wi-t
With the rules, navigation may be carried out algorlthmical
FIG. 6. Semantical area of a node in BAHYS (adjacent nodes represented by first lines).

_. _ - - - _ -. k:. ~33~1~08.
- Teb::cT r-i
A necessity arises to hav;ezg;ething like a compass
many 1 inks,
lipks is called
the so called
F’ossilqi l,iTy. to move from one node to another is
FIG. 7. Chain of statements as a result.
lows one to effect transitions (navigation) between different Conklin. J. (1987). Hypertext: An introduction and survey. Computer,
hypertexts (designers call this feature “hyperlink”). Visual- 17-24.
Engelbart, D., & English, W. (1968). A research center for augmenting
ization of a hypertext net combined with various images human intellect. AFIPS Conference Proceedings, 33, 1.
and pictographs is a good tool for intensifying the user’s Garfield, E. (1955). Citation indexes for science. Science, 122,
thinking in solving complex problems. The HYPERNET is 108-111.
also a product of the State Scientific Technical Center of Gilyarevskii, R. S., & Kaloshin, V. V. (1988). Development trends
Hypertext Information Technologies (Moscow). of informatics (based on Soviet dissertations from 1965 to 1980).
Automatic Documentation & Mathematical Linguistics, 22, 56-68.
SEMPRO (Semantic Processor) is a hypertext system
Harris, Z. (1951). Methods in structural linguistics. Chicago.
of a similar type designed and disseminated by the Ivanov, V.V. (1961). On constructing of information language for
Soviet-Finnish-Bulgarian joint venture NOVINTEKH. It texts on descriptive linguistics. Conference on Informntion Process-
takes into account peculiarities of users with a humanistic ing, Machine Translation and Automatic Reading of Text, Moscow,
thinking process and is specially adapted to the main p, 15 (in Russian).
Ivanov, V. G., & Subbotin, M. M. (1978). Analysis and updating of
stages of the authorship process: information (source texts)
comprehensive solutions with the use of computers on the basis of
accumulation, creating a variety of notes and fragments for the method of logical-semantic modeling. Moscow (in Russian).
the future text, and compiling variants of the exposition. Luhn, H.P. (1958). The automatic creation of literature abstracts
In all described systems with algorithmic navigation are (autoabstracts). IBM Journal of Research and Development, 2,
implemented the same rules of compiling coherent texts 159-165.
Lunin, L.N., & Rada, R. (Eds.) (1989). Perspectives on hypertext.
from small fragments. All of them inform the user of the
Articles about hypertext. Journal of the American Society for Infor-
semantic gaps that cannot be suppressed in the constructed mation Science, 40, 158-220.
texts. Each of these systems supports the creation and Nelson, T. (1966, May) The information systems in the future.
updating of hyperbases using keywords for searching Information retrieval: A critical view. In G. Schecter (Ed.), Third
“candidates for linking” [this method is similar to that used Annual Colloquium on Information Retrieval, Philadelphia.
Otlet, P. (1975a). La documentation on matiere administrative. In
by designers of the Arizona Analyst Information System;
Actes de la Conference internationale de bibliographie et de docu-
see Carmel, McHenry, & Cohen (1989)]. mentation (pp. 147-154). Bruxelles (W. Boyd Rayword, trans.)
The Universe of Information (p. 16). Moscow, FID 520 (Original
work published in 1908).
Otlet, P. (1975b). Traite de documentation. Bruxelles (W. Boyd
References Rayword, trans.) The Universe of Information, (p. 354). Moscow,
FID 520 (Original work published in 1934).
Andreev, N. D. (1967).Statistical-combinatoric methods in theoretical Shaikevich, A. Y. (1982). Distributive-statistic& analysis of texts.
and applied linguistics. Leningrad(in Russian). Doctoral thesis, Moscow (in Russian).
Bush, V. (1945). As we may think. Atlantic Monthly, 276, 101-108. Subbotin, M. M. (1986). Computer applications and the construction
Cannel, E., McHenry, W. K., & Cohen, J. (1989). Building large, of chains of reasoning. Automatic Documentation & Mathematical
dynamic hypertext: How do we link intelligently? Journal of Linguisfics, 20, 1 - 10.
Management Information Systems, 6, 33-50. Tseitin, G. S. (1961). On constructing of mathematical models of Ian-
Chelnokov, V. M. (1985). Making the concept of integrity opera- guage. Conference on Information Processing, Machine Translation
tional in the representation of knowledge. In System research: and Automatic Reading of Text, Moscow, p. 11 (in Russian).
Methodological Problems. A yearbook (pp. 103-112). Moscow (in Zefirova, V.L. (1990). The research hypertext system HYPERLOG.
Russian). NOVINTEKH: International Computer Journal, 1, 29-30.

Russian Experience in Hypertext: Automatic Compiling of Coherent Texts

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Russian Experience in Hypertext: Automatic Compiling of Coherent Texts

Uploaded by

Copyright:

Available Formats

Russian Experience in Hypertext: Automatic

Compiling of Coherent Texts

186 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993

TABLE 2. Direct semantic links in the collection of statements.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993 187

188 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993

nmJral language mechanized IS

text procewng computerized IS

text analysis l scanning of text

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993 189

commumcation science l Inf. & bibl. services

efticiency of scientific documents

autom. text recognition

DB in management tom. analysis of a text

theory of scient. inf bibl. services

theory of inf. science man-machine communication

representation of data theory of IS

190 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993

inf. needs & users

ref. service in big library

bibl. & inf. services in medicine

mathematical models of IS for city management

mathematical model of language

i” Sema”tic “~~~n~~c .,,,dem semantic networks

semantic data in ES autom. creation of structure

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993 191

192 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993

F’ossilqi l,iTy. to move from one node to another is

FIG. 7. Chain of statements as a result.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-May 1993 193

You might also like

i” Sema”tic “~nc .,,,dem semantic networks