Professional Documents
Culture Documents
QUALITATIVE MODELS,
QUANTITATIVE TOOLS AND NETWORK ANALYSIS
J. P. COURTIAL
Ecole des Mines de Paris, Centre de Sociologie, 62 bd. Saint.Michel, 75006 Paris (France}
One model for knowledge development is the network interaction model. Insofar as
socio-technical networks may have some structural properties, does knowledge development
reflect this? The hypothesis that it does may enable us to make some forecasts of science
development from a description of the state of a field. One condition necessary for testing
this hypothesis is that of adopting a model for these networks. Co-word analysis is such a
tool. It gives us key-words networks derived from scientific and technical texts. The author
checks for network properties in the area of knowledge development through a case study
of Polymer Science and Technology from 1973 to 1978.
Introduction
eUing that could be produced by both co-citation and co-word mapping techni-
ques. ~
With respect to specifically scientific networks, some authors have proposed
properties like self organization 8 or "boot-strap/band wagon" effects. 9 The latter
describe scientific development in terms of the transfer of "packages" associating
theoretical assessment and technological tools in order to allow for new "do-able"
experiments in a variety of fields in a kind of snow-ball process.
In this paper we will report on a case study designed to test for the existence
of possible dynamic structural properties of key-word networks.
Thus if a given research field could be described as a network of related research
areas or topics (each topic being related to one or more others and being more or
less developed inside the field), change or development inside the field could be
described as interaction with other research (or applied research) fields as follows:
(a) introduction, for instance, of a theoretical model or concept that may be
central (although not developed) by linking together many topics, and therefore,
able to develop inside the field,
(b) development, as a kind of reverse move, of a central and developed theoretical
model (or element of know-how) inside the field, now available for other research
field, thus progressively becoming a "specialty" no longer central within the field
[although perhaps central with respect to other fields, as with(a)]o
research field
central and developed
model available topics
for other research
or applied research
new model as input
fields into the research field
How can we find the kinds of network properties indicated above in the rela-
tionships between key-words?
Given a set of documents indexed by key-words, co-word analysis consists of cal-
culating a proximity coefficient "e" between key-words. The value of this coefficient
is 1 when one word implies the other and 0 if they never co-occur (or co-occur
below a given co-occurrence threshold). I f l i s the frequency of the first key-word,
J the frequency of the second One, K the co-occurrence, e = (K/l) X {K/J). It is
thus possible to calculate a general network of associated words.
Networks are analysed through co-word tools in terms of clusters of about 10 key-
words of decreasing intensity of links, with each cluster having a centrality index
corresponding to its location (central/peripheral) inside the network, and a density
index corresponding to the local development of links within the cluster. I o Centrality
corresponds to the weight of the extemal links of the cluster, whereas density corres-
ponds to t h e internal links of the cluster.
Within clusters, we distinguish between independant and dependant clusters. Depen-
dant clusters are clusters whosemutual strong links (above the value of intemal links
of the less-linked cluster) contribute to the meaning or complete comprehension of
each one. lndependant clusters are clusters whose 10 key-words entirely describe
their content (although they could of course be linked, through purely associational
links, to other clusters). Dependant clusters, in that their content is spread over more
than one cluster, are good indicators of structural change within the leading topics.
We decided thus to focus on the study of dependant clusters at the cross-roads bet-
ween several clusters.
What we call a "strategic diagram ''~ 1 displays clusters on a two-dimensional map
according to their centrality (x-axis) and to their density (y,axis). instead of using
exact values, we used rankvalue in order to study the pattern of association bet-
ween clusters.
The strategic diagram, divided into four quadrants, indicates:
1. clusters that are tightly connected both internally and externally and that could
be thus regarded as the central topics o'f the document'set analysed (quadrant one
upper right):
2. clusters that are central to the network but weakly developed internally which
refer t o topics important in other networks (that is t6 Say in another set or in the
same set at a different period of time), that may correspond to newty appearing
research themes (quadrant two, lower right);
3. clusters that are strongly developed internally but rather peripheral to the net-
work, which generally correspond to sub-sets of the current set (quadrant three, upper
left); and
4. clusters that are both weakly developed internally and peripheral to the network
which often indicate topics at the boundary of the current set (quadrant four, lower
left).
It can be noted that these locations fit in more or less with the areas mentioned on
the schema in the introduction. We have already carried out studies, of the strategic
use of this diagram for science policy: the first quadrant indicates what is central
inside a field whereas other quadrants depict areas that are instrumental with regard
to the topics of the first quadrant. ~2
However, given that the clusters in quadrant 2 are central to the network, but in-
ternally underdeveloped, it might be of particular interest to study them diachronically
rather than synchronically: One might then be able to show that a certain number of
them eventually move into quadrant 1, thus becoming a part of "mainstream" research
interests in the network. At the same time, however, other clusters in quadrant
2 might intially have been located in quadrant 1, but experiencing internal disintegra-
tion, interest in them is falling off.
In order to test these hypotheses~ three measurable consequences for key-word
networks will be studied:
1. Using the centrality and density rank values, we will compare cluster positions
on the strategic diagram at different points in time. Do changes in location correspond
to specific patterns expressing network's structural properties or not?
2. Given the centrality and density values of each cluster in different networks
corresponding to two different time periods, do some low density but high centrality
clusters increase in density over time? If so, this would indicate network reorganization
around an emerging topic.
3. Given a list of central but tow-density clusters, not connected to one another,
do changes over time tend to introduce relations among them? This question suggests
that topics located in the same quadrant of the strategic diagram could be cognitively
linked to one another despite the fact that, at any given moment in time, these links
might not yet be identified. This rather strong hypothesis could lead to qualitative
predictions.
If these effects really do exist, we would have obtained both a kind of qualitative
and quantitative model for the description of science in the making. This model would
consist of a list of sub-networks composed of topics interacting through a dynamic
process. Relations between elements in the model could better be described through
fuzzy logic than through mechanical models. The model would contain some quanti-
tative and qualitative prediction properties.
In order to check these hypotheses, we examined some results of a study on Polymers
sponsored by the National Science Foundation, 1~ covering the period 1965 to 1985.
We studied two files of articles corresponding to two different periods of time. The
articles were taken from the three "Journal o f Polymer Science" Editions (Chemistry,
Physics, Journal of Applied Polymer Science) and from Die Makrornolekular Chemie.
These four journals are located in the same place on journal co-citation maps in the
Polymer Field, calculated by Zeldenrust, 13 and they constitute what could be called
"Academic Polymer Science". The first file is made up of 3447 articles from 1973 to
19~/5, the second is made up of 3106 articles from 1976 to 1978.
Results
Table 1
Changes in location of clusters on the strategic diagram
from 1973-75 to 1976-78
caract6risation polystyr~n-~
cristallisation sph6rulite |
d6termination structure cristal
propri6t6 m6canique
and occupy the same place on the Strategic Diagram. This seems to be a good test
for the validity of the strategic diagram given the general variability of scientific texts.
"Plastique renforc6" (reinforced polymer ) seems to be, as far as it is an old research
area, a model whose study appears to be more or less terminated within the framework
of our file, and which is now available for other theoreticians and practitioners.
The two moves we can see are from the lower fight to the upper right quadrants
for "caract6risation polystyr6ne" and "cristallisation spherulite". This would seem to
suggest that changes in the position of clusters on the strategic diagram follow a
specific pattern.
What about the other two clusters in quadrant 2: "d~termination structure cristalline"
and "propri6t6 m~canique"? Their locations on the strategic diagram remain the same
over time. We would suggest, however, that these themes will develop later or, perhaps,
that they mobilized a great deal of interest in an earlier period of polymer science
research and that their weak intetnal structure is an indication of decreasing interest.
However, this is an hypothesis which cannot be checked with our data. A larger time
frame than that used in this case study would be required to test it.
2. Table 2 indicatest the variation in density from the first period of time to the
second one for the four clusters located on the lower right quadrant of the Strategic
Diagram.
As we can see, the two clusters moving from quadrant 2 to quadrant 1 increase
in density.
Table 2
Increase in density for low-density and high centrality clusters
from 1973-75 to 1976-78
73-75 76-78
Node Increase
Density Density
Table 3
Changes in links between clusters in quadrant 2 in
1973-1975 to 1976-1978
Dfit. struct 1
cristalline
Cristal.spherulite 2 90 - 80
Pt~. m~canique 3
Caract. polystyrene 4 . . . . 10
1 2 3 I 2
1973-1975 1976-1978
These two clusters not only increase in density as compared to the others - the
overall density level of the others could have decreased - but they also increase their
numerical density values. This constitutes the main changein the academic polymer
science file.
3. We calculated the links between the four clusters located in the lower right
quadrants for both periods. Articles are automatically assigned to a cluster if they
are indexed by at least one word of the cluster and another word associated with
it. In this way, articles are re-indexed by the name of the cluster to which they
belong, is It is then possible to calculate associations among clusters in the same way
as associations are calculated among key-words in the initial stage of co-word
analysis.
Table 3 indicates changes in cluster links. It can be seen that a new link appears
between the two developing clusters "cristallisation spherulite" and "caract6risation
polystyrene".
We also observe the decrease of the link between the second of these developing
clusters ("cristallisation spherulite") and the first of the two previously developed
clusters ("d6termination structure cristalline").
The existence of a link between the two developing clusters of quadrant 2
suggests that they are cognitively linked to one another despite the fact that in the
first period this link was not detected. When we went back to our data to check
word association patterns in the first period, a pathway" between "cristaUisation
sph6rulite" and "caract6risation polystyrbne" existed. This pathway started with
"styrene polym6re" and ended with "cristallisation". It included "polym~re
atactique" and "polym6re isotactique". The existence of such a pathway indicates
that a qualitative analysis of the data could have suggested a possible statistical
link at the next period of time.
This qualitative analysis also showed that only a part of cluster "cristallisation
sph6rulite" joins the first quadrant. The remaining parts split off to form a new
cluster which is still present in quadrant 2. Thus, changes in networks can imply
the splitting of a cluster into one central and several "contextual" ones. In the
present case, "sph6rulite" belongs to the central one, "fusion" to a contextual one.
The case study has shown some network properties that can be observed in scien-
tific development processes. Our goal was to explore hypotheses that might be useful
in improving our knowledge of the "life cycle" of themes corresponding to clusters.
We feel that this knowledge will allow us to study networks diachronically in order
to make predictions with respect to potential directions of scientific and technological
change.
Thus, it is possible to distinguish within a list of themes of a research area between
(a) themes which are internally developing within the research area (quadrant 1), (b)
themes which express input from other research areas - in this case the crystal model
applied to polymers - that may develop in the near future (quadrant 2), (c) themes
which express lines of research Coming to an end (quadrant 3).
These results have consequences for fimding policy. For instance, in order to deter-
mine the best period of time for funding or the kind of scientific partners at work
in the process of knowledge development, or the actors concerned by the implemen-
tation of research output. Herein lies the goal of our study on Polymers funded by
the NSF.
However, we tested these properties on only two time periods and within a single
field. More time periods are no doubt required in order to explore the idea that
We should like to thank G. Bowker, Loet Leydesdorff, H. Rothman and B. Turner for their
comments on a former version of this text. All the programming was carried out on a NAS/9180,
IBM 3090/200 eorrlputer at the Centre lnter-R6gional de Calcul Electronique, B~timent 506,
91405 Campus d'Orsay.