Professional Documents
Culture Documents
Productivity indicators
Rémi Barré
Science and Public Policy August 2001 0302-3427/01/040259-8 US$08.00 Beech Tree Publishing 2001 259
Sense and nonsense of S&T productivity indicators
are useful, but only as entry points into the discus- produce articles to be found in the Science Citation
sion, considering that their raison d’être is to be Index (SCI), if at all.
criticised in terms of their (limited) relevance and We argue that the least that can be demanded for a
(limited) comparability. I argue that such informed productivity indicator is that the input (resources
criticism provides an excellent way to deepen the used), as denominator of the ratio, is effectively used
understanding of the mechanisms at play in a com- to produce the output placed as numerator! It is not
parative way. the case here, since part of the resources considered
The conclusion is what I consider to be a practical as input (military and applied research) have nothing
and methodologically sound position regarding to do with what is considered to be their output (SCI
benchmarking the productivity of research systems, publications). We must therefore restrict the meas-
and a proposal for an exercise of benchmarking S&T urement of resources spent (the denominator), to the
productivity which would make sense. spending for research which effectively produces
SCI-type publications.
Nonsensical me asurement of productivity Correction for factor of bias 1: Compute basic re-
search spending as academic 3 research spending,
In an article published by Science, Sir Robert May plus 10% of all other public research spending, con-
(1998), at the time Chief Scientist of the British sidering that only a fraction of the applied and mili-
Government, proposed an “indicator of efficiency of tary public research can generate SCI-type public a-
public research spending”, constructed in the follow- tions (see Table 1).
ing way:
It appears that spending on university research is
number of scientific publicatio ns 1 similar in France and the UK (about 4.5 bn Euro)
and that funding of other public research institutions
spending on basic research is quite different between the two countries, France
spending 50% more than the UK (6.25 against 4.19
In 1996, the value of the indicator, as computed by bn Euro). Basic research spending as computed in
Sir Robert May, was 2.13 times higher for the the Science article, was 28% higher in France than in
United Kingdom than for France.2 His conclusion the UK. Following the correction, a more realistic
was that public research spending was 2.13 times 12% difference emerges.
more efficient in the UK than in France. The case
included all the ingredients of an apparently perfect Measuring scientific publications
case of benchmarking scientific productivity: an ex-
plicitly defined indicator of productivity; an inter- Sir Robert considers the number of publications ap-
national comparison; and a policy context at the pearing in the SCI for each country as a comparable
highest level. This section aims to illustrate that the measurement of their scientific production, but this
conclusion drawn is in essence illegitimate: the ratio would hold true only if the world’s average re-
of ‘number of publications’ to ‘spending’ cannot be searcher in each discipline were to publish the same
interpreted as representing the efficiency of public number of articles, that is, if all disciplines had the
research spending. same tendency to publish in SCI journals. This is not
the case. Clinical medicine, for example, represents
What is ‘spending on basic research’? 30% of the number of articles in the SCI database,
for an estimated 10% of the spending in basic re-
Sir Robert defines as ‘basic research spending’ the search. This means that the tendency to publish in
total spending of university research, government- that discipline is three times the average of the
funded laboratories and non-profit research acti- others.
vities. It means that he measures in reality the This would not result in a bias in the comparison
spending of all publicly funded research, which between France and the UK if the two countries had
includes academic, but also military and applied re- the same level of activity in clinical medicine within
search. Clearly, these latter kinds of research hardly their disciplinary structure. This is not the case. The
Country Academic Other public research Basic research spending as Corrected basic research
research spending institutions spending computed by May spending
(a) (b) (c = a + b) (d = a + 0.1b)
Table 2. World share and absolute number of articles for clinical medicine, other disciplines and all disciplines, for France and the
UK (1996)
Table 3. Number of articles for clinical medicine, other disciplines and all disciplines, for France and the UK with correcting factor
for clinical medicine (1996)
UK is strongly specialised in clinical medicine, with Correction for factor of bias 2: To eliminate the bias
a share of 12.0% of world articles, which is 41% introduced in the measurement of scientif ic output
above its average (all disciplines). On the other by the high tendency of publication of clinical medi-
hand, the world share of articles in clinical medicine cine and of the high specialisation of the UK, I
for France is 5% below its average. The high ten- introduce a weighting factor of one-third for the arti-
dency of publication in clinical medicine and the cles of that discipline (Table 4).
high specialisation of the UK in that discipline result
in a structural effect which constitutes a bias in the The scientific output of the UK was 67% higher than
comparison of the scientific outputs between France that of France with the initial numbers; it is 46%
and the UK. higher after introduction of the correction needed to
How can this bias be controlled for? I separate take into account the clinical medicine bias.
clinical medicine and the ‘other disciplines’ which,
together, make for ‘all disciplines’. This provides First round correction of productivity indicator
three categories of discipline. Let us first consider
the world share of articles for both countries in each In Table 5, the initial and corrected values for the
of these categories (Table 2, columns 1 to 3). Then, productivity indicator are computed. The initial
knowing that clin ical medicine makes for 30% of the value computed with OST (Observatoire des Sci-
SCI database, I compute the absolute number of arti- ences et Techniques) data fits very well with the one
cles from each country for each category, applying a used by Sir Robert, since the indices are almost
coefficient of 30% to the category clinical medicine, equal (213 and 214). Using now the corrected values
70% to other disciplines and 100% to the category for the denominator and the numerator, following
all disciplines (Table 2, columns 4 to 6). By con- the first round corrections, we get a corrected value
struction, the numbers for ‘all disciplines’ are the for the productivity indicator. This is 164, which in
same in columns 3 and 6.
At this stage, I apply a coefficient of one-third to Table 4. Volume of scientific publications — initial and
the absolute number of clinical medicine articles, to corrected values (1996)
account for the fact that the tendency of publication
in this discipline is three times higher than in the Country SCI publications SCI publications as% of
others (Table 3, column 1). The numbers for ‘other as% of world world total after correction
disciplines’ are kept (column 2), which gives a new total for clinical medicine
(e) (f)
total for each country, with a world total of 80. Fi-
nally, I adjust the total of each country to get the France 5.1 5.2
figures for a total of 100, which gives their world
UK 8.5 7.6
share, after correction for the structural effect of
clinical medicine (column 4). Sources : OST S&T indicators Report 1998 ; from ISI–SCI data
Country Indicator directly derived Indicator computed by OST b Indicator computed with corrected
from those published keeping the definitions values eliminating factors of
by Maya of May bias 1 & 2
(g = e/c) (i = f/d)
Sir Robert’s terms would mean that UK public ac- Factor of bias 5: linguistic and cultural: The
tivities in science and technology would be 64% hypothesis here is that, all things being equal, it is
more productive than their French counterparts. Is it easier for an English language speaker to publish in
certain that this residual 64% represents a difference an English language journal, which almost all SCI-
in productivity? covered journals are. Of course, this is not so in the
most internationalised disciplines, but some effects
Second and third round of corrections can be reasonably expected on the whole, which in-
troduces a bias.
In the first round of corrections I took into account
two factors of bias. There are, however, a number of Factor of bias 6: publications resulting from non-
other factors that introduce a difference between public spending: So far, it has been assumed that all
France and the UK in the numerical value of the scientific publications of a country are attributable to
productivity indicator without having anything to do public spending. This is not so, and we should dis-
with the productivity of research at all. I now con- count the publications linked to research financed by
sider another four such factors (numbers 3 to 6) in a non-public resources. It appears that more than 10%
second round of corrections, to eliminate the biases of UK publications are produced by firms. Further-
they introduce. I do this through making explicit more, it is well known that non-profit organisations
simplifying assumptions, modifying directly the play a major role in financing basic research in the
ratio, without detailed presentation of the new UK and thus producing publications.5 The situation
values for the numerator and/or the denominator is quite different in France where the non-public
(Table 6). related publications are significantly less. When
subtracting such publications from the total national,
Factor of bias 3: unit cost for salary and social the UK would loose more than France.
benefits4 : The majority of researchers in UK univer-
sities work under short-term contract (‘post-docs’) Assigning a differential impact of 10% on the value
with rather limited social benefits. The situation is of the productivity indicator for factors 3 and 4, and
quite different in France, where almost all academic an impact of 15% for factors 5 and 6, I find that the
researchers are civil servants, with significant social value of the productivity indicator is 96 for the UK,
benefits. as compared to 100 for France, which means a dif-
ferential of 4% in favour of France.6
Factor of bias 4: weight of social sciences and Has it been shown that France is more productive
humanities: The SCI does not include social sciences in science than the UK? Certainly not. Beyond the
and humanities. Therefore, we must discount from need to ground more firmly the corrections for bi-
the spending (the denominator), the resources de- ases 3 to 6, there is the need for a third round of cor-
voted to this field. They represent a share of the rections to control for some additional factors of
spending on basic research in France, which is bias, which may push the productivity differential in
probably higher than in the UK. either direction (factors 7 to 9):
Table 6. Effect of four factors of bias on the productivity indicator value (second round of corrections)
Country Corrected Factor of bias 3 Factor of bias 4 Factor of bias 5 Factor of bias 6 Value of indicator
indicator Unit cost for Weight of social Linguistic and Publications eliminating
eliminating salary and social sciences and cultural bias resulting from factors of bias
factors of bias security humanities non-public 1 to 6
1&2 spending
To arrive at a practical and methodologically sound mechanisms overlooked, very much in the spirit of
position, I suggest that the role of quantific ation the discussion above.
should be reconsidered, and the idea of a direct link- In the course of such exchanges, deeper under-
age between indicators and productivity assessment standing of one’s own system emerges at the same
should be abandoned. Instead, I introduce the con- time as an understanding of the other systems. This
cept of an informed and pluralistic debate in a wider results in an assessment of threats, opportunities,
network, involving scientists and decision-makers. I strengths and weaknesses, resulting in lessons and
suggest that quantitative indicators are the starting experiences learned and possibly recommendations.
point for the discussion, with their raison d’être It also results in insights in alternative institutional
being to be criticised in terms of their (limited) rele- arrangements and incentive structures, ideas about
vance and (limited) comparability. more adequate tools and procedures, a need for a
Indicators are considered thus, not as a final result more appropriate description of the system, with
to be accepted, but as an entry point for debate. This more appropriate concepts, and a need for better and
is an excellent way to enter an exercise of learning- more relevant indicators.
by-comparing, which is what benchmarking is Those results and needs should then feed into a
about. Criticism must be careful and positive, be- further stage of the benchmarking, which itself is a
cause the purpose is not to dismiss the legitimacy of multi-stage process. In this sense, benchmarking as
the exercise, but to help dig further for a better proposed here has much to do with an extended stra-
understanding of the situation. Indicators will pro- tegic monitoring process. It is an exercise in joint
vide the cornerstone of the debates. There will be a strategic evaluation, joint assessment and a joint
sequence of quantitative and qualitative analysis learning process. Benchmarking thus provides an
phases, each one feeding from the previous and opportunity for policy analysts, administrators and
into the next, in a virtuous circle of comparative decision-makers to develop international relations
understanding. aimed at meaningful comparison and understanding
through substantial interactions. To sum up, the fol-
Benchmarking as joint strategic monitoring exercise lowing steps are suggested for implementing this
process:
To be relevant to public policy, benchmarkingmust be
considered as a process involving some aspects of Step 1: Development of a conceptual framework to
both extremes presented above. From the first, we set up the categories, parameters and classific ations
should keep the idea of analysis with sufficient depth needed to define a model of how things work.
and due consideration of the socio-political and in-
stitutional context, and from the second we should Step 2: Selection and computation of internationally
keep the idea of indicators useful for comparisons. comparable quantitative input, process and output
Thus, benchmarking must be viewed as a strategic indicators.
monitoring exercise resulting from the inter-
action and debate among experts from two countries Step 3: Benchmarking, with the help of national ex-
based on quantitative and qualitative evidence. perts, which involves:
In this interpretation, benchmarking necessarily
involves both analytical work and direct interaction • characterising the structure of national S&T
among people. The experts are given the task of production, based on the conceptual framework,
criticising and interpreting jointly a set of indicators, that is, relating the conceptual framework’s cat-
explicitly based on a common conceptual framework egories to the ‘real’ national innovation system,
and computed in a transparent way. The debate its institutions and actors;
among them is about the assumptions, the meaning • interpreting and criticising the quantitative
of the categories chosen to compute the indicators, indicators, accounting for national specificity of
the bias as a result of specific features or various kinds;
• complementing, when necessary, the quantitative research system involves, may lead to the total dis-
indicators by qualitative information. missal of benchmarking as a tool for public policy.
There is room for benchmarking as a public policy
Step 4: Identification of the key determinants of tool, provided that indicators are considered not as
S&T production effectiveness, as well as the results, but as entry points for debate, and that
strengths/weaknesses of each national research and benchmarking is understood as a multi-stage exer-
innovation system. cise involving analysts, research actors and policy-
makers. To make sense, it should be seen as a joint
Step 5: Relating these determinants (strengths/ strategic evaluation, joint assessment and joint learn-
weaknesses) to policy options in key areas of gov- ing process.
ernment policy (such as financing of public research, Benchmarking thus provides an opportunity for
prioritisation and evaluation of public research, de- policy analysts, administrators and decision-makers
gree of autonomy of, and competition between, uni- to develop international relations aimed at meaning-
versities, forms of incentive programmes, and ele- ful comparison and understanding through substan-
ments of the incentive structure), with a view to tive interactions. In that sense, benchmarking can
making country-specific policy recommendations. indeed be a no-nonsense policy instrument.
Special Issue on Proceedings of the Conference Science and Brussels, 28-29 May.
the Academic System in Transition — an International Expert D Hicks and S Katz (1997), “The changing shape of British indus-
Meeting on Evaluation, Vienna, 3–5 July, Scientometrics, 45, trial research”, STEEP special report no 6, SPRU, University
pages 523–530. of Sussex.
M Gibbons, C Limoges, H Nowotny, S Schwartzman, P Scott and R M May (1998), “The scientific investments of nations”, Science,
M Trow (1995), The New Production of Knowledge (Sage pub- 281, 3 July, pages 49–51.
lications, London). P Roqueplo (1997), “Entre savoir et décision, l’expertise scienti-
K Guy (1998), “Conceptual issues in RTD impact assessment”, fique”, collection ‘sciences en questions’ (INRA éditions,
DGXII workshop measurement of RTD results/impacts, Paris).