You are on page 1of 22

Research Policy 29 Ž2000.

657–678
www.elsevier.nlrlocatereconbase

Evaluating technology programs: tools and methods


Luke Georghiou a , David Roessner b,c,)

a
UniÕersity of Manchester, Oxford Rd., Manchester M13 9PL, UK
b
Department of Public Policy, Georgia Institute of Technology, Atlanta, GA 30332-0345, USA
c
Science and Technology Policy Program, SRI International, 1611 N. Kent St., Arlington, VA 22209, USA

Abstract

This article reviews the analytical tools, methods, and designs being used to evaluate public programs intended to
stimulate technological advance. The review is organized around broad policy categories rather than particular types of
policy intervention, because methods used are rarely chosen independent of context. The categories addressed include
publicly-supported research conducted in universities and public sector research organizations; evaluations of linkages,
especially those programs seeking to promote academic-industrial and public-private partnerships; and the evaluation of
diffusion and industrial extension programs. The internal evaluation procedures of science such as peer review and
bibliometrics are not covered, nor are methods used to projects and individuals ex ante. Among the conclusion is the
observation that evaluation work has had less of an impact in the literature that it deserves, in part because much of the most
detailed and valuable work is not easily obtainable. A second conclusion is that program evaluations and performance
reviews, which have distinctive objectives, measures, and tools, are becoming entangled, with the lines between the two
becoming blurred. Finally, new approaches to measuring the payoffs from research that focus on linkages between
knowledge producers and users, and on the characteristics of research networks, appear promising as the limitations of the
production function and related methods have become apparent. q 2000 Elsevier Science B.V. All rights reserved.

Keywords: Technology programs; Tools; Methods

1. Introduction the mid-1980s, the first OECD review in this field


noted the convergence of the internal evaluation
Demand for evaluation has been fueled by the
traditions of science with a growing demand for
desire to understand the effects of technology poli-
evaluation of public policy in general ŽGibbons and
cies and programs, to learn from the past and, more
Georghiou, 1987.. The latter trend would later be
instrumentally, to justify the continuation of those
characterized as a feature of ‘‘new public manage-
policies to a sometimes skeptical audience. The rela-
ment’’ and reach its apotheosis in the current re-
tion between the evolution of evaluation methods
quirement for programmatic and institutional perfor-
and approaches and that of technology policies, we
mance indicators. A third influence was the growing
shall argue, is complex. The development of evalua-
tendency to associate science with competitive per-
tion approaches has responded to varied stimuli. In
formance and the search for more effective ways to
achieve that linkage. The growth of activity was
)
Corresponding author. Department of Public Policy, Georgia catalogued in other reviews at that time ŽRoessner,
Institute of Technology, Atlanta, GA 30332-0345, USA 1989; Meyer-Krahmer and Montigny, 1990. that ex-

0048-7333r00r$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 0 4 8 - 7 3 3 3 Ž 9 9 . 0 0 0 9 4 - 3
658 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

tended the analysis to innovation programs from the tion and to the ways in which information inputs and
US and European perspectives, respectively. Since reporting are organized. In general, the organiza-
that time, as policies have increasingly focused upon tional and political dimensions of evaluation, though
fostering linkages of various types within innovation important, are not covered in this review.
systems, evaluation methods have been developed In the second part of this section, the scope is
which aim to characterise and measure those link- broadened to include evaluations that focus upon
ages. linkages, including those of programs seeking to
In this paper, we have chosen to focus on technol- promote academic–industrial and public–private
ogy policy rather than the broader scope of innova- partnerships. The centrality of these interfaces to
tion policy. The evaluation literature is now so ex- technology policy has stimulated a large body of
tensive that it has been necessary to be selective research, much of which has an evaluative character.
even within this constraint. Evaluation methods tend The selection made here aims to give a flavor of the
to cluster around particular types of policy interven- range of evaluation methods used rather than to
tion. Hence, we have found it more useful to orga- review the topic as a whole. These interfaces are also
nize the review by broad policy category than by an present in the next focus, collaborative R & D pro-
attempt to group methods independently of their grams, though industrial collaboration is usually their
context. Following this approach, three related focal main aim. This policy instrument has had a special
points for evaluation are used to structure what relationship with the development of evaluation, with
follows. These appear to follow a sequential model the two growing together during the 1980s. It contin-
of innovation from basic research, through aca- ues to attract a large amount of evaluation activity.
demic–industrial linkages to industrial collaborative Finally, experience in the evaluation of diffusion
R & D and finally diffusion and extension. However, and extension programs is discussed. In methodolog-
it will emerge from what follows that similar ques- ical terms, this involves a transition from the rather
tions Žfor example those concerning economic re- lumpy and unpredictable distribution and attribution
turns. may be posed at any stage of this sequence. of benefits that characterize research, to a domain
Furthermore, evaluations have been instrumental in where large numbers of client firms are in receipt of
exposing the many feedback loops and unexpected less visible assistance. Evaluation here is thus char-
consequences that have modified the way in which acterized by treatment of large data sets, though we
the innovation process is understood. will argue that these conceal a wide range of services
The first of our focal points concerns evaluation and clients.
of publicly supported research carried out in univer- Two other points need to be made concerning the
sities and public-sector research organizations. One scope of this review. The first is that evaluation is a
reason this is included is the growing significance social process, which means that methods cannot be
attached to the economic and social rationales for equated with techniques for collection of data or
public support of research. Hence, the relevant sec- their subsequent analysis. The choice of what is
tion addresses means by which the economic and significant to measure, how and when to measure it,
social value of science is being assessed. The inter- and how to interpret the results are dependent upon
nal evaluation procedures of science, principally peer the underlying model of innovation that the evaluator
review with some contribution from its offshoots in is using, implicitly or explicitly. Much of the data
scientometrics, are not covered here. Modified peer collected by evaluators are themselves conditioned
review, or merit review as it is sometimes called, by the positioning of the evaluation and those who
extends the terms of reference of a peer review panel execute it. In consequence, it is usually necessary to
to cover the broader socio-economic issues that are understand the setting of the evaluation and the
discussed below. Often, the outcomes of such panel discourse in which its results are located before the
reviews are highly influential because of the status of choice of approach can be fully appreciated.
their members. However, from a methodological per- The second qualification about our scope ad-
spective, interest in such approaches is limited to the dresses the level of evaluation covered. The central
variations in structure and composition of consulta- unit of analysis is the program evaluation. In many
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 659

ways, this is the easiest territory for evaluators in framework of general approaches to measuring the
that a program almost by definition has boundaries economic returns to investments in research. He
in space and time and certainly should have objec- classified these approaches into three distinct types
tives. However, when moving across national and ŽHertzfeld, 1992, p. 153.:
cultural borders this criterion for inclusion may be 1. The adaptation of macroeconomic production
over-restrictive. Evaluation in France, in particular, function models to estimate the effects of techno-
is mainly at the level of institutions, even if many of logical change or technical knowledge that can be
the same issues are addressed ŽLaredo ´ and Mustar, attributed to R & D spending on GNP and other
1995.. aggregate measures of economic impact.
Aspects that remain excluded from this review are 2. Microeconomic models that evaluate the returns
ex ante evaluation, evaluation of projects and of to the economy of specific technologies by esti-
individuals Žthough each of these may well be linked mating the consumer and producer surpluses these
to the evaluations we discuss.. The last consideration technologies create.
in terms of scope is that of the most aggregated 3. Measuring the direct outputs of research activity
evaluations, which address whole sectors, policies, through reported or survey data on patents, li-
and national or international systems. Many of these censes, contractor reports of inventions, value of
tend to intrude into the general realm of technology royalties or sales, and so forth.
policy studies, but some are more firmly rooted in Most recently, the Critical Technologies Institute
the tradition of evaluation. We will argue in the of the RAND Corporation Žnow renamed the Science
concluding section of this paper that understanding a and Technology Policy Institute. published a report
program may well require more work on the sys- prepared for the White House Office of Science and
temic context than is presently common practice. Technology Policy that reviewed methods available
for evaluating fundamental science ŽCozzens et al.,
1994.. The RAND study classified evaluation meth-
2. Evaluation of the socio-economic impacts of ods into three types:
research in universities and public laboratories 1. Retrospective, historical tracing of the knowledge
2.1. Measuring the returns inputs that resulted in specific innovations.
2. Measuring research outputs in aggregate from
Within the past 15 years, there have been several particular sets of activities Že.g., programs, pro-
comprehensive assessments of methods for measur- jects, fields, institutions. using bibliometrics, cita-
ing the returns, benefits, andror impacts of public tion counts, patent counts, compilations of ac-
support for research. The Office of Technology As- complishments, and so forth.
sessment’s 1986 Technical Memorandum was stimu- 3. Economic theoryreconometric methods employ-
lated by Congressional interest in improving research ing, as measures of performance, productivity
funding decisions through the use of quantitative growth, increase in national income, or improve-
techniques associated with the concept of investment ments in social welfare as measured by changes
ŽU.S. Congress, Office of Technology Assessment, in consumer andror producer surpluses ŽCozzens
1986.. The OTA review covered economic and out- et al., 1994, pp. 41–43..
put-based, quantitative measures used to evaluate What is striking about these critical assessments is
R & D funding. Economic methods included macroe- that, despite their nearly 10-year span, there is virtu-
conomic production functions, investment analysis ally complete agreement among all three concerning
and consumer and producer surplus techniques. Out- the strengths and weaknesses of the various ap-
put methods included bibliometric, patent count, proaches to determining the benefits or payoffs from
converging partial indicators, and science indicators investments in research generally, and fundamental
approaches. science in particular. Moreover, improvements over
Hertzfeld Ž1992. provided a probing critique of the period in the several techniques described have
methods employed to measure the returns to space been modest and incremental. There have been nu-
research and development, but did so within a larger merous examples of the application of these tech-
660 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

niques to measure the impacts of research that span program managers seeking to maximize the benefits
the past 30 years. There is considerable agreement that result from their programs of research. As the
on which studies represent the best of each genre, RANDrCTI study concluded,
and on the limitations of each approach.
Historical trace studies such as Hindsight ŽSher- Most direct indicators of R & D outcomes such as
win and Isenson, 1967., TRACES ŽIllinois Institute citation and other bibliometric indicators are only
of Technology Research Institute, 1968., and indirectly Žand loosely. related to those aspects of
‘‘TRACES II’’ ŽBattelle, 1973. offer the possibility economic and other goals of public improvement
of detailed information about the relative contribu- which are of greatest interest ŽCozzens et al.,
tion of basic vs. applied research, institutional con- 1994, p. 44..
texts, and sources of research support. But the method
requires subjective judgments about the significance As noted above, economic assessments of re-
of particular research events, and suffers from lack search fall into two basic categories: production
of generalizability because of non-random innova- function analyses and studies seeking social rates of
tion selection processes. It also traces single products return. Production function studies assume that a
of research backwards in time to identify the full formal relationship exists between R & D expendi-
range of contributions, rather than tracing forward tures and productivity. While there is ample evidence
from a single research activity Že.g., a project. to that such a relationship exists, there are numerous
identify multiple impacts or consequences. Further, it problems with this approach. Prominent among the
is an extremely costly method; given the complexity studies that employ this approach are those by Deni-
of knowledge flow processes and variations across son Ž1979., Link Ž1982., Levy and Terleckyj Ž1984.,
fields of science and technology, the number of cases Jaffe Ž1989. and Adams Ž1990.. Regardless of the
necessary for generalization would be prohibitively assumptions underlying these studies, all conclude
expensive. Finally, historical trace studies fail to that the relationship between private R & D expendi-
account for the indirect effects of research, including tures and various measures of economic growth and
dead ends Žfrom which substantial learning takes productivity are positive and substantial. But there
place., spillovers, and synergistic effects. are fundamental problems with this approach that are
Other, more aggregate, approaches seek to over- exacerbated when applied to public R & D expendi-
come the limitations of historical trace studies. Mea- tures. First, the ‘‘technical information’’ term in the
sures of basic research outputs have been refined to production function is only approximated by R & D
the point where reasonably valid and reliable data on expenditure data, and in any event its effect on the
the quantity and quality of research outputs can be larger production system is not well understood.
obtained. It has been argued that while any single Second, the approach does not account adequately
output measure is insufficient, the relative effective- for externalities that result from R & D activity. Third,
ness and efficiency of research programs, organiza- if the focus is on publicly supported R & D, addi-
tional units, and institutions can be measured using tional problems arise because the intent of most
combinations of measures Že.g., Martin and Irvine, public R & D is not to stimulate economic growth,
1983.. The problem for such indicators generally is but to achieve Žpublic. agency missions. Any contri-
one of linking outputs to impacts, especially impacts bution to economic growth is thus due to indirect
that are valued in markets or the political arena. For knowledge transfers. Fourth, the contributions of
some policy purposes, such as resource allocation basic and applied research are difficult to distinguish,
decisions among several institutions conducting ba- yet these are important for many policy purposes.
sic research, these kinds of studies can be of consid- Finally, this approach is designed to address eco-
erable value. But they are inadequate for questions nomic benefits that result from incremental changes
concerning the impact of basic research on larger in production efficiencies, but cannot adequately ac-
societal goals, especially when measures of value are count for the effect of new products or radical
sought. In addition, except in the most general sense, changes in production efficiency ŽCozzens et al.,
these approaches offer little guidance to research 1994, p. 48..
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 661

Aggregate social rate of return studies attempt to process are brought to bear, obscuring the relation-
estimate the social benefits that accrue from changes ship under study.
in technology and relate the value of these benefits Hertzfeld concludes his assessment of the various
to the cost of the investments in research that pro- techniques with a recommendation:
duced the changes of interest. These studies employ,
by analogy, what amount to the internal rate of
return calculations often used by private firms. So- . . . the most promising type of economic study for
cial benefits are measured as the sum of consumer measuring returns to space R & D is the documen-
and producer surpluses. Prominent examples of this tation of actual cases based on direct survey
approach are studies by Mansfield Ž1980; 1991., information. Case studies provide relatively clear
Mansfield et al. Ž1977., Tewksbury et al. Ž1980. and examples of the returns, both theoretical and ac-
Link Ž1981.. At a more disaggregate level, the con- tual, to space R & D investments. A well-struc-
sumer surplus approach has been used to estimate tured series of case studies coupled with good
the returns from public investments in technology theoretical modeling aimed at integrating the data
development programs as well. We discuss this, and and incorporating those data with more general
the use of an alternative technique, briefly in a economic measures of benefits may well be the
subsequent section of this article. way to establish reliable and ‘believable’ eco-
As in the case of production function studies, nomic measures of the returns to space R & D Žp.
aggregate consumer surplus studies tend to show that 168..
the aggregate rate of return to research is positive,
and that the social rate of return exceeds, on average,
the private rate of return. Also, as in the case of Hall Ž1995., in a review of the private and social
production function studies, this approach has signif- returns to R & D supports the employment of a
icant drawbacks. First, the causal mechanism that broader range of approaches, cautions that case study
links R & D investment to social or private returns is evidence may be hampered by focus on ‘winners’,
imputed but not direct. Second, because calculations and needs to be supplemented by a computation of
of consumers’ and producers’ surplus depend on the returns.
existence of supply and demand curves, they are
inappropriate for goods and services that are radical
departures from their predecessors Žwhich is just the
case for the most interesting products of basic re- 3. Evaluating linkages
search.. Third, because they rely on data from indi-
vidual cases, like historical trace studies they are Since the RANDrCTI assessment was published,
difficult to generalize from and expensive to con- a number of papers have appeared that are critical of
duct. Fourth, they rely on an adaptation of an invest- the entire conceptual framework economists have
ment model intended for short-term investment cal- been using to analyze scientific activity. This ‘‘new
culations, not the long-term benefits that might ac- economics of science,’’ perhaps best formulated by
crue from basic research. Thus, the discounting re- Dasgupta and David Ž1994., is critical of the ‘‘old
quirements of the models may severely underesti- economics of science’’ based on production func-
mate the contribution of basic research. Finally, as tion-based approaches to assessing the payoffs from
with other approaches, spillovers are difficult to science Žand its linkages to technology., and intro-
identify and account for ŽCozzens et al., 1994, p. duces broader perspectives that may lead to new
55.. approaches for evaluating basic science. By incorpo-
In summary, the dilemma the limitations of these rating the insights gained from sociological studies
approaches pose for the research evaluator lies in the of science, economists’ perspectives are expanding
question of attribution, this being that to realise to incorporate the characteristics of research institu-
economic effects of research a range of complemen- tions and the professional networks that bind knowl-
tary factors from within and beyond the innovation edge producers and users. One of Dasgupta and
662 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

David’s propositions emerging from their work reads struments, their institutions, and their social net-
as follows: works of communication and collaboration. Inno-
vation gives rise to diverse scientific and techno-
The organization of research under the distinct logical outcomes including, among others, scien-
rules and reward systems governing university tific knowledge, new technological devices, craft
scientists, on the one hand, and industry scientists and technique, measures and instruments, and
and engineers, on the other, historically has per- commercial products. But it is the collective ar-
mitted the evolution of symbiotic relationships rangements and their attendant dynamics, not the
between those engaged in advancing science and accumulated innovation products, that should more
those engaged in advancing technology. In the properly be considered the main asset when as-
modern era, each community of knowledge seek- sessing the value of research ŽBozeman and
ers, and society at large, has benefited enor- Rogers, 1999..
mously thereby ŽDasgupta and David, 1994, p.
518..
Thus, for Bozeman and Rogers, the appropriate
Although much of the ‘‘new economics’’ is de- evaluation unit of analysis should be the set of
voted to the implications of this broader thinking to individuals connected by their uses of a particular
resource allocation efficiencies, its inclusion of the body of information for a particular type of applica-
processes of knowledge transfer and use and of the tion — the creation of scientific or technical knowl-
networks by which such transfers take place offers edge. They term this set of individuals the ‘‘knowl-
opportunities for new ways of thinking about how to edge value collective.’’ In this scheme, science can
value science beyond attempts to monetize the value be valued by its capacity to generate uses for scien-
of the knowledge produced andror to infer such tific and technical information. That capacity is cap-
value from the applications that enter economic mar- tured by the dynamics of the knowledge value col-
kets. lectives associated with use, as measured by their
Bozeman et al., as part of an ongoing evaluation growth or decline, their productivity or barrenness,
of the US Department of Energy’s Office of Basic and their ability to develop Žor deflect. human capi-
Energy Sciences, have recognized this opportunity tal.
and are currently developing new theory about In many ways, this work builds upon the perspec-
knowledge and innovation that, in our view, shows tive developed by the Centre for Sociology of Inno-
promise for methodological advances in evaluating vation ŽCSI. in France, who have focused their
basic research ŽBozeman and Rogers, 1999; Boze- evaluation activities upon the emergence of net-
man et al., 1998; Rogers and Bozeman, 1998.. Boze- works. One approach, applied mainly at the institu-
man and Rogers propose a ‘‘pre-economic’’ ap- tional level, uses the concept of techno-economic
proach to evaluating scientific knowledge, one that networks to characterise in a dynamic manner the
does not purport to supplant economic evaluations nature, types and durability of alliances between
but to complement them. Their approach is based on public laboratories and companies, with particular
the range and repetition of uses of knowledge both emphasis on the ‘‘intermediaries’’, meaning docu-
by scientific colleagues and by technologists. One of ments, embodied knowledge, artifacts, economic
the several advantages of this approach is that, unlike transactions and informal exchanges ŽCallon et al.,
most evaluations that use the program or project as 1997.. The emphasis in the CSI approach moves
the unit of analysis, it incorporates the role of re- away from economic appraisal, focusing instead upon
search collectives and networks engaged in knowl- how networks are assembled to realize innovations
edge production and use. In their view, in collective goods. This brief treatment cannot do
justice to the developing work on collectives, but
innovation and knowledge flows cannot be as- suffice it to say that it appears to offer considerable
sessed independently of the collective arrange- promise for capturing many of the payoffs from
ment of skilled people, their laboratories and in- basic research, such as the development of human
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 663

capital, that thus far have been systematically under- methodological practice and the importance of the
estimated or ignored. specific and individual context of each evaluation.
Citations to scientific papers made in patents have All were subject to a mix of approaches Žthe normal
been used as an indicator of growing linkage be- case for evaluations. and demonstrate the strengths
tween academic science and industry ŽNarin and and limitations of each method employed in meeting
Norma, 1985; Narin et al., 1997.. While supporting the needs of different clients and stakeholders.
this indicator, Pavitt Ž1998, p. 109. has argued that
patenting by universities gives ‘‘a restricted and
distorted picture of the range of the contributions 3.1. EÕaluation of the Center for AdÕanced Technol-
that university research makes to practical applica- ogy DeÕelopment (CATD)
tions.’’ The explanation of this view lies in the
complex nature of the relationship between universi- The Center for Advanced Technology Develop-
ties and the corporate sector, involving flows of ment ŽCATD. was established in 1987 at Iowa State
ideas and people. University ŽISU. with funds from the US Depart-
Roessner and Coward Ž1999, forthcoming. are ment of Commerce. CATD seeks to bridge two gaps
conducting an analytical review and synthesis of the in the innovation process:
research literature on cooperative research relation- Ø between the research results of the university and
ships between US academic institutions and industry, the commercial market; and
and US federal laboratories and industry. The scope Ø between a company’s problems and the expertise
of the review and synthesis was restricted to: resident at the university.
Ø empirical studies based on original or archival To address the first gap, CATD funds research
data only rather than theoretical, normative, or intended to demonstrate proof-of-concept and de-
policy analyses; velop advanced prototypes. To address the second,
Ø studies of research cooperation among US institu- CATD can match funds provided by industry to
tions; address industrial problems.
Ø published or at least readily available reports and An extensive evaluation of CATD was conducted
studies Ži.e., conference papers, dissertations, the- in 1995–1996 ŽRoessner et al., 1996.. The evalua-
ses, contractor reports, and unpublished papers tion is significant for several reasons: there was
generally were excluded.; sufficient financial support to conduct a relatively
Ø published or printed since 1986. thorough study Ža situation uncharacteristic of most
As of this writing, 33 studies have been reviewed single, university-based technology transfer organiza-
of the approximately 50 to be examined. While only tions., there were multiple clients for the evaluation,
a handful of these studies were intended as formal and the evaluation design combined multiple strate-
program evaluations, most were intended to yield gies and types of data. There were three primary
policy-relevant results. Surveys, case studies, and clients for the evaluation: the National Institute of
personal interviews accounted for most of the meth- Standards and Technology ŽNIST. of the US Depart-
ods employed in these studies Ž22.; two conducted ment of Commerce, which paid for the evaluation,
quantitative analyses of archival data, two employed the Iowa State Legislature, and CATD staff. NIST
modeling, and the remainder were literature reviews was interested in knowing whether it should support
or essays or used a combination of methods. As a long- or short-term projects. For the state, CATD
general observation, the preponderant use of surveys, needed to validate its successes and show evidence
case studies, and interviews is an indication of both of success, as developed by an independent evalua-
the recent emergence of cooperative R & D relation- tor, to its constituencies. Finally, CATD manage-
ships as a subject of study and the complex, dynamic ment wanted to know which of its activities gener-
characteristics of these relationships. ated the greatest payoff, how its activities were
The following three brief case studies of evalua- perceived by its industrial clients, and what changes
tions of programs to stimulate academic–industry were needed to increase the payoffs from its activi-
linkages demonstrate both general lessons for ties.
664 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

A project with multiple clients, each asking dif- what project selection criteria to emphasize. The fact
ferent questions, called for a design employing mul- that quite different evaluation methods produced
tiple evaluation strategies. In addition, multiple similar overall Žpositive. results strengthened the ac-
strategies Žand multiple measures of outcomes and ceptability of each type of result.
impacts. would yield greater confidence in the re-
sults if the several strategies reached similar conclu-
sions. The research team employed three research 3.2. Impact on industry of participation in the engi-
strategies: neering research centers program
Ø a survey of CATD client firms and a formal but
simplified benefitrcost analysis, which would ad- The National Science Foundation’s Engineering
dress payoff from investment questions; Research Center ŽERC. program, initiated in 1985,
Ø a series of detailed case studies, which would was intended to stimulate US industrial competitive-
address the client perception and management ness by encouraging a particular brand of industry–
questions; university research collaboration Žthe university-
Ø a ‘‘benchmarking’’ analysis involving similar based industrial research consortium., emphasizing
programs, which would also address the payoff interdisciplinary research, and fostering a team-based,
questions using a different method. systems approach to engineering education.
Of these, the benchmarking exercise was the most Evaluating such a program presents formidable
novel approach. Benchmarking programs such as challenges, especially if, as was the case here, some
CATD presents formidable challenges to analysts clients of the evaluation Že.g., Congress. wished to
and evaluators, most of which have to do with the obtain information on benefits to companies in terms
lack of data on programs that are truly comparable. of dollar payoff of their participation in ERCs. The
For these reasons, the Georgia Tech team used three overall objectives of the evaluation were to examine
different approaches to benchmarking CATD outputs patterns of interaction between ERCs and participat-
and impacts: ing companies, and to identify results of that interac-
Ø data from the FY 1993 survey of members of the tion in terms of types of impact and value of impacts
Association of University Technology Managers; Žbenefits to firms.. These deceptively simple objec-
Ø four selected state cooperative technology pro- tives posed several challenges because, among other
grams; and reasons, the consequences of industry–university in-
Ø results of a survey of companies conducting co- teraction take multiple forms over time, flow via
operative research with federal laboratories multiple paths within the firm, and are likely to have
ŽBozeman et al., 1995.. intermediate effects not valued in monetary terms —
Methodologically, the case studies offered the but set in motion Žor prevent. activities that do have
best fit between client questions and the evaluation such value.
results. The least successful effort was benchmark- In an effort to deal with these challenges in the
ing, primarily because it proved so difficult to iden- initial research design, SRI International, the evalua-
tify comparable programs that had been evaluated at tor, first conducted a series of case studies within a
a comparable level of detail. The benefitrcost small number of firms to provide details on intra-firm
framework was required to address the questions activities related to ERC participation ŽFeller and
posed by decisionmakers concerned about the future Roessner, 1995; Ailes et al., 1997.. Next, drawing
of the program itself, but the results offered no more upon the case studies, a focus group consisting of
than a modest justification for the public expendi- representatives from companies that participate in
tures and little for purposes of research management. ERCs was held. Third, the research team designed a
The case studies yielded three models of university– survey instrument based on the results of the case
industry collaboration that could be associated with studies, focus group, and model, and surveyed the
measures of impact. Managers could then decide, more than 500 firms that were participating. Follow-
based on their preferred outcomes Že.g., jobs vs. ing the survey, SRI conducted telephone interviews
increased private investment vs. startup companies., with about 20 survey respondents who reported par-
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 665

ticularly detailed or varied benefits from their partic- to separate ‘‘results’’ or ‘‘impacts’’ from ‘‘benefits,’’
ipation. because different respondents within the same unit
Results of the case studies of ERC-company pairs may agree on results or impacts but impute quite
and the focus group required that ‘‘typical’’ ap- different levels of benefit to them.
proaches to survey design Že.g., how respondents are Companies surveyed derived a multiplicity of
identified; the unit of analysis. be rethought, and that benefits from their participation in ERCs, but few
prevailing ideas about how firms value their partici- made a systematic effort to monetize these benefits
pation in ERCs Žas well as other investments in or develop formal measures of payoff or cost-ef-
external R & D. are incomplete and simplistic. To be fectiveness in order to justify the cost of member-
specific, the preliminary investigations suggested ship. Instead, they evaluated their participation in
that: ERCs as a dynamic process rather than expecting a
Ø The ‘‘known’’ direct consequences Žobservable. fixed set of benefits that can be assessed by present
to a firm of its participation in an ERC are or retrospective rating of outcomes. Occasionally,
extremely limited, often restricted to a handful of specific outcomes were realized and an economic
people in the firm. value estimated, but this was rare. Telephone inter-
Ø The business unit that pays the fee to participate views revealed that firms concluded that efforts to
Žand is identified as the ‘‘member’’. may not be measure benefits in dollars would cost more than the
the business unit that receives the Žobservable. company’s investment in the ERC. Only a small
benefits from participation. proportion of companies developed a new product or
Ø Companies regard benefits received from partici- process or commercialized a new product or process
pation in an ERC to be a function of their efforts obtained as a result of ERC interaction, but those
to make use of ERC results and contacts, rather that did valued these benefits particularly highly.
than of the amount of their membership fee. Thus, it proved to be extremely important to distin-
Ø Firms rarely attempt to estimate the precise dollar guish between whether a benefit was experienced
benefits gained from their participation in an ERC. and the value placed on that benefit.
These results had a number of very important
implications for any effort to employ surveys to 3.3. Teaching company scheme
measure impacts or benefits from industry participa-
tion in collaborative R & D relationships with univer- An example of a European study in this domain
sities or with other R & D suppliers. First, the busi- saw evaluations addressing what is widely perceived
ness unit, not the firm or even the division, is the as a successful initiative, the UK’s Teaching Com-
active response category. Thus, the survey instru- pany Scheme ŽTCS. ŽSenker and Senker, 1997.. The
ment must be directed to the unit that is the primary TCS is a long-running initiative that provides access
direct beneficiary of the company’s participation in to technology for firms and facilitates academic–in-
the ERC. Second, surveys must distinguish carefully dustrial technology transfer by employing graduates
between at least two key roles within the unit: the to work in a partnering company on a project of
‘‘champion’’ and the decision maker who approves relevance to the business. However, the study found
the budget for ERC membership. Data must be little evidence of a positive association between par-
obtained from both. The SRI team addressed this ticipation in the Scheme and the performance of
problem by identifying these roles as Ž1. the academic departments. In fact, there was a higher
‘‘champion,’’ the person who interacts most inten- positive association for departments involved in other
sively with the ERC, and Ž2. the ‘‘approver,’’ the forms of industrial linkage. From the perspective of
person who reviews or approves the budget for the evaluation methodology, this example illustrates the
unit’s participation in the ERC. Third, the survey difficulty of isolating the effects of a single stimulus
must establish the nature of each respondent’s in- from a wide range of variables and indeed of finding
volvement with the ERC. Valid analysis of any suitable proxies for collective academic performance
impact or benefit data must control for the nature of when linkages are still primarily at the level of
each respondent’s interaction. Finally, it is important individuals. In another evaluation study, the TCS
666 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

scored highest on a cost-effectiveness index devised absence of an intervention. For the ATP, a further
by the UK National Audit Office to compare a range hurdle has been the need to demonstrate to political
of programs which supported innovation. This index critics that the program is satisfying the theoretical
was based upon nineteen performance indicators criteria for intervention by operating in the margin
which were intended to reflect a combination of the where private returns alone do not justify the R & D
efficiency of delivering assistance and the measur- investment without the additional consideration of
able effect on innovation ŽNational Audit Office, social returns. In Europe, questions of rationale have
1995.. While the attempt to compare across widely generally been dealt with at the ex ante stage. In
differing schemes is laudable, the report serves most cases, it has been sufficient to demonstrate that
mainly to illustrate the pitfalls involved in trying to economic and social benefits have been achieved. As
capture complex effects with quantitative indicators, in the previous section, methodological issues are
and the importance of implicit or explicit weighting discussed in terms of a series of case studies, se-
of criteria. TCS did well because it involved low lected both for the prominence of the programs
administrative overheads by comparison with involved and for the evaluation issues they raised.
schemes offering grants to firms. However, granting
schemes have different objectives that may not be 4.1. The AlÕey Programme
achievable through low overhead policy instruments.
From the earliest days of evaluation of collabora-
tive programs, it was clear that a broad range of
4. Evaluation of collaborative R & D effects needed to be considered. The evaluation of
the UK’s Alvey Programme for Advanced Informa-
For Europe, the emergence of publicly supported tion Technology ŽGuy and Georghiou, 1992. set the
collaborative R & D between firms acting in research style for most subsequent UK national evaluations of
consortia Žand usually with academic partners. was a collaborative R & D, as well as forming an input to
distinctive feature of the 1980s, despite long an- broader European practice. That particular exercise
tecedents in industrial research associations. The US was a real-time evaluation, tracking the program
was initially inhibited in this area by anti-trust legis- from shortly after its inception, with periodic topic-
lation ŽGuy and Arnold, 1986., but private initiatives oriented reports fed back to the management, many
such as the Microelectronics and Computer Technol- of which addressed the complex process issues aris-
ogy were followed by the Advanced Technology ing from the program, for example the problems with
Program ŽATP. in 1990. The latter was much more intellectual property conditions ŽCameron and
similar to its European counterparts in structure and Georghiou, 1995.. The final report, delivered 2 years
aims, though somewhat smaller. It also differed in after the initiative ended, found that Alvey had suc-
offering single company support targeted at small ceeded in meeting its technological objectives as
firms in addition to support for joint ventures. well as its main structural objective of fostering a
From the perspective of this review, an important collaborative culture, particularly in the academic–
feature of such programs is the role they have played industrial dimension. However, it had manifestly not
in the stimulation of research evaluation method and succeeded in its commercial objective of revitalising
practice ŽGeorghiou, 1998.. The novelty of this pol- the UK IT industry. The problem, the evaluators
icy instrument and the expectation that new types of found, was in the objective itself. What was, despite
effects would be significant provided a stimulus to its size, still an R & D program had been sold as a
program managers to seek a deeper understanding of substitute for an integrated industrial policy with
their actions. Added to this has been an ongoing goals to match. For reasons that went well beyond
desire to demonstrate to policy-makers and other technological capability, the market position of UK
stakeholders that, on the one hand the programs are IT firms declined substantially during that period.
contributing to the competitiveness of firms, but that This was perhaps the first example of a long-running
nonetheless, those firms have been induced to under- evaluation problem in this sphere: such programs
take R & D that would not have taken place in the have multiple and sometimes conflicting goals that
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 667

are often not articulated in a format amenable to view of effects and revealed a number of unexpected
investigation through an evaluation. Hence, it was results, for example the significance of European
necessary to reconstruct the objectives in an evalu- projects in providing a basis for doctoral training. By
able form. working at the level of the organisation rather than
Since that time, the UK has developed the so- the project, some idea of the aggregate and interac-
called ROAME Žor ROAMEF. system, whereby tive effects of participation could be gained.
prior-approval for a program requires a statement Also from France came the best known European
detailing the rationale, objectives Žverifiable if possi- attempt to calculate the economic effects of pro-
ble., and plans for appraisal of projects, monitoring grams upon participating organizations. Known as
and evaluation Žplus feedback arrangements. ŽHills the ‘‘BETA method’’ after the laboratory at Stras-
and Dale, 1995.. This clearly simplifies the situation bourg University where it originated, this approach is
for evaluators but also emphasizes the need for based upon in-depth interviews during which man-
evaluators to engage with policy rationales, a topic agers in the organizations are asked to identify ‘‘ad-
returned to below in the discussion on the evaluation ded value’’ from sales and cost-savings, and to
of the ATP. attribute a proportion of this to their participation in
the project in question. The most innovative aspect
4.2. The EU Framework Programmes of the model is a formula for calculating indirect
benefits that arise from transfer of technology from
For most Europeans, the development of evalua- the project to other activities of the participant, from
tion has gone hand-in-hand with the history of pan- business obtained through new contacts or reputation
European collaborative initiatives, notably the Euro- enhancement gained via the project, from organiza-
pean Commission’s Framework Programmes and the tional or management improvements, or from the
inter-governmental EUREKA Initiative. The Frame- impact on human capital or knowledge in the organi-
work Programmes have been evaluated in many zation. The principal difficulty surrounding this ap-
different ways at the behest of a variety of stakehold- proach is that of attributing particular economic ef-
ers ŽOlds, 1992; Georghiou, 1995; Guzzetti, 1995; fects to a single project, when in many cases, the
Peterson and Sharp, 1998.. The legally based evalua- innovation may have drawn upon multiple sources
tion process has always been based upon convening from inside and outside the company concerned. The
a panel of independent experts for each sub-program method produces ratios attractive to research spon-
and asking them to report upon its scientific, man- sors because in a well publicized example ŽBach et
agerial and socio-economic aspects. Reports focused al., 1995. they show direct effects some 14 times
heavily on the first two criteria, confining themselves greater than the initial public investment. The figure
to generalized remarks on the third. Since 1995, this cannot, of course, be taken as a rate of return as it
approach has been elaborated, following pressure does not include in the calculation any other invest-
from Member States, and now consists of continuous ments necessary to realize the returns.
monitoring, reporting annually, and 5-year assess- Future directions for evaluation of the Framework
ments, carried out midway through program imple- Programme are discussed in a recent report from the
mentation but including within the frame of analysis European Technology Assessment Network ŽAiraghi
the preceding program ŽFayl et al., 1998.. et al., 1999.. A new emphasis upon broader social
However, from a methodological perspective, the objectives in the current Fifth Framework Pro-
most interesting developments have taken place at gramme implies the need to deal with a broader
the margins of this system or outside it altogether. range of stakeholders and to enter the difficult terri-
The first significant development came from a study tory of measuring the effects of R & D on employ-
carried out in France that aimed to look at the impact ment, health, quality of life and the environment, all
of all EU R & D funding upon the national research areas mainly driven by other factors. A second chal-
´ et al., 1990.. This was the prototype
system ŽLaredo lenge is the need to deal with the sometimes elusive
of a family of ‘‘national impact’’ studies which, concept of ‘‘European value-added’’, in other words,
using a survey and interviews, took a cross-cutting to assess whether the R & D objectives were most
668 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

efficiently pursued at the European Community level, involves the automatic despatch of a questionnaire
as opposed to national, regional, or indeed global on outputs and effects upon completion of the pro-
levels. ject Žthis functions as a Final Report.. Shorter fol-
low-up questionnaires are despatched after one, three
4.3. EUREKA and Žin the future. 5 years to participants indicating
commercial effects. In addition a sample of some
Evaluation of the EUREKA Initiative has evolved 20% of participants who completed 3 years before is
largely independently from that of the Framework interviewed. An annual report is prepared by an
Programme. EUREKA is normally described as a independent expert panel on the basis of these find-
‘‘bottom-up’’ initiative in which firms may seek ings. While simple in concept, this is a rare example
entry for any projects that meet the broad criterion of of an evaluation that systematically follows-up ef-
developing advanced technology with a market ori- fects during the market phase. A further innovation
entation. There is no central budget and only a very by EUREKA in late 1999 was the convening of key
small central secretariat coordinating decisions taken participants in all past and present evaluations to
by representatives of the Member States who choose review their collective findings and the degree to
whether or not, and by how much, to fund their own which these had been adopted. One key recommen-
nationals’ participation in collaborative projects. For dation Žthe need for post-project support for small
the first decade of EUREKA’s existence, evaluation firms. had recurred in each evaluation. Its lack of
took place optionally at the behest Žand expense. of adoption was seen as a measure both of its impor-
the Member State holding the Chair for that year. tance and of the lack of an adequate system to
After an administratively oriented panel exercise follow-up and learn from evaluation findings.
in 1991, the largest evaluation of EUREKA, and
indeed of any European initiative to date, took place
during 1992r1993. This involved teams from 14 4.4. SEMATECH
countries working together, with a survey of all
participants and interviews with about three partici- SEMATECH, the US government industry re-
pants from each of a selection of projects. Findings search consortium founded in 1987, has been the
are in Ormala et al. Ž1993., and a description of the subject of several academic studies, each seeking
process appears in Dale and Barker Ž1994.. This different measures of its effectiveness and cost-ef-
evaluation stood at a crossroads in terms of Euro- fectiveness. SEMATECH originally consisted of a
pean approaches as it involved an explicit combina- consortium consisting of 14 US firms in the semi-
tion of tools developed during the Alvey and French conductor industry Žtogether accounting for 80% of
impact studies, with further inputs from the Nordic US semiconductor component manufacturing. and
and German evaluation traditions Ždescribed respec- the Defense Advanced Research Projects Agency
tively in Ormala, 1989 and Kuhlmann, 1995.. Not ŽDARPA. of the US Department of Defense. Private
surprisingly, it has provided a template for many investment in the consortium has typically totalled
exercises that followed, within and outside EU- US$100 million annually, matched by government
REKA. In particular, the questions developed for the funds. As of 1998, SEMATECH is totally privately
survey, on topics such as the additionality of collabo- supported, and the number of member firms has
rative research and on the relation of R & D to firm declined.
strategy, have been extensively reproduced and The broadest assessment of SEMATECH was
adapted. conducted by Grindley, Mowery, and Silverman,
Since 1996, EUREKA has adopted a new ap- who published their results in 1994. The authors
proach which, though conventional in the tools it drew upon the available literature, SEMATECH
uses Žadaptations of the survey instruments from the archives, and interviews with SEMATECH managers
1993 evaluation., has been innovative in its structure and members of the technical advisory board. They
ŽSand and Nedeva, 1998; Georghiou, 1999.. Known discuss evaluation criteria, observing that these are
as the Continuous and Systematic Evaluation, this particularly problematic because of several features
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 669

of research consortia generally and SEMATECH in obtain benefit estimates in quantified form from
particular: members of research consortia but, consistent with
Ø agreement among the several stakeholders on similar results from a variety of other studies of
goals was lacking; research collaboration and technology transfer, these
Ø similarly, agreement on appropriate evaluation estimates fail to capture the bulk of benefits associa-
criteria was lacking; tion with consortium membership ŽRoessner and
Ø the relevant time horizon for achievement of SE- Bean, 1994; Feller and Roessner, 1995; Roessner et
MATECH’s broadest goals extends beyond the al., 1998..
time of its existence Ž5 years at the time of this The third evaluation, carried out by Irwin and
evaluation.; Klenow and published in 1996, used similarly nar-
Ø it is extremely difficult to establish a counterfac- row evaluation criteria: basically several measures of
tual against which to assess the impact of what is member firm performance as compared with non-
basically a unique arrangement; member firms. Although the major strength of this
Ø SEMATECH has been highly flexible, changing evaluation design is, in principle, the use of an
its goals as conditions changed ŽGrindley et al., implicit control group Žvia regression analyses using
1994, p. 736.. dummy variables to represent memberrnon-member
They conclude that flexibility in goals was crucial status., in reality the fact that SEMATECH members
for SEMATECH’s survival; that allowing for flexi- are the dominant firms in the industry weakens the
bility, its goals have been achieved; and that the value of this usually desirable feature. ŽSee the dis-
political goals of achieving ‘‘world leadership for cussion of the Irwin and Klenow evaluation in the
the US semiconductor industry’’ and ‘‘industry com- chapter by Klette, Moen, and Griliches in this issue..
petitiveness’’ constituted unrealistic criteria against
which the program could be judged.
Link et al. Ž1996. evaluated SEMATECH with 4.5. ATP
considerably more restrictive criteria in mind. They
sought to measure the returns to member companies The US Department of Commerce’s ATP is draw-
from their investments in SEMATECH, restricting ing substantial critical attention from policy-makers,
the returns to those accruing from research alone, and partly for this reason is also the supporter —
excluding benefits from improvements in research and subject — of numerous formal evaluation stud-
management, research integration, and spillovers ies. These studies are worth describing briefly here,
ŽLink et al., 1996, p. 739.. They drew a representa- less because any of them individually represents a
tive sample of eleven projects and then interviewed methodological advance, but because of the sheer
the people in member companies who were most size and variety of the evaluation effort. In the words
knowledgeable about each project. Many companies of the Director of ATP’s Economic Assessment Of-
either estimated a range of benefits or said that they fice, which supports and conducts many of these
experienced significant benefits but could not esti- evaluations, ‘‘It wATPx is probably the most highly
mate them accurately. Respondents also reported that scrutinized program relative to its budget size of any
intangible benefits related to research management, government program to date’’ ŽRuegg, 1998a,b, p.
integration, and spillovers were more important than 3.. The range of research topics being supported by
benefits related to research. The authors proceeded to ATP as of mid-1998 suggests a rich future source of
use these benefit estimates and project cost data Žlargely economic. analyses and evaluations of virtu-
obtained from SEMATECH accounting records to ally every aspect of the program ŽRuegg, 1998b, p.
calculate an internal rate of return for member com- 8..
panies. The ‘‘best’’ estimate of IRR was 63%, a Papers prepared for a recent ATP-sponsored sym-
figure that included both public and private funds posium on evaluation reflect the range of studies
invested in each project, burdened with an appropri- being conducted and a number of issues related to
ate overhead figure, and with projects weighted by evaluation methods. The paper by Jaffe Ž1998. re-
size. This study demonstrates that it is possible to views the extensive past efforts to measure the social
670 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

rates of return from R & D and compare them with estimate the impacts on the US economy of the
private returns. In the case of ATP, of course, mea- increased demand due to quality improvements. The
suring the average rates of net social return to a results showed cumulative effects over 5 years of
portfolio of projects may serve to justify Žor under- US$8.7 billion in increased total industrial output
mine. the rationale for the program, but in the ab- and an increase of more than 160,000 jobs ŽCON-
sence of predictive models directed at the individual SAD Research, 1998..
project level, project selection decisions remain unin- The Republican-led US Congress has been highly
formed about possible spillover effects from individ- critical of ATP, and consequently has initiated a
ual projects. Identifying technological and market number of inquiries concerning its cost-effectiveness
factors that tend to be systematically related to large and political justification. Prominent among the for-
positive differences between the social and private mal evaluation efforts launched by the Congress is a
returns at the project level continues to be a form- large-scale, survey-based evaluation of ATP con-
idable task facing economists. ducted by the General Accounting Office ŽGAO.. An
One of the ATP program’s objectives is to accel- unusual aspect of the evaluation was the use of a
erate the development and commercialization of new comparison group, quasi-experimental design. In ad-
technologies. Laidlaw Ž1998. surveyed 28 projects dition to 89 successful applicants for ATP awards,
funded in 1991 to obtain estimates of the amount 39 ‘‘near winners’’ who had been scored as having
that ATP funding reduced development cycle time, very high scientific and technical merit, had satisfied
and estimates of the economics of reducing cycle the program’s requirements, and had strong technical
time by 1 year. Estimates of cost reductions or and business merit but had not received ATP funding
savings resulting from reduced development cycle were surveyed. The 128 winners and near winners
time ranged from one million dollars to ‘‘billions,’’ were surveyed by telephone using a 76-item, closed-
suggesting that such estimates may be highly unreli- ended instrument. Data were obtained on applicants’
able. In another project reported in the symposium, efforts to obtain funding before applying to ATP,
Link Ž1998. sought estimates of the reduction in and if they intended to pursue their projects whether
research costs attributable to ATP funding, realized or not they received ATP funding. Near winners
by seven companies participating in the Printed were asked about the fate of their projects after
Wiring Board Research Joint Venture. Companies failing to obtain ATP support. The results were used
estimated research cost savings due to workyears by both critics and supporters of ATP to support
saved and testing and machine time saved, and pro- their views ŽU.S. General Accounting Office, 1996..
duction cost savings due to productivity improve- If nothing else, the GAO evaluation demonstrates
ments. Total estimates are reported 2 years into the the difficulties associated with evaluating public pro-
project and at the project’s end. Link provides no grams intended to support pre-competitive technol-
discussion of the reliability of these estimates, how- ogy development in private firms. While relevant
ever. and, apparently in the case of the GAO evaluation,
Researchers at CONSAD Research attempted to reliable data on key questions about pre- and post-
estimate the preliminary impacts on the national award project histories can be obtained from award
economy of an ATP-sponsored project whose objec- winning companies and from a comparison group of
tive was to control dimensional variation in automo- companies, drawing conclusions about whether the
bile manufacturing processes. At the plant level, program is addressing a market failure or not is a
experts were asked to estimate the direct impacts of political rather than an analytical exercise.
the newly developed technologies on the production The GAO experience has been echoed in similar
processes across different assembly plants; other ex- European work on the issue of additionality — what
perts estimated the rate of adoption of the technolo- difference did the intervention make. Traditional
gies by automobile manufacturers and the magnitude treatments have focused on input additionality —
of the impact of the resulting increase in product that is whether the incremental spending by an as-
quality on the sales of automobiles. A macroeco- sisted firm was greater than or equal to the amount
nomic inter-industry model was then employed to of subsidy. Papaconstantinou and Polt Ž1997, p. 11.,
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 671

summarizing an OECD conference, concluded that approach would yield an estimate of whether the
several participants saw: average benefitrcost ratio summed over all projects
is greater than one and, if so, by how much.
a focus on additionality Žthe changes in behaviour The counterfactual approach has been applied to
and performance that would not have occurred several technologies supported by NIST Že.g., Link,
without the programme. as a criterion for success 1996., but not, as far as we know, to overall pro-
is simply a reflection of the difficulty of accu- grams such as ATP and SEMATECH that support
rately measuring spillovers or externalities and multiple technologies. Both the GrilichesrMansfield
thus the net social benefits of programmes. and the LinkrScott approaches rest on industry-re-
ported estimates of private investments in technology
They present a view more consistent with the development. In addition, the consumerrproducer
network models described earlier in this article: that surplus approach requires knowledge of the supply–
the concept of ‘behavioral additionality’ Žinduced demand parameters for process technologies, and
and persistent changes in the innovative behavior of must make fairly heroic assumptions to obtain equiv-
firms. provides a sounder measure of the effects of alent estimates for new products. The counterfactual
intervention ŽBuisseret et al., 1995. in keeping with approach requires estimates by industry of what their
policy rationales founded in the notion of systemic costs would have been in the absence of government
rather than market failure. support Žthus the label, counterfactual.. Both ap-
proaches provide useful insights into the issue of net
4.6. Economic estimates of the net benefits of public returns from public investments in support of new
support of technology deÕelopment technologies, but since both rely, in part, on ex-
pressed preferences rather than revealed preference
Although our focus in this article is on methods they embody a subjective element and must be used
for evaluating technology programs, in contrast to and interpreted with that subjectivity in mind.
either more aggregate levels of analysis Že.g., poli-
cies. or less aggregate levels Že.g., individual pro-
jects or technologies., we would be remiss if we 5. Evaluation of diffusion and extension pro-
omitted reference to methods that economists have grams 1
developed to measure the net social benefits of pub-
lic support for technology development. The con- For several decades, industrialized nations have
sumerrproducer surplus method, pioneered by initiated a variety of programs intended to promote
Griliches Ž1958. and Mansfield et al. Ž1977., uses the diffusion of industrial technology within their
estimates of producer and consumer surplus to mea- borders. The primary purpose of most of these pro-
sure the social and private rates of return to invest- grams has been to enhance the competitiveness of
ment in particular technologies. Link and Scott Ž1998. target firms, which in turn is expected to increase the
have used what they term a ‘‘counterfactual’’ model competitiveness — and, hence, the economic growth
to determine the relative efficiency of public vs. — of the nation or region in which the target firms
private investment in technologies with public good operate. Some of these programs single out new,
characteristics. In principle, these two approaches state-of-the-art technologies for diffusion; others em-
could be applied to the individual projects or tech- phasize best practice technologies and techniques;
nologies that comprise the portfolio of technologies still others focus as much or more on business
in a publicly supported program of technology R & D,
and the results aggregated to evaluate the entire
program. The consumerrproducer surplus approach 1
would yield an estimate of whether, and by how Although purists may argue that the terms signify differences
among program goals or practices, for our purposes, the programs
much, the average social rate of return over all whose evaluation we discuss in this section can be labelled
projects or technologies in the portfolio exceeds the interchangeably as industrial modernization, industrial extension,
average private rate of return. The counterfactual or manufacturing extension.
672 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

practices, training, and marketing as on technology to repeat the litany of formidable barriers has served
itself. In most instances, target firms are small- and as a convenient rationale for not conducting evalua-
medium-sized manufacturing enterprises ŽSMEs., tions or as an umbrella justification for limited ex-
typically defined as firms with fewer than 500 em- pectations or egregious shortcomings.’’ In the re-
ployees. Program staff seek to deliver a range of mainder of this section, we focus on these few
services whose number and mix vary widely across exceptions and on the progress that has been made in
programs in the general terrain of training, advice to the last 4 years.
firms and promotion of linkages. In their 1996 review of methods used to evaluate
Each of the 50 states in the United States now has industrial modernization programs, Shapira et al.
at least one industrial modernizationrextension pro- Ž1996. concluded that ‘‘most of the evaluation meth-
gram. States have been the initiators of these types of ods used by industrial modernization programs are
programs, with significant industrial extension pro- either indirect Žprogram monitoring or external re-
grams in several states dating back to the 1960s. Not views. or implemented with one data point after
until very recently did the US federal government service completion’’ Žp. 202.. Cost and the lack of
begin to support a nationwide program of such pro- demand for more sophisticated evaluation strategies
grams under the Manufacturing Extension Program largely account for the situation; as Shapira and
ŽMEP. within the Commerce Department’s NIST. Youtie Ž1998. observed recently, there appears to be
The MEP has an active evaluation program, and no direct correlation between the usefulness of an
most larger evaluative efforts to date and the attempt evaluation method with that method’s degree of so-
to create a community of evaluation researchers fo- phistication or use of controls. The state of the
cused on MEP-supported extension centers have been evaluation art is thus defined by a small number of
supported or encouraged by this group. Although studies that employ relatively rigorous designs: the
state-supported programs have existed for decades, identification and use of comparison groups and the
and the national effort is now ten years old, the state use of statistical controls to achieve internal validity
of the art in evaluating industrial modernization pro- of results.
grams Žat least in the US. is just beginning to evolve Identification of appropriate comparison groups 2
from a fairly primitive state. In a 1996 special issue is a daunting task because, among other reasons,
of this journal devoted to the evaluation of industrial firms receiving services from industrial modernatiza-
modernization programs, Shapira and Roessner tion programs Žclients. are self-selected, automati-
Ž1996. observed that at the time there was consider- cally distinguishing them in unknown ways from
able experimentation, innovation, and mutual learn- non-client firms. At least two prominent evaluations
ing, but ‘‘to date there have been comparatively few have used different approaches to identifying a com-
systematic evaluation efforts in the industrial mod- parison group of non-client firms. The Michigan
ernization field’’ Žp. 182.. They cited the newness of Manufacturing Technology Center ŽMMTC., one of
the field, the small amount of resources devoted to the first NIST manufacturing centers, has evaluated
evaluation, and lack of agreement on designs, mea- the impact of 5 years of service delivery to SMEs by
sures, and who should manage the evaluations. comparing a set of performance measures collected
Feller et al. Ž1996. offer a harsh assessment of the from client firms to performance measures on the
reasons for the relatively primitive state of the evalu- same variables from firms participating in the Perfor-
ation art as of 1996. They discount the usual reasons mance Benchmarking Service ŽPBS., a separate ser-
for such situations, namely inherent conceptual prob-
lems or difficulties in obtaining data, claiming that in
other program areas such as manpower training,
2
medicine, and education similar issues have not fore- In this article, we use the term ‘‘comparison’’ groups rather
stalled the emergence of a substantial evaluation than ‘‘control’’ groups to distinguish the use of quasi-experimen-
tal designs employing the logic of comparing firms that received
research literature, tradition, and practitioner com- services from similar firms that did not receive services, from true
munity Žp. 313.. With few exceptions, they say, experimental designs in which treatment and control groups are
referring to the state of US evaluation, ‘‘the ability selected at random from a population of potential target firms.
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 673

vice operating under the auspices of the MMTC. 3 ufacturers and a 1-year follow-up of GMEA cus-
PBS currently has about 3000 firms in its data set. tomers; and
Shapira and Youtie’s evaluation of the Georgia Man- Ø a series of case studies to provide in-depth infor-
ufacturing Extension Alliance ŽGMEA. uses a differ- mation about the effects of GMEA services on
ent method of identifying a comparison group firm operations and profitability.
ŽShapira and Youtie, 1998.. The GMEA evaluation In addition to the obvious analyses of these data,
sent a ‘‘control’’ survey to all manufacturers in the Shapira and Youtie combined data on the private and
state of Georgia with more than ten employees. The public returns to GMEA-related investments and di-
1000 responses received were weighted to reflect the vided this figure by private plus public investment to
actual distribution of manufacturers by industry and obtain a benefitrcost ratio. Private returns included
employment size. Then GMEA client performance, increases in sales, savings in labor, materials, energy,
measured by value added per employee, was com- or other costs, reductions in the amount of inventory,
pared with the weighted sample of all Georgia manu- and avoidance of capital spending. Private invest-
facturers. ment included estimates of the value of customer
Jarmin Ž1999., in an interesting effort to measure staff time commitment, increased capital spending,
the performance of MEP client firms against non-cli- and fees. Public investment included federal, state
ent manufacturers, used the US Census Bureau’s and local program expenditures. Public returns were
Longitudinal Research Database ŽLRD. to create a measured by federal, state, and local taxes paid by
nonclient comparison group. Jarmin’s evaluation used companies and their employees, estimated from sales
data from eight MEP centers in two states. He linked increases or job creationrretention. The analysis ac-
client data obtained from these eight centers to the counted for zero sum effects Ži.e., sales increases
LRD by using the Standard Statistical Establishment may be at the expense of other firms in the state. by
List ŽSSEL., which uses the same identifiers as the adjusting sales to about 30% of the reported sales in
LRD. Then he linked clients to the SSEL by creating the benefitrcost model.
matching variables such as firm name and zip code The results of these evaluations raise two kinds of
from the several fields that are common between the issues. First, there are considerable methodological
two data sets. Jarmin controlled for selection bias, a issues. Most are related to the complexity of the
serious problem in all such comparison group analy- programs themselves and to the inability of evalua-
ses, by estimating a Heckman-style two-stage model tors to identify a ‘‘true’’ control group of non-client
ŽJarmin, 1999, p. 103.. firms. Second, there are political issues related to
The GMEA evaluation employed a very wide differences between what the evaluation community
range of evaluation techniques to gather and analyze and the practitionerrpolicy community regard as the
data, thus exemplifying perhaps the most compre- most useful and credible attributes of evaluations.
hensive state-of-the-art in industrial modernization With regard to the first issue, it has proven highly
evaluation. In particular, they developed: problematic to sort out the effects of widely varying
Ø a customer profile assembled by program person- lengths and types of services delivered to client
nel at the point of initial contact with a customer; firms, which are accompanied by widely varying
Ø activity reports that track field agent activities and levels of resource commitment by the clients them-
customer interactions; selves. Aggregating across any significant number of
Ø client valuation surveys administered by mail to clients masks enormous variations in these two key
each customer when all major GMEA services variables. Further complicating this aspect of evalua-
had been completed; tions is the difficulty that firms have estimating the
Ø progress tracking via benchmarking and non- dollar value of industrial modernization program ser-
customer controls via the state survey of all man- vices for their operations.
The clients for evaluations of industrial modern-
ization programs are as widely varied as the pro-
3
This description of the MMTC evaluation is based on Dziczek grams themselves. Federal officials, state-level
et al. Ž1998.. politicians, and practitioners all seek and value dif-
674 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

ferent kinds of information, and have different crite- policy studies are related — in other words, that
ria of ‘‘ value.’’ As Shapira and Youtie Ž1998. point broad social and political trends influence the prac-
out, US federal sponsors have little interest in mea- tice of technology program evaluation. At certain
sures of customer satisfaction, but instead seek mea- times, evaluation studies have been at the leading
sures of economic impacts. Information about the edge of technology policy studies, for example in
relative impact of different program services is of eliciting an understanding of collaborative R & D and
vital interest to program managers and field staff, but of networking more generally. In other respects, they
has little value for those charged with justifying the have tended to lag behind for methodological and
program’s existence. Testimonials — basically su- political reasons. The demand for evidence of direct
perficial case studies — prove useful for program economic effects has left many evaluation studies
justification, but more formal cases, conducted using coupled to a more linear view of the world than is
rigorous social science methods, have no greater common in the mainstream of technology policy
impact for justification purposes. They can be highly studies. In general, though, evaluation may be seen
valued, however, by program managers and field as complementary to work carried out in other
staff because they reveal the paths by which services branches of the field, for example, more aggregated
are translated into changes in client operations and, studies of national performance or more detailed
in turn, into changes in client productivity. work at the level of the organization. Evaluation
European experiences of evaluating diffusion and work has probably had less of an impact in the
extension programs are most extensive in Germany, literature than it deserves, in part because much of
as a consequence of that country’s technology policy the most detailed and valuable work is not easily
traditions. However, an assessment equally as harsh obtainable. There is a disturbing tendency for evalua-
as those in the US has been made by Kuhlmann and tion data that could form a valuable reference point
Meyer-Krahmer Ž1995. about the state-of-the-art: for future studies to be lost in the grey literature,
especially for those in countries other than the one in
The practice of evaluation of technology policy in which the study was performed. Evaluators should
Germany, when critically surveyed, does not nur- make greater efforts to ensure that their main find-
¨ illusions about satisfying impact
ture any naıve ings are also published in more conventional media.
control. So far, evaluations have scarcely done This problem also has inhibited the effectiveness of
more than provide evidence that technology pol- evaluation itself. Despite their limitations, most stud-
icy interventions correlate with trends in techno- ies would benefit from a greater use of benchmark-
logical and industrial development. ing and comparison group approaches. The use of
archival data and multivariate methods should con-
They conclude that progress is dependent, among
tinue to be explored as one means of constructing
other things, upon more objective data Žless reliant
analytical controls.
on perceptions of participants., integration of evalua-
Looking at the broader role of evaluation in pol-
tion with prospective methods for technology assess-
icy, some interesting changes are on the horizon. 4
ment in order to assess longer term unintended im-
In the US, passage of the Government Performance
pacts of public intervention in technology. In re-
and Results Act ŽGPRA. in 1993 has galvanized the
sponse, a network of European evaluators has been
attention of the R & D community. GPRA calls for
attempting to unify evaluation with both TA and
annual ‘‘performance reviews’’ at relatively high
foresight approaches under the umbrella of a Euro-
levels of agency activity, generally well above the
pean Strategic Intelligence System ŽKuhlmann et al.,
1999..

4
For a comparison of research evaluation, practices in two
6. Conclusions different political settings, the US and Canada, and of the differ-
ences between GPRA and Canada’s earlier, formal requirement
In Section 1, we observed that evaluation methods that program evaluation be an integral part of the budget process,
and the methods used for more general technology see Roessner and Melkers Ž1997..
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 675

program level. It also contains an ‘‘escape clause’’ and measures of value, for the products of research.
that enables agencies supporting fundamental re- Methods that emphasize the dimensions of human
search to utilize qualitative measures of performance capital and institutional development offer fertile
rather than the quantitative ones favored in the legis- ground for development.
lation. Program evaluation activity has increased in Despite their widespread use as techniques for
response to GPRA, under the apparent assumption obtaining estimates of the benefits of research, and
that these evaluations will help support agency GPRA especially of linkages between research institutions,
requirements. Appropriately or not, program evalua- surveys of the intended beneficiaries are problematic.
tions and performance reviews are becoming entan- The studies we reviewed in this article document
gled, and it appears to us that, as pressure to develop some of the difficulties: within the ‘‘user’’ organiza-
performance measures and justify program budgets tion, those benefiting most from the interaction be-
increases as a result of GPRA, the lines between the tween external sources of knowledge and technology
two will become blurred. A similar tendency has also may not be the same as those who interact directly
emerged in Europe, with monitoring and perfor- with such sources and know the most about the
mance indicator approaches intersecting with evalua- interaction; the greatest benefits are longer term and
tions, and in some cases, competing. This raises qualitative rather than short-term and quantitative,
some important issues, not the least of which echoes and are thus difficult to estimate in monetized form;
our earlier observation that valid program evalua- efforts to obtain and validate such estimates conflict
tions must, increasingly, account for the systemic with firms’ proprietary concerns and with their prior-
context in which they are performed. Technology ities for allocating staff time Ži.e., responding to
programs do not exist in a vacuum, either politically surveys.; the inherently risky and long-term nature
or theoretically. The short-term yet aggregate per- of the innovation process itself often precludes reli-
spective of GPRA’s performance reviews conflicts able estimates of the payoffs from research. Where
with the longer-term, program-level, yet context-de- monetized estimates are obtained it must be recog-
pendent perspective of technology program evalua- nised that these embody the Žunknown. model of
tions. The fundamental requirement for the design of innovation and consequent attribution of benefits that
a performance indicator regime is a clear understand- the respondent is applying. The tendency to broaden
ing of context, goals and the relationships which link the range of acknowledged stakeholders in R & D
goals to effects. Whether this important distinction programs that has emerged recently in Europe pro-
will be recognized and dealt with by government vides another problem for survey approaches as many
officials and the evaluation community remains to be of these ‘‘social’’ groups are too distant from the
seen. research to provide detailed and structured responses.
Limitations of the production function approach We are not advocating that surveys be shelved as
to measuring the payoffs from research are now appropriate evaluation tools, but we do urge that the
apparent, but promising new directions are just subtleties now appearing in reported research be
emerging. Return on Investment and Internal Rate of anticipated and incorporated in future evaluation de-
Return measures, despite their quantitative appeal, signs. We also encourage, as mentioned above, the
have limited value for program justification and even creative construction of comparison groups, as was
less for R & D program management decisions. done in the GAO evaluation of the ATP program,
Methods are needed that capture more fully the and the use of case studies as the preferred method
noneconomic benefits from research — or at least for understanding what actually transpires in the
the benefits not easily translated into monetized form complex process of technological innovation.
by those who receive them directly Že.g., private We made the point that progress in technology
firms. or who seek to develop valid metrics Že.g., program evaluation methods has been incremental
professional evaluators.. We look forward to the over the past 15 years. Evaluation, as the applied end
outcome of efforts that focus on the characteristics of of technology policy studies, will continue to grapple
networks and linkages between knowledge producers with real-world problems such as lack of clarity in
and knowledge users as possible units of analysis, objectives and the need to meet multiple and often
676 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

conflicting stakeholder needs. On the positive side does it make? Additionality in the public support of R&D in
evaluation is an adaptable beast and through a cre- large firms. International Journal of Technology Management
10 Ž4–6., 587–600.
ative combination of methods has often managed to
´
Callon, M., Laredo, P., Mustar, P., 1997. Technico-economic
provide illumination and rational analysis in the most networks and the analysis of structural effects. In: Callon, M.,
difficult of circumstances. It also continues to have ´ P., Mustar, P. ŽEds.., The Strategic Management of
Laredo,
much to offer to technology policy studies as a Research and Technology — Evaluation of Programmes. Eco-
whole. There is a trade-off in which evaluators have nomica International, Paris.
Cameron, H., Georghiou, L., 1995. Managerial performance —
to work within more restrictive terms of reference
´ P., Mustar, P.
the process evaluation. In: Callon, M., Laredo,
than other researchers, but in return gain access to a ŽEds.., The Strategic Management of Research and Technol-
depth of data that would otherwise not be possible. ogy — Evaluation of Programmes. Economica International,
They also provide a conduit by means of which the Paris.
broader concepts of technology policy studies are CONSAD Research, 1998. Estimating economic impacts of new
dimensional control technology applied to automobile body
carried into the policy arena, often at the highest manufacturing. Journal of Technology Transfer 23 Ž2., 53–60.
levels. Our review suggests to us that there are a Cozzens, S., Popper, S., Bonomo, J., Koizumi, K., Flanagan, A.,
number of promising directions that evaluation meth- 1994. Methods for Evaluating Fundamental Science.
ods might take that could lead to significant ad- RANDrCTI DRU-875r2-CTI, Washington, DC.
vances. We hope we are right, and that agencies Dale, A., Barker, K., 1994. The evaluation of EUREKA: a
pan-European collaborative evaluation of a pan-European col-
supporting such work recognize the potential and act laborative technology programme. Research Evaluation 4 Ž2.,
upon it. 66–74.
Dasgupta, P., David, P.A., 1994. Toward a new economics of
science. Research Policy 23, 487–521.
References Denison, E.F., 1979. Accounting for Slower Economic Growth:
The United States in the 1970s. Brookings, Washington, DC.
Ailes, C.P., Roessner, D., Feller, I., 1997. The Impact on Industry Dziczek, K., Luria, D., Wiarda, E., 1998. Assessing the impact of
of Interaction with Engineering Research Centers. SRI Interna- a manufacturing extension center. Journal of Technology
tional, Arlington, VA, January 1997. Final Report prepared for Transfer 23 Ž1., 29–35.
the National Science Foundation, Engineering Education and Fayl, G., Dumont, Y., Durieux, L., Karatzas, I., O’Sullivan, L.,
Centers Division. 1998. Evaluation of research and technological development
Airaghi, A., Busch, N.E., Georghiou, L., Kuhlmann, S., Ledoux, programmes: a tool for policy design. Research Evaluation 7
M.J., van Raan, A.F.J., Viana Baptista, J., 1999. Options and Ž2..
Limits for Assessing the Socio-Economic Impact of European Feller, I., Roessner, D., 1995. What does industry expect from
RTD Programmes. ETAN, Commission of the European Com- university partnerships? Issues in Science and Technology,
munities, January 1999. Fall 1995, 80–84.
Bach, L., Conde-Molist, N., Ledoux, M.J., Matt, M., Schaeffer, Feller, I., Glasmeier, A., Mark, M., 1996. Issues and perspectives
V., 1995. Evaluation of the economic effects of BRITE- on evaluating manufacturing modernization programs. Re-
EURAM programmes on the European industry. Scientomet- search Policy 25 Ž2., 309–319.
rics 34 Ž3., 325–349. Georghiou, L., 1995. Assessing the framework programmes — a
Battelle Columbus Laboratories, 1973. Interactions of Science and meta-evaluation. Evaluation 1 Ž2., 171–188.
Technology in the Innovative Process: Some Case Studies. Georghiou, L., 1998. Issues in the evaluation of innovation and
Battelle Columbus Laboratories, Columbus. technology policy. Evaluation 4 Ž1., 37–51.
Bozeman, B., Rogers, J., 1999. ‘‘Use-and-Transformation’’: An Georghiou, L., 1999. Socio-economic effects of collaborative
Elementary Theory of Knowledge and Innovation. School of R&D — European experiences. Journal of Technology Trans-
Public Policy, Georgia Institute of Technology, Atlanta, GA fer 24, 69–79.
Ždraft paper.. Gibbons, M., Georghiou, L., 1987. Evaluation of Research — A
Bozeman, B., Papadakis, M., Coker, K., 1995. Industry Perspec- Selection of Current Practices. OECD, Paris.
tives on Commercial Interactions with Federal Laboratories. Griliches, Z., 1958. Research costs and social returns: hybrid corn
Georgia Institute of Technology, School of Public Policy, and related innovations. Journal of Political Economy.
Atlanta, GA, January 1995. Grindley, P., Mowery, D.C., Silverman, B., 1994. SEMATECH
Bozeman, B., Rogers, J., Roessner, D., Klein, H., Park, J., 1998. and collaborative research: lessons in the design of high-tech-
The R&D Value Mapping Project: Final Report. Report to the nology consortia. Journal of Policy Analysis and Management
Department of Energy, Office of Basic Energy Sciences. 13 Ž4., 723–758.
Georgia Institute of Technology, Atlanta, GA. Guy, K., Arnold, E., 1986. Parallel Convergence. Frances Pinter,
Buisseret, T., Cameron, H., Georghiou, L., 1995. What difference London, pp. 32–67.
L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678 677

Guzzetti, L., 1995. A brief history of European Union research Link, A.N., 1996. Evaluating Public Sector Research and Devel-
policy, Commission of the European Communities, October opment. Praeger, New York.
1995, pp. 76–83. Link, A.N., 1998. Case study of R&D efficiency in an ATP joint
Hall, B.H., 1995. The private and social returns to research and venture. Journal of Technology Transfer 23 Ž2., 43–51.
development: what have we learned? Paper presented at AEI- Link, A.N., Scott, J.T., 1998. Public Accountability: Evaluating
Brookings Conference on The Contributions of Research to Technology-Based Institutions. Kluwer, Boston.
the Economy and Society, Washington, DC, October 3 1994, Link, A.N., Teece, D.J., Finan, W.F., 1996. Estimating the bene-
Revised 1995. fits from collaboration: the case of SEMATECH. Review of
Hertzfeld, H., 1992. Measuring the returns to space research and Industrial Organization 11 Ž5., 737–751.
development. In: Hertzfeld, H., Greenberg, J. ŽEds.., Space Mansfield, E., 1980. Basic research and productivity increase in
Economics. American Institute of Astronautics, Washington. manufacturing. American Economic Review 70.
Hills, P.V., Dale, A., 1995. Research Evaluation 5 Ž1., 35–44. Mansfield, E., 1991. Academic research and industrial innovation.
Illinois Institute of Technology Research Institute, 1968. Technol- Research Policy 20, 1–12.
ogy in Retrospect and Critical Events in Science. National Mansfield, E., Rapoport, J., Romeo, A., Wagner, S., Beardsley,
Science Foundation, Washington. G., 1977. Social and private rates of return from industrial
Irwin, D., Klenow, P., High-tech R&D subsidies — estimating innovations. Quarterly Journal of Economics 91 Ž2., 221–240.
the effects of SEMATECH. Journal of International Eco- Martin, B., Irvine, J., 1983. Assessing basic research: some partial
nomics, 40, 323–44. indicators of scientific progress in radio astronomy. Research
Jaffe, A., 1989. Real effects of academic research. American Policy, North-Holland.
Economic Review 79 Ž5., 957–970. Meyer-Krahmer, F., Montigny, P., 1990. Evaluations of innova-
Jaffe, A.B., 1998. The importance of ‘spillovers’ in the policy tion programs in selected European countries. Research Policy
mission of the advanced technology program. Journal of Tech- 18 Ž6., 313–332.
nology Transfer 23 Ž2., 11–19. Narin, F., Hamilton, K.S., Olivastro, D., 1997. The increasing
Jarmin, R.S., 1999. Evaluating the impact of manufacturing exten- linkage between US technology and public science. Research
sion on productivity growth. Journal of Policy Analysis and Policy 26 Ž3., 317–330.
Management 18 Ž1., 99–119. National Audit Office, 1995. The Department of Trade and Indus-
Kuhlmann, S., 1995. German government departments’ experi- try’s Support for Innovation, HMSO HC715, London.
ence of RT and D programme evaluation and methodology. Olds, B.J., 1992. Technological Eur-phoria? An Analysis of Euro-
Scientometrics 34 Ž3., 461–471. pean Community Science and Technology Programme Evalua-
Kuhlmann, S., Meyer-Krahmer, F., 1995. Practice of technology tion Reports. Beliedsstudies Technologie Economie, Rotter-
policy evaluation in Germany: introduction and overview. In: dam.
Becher, G., Kuhlmann, S. ŽEds.., Evaluation of Technology Ormala, E., 1989. Nordic experiences of the evaluation of techno-
Policy Programmes in Germany. Kluwer, Dordrecht. logical research and development. Research Policy 18 Ž6.,
Kuhlmann et al., 1999. Final Report of the Advanced Science and 343–359.
Technology Policy Planning Network ŽASTPP. A Thematic Ormala, E., et al., 1993. Evaluation of EUREKA Industrial and
Network of the European Targeted Socio-Economic Research Economic Effects. EUREKA Secretariat, Brussels.
Programme ŽTSER.: Report to Commission of the European Papaconstantinou, G., Polt, W., 1997. Policy evaluation in innova-
Communities Contract No. SOE1-CT96-1013. tion and technology: an overview. In: OECD Proceedings,
Laidlaw, F.J., 1998. ATP’s impact on accelerating development Policy Evaluation in Innovation and Technology — Towards
and commercialization of advanced technology. Journal of best practices, OECD, Paris.
Technology Transfer 23 Ž2., 33–41. Pavitt, K., 1998. Do patents reflect the useful research output of
´ P., Mustar, P., 1995. France, the guarantor model and the
Laredo, universities? Research Evaluation 7 Ž2., 105–111.
institutionalisation of evaluation. Research Evaluation 5 Ž1., Peterson, J., Sharp, M., 1998. Technology Policy in the European
1995. Union. Macmillan, Basingstoke, p. 210.
´
Laredo, P., Callon, M., et al., 1990. L’impact de programmes Roessner, D., 1989. Evaluating government innovation pro-
communitaires sur le tissu scientifique et technique français. grammes: lessons from the US experience. Research Policy 18
La Documentation Française, Paris. Ž6..
Levy, D.M., Terleckyj, N., 1984. Effects of government R&D on Roessner, D., Bean, A.S., 1994. Payoffs from industry interaction
private R&D investment and productivity. Bell Journal of with federal laboratories. Journal of Technology Transfer 20
Economics 14, 551–561. Ž4..
Link, A., 1981. Basic research and productivity increase in manu- Roessner, D., Coward, H.R., 1999. An Analytical Synthesis of
facturing: additional evidence. American Economic Review 71 Empirical Research on U.S. University–Industry and Federal
Ž5., 1111–1112. Laboratory–Industry Research Cooperation. SRI International,
Link, A., 1982. The impact of federal research and development Washington, DC. Final Report to the National Science Foun-
spending on productivity. IEEE Transactions on Engineering dation, Žforthcoming..
Management EMr29 Ž4., 166–169. Roessner, D., Melkers, J., 1997. Evaluation of national research
678 L. Georghiou, D. Roessnerr Research Policy 29 (2000) 657–678

and technology policy programs in the United States and ships for universities: a case study of the UK teaching com-
Canada. Journal of Evaluation and Program Planning 20 Ž1.. pany scheme. Science and Public Policy 24 Ž3..
Roessner, D., Ailes, C., Feller, I., Parker, L., 1998. Impact on Shapira, P., Roessner, J.D., 1996. Evaluating industrial modern-
industry of participation in NSF’s engineering research cen- ization: introduction to the theme issue. Research Policy 25
ters. Research-Technology Management 41 Ž5., 40–44. Ž2., 181–183.
Rogers, J., Bozeman, B., 1998. Creation of Knowledge Value: A Shapira, P., Youtie, J., 1998. Evaluating industrial modernizatin:
Taxonomy of Knowledge Value Alliances. Paper prepared for methods, results and insights from the georgia manufacturing
presentation at Workshop on Research Evaluation, Ecoles des extension alliance. Journal of Technology Transfer 23 Ž1.,
Mines, Paris, France, June 30–July 2. 17–27.
Ruegg, R.T., 1998a. Symposium overview. Journal of Technology Shapira, P., Youtie, J., Roessner, J.D., 1996. Current practices in
Transfer 23 Ž2., 3–4. the evaluation of US industrial modernization programs. Re-
Ruegg, R.T., 1998b. The Advanced Technology Program, its search Policy 25 Ž2., 185–214.
evaluation plan and progress in implementation. Journal of Sherwin, C.W., Isenson, R.S., 1967. Project hindsight: defense
Technology Transfer 23 Ž2., 5–9. department study of the utility of research. Science 156,
Sand, F., Nedeva, M., 1998. The EUREKA Continuous and 1571–1577.
Systematic Evaluation procedure: an assessment of the socio- Tewksbury, J.G., Crandall, M.S., Crane, W.E., 1980. Measuring
economic impact of the international support given by the the societal benefits of innovation. Science 209, 658–662.
EUREKA Initiative to industrial R&D co-operation. Proceed- US Congress, General Accounting Office. Measuring Perfor-
ings of the APEC Symposium on the Evaluation of S&T mance: The Advanced Technology Program and Private-Sec-
Programmes among APEC Member Economies, 2–4 Decem- tor Funding. GAOrRCED-96-47, January 1996.
ber, Wellington, New Zealand, National Center for Science U.S. Congress, Office of Technology Assessment, 1986. Research
and Technology Evaluation, Ministry of Science and Technol- Funding as an Investment: Can We Measure the Returns?
ogy of China. Office of Technology Assessment, Washington, DC.
Senker, J., Senker, P., 1997. Implications of industrial relation-

You might also like