You are on page 1of 135

Preface

This manuscript is intended to become a textbook for students interested in the economic analysis
of technological change. The manuscript is written with the intention to make it attractive to both
economics and engineering students. This intention has led to two special considerations. First,
the text does not assume a deep knowledge of economic theory. Where concepts from economic
theory are used, they are always explained in the text, in an appendix (although at this stage these
appendices are not yet available), or in an envisaged study guide for engineering students. Second,
the chapters presented here use a fair amount of examples from the history of technology. Some
economics students might therefore find the text rather unusual, while engineering students will
be acquainted with most of the examples used. It is hoped that both students will learn something
from the examples. Economics students may learn how their standard toolbox can be applied to
issues they may have considered to be outside their normal domain of study, and engineering
students may learn that it is not only engineering that makes technology. Economic factors are
at least as important for the successful use of technology in society as are sound engineering
principles.

One important issue concerns the level of mathematics used in this text. Engineering students may
find the mathematical arguments somewhat easier to swallow than the average economics student.
Although some of the formulas in the text may look rather nasty, the level of mathematics used
here is not very advanced, and anybody with a fair level of calculus should be able to understand
the formalities used in this text.

This version of the book is incomplete and draft. I therefore welcome comments of any kind.
These may be directed to my address below. I have already benefitted from such comments given
by Marjolein Caniëls, Gerald Silverberg, Eddy Szirmai, the students at the Oslo Summerschool
in Comparative Social Sciences 2000, students at the ECIS PhD course, and undergraduate
students at the TEMA programme at TU/e. Their help has been very valuable. Nevertheless, I
alone remain responsible for errors or the views expressed.

Bart Verspagen

ECIS - Eindhoven University of Technology


PO Box 513
5600 MB Eindhoven
Netherlands

b.verspagen@tm.tue.nl
Chapter 1.
An Economic View on Technological Change and
Innovation
1.1. Invention, Innovation and Diffusion

Every reader of this book has experienced the impact of technological change and innovation in
every day life. You may have downloaded the book from the Internet as a pdf document, or
bought it through a local bookstore in hardcopy format. The hardcopy is the traditional format
that you have been familiar with ever since you started to look at books from your childhood. The
pdf Internet version is something quite new relative to the traditional format. But both
technologies would have been undreamable to the 14th century monk who spent his days copying
manuscripts with a feather and inkpot.

When you think about it, such amazing successions of technologies occur in almost every aspect
of our daily lives and the economy in which we consume and produce. New technologies may
thrive, but most of them (although not all) will at some point become obsolete. Consider, for
example the innovations in Table 1. These are taken from a list of so-called basic innovations that
was compiled by various scholars interested in the impact of technology on economic growth. All
of these innovations had a major impact on the economy at the time immediately following their
introduction (that is why they were selected in the list). But most of them are no longer used
today for commercial purposes.

Figure 1.1. Jan Pieter


Minkelers’ statue in
Maastricht

Besides economic fortune, a successful major innovation may brings its inventor fame. Consider,
for example, the case of street lighting by means of gas. One important figure in the history of this
major innovation was Jan Pieter Minckelers, who was born in Maastricht (in what later became
the Netherlands) in 1748. Minckelers is now honoured in Maastricht with a statue on the market

2
(see photo). In 1781, he invented what was called ‘inflammable air’ by crushing charcoal. This
gas was used in 1783 to fly a balloon over a distance of about 25 kilometres. Later on, Minckelers
started to use the gas to lighten the lecture hall in which he gave his lectures as a university
professor in Leuven.

Table 1. Major innovations, 1764-1835


Innovation year
Spinning machine 1764
Steam engine 1775
Automatic band loom 1780
Sliding carriage 1794
Blast furnace 1796
Steam ship 1809
Whitney's method 1810
Crucible steel 1811
Street lighting (gas) 1814
Mechanical printing press 1814
Lead chamber process 1819
Quinine 1820
Isolated conduction 1820
Rolled wire 1820
Cartwright's loom 1820
Steam locomotive 1824
Cement 1824
Puddling furnace 1824
Pharma fabrication 1827
Calciumchlorate 1831
Telegraphy 1833
Urban gas 1833
Rolled rails 1835

But what is often less vividly remembered is how, besides major technological breakthroughs, also
relatively ‘minor inventions’ can have a tremendous economic impact. These are often linked to
the further refinement and development of the basic innovations in Table 1. Consider, for
example, the steam engine, dated in the table at 1775. The particular steam engine referred to in
the table is the engine invented by James Watt, who at the time was an instrument maker at
Glasgow University. What James Watt did was in fact to improve greatly upon a design of a
steam engine that was made earlier by Newcomen (1712). The Newcomen engine was useful for
pumping water out of mines, but due to certain flaws in its design, it could not be employed as
a source of power in other industries.

Watt’s engine greatly improved the possibilities for application of steam as a universal power
source in the emerging manufacturing industry (although during the period that is now known as
the Industrial Revolution, water wheels remained the dominant source of power), and hence his
invention is classified as a major innovation in the table. But the standard set by Watt’s engine was
hardly the best attainable in steam power. Alessandro Nuvolari, following on work by Donald
Cardwell, has assembled evidence on this for the region of Cornwall during the early 19th century.

3
He shows how in a period of approximately 30 years, the performance of steam engines
introduced in this region increased two- to threefold, without the basic design of the type of
engine undergoing a major change.1 In other words, it were relatively minor, or incremental
changes to the engines’ crucial parts, such as boiler and cylinders, that were responsible for the
rapid increase in performance, rather than a revolutionary change of the underlying technology.

Interestingly, two other forms of the steam engine appear in the table: the steam ship (1809) and
the steam locomotive (1824). This sequence from Newcomen’s engine to the steam locomotive
underlines the notion of technology as a system of inventions that logically follow from each
other. I will come back to the implications of this in a later part of this chapter.

The history of technological change is bound with such examples of initial radical breakthroughs
followed by incremental improvements. These improvements take place during the process of
diffusion of the innovation. The Austrian economist Schumpeter, whose work will be reviewed
in chapter 3 below, developed the terminology that is still used nowadays to delineate such stages
in the life time of an innovation. He used the terms invention - innovation - diffusion.

Invention refers purely to the technological domain. It denotes the process during which a new
technological artifact is constructed, usually with some idea of its usefulness in the economy or
society. The invention turns into an innovation when it is put onto the market by an entrepreneur.
This can either take the form of a new product that is being sold by the innovator, or a new
process that is being used to produce existing products. Commercial aspects, including the
marketing of the new products obviously play a large role at this stage. Finally, diffusion refers
to the process that spreads the innovation through the economy at large.

Nowadays, a large part of the activities during the invention - innovation stages of this process
are concentrated in private firms, although universities and (semi-)public research institutes also
play an important role. Diffusion occurs when the innovation leaves the firm in which it was
conceived, and gets adopted by customers as well as imitated by competitors.

The invention - innovation - diffusion distinction has sometimes been taken too literally as a
sequential process. As we have already seen, diffusion of a major innovation (e.g., the steam
engine) is often associated with incremental innovations of the basic design, and these are often
put on the market by firms that compete with (try to imitate) the original innovating firm. Thus
innovation, often incremental innovation, is an essential part of diffusion.

This is why Dosi and others have coined the terms ‘technological paradigms’ and ‘technological
trajectories’. By a technological paradigm, Dosi refers to a “model and pattern of solution of
selected technological problems, based on selected principles from the natural science and on
selected material technologies”. The term is borrowed from Kuhn’s philosophy of science, which
posits that the normal development path of scientific knowledge is heavily selective in terms of
a dominant framework jointly adhered to by the leading scientists in the field. From all the possible
directions scientific, or in Dosi’s notion, technological development may take, only a small portion
(defined by the paradigm) gets realized.

1
In Chapter 4, we will come back to the case of steam engines in Cornwall.

4
In Dosi’s interpretation, a small number of basic innovations sets out a technological paradigm
that may dominate techno-economic developments for a long time. Along the paradigm, the basic
design of the innovation is constantly altered by incremental innovations, but the basic directions
in which technology develops has already been limited by the choice of paradigm. Still, there is
some room for choices along the paradigm, and these choices are governed by the specific
circumstances in which the technology develops. This development is termed a ‘technological
trajectory’ by Dosi. In the example of the steam engines in Cornwall, the engines were employed
in copper and tin mines. This means that the coal needed to operate the engines needed to be
brought into the mines (in fact, into the Cornish area), and this made it relatively expensive to
operate an engine. Hence, the main aim for engineers in the business of designing engines for
Cornish miners (we will met a few of them in Chapter 3) set out to get as much power as they
could per bushel of coal, and this goal dominated their designing efforts. Under different
circumstances, for example, steam engines operating in locomotives used for transport, designers
(such as George and Robert Stephenson) had to work with a completely different aim, namely to
get as much power as possible while keeping the engine small so that it fitted on wheels and be
moved. One may imagine that again a completely different set of engineering aims may apply to
the case of a steam ship.

Thus, a basic innovation can be thought of as setting out developments in the techno-economic
domain for a number of years to come, but the success of the paradigm, and hence of the basic
innovation, depends crucially on how well incremental innovation is able to adapt the paradigm
to local circumstances. This includes the skills and capabilities of the workforce that has to work
with new machinery, as well as even broader factors such as certain cultural aspects of the society
in which the paradigm develops.

Just as a scientific paradigm in the Kuhnian philosophy of science, a technological paradigm will
ultimately most likely break down. A new paradigm might emerge and start to compete with the
first paradigm (chapter 5 will explore these competition processes in more detail). Such a
competition process will be decided both on the ground of the pure technological merit of the two
technologies and on the question which of the two paradigms is better adapted to local
circumstances. In many cases, a newer paradigm might be better from a (technological)
performance point of view (the first factor). However, because of experience built up along the
technological trajectory of the old paradigm, the new paradigm might be at a disadvantage as
regards the adaptation to local circumstances.

These two factors may give rise to interesting dynamics, for example in the case of the steam ship
(see table 1). As described by Rosenberg, the introduction of the steam ship in fact led to a whole
series of incremental innovations in sailing ship technology, introduced by shipbuilders who
wanted to keep up with the competition from the new source of power. Certainly also, the
expertise of the sailors employed for years on sailing ships on the ships argued in favour of the
old technology. But ultimately, the technological superiority of a power source that was
independent of the winds of the oceans could not be stopped by the vested interests in sailing
ships. This effect of incremental innovation in an old paradigm induced by competition from a new
paradigm, has been observed in other cases as well, and has been termed the ‘sailing ship effect’.

Another way in which a paradigm might break down is because of exhausting technological
opportunities along the technological trajectories of the paradigm. Ultimately, achievement of the

5
leading goals of engineers working along the trajectory (e.g., increasing the duty of an engine, or
miniaturizing an electronic chip) will be harder and harder to achieve, because all the possibilities
to achieve progress along the trajectory have been exhausted. Such decreasing returns to
innovative efforts have been termed ‘Wolff’s law’, after a German economist from the beginning
of the 20th century. Obviously, a paradigm is most vulnerable to competition when it operates at
such decreasing returns, which is why the two factors mentioned here often take place jointly.

1.2. Interaction of Technology and the Economy: Demand Pull or Technology Push?

The examples used above, as well as many others you may think about, clearly underline that
economic development and economic growth depends on technological change. Similarly,
technological development depends on entrepreneurs who are willing to make efforts to develop
the technology. The willingness to do this clearly depends on how much profit can be expected
from an innovation. This two-way interaction raises the question as to what determines what:
does technology determine economic development or vice versa?

Two arguments have been made in the literature about this interaction: the technology push
hypothesis that states that technology comes first and determines economics, and the demand pull
hypothesis that argues that economics (demand) comes first, and technological development
reacts to this. The debate between the two camps advocating these hypotheses can be stylized as
the question as to whether the innovative process starts in the marketing department or the R&D
department of the firm. We will now discuss these two hypotheses in turn.

The demand pull hypothesis starts from the notion that the innovating firm gets the idea for an
innovation from the market. This may either happen through market research by the marketeers
employed by the firm, or it may be a more informal process in which the management of the firm
sees or feels some profit opportunity in the market in which they are active. Either way, the firm
recognizes a specific problem that consumers face, and sets out to develop a product that solves
this problem. In the field of computers, one may think of a firm like Microsoft recognizing that
users of its old MSDOS operating system are limited in their efficient use of computers by the low
level of user-friendliness of this product.2 After market research, the firm finds out that what
consumers really want is a graphical interface that is analogous to the office environment they are
used to when working without a computer. The marketing department briefs the R&D
department, which comes up with a graphical user interface that resembles a desktop, with items
such as a notepad, a trash can, paper and a pen (Windows and all its sequels).

The main evidence in favour of the demand pull hypothesis was collected by Jacob Schmookler
in the 1950s and 1960s. His work was mainly based on patent statistics, and mostly concerned
much less modern products than computer operating systems. One of the case studies he
undertook was the innovation in horseshoes in the United States. The horse was obviously the
major form of transportation in the early days of the United States and Schmookler clearly shows
how the increased use of horses in transportation (among others for the west-ward expansion of
the United States) led to a wave of (incremental) patented inventions in the design and

2
The Microsoft example given in this paragraph is not based on facts, and is used only to illustrate
the general idea of the demand pull hypothesis.

6
manufacturing of horseshoes. A more recent data set he used concerned patented innovations in
investment goods (mainly machinery), which he compared in a statistical way to investment flows
in the national accounts of the United States. Again, he found a clear pattern in which investment
demand rises first, and the number of patented inventions in investment goods follows after a
while.

Although there are some caveats with regard to the statistical work by Schmookler (or those who
followed him in advocating the demand pull hypothesis), there is obviously some value in the idea
that market information may help to make an innovation successful. Marketeers may often have
a good feeling of what the market wants, and which products are best suited to fulfil the market’s
wishes. Whether or not the marketing department and the R&D department of a firm are able to
work together in a constructive way is often a crucial factor in the success of an innovation. Quite
often too, the cultural differences between engineers and marketeers lead to a clash of visions, and
this may lead to a low willingness to listen to each other’s insights into what could be a successful
product. In the demand pull perspective, such a clash could, for example, happen when the
marketing department demands a product with specifications that cannot be delivered in the eyes
of the engineer, or which, to the contrary, does not provide enough of a challenge to the
sophistication of the engineers.

Perhaps the strongest point of critique against the (pure) demand pull model is concerned with
the difference between needs and demand. Obviously, any successful innovation can, with
hindsight, be shown to fulfil some need, otherwise it would not have been successful. This,
however, does not establish that at the time of invention (or even innovation), this need was more
than a latent need, or, in other words, whether demand was present at the time the innovation was
developed. Many needs are only recognized once it becomes possible to fulfil them, i.e., once the
innovations starts to diffuse through the market. Obviously, such a latent need does not provide
a strong case for the demand pull hypothesis.

The technology push hypothesis starts at the other end of the marketing-R&D interaction. Here
it is the engineer who recognizes that a specific piece of new technological knowledge may result
in a product. It is then up to the marketing department to try to find a market for such a product.
Now one may imagine the engineer coming up with (in her eyes) a sophisticated piece of new
technological ingenuity for which the marketing department does not see any market. Quite often,
the British-French aircraft Concorde is mentioned as an example of such a product. The Concorde
is obviously a beautiful piece of aircraft engineering, and it is still the quickest passenger airplane
in the world. But despite its technological sophistication, a large market for Concorde flights
never emerged (due to the high price of tickets as a result of fuel inefficiency).

Taking the technology push hypothesis in a broad perspective, one may include (basic) science
on the push side of the technology-economy relation. Basic science, as contrasted to applied
science is defined as being carried out without a specific use in mind. It originates purely from the
curiosity of the scientist undertaking the research. Because there is little or no direct use in basic
science, it is viewed as something that is not very attractive to firms, and hence mainly the
province of universities and governmental research laboratories. Using the same definitions, firms
‘specialize’ in applied research, which is aimed at finding applications for scientific knowledge,
or even so-called development work, which consists of designing, prototyping and tuning up of
specific products.

7
Taking such a detailed view of the R&D process, the technology push hypothesis may be
interpreted as saying that technological developments originate with basic science, hence (often)
in university or government research. The knowledge then passes through a process of applied
R&D, during which an application is envisaged, until it gets into the hands of a firm that develops
a product or process with the knowledge, and puts it on the market. The final stage of the process
is diffusion of the product (or process) through the market, when more and more consumers (or
firms) adopt it. This view is often called the linear innovation process, because it assumes that all
innovations will always pass through the various stages in the same sequence.

The technology push hypothesis, or in its strongest form the linear innovation process, may be
confronted with historical reality. One then enters the debate on the relationship between science
and technology. Does science have a strong impact on technology, and how has the relationship
been changing over time? Mowery and Rosenberg have used the history of steelmaking to
illustrate that in the 19th century, the relationship was often quite loose. Steel is an alloy of iron
and carbon, and is much harder than pig iron3. Its uses therefore include a much larger range of
products and processes than pig iron. For example, the art of bridge building received a strong
impetus from the development of steel, and this, in turn, greatly facilitated the operation of
railroads. Other industries that benefited greatly from steel were food processing (tin cans),
weaponry and machinery.

Figure 1.2. The impact of steel on bridge building

The Bessemer process (developed in 1855 and named after its inventor) was the first process that
could be used to manufacture steel at a large scale. It operated by blowing air through molten pig
iron in order to increase the heat and thereby facilitate the forming of the carbon iron alloy. What
Bessemer did not realize when he made his invention, was that his method was only useful when
using pig iron manufactured with ores that are low in sulphur and phosphor. In actual practice,
such high quality ores were not available in many environments, and in such cases the Bessemer
process did not result in a useable product. For a long time, this put the United Kingdom, where

3
See Chapter 3 for a more detailed overview of iron- and steelmaking.

8
sufficient ore of the right quality was available, at a large advantage compared to other countries.

20 years later, Thomas Gilchrist solved the problem with the Bessemer process when, with help
from his cousin Percy Gilchrist, he developed the so-called Gilchrist converter, which adds lime
to the process. The open hearth process (developed independently by Siemens and Martin in,
respectively, the United Kingdom and France) solved the problem completely and consisted of
a more flexible method because it used iron ore rather than pig iron as an input.

According to the historian of science Bernal, none of the ideas used by either Bessemer, the
cousins Gilchrist or Siemens or Martin were based on very advanced science. Bernal asserts that
the most recent scientific insights used in early steel-making originated around 1790, i.e., wre
more than half a century old when applied by the inventors of industrial steel-making. Mowery
and Rosenberg add that none of the inventors in this case, with the exception of Siemens, were
trained in university. Thomas Gilchrist, for example, was a desk clerk in a police station, but his
cousin Percy was employed as a chemist, and hence had at least some basic understanding of
chemical reactions. One may thus conclude that if scientific knowledge played any role in steel
making, it was ‘old science’ rather than state-of-the-art knowledge that made the difference. Thus,
what later became the science of metallurgy largely developed from practical experiments without
a firm base in the scientific state-of-the-art.

Naturally, one example does not constitute a complete proof, but Mowery and Rosenberg, as well
as many other writers conclude that during the 20th century, technology became much more
dependent on science than was the case during the days of Bessemer and the cousins Gilchrist.
For example, in the introductory article to a special issue of Scientific American on the 20th
century, Jonathan Piel concluded as follows:

“The 20th century is the first in which inquiry into nature has become the source
of technologies that amplify the power of the human mind and may even render
human intelligence redundant. (…) The 20th century has experienced a sea change
in the relationship between science and technology. In the 18th century the
members of Birmingham’s Lunar Society learned more about physics from their
mill machinery than physics contributed to productivity. The study of illness has
traditionally illuminated biological processes; now medicine takes lessons from
molecular biology. Finally, science has become a global profession with its own
rules and culture. It is a profession institutionalized in university, government and
industrial research laboratories. Political leaders and entrepreneurs have begun to
listen to what science has to say.”

Taking into account the conclusion by Mowery and Rosenberg on steel making, one might say
that technology has caught up on scientific development, and has used up ‘old’ scientific
knowledge so it must now rely on closely interacting with the scientific frontier. Recent
developments in high-tech electronics (e.g., chip making), pharmaceuticals or bio-engineering
seem to underline such a conclusion. Based on data on scientific publications originating from the
research laboratories of firms, some scholars have even concluded that nowadays firms play a
significant role in the development of science.

Where does this discussion lead to in terms of conclusions on the demand pull versus technology

9
push debate? Probably not to any conclusion that favours any of the two over the other model.
Instead, Kline and Rosenberg have proposed a model that can be interpreted as a combination of
the two ideas. Such a general and flexible model, which they term the chain-linked model, may
put the emphasis on the demand side (market, i.e., demand pull) in some specific case, and on the
supply side (R&D, hence technology push) in a different case. In addition, the chain-linked model
views the innovation process as a non-linear process, as opposed to the linear innovation model
that was encountered above.

A somewhat simplified scheme of the chain-linked model is displayed in Figure 3. Note that the
‘central chain of innovation’, which is indicated by the thick arrow, runs left to right in the figure,
and starts from the potential market. Thus, the chain-linked model borrows from the demand pull
approach in this respect. However, in the chain-linked model, there are a large number of
feedbacks, indicated by the three types of small arrows in the diagram. In fact the whole process
might start at one of the small arrows, so that a pure demand pull model is not strictly adhered
to.

Figure 1.3. A simplified scheme of the chain-linked model

Research plays a quite different role in the chain-linked model as compared to either a pure
demand pull or technology push model. In the chain-linked model, research is seen as playing a
role along the whole track of the innovation, partly in the form of interaction between different
stages in the innovative process. These interactions are indicated by the arrows in the top part of
the figure. The specific role and character of research will generally change when moving left to
right. Typically, on the left-side of the figure, when R&D workers are pondering on the
(im)possibility to solve certain technical problems and invention is going on, research is very much
based in pure science of the academic type. When these problems get solved and it comes to
developing a functional design, the research gets a more ‘engineering’ nature, trying to establish
a system in which the components interact well.

Closely linked to the central chain are the feedbacks between the directly connected processes in
the chain. This indicates the iterative process that each of the boxes represents. For example,
establishing a first design after the basic inventive work has been finished is not an activity that

10
succeeds at once. The engineers working on this design will frequently go back to the results that
were obtained during the inventive stage, using these to obtain guidance on how a good design
looks. Their experience in putting a design to work may, on the other hand, also give rise to new
inventive work, for example investigating performance of a certain material deeper. This is why
the arrows that are linked to the central chain run both ways.

The final type of feedback is right-to-left, and runs from the market to each of the other
processes. This typically indicates incremental innovations that result from experience with putting
the product in the market. Such experience may give rise to ideas on improvements of the design,
or, more radically, for additional functionality that cannot be implemented without further
research into certain aspects of the innovation.

1.2. Technology in economic models

So far, we have looked only at descriptive models of the relation between technology and the
economy. But economics is very much a science that tries to analyze its topic by means of
mathematical models. Therefore it is relevant to ask the question how technology can be
incorporated into these models, and what the models have to say on the relationship between
technology and the economy. The simplest way in which technology can be incorporated in
economic models is as an exogenous production factor.

A production factor is a good that can be used to produce other goods. Traditionally, economics
has considered two production factors: labour and capital. Each one of these two can be further
decomposed into several distinct categories. Labour, for example, may be decomposed into skilled
and unskilled labour. Different forms of capital are buildings and machinery. All these different
production factors can be employed to generate economic goods, including services. Labour and
capital are both so-called endogenous production factors, which means that economic theory aims
at explaining how much of these factors will be used in the production process. For example, the
microeconomic theory of production will derive a demand curve for labour, while setting up a
model of the decision of an individual on how to divide time between work and free time will yield
a supply curve for the production factor labour. From these two curves, the amount of labour
used in the production process can be derived.

In the simple view of technology as an exogenous production factor, such economic theory on
how much knowledge will be produced and used does not exist. To the contrary, until some time
in the 1950s and 1960s, economists usually considered technological knowledge as a shift factor
in the production function. For example, a popular view of the production process is the Cobb-
Douglas production function, which states that the level of production (for example in a plant)
is determined by the levels of capital and labour employed in the following way:
Q A K . L 1 ., (1.1)

where Q is the level of production, K is the amount of capital (machines, buildings, etc.) used, and
L is the amount of labour used. . is a parameter that is usually in the range 0.2 - 0.4. A is the shift
factor that represents technological knowledge. The formula says that raising the level of
technological knowledge implies that more is produced with an equal amount of capital and

11
labour.

In this simple model, A is determined by factors that are completely outside the economy, as if
technology is “given by God and the engineers” (Joan Robinson). This is what is meant by the
term exogenous. Obviously, such a view is far from reality. In practice, firms deliberately devote
resources (money, time) to the development of new technologies, and hence it is clear that
economic motives play a crucial role in the development of new technological knowledge. In
other words, technology is an endogenous factor just as much as labour and capital are.
Technological change is as much a topic of interest to economists as to engineers.

Besides the issue of exogenous versus endogenous production factors, there is another problem
with the production function point of view on technology in equation (1). This relates to the fact
that technological change often takes the form of new goods, rather than new processes. Note
that in equation (1), the only change due to an innovation is that more of the same good gets
produced by an equal amount of labour and capital. There is no change in the nature of the
product that comes out of the production process in equation (1).

It is thus useful to make a distinction between process innovation (which lower the costs of
production) and product innovations (which yield products unknown before or improved
products). Whether an innovation must be classified as a product or a process innovation depends,
of course, on the perspective one takes. A firm that delivers goods to other firms, which use these
goods in their production process, may come up with a product innovation, but this product
innovation may in fact be a process innovation from the point of view of the buyer. An example
of such a case would be a computer-controlled lathe, which can be used to produce the exact
same metal products as can be made with a traditional lathe. The firm that sells the lathe will look
upon it as a product innovation, while the firm that uses it will classify it as a process innovation.
In actual practice, an innovation may often have characteristics of both product and process
innovations. This would be the case when the computer-controlled lathe would not only work
cheaper, but also enable a machine manufacturer to produce according to more precise
measurements.

Such confusion over what is a process innovation or a product innovation is not possible in the
case of innovations that are aimed at the consumer market. Thus, although one might argue that
for capital goods the production function point of view on technological change is valid as long
as one takes an aggregate point of view, this no longer holds in the case of consumer goods. In
addition to endogenizing technological change, we will thus at some stage have to pay some
attention to product innovations as well as process innovations.

The task of endogenizing technological change from an economic point of view is not an easy
one. The reason for this is that technological knowledge differs greatly from other topics studied
by economists. These differences consists of two major parts: technology has certain
characteristics of a public good, and technology is subject to great uncertainty. The remainder of
this chapter will be devoted to discussing these two issues in greater detail. The remainder of the
book will then be devoted to exploring the consequences of these differences, or, in other words,
exploring the question what is special about technology from the point of view of economics.

1.3. Technology as a Public Good

12
The IBM vs Apple case

The predecessors of the modern personal computer emerged during the Second World War, when
Germany, the United Kingdom and the United States were simultaneously developing electro-
mechanical computing facilities. The machines were used for war-efforts. In Germany, an engineer
named Zuse developed the very first computer, but his design was never commercialized. In the
United Kingdom, Alan Turing and collaborators were working, quite successfully, on decryption
with the help of a computer. But it was in the United States that the first firm building a successful
computer emerged, namely Remington Rand, with its UNIVAC machine.

The earliest prototype that predecessed the UNIVAC was the ENIAC computer, which was
developed during the period 1942 - 1946 at the University of Pennsylvania by a team of scientists
under the leadership of two engineers Mauchly and Eckert. The ENIAC was part of the US war
effort, and an early version was used to calculate ballistic curves. Later on, a version was also
used in the Manhattan project that led to the first atomic bomb. Shortly after the war, the full-
fledged ENIAC was presented by Mauchly and Eckert, and they decided to start a company to
market the machine. They were quickly bought out by Remington Rand and started to work for
this company on an improved version of the ENIAC, which became the UNIVAC.

A new industry was born, in which IBM rapidly became the leader. This company had produced
punchcard-based tabulating machines for activities such as statistical surveys since the early 20th
century. But with the introduction of computers, IBM’s main business line was radically changed.
The firm quickly adopted the new technology pioneered by Mauchly and Eckert and started to
manufacture computers on a large scale. An important factor in the jump to leadership by IBM
was its use of a cross-license of the Mauchly and Eckert basic patent, owned by Remington Rand.
Later on, this patent was revoked on the ground that the application was filed too late, i.e., more
than a year after the first public exhibition of the machine.

Figure 1.4. J.W. Mauchly at work with the ENIAC

The computers that were produced from the 1950s onwards were large and expensive machines,
that would only fit into a large room, and had to be operated by specialists. The underlying
technology used for the main building blocks of the machines was that of vacuum tubes. Vacuum

13
tubes are, compared to the semiconductor devices that replaced them, large and slow. Their use
resulted in large machines that operated at a speed which was only a very small fraction of what
we are used to today. Only large firms, such as large insurance companies and banks, were able
to apply these expensive machines in a profitable way. Also, a main market for the so-called
mainframe computers was scientific work performed in universities and (semi-)public research
institutes.

A number of rapid technological advances in the electronics industry led to drastic miniaturization
of computers. First, the transistor, invented by William Shockley and others of Bell Labs in 1947,
opened up the possibility to make the basic components of a computer much smaller, cheaper and
more reliable. The invention of the integrated circuit (IC), independently in 1957 by Jack Kilby
(of Texas Instruments) and in 1958 by Jean Hoerni and Robert Noyce (of Fairchild
Semiconductors) put a whole series of these small building blocks into a single standardized
module. This greatly reduced the efforts of engineers who were used to designing and building
circuits for almost all purposes, rather than buying these as a ‘black box’. Finally, putting more
and more functionality into these ICs led to the invention of the micro-processor in 1971 by a
team of scientists at Intel. The microprocessor essentially put all basic functionality of a computer
onto a small piece of silicon, the so-called single ‘chip’. In the same year, Intel invented the
memory IC, another important building block for the modern computer.

In a process that became known as ‘Moore’s law’, the capacity of the microprocessor (or memory
chip), as measured as the number of transistors on a single chip, seemed to expanding in an
exponential fashion (see Figure 5). By the end of the 1970s, developments had come to a stage
where a computer would actually fit onto a desk, and was thus ready to invade every-day life in
office and home environments. The first model that was a big success was the Apple II machine,
put into the market in 1977 by Apple Computer, a company owned by Steve Jobs and Steve
Wozniak. The popularity of the machine made Apple grow tremendously.

Figure 1.5. Moore’s law (source: Intel


Corporation)

Around this time, i.e., the late 1970s, the markets for mainframe computers was still large,
although so-called mini-computers (which were hardly mini by today’s standards) had also taken
a significant part of total sales. But it was obvious that micro-computers were becoming
important rapidly, and one might even envisage that these small machines would one day take
over at least partly from their larger counterparts. Hence, the dominating firm from the mainframe
computer market, IBM, decided it wanted to try to get a share of the emerging market for small
computers. Their problem, however, was that they knew how to build large scale mainframes, but
did not have any experience with building a smaller machine. IBM’s engineers decided on a design
for which they outsourced all of the major components to other, specialized companies. The

14
resulting machine was called the Personal Computer (PC), and was launched in 1981.

45
40

35
30
25

20
15
10

5
0
1981

1983

1985

1987

1989

1991

1993

1995

Source: Harvard Business School Apple case studies 1992 & 1998 1997

Figure 1.6. Market shares on the Personal Computer market

Although the PC was clearly inspired by the success of the Apple II, the two machines differed
widely in their specifications. Most importantly, the software that was available for the Apple II
would not run on the new IBM PC (or vice versa), which implies that Apple users faced
significant costs if they wanted to switch to the IBM platform. Still, the IBM PC became a big
success. In the first few years after the introduction, the market share of IBM increased rapidly
(see Figure 6). However, a couple of years after the introduction of the PC, other firms, such as
Compaq, started to imitate the product. They sold machines that were functionally fully equivalent
to the IBM PC, and could actually run the software that was originally designed for the IBM PC.
Typically, the models that Compaq and other imitators sold were priced much below the original
IBM product.

In 1984, Apple launched a new model, the Macintosh, which, as the Apple II, was not compatible
with the PCs produced by IBM and other companies (nor was the Mac, as the Apple machine
became known, compatible with its older brother the Apple II). In the battle between Mac and
PC that emerged, the PC platform clearly became the dominant party in terms of users. One of
the reasons for this was that Apple did not allow other firms to imitate its model, and hence it
remained the only supplier for the Macintosh model. Competition in the market for IBM-
compatible computers was therefore much higher, and this, among other things, led to much
lower prices. This greatly helped the IBM-compatible machines to stay dominant.

However, as Figure 6 shows, IBM, the inventor of the PC model, was not able to stay the leader
in the market. After 1984, its market share started to wane, largely because the firms who imitated
the PC took over the largest share. Initially, Compaq and other early imitators benefited from this,
but later even these firms started to loose market share. The PC became a standardized
commodity that could even be built by very small local computer shops.

15
Public goods characteristics and economic theory

The IBM-Apple case brings out very clearly a number of basic principles that apply to the
economics of technological change. In chapter 5, we will be referring to the notion of network
externalities, which means that the value of a piece of equipment to the customer depends on the
total number users of the equipment. In the present case, this meant that the more users chose the
IBM platform, the more value this model gives to its users. The reason for this is that users of the
PC can exchange documents, programs and experiences with other PC users, but not with
Macintosh users (because of the incompatibility). It is easy to see that once the PC got a critical
mass of users, this gave it a great advantage over the Macintosh, which stayed a machine for a
niche of the market. In other words, initial success led to further attractiveness and hence sales
of the PC model, while it worked against the Macintosh. Such self-reinforcing mechanisms have
important consequences for market structure.

In this chapter, however, we will concern ourselves with a different (although related) aspect of
the IBM-Apple case: the appropriability of technological knowledge. When designing the PC,
IBM did not try to appropriate the knowledge needed to build this machine. Instead, other firms,
such as Compaq, were allowed to imitate its PC.4 The reason for this was that IBM used standard
components supplied by other firms to build the PC. Hence, there was not a single invention in
the PC that could be ascribed to IBM engineers, and under patent law the ‘packaging’ of these
components into a PC was hard to claim as appropriable knowledge. Hence the decision of IBM
not to patent the PC was to a very large extent a forced decision. But it does enable us to
illustrate a crucial aspect of technological knowledge: its public goods nature.

A public good, in the terminology of economists, is a good that has two characteristics. First, a
public good is characterized by non-rivalry. This means that if one person uses the good, this
does not prevent other people from using it again. A classic example of such a good is clean air.
If your neighbour enjoys clean air, you will do the same (there is more clean air than you can
jointly breath in). Contrast this with a standard economic good that is characterized by rivalry,
such as a candy bar. Once your neighbour consumes the candy bar, you will not be able to
consume that same one again.

The second characteristic of a public good is non-excludability. This means that the party that puts
the good on a market, or into the public domain, has no way to control the use of this good by
other parties interested in it. The candy bar and the clean air examples again serve as a good
illustration of this principle. The shop that sells candy bars requires you to pay for it, and if you
are not willing to do that, the shopkeeper will prevent you from taking the bar from his shop, if
necessary by (legal) force. In the case of fresh air, however, it would be impossible to grant your
neighbour the right to enjoy it, while excluding you from doing so.

The combination of non-rivalry and non-excludability is very powerful in generating public


interest in a certain good, but, at the same time, it is lethal for any firm that would consider
putting such a good on the market with the aim of making a profit. This is why so far, no firms

4
In fact, there was one piece of software in the IBM PC, the so-called BIOS, which took some time to
be imitated. The first ‘IBM clones’ suffered from slightly incompatible BIOS chips. This problem was,
however, solved quickly.

16
have been successful in selling clean air. The basic problem is that non-excludability implies that
such a firm has no power to charge consumers a price for the product, and hence to generate
revenue. In the field of public economics, this has led to the conclusion that governments may
take an active role in supplying public goods, such as clean air (or a healthy environment in
general) and defence.

Technology also has certain public goods characteristics. Think of the knowledge needed to build
the IBM PC. Once this knowledge had been generated by the IBM engineers, it could be used as
many times as necessary by other firms such as Compaq. In other words, this particular piece of
technological knowledge is characterized by non-rivalry. Also, IBM did not have any way to
exclude other firms from using this knowledge. The competing firms could simply obtain the
knowledge by disassembling a single IBM PC that could be bought on the market (so-called
reverse engineering). The component suppliers that originally served IBM now also sold to firms
such as Compaq.

Whether or not putting the PC on the market under these circumstances was a sensible thing to
do was hard to judge in advance. Whether or not IBM was able to recover its development costs
depended on how quickly other firms would be able to jump on the bandwagon and imitate IBM’s
technology. In other words, the lead time that IBM had by putting its invention on the market
first, was an important source of revenue. This can be seen in Figure 6 as the initial (1981-84)
period of rapid growth of IBM market share. However, being able to maintain its leading market
position for at least a little bit longer surely must have been an interesting prospect for IBM at the
outset of what later became the PC revolution. We will come back to possibilities for doing this
in chapter 2.

From the point of view of the consumer, competition in the PC market was a positive
development. It meant that prices were falling, and that quality went up, because firms were
making efforts to increase their competitiveness on both accounts. Also, a firm like Compaq
benefited greatly from the original IBM invention. In other words, IBM’s actions (research,
marketing, production) provided value for a large range of firms and individuals. However, often
(think of Compaq) these firms or individuals did not pay IBM anything for this additional value.

Now consider the case of an IBM manager having to make the decision of whether or not to
undertake the PC project. Which benefits would such a manager take into account? From a profit-
maximizing point of view, she should take into account only the benefits of the project for IBM,
and leave out the benefits for others (consumers or competitors) to the extent that these other
parties will not be generating revenue for IBM. Suppose the manager would value the financial
benefits that will accrue to IBM as a result of the PC project as $ X. The costs are expected to
be $ C. For IBM, it obviously only makes sense to invest in the project if X is larger than C.
Suppose, for the sake of the argument, that this is not the case (C>X), and hence IBM decides
not to undertake the project.

But we have seen that due to the public good characteristics the benefits to society as whole that
could results from the innovation project that IBM is considering, could well be larger than X.
Let’s say the total benefits can be quantified as $ X+Y. Now consider the paradoxical situation
that results if C>X, but Y> C-X. In words, the public interest in the innovation is larger than the
private excess costs that are faced by the company that considers to invest in the project. This

17
means that although from the point of view of the economy as a whole, benefits are larger than
costs, the individual firm will decide against introduction. This is obviously an inefficient situation.

The described situation is indeed a classic case in the theory of public goods. It points to so-called
market failure, or, the inability of the free market to produce an ‘optimal’ result. As in the case
of pure public goods such as fresh air and defense, such a situation calls for government
intervention. This intervention may either consist of supplying of the good by the government, or
the application of more market-based instruments such as subsidies and taxes, or laws for
intellectual property rights.

These options will be described in other parts of this book. For now, it is enough to take note of
the fact that the public goods aspects of technological knowledge give rise to an incentive
problem in a free market economy. In general, compared to the public interest in new
technologies, private firms will have the incentive to produce enough innovations.

You may have noticed that at several points in the above discussion, I spoke about ‘pure’ public
goods, and implied that technological knowledge, although it does have public good
characteristics, is not such a pure case. The remainder of this section will be devoted to discussing
why this is the case, and why this matters.

Think again about the example of clean air. The consumption of this public good does not require
any special effort or special skill on the side of the consumer, since breathing is the most natural
thing a human being can imagine.. This is quite different, however, for technological knowledge.
Using technological knowledge, even if it stems from the public domain, requires considerable
skills and efforts on the side of the receiver of this knowledge. The reason for this is that
knowledge has a strongly cumulative nature. Every piece of new knowledge builds to a large
extent on previous knowledge, and to apply knowledge requires that one has command over the
older knowledge on which the new knowledge builds.

Not all of this previous knowledge can be codified into textbooks and manuals. It resides to a very
large extent in the minds of practioneers, such as scientists and engineers, but also users of
equipment and procedures. The knowledge that is locked up in people in this way is often called
tacit knowledge. The most important transfer mechanism for such tacit knowledge is face-to-face
interaction between co-workers inside the same firm, or within networks of firms. Obviously,
face-to-face transfer of knowledge takes time and effort, and hence it has important limitations
in terms of its reach of the transfer mechanism. This means that the tacit parts of knowledge will
not completely diffuse through the economy, and hence that there are important aspects of
knowledge that can hardly be considered as a public good.

This has resulted in the emergence of so-called knowledge bases, which are in many cases very
specific to an industry and can to a large extent be appropriated by the incumbent firms in the
industry. The knowledge workers inside the firm will give that firm a so called intangible asset,
which in certain senses is comparable to tangible assets such as machines and buildings. The
quality of the intangible asset embodied in the minds of its (R&D) employees is an important
competitive factor for a firm.

In summary, although knowledge has important aspects of a public good, it cannot be considered

18
as a pure public good. To master and control technology is what a firm will strive for in its search
for profits, and this will lead to a considerable part of technology being locked up in firms. But
the exploitation of this knowledge in terms of spillovers is also an important aspect of the overall
benefits of knowledge for the aggregate economy. This opposition between spillovers and private
knowledge will be a recurring theme of the analysis in this book.

1.4. Technology and Uncertainty

The Case of the Comet

Technology is all about uncertainty. Engineers in the R&D department of a firm do not know
exactly what is possible from a technical point of view. Marketeers and managers do not know
what consumers like, and how they will respond to the introduction of a new or modified product.
Uncertainty takes a number of different forms. For example, we may distinguish between
technological uncertainty and commercial uncertainty. Even within these categories, a more
detailed distinction is possible. For example, technological uncertainty may refer both to the
possibility of certain technical options, or to the costs involved with a specific procedure for
producing a technical result. Commercial uncertainty may refer to the price consumers are willing
to pay for a new product, or to the variety of that product they prefer.

In order to see how uncertainty may work in practice, we will now review the case of the Comet
airplane. The case illustrates a rather extreme and sad form of technological uncertainty, as well
as a form of commercial uncertainty that took a more positive turn. The story starts in the 1930s,
when a new type of airplane engine was developed. During the time, airplanes were standardly
equipped with internal combustion engines that were driving a propellor. The power and fuel
consumption of this type of engine severely limited its operation modus, and intercontinental
flights were something quite extraordinary at the time. Also, planes were relatively small as
compared to what we are used to now.

In the 1930s, groups of researchers in Germany and the United Kingdom started working on a
different type of engine, which had a gas turbine as its underlying working principle. In the jet
engine, as the new engine was called, kerosine and air are blended and burned. The exhaustion
gases that result from this are used as the propelling force of the engine. Two basic designs of the
engine exist. In the turbojet, air is taken in at the front of the engine, compressed in a
compression chamber and then blended with the kerosine. The mixture is burned and the
exhaustion gases are driven through a gas turbine. The reaction gases from the turbine drive the
airplane, and the turbine itself drives the compression chamber inside the engine. The other basic
form of the jet engine, the turboprop engine, uses the turbine to drive a propellor. The propellor,
and not the exhaustion gases from the turbine propel the aircraft.

The jet engine was originally used for military aircraft. The first plane operating with a jet engine
was the German fighter Messerschmitt Me 262, which first flew in 1942. The British firm De
Havilland built the Gloster Meteor in 1944. The Me 262 was very effectively used to take down
allied bombers, but the Germans had too little of them to make a real difference. Quite soon after
the conception of the jet engine, non-military use came in mind. One important question related
to this was which of the two basic designs of the jet engine was best suited to carrying freight or
passengers.

19
The British called into existence a committee under the leadership of Lord Brabazon of Tara in
1942, and gave it the task of investigating the possibilities for non-military air-transportation after
the war. The committee reported with a recommendation to build five types of aircraft. With
hindsight, the types 1 and 4 are most notable. Type 1 was a large aircraft for long distances,
including transatlantic flights. Type 4 was a smaller airplane designed to carry mail. The
committee proposed to equip the type 1 plane with turboprop engines and the type 4 with turbo
jet engines. This recommendation reflected the mainstream opinion of aircraft engineers in the
1940s. They preferred the turboprop engine because its fuel efficiency was much higher than that
of the turbo jet.

De Havilland, the same firm that built the Gloster Meteor, went against the recommendation of
the Brabazon committee, and, as we will see shortly, was going to prove the committee wrong.
De Havilland took the basic idea of the type 4 plane proposed by the committee, and turned it into
a plan that could carry 36 passengers. The aircraft was called the Comet, and was equipped with
four Ghost engines. These were built into the wings, rather than attached under them, as we now
mostly see with civil airplanes. As the picture shows, this gave the Comet a futuristic look. The
Comet had a cruise speed of 500 miles per hour, and a range of 1750 miles.

The British airline BOAC ordered 9 planes, and the first one made regular flights from 1952
onwards. The planes carried out services from London to among other places Johannesburg,
Colombo and Tokio. De Havilland quickly developed a follow-up version, called the Comet 1A,
with room for 44 passengers. This plane was sold in advance to a number of foreign airlines, and
nothing seemed to stand in the way of commercial success. What could explain the success of this
strange plane equipped with turbo jet engines, where common wisdom would prescribe the
turboprop engine?

Figure 1.7. The first non-prototype Comet, flying for the British airline BOAC under
designation G-ALYP

The fuel efficiency of a jet engine depends on the altitude at which the engine is operated, as well
as on the design of the engine. In practice, a major problem in developing an efficient turboprop
design proved to be the size of the propellor blades. Years of experimentation would converge
on the so-called turbo-fan engine as one highly efficient design. A turbo-fan engine is essentially
a turboprop engine with a high number of small propellor blades. One major advantage of this
type of engine is the low noise. The turbo-fan is now the standard type of engine on most large

20
aircraft, but just after the war it appeared hard to design it efficiently. In those early days,
contrary to the common wisdom, practice pointed to the pure turbojet as the best alternative.

An important reason for this was the fact that when flying in higher parts of the atmosphere, the
pure turbojet has high fuel efficiency. Flying at such high altitudes has the additional advantage
that turbulence is much lower, as most people that have been on turboprop and turbojet driven
airplanes can confirm. Flying high is more comfortable than flying low. Thus, especially for
carrying passengers, and especially for longer distances, when there is enough time to climb to
high altitudes, the turbojet is not such a bad alternative to the turboprop or the turbo-fan.
However, because in the 1940s, there was no previous experience with flying at such high
altitudes, this option was difficult to recognize for the engineers working in this field.

With the information on the preference for flying at high altitudes and the difficulties in obtaining
an efficient turboprop (i.e., turbo-fan) design, the view of the Brabazon committee appeared to
be completely wrong. In contrast to what the committee recommended, following the dominant
designers paradigm of the time, the turbo jet was especially suited to fly passengers over large
distances, while the turboprop was more suited for shorter distances. Note that both pieces of
information that led to this result were, however, impossible to obtain for the committee at the
time. De Havilland showed a truly innovative spirit by building the Comet against the common
wisdom, and ventured a great deal in doing this. For a while, the future for the company and the
airplane looked bright.

However, the good fortune of the Comet came to a sudden end on 10 January 1954. On this day,
the first Comet obtained by BOAC (the one on the picture) took off from Rome, and crashed into
the sea near Elba shortly after take-off. There were no survivors. In fact, the Comets had already
been involved in four accidents during take-off. In the third of these four accidents, 43 people
died. After the Elba crash, the Comets were grounded for two months, during which a search was
undertaken for design flaws. Nothing surfaced, and the cause of the Rome crash remained a
mystery. Finally, the Comets were allowed to fly again. Two weeks after this decision, a new
Comet took off from Rome, and again it crashed into the sea.

Now the investigators went to a whole lot more trouble, and the planes were grounded for a
much longer period. The answer to the mystery finally came when a Comet was subjected to
extended tests in a large water tank, aimed at simulating the repeated pressure put on the fuselage
of the plane during long flights at high altitude. During these tests, it appeared that metal fatigue
was capable of causing small fractures in the hull of the aircraft. At high altitude these could cause
the plane to explode in mid-air, and this was what had happened after the two fatal take-offs in
Rome. The wreck of the last crashed plane was withdrawn from the sea, and an investigation
revealed that fractures had formed in the corners of the windows. These were designed as
rectangles, while a rounder shape would probably not have caused any problems.

The notion of metal fatigue was not very well known before the crash of the Comet, and this
unfortunate event caused a lot of research into this phenomenon, to the benefit of many other
applications in a whole range of other industries as well as the airplane industry itself. It was too
late for the Comet, however. Although a new version, the Comet 4, was launched in 1958, trust
in the basic design had declined to such an extent that airlines chose a competing plane, the 707
manufactured by the American company Boeing, which had been introduced in 1957. The 707

21
obtained the success for which the Comet had been destined, and Boeing continued to design
turbo jet aircraft, such as the successful jumbo 747. De Havilland never recovered from the
commercial set-back experienced with the Comet, and was bought up by British Aerospace. The
Comets are now museum pieces, although a few machines (the safe Comet 4 types) are still flying
in parts of the world in which air travel is not so dense.

Uncertainty and economic theory

Obviously, uncertainty comes in very different degrees. Consider the difference between the first
conception of a computer in the days of Mauchly and Eckert and their ENIAC machine, and the
introduction of the Pentium chip by Intel in 1992. According to a history of the early days of
computers in the United States by Katz and Philips, the leading business men of the day saw no
commercial possibilities for the computer. They quote Thomas J. Watson Senior, CEO of IBM,
as having expressed the feeling that “the one SSEC machine which was on exhibition in IBM’s
New York offices could solve all the scientific problems in the world involving scientific
calculations”. According to Katz and Philips, this was exemplary for the dominant view of
industry leaders of the day. The same T.J. Watson, by the way, quickly led IBM into leadership
in the global computer industry in the 1950s.

The clearly (with hindsight) much too pessimistic view on the commercial potential of computers
becomes understandable if one realizes that businessmen like Thomas J. Watson Senior had never
experienced a computer as we know it in our days. Under such circumstances, how was it
possible to appreciate the many new uses that have been found for computers after they started
their conquest of the modern office and factories? And without any knowledge of semiconductors,
integrated circuits and microchips, how was it possible to think that a computer the size of a large
room would ever fit on a desktop? The problem with recognizing the commercial opportunities
of a truly major technological breakthrough is that there is no frame of reference for judging how
useful the innovation will be.

How different this was for the introduction of the Pentium chip by Intel in 1992. By that time,
Intel and a whole bunch of other firms had knowledge about some of the exciting applications that
have been found for computers and the devices attached to them. Intel also knew that its products
were a major input in small computers that were bought by a large population of consumers and
firms. From its experience with microprocessors since the 8086, it had achieved a state of
knowledge close to certainty that a new and faster microchip would make its way to the heart of
the majority of computers sold. Thus one may certainly say that as compared to the early days of
the computer, Intel was facing a lot less uncertainty with regard to the commercial success of its
Pentium processor.

However, this does not mean that uncertainty was completely reduced. Some degree of
technological uncertainty remained, because of the complex nature of the new design. And in fact,
it turned out that the Intel engineers has made a small mistake which could result, although in a
very limited number of cases, in wrong (floating point) calculations being made by the Pentium
processor. The mistake was reported to the Internet by a mathematics professor in the United
States and the news quickly made its way in the traditional press. Although Intel initially played
down the importance of the so-called Pentium Bug, it quickly changed its course and decided to
take back all faulty chips and supply everyone who had one installed in her machine with a new

22
version, along with operating instructions on how to replace the chip. This quickly reinstated trust
in the Intel brand, and the company’s reputation was not harmed in the end. The example shows,
however, that no matter how mature a technology is, there is always some degree of uncertainty,
in this case of a technological nature.

How can economists cope with this uncertainty in their models? One way is to use stochastic
mathematics. Using this toolbox, the outcome of some event that is determined by chance (a
stochastic or random event) can be evaluated by weighting all the possible outcomes by their
probabilities. For example, if there is a 50% probability for outcome A, a 50% probability for
outcome B, and the two outcomes are valued at 100 and 200, respectively, the expected value
of the stochastic process is equal to 0.5×100+0.5×200= 150. In this way, if all the possible
outcomes of the stochastic process and their probabilities are known, uncertain outcomes involve
more calculations than certain outcomes, but there is no fundamental difficulty in making these
calculations. We refer to such a situation as a case of weak uncertainty.

The situation changes, however, when the possible outcomes of an uncertain process are not
known in advance. Arguably, this is a better description of the situation facing Thomas J. Watson
Senior judging the potential of computers in the 1940s than the previous description of weak
uncertainty. We will refer to the situation in which the possible outcomes of an uncertain process
are not known in advance as strong uncertainty. Under strong uncertainty, the elegant calculations
using probability weighted outcomes to calculate the expected value of a stochastic process
obviously no longer apply. What alternative exists for economists and business men in the real
world?

An answer to this question has been proposed by the field of evolutionary economics, for example
in the work by Nelson and Winter. They base their theory of economic development on the notion
of bounded rationality, which had earlier been introduced by Herbert Simon. Simon argued that
the real world is much too complex for firms to process in a fully rational way all necessary
information and derive a fully rational decision from this. There are at least three reasons for this.
First, obtaining and processing information is a costly process (both in money terms and in terms
of time). Second, the interaction between variables in the real world is often much too complex
to be grasped completely in a model that the firm can use for decision-making. Finally, strong
uncertainty limits the firm’s predictive capabilities.

Instead of a fully rational decision making process, Simon argues that the firm will use a
conceptual model (this need not be an explicit model in the mathematical sense) that encompasses
only the most important variables in its environment. The (basic) relationships between those
variables are described by the model in a stylized way, and the firm then collects only a subset of
all relevant information to feed into this model. What results is a crude decision-making tool that,
if the firm does a good job, will on average at least go in a correct direction. An important feature
of such a boundedly rational model is that it is constantly updated and altered, both in terms of
the actual variables that are taken into account, and in terms of the relationships between them.
Decision making then becomes a trial-and-error process.

Nelson and Winter and other evolutionary economists have proposed that decision making under
bounded rationality may take the form of routines, or rules of thumb. These are relatively simple
rules of behaviour that guide the strategic decisions of firms. An example could be that a firm

23
decides it will spend 5½ % of its sales on R&D. Such a role will certainly not give it the
mathematical precision of the optimal states that we can drive using stylized economic models
based on a stochastic representation of (weak) uncertainty. But it might work in practice for a
firm operating in the real world under bounded rationality, and if it doesn’t work, the firm may
always adapt the rule of thumb (i.e., scale up or down the percentage of sales on R&D).

Bounded rationality is often used as a description of firm behaviour in stylized evolutionary


models of the relationship between technology and the economy. We will discuss examples of
such a model in several chapters of this book (chapter 5, ...). As follows from the above
discussion of the different degrees of uncertainty, such models are particularly relevant when
strong uncertainty is prevailing. One interpretation of this is that evolutionary models are strong
in relationship to the analysis of the long-run changes associated with major or basic innovations,
or in fact with shifts in technological paradigms, while the standard (so-called neo-classical)
economic models using stochastic descriptions of uncertainty are better suited for incremental
innovations, because the latter can be predicted relatively well.

Uncertainty, even in weak form, also has a different consequence. Firms and consumers generally
value uncertainty in a negative way. To see why, imagine which one of the following two
alternatives you would prefer: to receive 100 guilders with a probability of 1, or to receive 100
guilders with probability 0.25. Certainly everybody would pick the first alternative. The decision
becomes harder when the choice is between receiving 100 guilders with probability 1 and
receiving 200 guilders with probability 0.25. In such circumstances, some people would actually
be willing to take the gamble, and choose the second option. But note that the expected value of
the second option is equal to 0.25×200=50 guilders, which is still less then the 100 guilders from
the first option. A person who prefers the second option can be called a risk lover. A risk neutral
person would then be indifferent between a 0.25 probability for 400 guilders and a probability of
1 for 100 guilders, while a risk averse person would only be willing to take the 0.25 probability
option if the pay-off for success in that case would be higher than 400.

Note that independent of whether a person is risk loving, risk averse or risk neutral, a so-called
risk premium is always necessary for an option with uncertainty to be more attractive than an
option with the same pay-off under full certainty. This is the obvious lesson from the first example
of a choice between two options given above. For the case of R&D or other investment in
technological change, we can therefore expect that any firm that is considering to undertake such
investment would only be willing to do so if it can expect a higher pay-off than is associated with
a similar investment in an activity with lower risk.

Risk and uncertainty are obviously more of an impediment for investment in R&D at the level of
an individual firm than from the point of view of the aggregate economy, at least with respect to
the case of weak uncertainty. The reason for this is that any stochastic process, if repeated often
enough, will converge to its mean. While for an individual firm a few unlucky draws from a
random distribution can mean bankruptcy, in the aggregate economy this is much more unlikely
for the basic reason that for a higher number of draws, a spell of only unlucky outcomes becomes
less and less probable. This is the same principle that underlies the concept of insurance.

Because risk is less of a negative factor for the economy at large than for an individual firm,
leaving risky R&D investment to firms will tend to lead to lower amounts of money spent on

24
R&D than would be optimal from a macroeconomic point of view. This conclusion was reached
by the Nobel prize winning economist Kenneth Arrow, in an analysis of the (weak) uncertainty
associated with R&D and technology in general. Note that this is the same conclusion that was
derived from the notion of technology as a partly public good. We thus have two good reasons
to expect that a free market economy will invest too little in R&D.

Arrow also discussed several mechanisms that can be envisaged to reduce the negative impact of
risk. These range from the stock market to insurance schemes, and to futures markets (i.e.,
options to buy or sell a certain good in the future against a price agreed on in the present). To see
how, for example, a stock market would reduce the amount of risk taking associated with R&D
investment, we have to realize that what a stock market does is transfer the risk associated with
a firm’s strategy from the managers to the investors. The manager of a firm becomes an employee
who is paid a salary, and the investors (holders of shares) run the risk of soaring profits. Because
investors can spread their money over a wide portfolio of firms, possibly through investment
schemes such as mutual funds, they can reduce the risks associated with stock market investment.
Thus, we see that the stock market in fact shifts the burden of risk from a single firm to a large
group of investors, which, according to the discussion above, makes the risk more bearable.

However, as Arrow points out, most of these ‘solutions’ to the problem of uncertainty also have
their drawbacks. Either the solutions are not practible, such as creating futures markets for all
commodities associated with uncertainty, or they create additional problems in terms of
weakening incentives. The latter tendency is associated with the problem of moral hazard, or the
agency problem. This problem arises when, for example, an insurance policy creates the wrong
incentives by ‘rewarding’ unattentive behaviour or even purposeful damage causation. This is the
case with arson intended to collect insurance money. A different form of the agency problem
arises in the stock market solution to uncertainty, when salaried managers are not as motivated
to seek profits for the firm compared to when their (full) income depends on this profit stream.

These and other examples show that sharing the burden of risk will often lead to weakened
incentives, and therefore that the economy might be facing a Catch 22 situation when trying to
stimulate R&D investment by sharing risk. Without such risk-sharing schemes, investment in
R&D might be too low from a macroeconomic point of view, while introducing risk sharing might
reduce incentives and thereby worsen the problem while trying to solve it.

1.5. Motivation and plan of the book

The above discussion has highlighted three main points. First, that there is a need to endogenize
technological change in economic models. Second, that technological knowledge is a peculiar
economic factor that cannot be treated like a normal economic good. In particular, there are at
least two reasons (uncertainty and public goods aspects) why one might expect that a free market
economy will produce less technological knowledge than is socially optimal. This in itself would
be a good reason to present a book on the economics of technological change. But it is also a
reason that appeals mainly, if not exclusively, to economists. In order to underline the value added
of the analysis in this book to non-economists, and in particular to engineers, we need to resort
to the second point that emerged in the discussion.

This point refers to the interaction between the economy and technological change. Technological

25
development is a process that is always characterized by joint causality between economic and
other social factors (in short, the ‘human factor’) on the one hand, and technical and engineering
knowledge on the other hand. Thus, for the economist who is interested in long-run development
and growth, technological change is a crucial factor. But it also follows from this that for the
engineer or natural scientist who is interested in developing new technologies, disappointment
might result if no account is taken of the economic and social context in which technology
develops.

To give just a few examples of the questions that arise in such a context of interaction forces
between technology and the economy, let us refer to a few of the issues raised in the chapters to
follow, after which a full plan of the book will be presented. Any engineer working in an R&D
department of a firm will ultimately have to deal with the issue of patents. While it will be fairly
obvious why a firm will want to apply for a patent (especially after the above discussion of the
IBM-Apple case), there are certainly some aspects of the patenting process that remain obscure
without some light from economic theory. For example, why does the law only grant patent
protection for a limited amount of time? Or why, despite a patent system, do we sometimes see
that competition, for example by ‘inventing-around’, is possible, and why would such a situation,
although a nuisance for an individual firm, be desirable from a macroeconomic point of view?

Other questions relate to the process of competition. As an engineer, one might feel that going
for the best technical solution attainable is desirable. But as will be shown in chapter 3, the ‘size’
of an invention will depend, among other factors, on the size and market position of a firm in
which the R&D is undertaken. In ‘selling’ a technical solution for a certain problem to the
management of a firm, it will be useful and insightful to have knowledge about why a manager
will prefer, under certain circumstances, a ‘small innovation’, while in a different context a ‘large
innovation’ will be preferred.

[other questions from chapters to come]


[plan of the book]

References to the original works described in this chapter

Arrow, K. J. (1962). “Economic Welfare and the Allocation of Resources for Invention”. In: The
Rate and Direction of Inventive Activity: Economic and Social Factors. New York,
National Bureau of Economic Research: 609-625.
Cardwell, D. S. L. (1971). From Watt to Clausius. The Rise of Thermodynamics in the Early
Industrial Age. London, Heineman.
Freeman, C. and L. Soete (1997). The Economics of Industrial Innovation. 3rd Edition. London
and Washington, Pinter.
Kline, S. J. and N. Rosenberg (1986). “An overview of innovation”. In:. The positive sum
strategy: harnassing technology for economic growth, edited by R. Landau and N.
Rosenberg. Washington DC, National Academic Press.
Mowery, D. C. and N. Rosenberg (1989). Technology and the Pursuit of Economic Growth.
Cambridge, Cambridge University Press.
Schmookler, J. (1966). Invention and Economic Growth. Cambridge MA, Harvard University
Press.

26
27
Chapter 2.
The Economics of Patents
2.1. Introduction

We have seen in the previous chapter that there might be an incentive problem with regard to the
production of technological knowledge by commercial firms. The reasons for this are twofold:
first, that knowledge has certain characteristics of a public good, and second that knowledge
generation is subject to uncertainty. Th public good character of knowledge implies that once
knowledge has been produced for the first time, it may be reproduced at relatively low costs, and
there is no natural barrier that prevents competitors from reproducing the knowledge. This may
lead to the situation that a firm that produces a piece of useful knowledge is beaten in the market
by a fast second mover that imitates the knowledge cheaply. A forward looking firm that faces
such a prospect may thus decide not to invest into developing the knowledge.

The patent system is one of several instruments that can be used to remedy such a situation of
under-investment in R&D due to the public good nature of technology. It works by creating a
(legal) barrier to copy the knowledge created by an inventor, by granting the inventor the
monopoly to use the knowledge in the market. Patent law must define a number of aspects of this
monopoly. These aspects, and their economic consequences, are the main topic of this chapter

Patents are a temporary monopoly right to an invention. The institution originated in the 14th
century, when patents were used to stimulate foreign artisans to immigrate abroad. One of the
main issues in patent design is the lifetime of a patent (patent length). How long should a patent
last? In England, it was set to 14 years, which was twice the period of an apprenticeship. Hence,
a foreign artisan setting up business in England would be able to train at least two generations of
apprentices before these would be able to compete with the master. An extension period of
another 7 years was possible.

Today, patent lifetimes are usually fixed to a period around 20 years. In the United States, it is
17 years from the date of grant, with an extension possible only for pharmaceutical inventions.
In most European countries, the duration of a patent is 20 years, but often this is counted from
the application date. The duration of a European patent is also 20 years.

Besides the length of a patent, the scope is important. By this, we refer to the extent of protection
offered by the patent. Patent scope can be split up into three dimensions: breadth, width and
height. Breadth usually refers to the number of varieties of the basic form of the invention that are
protected by the patent. For example, if a firm patents a chemical that contains a certain volume
percentage of a working component, how large a variation of that percentage will be protected
by the patent? Other definitions of breadth have been proposed in the economic literature, but we
will confine ourselves to the varieties definition, because it corresponds most closely to a technical
interpretation of the concept of breadth.

Patent width refers to the number of (economically) different markets in which an invention can
be applied. Below, we will review in some detail the invention of a microwave radiation source
that could be applied in a broad range of industries, from beer can making to automobiles. When

28
the basic form of the invention has to be modified to be useful in a different market than the one
originally envisaged, patent width becomes an issue: do the modifications fall under the original
patent or not? This is obviously related to the discussion of the systems nature of technical change
in the previous chapter, and hence also to strong uncertainty that results from this systems
characteristic. This is why patent width will be discussed below in relation to strong uncertainty.

The final dimension of patent scope is height. This refers to the minimal inventive step that a
patent must comprise, or in other words, the size of an invention. The inventive step is an
important factor in the examination of a patent by the patent office, and often patent applications
get rejected because they do not meet the minimum inventive step.

Together, these four factors define the extent of patent protection offered to an inventor. Patent
law is aimed at providing sufficient protection by defining these four characteristics. Still, firms
do not always use patents to protect their innovation. In a survey among US firms, Cohen, Nelson
and Walsh asked firms what percentage of their product innovations could effectively be protected
by means of a patent.5 The mean of the answers to this question was around one third. Other
means of protecting innovations that achieved high scores were keeping the knowledge secret,
building up a lead time, and complementary sales, service or manufacturing capabilities (all
scoring around half). The main exception to this result was the pharmaceuticals industry, which
relies heavily on patents. The survey also asked for the reasons firms would not want to patent
their inventions. Three answers stood out. First, firms indicated that they often were not able to
demonstrate the novelty requirements that the patent office demands. In other words, in this case
the invention does not pass the required patent height. Second, firms did not want to disclose
information about their invention, and consider secrecy a better alternative. Third, firms were
afraid of ‘inventing around’ (i.e., imitation by slightly altering the patented invention so that the
result does not fall under the patent) by competitors.

The results found by Cohen, Nelson and Walsh seem to point out that patents are not very
important, but this is not the correct conclusion. The results rather point to some well-considered
limitations in patent law, as indicated by the reasons that firms give for not relying on the patent
system. The first and the third reason given (height and fear for inventing around) point out that
the firms consider patents to be too high and/or too narrow. The second reason points out that
firms are not prepared to disclose information with regard to their invention.

Both of these two characteristics of patent law have good reasons behind them. With regard to
obligatory information disclosure, the patent system requires full publication of all technical details
once the patent has been applied for (in Europe) or has been granted (in the United States). This
enables other inventors to learn about the knowledge described in the patent, and use this in their
own R&D process. In other words, while the patent system aims to protect a certain pay-off to
the inventor by granting a monopoly right, patent law is not interested in ruling out all spillovers.
Typically, using the knowledge in a patent as an input for ideas for completely new (relatively to
the patent) products is allowed, and this is considered beneficial from the point of view of the
total economy. Remember that in the case study of the personal computer in the previous chapter,

5
W.M. Cohen, R.R. Nelson and J.P. Walsh, ‘Protecting Their Intellectual Assets: Appropriability
Conditions and Why U.S. Manufacturing Firms Patent (or not)’, National Bureau of Economic Research
Working Paper no. 7552.

29
imitation led to quick diffusion and low prices, and was thus considered to be a good thing from
the point of view of consumers. When we review economic growth theory in chapter .., this dual
nature of the patent system (protection - spillovers) will be one of the central themes.

This leaves us with the question of why it is beneficial to have relatively high novelty
requirements, or narrow patent breadth. The answer to this question is one of the main themes
of this chapter. The most important factor in the answer lies in the primary disadvantage that is
associated with all monopolies: they lead to welfare loss, for example on the part of consumers
that buy the patented products. This welfare loss may take the form of high prices, or a small
variety of products being offered on the market. The models that we will review in this chapter
will point out how exactly this welfare loss, which we usually term dead-weight loss, comes
about, and how it can be minimized by choosing a particular patent design.

If patents have certain disadvantages, for example in the form of welfare loss, why are they then
still used as a tool for stimulating inventions?6 The reason is that other possible instruments also
have disadvantages, and in certain situations, these disadvantages weigh stronger than the
disadvantages of the patent system. Paul David has made a distinction between three mechanism
of invention-stimulation, and dubbed them the three P’s: Patronage, Procurement and Patents.

Patronage refers to the situation in which government finances research without specific demand
with regard to its outcomes. This is, for example, the case for university research in most
European countries as well as some US universities, or in the case of public research laboratories.
These institutions generate knowledge (for example in the form of basic research) that is often not
immediately useful, but is expected to have benefits in the long run. Firms will often find this
research too risky to spend large amounts of money on.

Procurement is different from patronage because it refers to specific pieces of research that are
contracted out. This is a model that is often used in the weapons industry. In this case, the buyer
(government) has in mind a specific piece of knowledge (or a product based on it) that it wants
to buy, and it draws up a contract with a private firm that can supply this knowledge from its
R&D process. Patronage has the disadvantage that it does not make extensive use of the market
process. Government decides which institutions or firms to fund, and which knowledge to buy.
Thus, the market is not used as a means of selecting useful knowledge.

Patronage and procurement seem useful procedures for stimulating knowledge development in
specific situations, such as basic research and weapons research. For innovations that are largely
aimed at the consumer market or the market for capital goods, the patent system has the
advantage that it involves the market to a larger extent than either procurement or patronage. This
has led economists to pay much more attention to patents than to university research or
government procured research. Because the theory for the latter two systems has only recently
started to develop in full, this book will choose not to develop this topic further, but instead focus
on the patent system.

6
Chapter 4 will explain a further disadvantage of the patent system: in cases where the inventive
process can be characterized as a ‘common pool process’, it may lead to over-investment in R&D.

30
Like in all economic analyses of innovation, the patent system is affected to a large extent by
uncertainty. At the stage when a firm is applying for a patent, it is often not clear how much
economic value (i.e., profits) the invention is going to yield. Quite a few patents are never put to
any economic use, i.e., they remain expectations of fortune. As was argued in the previous
chapter, innovations may either be subject to strong or weak uncertainty. The mathematical
models that we use in this chapter to illustrate the working of the patent system all start from the
assumption that there is no strong uncertainty in innovation. In fact, they will typically not even
take into account weak uncertainty in an explicit sense, but this must be interpreted as a
simplification. As explained in the previous chapter, weak uncertainty could be taken into account
by modelling the ‘certain’ pay-offs to R&D that we will use below as a stochastic process of
which the mean (and variance) is known, and this would not change the conclusions drastically.

Still, not taking into account strong uncertainty is a major caveat of the models considered here.
But, as will be argued in section 2.4 below in detail, strong uncertainty can be accommodated by
reasoning from the models that are presented here. Section 2.4 will indeed argue that taking into
account strong uncertainty in a patent system may lead to undesirable outcomes (so-called
prospecting behaviour), and that thinking about innovation in terms of weak uncertainty tends to
lead to patent systems that are ‘on the safe side’ with regard to the definition of patent scope.

This leads us to the outline of the chapter. In the next section, the issue of optimal patent length
will be considered. Section 2.3 will consider optimal patent breadth. Both of these sections will
use a mathematical model to outline the main issues. Section 2.4 will explain the issues involved
with patent width and the relation of patents to strong uncertainty. Section 2.5 will present a
model of patent height, and section 2.6 will summarize the theory of patent design. Section 2.7
will introduce the issue of patent strategies by firms, and show that quite often, patents are used
for purposes that go beyond mere protection of an innovation.

2.2. Patent Length

In 1988, Sir James W. Black (born in 1924) received the Nobel Prize for Physiology or Medicine,
together with George H. Hitchings and Gertrude B. Elion. The prize was awarded to Black for
developing two important drugs, propranolol and cimetidine, in the early 1970s in the laboratories
of the British pharmaceuticals firm SmithKline & French (now SmithKline-Beecham). Cimetidine
belongs to the class of H2 blockers. The drug is applied against ulcers, and works to engage the
histamine receptor sites within a cell. It does this, however, without invoking the normal response
of acid production by the receptors. Such acid production during times when the stomach is
empty is one of the main problems associated with ulcers. Administering H2 blockers to patients
with an ulcer prevents this problem because the drug engages the histamine receptor sites before
they can be engaged by the normal bodily processes. Propranolol, the other medicine for which
Black was awarded the Nobel Prize, is part of the class of drugs known as Beta blockers, which
work on similar principles (engaging receptor sites) in relation to cardiovascular diseases.

SmithKline & French introduced cimetidine on the UK market in 1976, and on the US market in
1977, under the name Tagamet. In 1983, the firm Glaxo (now Glaxo-Wellcome, and about to
merge with SmithKline-Beecham) introduced another H2 blocker on the market, ranitidine (traded
as Zantac). Zantac had several advantages over Tagamet, among others fewer side-effects.
Nevertheless, both products managed to take a significant part of the market for ulcer drugs.

31
Both Zantac and Tagamet were protected by patents. The Tagamet patent ran out in 1984, after
it had earned SmithKline more than 14 billion US$, and had become the first drug ever to sell
more than 1 billion US$ per year. After the patents ran out, however, generic products, which
offer the same working component, i.e., cimetidine, took over the market quickly by offering the
product at price discounts of 30-60%. The consultancy firm IMS Data reported that 8 weeks after
the patent on cimetidine expired, 55% of the market had been captured by generic brands.

SmithKline-Beecham and Glaxo (which was facing patent expiry for Zantac soon as well) reacted
by putting on the market so-called ‘over-the-counter’ varieties of their products, which had
smaller doses of the working component (cimetidine or ranitidine), but were available without
prescription. In the over-the-counter market, price competition is much stronger than in the
prescription market. Thus, the combined effect of the competition of generic brands, and the
emigration to the over-the-counter market led to a strong fall in prices for anti-ulcer drugs. And
the start of this strong fall was the expiration of the Tagamet patent.

Would it not have been nice for all those ulcer patients (or their health insurance companies) if
the patent on Tagamet had run out earlier? The answer is most certainly yes. Why then does the
law in the United States set the patent lifetime to 17 years, or in most European countries to 20
years? This is the question that we will investigate in this section. In order to do so, we will
employ a simple model of a process invention that was originally but forward by William
Nordhaus.

The model assumes a market for a homogenous good facing a downward sloping linear demand
curve.7 Initially, there are many firms competing in this market, and the price is driven down to
marginal costs of production, which we denote by c and assume to be constant (not dependent
on production). Then along comes a firm that makes a process invention that lowers the level of
marginal costs in the industry to c < c . The new marginal costs level is also assumed to be
constant. This firm will be granted a patent to its invention.

The first stage in the analysis concerns the price setting by the firm that holds the patent. This firm
has two choices: either it prices at the old price c , which will enable it to capture a margin
of c − c on all products it sells, or it prices at a price lower than c , which will enable it to sell
more relative to the first option. Note that pricing at a price higher than c does not make sense,
because then the other firms will still be able to offer the good at c .

The problem can be tackled by solving the profit maximization problem of the patent holder. This
firm maximizes the profit function
Πq (p c), (2.1)

where Πdenotes profits, p is the price charged, and q is the quantity sold. The demand curve is
linear and can be denoted:

7
The model uses a number of standard assumptions from the economic model of perfect competition.
If you are not familiar with this model, please consult the appendix on economic models.

32
q A Bp, (2.2)

where A and B are parameters.


Now substitute (2) into (1) to arrive at
Πp(A  Bc) Bp 2 Ac. (2.3)

The price that maximizes profits can be found by differentiating this equation with regard to p and
setting the result to zero, which yields
A 1
p  c, (2.4)
2B 2

where the asterisk indicates profit maximization. To this we have to apply the restriction that p
cannot exceed c . In other words, if p*> c , the monopolist’s profits will be limited by the
potential competition of the firms using the old technology. Note that this restriction for the
monopolist is more likely to hold for innovations where the fall in the level of marginal costs is
low. To be precise, the monopolist will be able to choose its optimal price p* only if
2 c − c > A / B , i.e., for the case for ‘large’ reductions in costs. We will call this case drastic
innovation, while the reverse (when the monopolist is restricted by the price set by firms operating
with the old technology) will be called minor innovations.

For minor innovations, the monopolist can outprice the other firms by setting a price that just a
tiny fraction lower than c , e.g. c − ε, where ε is a very small positive number. At this price, all
firms using the old technology will leave the market, and the patentholder for the new technology
will be the only firm that produces. Because ε is infinitely small, we can approximate the price
that the monopolist charges by c in the case of minor innovations. With a drastic innovation, the
price charged by the monopolist is equal to p * = ( A / B + c ) / 2 , as shown in equation (3) above.

Social costs and benefits

Let us consider the case of minor innovations. Before the innovation is patented, there is a number
of competitive firms selling the product at price c . After the innovation is patented, these
competitive firms are replaced by the monopolist patent holder that sets (almost) the same price.
Thus, for consumers, there is no actual change between the situations before and after the
innovation. However, because the monopolist has lower costs, it is now able to capture ‘above
normal’ profits.8 The amount of these profits can be found by substituting c for p in equation (3),
yielding

8
Remember that ‘normal profits’ are included in marginal production costs, which means that firms
operating in a fully competitive market have zero ‘above normal profits’. See the appendix on economic
models for details.

33
Œminor (A Bc) (c c), (2.5)

where the superscript ‘minor’ indicates the monopolist’s optimal strategy for minor innovations.

The amount of profits can also be illustrated using Figure 1. In this figure, the downward sloping
line is the demand curve, and point X indicates the market equilibrium. The monopolist charges
p= c , while the unit production costs are equal to c. The quantity sold is equal to qx, so that total
‘above normal’ profits are equal to the darkly shaded rectangle F F X Y.

The demand curve is downward sloping because of decreasing marginal returns to consumption
facing consumers. This means that the utility that is derived from an additional unit of the good
falls when more units get consumed. Think of this as the effect that arises when you keep on
eating ice cream: eating the first portion will be a highly rewarding activity, but if you keep on
eating there comes a point you can take no more.

Figure 2.1. Market equilibrium following a minor


process innovation

The phenomenon of decreasing marginal returns to consumption has an important consequence.


Imagine a situation starting from the point where the demand curve crosses the vertical axis, i.e.,
where consumption is equal to zero. Consumers are now willing to pay price A/B (or actually just
a tiny bit less). But at the market equilibrium X they only pay c (which is lower than A/B). This
means that there is a consumer surplus: the marginal value derived from all units of the good
consumed before the qxth unit is in fact higher than the price that is paid ( c ).

Because of decreasing marginal returns to consumption, the value of the consumer surplus for the

34
ith unit consumed falls when i increases (on the horizontal axis), until consumer surplus is zero
at i= qx. For each value of i, consumer surplus is equal to the distance between the line F X and
the demand curve. Thus, total consumer surplus is equal to the lightly shaded triangle S. The value
of S can also be found by integrating this distance over the interval zero to qx:
qx
A K
S c d K. (2.6)
P B
0

Note that consumer surplus S is not a result the innovation. It existed before the innovation, and
in the case of a minor innovation, total consumer surplus S does not change as long as the patent
is in effect. Thus, in this case, the total societal benefits are equal to the extra profits that the
monopolist earns. However, when the patent runs out, other firms will be able to legally imitate
the innovation, and hence the market becomes fully competitive again. This implies that the price
is driven down to c , and the market equilibrium shifts to X’. The monopolist’s ‘above normal’
profits vanish, as all firms operate on marginal costs c again. Consumer surplus is now equal to
the total shaded area, i.e., S+Œ+. The increase in consumer surplus relative to the period when
the patent was in effect consists of two elements. Firstly, what used to be profits for the patent
holder (darkly shaded area Œ) now becomes consumer surplus. From an aggregate societal point
of view, this does not constitute a gain, but rather a transfer from the monopolist to the consumer.

The second effect is an actual gain, and this is the triangle . This part of the societal benefits is
unlocked only after the patent expires. The fact that it does not unlock right after the innovation
is made (the patent granted), is due to the monopoly that is granted to the inventor in the form
of a patent. Thus, the area  in Figure 1 constitutes welfare that must be sacrificed in order to
provide the inventor with enough incentives to undertake the innovation. Economists call it the
dead-weight loss associated with a monopoly (in this case, the patent). The fact that this welfare
loss is temporary, i.e., until the patent runs out, makes it bearable from the economist’s point of
view. It is said that static efficiency (maximizing the size of the consumer surplus triangle by
bringing down the market price to marginal costs at a given point in time) is sacrificed to obtain
dynamic efficiency (trying to decrease marginal costs by inventing).

Let us now look briefly at the case of drastic innovation. Recall that in this case, the inventive step
(cost reduction) is so large that the monopolist’s optimal price undercuts the old price ( c ). It can
safely be assumed that the optimal price is larger than the new level of marginal costs ( c ),
because otherwise the monopolist would choose not to produce at all. Figure 2 illustrates this
situation. What changes relative to the situation of minor innovation (Figure 1), is that the market
price falls (from c to p* ) after the patent has been granted. This means that it is now not solely
the monopolist that captures the rents of the innovation. Consumers now also capture a part of
the rents immediately after the patent has been granted, because the value of consumer surplus
increases with the fall of the price. This is the part in the lightly shaded triangle S that lies below
the level c . Hence, it can be concluded that only inventors benefit initially from minor (process)
innovations, while for drastic innovations, consumers also take part in the immediate benefits.
However, this does not change the fact that also in the case of drastic innovations, a static
efficiency  loss occurs as a result of the monopoly granted by the patent.

35
Figure 2.2. Market equilibrium following a drastic
process innovation

The costs of invention: R&D

So far, we have not yet talked about the R&D costs for the inventor. It seems reasonable to
assume that these vary with the extent of the innovation (size of the cost reduction). Large
innovations are more costly than small innovations. This can be represented by assuming the
following ‘invention possibility function’:
c c
 R ., (2.7)
c

where R are the R&D outlays chosen by the firm, and . and  are parameters. We assume .<1
to reflect the conventional assumption of decreasing marginal returns (to R&D). Let us further
assume that R&D costs are paid fully in the present (i.e., we abstract from the idea that R&D
takes time).

Then the firm that considers whether or not to invest in R&D faces the trade-off between the
costs of innovation (equation 7) and the profits that arise from innovation. If we limit attention
to minor innovations for now (the analysis is not much more complicated for drastic innovations),
we can use equation (5) to quantify profits for the inventor firm every period. Profits consist of
a stream over the period while the patent is in effect (the patent life time, which is denoted by T).
In order to value future profits correctly, the firm discounts future profit streams.9 The discount

9
See the appendix on economic modelling for details about the ideas behind discounting.

36
rate is denoted by !. Then net profits (V) for the firm consist of the net present value of future
profit streams minus the R&D costs:
T
V
P
Œminor e !2
d2 R. (2.8)
0

Note that by substituting equation (7) into equation (5), we can express profits in each period as
a function of the (present) R&D outlays:
Œminor c R .(A Bc). (2.9)

Substituting this into equation (8) gives


T !T
cR .(A Bc) e d2 R cR .(A Bc)
!2 1 e
V R.
P
0
! (2.10)

The inventor firm will chose the value of R that maximizes this expression. This value can be
found by differentiating with respect to R and setting the result to zero10:
0V . c R (1 .)
(A Bc)
1 e !T
1 0<
0R !
!T
1 (2.11)
R .c(A Bc)
1 e 1 .
.
!

Equation (11) gives the firm’s optimal level of R&D that belongs to a specific patent life time T.
This outcome tells us one important thing: that the optimal amount of R&D performed by a firm
depends positively on patent life T. Longer patents induce higher R&D outlays, and hence larger
innovations. In other words, patent length is indeed an instrument that can be used to stimulate
innovation.

Next, we look at the amount of profits that are generated from the innovation. These can be
obtained by substituting equation (11) into equation (10):
. 1
!T !T !T
V c . c(A Bc)
1 e 1 .
(A Bc)
1 e
. c(A Bc)
1 e 1 .
. (2.12)
! ! !

10
Formally, we also have to check whether the extreme value of the function V(R) that is derived in
this way is in fact a maximum (instead of a minimum). This can be done by checking the sign of the second
derivative, which must be negative. This is known as the second-order condition. We do not drive this here,
but the mathematically skilled reader may want to verify that the second-order condition holds.

37
Figure 2.3. Net present value of patent holder’s
profits from a minor innovation for varying patent
duration

Figure 3 displays equation (12). It is shown that profits follow a sigmoid shape with increasing
patent life time. This corresponds to the intuition from our above case that longer patents are
better for the firm. However, when patent life time becomes very long, it does not lead to further
increasing profits.

Optimal patent duration

We are now finally ready to take the last step in the analysis: to pick a value for T that maximizes
the (net) benefits of the innovation for society as a whole. We still focus on the case of a minor
innovation. As we saw above, there are two components in the social benefits of the innovation:
profits for the monopolist and consumer surplus for the consumers. The former have already been
analyzed above (equations 8-11), so we have to focus now on the benefits for the consumer. As
before, future benefits have to be discounted, and we assume that the discount rate for the
consumer is equal to that of the firm (!).

Because we focus on minor innovations, the consumer surplus only becomes available after the
patent runs out, i.e., at time T. This surplus consists of two parts: what used to be profits for the
monpolist, derived in equation (12), and the new triangle  in Figure 1. Let us denote the sum
of these two parts of the additional consumer surplus after the patent runs out ,. Total net
benefits can thus be written as follows:
 !T
W V ,e d2 V,
!2 e
.
P
T
! (2.13)

38
 can be calculated analogue to equation (6). The limits on the integral now become the market
equilibrium output levels at c and c , respectively. Together with equation (5) this yields:
q x(c)
A K
, Œminor   (A Bc) (c c) 
P
c dK
1
B(c c)2  (A Bc) (c c), (2.14)
B 2
q x(c)

and, using equations (7) and (11):


2.
!T !T !T
, e
Bc 
1 2 2
.c(A Bc)
1 e 1 . e

! 2 ! ! (2.15)
.
!T !T
c (A Bc) .c(A Bc)
1 e 1 . e
.
! !

Figure 2.4. Net present value of consumer surplus


following from a minor innovation for varying patent
duration

This equation gives the net present value of the consumer surplus that arises after the patent runs
out, or, in other words, the second term on the right hand side of equation (13). Figure 4 shows
how this value varies with the patent life time. Starting from a zero patent life, consumers benefit
from an increase of T, because this leads to R&D investment and hence innovation. However,
there is also a different effect of the increasing patent length: consumer surplus gets shifted into
the future (because it only starts occurring after the patents expires). Near the origin, this effect
cannot dominate the effect of increased R&D, but at a certain value for T, the resulting net

39
present value of the consumer surplus peaks and falls afterwards. This is caused by the fact that
at this point, the second effect starts to dominate.

Figure 2.5. Total societal benefits from a minor


innovation

The total societal value W of a patent of length T can now be found by simply adding up the two
curves in Figures 3 and 4, as is shown by equation (13). This is displayed by the lowest of the
three curves in Figure 5. It is readily seen that although the firm would prefer to have a patent that
lasts forever (this yields maximum profits), the interests of the consumer prohibit this. A social
planner (government) would pick the value the patent life time that corresponds to the peak of
the curve in Figure 5.

We end our analysis by investigating how this ‘optimal patent length’ varies with some of the
parameters in the model, specifically the parameters B and . The reason why this exercise is
interesting, is that in economic reality, circumstances will differ between industries. Because the
exact value of the optimal patent length depends on the parameters in the model, one might expect
that optimal patent length will differ between sectors, if parameters such as B or  differ between
sectors.

B represents the sensitivity of product demand to changes in the price of the product (‘price
elasticity’). Note from Figure 1 that a demand curve that falls at a less steep angle, i.e., a large B,
leads to more consumer surplus. This can also be seen from equations (14) and (15). However,
for given size of the innovation, increasing the value of B also decreases the value of profits per
period while the patent is in effect, as is seen from equation (5). Thus, for larger B, an innovation
will yield more consumer surplus and less profits. Combining both effects means that the share
of consumers in the total benefits increases, and hence the social planner can achieve a better
result by taking a measure that benefits the consumer, i.e., decrease the patent life. This is
depicted by the middle curve in Figure 5.

40
Parameter  represents the ease of innovation (higher values imply that less R&D is required to
deliver an innovation of given size). Equation (15) shows that a higher  leads to more consumer
surplus. The same holds for profits (equation 12). Hence, in order to generate an innovation of
given size, less incentive needs to be provided for the innovator in the form of a long patent. Thus,
higher  leads to shorter optimal patent duration.

Concluding, there are three important lessons from Nordhaus’ model of optimal patent length.
First, the model shows that a patent, as all monopolies, involves welfare loss in the form of
consumer surplus that cannot be realized during the life of the patent (dead-weight loss). Second,
the model allows us, at least in theory, to calculate the optimal patent length, i.e., the patent
duration that maximizes total economic welfare. Even if we might not be able to calculate the
exact value of the optimal patent length in practice (because we don’t know the values of all the
parameters involved), the model teaches us that patents can be too long (or too short), and that
this may lead to welfare losses. Third, the model shows that the optimal patent length is not equal
between sectors. Sectors with high elasticity of demand or high innovation opportunities have
shorter optimal patent length.

2.3. Patent breadth

A main disadvantage of the instrument of patent length is the fact that it cannot be used to
influence the amount of dead-weight loss in each separate period. In Figures 1 and 2, the area 
that represents the dead-weight loss is fixed in every period, and cannot be changed by changing
the patent life T. In this section, we will review a simple model of patent breadth, which shows
that breadth is an instrument that can be used to influence the relative (to profits) amount of dead-
weight loss. The model that we will discuss was originally proposed by Paul Klemperer, and later
refined by Theon van Dijk.

What do we mean by patent breadth? Most products come in different varieties. For example,
think about fruits. This product comes in many different varieties: apples, oranges, grapes, etc.
Patent breadth then refers to how many of those varieties of the basic product fall under the
protection of the patent. A broad patent would mean that somebody who patents an apple, would
also hold the patent to an orange.

From an economic point of view, what is crucial in the definition of patent breadth, is that each
one of the different varieties of the innovation that are captured by breadth is not inherently better
than the other varieties. An orange is not a better quality of fruit than an apple, although some
people prefer oranges to apples. This is what an economist calls horizontal product differentiation.

Of course, one can also imagine varieties of a product that differ in quality, such that it is possible
to rank them in terms of performance. This is, for example, the case with microprocessors. The
successive Intel microprocessors that we saw in chapter 1 when we reviewed Moore’s law, can
clearly be ranked in terms of performance. The economic terminology for such a case is vertical
product differentiation. We will discuss the case of vertical product differentiation in the next
section when we look at patent height.

Horizontal product differentiation can be represented by an address model. Each variety of the
product is denoted by an index w which lies somewhere on a line [0,1]. Note that we have a very

41
large (infinite) number of varieties on this line. Each of these varieties is preferred by a number
of people, and we make the assumption that the number of people preferring a single variety is
equal for all varieties. In technical terms, this means we assume a uniform distribution of
individual tastes for a variety w. The total number of consumers (the density of the distribution)
is scaled to 1. The most preferred variety is denoted by w*. Note that we also assume that all
consumers agree on the ranking of the varieties, i.e., two varieties that are next to each other are
considered as close substitutes by all consumers.

Each consumer can be represented by the following utility function:

U v pQ d if the consumer buys


(2.16)
0 otherwise

U is utility derived from buying the good, v is the gross surplus that the consumer enjoys when
she consumes her most preferred variety of the good, p is the price paid, and 3 is the price
elasticity of the good. The term d represents the loss in utility if the consumer consumes a variety
other than the most preferred one: d is the distance between the variety that is consumed and the
most preferred variety, measured as d = |w* - w|, and  is a parameter representing a utility penalty.
We also assume that a consumer buys one unit of the product at most.

Figure 6 displays this situation. On the horizontal axis are the different varieties w, ranging from
0 to 1. On the vertical axis, we measure utility derived from consuming. Note that the maximum
utility possible is equal to v, which occurs when the price for the good is 0, and the consumer buys
her most preferred variety.

Patent breadth becomes relevant to this situation if we assume that the good in question is an
innovation. We assume that the innovator (patent holder) invents variety 0, and only produces this
variety. The patent protects all varieties of the product in the range [0, bˆ, i.e., no competition is
allowed in this interval. The parameter b represents patent breadth.

Figure 2.6. Welfare analysis for patent breadth

Outside the patented interval, i.e., on the range [b,1], it is assumed that perfect competition holds,
i.e., firms sell the product at marginal costs. The final assumption that is necessary before we can
analyze the impact of patent breadth on utility, concerns production costs. We assume that

42
marginal costs of production are equal for all varieties of the product, including variety 0 that is
produced by the patent holder. In order to keep the mathematics as simple as possible, we further
assume the marginal production costs are 0, and thus also that the price of products in the range
[b,1] is 0. Note that although this is a somewhat awkward assumption to make, it does not change
any of the basic conclusions of the analysis.11

In Figure 6, all varieties except the ones in the range „0, bˆ are supplied on the market. Thus, every
consumer will pick the most preferred variety, except the ones in this particular interval. These
unlucky consumers will have to choose between buying no variety at all, buying variety 0, or
buying variety b. Let us first investigate which consumer is indifferent between varieties 0 and b.
We do this by setting the utility derived from each of these two varieties equal to each other (note
that the price of variety b is equal to its marginal production costs, which are 0). We denote the
prefered variety of this consumer by w’:
pQ
v pQ |w 0| v |w b| <w b
.
2
(2.17)
2

Every consumer to the right of w’ will prefer to buy variety b instead of variety 0, whereas
consumers to the left of w’ will prefer variety 0. Let us further assume that v is large enough to
induce every consumer to buy. Thus, for lower prices and for larger patent breadth, the
monopolist patent holder will sell more products. In fact, equation (17) is the demand function
faced by the monopolist. Hence, the profits for the monopolist are equal to the number of sold
products in equation (17) multiplied by p:
p Q 1
Πpb
.
2
(2.18)
2

The monopolist’s optimal price can be found by differentiating the profit function with respect to
p and setting the result to 0:

0Πb (Q  1)p Q 0 < p b
1
Q, (2.19)
0p 2 2 Q1

where p* is the optimal price. Substituting the optimal price into the expression for w’ and into
the profit function yields the following:

b
1
bQ bQ
w , Πp w Q . (2.20)
2(Q  1) Q 2(Q  1)

11
The mathematically skilled reader may want to check this by repeating the analysis below with
marginal production costs set to a fixed value c.

43
Just as in the case of the Nordhaus model of patent length, there is dead-weight welfare loss
associated with the monopoly patent. In this case, however, not all consumers face welfare loss.
Only the consumers in the interval [0, bˆ are affected. We calculate the welfare loss separately for
the consumers to the left and right of w’. We start with the consumers to the right.

In this case, the dead-weight welfare loss concerns only the costs d associated with consuming
a less preferred variety. Note that the closer the most preferred variety of these consumers is to
b, the lower the welfare loss is. In Figure 6, this is the triangle 1. The value of 1 can be
calculated as follows:

b 2(Q  2)2
b
1 P(b &) d& . (2.21)
8(Q  1)2
w

Consumers to the left of w’ also face costs d , but on top of this, these consumers also have
welfare loss p3 as compared to a perfectly competitive market (where price would be equal to
marginal costs, i.e., 0). The welfare loss of these consumers can thus be calculated as follows:

b 2 Q(4  Q)
w
(p Q  &) d& . (2.22)
P 8 (Q  1)2
0

Note that a part of the welfare loss for consumers to the left of w’ is passed onto the monopolist
patent holder in the form of above normal profits. This was also the case in the Nordhaus model
of patent length in the previous section. Total welfare loss for the consumers buying from the
patent holder can thus be represented in Figure 6 by the sum of the areas 2 + Œ. Profits Œ,
however, also need to be counted as a welfare gain, just as we did in the Nordhaus model above.
Thus, total welfare loss is equal to 1 + 2.

As an intermezzo, let us shortly comes back to the issue of procurement of invention by the
government. Suppose the government knows that the innovation that we have considered in our
model of patent breadth can be developed, and it also knows how much profits the innovation can
yield to a commercial firm (V) (note that we argued in the introduction that generally, the
government does not have this kind of information, which is why economists prefer a patent
system; but let us make the assumption anyway for the sake of the argument). In this case, the
government could pay a firm the amount V and procure the innovation. It could then put the
technological knowledge in the public domain, so that competitive firms would produce all
varieties in the interval [0, b]. Then the whole market would be served by competitive firms, and
there would be no welfare loss due to consuming a less preferred variety.

Still, the government would have to finance its outlays to procure the innovation (V). This could
be done through a lump-sum tax. A lump-sum tax is a tax that does not depend on any economic
characteristics of the agents (firms, consumers) that pay the tax. In this case, it would simply
divide V by the number of consumers, and have each consumers pay that amount in the form of
a lump-sum tax. In terms of Figure 6, total (static) welfare loss would then be Œ, which is
considerably less than in the case of a patent system. This is an example of the magic of lump-sum
taxes, which are a much preferred way to think about taxation in microeconomic theory. A lump-

44
sum tax does not distort the (optimal) allocation of welfare to consumers, and is therefore an
efficient form of taxation.

Let us now go back to the start of the section, where it was argued that patent breadth can be
used as an instrument to influence the dead-weight loss during the period in which the patent is
in effect. To see how this is possible, we calculate the total welfare loss. Note that the result of
equation (22) is equal to 1+Œ. Then, using equations (20) - (22), we can derive:

b 2 (Q  2)2 b 2 Q (4  Q) b
1
bQ
1  2  Q. (2.23)
8(Q  1)2 8(Q  1)2 2(Q  1) Q  1

Now we can think about the problem of an optimal patent design in the following way. The goal
of a patent is to provide an incentive to the innovator in the form of a profit stream. But the bad
thing about the patent is that it leads to dead-weight welfare loss (1 + 2). Thus, it would be nice
if we could find a patent design that gives us much value (profit stream) for costs (dead-weight
loss). How does patent breadth help us to achieve this? Let us look at the ratio of dead-weight
loss to profits in order to answer this question. Use equations (20) and (23) to obtain the
following:
1  2 b  (Q  2)2 b
1

Q  b (4  Q) b
1
Q 1. (2.24)
Π4Q (Q  1) Q  1 4 (Q  1) Q1

This expression looks rather complicated, but its qualitative behaviour reduces nicely to three
cases, which are depicted in Figure 7. For 3=1, the ratio between dead-weight loss is constant
(equal to 2½), which means that for this special case, profits and dead-weight loss are equally
affected by a change in patent breadth. For 3>1, the ratio of profits to dead-weight loss increases
with b, and for 3<1, this ratio decreases with b.

Figure 2.7. The ratio of profits to welfare loss for


varying patent breadth

Thus, again, we find that sectoral parameters, such as the elasticity of demand, have an impact
on optimal patent design. For goods where the elasticity of demand (3) is high, broad patents are
relatively bad, because they sacrifice a large amount of consumer welfare in order to obtain a
given incentive in the form of profits. For goods for which the elasticity of demand is low, the

45
same holds for narrow patents.

2.4. Patent width

So far, we have only looked at single inventions, and left so-called follow-up inventions out of
the picture. Because these follow-up inventions are an important aspect of technological change,
we will now shift our attention to a dynamic perspective, where innovations will follow on each
other. In this section, focus will be on patent width, while the next section will take into account
patent height in relation to a follow-up invention.

Patent width refers to the range of applications for which the patent holds. Many inventions may
be applied in different markets, but it is often not very clear at the outset which markets these are.
Because of the systems nature of many innovations, applications of inventions often come as a
surprise to the original inventor. If we would leave out the strong uncertainty that is associated
with this for a moment, it would be possible to think about patent width in terms of several
markets, each of which can be represented by a model of patent breadth or patent length as
reviewed above. In this case, there would not be much to the economic analysis of patent width
from an economic point of view.

However, if we do take into account strong uncertainty, the picture changes. Strong uncertainty
means that the innovator does not have a clear idea of how exactly the innovation is going to be
applied, and, hence, neither does she have a good idea of the expected pay-offs. At the very best,
the inventors may have some sense of a large fortune that may be earned with the knowledge they
posses, but how exactly this fortune can be made remains obscured in the future.

As an example, consider the recent developments in describing the human genome. The main issue
in this project is to describe the sequence of genetic code in human cells. It is believed that such
a description may open the way to identifying the function of various genes, and hence to discover
their possible role in certain human diseases. Once such knowledge exists, it would be possible
to find cures for these diseases based on the genetic information (e.g., by eliminating the genetic
‘deficits’ that lead to the disease).

The main technical problem that exists in mapping genetic code, is that reading of the code can
only be done for small pieces of code (up to around 500 characters). There are two approaches
that are used for mapping the human genetic code. The traditional one is a lengthy process that
proceeds by cutting the long string of genetic code into ever-smaller parts, until, ultimately, a
piece results that can be read. The main disadvantage of this method is that it takes a long time
to cut the genetic string at the right places. The combined forces of several of the most prestigious
academic institutions in the world set out to undertake this project in the early 1980s and expected
to finish by 2005.

An alternative approach has been developed by Craig Venter and his commercial firm Celera.
Venter has proceeded by cutting a number of copies of an identical genetic string randomly into
small parts. After the cutting has taken place, it is not exactly known where the pieces fit into the
large string. However, by letting a computer determine the overlap of the pieces from the ten
distinct copies of the string, one can get a good idea of how the pieces of the puzzle fit together.
This approach had successfully been used on genetic code of simple organisms, such as bacteria.

46
When Celera proceeded to use the technique on the human genome, it was met with skepticism.
Still, the firm largely succeeded in speeding up the date at which the human genome will be
mapped completely (now set at 2003 instead of 2005).

The entrance of a private firm in the human genome scene has spurred public anxiety about the
patentability of human genes. A large range of arguments has been used against it, using ethical
(is it reasonable to patent the basics of human life?), legal (is it possible to patent a discovery of
something that has been around for a long time) as well as economic motives. From an economic
point of view, the main question is about patent width. One extreme view would be to allow a
patent on a (human) gene, irrespective of the use of the genetic knowledge. This would
appropriate all possible future applications that explicitly use the genetic knowledge, such as a
wide range of different cures for diseases that are in some way related to the described gene.

Such a patent policy would open up the way for so-called prospecting behaviour, for example in
the form of firms racing to map the human genome and patent every little piece of genetic
information that is unveiled. This is called prospecting because the firm does not have a clear
expectation of the applications, but does have the feeling that it is sitting on a goldmine, and
wants to cash in on this by patenting the information it discovers.

Allowing such prospecting behaviour runs a large risk of allowing the firm the legal monopoly to
an application that could turn out to be extremely important and valuable in the future. Do the
research efforts of that firm warrant such a strong claim, even if its exact nature is unclear at this
point in time? In the specific example probably not, because the disadvantages of such a monopoly
probably outweigh the few years that the world has to wait somewhat longer for the public human
genome project to finish and provide the same information.

In other words, in the case were the pay-offs of research exhibit strong uncertainty, the
instruments of patronage (university or public research funding) and procurement (pay a
consortium of universities and research institutes to map the human genome) seem reasonable
alternatives because they allow for greater spillovers than the patent system. Once a firm finds a
specific application for a piece of genetic information, the patent system can be used to allow a
monopoly for that specific application.

In summary, the notion of strong uncertainty seems to lead to the conclusion that patents should
not be too wide, unless the patent applicant can describe the specific applications in a detailed
way. This is exactly the point of view that the European Union took with regard to the
patentability of genetic information. The European law specifies that genetic information can only
be patented as a part of a wider description of an application of that genetic information.
Moreover, it is only this application that is patentable, and other, future uses of the same genetic
information are excluded from the patent.

2.5. Patent height: imitation and ‘inventing around’

The final dimension of patent protection that we distinguish is patent height, i.e., the minimum
inventive step that makes an invention eligible for patenting. In order to assess the impact of
patent height, we will modify slightly the Klemperer-van Dijk model of patent breadth. The main
modification lies in the utility function, which was equation (16) above. In the case of patent

47
breadth, we were using a model of horizontal product differentiation, which means that there are
multiple varieties of a product, but no variety is inherently better than other varieties. Instead,
some consumers prefer one variety, while others prefer another variety.

The model that we use to analyze patent height is one of vertical product differentiation, which
means that there is some quality ranking of products. Thus, instead of varieties of a product, we
will speak about qualities of a product. All consumers would prefer to have a high quality product
instead of a low quality. Consumers do differ, however, with respect to the degree they are willing
to pay for quality. This can be represented by the following utility function:
v  mf p if the consumer buys
U (2.25)
0 otherwise

This utility function differs from equation (16) in several respects. First, the parameter 3 is set to
1, which is done for mathematical convenience. Next, the scenario of a most preferred variety and
a utility penalty for consuming other varieties (v-d) has been substituted for a different scenario
(v+mf). The parameter v still represents (autonomous) utility associated with consuming any
quality of the product. The variable f represents product quality: higher quality goods have a
higher value for f. The parameter m, which is different for each consumer, represents the
willingness-to-pay for quality of each consumer. Just as we assumed a distribution of consumers
over most preferred variety w* in the case of patent breadth, we now assume a distribution of
consumers over different values of m. In fact, we use the same uniform distribution as in section
2.3, which means that the number of consumers with a specific value of m is equal for all such
values. The density of this uniform distribution is once again scaled to 1, and it spans the interval
[0,1].

In order to address the issue of ‘inventing around’, we will take a leap into the theme of chapter
4: competition. The situation which we will analyze is one in which one firm (‘the innovator’) has
developed an innovation. The innovation is denoted by f = 0, and the innovator has the choice
whether or not to undertake further R&D in order to improve the quality level of the product
(achieve higher f). However, the innovator is challenged in the marketplace by an imitator, which
quickly adopts the patented innovation, and is also able to improve the basic product by
performing R&D. We will denote the (ultimate) quality of the innovator by f1, and assume that
the imitator will put on the market a version of the product with better quality (f2 > f1). However,
the patent law does not allow close imitation, but instead requires that the imitator makes a certain
minimal inventive step as compared to the innovator. Patent height (the minimal inventive step)
then refers to the parameter h, which defines the interval [f1, f1 + hˆ, in which f2 is not allowed.

The question we pose is how patent height (h) will affect the behaviour of the original innovator
and the imitator. Because the behaviour of the two innovators will influence each other, we will
have to use game theory to analyze the situation. Let us start by defining the consumer that is
indifferent between the two qualities f1 and f2. We denote this consumer by m’:
p2 p1
m f1 p1 m f2 p2 <m , (2.26)
f2 f1

48
where we use the subscripts 1 and 2 on p in the same manner as previously defined for f. All
consumers with m < m’ will opt for the original product, and hence buy from the innovator. All
other consumers will buy from the imitator. This gives the following demand functions (remember
the total number of consumers is scaled to 1):
p2 p1 p2 p1
x1 , x2 1 . (2.27)
f2 f1 f2 f1

Once again, we apply the assumption of zero marginal production costs, so that the profit
functions can be obtained by multiplying the demand functions with the respective prices:
p2 p1 p2 p1
Œ1 p1 , Œ2 p2 p2 . (2.28)
f2 f1 f2 f1

Differentiating these with respect to p and setting the result to 0 gives us the optimal prices for
the two firms:
0Œ1 p2 2p1 1
0 < p1 p,
0p1 f2 f1 2 2
0Œ2 p 2p2 1
(2.29)
1 1 0 < p2 (p  f f ).,
0p2 f2 f1 2 1 2 1

These equations show that the optimal price of each of the two firms depends on the price of the
other firm. In order to solve for the two prices, we apply the game theoretical concept of Nash
equilibrium. A Nash equilibrium is a situation in which neither of the two participants in the
‘game’ (the two firms) has an incentive to change its strategy (in this case, the price).

Figure 8 shows the graphical derivation of the Nash equilibrium. The two curves in the figure are
the so-called reaction curves. These show how one firm choses its price level as a function of the
price level of the other firm. Thus, the reaction curves are simply the equations (29). The Nash
equilibrium occurs where the two curves intersect. To see why this is the case, imagine what
happens if we start from a price p 1' that lies to the right of the intersection point. This price will
invoke a price p 2' from the imitator, which will invoke p 1'' , which will invoke p 2'' , etc., until we
reach the equilibrium and no price reactions occur anymore.12

The Nash equilibrium can be found analytically by solving the two equations (29) for the two
prices, which yields:

12
Note that the particular Nash equilibrium in this case is stable. It would not be hard to imagine a
situation with different slopes of the two reaction curves that would lead to price oscillations away from the
equilibrium. This would be an unstable Nash equilibrium.

49
f2 f1 f2 f1
p1 , p2 2 . (2.30)
3 3

Figure 2.8. Nash equilibrium for the pricing game of


the innovator (firm 1) and imitator (firm 2)

In order to examine the full effect of patent height h, we need a final step in the analysis that takes
into account R&D costs. For this, we go back to equation (7) in the Nordhaus model. We will
adopt a modified form to take into account product innovation instead of process innovation.
Moreover, for mathematical convenience, we will set .=1/2. Then (note that we will write R&D
costs as a function of product quality, rather than the other way around):
R(f ) f 2. (2.31)

In order to determine the product quality levels f1 and f2 ,we will first analyze a situation that does
not take account of patent height h, and then check afterwards how patent height influences
competition between the two firms. We start by writing the net profit functions of the two firms.
These can be obtained by subtracting R&D costs from gross profits resulting from the sales of the
products:
f2 f1 f2 f1
V1 f12, V2 4 f22, (2.32)
9 9

where V is net profits. Each firm maximizes net profits by differentiating V with respect to f and
setting the result to zero, under the restriction that f 0. This yields:

50
0V1 1
 2 f1 0 < f1 0,
0f1 9
0V2 4 (2.33)
2 f2 0 < f2
2
.
0f2 9 9

Note that the solution for f1 is a corner solution. Thus, the patent holder chooses quality level 0,
which does not require it to perform any additional R&D, and the imitator improves this,
depending on the ease-of-innovation parameter .

Now we are ready to evaluate the impact of patent height. There are three situations that can
arise. The first is when h 2/9. In this case, patent height does not influence the natural choices
of either the innovator or the imitator. Patent height is so low in this case, that it does not become
restrictive. The protection height does not have an impact on the market power of the patent
holder.

The second situation is when patent height h>4/9. In this case, V2 becomes negative, as can
easily be seen in equation (31) (remember f1 =0). The imitator decides not to enter the market,
and the patent holder becomes a monopolist in the market. However, its behaviour is still
influenced by the threat of entry of the imitator.

The final situation is when patent height is intermediate, i.e., 2/9 < h < 4/9. In this case, the
first-best choice of the imitating firm (in equation 33) is ruled out because the patent awarded to
the innovator is too high. The best strategy for the imitator is then to pick f2 = h. Obviously, this
leads to smaller profits than the first case (where patents were very low). Thus, it can be
concluded that high patents make imitation relatively costly.

2.6. Patent design: conclusions

We have distinguished four aspects of patent design: patent length (time period during which the
patent lasts), patent breadth (the number of horizontally differentiated product varieties that fall
under the patent), patent width (the number of industries to which the patent applies) and patent
height (the minimal inventive step necessary for a patent). All patents, irrespective of their design,
lead to a certain degree of welfare loss, but by choosing a good patent design, this welfare loss
can be minimized.

A single dimension of patent design can be used for this purpose. For example, the Nordhaus
model shows us that a patent should not be too long, otherwise it will lead to large (static) welfare
losses. Similarly, broad patents lead to more welfare loss than narrow patents. The two
instruments may also be used in combination. For example, when demand is relatively price
elastic, broad patents will be more efficient than narrow patents. The reason for this is that in this
case broad patents, welfare loss per period is small as compared to profits for the patent holder.
Thus, a broad patent may also be relatively short. Conversely, when price elasticity is low, a
narrow and long patent may be the better solution.

In practice, however, patent systems do not allow for much variation in breadth or length. Patent

51
length is typically fixed by law to a period around 20 years. In some countries, such as the United
States, longer periods may apply for pharmaceuticals, but for other products, patent length is
fixed. Patent breadth is often a case of jurisprudence, because it is hard to lay down breadth in
written rules because of the large technological diversity in different fields.

2.7. Patenting strategies: beyond mere protection

So far, we have presented the patent system mainly as an instrument that can be used by firms to
protect the R&D-investment that was made to develop a novel piece of knowledge. In this
section, we will review some of the strategies applied by firms that use the patent system in a
much more pro-active way than just seeking ex post protection of R&D efforts. For example, in
the model of patent height in section 2.5, the innovator firm, when given sufficient latitude in
terms of patent height, was able to keep out a competing firm completely from the industry, or
at least make it much less attractive for that firm to enter, even if the imitator firm was willing to
undertake R&D and improve upon the first innovation. A firm may thus attempt to block certain
fields for other firms, and in the process, the direction of its own R&D may be influenced by such
strategic considerations.

Before reviewing the theory on patent strategies, let us illustrate this phenomenon with an
example. The case is the one of an American company named Fusion, which produces and sells
high-intensity ultraviolet lamps.13 The technology Fusion employs goes back to the early 1970s,
when the five founders of the company obtained a breakthrough in gaseous lamps. The working
principle of most gaseous lamps is to feed electricity to the gas by means of metal electrodes. An
experimental alternative technology in the early 1970s was to use microwave energy to feed the
electricity to the gas. The state-of-the-art in the field was to use low power (1-5 watts) to do this.
Fusion’s innovation was to use high power (3000 watts). This, in contrast to the low power
microwave lamps, led to a commercially viable product, and Fusion was able to launch its product
successfully in the market. The first application was in the field of beer. Fusion’s lamps were sold
to the Coors brewing company, which used the lamps to dry the ink used for decoration on its
beer cans. From there, Fusion’s lamps were used in the automotive industry, silicon wafers
production (for chips), solar film, medical equipment, printing plates and printed circuits boards.

In 1975, Fusion entered the Japanese market by selling through Japanese distributors. For this
purpose, it filed patents in Japan, as it had done in other countries before. In 1977, the large
Japanese conglomerate firm Mitsubishi bought a single Fusion lamp, and shipped it to its R&D
laboratory. In the same year, Mitsubishi filed nearly 300 patents on high-intensity microwave
lamps. According to the CEO of Fusion, Donald M. Spero, the Mitsubishi patents were in fact
“scores of unworthy patents surrounding the core technology of [Fusion]” (p. 60). He argues that
there were three categories of patents filed by Mitsubishi: patents that copied Fusion’s
technology, patents that consisted of knowledge that Fusion considered to be in the public
domain, and, finally, patents that were small variations on Fusion’s technology.

After 1983, when Fusion had had time to evaluate the Mitsubishi patents, the company realized

13
The description of the case is based on a paper by the CEO of Fusion in the Harvard Business
Review (D. M. Spero, ‘Patent protection or Piracy - A CEO views Japan’, September-October, 1990, pp. 58-67.
Because the main adversary of Fusion, Mitsubishi, has not been consulted, the case description may be biased.

52
that it would face serious trouble selling its products in the Japanese market if the Mitsubishi
patents were be granted. It was as if the large number of patents by Mitsubishi had surrounded
Fusion’s technology, and the company could not produce without infringing parts of the
Mitsubishi patents. Thus, the company entered into negotiations with Mitsubishi. The main
advantage Mitsubishi had in this process was the small size of its adversary. For a relatively small
company such as Fusion, it would be very costly to fight the Mitsubishi patents in court. The
prospect of fighting a long legal battle in a foreign country whose legal system differs significantly
from that of the United States was so unattractive to Fusion that it was willing to make
concessions to Mitsubishi.

In the negotiations, Mitsubishi was steering towards a cross-license of patents, thus using its
patent flood to obtain the core technology of its competitor. Fusion, however, did not want to
give in, and kept the option of fighting the Mitsubishi patents in court alive during the whole
process. In addition, it engaged the American government, which at the time was trying to open
up the Japanese market for US firms. With diplomatic power used on both sides, the parties came
close to an agreement several times. However, they went into the 1990s without a clear resolution
of the battle, and both Fusion and Mitsubishi kept on selling their products on the Japanese
market.

We can conclude from this example that patents may be used for purposes that go well beyond
mere protection of a firm’s own technology. In the gaseous lamps case, Mitsubishi used the legal
power of a large number of patents to try to obtain access to the core technology of a competitor.
When doing this, it blocked the R&D process of that competitor by its own patents. The
Mitsubishi strategy is just one example of a number of strategies that may be used when filing
patents. What all these strategies have in common is that they are not primarily aimed at
protecting an innovation that comes out of an independent R&D process, but rather at influencing
the market or technological power of a competitor.

A theory of firms’ strategic patenting behaviour was developed by Ove Granstrand. He makes a
distinction between five major patenting strategies, all of which take into account technological
capabilities of the main competitors of the firm, and all of which will lead to changes with regard
to the R&D path taken by the firm. The strategies are all conceptualized in an abstract technology
space, which can be thought of as a three dimensional landscape, in which the differences in height
correspond to difficulties in the R&D process. High peaks correspond to high-cost and -effort
R&D places, valleys represent pieces of knowledge that are easily obtained. The two dimensions
in the plane correspond to some abstract technological phenomenon, indicating that there are
different routes (straight line, or detours) towards a single piece of knowledge as indicated by a
spot on the map. Figure 9 gives a graphical impression of the different strategies, to which we will
now briefly turn.

The most basic strategy is simple blocking. In this case, one firm takes out a patent that obstructs
the natural way in which another company was moving with its R&D. This strategy often comes
about in an ad hoc way, when the blocking company realizes that by marginally changing its own
R&D process, it can block efforts by a competitor. Often, however, the competitor will be able
to move around the blocking patent (‘inventing around’) without much additional cost.

53
Figure 2.9. Patenting strategies in technology space (adapted from
Granstrand)

A stronger strategy results when inventing around a blocking patent is very costly, as is the case
with strategic patenting. In this case, the blocking patent is carefully situated in a ‘valley’ of low
R&D costs surrounded by trajectories that require large investment to invent around. The holder
of such a strategic patent will have a very strong position when it comes to licensing, and it will
often be able to dictate its terms (either in terms of fees or cross-licensing) to companies that are
interested in obtaining the blocking patent.

A special case of strategic patent arises when patents are incorporated into a technical standard.
This is often the case in telecommunications. The successful mobile telecommunications standard
GSM, for example, contains around 140 essential patents. Any company that wants to produce
equipment for the GSM standard must use the knowledge described in these essential patents,
because there is no way to engineer things in a different way and still adhere to the standard.
Naturally, such essential patents are the nightmare of standardization bodies such as the ETSI
(which administers the GSM standard). Therefore, companies contributing to the standard are
usually required to license out their essential patents on ‘reasonable and fair terms’. Still, in the
case of GSM, the Motorola company was able to obtain a large number of essential patents and
as a result largely dictate its licensing terms to other companies. This, among other things, led by
and large to the exclusion of Japanese companies for the GSM market.

54
The third strategy is (partly) what was applied by Mitsubishi against the Fusion company. This
is the blanketing or flooding strategy. In this case, a large number of patents are turned into a
minefield. This strategy is often used in emerging technical fields, where uncertainty about future
directions of R&D is large. By taking out a large number of patents, the company aims for a lucky
shot of at least one or a couple of patents. Often, the patents in such a blanket or flood contain
inventions that are minor form a technical point of view, and engineers refer to them as ‘petty
patents’, ‘junk patents’ or ‘nuisance patents’ (recall that the Fusion CEO also talked about the
Mitsubishi patents in this way). This, however, does not preclude that these minor patents may
be important from an economic point of view.

The fourth strategy is called fencing. This is the strategy of carefully choosing a range of patents
that together block a specific line of R&D. Such a strategy is especially useful when one may
reach a similar result in many different ways, such as in the case of a chemical compound that
allows relatively large variations in terms of exact molecular design.

Surrounding is the final strategy. Mitsubishi seemed to be using this strategy in combination with
flooding against Fusion. With surrounding, the company tries to take out patents that are all very
close to the core technology of a competitor. If the competitor has not taken care to seal off all
technology it needs in its own patents, it will often find that it cannot produce without infringing
a number of the surrounding patents. Such a strategy can often be used successfully in the case
of a basic invention that can be applied in many different fields. The company that patents the
basic invention will probably have paid more attention to being first to patent, than to describing
and developing all possible applications. This obviously leaves space for a fast second mover to
apply patent surrounding.

References to the original works described in this chapter

Granstrand, O. (1999). The economics and management of intellectual property. Towards intellectual
capitalism. Aldershot, Edward Elgar Publishing.
Klemperer, P. (1990). “How broad should the scope of patent protection be?” RAND Journal of
Economics 21: 113-130.
Nordhaus, W. D. (1969). Invention, Growth and Welfare. A Theoretical Treatment of Technological
Change. Cambridge MA, MIT Press.
Van Dijk, T. (1994). The Limits of Patent Protection. Essays on the Economics of Intellectual Property
Rights. Faculty of Economics. Maastricht, University of Limburg.

55
Chapter 3.
Technological Revolutions and Economic Development
1. Introduction

We have now seen a number of the basic mechanisms of interaction between technological change
and the economy (in chapter 1), and a few of the (microeconomic) mechanisms that can be
identified when dealing with the peculiarities of technological change as an economic factor
(chapter 2). But so far we have not been able to provide more than the illustration of a few
general issues by pointing to a number of scanted examples. A comprehensive and systematic
picture of the different aspects the long-run impact of technological change on the economy is still
largely lacking. In order to provide such a picture, we need to take a broad perspective and indeed
go into some of the historical detail, both from the point of view of economic history and the
history of technological change.

The ideal theoretical framework in which to undertake such a venture is the Schumpeterian theory
of economic growth. This theory is named after the Austrian economist Joseph Schumpeter (1883
- 1950). Schumpeter was born in the same year as Karl Marx (another great theorist of technical
change) died, and in many ways he was indeed the counterpart of the revolutionary writer of the
19th century. Schumpeter was a right-wing politician as well as a scientist, and he served as a
minister of finance in the Austrian government. Although his (political) views were in many
respects contrary to those of Marx, he had some sympathy for the Marxian way of analysis. The
way in which he views the long-run process of economic growth indeed has some parallels to the
way in which Marx describes this process.

Schumpeter provides a view on the relationship between technological change and economic
growth that spans decades, and hence his theory is the perfect vehicle for the systematic and
comprehensive picture that we want to obtain in this chapter. The Schumpeterian tradition has
been followed up by a number of economists after Schumpeter’s death, and they have greatly
elaborated the original theory. We will review Schumpeter’s original theory and some extensions
made to this by followers in the next section. In section 3, we will apply this theory by means of
a summary of the historical analysis provided by Christopher Freeman and collaborators. This
section will cover two and a half decades of technological developments and the economic
changes they brought about. The final section of this chapter will summarize some of the
implications of the Schumpeterian body of theory for our understanding of long-run economic
growth and technological change.

2. Technological Revolutions: Schumpeterian Theory

Schumpeter’s theory of economic growth and technology is all about fluctuations in the rate of
economic growth. The theory says that these fluctuations will span a very long time period, i.e.,
up to 60 years. Such fluctuations are known in the economics literature as long waves, or
Kondratiev waves, after the Russian economist who worked on long-run fluctuations in prices in
the 1920s. The occurrence of fluctuations in the rate of economic growth means that periods of
relatively high growth rates will alternate with periods of low growth rates. In a very stylized way,
one may depict this process by the succession of four phases, as in Figure 1.

56
Figure 2.1. Long waves of economic growth

The first phase is called the upswing, and is one of increasing growth rates. At a certain point in
time, the increasing growth rates will settle down at a high level. This phase is called the
prosperity phase. When growth rates start to slow down again, the process enters the recession
phase, until finally the depression period sets in (when growth settles at a relatively low level).
Such a cyclical pattern may be superimposed on a long-run trend of constant growth rates, as in
the bottom part of the figure. In the case of long waves, the whole process from upswing to
depression spans a period of 40-60 years. However, also fluctuations of shorter duration are
found in the data on economic growth. So-called Kuznets cycles, for example, span a period of
roughly 20 years, while Kitchener cycles take about a decade.

The existence of long waves is a heavily debated issue in the economics literature, and many
economists are in fact very skeptical about the issue. Schumpeter, as well as his more recent
followers, strongly believed in the Kondratiev cycles. A major difficulty in finding evidence for
the hypothesis of long waves lies in the fact that most time series for economic growth date no
further back than the 1870s, i.e., span two long waves at most. With such a limited number of
observations, it is hard to make a very strong case that long waves are in fact a regularly re-
occurring phenomenon.

What is more important for the present purpose than the issue of whether or not long waves exist,
is the question what may drive them, and what role technology may play in the long waves.
Schumpeter’s theory was about the role of major, or basic, innovations, such as those described
in Table 1 of chapter 1, in driving long waves. Even if one is not willing to believe in a strictly
regular cyclical pattern of long waves, Schumpeter’s theory provides a useful starting point for
the analysis of the economic impact of such basic innovations.

57
The starting point of the Schumpeterian wave is the occurrence of one or a number of
(interrelated ) basic innovation(s). Often, such a set of basic innovations will in one or another
way be related to each other, and then the notion of a technological paradigm, as discussed in
chapter 1, becomes a possible interpretation of Schumpeter’s ideas. These basic innovations
provide the opportunity for increasing i.e., for the upswing of a new long wave to set in.

In Schumpeter’s original point of view, the basic innovations were introduced by a special class
of business men he called entrepreneurs. The entrepreneur is an especially visionary business man,
who recognizes the commercial opportunities of the basic innovations at a time when other
business men, or possible consumers of the products associated to the basic innovations, are still
in the dark with respect to the new possibilities. The entrepreneur is also especially skilled in terms
of running the type of business that is needed to make the basic innovations into a success, or in
the art of invention, or both. One may also think of partnerships of business men (managers) and
inventors, jointly representing the entrepreneur.

Later in his life, Schumpeter started to put less emphasis on such personal characteristics of the
entrepreneur and their importance for basic innovations. As will be explained in more detail
below, and in the next chapter, this was the result of changes going on in the economy during the
period in which Schumpeter was most active in terms of his professional activities. The role of
the entrepreneur in the innovation process slowly started to be taken over by large firms, which
were much less dependent on the personalities of their managers than had been the case in the
past. However, no matter whether the source of basic innovations is a single entrepreneur or a
large firm, the role of basic innovations for the process of economic growth remains largely the
same.

The commercial opportunities of the basic innovations unleash great commercial potential, and
hence attract a swarm of imitators. This why such radical innovations “are not evenly distributed
in time, but that on the contrary they tend to cluster, to come about in bunches, simply because
first some, and then most firms follow in the wake of successful innovation” (Schumpeter, 1939,
p. 75). Such an imitation and diffusion process does not consists, however, of mere copying of
the original innovation, but rather takes the form of ever more incremental improvements, as
described in chapter 1. There is a bandwagon of such imitations taking place during the early
phase after the immediate introduction of the innovation. The bandwagon of imitations leads to
higher growth rates, i.e., takes the economy into the upswing, because the imitators are able to
expand their activities while slowly pushing the radically new technologies into the economy. A
multiplier process sets in because this expansion requires investment in capital and workers.

The upswing is not only a process of expansion, however, because productive capital that is
specific to the old technology can no longer be used. This includes machinery and equipment
installed in factories, skills and experience locked up in human capital, or infrastructural capital
used for transportation or energy distribution. Firms that try to hold on completely to the old
technology will have increasing problems in surviving in the market, and may eventually be forced
to chose between adopting the new technology or go out of business. Schumpeter uses the term
creative destruction to illustrate the dual nature of expansion and substitution that takes place
during the upswing phase.

58
The bandwagon of imitation makes the upswing happen but also implies that profit rates of the
firms pushing the new technology will gradually be eroded. Initially, when there is only one or a
few firms that use the new innovations, profit rates are high due to the large technological
opportunities and absence of competition. But when the bandwagon grows, technological
opportunities gradually become smaller (when the easiest incremental improvements have already
been applied and only the harder-to-achieve improvements remain). The entrance of more and
more firms on the bandwagon at the same time increases the level of competition, and this drives
down the profit rate. Eventually, this will lead to profit rates and overall growth rates settling
down at the high level of the prosperity phase.

Competition and the further erosion of technological opportunities do not stop in the prosperity
phase, however. The continuation of these processes eventually leads to a decline in the growth
rates. The recession sets in. During the recession, the technology can be considered as mature,
and competition between firms mainly takes the form of (intense) price competition. The diffusion
of the basic innovations gets saturated at this stage, when all potential users have adopted the
technology. Hence, markets are no longer expanding and depend to a large extent on replacement
of old and defective products.

The recession turns into a depression when saturation gets almost complete, and the intensity of
price competition reaches a peak. Technological opportunities for further improvements of the
technological paradigm have dried up completely at this stage. The economy approaches a zero
profit level, and the need for a new set of basic innovations becomes very high. This is when the
process starts all over again with the next wave of basic innovations that may lead the economy
into a new upswing.

This description of the rise and fall of a set of basic innovations is called the primary cycle by
Schumpeter. The primary cycle may also be re-enforced by a secondary cycle that is to a large
extent driven by investment in financial assets. During the early upswing, (stock market) investors
get optimistic about the new technology and are willing to take more risk when investing in the
new companies. Such investment facilitates the expansion of the new technology and the
companies that have adopted it. However, the large degree of uncertainty associated with the new
technology may also lead to failures of firms, and hence investment in the stock market becomes
highly risky, but such risk is not perceived by all stock market speculators because of the generally
optimistic nature of the booming times. In the same way that such speculative bubbles may
facilitate the upswing, they may aggravate recessions and depressions, when sentiments become
over-pessimistic. As will be described below, history actually provides a number of examples of
such mechanisms driving a secondary wave.

One basic question that can be asked with regard to the original Schumpeterian theory of basic
innovations and long waves regards the timing of the swarms of innovations. If these swarms
were spread out evenly over time with short intervals of time between them, a more smooth
pattern of economic growth might easily result because periods of saturation of one innovation
would be offset by periods of expansion of others. This was pointed out by Kuznets in a book
review of Schumpeter’s two volume work called Business Cycles. In Kuznets’ view,
Schumpeter’s theory needed an answer to the question as to why the entrepreneurs that were to
introduce new radical technologies would get tired every 50 years. Schumpeter did not elaborate
on the reasons why swarms of innovations would be spread unevenly over time, although he did

59
point to the pervasive nature of some innovations, i.e., that they affect a large number of sectors
in the economy at the same time. This obviously reduces the probability of swarms of innovations
in different phases of their life cycle more or less offsetting each other.

While Schumpeter wrote up his theory of long waves and basic innovation during the first half of
the 20th century, the role of basic innovations in the economy gained new interest in the 1970s,
when the economy in the Western world was slowing down. According to some (Schumpeterian)
economists, the depression phase of the long wave had set in, and they were putting their hopes
for an upswing in the coming decades on the new technological paradigm of computers and
related electronics technology. With such a re-birth of Schumpeter’s theory, the need for a more
elaborate theory for the timing of swarms of basic innovations became all the more apparent.

It was the German economist Mensch who put forward a new theory of the relation between long
waves and basic innovation. Mensch argued that basic innovations cluster during the depression
phase of the long wave, contrary Schumpeter’s original view that swarms of innovations would
occur during the early upswing due to imitation. The theoretical explanation for such Mensch-type
clusters of basic innovations came up was offered in the form of the depression trigger hypothesis.
This hypothesis starts from the assumption of bounded rationality, which, as we have seen in
chapter 1, is particularly appropriate in the case of basic innovations or a radical change of
technological paradigm. Mensch argued that firms under bounded rationality would display so-
called satisficing behaviour, which means that they will strive to obtain a certain minimal level of
profits by trying out new combinations (innovating). After they have reached this minimum level,
the firms will focus more on maintaining this level, i.e., exploiting the opportunities that they have
found, than on trying to increase the profit level even further by searching for new innovations.

This explains why firms will not be actively searching for new basic innovations during the
upswing or prosperity phase of a long wave. In these phases, the increasing or high profit rates
will lead to a focus of attention on the existing technological paradigm. Only when profit rates
start to soar will the interest in new basic innovations surface again. This will occur during the late
stage of the recession and the depression phase. Some time after this, firms start actively searching
for basic innovations, and after a while their search will become successful. This explains why
basic innovations will cluster in the depression phase.

Such a theoretical explanation still leaves open the question as to whether or not clustering is
observed in actual practice. Here the problems for statistical analysis of the available data are even
larger than for the analysis trying to prove the existence of long waves in economic growth rates.
The main difficulty lies in determining which innovations are basic innovations, and in picking the
date at which these were introduced. Hence the debate has not been decided in favour of any of
the two positions (clustering or not), and all theorizing about the issue remains somewhat
speculative. Figure 2 provides an impression of the number of basic innovations per year, and the
reader may judge for herself whether or not the time series displays clustering.

60
8
7
6
5
4
3
2
1
0
1750 1800 1850 1900 1950 2000

Figure 2.2. The number of basic innovations since the Industrial Revolution
(source: calculations on various time series found in the literature)

3. Technological Revolutions: Freeman’s Interpretation of History

The difficulties associated with counting the number of basic innovations and dating them can
surely be avoided to some extent by an in-depth historical analysis of the life cycle of such
innovations, and the changes they brought to the economy. Such historical research has also been
part of the long waves tradition, ever since the publication of Schumpeter’s main work Business
Cycles. In the two volumes of this work, he analyzed in great historical detail the successive
technological revolutions that had occurred since the Industrial Revolution.

Of the scholars in the Schumpeterian legacy of the second half of the 20th century, Christopher
Freeman has no doubt provided the most inspiring and broad-sweeping historical contribution.
His work, partly undertaken in collaboration with colleagues such as Luc Soete, gives a most
vivid interpretation of the Schumpeterian theory of technological revolutions and long waves
outlined in the previous section, and we will summarize his writings here in order to relate the
Schumpeterian theory to the economic history of the Western world. In doing so, we will see
many of the Schumpeterian themes ‘in action’, but we will also see some new interpretations that
are typical for the Freeman-view of technology and economic growth.

Freeman makes a distinction between five technological revolutions in the history of modern
capitalism. These are outlined in Table 1, along with some characterizing features for each of
them. The history starts with the Industrial Revolution, which can be dated around 1780 (all the
dates are necessarily crude). Since then, five major technological revolutions pass, of which the
last one, the ICT revolution, is just on its way. Each of these revolutions is characterized by a
small number of carrying basic innovations, together constituting a technological paradigm. Two
particularly important features of these technological paradigms are given by the transport and

61
communication system, and the energy system in operation, which will guide the discussion below
to some extent.

Table 1. Technological Revolutions


Timing Name Transport & Energy systems
(approximate) Communication systems
1780-1840 Industrial Revolution: Canals, carriages Water power
mechanization of
textiles
1840-1890 Age of steam power Railways, telegraph Steam power
and railways
1890-1940 Age of electricity and Railways, telephone Electricity
steel
1940-1990 Age of mass Motor vehicles, radio and Fossil fuels
production tv, airlines
1990 - ? Information Age Digital networks (Internet) Fossil fuels
Source: adapted from Freeman and Soete

The Industrial Revolution

The process starts with the Industrial Revolution in Britain. This period represented a shift from
a small-scale production system based largely on human and animal power to a mechanized way
of production in factories. The main field of application of this new way of organization was the
production of textiles and iron in Britain. In the textiles industry, more specifically the cotton
textiles industry, three major innovations were responsible for a 50-fold increase in the
productivity in spinning, over a period of roughly 15 years. The ‘spinning jenny’, introduced by
Hargreaves in 1764 was a manually operated device that could spin multiple threads instead of
the single thread that was spun by the traditional spinning wheel (as known from many fairytales).
Five years later, in 1769, Arkwright’s water wheel introduced multiple pairs of rollers, thereby
greatly increasing both the speed of the operation and the quality of the resulting yarn. As the
name suggests, Arkwright’s wheel was operated by water power, thereby making the technology
independent of human power. Finally, in 1779, Crompton’s mule, named in this way because it
was a cross-over of the jenny and the water wheel, made large scale production of yarn possible.

Freeman and Soete estimate that Crompton’s mule required 2,000 hours of work to process 100
lbs of cotton, whereas the same amount could be spun by Indian hand spinners, considered to be
the most productive of the age, in about 50,000 hours. Hence the Crompton mule, as the most
advanced of the three innovations, represented a 25-fold increase in productivity, as measured in
the amount of time required for spinning. A 100-spindle mule, operational around 1790, would
reduce the time to 1000 hours, whereas in 1795 the time would be around 300 hours. The truly
impressive nature of the achievements of the three innovations occurring in the last quarter of the

62
18th century becomes clear if we note that around 1990, the time required to spin 100 lbs of
cotton had been reduced to 40 hours, which is ‘only’ 7½ times less than in 1795. Weaving also
underwent a process of significant increase in productivity, although this was perhaps less
spectacular than in spinning. The major source of progress in weaving was the power loom. This
machine was based on an early (1785) design by Cartwright, which, however, proved inadequate
until the effort of ‘dressing’ the loom could be automated by a so called dressing machine.

The rapid expansion of production that was the result of these increases in productivity had to be
met by an improved system for the transportation of the products. This took the form of rapid
investment in transport infrastructure, which, in terms of the technology of those days, meant
canals and roads for horse-pulled boats and carriages. This investment made it possible to ship
products rapidly to their destination, which could be overseas. The British ports and large fleet
naturally played a decisive role in shipping cotton products to other parts of the world.

Iron-making was another expanding activity during the Industrial Revolution, although its
development is judged as much less dynamic than that in the textiles industry by, e.g., Landes.
Iron-making had of course been known since the prehistory, but until 1500 the process was only
applied at a small scale in so-called bloomeries. From around 1500 onwards, the blast furnace
diffused in Western Europe, which enabled larger quantities of iron to be produced. Iron is made
by heating iron ore in contact with carbon. Because iron ore is an oxide, this will result in the
carbon and the oxygen uniting (into a gas that is burned), and the pure metallic form of iron is left.
A constant flow of air is necessary to keep the heating process inside the furnace going.

The blast furnace produces a product called cast iron or pig iron (the name pig iron was adopted
because the iron was cast in a form of a main channel with several branches, which did not look
unlike a litter of pigs lying alongside a sow). The chemical difference between pig iron and
wrought iron is that pig iron absorbs a larger amount of carbon. Pig iron is not very strong in
tension, unlike wrought iron, which used to be the product coming out of the bloomery. Also,
while wrought iron can be shaped by continuous hammering, pig iron will break during such a
process. Thus, an additional furnace called the finery had to be used to convert pig iron into
wrought iron, which remained to be the main product in demand.

A major obstacle to the expansion of the iron industry just before the Industrial Revolution was
the supply of wood that was needed to produce charcoal applied in the operation of the blast
furnace. The solution lay in the use of coke, which is produced from coal. This was first applied
by Darby in the early 18th century, but not used on a large scale until the last part of that century.
But charcoal was still necessary to produce wrought iron from pig iron. This problem was solved
by the puddling process, invented by Cort in 1784, which burns the coal and iron ore separately
in different parts of the furnace. The term ‘puddling’ refers to the activity of separating the large
ball of wrought iron into four or five different parts, using a long iron bar.

We thus see that technological change solved some of the bottlenecks in the production system
(shortage of charcoal). But other bottlenecks remained during the days of the Industrial
Revolution. An important example of this was the power source, which had implications for the
location of industry. Contrary to many popular accounts of the Industrial Revolution, steam
engines did not play a major role in powering the emerging factories in Britain during this period.
Although, as we have already seen in chapter 1, steam power had been known since Newcomen’s

63
engine (1712), and James Watt patented his engine in 1769, the new power source did not gain
a major share until some time to come. The Industrial Revolution was thus powered mainly by
water wheels, and this meant that factories or blast furnaces had to be located close to a river.

The increased productivity due to technological innovations in the textiles industry did not only
lead to a quantitative increase in the scale of production. Most importantly, it led to a major
organizational change in the way in which production was organized. Before the Industrial
Revolution, spinning used to be organized according to a so-called merchant system, in which raw
material would be delivered to the spinners at their cottages. After processing, the yarn was
collected again and shipped out. This system depended on small scale equipment that could be
operated by a single worker without special infrastructure. With large scale machines operated
by other means than human power, the need arose to concentrate production in a central place,
where many workers would be present at the same time, and a power source (water) to operate
the machines was available. This represented the birth of the modern factory and the Capitalist
mode of production that Marx described so vividly in Das Kapital I. Not only did this system
imply a physical re-location of the workers, but in the long run it represented the emergence of
the proletarian class, whose interests were going to oppose those of the capitalists so strongly.

The factory system, or put more broadly, the Capitalist system, was to a very large extent
dependent on efforts by single individuals, both in terms of inventive activity, and the financing
of start-up firms needed to exploit the inventions. Although a capital market on which significant
sums could be borrowed was in existence, many of the early industrialists, including Boulton and
Watt when they founded their business producing and selling steam engines, had to turn to friends
and family in order to get enough capital. According to Freeman and Soete, Crompton, on the
other hand, resorted at one stage to playing the violin in a local church in order to earn enough
money.

These and similar stories of the lives of successful entrepreneurs mark the early history of the
industrial society with a certain romantic appeal. This must clearly have inspired Schumpeter
when he put so much emphasis on the role of entrepreneurship in industrial revolutions. On the
other hand there are also stories of men who, despite in many cases inventive creativity, did not
manage to gain a fortune by putting their innovations into production. Such a contrast is, for
example provided by Arkwright, who made good fortune of his water frame and died in 1792 as
a rich man, and Crompton, whose mule could not prevent the firm he was leading from going into
bankruptcy. One of the main problems that the entrepreneurs of the Industrial Revolution would
face in addition to finding sufficient start-up capital, was to unite the different skills needed for
invention and management into a single firm. This problem was in many cases overcome by
partnerships, such as between Boulton and Watt, or between Arkwright and Strutt (who was a
stockings manufacturer when entered into the partnership).

This broad account of the Industrial Revolution brings out a number of preliminary conclusions,
that may serve as guideposts to the coming narrative of subsequent technological revolutions.
First, technological innovations may lead to extremely rapid increases in productivity, but it will
generally take some time for these innovations to realize their full impact on the economy. Basic
innovations can follow up on each other rapidly, but it usually takes a significantly longer amount
of time before the capital equipment needed to operate these innovations becomes available on
a large enough scale to have an economy wide impact. In the case of the Industrial Revolution,

64
it took approximately 60 years before diffusion of the paradigms of textiles production and iron
production had worked their way fully through the British economy. This is obviously a feature
that links up quite closely to the Schumpeterian idea of long waves that was explored above.

Second, and much related to the first conclusion, technological change does not operate in a
vacuum when it brings economic prosperity. Technology is accompanied by and conditioned on
re-organization of the system of production. Without such organizational change, radically new
technologies cannot develop. Such organizational changes may also take place more less
independent of technological change, thereby facilitating technological change and inducing it. In
fact, technological and organizational change often go hand in hand, connected by a process of
joint causation. Such a process obviously greatly shakes up and transforms the daily lives of
people living and working in the system.

The Age of steam power and railways

Around 1830 - 1840, the British economy was ready for another major leap forward. Whereas
the period prior to 1840 (the Industrial Revolution) has consisted mainly of the mechanization of
the textiles branch, the second wave of major innovations was related to revolutionary
developments in transportation and the powering of factories. The main underlying technological
development was steam power. Although Boulton and Watt were quite successful with their
business for steam engines, it was only until more than 60 years after the invention of the Watt
engine that the economy was more or less fully operated by steam.

Since the very beginning of the 19th century, work had been going on on steam powered ways of
transportation. Horse-drawn tramways had become somewhat common, and in 1804, Richard
Threvithick built the first steam locomotive in an attempt to increase the capacity of such a
system. Success for his idea came with the efforts of George Stephenson in the 1820s, who
successfully applied a steam powered locomotive (the ‘Northumbrian’) on the first railroad in the
world (the Stockton and Darlington Railroad). George Stephenson and his son Robert were to
become another example of successful entrepreneurs in the romantic Schumpeterian fashion. Both
Stephensons were men of many crafts, as building a railroad did not only require ingenuity in
terms of the design of locomotives and other rolling equipment, but also bridge building and civil
engineering on the part of the track laying as well as (financial) management.

As we saw in chapter 1, the steam engine was initially used mainly for pumping water out of
mines, and this remained to be the major application in which new types of engines were being
developed in the early 19th century (we will see some details about this inventive process in the
next chapter). The application of steam power in factories (for example in the cotton industry)
obviously had large advantages, the major of which was that factories could now be built in places
where no water was available. As was already pointed out above, during the Industrial
Revolution, factories had to be close to water for their power, but steam provided more flexibility
in terms of choosing a location.

The application of steam in this way became possible when, as the result of the incremental
innovations that were described in chapter 1, steam engines required less and less coal per
horsepower delivered, while at the same time the total horsepowers a single machine could deliver
went up significantly. Moreover, coal (the main fuel used for operating steam engines) became

65
available more cheaply as a result of the application of steam to railways. Thus, the application
and spread of steam power is a good example of both the importance of incremental innovation
for the diffusion of a technological paradigm, and of the systems nature of such paradigms.

The new applications of steam, as well as other technological developments going on, also
enabled countries other than the United Kingdom to leapfrog into industrialization. Thus, the age
of steam power and railways (1840 - 1890) saw the spread of industrialization to the European
continent (Belgium, Germany) and the United States. These countries took over part of British
technologies, and in many cases greatly improved on it (e.g. steam engines in the United States),
thus becoming an integral part of the development of the new technologies as well as the use of
them.

At the same time that steam power, which as we have seen was really an innovation that emerged
during the previous long wave, was taking control of the economy, the technological paradigm
that was going to replace steam power, i.e., electricity, was slowly being invented. In 1800,
Alessandro Volta invented (a primitive form of) the battery, which enabled electricity to be stored
and used outside scientific laboratories. Michael Faraday established the principle of the electric
motor in 1821, but this did not lead to a working prototype until the 1840s. Faraday also managed
to generate electricity by magnetism in 1831. The main application of electricity during the age
of steam and railways was confined to low-power devices, mainly the telegraph. Together with
railways, the telegraph greatly increased communication between different parts of a country. At
the same time, innovations in synthetic dyestuffs and gas were taking place, thus laying the
foundation for the chemical industry that was also going to rise during the next waves.

Just as had been the case during the first wave, the second technological revolution also saw some
important institutional changes which contributed to growth. One of these was the rise of the
joint-stock company and the corresponding stock market, described in Emile Zola’s masterpiece
L’Argent. This facilitated greatly the raising of capital for the establishment of new firms, and
decreased the dependence on personal capital that had been so great during the Industrial
Revolution. It also led to the type of speculative behaviour by investors having high expectations
of firms applying the new technologies, that was identified by Schumpeter as a main driving force
of the so-called secondary wave.

An example of the speculative bubbles that could emerge in the new system of the stock market
was the so called ‘mania of railroads’ in the United Kingdom in 1845. On 30 November of that
year, the term for filing proposals for new railway projects with Parliament would close, and the
rivaling parties that were putting forward different proposals cluttered the access roads to
London. One of them managed to smuggle in their plans in a coffin transported on a train run by
a rivaling railroad. The general public had great trust in almost all the proposals and the stocks
of companies that were filing the plans sky rocketed. Many plans, however, did not lead to
successful businesses, and most of the invested money was lost when the stocks came crashing
down. Among the victims of the crash, although at a modest scale, were Emily and Anna Brontë,
who had each invested the sum of one pound sterling. The third Brontë sister, Charlotte, did not
take part in the venture, and advised here two sisters to sell while they were still in the black
figures. Her advise was sadly waived, and Emily and Anna never saw their investment returned.
One is tempted to draw an analogy to the valuation of the so-called new economy stocks in the
late 1990s.

66
The Age of Electricity and Steel

A new ‘break’ occurs around 1890, with the emergence of the so called age of electricity and steel
(1890- 1940). Both steel and electricity had been invented decades ago, but it was only towards
the 20th century that they began to make their full impact. Both of the new paradigms were most
rapidly growing and spreading in the United States. In the case of electricity, the American
inventor Edison was the most important source of innovations using the new power source. He
invented, among other things, the incandescent lamp, the phonograph, the carbon-button
transmitter used in the telephone and microphone, and an experimental electric railroad.

The start of Edison’s career as an inventor was related to the specific circumstance that he had
had hearing problems since his childhood. This handicapped his activities as a telegrapher for the
American railroads when the telegraph system switched to a sounding key. Edison set out to
invent various devices that were initially at least partly aimed at overcome his handicap. After
having invented a printing telegraph, he became a full-time inventor, working initially mainly for
the large telegraph companies. Edison managed to attract various associates and ultimately set
up the world’s first R&D laboratory in which he employed many inventors. He held a record
amount of 1093 patents.

In order for Edison’s (and other) inventions, rather than the low-power devices from the previous
era, to be useful, a way to generate electricity on a large scale, as well as to make it available in
a universally spatial manner, had to be found. The generators developed from Faraday’s invention
were not sufficient for this purpose, and it was not until the self-exciting generator, or dynamo,
was developed by several inventors in the 1860s that this problem was solved. The dynamo uses
its own output to supply current to an electromagnet used for generating electricity. The Belgian
inventor Gramme first developed a commercial dynamo in 1870, based on the principle of ring-
winding.

Using these and subsequent advances in the generation of electric power, Edison started to
electrify large cities, such as New York and London in 1881, using the so called direct current
(DC) technology that he would promote during his whole life as an inventor in electricity. The
main disadvantage of the DC technology was that because of voltage drop, the generating station
could be no further than about a mile from the place of use of the power. The competing
technology of alternating current (AC) did not suffer from this problem. AC enabled current to
be distributed at high voltage over large distances, and then stepped down to lower voltages by
means of the transformer for local distribution and use. A series of inventions gradually led to AC
as the dominating paradigm. Of these inventions, the rotary convertor (1892), which enabled DC
power generating stations to be integrated into an AC electricity net, was one of the most
important one.

Large electricity systems (based on the AC technology) offered lighting as the major application
for households, and electric drive for factories. As is shown by data collected by Robert Ayres,
around 1900 some 5% of households in the United States were connected to the electricity grid,
and the same number applied to the share of firms using electrical power. From this point
onwards, diffusion took off rapidly and the 50% threshold was reached just before 1920 (for
firms) or 1930 (households). By the 1940s, the technology had diffused (almost) completely.

67
For factories as well as households, electricity represented a power source that was even more
mobile and flexible than steam (remember that a main advantage of steam engines over water
power was that it did not depend on location). Electric motors were small, much smaller than a
steam engine, and one could actually equip each single machine with its own electric motor.
Contrary to this, in the case of a steam engine, one or a few engines would run a complete
factory, through a complicated systems of conveyor belts and shafts.

Such a system had large disadvantages in a number of respects. Firstly, it greatly restricted the
lay-out of the factory, as all machines had to be lined up with the belts and shafts system. With
electric motors, the factory lay-out was far more flexible, and large amounts of space could be
saved. Secondly, whether a single machine or all machines in the factory were running at the same
time, the central power source (steam engine) that was used to drive the factory had to be online
and running. With electricity and each machine equipped with its own motor, the workman could
actually turn on and off the power source at his own will. Finally, the system with central power
was highly sensitive for a breakdown in the driving engine. Although under a system with
electricity such dependence is similar for brown-outs, the dependence on failures in the motor is
greatly reduced because a single breakdown does not cause the whole factory to go down.

Although it is relatively easy to identify these advantages of electricity over steam from a
backward looking position, it was not so obvious in the early days of the new technology. It has
been argued by Freeman and Soete that the first factories that ran on electricity indeed depended
on the old system of a central motor attached to the factory’s machinery through conveyor belts
and shafts. It was only until the first years of the 20th century that the factory owners started to
realize that there was a much more efficient way of using the electric power, i.e., the system of
decentralized motors. It was the habit of doing things the old way as well as the costs of capital
locked up in equipment that suited the centralized system that caused this inertia. This case is a
good illustration of the general principle that it the diffusion path that takes a technological
paradigm to its full (productivity) potential depends not only on incremental technical innovation,
but also requires organizational innovation.

It is obvious that electricity, or steam in the previous period, by their nature as power suppliers,
were technologies of a genuine pervasive nature. They were eventually, after a long period of
diffusion and improvement, used in all sectors of the economy, and greatly spurred productivity
and growth in these sectors, and, hence, at the macroeconomic level. The same notion of
pervasiveness also holds for the other paradigm of the period 1890 - 1940, i.e., steel. The
construction industry, food packaging, machinery, transport equipment and weaponry industries
were among the industries that benefitted most. The development in these industries led to such
spectacular phenomena as the large steel bridges (e.g., the Brooklyn Bridge in New York City)
and the first skyscrapers (which had steel frames), but also to much more minor, but perhaps
equally influential innovations such as the tin can.

American industry proved to be vigorous in the application of the material brought onto the
market by the European inventors Bessemer, Thomas Gilchrist, Siemens and Martin (see chapter
1). The main advantage of steel was that it was much harder than pig iron or wrought iron, which
was the main material used in the first and second industrial waves. The leading firm in this
respect (later United Steel) was led by Andrew Carnegie, who started his career in railroads.
During the American Civil War, Carnegie started to produce and sell iron rails, locomotives and

68
bridges to the American railways. After the war, he became familiar with the Bessemer process,
and started his first steel mill, which became a big commercial success.

The (incremental) innovations introduced by Carnegie and other American companies led to an
increase in productivity, and, as a mirror image, a fall in prices, of raw steel that was only
comparable to the same phenomenon in spinning during the Industrial Revolution in Britain.
Prices fell from 107 $ per ton (for steel rails) in 1870 to 29 $ in 1885, while the consumer price
index only fell from 38 to 27 during the same period. During the period 1880 - 1913, steel
production in the United States rose from 1.2 million tons to 31.3 million tons. In 1929, the level
of production would be 57.3 million tons.

The strong performance of American companies in the electricity and steel paradigms led to a
gradual take-over of economic and technological leadership from Britain by this country. Three
underlying factors turned out to be decisive for this process. First, the United States possessed
a large and homogenous domestic market (larger than each one of the European countries on its
own). Because the new technologies were capital intensive and displayed important scale effects,
this led to production costs falling rapidly when a larger market was being served. Second, the
United States were relatively abundant in natural resources, including land and minerals. This gave
the American firms an important cost advantage and helped them to keep prices low and hence
serve a large market and reap the scale economies of the new paradigm. Third, the liberal
‘American spirit’ was aimed at entrepreneurship and capitalist investment, and was also very much
open for change and the introduction of new technologies. This followed from the early
immigrants who were by and large forced to be self-made men and women, largely dependent on
the application of new ideas and technology. Also, the victory of the North over the South in the
Civil War, and the abandoning of slavery re-enforced the open mind that Americans had for
entrepreneurship and new technologies.

The importance of scale economies also led to changes in the type of firms that were best suited
to develop the new technological paradigms during the age of electricity and steel. As has been
documented extensively by Alfred D. Chandler Jr., the American take-over of leadership was
mainly implemented by a new type of firm: the giant management-led firm.14 The first examples
of such managerial capitalism were the leading firms in the development of the electricity
paradigm, such as General Electric and Westinghouse. Besides in the United States, this type of
firms also emerged, but on a somewhat smaller scale, in Germany (e.g., Siemens, AEG).

The new type of firm was characterized by several innovations in the management sphere. The
first and most obvious one was the rise of the professional manager himself: a person that was
especially trained to lead a company or one of its divisions, rather than someone who had obtained
seigniorage from his achievements in either technology (such as Edison, Watt or Arkwright) or
the building up and management of a private small firm. A second innovation was the growth by
means or mergers and take-overs, thus taking the joint-stock model of the previous wave one step
further. Both General Electric and Westinghouse achieved their rapid growth to a large extent
from mergers and take-overs.

14
The D. in Alfred D. Chandler stands for DuPont, which is the same DuPont as the large chemicals
firm. Hence the leading historian in the description of American giant firms actually had first-hand experience
with the phenomenon he was describing and analyzing.

69
The third management innovation became known as Taylorism. It was named after Frederick
Taylor, who developed a method that took to its extreme the principle of the division of labour
as Adam Smith stressed so much. Taylor broke down the production process into simple tasks,
each of which could be standardized and timed. One of the tasks he analyzed extensively was the
(dis)placement of light bulbs, which led to a large number of so called light bulb jokes.15 In this
way, the whole production process could be standardized and the amount of work to be
performed by a single worker optimized.

The main reason for the emergence of the large managerial companies was that they were well
suited to take advantage of economies of scale and scope. The first of these two notions (scale
economies) refers to the decreasing unit costs that result from an increase in the scale of
production. As was seen already, this was an important characteristic of the American jump to
leadership. Economies of scope refer to the idea that knowledge (about markets, technology or
distribution) may be used for several product families. Hence, a strong position in one product
may lead to important advantages in other products, and hence a firm that is large enough to serve
both markets will be at an advantage compared to a smaller firm. This is in fact the same
phenomenon that was encountered in chapter 1 when we discussed the advantages of large firms
in a demand pull setting.

The rise of the giant management-led firm also led to a shift away from the individual inventor in
the technological process. This was caused by the gradual introduction of R&D departments
inside large firms. The romantic notion of an individual inventor that could change the course of
history (James Watt, Thomas Edison) was slowly coming to an end, while at the same time the
trend for R&D to be concentrated in large laboratories led to an ever closer relationship between
science and technology, as described in chapter 1. Edison himself was an important figure in the
transition from private inventor to the R&D laboratory, because, as we have seen, he combined
great personal inventive skills with the management of a firm that had as its sole aim to invent,
and hence employed quite a few people as inventors.

The first R&D laboratories emerged in Germany in the chemical industry from 1870 onwards. In
the United States, it was especially the emerging automobile and related (such as tires) industry
that started up R&D departments, while in Germany the large chemical firms were most active
in that respect. Often, R&D departments emerged from facilities aimed largely at testing (e.g.,
materials). With the gradual increase of the use of scientific knowledge in the innovation process,
these departments often expanded the range of their activities, and thus slowly transformed into
what later became known as R&D laboratories.

Mowery and Rosenberg have presented data that show that the foundation of R&D departments
peaked in the automobiles cluster during the period 1919 - 1928, and during the period 1937 -
1946 in industries such as instruments, electrical machinery and transport equipment. As a result,
the number of engineers employed in R&D in the United States grew spectacularly from 3,000
in 1921 to 46,000 in 1946. In 1999, the total number was close to one million. Obviously, this

15
How many signal processing engineers does it take to change a light bulb? Three. One to Fourier
transform the light bulb, one to apply a complex exponential rotational shifting operator, and one to inverse
transform the removed light bulb. How many economists does it take to change a light bulb? None; they're all
waiting for the unseen hand of the market to correct the lighting disequilibrium.

70
strong increase in formal R&D was entirely in line with the emergence of managerial capitalism
and the increased scale of companies that was associated with this.

The Age of Mass production

The development towards production based on scale that emerged in the first half of the 20th
century in the United States culminated in the post-world war II period. Besides the more
extended reliance on scale economies and mass-production in the United States, this period also
saw the spread of mass-production based technology to Western European countries and parts
of Asia (first Japan, later also the so-called Newly Industrializing Countries). In these countries,
the adoption of mass-production methods led to a significant rise of (labour) productivity levels
relative to the United States. This is why Abramovitz has also called this period the ‘catch-up
boom’.

Like in the previous waves, the expansion of mass-production depended on the application and
spread of a number of key technologies, which, again had been around for some time. These
include the automobile and the internal combustion engine on which it is based, and (petro-)
chemical technology with plastics as an important material that can be used as raw material in a
large number of other sectors.

The internal combustion engine was developed already in 1859 by the Frenchman Lenoir, and was
first commercialized by the German Otto in 1878. These inventors regarded their engines as an
alternative for the steam engine, just as the electric motors which were emerging at the time were
supposed to be. As was already seen above, electric motors started to be used as a power source
in factories as well as on railway locomotives from the early 20th century onwards, thus indeed
rapidly replacing the steam engine. However, history had a completely differently role in mind for
the internal combustion engine.

It was the financial director of Otto’s firm, Daimler, who started the application of the internal
combustion engine to bicycles, boats and carriages. Daimler set up his own firm, together with
Maybach, thus starting what is now known as the Daimler-Benz corporation. Daimler’s efforts
led eventually to the concept of the automobile, which was quickly adopted by other firms,
initially mainly in Germany and France. In 1896, Henry Ford built his first ‘horseless carriage’ in
the United States. In 1903, he set up a firm called the Ford Motor Company, after he had earlier
been engaged in the Henry Ford Company (which later became Cadillac). The Ford Motor
Company got caught up in a patent dispute, because Ford was accused by the Association of
Licensed Automobile Manufacturers to violate a patent they controlled. The law suite dragged
on until 1911, when Ford, in appeal, won the case. In the mean time, the company had introduced
the Model T in 1908, which, in its 18 years of existence would sell more than 15 million times.

Henry Ford’s firm was firmly rooted in the now apparent mode of managerial capitalism, and he
introduced the assembly line in the production process around 1913. This was a process invention
that was characteristic of the Taylorist way of producing. The assembly line broke down the task
of assembling an automobile into small parts and gave workers a standard (and small) amount of
time to carry out this task. The high productivity (and consequently, relatively low prices of the
final product) that was associated with this, led to rapid growth of the sales of the Model-T Ford,

71
leaving other automobiles brands (including versions based on different engines, such as electric
motors or steam engines) far behind.

Ford’s assembly line appeared to be useful in other industries as well, and in the age of mass-
production, the system became the norm in many factories producing in mass volumes. From the
point of view of the worker, the system had important disadvantages, because it greatly decreased
the quality of work. The boredom associated with the repetitive tasks and the mental stress
resulting from the steadily moving belt were aptly characterized in Charlie Chaplin’s movie
‘Modern Times’ (1936), in which an assembly line worker gets literally caught up in the
machinery that controls his life.

This type of degradation of the quality of work was naturally opposed by the labour unions, which
had emerged since socialism slowly started to emancipate the labour class in the late 19th century.
The organization of workers into labour unions (traditionally stronger in Europe than in the
United States) led to a special mode of ‘regulation’ of labour relations, based on (often
centralized) negotiations between employers and unions. The unions sought to acquire a part of
the increases in profits that was associated with the increased productivity, and this led to a strong
increase in wage levels in the industrialized world. This, in turn, led to increased demand (workers
spending their wages) for consumption goods, and increased the virtual circle of scale economies
and mass-production. The term ‘Fordism’ has been used for this mode of regulation in the labour
relations and the general socio-economic changes associated with it.

The age of mass-production also provides a splendid example of the systems nature of
technological change, by means of the pervasive nature of mineral oil. Obviously, the success of
the automobile driven by an internal combustion engine crucially depended on gasoline, a product
derived from mineral oil. In the United States, oil had been in demand as a fuel from the mid 19th
century onwards, but it was only in the 1910s that the oil industry received an important stimulus
with the application of the first cracking process. The term cracking refers to a chemical reaction
in which the heavy hydrocarbon molecules in petroleum are broken up into lighter ones by means
of heating, by applying pressure and possibly catalysts. The end product consists of a mixture of
light oils, heavy oils and a number of gasses.

Cracking was introduced commercially by Standard Oil in 1913. By that time, the large Standard
Oil company had been divided up into smaller parts, of which the Indiana part received a plant
that employed William Burton. Burton had been working on the cracking process in an R&D
laboratory, and had built a pilot plant in 1910. Although the experiments had been highly
successful, the Standard Oil company had refused to build a commercial plant applying the new
technology, because of fears of explosions. The new management, however, decided to build the
plant in 1913, and the process became highly successful, leading to further development of the
technology involving also other oil companies.

During the 1920s, new types of cracking were discovered, including thermal cracking and
catalytic cracking. These new processes were applied as flow processes, whereas the Burton
process is a batch process. In a batch process, the costs of initiating and handling the reactions
(e.g., reaching the desired temperatures, etc.) are much higher than in a flow process, where a
constant stream of unrefined oil is transformed continuously. These new methods of cracking
greatly increased the productivity of oil refining relative to the original Burton process, with

72
savings in terms of raw material (petroleum), man hours, capital investment and energy (all per
gallon of gasoline produced).

The pervasive nature of mineral oil becomes clear when we realize that besides providing a major
new source of fuel, the cracking processes also proved to be an important stimulus for the
development of synthetic materials. Before cracking, known ‘plastic’ materials were confined to
a small number of variations. The first plastic material was developed in Britain by A. Parkes and
called Parkesine. However, apart from winning a bronze medal at the International Exhibition in
1862 in London, the product never caught on. Celluloid, invented in 1869 by J.W. Hyatt in the
United States, did become a commercial success, but was to a large extent superseded by
Bakelite. This substance was invented in the United States in 1909 by the Belgian L.H. Baekeland.

The large scale rise of plastics became possible after it was realized that the (by-)products of the
cracking processes developed in the 1920s could be used to produce certain polymers
(macromolecules made up of chains of simpler molecules). Polymers on their turn are the basic
input for most plastic materials used nowadays. Thus, oil became a main source for both energy
and material input in the age of mass production. In accordance with what we saw in the case of,
e.g., electricity and steam, however, the inventive source for the widespread use of oil in this way
lays relatively long back in history, i.e., the 1910s and 1920s. The widespread diffusion of the
technological paradigm based on oil took place only in the period after the second world war. The
same holds for the process innovation part of mass production, i.e., the assembly line and its
further development.

Let us now turn again to the organizational changes associated with the technological changes
that have been discussed so far. The increased scale of production led to further growth of the
size of companies. The large conglomerate firms emerging under managerial capitalism in the
previous period, now generally extended their activities beyond the borders of the countries in
which they were founded. This led to the (further) rise of the multinational firm, which has offices
and branches in many different countries. The rise of the multinational company was greatly
facilitated with the rise of interaction between countries due to air travel (see the example of the
Comet in chapter 1) and telecommunications (telephone as well as mass-communication media
such the television and radio).

International trade (imports and exports) and Foreign Direct Investment (FDI) are the two main
modes through which the multinational corporation works. Trade flows often take the form of
intra-firm flows (exports from one part of the firm to a part located in a different country), when
the foreign affiliate of the firm is primarily aimed at selling and marketing products in a local
market. FDI takes place when the firm physically locates its production in a different country, for
example because it wants to make use of low wages, or because it wants production to be carried
close to the final market (so that the production process keeps close contacts with the local
market and the product can be tailored to local preferences).

The Information Age

The success of the mass production paradigm broke down in the 1970s, when economic growth
rates slowed down significantly in the Western world. One immediate factor behind this slowdown
were the two oil crises at the beginning and end of the decade, which led to high increases of one

73
of the major inputs associated with the technological paradigm. But it soon became obvious that
the technological opportunities for further increase in the efficiency of mass production systems
were coming to an end. Economists invented the term ‘productivity slowdown’ for the process
in which the rates of productivity increase dropped to levels that were very much lower than what
had been experienced in the 1950s and 1960s.

In the late 1980s, the productivity slowdown turned into a ‘productivity paradox’, when it was
realized that a potential new technological revolution, associated to computers and electronics,
was not yet leading to an upswing in the rates of productivity growth.16 This ‘paradox’ does not
surprise us so much, however, because we have by now become familiar to technological
paradigms taking a long period to reach their full potential. The early history of the computer has
already briefly been covered in chapter 1. Like all cases looked at in this chapter, there was a long
lag between the conception of the technology (1940s) and its perfection up to the scale that it
could become the source of a new upswing of the long wave.

During this process, computers did not stay on their own. With the gradual reduction in the size
of the machines and their growing number, a need arose to connect computers to each other. This
was made possible on an ever larger scale by advances in the telecommunications sector, such as
the application of fiber optics and satellites. The convergence of the computer industry and the
telecommunications industry led to the term Information and Communication Technologies
(ICTs). With multimedia and the Internet arising in the 1990s, a third stream (‘content’) has been
added to this convergence process. At the same time, computers, in the form of microprocessors,
have made their way into a whole range of other products, such as a wide range of machinery
(among other things, so called Computer Integrated Manufacturing, CIM), and consumer
products of all types and sizes (hifi equipment, cars, etc.). This technology of integration of digital
information processing into devices of all sorts is expected to be applied at a more advanced and
larger scale by so called embedded systems, leading to more intelligent devices.

With these developments, the ICTs reached a level of diffusion and application that gives reason
to speak of a new technological revolution beginning to shape up some time during the late 1980s
or early 1990s. For the time being, the exact potential of this new technological revolution
remains somewhat obscured in the future, but it is clear that stock market players expect just as
much from ICT than the Brontë sisters and their fellow-speculators expected from railways in the
19th century.

In the mean time, several changes in the organization of the economy have been ascribed to ICTs.
The most important of these is the growing importance for networks as a way to organize the
economy. Networking between firms is seen at all levels of firm size (from multinational firms to
small and medium sized enterprises) and applied to all sorts of activities (marketing, R&D,
production). An early form of networking was applied in Japanese firms, which developed the
system of ‘lean production’ from the late 1940s onwards. Lean production was developed in the
Japanese firm Toyota, on the initiative of the chief engineer Ohno. The system was aimed at
reducing inefficiencies in the assembly line process. The latter resulted in a rather high rate of
defective products, mainly because of the inflexibility embodied in the system where each worker

16
Nobel prize winner in economics Robert Solow put it as follows: “we see computers everywhere,
except in the productivity statistics”.

74
had to perform a single task in a small amount of time, and the non-economic use of resources
in this system. Before a mistake could be corrected, the belt had moved further, and defective
products had to be repaired by a special team after they had finished their way through the
factory, or completely discarded.

Ohno’s system consisted of adding flexibility to the process (hence the system is also called
‘flexible production’), among other things by relying to a large extent on the skills of the workers,
who were trained to perform a large variety of tasks rather than a single task. Another important
element of the system was to use a large number of suppliers for specialized parts. Relationships
with these suppliers were long-term, and strong ties (e.g., exchange of personnel) emerged
between the buying firm and its sub-contractors. This system of depending on specialized
suppliers instead of making parts in-house became a model for the networking firm of the
information age.

Obviously, the possibilities for networking between firms (or between offices of the same firm,
possibly in different countries) have been facilitated with the event of ICTs. As was argued above,
an important part of the technological development in the ICTs (convergence of
telecommunications with computer technology) was aimed exactly at the physical infrastructure
of the networks needed to exchange information between the parties in the ‘virtual’ network of
firms, as well as to codify and (automatically) process such information.

Synthesis

This brief economic and technological history of almost two and a half centuries, which is
summarized in a stylized way in Figure 3, brings out a number of general conclusions that can be
related to Schumpeter’s theory of long waves and basic innovations. First, in the long run,
technological change and economic takes the character of basic innovations overtaking each
other. At various point in history, completely new technical developments have provided
important openings in the way contemporary members of society perceived the world around
them. Surely, the people witnessing the first steam trains and the world it opened up for them,
must have been as flabbergasted as we (at least those of us beyond our teens) are by the
possibilities offered by the Internet and technologies such as virtual reality. Economic history is
largely a story of how these technological paradigms made their way through society, only to be
ultimately become mostly museum pieces, and superseded by more modern technology.

From the point of view of the long time periods spanning these periods of rise and decline, the
long-run wave-like perspective offered by Schumpeter’s theory indeed seems to be the
appropriate one. Figure 3 illustrates this general phenomenon by means of the basic innovations
listed in the bottom part. Obviously, the innovations at the beginning of the time period covered
now appear ancient, and none of them is now still used in practice. However, as the analysis by
Freeman brings out very clearly, it is not so much the innovation dates of these major
technological breakthroughs that matters for the growth patterns experienced in the economy. By
their fundamental nature, these innovations require long periods of learning in firms and by
consumers, as well as supporting (infrastructural) investment. Hence it takes a long time before
their full impact becomes realized.

75
Figure 2.3. A Schematic overview of Technological Revolutions (based on Freeman’s interpretation)

76
It is the diffusion of the technological paradigms to which the basic innovations give rise that
matters for economic growth, rather than the innovation itself. Thus, we observe that the major
technological characteristics of the subsequent revolutions that we have reviewed are associated
with technical breakthroughs that occurred in a previous wave rather than at the beginning of the
current wave. This finding is an important qualification made by Freeman and others to the
original long wave theory proposed by Schumpeter. The resulting picture is thus one in which
radical changes of technology are slowly introduced in the economy. The diffusion of basic
innovations is a gradual process of incremental change, but one with a tremendous long-run
impact. At the same time, this diffusion process is one in which opportunities from various ‘basic
innovations’ are combined in new ways, rather than the spread of a single innovation in isolation.

Third, basic innovations, or the technological paradigms that are associated to them, indeed have
the capacity to change the economy over long periods. Whether or not this long-run pattern is
adequately portrayed as a cyclical pattern, with periods of rapid and slow growth alternating,
remains an open issue for many economists. But what is perhaps even more important than the
question of the existence of such long waves of economic growth, is the notion that basic
innovations transform the society in a way that goes much beyond the one-dimensional view of
the economy from the point of view of growth rates.

The introduction of basic innovations and their development along technological paradigms and
trajectories leaves a deep impact on the structure of the economy in the widest interpretation of
this notion. Basic innovations change the sectoral composition of the economy (a narrow
interpretation of the notion of economic structure). The history that was reviewed above shows,
for example, the rise and (relative) decline of the textiles sector, the machinery sector, the electric
machinery sector, the chemical sector, and the electronics sector. Key factors associated with
these new industries and the new technological paradigms they embody slowly rise while the
paradigm develops, and such key factors associated to older paradigms see their influence decline
at the same time. Thus we see the rise and relative decline of iron, steel, plastic and information,
or, in the sphere of energy systems, water power, steam power, electricity and fossil fuels. Such
key factors represent pervasive factors in the economy, i.e., at the time of their rise they find their
way through the large majority of economic activities and all actors in the economy have to deal
with these key factors in some way or another.

Structural change can also be viewed in terms of the organization of the economy. Perhaps one
of the most crucial findings of the long-run analysis of the impact of (basic) innovations on the
economy is that they go hand-in-hand with profound changes in the organization of private firms.
These changes are not only caused directly by the changes in technology, but they also shape the
technological trajectory. In other words, the organization of firms and technological development
are two factors between which joint causality exists, and that can therefore hardly be disentangled.
As an example of this general principle, we have seen the gradual evolution towards larger (and
eventually, multinational) companies.

This trend was the result as well as the facilitator of/to an increased importance of scale and scope
economies associated to the technological paradigms developing during subsequent technological
revolutions. At the same time, there were important technologies facilitating larger firms, such
as increased opportunities for long-distance communications and travel. These facilitated internal
coordination within the large firms, as well as enabled important institutional innovations such as

77
the stock market that supported the development of large firms. More recently, ICTs provided
a stimulus for a trend towards an increased importance of firms networks, in which small and
medium sized firms tend to play an important role alongside large (multinational) firms. This may
provide a break with the trend of ever increasing firm size.

Finally, the changes associated with the technological revolutions covered in this chapter, go
much beyond the pure economic realm. The increased possibilities for travel, for example, mean
much in terms of personal freedom or happiness than can ever be captured in an index of gross
domestic product. A similar argument can be made with respect to almost all of the major
technological breakthroughs that were reviewed in this chapter.

References to the original works described in this chapter

Freeman, C., J. Clark and L. Soete (1982). Unemployment and Technical Innovation. London,
Pinter.
Freeman, C. and L. Soete (1990). "Fast Structural Change and Slow Productivity Change: Some
Paradoxes in the Economics of Information Technology." Structural Change and
Economic Dynamics 1: 225-242.
Freeman, C. and L. Soete (1997). The Economics of Industrial Innovation. 3rd Edition. London
and Washington, Pinter.
Landes, D. (1969). The unbound Prometheus: Technological Change 1750 to the present.
Cambridge, Cambridge University Press.
Schumpeter, J. A. (1934). The Theory of Economic Development. Cambridge MA, Harvard
University Press.
Schumpeter, J. A. (1939). Business Cycles: A theoretical, historical and statistical analysis of
the capitalist process. New York, McGraw-Hill (page numbers quoted in the text refer
to the abridged version reprinted in 1989 by Porcupine Press, Philadelphia).

78
Chapter 4.
Competition and Innovation
4.1. Introduction

One of the conclusions from chapter 2 was that monopoly power may increase the incentive for
R&D spending. In the models considered in that chapter, monopoly power was provided by
means of a patent, i.e., in a legal way. However, monopoly power may also result from economic
or technological sources. For example, when entry barriers in a particular sector are high because
of high capital intensity of the production process, a monopolistic market structure is likely to
result. This suggests that as a result of such differences in market structure, differences in R&D
intensity (i.e., R&D expenditures as a fraction of sales, or as a fraction of the capital stock)
between sectors may result.

The relationship between market structure and R&D intensity is the topic of a literature on the
so-called Schumpeterian hypotheses. This literature draws on the view of Schumpeter that
innovation and R&D are mainly taking place in large monopolized firms. As was seen in the
previous chapter, this view must be attributed to Schumpeter’s later work, because in his earlier
work, he stressed the role of entrepreneurs and start-ups for innovation.

According to the Schumpeterian hypotheses, R&D intensity is higher for large firms, and for firms
with a high degree of monopoly power (firm size and monopoly power will tend to go hand in
hand). A number of factors have been suggested as an explanation for this phenomenon. In terms
of the discussion between demand pull and technology push from chapter 1, both these views of
the innovation process would suggest that large monopolized firms are in an advantageous
position with regard to innovation. From the demand pull perspective, one may argue that large
firms tend to supply to a larger part of the market and hence possess superior knowledge about
buyers’ preferences. They may also be better able to finance market research. The technology
push view suggests a high importance of basic R&D. To the extent that a large can employ more
scientists and engineers than a small firm, this also gives it an advantage in this department. A
further factor enhancing the positive relation between monopoly power and innovativeness is the
fact that large firms have more means to finance R&D from internal sources.

The reason why the Schumpeterian hypotheses are of interest, is because the implications they
have for the difference between static and dynamic efficiency of an economy. The notion of static
efficiency results when we consider the economy in isolation of technological change, i.e., with
fixed technology. In this case, monopoly may lead to welfare losses because of the high prices that
are being charged to consumers. In line with the analysis in chapter 2, we may measure this static
welfare loss by the loss consumer of consumer surplus, i.e., the triangle below the demand curve.
Because such welfare losses are minimized under a fully competitive market structure, monopolies
are considered to lead to static welfare losses, and hence to be inefficient from a static point of
view.

When we relax the assumption of fixed technologies, and instead assume that new technologies
are generated endogenously within the economy, dynamic welfare changes also become a relevant
concept. As in chapter 2, this refers to the notion that we should take into account welfare gains

79
from innovations. As was shown in that chapter, an innovation will result in more consumer
surplus if it (eventually) leads to lower prices, or higher product quality. If monopoly power leads
to a higher rate of innovation, the situation becomes paradoxical: on the one hand monopoly
power leads to static welfare losses, while on the other hand it leads to dynamic welfare gains.

In such a case, anti-trust laws would have to strike a balance between the two effects. Arguments
in favour of competitive markets would then loose at least a part of their attraction, at least when
considered in the light of the rate of innovation in the economy. With the current drive towards
more markets in most Western countries, this issue becomes highly relevant to present-day
political debates.

However, the trade-off between static and dynamic efficiency is only relevant when there is indeed
a positive relation between monopoly power and the rate of innovation. The arguments that we
have given above in favour of such a relationship were hardly the final verdict on this issue, and
one would have to put these to further empirical and theoretical scrutiny. For example, the
arguments that we have given in support of the Schumpeterian hypotheses are all related to the
resource side of the innovation process. The literature on the Schumpeterian hypotheses tends to
neglect the impact of incentives. As will be shown in the next section, a monopolist may well have
less of an incentive to innovate than a firm operating in a competitive market. The aim of this
chapter will therefore be to investigate in some theoretical depth the relationship between market
structure and innovation.

The role of incentives has been investigated in the literature on patent races. This is a group of
models that depend heavily on mathematical methods based on the assumption of weak
uncertainty. Sections 3 and 4 of this chapter will review the two most basic models from the
patent race literature in order to illustrate how the incentives factor may work against the simple
view of a positive relationship between monopoly power and innovativeness expressed in the
Schumpeterian hypotheses. It must be noted, however, that the patent race models that we will
consider do not take into account possible effects related to the impact of the availability of
resources on R&D expenditures (as is the main topic of the Schumpeterian hypotheses). Although
more complicated patent race models that take this into account exist, these are, by their relatively
complicated mathematical nature, outside the scope of this book.

Instead, we will consider in section 5 a model dealing with the relationship between market
structure and R&D in the tradition of evolutionary economics. In line with the discussion in
chapter 1, the main starting point of this model is the assumption that due to strong uncertainty
with respect to technology, firms operate under bounded rationality rather than the full rationality
from the patent race models. The evolutionary model deals with the differences between so-called
technological regimes, which are defined in terms of certain characteristics of the knowledge base
underlying an industry. As we will see, such differences may indeed lead to more sophisticated
conclusions about the relation between market structure and R&D expenditures or innovativeness
than the simple Schumpeterian hypotheses.

4.2. Incentives to innovate

80
In order to gain some understanding about the role of incentives in the innovation process, let us
go back to the 18th century, to be more precise the year 1769.17 In this year, James Watt obtained
an English patent ‘for a method of lessening the consumption of steam and fuel in fire engines’.
The patent was initially granted for the period of 14 year. With hindsight, scholars have judged
the Watt patent to be extremely broad in nature, so that during its lifetime, it could effectively
block all engines that used steam as the ‘working substance’ and made use of a separate
condenser.

James Watt went into a partnership with Matthew Boulton, and the Boulton & Watt Company
became highly successful in selling their steam engines. The patent was prolonged until 1800 by
an Act of Parliament in 1775, giving Boulton & Watt an effective 31 years of almost perfect
patent protection. Their usual way of doing business was to supply a buyer with drawings and
some crucial parts (some of the valves), but leave the actual work and all of the costs of erecting
the engine to the buyer. The Boulton & Watt Company would send one of its workmen to
supervise the procedure.

After the engine had been erected, its owner would have to pay a fixed fee to the Boulton & Watt
Company. In the case of reciprocating engines, which were mainly applied for pumping water
from flooded mines or in iron works or breweries, the fee was based on the savings in terms of
coal consumption incurred by the Watt engine, as compared by the Newcomen engine, which was
the state-of-the-art before James Watt. The owner of such a reciprocating engine would have to
pay one third of the value of the coal savings to Boulton & Watt, for as long as the engine was
in operation. In case of rotary engines, there was no benchmark in the form of a Newcomen
engine, because one of the innovations by James Watt was exactly to transform the movement of
the steam engine to rotary movement. In this case, the fee was £5 per horsepower per year (£6
in the London area).

P C
The fuel costs of a steam engine in £ can be calculated as × × H P × H , where P is
2 40 2 24 0
the price of coal (in pence per ton, because there are 240 pence in a £, we divide by 240), C is
coal consumption of the engine per horsepower (in lbs. of coal per horsepower delivered per
hour, because there are 2240 lbs. in a ton, we divide by 2240), HP is the number of horsepowers
delivered by the engine, and H is the number of hours during which the engine is operated. The
performance of a steam engine was usually measured by its duty, which is defined as millions of
lbs. of water raised one foot per bushel of coal consumed. This can be transformed into lbs. of
coal per horsepower per hour (our unit of C in the above formula) by performing the calculation
C = 170 / Duty in millions.

One of the main areas in which Boulton & Watt sold their machinery was Cornwall, which at the
time was a booming mining area. The engines sold there were all reciprocating, and hence the
Newcomen-based fee applied. The Newcomen engine, on which Boulton & Watt based their fee,
had a duty approximately equal to 7, or C  24.3. A typical Boulton & Watt engine in the 1790s

17
The material for the case described below has kindly been provided by Alessandro Nuvolari. The
data is partly drawn from the work of Nick Von Tunzelmann.

81
would have duty between 18 and 20. Their best-practice technology was a duty of 27.5. The latter
corresponds to C  6.1.

Von Tunzelmann has collected data on the market circumstances in Cornwall. According to his
estimations, which refer to the year 1800, the price of coal was 266 pence. The horsepower
installed in reciprocating engines in Cornwall was equal to 3250, and those engines worked for
approximately 4000 hours per year. Applying the above formula, we find that the savings incurred
by a (best-practice) Watt engine as compared to a Newcomen engine
2 6 6 ( 2 4 .3 − 6 .1 )
were × × 3 2 5 0 × 4 0 0 0 ≈ 1 1 7 0 6 8 £. Boulton & Watt commissioned one third of this
240 2240
amount, i.e., 39023£ per year. Thus, on the Cornish market, the Boulton & Watt company could
earn approximately 40000£ per year (gross).

The Cornish mines were also a fruitful area for inventors of steam engines with far better
performance than the Watt engine, as we have already seen briefly in chapter 1. In particular
Richard Threvithick and Arthur Woolf were active in the conception of high-pressure steam
engines, which were considered to be too dangerous by James Watt (whose engine was a low-
pressure engine). In 1781, Jonathan Hornblower had invented the compound engine, which was
called this way because it used the steam twice or more (‘compounding’) at descending pressure.
Hornblower’s engine was prohibited under the Watt patent, and this was also the reason why
Threvithick and Woolf never manufactured any engines until well into the 19th century.
(Threvithick did get around to building the first steam locomotive in 1803).

In 1804, Woolf patented his compound engine, but only in 1812 did he actually build one. It
delivered a duty of 34, but after a construction error had been corrected in 1815, it delivered 52.2.
Threvithick erected an engine in 1812 that delivered a duty of 26.7 (thus comparable to the best-
practice Watt engine), but managed to raise the duty to 40 in 1816.

According to common wisdom, Hornblower’s 1781 invention opened up the way to high-pressure
steam and the improvements over the Watt engine that were realized in the beginning of the 19th
century by Threvithick and Woolf. James Watt, obviously one of the most renown experts on
steam engines of the time, certainly would have been able to work on the basis of the Hornblower
design. Let us conduct a thought experiment and assume that Watt, Woolf and Threvithick were
all working on a better steam engine during the 1790s. Let us also assume that it would only be
pecuniary motivations that drove their inventive minds. Finally, let us assume that the fees that
Boulton and Watt were able to obtain were a reasonable reflection of the market power of a
monopolist supplying steam engines to the Cornish miners.

Suppose Richard Threvithick would have put his engine with duty equal to 40 on the market in
1800. The total cost savings of the engine compared to the Newcomen engine would have been
2 6 6 ( 2 4 .3 − 4 .3 )
equal to × × 3 2 5 0 × 4 0 0 0 ≈ 1 2 8 6 4 6 £, or an annual fee equal to one third of this is
240 2240
almost 43000£. Arthur Woolf would have made an even better calculation, because his engine
performed at duty equal to 52.2, which, in 1800, would have earned him one third of
2 6 6 ( 2 4 .3 − 3 .3 )
× × 3 2 5 0 × 4 0 0 0 ≈ 1 3 5 0 7 8 £, i.e., approximately 45000£ per year.
240 2240

82
Thus, Woolf and Threvithick would indeed have had a large incentive to build their machine if the
patent law would have allowed it. Their invention would have paid them a sum of approximately
43000 (Threvithick) or 45000 (Woolf) per year. As economists, we call these amounts the
incentives for the inventors, against which they would have had to compare the development costs
of their engines.

Now let us compare these amounts to the incentive that James Watt had. Remember he and
Matthew Boulton were already earning a neat 40000£ per year on their current machine. If Watt
had invented Woolf’s engine, this would have earned eliminated the demand for his earlier
machine and hence have earned him an extra 5000£ (45000£ - 40000£). With Threvithick’s
engine, Watt would have earned an extra 3000£ per year. This is clearly a much lower incentive
than either Threvithick or Woolf, and it is therefore likely that James Watt would have considered
the costs of developing a high-pressure steam engine excessive as compared to the gains he could
expect from the effort.18

This case provides an example of how the current market position of a firm may have an impact
on its willingness to innovate. As will be explained below, the different incentives between Woolf
and Threvithick on the one hand and James Watt on the other hand are known in the economic
literature as the replacement effect (because an innovation replaces the incumbent technology).
This effect acts as a retarding factor on the innovative efforts of a monopolist, and tends to favour
entrants in a market. However, there are also other factors in the incentive structure of a firm with
regard to innovation.

Theory

As illustrated by the thought experiment on steam engines in the 18th century, the economist views
profits as the prime motivation for a firm to innovate. And as the case showed, how much profits
a firm can expect from an innovation partly depends on its market position. To illustrate this point
theoretically, let us go back to the setting of the Nordhaus model that was discussed in chapter
2. We consider a process innovation that lowers the (constant) marginal production costs from c
to c . As in chapter 2, the demand curve is linear and can be written as
q A Bp, (3.1)

where all symbols have the same meaning as in chapter 2. The question we pose is how much a
firm would be willing to pay per period for an exclusive license of the patent on the innovation.

First, let us consider a monopolist. We derived the monopolist’s optimal price in chapter 2, and
this can be substituted in the demand and profit functions to yield
2
1 (A Bc)2 qp
p
A 1
 c, qp 1
(A Bc), Π, (3.2)
2B 2 2 4 B B

18
Some readers may argue that James Watt would also have to take into account the possible loss of
his 40000 £ fee if any of the other two inventors would patent their engine. This is indeed true, and will be
incorporated into the theory in section 3 below.

83
where we write the general cost level c instead of either the pre- or post-innovation marginal
costs. By substituting both c and c in the profit function, we can derive the extra profits that
the monopolist will draw from the innovation:
c
Π(c) Π(c) P
1 1
q(p (c)) dc A(c c) B(c 2 c 2) (3.3)
2 4
c

This is the maximum price a monopolist would be willing to pay per period for a license on the
patented innovation.

Let us now consider a firm that operates in a fully competitive market. By buying the license, the
firm will become a monopolist in the market. Whether or not it will be able to charge the
monopoly price given by equation (2) depends on whether the innovation is drastic or minor. As
defined in chapter 2, the innovation is minor if the optimal monopoly price is higher than c . Then,
as explained in chapter 2, the new monopolist will charge a price equal to c , and the extra profits
derived from licensing the innovation are
c
ΠPq(c) dc
minor
(A Bc) (c c). (3.4)
c

Note that the only difference between equation (3) and (4) lies in the point at which the demand
function is evaluated: in the case of the monopolist (equation 3), it is evaluated at the
monopolist’s optimal price, while in the case of the competitive firm (equation 4), it is evaluated
at the pre-innovation marginal production costs. Because we are considering a minor innovation,
we know c < p ( c ), and hence (because the demand curve is downward sloping, q ( c ) > q ( p * ( c )).
*

From this it follows directly that in case of a minor innovation, the extra profits from the
innovation for a competitive firm (described by equation 4) are larger than those for a monopolist
(equation 3).

Now consider a drastic innovation. In this case, a monopolist and a competitive firm will earn
identical profits after the innovation, because the competitive firm becomes a de facto monopolist
after adopting the innovation. The competitive firm earned zero profits before the innovation,
while the monopolist earned positive profits. From this it again follows that the extra profits from
adopting the innovation for the competitive firm are larger than for the monopolist.

Thus, in both cases of drastic and minor innovation, the incentive of a competitive firm to
introduce the innovation is larger than the incentive of a monopolist. This result was first derived
by Nobel Prize winner Kenneth Arrow, and has become known in the literature the replacement
effect. The intuition behind it is simple. A competitive firm has nothing to loose and hence only
to gain from introducing an innovation, while a monopolist is replacing herself, and hence has
current profits to loose. It is the formal theoretical derivation of the steam engine example.

We may also compare the incentives of both the monopolist and the competitive firm to the
incentive of a social planner. In chapter 2, we staged the social planner as a benevolent agent that

84
tries to maximize total welfare in the economy. We showed that besides the profits of the
innovating firms, the social planner was also concerned with consumer surplus. Thus total welfare
can be calculated as
c

Pq(c) dc.
c
(3.5)

This expression is clearly larger than equation (4), which is intuitively correct because the social
planner includes consumer surplus in her considerations. Hence, a social planner has a larger
incentive to adopt the innovation than a competitive firm, which in turn has a larger incentive than
a monopolist.

4.3. Incentives and R&D spending: patent races

The impact of incentives on the decision on how much to invest in R&D is the topic of the
literature dealing with patent races. We will outline two of the most simple models from this
literature in order to evaluate the role of incentives in the relationship between market structure
and R&D expenditures. The general setting of these models is one in which a number of firms is
simultaneously trying to develop the same innovation. By raising its R&D expenditures, a firm
may increase its chance of making the actual invention sooner. However, only the firm that makes
the invention first will receive the patent. At such a moment, all other firms decide to stop their
R&D immediately and take their losses (their R&D does not yield any economic pay-off).
Obviously, in order to increase its probability of being the first to make the invention, and hence
to receive the patent, a firm may raise its R&D expenditures, but this is only feasible if the rewards
outweigh the expected R&D costs.

We consider two different setups of the model, relating to differences with regard to the market
structure already in existence. In the first case, so-called threatened monopoly, the situation is one
in which an existing monopolist controls the industry. This monopolist races with one firm that
is currently not operating in the industry for an innovation that would make the current
technology obsolete. In terms of the example of the previous section, this would be a situation
in which the Boulton & Watt company would race with either Trevethick or Woolf for the next
generation of steam engines. The main question we ask here is which of the two firms has a higher
incentive to innovate. The second model considers a more symmetric situation. In this case there
are n firms, neither of which holds a more advantageous position than any of the others. Here we
analyze the question how R&D expenditures are affected if the number of firms (n) will grow.

Threatened monopoly

Discovery of the innovation is modeled as a stochastic process, where spending more on R&D
increases the probability of being first. The Poisson distribution is used to represent this process.
Assume that the probability of making the discovery during a time interval of duration ût is equal
to R.ût, where R is, as before, the amount of R&D spending. Note that this formulation is similar
to the familiar R&D pay-off function of chapter 2, this time setting =1 for analytical convenience.
We will also assume that all participants in the patent race are equally well placed to do R&D.

85
The first patent race that we will describe is one between a monopolist in an industry and an
outside firm. The race is once again for our familiar process innovation that lowers constant
marginal costs from c to c . The monopolist is currently earning profits that we will denote by
Œm( c ). If it makes the discovery first, the other firm will stay out of the industry, and the
monopolist remains a monopolist, earning Œm( c ) > Œm( c ). If, on the other hand, the other firm
makes the innovation first, it will enter the industry. In case of a minor innovation, the monopolist
will also stay in the market, and a duopoly results. In this case, the entrant will be the
technological leader, and the pre-innovation monopolist will continue to use the old technology.

In the duopoly, both firms will earn ‘above normal’ profits, but the profits of the technology
leader will be larger than those of the laggard. We denote the profits of the technology leader (i.e.,
the entrant) by Œd( c , c ), and those of the laggard by Œd( c , c ) (thus taking the first argument
in the function Œd as the firm’s own costs, and the second argument as the costs of its competitor).

We will not specify the exact shape of the profit functions Œm and Œd, but instead make two general
observations about how the profit levels compare to each other. The first observation relates to
the replacement effect, which was considered in the previous section. The replacement effect
states that a monopolist will have less incentive to innovate than a potential entrant, because the
monopolist is replacing herself, thus destroying current monopoly rents, and the potential entrant
has nothing to loose. As we will show below, this effect tends to lead to the potential entrant
winning the patent race.

The second effect goes contrary to the replacement effect, and is termed the efficiency effect. This
can be stated formally as follows:
Œm(c)  Œd(c, c)  Œd(c, c). (3.6)

The efficiency effect states that a monopolist can always earn at least as much profits as two
duopolists together would be able to earn. Although it would be possible to illustrate this
argument by modeling in detail the competitive process between the two duopolists, we will
suffice with a general intuition of the argument. First, note that joint profits of two duopolists are
probably largest if they collude in some way, for example by drawing up some agreement over
prices. A monopolist would naturally be able to duplicate such a situation, and hence do at least
as well as the two colluding duopolists. However, collusion between two duopolists will not
always take place, for example because the anti-trust law prohibits such behaviour, or because of
a lack of trust between the two parties. In this case, the two duopolists will generally earn joint
profits that are lower than what they could earn with collusion. Hence, their joint profits will also
be lower than what a monopolist could earn.

To see the impact of the efficiency effect on the patent race, we must rewrite equation (6) slightly
to obtain
Œm(c) Œd(c, c)  Œd(c, c). (3.7)

86
Note that the lefthand side of this equation is the profit differential that the monopolist gets when
she succeeds in preempting the entrant. The right hand side of the equation is what the entrant
earns by innovating and entering. Hence, equation (7) states that the monopolist has a larger
incentive than the potential entrant. We thus conclude that the two effects, replacement effect and
efficiency effect work against each other, and that it depends on the specific situation which one
of the two will dominate over the other.

We will now derive the expected profits for each of the two firms. The current monopolist will
be indicated with a subscript 1, and the subscript 2 will be used for the potential entrant. Let us
first write down the expected profits V1 for the current monopolist. As long as no innovation takes
place, this firm will earn its current profits Œm( c ) minus the R&D outlays R1. Using the definition
of the Poisson distribution, we can denote the probability of no innovation up to time t as
e −[ h ( R1 ) + h ( R 2 )] t , and hence the value of this option is equal to
(Œm(c) R1).
[h(R1)  h(R2)] t
e (3.8)

(Remember from chapter 1 that we value an option for the future by multiplying the probability
of that option by the pay-off it generates).

A next possibility is that the monopolist makes the discovery at time t. For this to happen, it must
first be the case that none of the two firms has made the discovery up to that point, and second
that the innovation is indeed made by the monopolist, and not by the potential entrant. The
probability of the first is (again) e −[ h ( R 1 ) + h ( R 2 )] t , while the probability of the second is R 1α . The
discounted value of the profit stream that is derived from this is equal to Œm( c ) /!, where ! as
before is the discount rate. The value of this option is thus equal to

[h(R1)  h(R2)]t . Œm(c)


e R1 . (3.9)
!

(Note that because we have already subtracted R&D costs in option 1, we do not have to do that
again, all options after equation 8 are relative to the first option).

The third and last option is similar to the second, only that now the entrant innovates (with
probability R 2α ), and hence the profits earned are now duopoly profits made with the high cost
level:

[h(R1)  h(R2)]t . Œd(c, c)


e R2 . (3.10)
!

87
Now we are ready to add up the value of the three options, sum over time and discount to obtain
the net profits from the innovation. For simplicity we will assume that the patent that can be won
lasts for ever. Using equations (8) - (10), we get:19
 . . Πm
. Π(c, c)
d

Pe
(c)
V1 !2
e
(R1  R2 )2
Π(c) R1 
m .
R1  R2
0
! !
(3.11)
Œm(c) R1  R1 Œm(c) / !  R2 Œd(c, c) / !
. .
.
!  R1.  R2.

The expression for the potential entrant is somewhat simpler, because there is only one option that
yields non-zero profits. This is the case when the entrant makes the innovation. Thus, we can
build an expression similar to equation (9) and (10), but must also subtract R&D costs. This
yields:

Œd(c, c) R2 Œd(c, c) / ! R2
.
. .

Pe !2 (R1  R2 )2 .
V2 e R2 R2 .
0
! ! 
.
R1
.
R2
(3.12)

In order to obtain the firms’ optimal R&D expenditures, we must differentiate equations (11) and
(12) with respect to R1 and R2, respectively, and set the result to zero:
0V1 [!  R1.  R2.]  .R1. 1 [Œm(c) Œm(c)  R2.[ Œm(c) Œd(c, c)] / !  R1]
0,
0R1 (!  R1  R2 )2
. .

(3.13)
0V2 !  R1.  R2.  .R2. 1 [Œd(c, c) R2. Œd(c, c) / ! R2]
0.
0R2 (!  R1  R2 )2
. .

Note that the two equations in (13) each contain the two R&D spending variables R1 and R2,
which indicates that optimal R&D spending of one firm depends on R&D spending of the other
firm. Hence we must once again apply game theory and solve for the Nash equilibrium (see the
model of patent height in chapter 2 for a short introduction to the concept). The Nash equilibrium
requires, in addition to equation (13), ∂ V 1 / ∂ R 1 = ∂ V 2 / ∂ R 2 . We will show that in case of a
drastic innovation, it necessarily follows from this that the potential entrant spends more on R&D
than the incumbent monopolist.

Let us start with the intuition behind this result. From the definition of a drastic innovation, we
get Œd( c , c ) = Œm( c ) and Œd( c , c ) = 0 (remember that with a drastic innovation, the firm that
does not have the state-of-the-art technology leaves the industry because it cannot earn positive

∞ − ρt
− ρτ e
19
Note that ∫ e Xdτ = X . Hence, division by ! in equations (9) and (10) was necessary
t ρ
even if we discount in equation (11).

88
profits). By substituting these two expressions in equation (6) or (7), we find that these hold by
equality. This means that there is no efficiency effect, and hence we would expect the replacement
effect to dominate the patent race.

To proof this result, first substitute Œd( c , c ) = Œm( c ) and Œd( c , c ) = 0 into equations (13).
Then apply ∂ V 1 / ∂ R 1 = ∂ V 2 / ∂ R 2 . Re-arranging then yields:

.R1. 1R2. . R2. 1R1.


.Πm . 1
(c) R1 .Œ
m . 1
(c) [R1  . 1
R2 ] . R1. . R2.   . (3.14)
! !

Now suppose R1 > R2 (which is the contrary of what we wish to prove). Because R.-1 >0 and
because monopoly profits with the innovation are larger than without the innovation, the lefthand
side of equation (14) is then strictly negative. On the righthand side, α R 1α − α R 2α > 0 , and
hence the whole right hand side is positive. Thus, for R1 > R2 equation (14) cannot be satisfied,
and hence we conclude that there is no Nash equilibrium where the monopolists invests more in
R&D than the potential entrant. Conversely, when R1 < R2, there is a Nash equilibrium.

Figure 1 shows the two reaction curves and the corresponding Nash equilibrium for the case of
a drastic innovation. Note that the Nash equilibrium lies above the 45 degrees line, which means
that the potential entrant spends more on R&D than the incumbent monopolist.

Figure 2.1. Nash equilibrium in the ‘threatened


entry’ patent race with drastic innovation

We thus conclude that in case of a drastic innovation, the efficiency effect is not present, and the
replacement effect dominates the patent race. This means that a monopolist will invest less in
R&D than a potential entrant. Note that this does not necessarily mean that the potential entrant

89
is the first to innovate, because the R&D process is stochastic. But the result does imply that on
average, drastic innovations in a specific market will be made more often by outsider firms than
by firms that have large market power in that market. Similarly, the model can be used to show
that under a minor innovation, the monopolist may have a higher incentive to innovate, and hence
spend more on R&D.

Symmetric patent races

There is one other important aspect of patent races: from a social point of view, they lead to
excessive R&D expenditures because firms tend to duplicate each other’s efforts. We illustrate
this phenomenon by describing a symmetric patent race. The race is called symmetric because all
firms are assumed to be equal: they are all potential entrants in an industry that does not yet exist,
and hence they have no current profits. Compared to the previous model (which was asymmetric
because one firm has a different profit perspective than the other firm), this simplifies things
greatly because it eliminates the different types of profits. We denote the profit stream that the
firm can expect from innovation by Œ. Otherwise, we will not change anything compared to the
asymmetric patent race.

Let us consider one firm that is racing with n-1 other firms, so that there are n firms in total. Let
us denote the R&D expenditures of this firm by R1. Because all other firms are equal, our firm
assumes that all its rivals behave alike, which means that we can denote their individual R&D
expenditures by R. We can set up the expected intertemporal profits V for the firm in the same
way as we did before.

R1 ΠR1
.
.

Pe Œ
!2 (R1  (n 1)R .)2 .
V e (R1 R1) d2 . (3.15)
0
.
R1  (n 1) R  !
.

Differentiating equation (15) and setting the result to zero, in order to obtain the firm’s reaction
curve, yields
[(n 1) R .  !][. R1. 1
Π1] R1. (1 .) 0. (3.16)

If we differentiate this equation with respect to n or with respect to R (not R1!), we obtain a result
that is positive. Hence, it can be concluded that an increase in the number of rivals (n) or the
amount of R&D that each of these rivals spends (R) will raise R&D spending by the firm.

Because all firms are equal, the Nash equilibrium will be a symmetric one, i.e., all firms spend the
same amount on R&D in the equilibrium. To solve for the Nash equilibrium, we thus set R1 = R
and substitute this into equation (16). This yields:
[(n 1) R1.  !][. R1. 1
Π1] R1. (1 .) 0. (3.17)

90
This equation indeed has one solution, as is depicted in Figure 2. The curves in the figure display
the lefthand side of equation (17). The points where the N curves intersect with the horizontal axis
correspond to the Nash equilibrium. As can be derived from equation (17), increasing the value
of n will shift the Nash equilibrium to the right, i.e., increase the equilibrium value of R&D
spending of the firms participating in the race. Hence the curve N1 corresponds to a higher value
of n than the curve N0.

The vertical lines V indicates the (private) value of the innovation. From equation (15), it is clear
that this value decreases with the number of firms, so that the V line will move to the left for
increasing n. This is caused by the fact that the probability for each firm to be the one that
discovers the innovation first declines with the number of competitors.

The V1 (V0) curve is drawn for the same number of firms as the N1 ( N0) curve. The tendencies
of the Nash equilibrium shifting right and the V curve shifting left for increasing n together will
lead to the outcome that for a certain value of n, R&D is no longer profitable. This occurs where
the vertical V line intersects with the horizontal axis to the left of the point where the N curve
intersects with the horizontal axis. In the figure, this is the case for N1 / V1. At this point, it is no
longer profitable to undertake any R&D. In a dynamic context, where entry into the research
sector is free, we may imagine that entry keeps occurring until the point is reached where the V
and N curve intersect, and ‘extra profits’ from the innovation are zero. We will use this property
of the patent race model later on in chapter .. when we look at a model of macroeconomic
growth.

Figure 2.2. Nash equilibrium in a symmetric patent


race

So far, we have shown that when the number of firms in the industry increases, R&D spending
per firm goes up, until the point is reached where R&D is no longer profitable. We will now show
that this process of rivalry leads to wasteful R&D spending. To show this, we compare the

91
outcome of the symmetric patent race to a hypothetical situation in which a single monopolist firm
would undertake n parallel research projects, each of which receives the same amount of
spending. We will denote this level of spending by Rm. In order to achieve the expected profits
for this monopolist firm, we substitute Rm for R1 and R in equation (15), and multiply the result
by n (equation (15) gives the expected pay-off for one project). This yields:

Rm ΠR m
.
.

P Œ
!2 (nR m )2 .
V n e e (R m Rm) d2 n . (3.18)
nR m  !
.
0

In order to obtain the optimal level of R&D spending, we differentiate this with respect to Rm and
set the result to zero to obtain:
.
n Rm (1 .)
. Rm. 1 Π1 . (3.19)
!

Let us call this the ‘centralized’ solution to the optimization problem. The decentralized solution
is the one we obtained earlier in equation (17). From this equation, we may write:
.
R1 (1 .)
. R1. 1 Π1 . (3.20)
(n
.
1) R1 !

Now let us assume that Rm > R1 and investigate the implications of this assumption. Firstly, we
note that the lefthand side of equation (20) is larger than the lefthand side of equation (19) under
this assumption (remember .<1!). Secondly, it follows that the denominator in the righthand side
of equation (19) is larger than the denominator in the righthand side of equation (20) (n>1).
Thirdly, and irrespective of the assumption, we note that the numerator the righthand side of
equation (20) is larger than the numerator in the righthand side of equation (19). The last two of
these conclusions together imply that the righthand side of equation (19) is larger than the
righthand side of equation (20). But since both equations are equalities, this is at odds with the
first conclusion. We must therefore conclude that the assumption Rm > R1 is not consistent with
the equilibrium conditions of the model. Hence, we conclude that Rm < R1 must hold.

Thus we have shown that compared to a single monopolist, any number of decentralized firms
together will spend more on R&D, or, in other words, these decentralized firms spend more on
R&D than the innovation is warranted by the joint value of the innovation. Obviously, this
phenomenon results from rivalry and duplication in the R&D process. But the derivations so far
have not given us a clear intuition behind this important result. We will therefore illustrate the
intuition behind it with an example in the next section.

4.4. Patent races as a common pool problem

The winner takes all in the patent race, and the losers get less than nothing (they loose their
money invested in R&D). This is the point of view of the individual firm taking part in the patent
race. But does this also hold if we consider the impact of the innovation on the economy as a

92
whole? Obviously not. Consumers do not care about which firm wins the race, because any
winner is as good as any other winner. They will all supply the same product.

Thus, an individual firm has a clear incentive for investing more R&D as the number of rivals goes
up. This will raise its probability of winning the race compared to the situation of not increasing
R&D spending, and might thus make the difference between all (winning the race) or less than
nothing (losing the race). From the aggregate point of view, however, the benefits of increased
spending are not so obvious. Increased spending shortens the expected innovation date, and this
raises the total pay-off of the innovation because it reduces the negative impact of discounting.
But this is a small increase as compared to the ‘less-than-nothing’ to ‘all’ increase that is relevant
for the firm. Thus, from the aggregate point of view, the increased spending that occurs as a
reaction of an increased number of participants is clearly wasteful.

This is an example of the general class of problems known in economics as the ‘common pool
problem’. This problem arises with resources that are in some way common property, and to
which access is unrestricted. The most commonly used example to illustrate the problem is a
fishery. Consider a lake with fish in it. The lake and its population of fish have a fixed capacity.
The more fisherman that exert their trade on the lake, the less they will each catch. More
specifically, assume that the relation between the number of fishermen and their catch is as
described in Table 1.

Number of Catch per fisherman Total catch Marginal increase in


fishermen total catch
0 - 0 -
1 100 100 100
2 90 180 80
3 80 240 60
4 70 280 40
5 60 300 20
6 50 300 0
7 40 280 -20
Table 1. The common-pool problem

Now consider that instead of becoming a fisherman, one may also become a computer
manufacturer. The pay-off to becoming a computer manufacturer (for the same period as the
catches in the table refer to) is equal to the equivalent of 60 fish. How many fishermen will we
have in the example of Table 1? The answer is five. To see why this is the case, imagine what
would happen if there are six fishermen. Each of them catches 40 fish, which means each one of
them would be better off manufacturing computers. As soon as one of the fishermen actually

93
enters the computer business, the other five no longer have a reason to stop fishing, because they
will catch 60 fish, or, the equivalent of what they can earn in the computer business.

Now imagine there are only three fishermen. A fourth one could earn 70 fish, and this is obviously
more profitable than making computers. Even a fifth fisherman could earn as much in computers
as in fishing, and if we assume that there is at least one person who likes fish better than
computers (and hence prefers fishing to computer manufacturing when the pay-offs to both
activities are equal), a fifth person will enter fishery. A sixth one will not enter, however, for the
reasons described above.

Thus, five fishermen is the equilibrium that will arise with open access to the common resource
of the lake. But is this an optimal solution from the aggregate point of view? Obviously not. As
is clear from the last column (which is simply the first difference between rows in the third
column), the first fisherman that sailed the lake increased total social benefits by 100, the second
by 80, the third by 60, the fourth by 40, etc. Notice that already at the fourth fisherman, total
welfare to the society was negatively affected. This fisherman brought in an extra 40 fish to the
community, while making computers would have brought the equivalent of 60 fish. But this fourth
fisherman still makes 70 fish, so for him as an individual it makes sense to fish instead of making
computers.

What causes this paradoxical situation is the negative externality that the fishermen have on each
other. For each one of them, entering brings net benefits to themselves as long as they are one of
the first five to enter, but it also affects the earnings of the others negatively. The aggregate effect
is the sum of the individual effect and the externality. Note that such a negative externality is the
counterpart of positive externalities that were considered in the discussion in chapter 1 on public
goods. Imitation of technological knowledge can be considered as a positive externality.

How does this relate to the patent race? The feature that is shared between the fishery example
and the (symmetric) patent race is that in both cases, entry of one firm has a negative impact
(externality) on the other firms in the race. As described above, entry of an additional firm in the
patent race lowers the probability of all firms to win the race, and it also drives up the costs. Thus,
the patent race is very much like a fishery in the common pool of undiscovered knowledge.
Because access to the pool is open, it is very much likely to be overused.

4.5. Technological regimes: Schumpeter revisited

The models that have been considered so far assume a large degree of rationality and availability
of information on the part of firms making investments in R&D. For example, it has been assumed
that firms know exactly the relationship between their R&D investment and the (expected) cost
reductions resulting from innovation. Also, firms have been assumed to know the actions of their
competitors, and to include this in their own decisions. The only uncertainty that we have allowed
for is in the relationship between R&D expenditures and innovative success. It was assumed that
firms know the average outcome of this relationship. In other words, we have been operating
under the assumption of weak uncertainty so far.

Given the analysis of technological change as a phenomenon to which, in many cases, strong
uncertainty applies rather than weak uncertainty, it is a relevant question to ask how our findings

94
with regard to the relationship between competition and R&D investment hold under a range of
models that do not rely on the assumption of weak uncertainty. One well-known model from the
field of evolutionary economics that deals with this question is the model that was presented by
Nelson and Winter (1982), and which was later refined by Winter (1984). The model asks the
question how market structure and R&D-intensity (R&D expenditures as a fraction of sales) co-
evolve under different so-called technological regimes.

The term technological regime relates to the possible sources of knowledge that a firm may use
to develop an innovation. These can be distinguished into internal and external sources. As we
saw in chapter 1, an important internal source of knowledge is the R&D department (if present),
but also other departments, such as production or marketing, may be viewed as internal
knowledge sources. External knowledge sources include both competitors, which might be
imitated, firms in other business lines, but with related technologies, and public and semi-public
knowledge institutes (such as universities). A technological regime can be defined as the set of
conditions that determine how important these different sources of knowledge are relative to each
other, and the ease with which they can be tapped into.

The traditional distinction is between two regimes that can be traced back to Schumpeter’s theory
of basic innovations and where they stem from. In a stylized way, the image of Schumpeter’s
entrepreneur has become the model for a technological regime in which entrants into an industry
hold a relatively favourable position relative to incumbent firms. This is often called the
Schumpeter Mark I regime, or the entrepreneurial regime. We will adopt the second term here.

The Schumpeter Mark II regime is based on Schumpeter’s vision that large established firms are
the main source of (basic) innovations. As was explained in chapter 3 and the introduction to this
chapter, this vision emerged in Schumpeter’s writings while he was witnessing the rise of the large
managerial firm in the United States in the 1920s and 1930s. In the Schumpeter Mark II, or
routinized (which is the term we will adopt), regime, incumbent firms have a relative advantage
over outsiders with regard to developing innovations.

The model that we will consider assumes that by their nature, various fields of technological
knowledge can be classified as corresponding to either the routinized or entrepreneurial regime.
The question the model asks is how market structure and R&D intensity co-evolve under these
two regimes. The appendix to this chapter gives the exact equations of the model. Here, we will
only discus the general structure of the model, as well as some results that may be obtained from
simulation studies using the model.

The setting of the model is one of process innovation. Firms sell a homogenous product, and
capital is the only production factor. Process innovations make capital more productive.
Production costs further depend on a fixed rate of variable costs. Firms perform two distinct kinds
of R&D: innovative R&D, which yields completely new production techniques, and imitative
R&D, which is aimed at copying techniques already in use by other firms. A higher ratio of R&D
expenditures (of either type) per unit of capital already in use yields a higher probability for an
innovation or imitation (depending on the type of R&D).

Firms chose an R&D policy for both types of R&D. Such a policy is defined as the ratio of R&D
expenditures (of a certain type) to capital already in use. Note that increasing this ratio raises the

95
probability of finding a new production technique (innovation or imitation), but also decreases
current profits, because R&D expenditures are deducted from gross profits (sales minus
production costs) to yield net profits. Firms are assumed to act under bounded rationality. More
specifically, they are ‘satisficers’: as long as they earn a profit rate (net profits per unit of capital)
that is higher than the industry’s average rate, R&D policies are not changed. However, when
profits fall below the industry’s average, the firm will usually change its R&D policy. In this case,
the firm’s policy is adjusted to the industry’s average policy. The speed of this adjustment process
can be varied.

Firm produce according to their full production capacity (determined by their capital stock and
technology level), and sell all products in the same period as they are produced. The market price
is determined according to a demand function that is constant over time (demand is equal to
supply and hence market price results from the demand function). Firms make investment plans
by taking into account a desired markup rate which maximizes their profits under the assumption
that other firms’ output remains constant. This desired markup depends positively on the market
share of the firm. Investment plans are subject to a financing constraint: more profitable firms are
able to invest more than less profitable firms.

Entry of new firms and exit of existing firms are endogenous processes. Exit occurs if a firm falls
below a certain threshold size (of the capital stock), or if a firm’s performance (profits) falls below
a (different) threshold. Entry is assumed to depend on technological opportunities. To this end,
it is assumed that ‘external’ R&D takes place (e.g., by private inventors, firms outside the industry
or by researchers at universities). Such external R&D yields innovation and imitation draws
according to essentially the same process as R&D by incumbent firms does.

The entrepreneurial and routinized regimes differ in two respects. The first is the efficiency of
innovative R&D differs. For equal levels of innovative R&D spending (per unit of capital), the
entrepreneurial regime will yield a lower probability for an innovation. This also implies that for
given levels of external R&D, the probability of innovative entry would be lower under the
entrepreneurial regime. But this tendency is set off by assuming a higher amount of external R&D
under the entrepreneurial regime as compared to the routinized regime, such that the resulting
probabilities for an innovative draw as a result of external R&D are exactly equal between the two
regimes.

The second difference between the two regimes lies in the way in which innovative draws
(productivity levels corresponding to innovations) are made. Under the entrepreneurial regime,
innovation draws are made from a random distribution with steadily growing (over time) mean
(called latent productivity). Entrants draw their innovations from the same distribution as
incumbent firms. The routinized regime uses a similar distribution, but here the mean of the
distribution is equal to the average of the firm’s current productivity and latent productivity
(which grows at the same rate as under the entrepreneurial regime). In this regime, the
distribution’s mean for entrants’ innovative draws is equal to the average of latent productivity
and a base level productivity parameter that remains constant over time. This implies that under
the routinized regime, innovation depends not only on external science, but also on the firm’s own
experience. As a result, entry by innovation (new production techniques) will become much more
difficult when the industry grows older in the routinized case. Eventually, such entry will cease
completely, although entry by means of imitation remains possible.

96
Note that these differences between the two regimes imply by assumption that the rates of
(innovative) entry will differ between the routinized and entrepreneurial regimes. Thus, possible
differences between the two regimes in terms of entry and market structure will at least partly be
a direct consequence of the assumptions made. This does not hold, however, for R&D intensity
(R&D expenditures as a fraction of the capital stock), or the profit rate. These variables are
completely endogenous.

The model can be explored using a computer program for Windows PCs available from the
author’s website. The pictures below are taken from the computer program. There are two modes
in the simulation program: single run mode and batch run mode. The first of these enables the user
to investigate in a detailed way the simulated history of the industry. Figure 3 displays two such
histories, for both regimes using the default parameter set.

For each of the two runs, there are four panels. The two panels at the top display variables that
are related to the market structure. On the left, the number of innovations and imitations (both
total and only referring to entrants) is displayed. A comparison between the two regimes for this
panel reveals one immediate difference between the regimes. Under the routinized regime, both
the number of imitations and innovations is higher than under the entrepreneurial regime.
Moreover, under the entrepreneurial regime, the number of innovations stagnates relative to the
number of imitations, whereas under the routinized regime, these two variable grow more or less
along the same path. Under both regimes, entry stagnates in the long run, but under the
entrepreneurial regime, imitative entry is clearly higher than innovative entry. For the routinized
regime, the levels of both types of entry are more or less the same.

The right-top panel displays measures of market structure: the number of firms, the Herfindahl
N and the C4 measure. The Herfindahl N indicator is defined as the inverse of the sum of squared
market shares of all firms. This indicator is high when the market is competitive (many relatively
small firms), and low when the market is concentrated (monopolistic). The C4 indicator measures
more or less the same phenomenon, and is defined as the sum of the market share of the four
largest firms. This indicator is high (low) for monopolistic (competitive) markets. The main
differences between the regimes for these indicators relates to the evolution over time. After an
initial phase of adjustment, the market structure indicators for the entrepreneurial regime do not
show a clear trend, but fluctuate around a more or less stable level. In case of the routinized
regime, however, there is a clear tendency for the market to become more monopolized over time,
although the number of firms does not show a clear (downward) trend.

The two bottom panels show indicators related to technology. The left-bottom panel refers to
productivity. Due to the large number of innovations and imitations under the routinized regime,
the indicators for best-practice productivity and average productivity are rather smooth in this
case. For the entrepreneurial regime, the best-practice productive steps are more occasional, while
average productivity catches up slowly to the best-practice level.

Finally, the right-bottom panel shows the average R&D expenditures (as a fraction of capital) and
the profit rate. By far the largest fraction of total R&D goes to innovative R&D, under both
regimes. At the end of the run, the routinized regime shows a clearly higher rate of R&D
investment than the entrepreneurial regime, but this is not the case in the early stages of the runs.

97
However, such observed differences may well be the result of random flukes occurring in either
of the two simulation runs. In order to get a more reliable picture of the differences between the
regimes, we need a larger number of runs (with different realizations of random events) over
which we can take means. These means can then be compared using statistical methods, after
which a more final answer can be given to the question as to how the regimes differ. This is what
is done in batch mode of the program. Figure 4 displays the results of a batch run for the default
parameter set.

Figure 3. Simulation results for single runs, default parameter set

98
The vertical axis measures the ratio for a particular variable between the routinized regime and
the entrepreneurial regime. A value larger (smaller) than one indicates a higher (lower) value
under the routinized regime (note that the axis has a logarithmic scale). The color of the bar
indicates the statistical significance of the difference: dark colors indicate significantly different
values, while light colors indicate differences that are not significant. For each variable, there are
four bars, which correspond to different phases of the simulation period. The first bar indicates
the first 12 ½ periods, the second bar the second 12 ½ periods, etc. Note that over the simulation
period, both the value of the difference between the regime and the statistical significance of this
difference may change.

Figure 4. Simulation results for batch runs, default parameter set

For the default parameter set, most of the differences are statistically significant. Only for the first
12 ½ periods of the simulation are there some values that are not significantly different between
the runs (notably the number of imitative entrants, the number of firms, the Herfindahl and C4
indicators, innovative R&D and total R&D). In general, the results in Figure 4 confirm the
impressions from Figure 3. The routinized regime shows significantly more monopolistic market
structure towards the second half of the simulation period, but the reverse is true for the period
just after the initialization (seconds bar). Average age of firms is higher for the routinized regime.
The difference in the profit rate between the two regimes also reverses sign over the simulation
period. Initially, it is higher for the entrepreneurial regime, but for the last three quarters of the
runs it is higher for the routinized regime. Total R&D expenditures are higher under the
routinized regime, but this is the result of higher innovative R&D expenditures. Imitative R&D
expenditures are lower for the routinized regime.

Thus, the results for the innovation regimes model tend to confirm the Schumpeterian hypotheses
discussed in the introduction: R&D intensity is higher in industries that are monopolistic. But this

99
finding is not the result of a one-way causal interaction from market structure to R&D
expenditures as in the simple Schumpeterian literature. Instead, market structure and R&D
intensity co-evolve under the influence of (exogenous) differences in the knowledge base. There
is also an important qualification to the simple Schumpeterian result: while innovative R&D is
higher for market structures that are relatively monopolistic, the reverse holds for imitative R&D.

Obviously, varying some of the parameters will change the degree and nature of the differences
between the regimes. The reader is encouraged to experiment with the computer program in this
way.20 Two particularly interesting situations emerge by setting the exogenous innovation step
(l) to a lower value (0.01 is suggested), or by setting the degree to which firms are focused on the
short run when revising their R&D policies () to a higher value (0.6 is suggested). The first
change will greatly reduce the differences between the regimes in terms of R&D intensity, as a
result of the fact that R&D becomes much less important for firm performance. The second
change will mainly lower the differences in terms of imitative R&D, but leave innovative R&D
relatively unaffected, and also increase differences in terms of profitability between the two
regimes.

4.6. Conclusions

We have now put a variety of theoretical approaches to work in investigating the relationship
between market structure and R&D expenditures. These approaches are complementary to each
other, and consider the problem at hand from different angles. The patent race models that have
been considered tend to stress the role of incentives on the decision on how much to invest in
R&D. Two cases were considered: a patent race between an incumbent monopolist and a
challenging potential entrant, and a patent race between a number n of equal rivals.

The point of reference in our discussion has been the hypothesis that large firms with a high
degree of monopoly power tend to spend more on R&D than small firms that operate on a
competitive basis. This hypothesis has been put forward in the literature on the so-called
Schumpeterian hypotheses. The models that we have reviewed all present important qualifications
to this simple point of view.

In the first case, we concluded that the impact of monopoly power on R&D expenditures really
depends on the size of the expected innovation. In the case of drastic innovations, the potential
entrant will tend to spend more on R&D than the incumbent monopolist, while the reverse is true
for minor innovations. The reason is that in the case of drastic innovations, the losses for the
monopolist due to the so-called replacement effect are relatively large. This model therefore
presents our first important qualification to the hypothesis, namely that the impact of monopoly
power on R&D expenditures depends on factors such as the size of the innovation, and no wide-
reaching general conclusions can be reached.

20
Please take care that the resulting parameter set ‘makes sense’. Simulation is not a blind tool to
investigate nonsensical situations, and the program is certainly not ‘fool proof’. For example, it is relatively
easy to generate a nonsensical result as well as a runtime error by setting l = 0.0, which implies no technical
change takes place.

100
The second patent race model, in which a symmetric race is analyzed, underlines the possibility
that due to competition between firms, R&D expenditures of each individual firm may rise. This
is obviously a tendency that goes against the simple Schumpeterian hypothesis of a positive
relation between size and monopoly power on the one hand and R&D expenditures on the other
hand. However, the spiral of higher R&D expenditures due to stronger competition, in addition
to leading to wasteful duplication of efforts, must break down at the point where the higher R&D
costs reduce overall profitability so much that R&D projects seize to be undertaken. In this
situation then, competition again has a negative effect on R&D spending. But this points to a non-
monotonic relationship between market structure and R&D, rather than the simple monotonic
relationship from the Schumpeterian hypotheses.

Finally, the evolutionary model dealing with the issue of technology regimes concludes that
differences in the knowledge base underlying different industries may lead to both differences in
market structure and R&D spending. Whether or not monopolistic market structures tend to go
with higher R&D expenditures depends on whether one considers innovative R&D or imitative
R&D. The latter tends to be higher in competitive markets, the former in monopolistic markets.
This conclusion that the relationship between market structure and R&D expenditures may differ
with the type of innovations is obviously in broad accordance with the result from the asymmetric
patent race model.

Overall, we are thus left with the picture that there is no obvious relation between market
structure and innovation. The causality of this relationship goes both ways: market structure
determines R&D expenditures and vice versa. Which combinations of market structure
(competitive or monopolistic) and R&D intensity (high or low) will result depends on
circumstances such as the size of the innovations or the characteristics of the knowledge base. In
terms of implications for the trade-off between static and dynamic efficiency, this means that there
is no simple and generally valid answer to question whether such a trade-off is relevant.

Depending on specific circumstances, one may imagine cases where competition will enhance the
rate of innovation. In such cases, the trade-off is not relevant, and the argument in favour of
competition is stronger than ever. In other cases, however, monopoly power may enhance
innovation. In this case, policies solely aimed at increasing the degree of competition are
obviously not very efficient, and some degree of monopoly would be desirable. One may imagine
that in such a case, regulation can be used to restrict static welfare loss from monopoly. The
situation on the telecommunication markets in many Western-European countries is a case in
point. Here, regulationary boards are usually installed to watch behaviour of the firms that control
the networks.

References to the original works described in this chapter

Arrow, K. J. (1962). Economic Welfare and the Allocation of Resources for Invention. in: The
Rate and Direction of Inventive Activity: Economic and Social Factors. New York,
National Bureau of Economic Research: 609-625.
Lee, T. and L. L. Wilde (1980). "Market structure and innovation: a reformulation." Quarterly
Journal of Economics 94: 429-436.
Nelson, R. R. and S. G. Winter (1982). An Evolutionary Theory of Economic Change.
Cambridge, MA, Harvard University Press.

101
Reinganum, J. (1983). "Uncertain innovation and the persistence of monopoly." American
Economic Review 73: 741-748.
Von Tunzelmann, N. (1978). Steam Power and British Industrialization to 1860. Oxford,
Clarendon Press.
Winter, S. G. (1984). "Schumpeterian competition in alternative technological regimes." Journal
of Economic Behavior and Organization 5: 287-320.

Appendix - The equations of the simulation model (Winter, 1984)

This appendix documents the complete structure of the simulation model on technology regimes
presented by Winter, and discussed in the text above. The model is implemented as a computer
simulation, and can be obtained from http://www.tm.tue.nl/ecis/bart. Users of the computer
program may change parameter settings in the program. Because the program does not have
online help, this appendix is the only source of information on what the parameters mean. Note
that in the computer program, subscripts and superscripts as used in this appendix are not
implemented (yet). They appear as small case regular sized characters.

1) Production Structure

The following equations determine the production module of the model:


Qit Ait Kit,
Œit Pt Ait c rit
m n
rit ,
Pt D MQ
i
it
,
Xit  Xi(t 1)  (1 ) Œit,
PA Q it
K i(t1) I t i(t1) , ,Œit,/ K it (1 /) Kit.
c MQjt
j

where Q is production, A is capital productivity, K is the capital stock (number of machines), /


is the rate of depreciation of capital, Πis the profit rate (profit per unit of capital), P is the price
of the produced good, c denotes production costs per unit of capital and can further be specified
as /+!+v (with v denoting variable production costs and ! denoting ‘normal’ profits), rn
represents R&D expenditures aimed at innovation per unit of capital, rm is a similar variable for
imitative R&D, X measures long-run performance of the firm,  is a parameter (0<<1), I is the
investment function, the subscript i (j) indicates a firm, and the subscript t (t+1) represents a time
period.

The demand function specifies completely elastic demand up to a certain threshold price level, and
unit elasticity beyond this level:
Pt P for Qt <Q ,
V
Pt for Q >Q .
Qt

102
We want this demand function to be approach the same value for Q* from the left and right, and
this implies we can set only two of the three values Q*, P* and V independently. The simulation
program allows the user to set P* and V and will calculate the corresponding value for Q*.

The investment function looks as follows:

  (1 s it) % (P  Pt) Ai(t1)


mit min , 0.999 ,
  (1 sit) % s it 2c
mit c
/1
D
I it ,
Pt Ai(t1)
/  Œit, for Œit  0,
F
Iit
I it /  2Œit, for Œit > 0,
F

I max[0, min(I D, I F)],

where m is the desired profit markup over production costs,  is price elasticity (this is set to one
in all simulations, as implied in the demand equation), % is a parameter, s is market share of a firm,
ID is desired investment and IF is financeable investment.

2) R&D Policies and technological change

Innovation is a stochastic process:


n n
a n rit Kit,
m m
a m rit K it,

where n and m are (respectively) the probabilities for an innovation or imitation to occur, and
an and am are parameters that differ between the two regimes. If under the above probabilities an
innovation or imitation takes place, an innovation or imitation draw will take place yielding the
following values:
n
Ait ~ LN(m LN,s LN),
m
A it Apt,

where An and Am are the results of innovative and imitative draws, respectively, LN represents the
lognormal distribution with mean mLN and standard deviation sLN, and the subscript p represents
the firm that is imitated. When a firm invents a new technique (innovation), this technique is
invisible for imitation. However, at each period, the probability that the technique is visible
increases with a value 2 until it reaches 1. The default value of 2 is 0.125, which implies it will take
8 periods before the technique is visible for imitation with absolute certainty. When a firm
imitates, the firm to be imitated (p) is chosen with a probability that is proportional to that firm’s
share in the total industry capital stock.

103
Adopting the convention that An and Am are zero if no innovation or imitation is realized, capital
productivity evolves as follows:
n m
Ai(t1) max(Ait, Ait , A it ).

Management is assumed to feel an incentive to change R&D policies if firms perform relatively
bad (compared to the industry average). If long-run firm performance is above the industry’s
average rate of return, R&D policies remain unchanged:
for Xit  ¯Œt,
n n
ri(t1) r it
for Xit  ¯Œt,
m m
ri(t1) rit

where the bar above a variable indicates a (capital weighted) average. If, on the other hand,
X it < π t , i.e., long-run firm performance is ‘unsatisfactory’, there is a probability h <1 that the
firm will change it’s policy according to the following rule:

n
ri(t1) (1 ) ritn   r̄ n  u n, u n ~ N(0, s n),
m
ri(t1) (1 ) ritm   r̄ m  u m, u m ~N(0, s m),

where  is a parameter (0<<1), N represents the normal distribution (with mean zero), and the
standard deviations of the draws for innovative and imitative R&D policy changes are represented
by sn and sm, respectively. Once a firm changes it’s policy, it will allow a certain period to ‘prove’
the new policy. This is modeled by adding a value û to X.

3) Entry and exit

A firm will exit when its capital stock falls below a certain level, or when it’s long-run
performance falls below a certain level:
Kit <K min,
Xit <X min,

where the superscript min indicates the critical levels for exit.

Entrants are assumed to appear according to a Poisson process with arrival rate equal to
N t a n E n,
M t a m E m,

104
where N and M are, respectively, the arrival rate of ‘innovative entrants’ (i.e., firms entering with
a completely new technique) and imitative entrants (new firms imitating the technique of an
existing firm), and En and Em are, respectively, external R&D expenditures for innovative and
imitative R&D (these are exogenous variables). Each of the entrants resulting from this process
makes a productivity draw denoted Ae, which yields a rate of return Pt Ae - c. The entrant will
actually enter if this rate of return is higher than a stochastic threshold value, hence entry will
occur if
Pt A e c> r eu e, u e ~ N(0, s e),

The capital stock of an entrant is a stochastic variable drawn from the following distribution:
N(K , s k).

This distributed is truncated below at Kmin. The entrant’s R&D policies are chosen using the
equations that govern an existing firm’s R&D policy change, with =1. If these policy levels yield
a negative rate of return, the policies are set at the break-even level. X is initialized at û for an
entrant.

4) Differences between the two regimes

The two regimes are characterized by differences in the parameters an, En, µ LN and 1LN. With
regard to the parameters an and En, these always produce the same product, i.e., anEn does not
differ between the two regimes. In the simulation process, this is implemented by fixing a value
for an and En under the entrepreneurial regime, as well as setting a factor Z that yields values for
an and En under the routinized regime according to:
a nR Z a nE,
E nE
E nR ,
Z

where the superscripts E and R denote the entrepreneurial and routinized regime, respectively.

Furthermore, under the entrepreneurial regime,


m LN L(1l)t,

where L is the initial level of ‘latent productivity’ and l is it’s growth rate. Under the
entrepreneurial regime, this value is also used to draw innovative entrants.

Under the routinized regime,

LN
L(1l)tAit
m .
2

105
In this regime, innovative draws for entrants are drawn for by substituting a fixed base level
productivity B (which is a parameter) for Ait in this equation.

Finally, sLN is twice as high under the entrepreneurial regime than under the routinized regime (the
simulation program allows the user to set this value for the entrepreneurial regime).

5) Industry founders

The industry is started with one firm, which is attributed the following values: K = KF, A = AF,
rn = RnF and rm = RmF.

6) Program operation

When the program is started, parameter values can be set by latering the default values in the cells
on the program’s parameter form. The user may return to this form by pressing the PARAMETERS
button. On the parameter form, parameter sets can be saved to and loaded from file by pressing
the LOAD FILE and SAVE FILE buttons (respectively). The user may also select the regime to be
used by chosing the appropriate radio button at the bottom of the form.

Once the desired parameters have been set, the user may chose one of two simulation modes:
SINGLE or BATCH. The first launches a single run and displays ‘live’ graphs documenting the
history of the industry. These graphs are the same ones that have been used in the text above
when discussing the sample run. The BATCH simulation mode repeats a large number of runs for
both regimes and compares the outcomes using statistical tests. The graph that is produced in this
simulation mode is also discussed in the text.

The BITMAP button may be used to copy the program’s window to the Windows clipboard. Form
there, it may be pasted into a wide variety of other programs, such as wordprocessors or drawing
programs. The EXIT button closes the program.

The simulations in the program will make use of so-called pseudo-random numbers. These are
the computer representation of stochastic variables, and are hence used to represent the various
stochastic variables described above. The realization of these random variables may be controlled
by the RandSeed variable. Setting this variable (on the parameter form) to a specific value will
lead to the same sequence of random numbers being drawn. This is similar to a device that would
enable gambler to throw the same sequence of dice values over and over again. In practical terms,
it means that by setting the same value for RandSeed, the user will be able to re-produce the
(batch of) same simulation run(s) over and over again. Changing the value of RandSeed without
changing the value of other parameters will change the outcome of the simulation as a result of
random (and hence non-systematic) factors only, i.e., will leave the ‘environment’ identical.
Changing other parameters than RandSeed will, naturally, also change the simulation outcome,
but in this case by means of changing in the environment.

7. Summary of parameters of the model

106
Parameter Computer Explanation Default Useful interval
program value
notation

Random Randseed Determines stochastic draws 123456789 any positive


Seed integer value

  Impact of the short run on performance 0.25 [0..1]

V V Total expenditures in the market 64 [50..100]

P* P* threshold price level for elasticity 1.2 [1..2]

re re entry barrier 0.007 [0..0.02]

  price elasticity 1.001 [½..4], but not


exactly equal to 1

% % sensitivity of investment 2 [1.5..2.5]

/ / capital depreciation rate 0.03 [0..0.1]

! ! normal profit rate 0.015 [0.005..0.05]

v v variable production costs 0.115 [0.08..0.16]

Kmin Kmin exit level capital stock 10 [5..15]

Xmin Xmin exit level performance -0.025 [-0.05..0.025]

  persistence of old R&D policy 0.165 [0..1]

sm sm standard deviation random policy change, 0.0004 [0..0.02]


imitative R&D

sn sn standard deviation random policy change, 0.002 [0..0.02]


innovative R&D

h h probability of R&D policy change 0.5 [0..1]

û û initialization of long-run performance after 0.025 [0..0.05]


policy change

am am innovation opportunity, imitative R&D 2.5 [1.5..3.5]


(entrepreneurial regime)

an an innovative opportunity, innovative R&D 0.025 [0.015..0.035]


(entrepreneurial regime)

Em Em external imitative R&D (entrepreneurial 0.2 [0.1..0.3]


regime)

En En external innovative R&D (entrepreneurial 2 [1..3]


regime)

2 2 vulnerability to imitation 0.125 [0..1]

L L start level latent productivity 0.135 [0.05..0.2]

l l growth rate latent productivity 0.04 [0..0.1]

107
sLN sLN standard deviation innovative draws 0.12 [0..0.25]
(entrepreneurial regime)

se se standard deviation entry disturbance term 0.014 [0..0.025]

K* K* mean of capital of entrants 25 [15..35]

sk sk standard deviation of capital of entrants 7.5 [0..15]

B B base level productivity (routinized regime) 0.1333 [0.1..0.3]

KF KF capital stock of the founder 25 [15..35]

AF AF productivity of the founder 0.15 [0.01..0.2]

RnF RnF innovative R&D policy of the founder 0.002 [0..0.01]

RmF RmF imitative R&D policy of the founder 0.005 [0..0.01]

Z Z scale factor between regimes 10 [1..20]

108
Chapter 5.
Diffusion, Competition between Technologies, and
Networks
5.1. Introduction

The economic models of technological change that have been considered so far (chapters 2 and
4) have been concerned with the generation of innovations. However, as was seen in chapters 1
and 3, only when a technology spreads through the economy (the diffusion phase) can its full
impact be felt. In chapter 1, it was argued that the spread of technologies cannot be seen as a
process in which the original innovation remains unchanged. Instead, incremental innovations
greatly improve the potential for an innovation to spread. The conclusion was that incremental
innovation and diffusion are hard to separate.

In this chapter we will focus on the explanation of diffusion patterns of technologies. In many
cases we will, for analytical convenience, assume that the underlying technology for which the
diffusion process is modeled does not change. This assumption is obviously at odds with the
above conclusion, but we will see that it greatly facilitates the mathematical analysis. As a
consequence, however, the models presented here cannot be taken too literal. In all cases, we will
have to ask the question how the conclusions may be affected by a more realistic assumption
about the role of incremental innovations.

What are the phenomena that are relevant in the study of the diffusion of an innovation? In
chapter 3, we have seen many cases of diffusion processes that take a long time. The basic
innovations that were considered in the case of the Schumpeterian long wave theory typically take
off slowly, but once they pass a certain threshold level, the diffusion process is greatly speeded
up. It was argued that this phase of rapid diffusion of a cluster of basic innovations is associated
with high economic growth (the upswing of the long wave). However, after a while the diffusion
process slows down, mainly as a result of consumer satiation and a depletion of technological
opportunities of the basic underlying technology.

In fact, such a pattern of slow initial growth, then rapid diffusion and eventually saturation, is
common for most historical cases of diffusion. Although not all diffusion processes take as long
as the basic innovations that underlie a long wave, one may speak of a typical shape for diffusion
processes that looks like the historical cases from chapter 3. The case of the railroad will be used
as an example here. The basic system of railroads was first used in 16th century European mining,
but it wasn’t until the early 19th century before they were put to a wider usage. The same Richard
Trevithick whom we met in chapter 4 when we considered the development of the high pressure
steam engine, built a steam locomotive in 1803. This locomotive was put to work in an iron
works. Although Trevithick continued to develop his locomotive, he was never able to put it to
work in a public railroad, mainly because the brittle cast-iron rails available to him were unable
to carry the weight of the carriages.

In 1825, George Stephenson successfully solved this problem, and built the Stockton and
Darlington Railroad, which was the first in the world to carry both passengers and freight. From

109
that point onwards, railroad began to spread throughout the British economy, and later the same
process took place in other parts of the world.

Figure 5.1 describes how this diffusion pattern was taking place in the United Kingdom and the
United States. The indicator given is the length of the railway network in both countries. Both
data series have been standardized by dividing all observations by the maximum observed value
(original data are in kilometers or miles)21. Although one may imagine more sophisticated
indicators (such as passenger kilometers or the weight of goods carried), the data in the figure
provide a good illustration of the diffusion of railways.

0.9
United
Kingdom
0.8

0.7

0.6

0.5

0.4
United
States
0.3

0.2

0.1

0
1820 1840 1860 1880 1900 1920 1940

Figure 3.1. The diffusion of railways in the United Kingdom and


the United States (length of railway network, standardized).
Source: Gruebler.

The points in the figure represent the actual data, the lines represent curves estimated by statistical
techniques. The pattern found for both series is the typical S-shaped or sigmoid pattern that is
found in virtually all diffusion studies. Diffusion takes off slowly, but grows exponentially for a
while. Then an inflection point is met, and diffusion levels off towards the saturation level.

Obviously, the estimated lines do not match the data exactly, but the fit is close enough to warrant
the study of diffusion as a sigmoid pattern. Note also that although the general shape is the same
for the two countries, there are important differences between the two curves. For example, the
United Kingdom starts somewhat earlier than the United States, and also has a somewhat more
rapid take-off. As a result, the estimated S-curve has a somewhat steeper slope. When we study
the diffusion of individual technologies below, our aim will be to explain such differences in the

21
The data are taken from the book by Gruebler referenced at the end of this chapter.

110
diffusion process between countries, sectors, or technologies. The models that aim to explain
these differences for the case of a single technology are the subject of section 2 of this chapter.

In many cases, however, the diffusion of a specific technology does not take place in isolation of
other, competing or complementary, technologies. In the case of the first railways, there was
severe competition with other means of transport, e.g. boats or stage coaches. Nowadays,
railways have to compete with cars and airplanes. Such competition (or complementaries) from
other technologies may obviously have an effect on the speed of diffusion. We will consider
another case from the historical description of chapter 3: process innovations in steel-making.

As was seen in chapter 1, in the middle of the 19th century, two processes were invented that
made steel-making a lot cheaper and more efficient than the common method of crucible melting.
These were the open hearth process and the Bessemer process. In the early 20th century, the
availability of electric power led to the process of making steel by the electric-arc method. This
process uses electricity to melt steel scrap (whereas the previous two processes would use iron
ore and carbon as the primary inputs) and produce liquid steel. Finally, after the second World
War, the oxygen process was invented in Austria. This was essentially a further development from
the Bessemer process, and worked by blowing oxygen into a vessel similar to the Bessemer
converter. The oxygen process in its modern form takes a mixture of 75% liquid iron and 25%
steel scrap as its primary input.

Figure 5.2 shows the historical development of these processes in the United States. The Figure
displays on the vertical axis a measure for the relative importance of the technology in total
production (measured in tons of steel). The indicator starts from the variable f, which is the share
of a specific technology in total production. Then for each technology i, we calculate the variable
fi/(1-fi), which is the share of technology i relative to the share of all other technologies. This
variable, which is displayed on a log scale, will be larger than one for technologies that dominate
total production absolutely (i.e., have a share larger than ½).

The figure clearly shows that together, the open hearth process and the Bessemer process took
over from crucible melting after 1850. Of these two new processes, the open hearth process
achieved the largest share. The Bessemer process reached its peak around 1880, and then declined
over the period of almost a full century. The open hearth process continued to increase its share
well into the 20th century, however. Similarly, the electric-arc and oxygen processes increased
their shares after they had been introduced, at the expense of the existing technologies. The speed
at which this substitution occurred is relatively slow, however.

It has been shown by Nakicenovic, Gruebler and Marchetti that the shape found in Figure 5.2. is
typical for many processes of technological substitution. What is typical are the almost straight
segments of the lines when the technology expands or declines. Note that the log scale on the
vertical axis implies that such straight lines in fact correspond to phases of exponential increase
or decrease, and hence bear similarity to the single technology diffusion curves in Figure 5.1. In
their initial phase, these also show exponential growth, while the phase following the inflection
points in Figure 5.1. corresponds to the turn of the curves in Figure 5.2. Thus, for the case of
multiple competing or complementary technologies, the S-curve remains relevant, although the
observed patterns are somewhat more complex. The study of the diffusion pattern of competing
technologies will be the subject of section 3 of this chapter.

111
10

0.1
bessemer
f /(1-f)

0.01
open hearth

0.001 oxygen

electric
crucible
0.0001
1860 1880 1900 1920 1940 1960 1980

Figure 3.2. Diffusion of competing technologies, the case of


steelmaking in the United States. Source: Nakicenovic.

A specific case of the diffusion of competing technologies that is different from the general model
considered in section 3 is laid out in section 4, which considers technologies characterized by so-
called network externalities. Network externalities have been known at least since the first
railways and telephones, but have gained importance in the Information Age. How network
externalities may be defined, how they affect the diffusion of competing technologies and how
they enhance the role of standards will be considered in sections 4 and 5.

5.2. Diffusion models for one innovation

If we think about diffusion in the context of the strong rationality that economists often attribute
to firms or consumers (profit or utility maximization), the fact that diffusion is not immediate
raises questions. If all firms are perfectly rational, why would one firm wait to introduce an
innovation, while a different firm adopts immediately? Seen in this way, the relevant question then
becomes one of modeling the adoption process of an individual firm.

The answer to the above question may take two different forms. The first is that information is
incomplete, and reaches some firms or consumers earlier than others. This is the so-called
epidemic diffusion model. The second answer is that firms or consumers differ with regard to
certain characteristics (e.g., firm size or consumer income). It may be then make sense for firms
or consumers with certain characteristics (e.g., large firms or rich consumers) to adopt early,
while at the same time it makes sense to wait a bit longer for others. Such explanations are the

112
subject of analysis of the probit diffusion model. The remainder of this section will consider these
two models in detail.

5.2.1. The epidemic model

The epidemic model is concerned with the spread of information and the role of incomplete
information in the diffusion process. The model, first proposed by Edwin Mansfield, is called the
epidemic model because its main structure is taken from the study of the spread of an infectious
disease (epidemic). The diffusion of an innovation is seen as analogous to the spread of such a
disease. The main idea is that adoption of an innovation by a firm or a consumer is immediate
once the firm or consumer learns about the innovation, but cannot take place without such
knowledge. Learning about an innovation takes place through contact with others who have
already heard about the innovation.

Specifically, the model assumes a constant population of potential adopters of the technology.
Denote the number of individuals in this population by N*. Denote also the number of individuals
who have already adopted the technology at time t by Nt. Hence we can denote the number of
individuals who have not yet adopted the technology at time t by N* - Nt. Also define ntNt/N*.

Now we assume that the information about the innovation spreads like a contagious disease.
Hence for any individual who has not yet ‘caught the disease’ (adopted the innovation), the
probability of doing so at time t is equal to the probability of meeting an individual who has
already adopted the innovation (caught the disease) multiplied by the probability that the
information (disease) is transferred during such a contact. We assume that given a contact with
a different individual, the probability that the ‘other’ individual has already adopted the innovation
is equal to the share of adopters in the total population, i.e., nt. This means that contacts are
drawn from a representative sample of the current population, and are not biased against either
adopters or non-adopters. We also specify a parameter g which measures the frequency of
contacts between individuals in the population and the probability that the information is spread
during a contact (the latter is analogous to the infectiousness of the disease).

Thus for any individual who has not yet adopted the innovation, the probability of doing so at
time t is equal to gnt. Because there are N* - Nt individuals who have not yet adopted the
innovation, we may write the (expected) number of adopters at time t as
N t gnt (N Nt). (4.1)

where the dot above the variable denotes a time derivative (i.e., N ≡ d N / d t ). Because N* is
constant over time, we may write n = d ( N / N * ) / d t = N / N * , and substituting this into
equation (1), we obtain
nt gnt (1 nt). (4.2)

113
This differential equation may be solved to obtain the diffusion path. In the case when g is a
constant, i.e., the frequency of contacts and the infectiousness are independent of time, the
solution is
1
nt , (4.3)
1e c gt

where c is a constant of integration. Equation (3) describes a sigmoid curve with saturation level
1 (when all potential adopters have adopted). Two parameters, c and g determine the shape of
the curve: c positions the curve on the time axis (and hence fixes the point at which t=0), while
g determines the speed of diffusion (steepness of the curve). The curve is symmetric around the
point of inflection, which can be calculated to occur at t = -c/g. The inflection point thus
corresponds to the point in time when half the population has adopted the innovation.

One may also consider a slightly more general case of the epidemic model, where g depends on
certain variables that may differ between sectors, countries or innovations. As long as g remains
independent of time (which is a basic assumption of the model), we may still solve the model in
the same way, but then the speed of the diffusion process becomes a function of sectoral or
country-circumstances, or differs per innovation. It has been suggested, for example, that the
overall profitability of the innovation, the capital requirement to introduce the innovation and the
perceived risk associated to the introduction of the innovation may all affect the rate of diffusion
(the first factor assumedly in a positive way, the last two in a negative way). One may also
imagine that a variable such as population density would affect the speed of diffusion positively.
In this way, the epidemic model offers an explanation for the observed differences between
diffusion processes for individual cases, such as the example in Figure 5.1.

Concluding, we may say that the epidemic model clearly predicts an S-shaped diffusion pattern,
and is able to account for observed differences in the exact shape of the curve. Hence it seems to
be able to explain at least the basic empirical facts about the diffusion of innovations. However,
economists are not very happy with this simple model, for the reason that it does not explain in
any economic sense the decision to adopt. Rather than assuming that adoption is something that
is a chance variable resulting from contacts between ‘infected’ and ‘non-infected’ individuals (as
the epidemic model does), economic theory wants to explain adoption as a decision that results
from a process of profit- or utility maximization. In this way, one may bring in economic factors
at the heart of the model, rather than in the form of an exogenous parameter such as g. The probit
model is an example of such a more sophisticated theoretical framework.

5.2.2. The probit model

The central idea behind the probit model, which was first proposed by Paul David, is that the
decision to adopt depends on some kind of stimulus variable exceeding a critical threshold value.
For example, adoption of an innovation may only be profitable for firms that exceed a certain
threshold size. If firms do not have equal sizes, and the threshold value for adoption varies over
time, diffusion of the innovation will eventually become complete, but not be instantaneous.
However, in order for the diffusion pattern to be sigmoid, both the movement of the stimulus
threshold over time and the distribution of the stimulus variable over potential adopters must

114
adhere to certain restrictions. This will become clear when we set out the model in graphical
terms.

Suppose X represents the stimulus variable, f(X) is the relative density function of X over potential
adopters, and X t* is the observed value of X at time t. A density function gives the number of
individuals (frequency) for each possible value of X. The relative density function f is obtained by
dividing each frequency by the total number of members in the population (in mathematical terms,
we scale the surface below the density curve to 1). The top panel of Figure 5.3 shows the relative
density function. On the vertical axis, we find the (relative) number of individuals in the
population with corresponding values of the stimulus variable on the horizontal axis. The figure
thus shows that we have relatively few individuals with small size, similarly for big size, but we
have a relatively high number of individuals with intermediate size. The resulting bell-shape is
typical of the distribution we need in order to arrive at a sigmoid diffusion curve.

Figure 3.3. The probit diffusion model

115
Now suppose that initially, the value of the stimulus variable is at X 1* . Remember an individual
threshold needs to be exceeded for adoption to take place. This implies that all individuals with
threshold values larger than X 1* will have adopted at t=1. The (relative) number of such
individuals is measured by the surface under the curve to the right of X 1* . We may display this
value in the bottom panel which gives the number of adopters as a function of time. Now consider
the cases for X 2* , X 3* and X 4* . Moving from X 1* to X 2* (i.e., from t=1 to t=2), the number of
new adopters is measured by the surface below the curve between X 1* and X 2* . This is again
plotted in the bottom panel. This procedure is repeated for t=3 and t=4. Note that by the nature
of the bell-shaped distribution curve, the number of new adopters rises until t=3, but falls after
the peak of the bell is passed. This causes the diffusion curve to level off eventually (the inflection
point of the S-curve corresponds to the peak of the bell).

How can this be represented in terms of equations? It is obvious from the figure that the diffusion
curve can be obtained by integrating the relative distribution curve over the observed value of the
stimulus variable. Thus, the relative number of adopters at time t (as before denoted by n) will be
equal to

nt
P f (X ) dX .
Xt
i i (4.4)

Suppose X measures the production level of a firm that is considering to buy a process innovation
embodied in a machine. The process innovation will reduce labour input used for a single unit of
output by l. A single machine can produce all the output of single firm, and costs C (this includes
all costs and is discounted in the usual way). The wage rate is w. The firm will save money on
labour inputs equal to Xwl, and total benefits from the innovation are equal to Xwl-C. Hence firms
with size X<C/(wl) will not adopt the innovation, while the reverse is true for firms with X>C/(wl).
It is clear that the production scale can be seen as the stimulus variable to which the firm’s
decision to adopt the innovation or not will respond.

Further assume that due to incremental innovation of the basic innovation design, labour saving
resulting from the innovation will grow at a fixed rate, i.e., l = λ l .Then, applying the above
logic, the critical value of firm size for which the innovation will be adopted will evolve as

Xt .
C
Xt < Xt (4.5)
wlt

From equation (4), we may readily derive by differentiation:22

22 *
To obtain this result, first write out the integral in equation (4). This will give F()-F( X t ), where
F is the cumulative density function. Note that by nature of the probability density function, F()=1.

116
dn t
f (Xt ). (4.6)
dX t

Combining this with equation (5), we obtain


dnt
nt Xt f (X t )  X t . (4.7)
dXt

Using the substitution method for integration, we notice that equation (7) may be solved for the
number of adopters up to time T as follows:
T
nT
Pn dt
0
t F(X T )  c0, (4.8)

where c0 is a constant of integration, and F(X) is the cumulative density function. The cumulative
density function is a primitive function of the relative density function, and gives the relative
number of individuals with values between 0 and X. (Note that by definition, F(0)=0, something
we have used in the derivation of equation 8). Note also that from equation (5), we may easily
obtain
Xt e t
 c1, (4.9)

where c1 is a (different) constant of integration. Let us set these constants to zero without loss
of generality. We will now have to make an assumption about the specific mathematical form of
the distributions functions f and F. Suppose these are the lognormal distributions, which means
that ln(X) is distributed normally. The lognormal distribution has a shape comparable to the top
panel of Figure 5.3. Hence -F[ln(X)] = F(t) is the cumulative normal distribution. This implies
that the diffusion path over time given by equation (8) corresponds to the cumulative normal
density function, which is indeed a sigmoid curve, just as we derived in the bottom panel of Figure
5.3. A higher growth rate of labour productivity  of the innovation will speed up the diffusion
process.

The conclusion from the probit model is thus that this model is able to generate sigmoid diffusion
curves, and that the speed of diffusion can be made dependent on economic factors. We have used
the rate of incremental improvement of the innovation as one example of such a factor. Obviously,
this was done with the aim of applying the probit model to our earlier argument about the
interrelationship between incremental innovation and diffusion. Hence we were able to show that
the probit model is a natural candidate for explaining this interrelationship. The version of the
model that we have used here is still rather crude, because it assumes that labour savings from the

Differentiating again obtains equation (5).

117
innovation grow at a constant exogenous rate. One may easily imagine a version of the model,
however, in which labour savings are a function of learning, and could be approximated by the
cumulated level of production in the industry. In this way, incremental innovation would become
an endogenous factor in the model. Because of the mathematical complexities involved with this,
we will not pursue this any further here.

Finally, it must be noted that the probit model is quite flexible in terms of the factors that it
assumes are relevant for the diffusion process. In the example that we have given here, the reader
may easily verify that the same result could have been derived by assuming that the wage rate
grows at a fixed rate (instead of labour savings growing as we have assumed). A variety of
different settings (also for consumer markets) may also be imagined under which the general set
up proposed here can also be interpreted. What is crucial, however, is that in order to arrive at
a sigmoid diffusion curve, we need to assume a bell-shaped distribution of the stimulus variable
over the population, as well as some kind of steady movement for the observed variable of the
stimulus variable. Without these assumptions, sigmoid diffusion curves will not result.23

5.3. Diffusion of competing technologies

The diffusion models that we have considered so far have not considered competition between
multiple (new) technologies. It has already been argued in chapter 1 that the succession of
technological paradigms may be seen as an example of such a competitive process. However, if
the model that we are going to present has to bear any relevance to the case of technological
paradigms, it will have to be able to deal with strong uncertainty and bounded rationality. Thus,
rather than continuing to work in the tradition of perfect rationality and no, or only weak,
uncertainty, we will once again resort to evolutionary economic models. As in the Nelson and
Winter model considered in the previous chapter, we will let the process of market selection do
the work.

The specific model that will be considered was developed originally by Gerald Silverberg. The
model assumes that a technology is characterized by two parameters: its labour productivity
(denoted by a) and the capital output ratio (denoted by c). Using Q for output, L for labour input
and K for the capital stock, we write
Q K
a c . (4.10)
L Q

We further assume that total labour supply (denoted by N)24 grows at a fixed rate equal to :
N t N0 e t. (4.11)

23
As an exercise, it might be useful for the reader to derive Figure 5.3 again using a non-bell-shaped
distribution in the left panel. A uniform distribution (i.e., a horizontal line in the left panel) is suggested for
this exercise.

24
Although it is denoted by the same symbol, this variable obviously bears no relation at all to the
variable N in the previous section.

118
The economy always produces at the capacity allowed by the capital stock, and labour demand
is adjusted to this level. This means that in case when only one technology is present we may write
K Q K
Q , L . (4.12)
c a ac

Now define the rate of labour employed in the economy as:


L K
v . (4.13)
N N ac

Labour is paid a real wage rate equal to w, and we assume that the change of this rate depends
on the aggregate employment rate in the economy:
w
m  n v. (4.14)
w

The dot above the variable indicates a time derivative, so the lefthand side of this equation gives
the proportionate growth of the real wage rate. m and n are parameters. The equation thus says
that when unemployment is low (i.e., v > m/n), wages will rise, while the reverse holds if
unemployment is high.

Finally, we assume that firms re-invest all their profits into capital, and that capital depreciates at
a fixed rate / (the latter does not differ between technologies). Using the second part of equation
(12), we may write the following equation for the growth rate of the capital stock:

K Q wL 1 w
/ 1 /. (4.15)
K K c a

Now differentiate equation (13) and combine the result with equation (15) to obtain the growth
rate of the employment share:
K
v
 1
1
w
(  /). (4.16)
v K c a

Equations (14) and (16) form a coupled system of differential equations describing the time path
of the real wage rate and the employment rate for an economy employing only one technology.
Such a system is known as the Lotka-Voltera or predator-prey system of equations.25 Applying
the definitions A1/c-(/+) and B 1/(ac), we may write the system as follows:

25
The system was originally proposed to model populations of predator and prey animals. In the
economic analogy used here, wages play the role of predator, employment that of prey.

119
w
m  n v,
w
(4.17)
v
A Bw.
v

The solution to the system, i.e., the resulting time path, is represented by a so-called closed orbit
around the stationary values. The latter are defined as the values at which there is no growth of
the two variables (real wage rate and employment rate). This point can be found as follows:26
v
0 < v m, w
0 < a (   /) a c (4.18)
v n w

Figure 3.4. Phase diagram of


the Silverberg model; the
model fluctuates in a closed
orbit (the circle) around the
stationary point

Whenever the system arrives at this point and remains without external perturbation, it will stay
there, but when the system gets to a point other than the stationary values, it will remain in a
closed orbit around this equilibrium. Such a closed orbit results in cycles for the real wage and
employment rates, and is depicted in Figure 5.4. The circle describes the movement of real wages
and the employment rate over time. The middle of the circle corresponds to the stationary point.
Note that if we would have started with wage and unemployment rates closer to the stationary
point, the circle would have had a smaller diameter.

With this simple model, we are able to address the question how the economy will change when
a new technology is introduced that has different values for the parameters a and c. We denote
the old technology by a subscript 1 and the new technology by a subscript 2, and define vi = Li /
N, with i=1,2. The wage rate remains uniform between the two technologies, but we assume that
profits resulting from a technology are only re-invested in that same technology. It may then easily
be shown that instead of the system of equations (17), we may now write the new system

26
The point w=v=0 is also a stationary point, but this represents a trivial case.

120
w
m  n(v1  v2),
w
v1
A1 B1w (4.19)
v1
v2
A2 B2w.
v2

A1 and A2 are defined analogously to the previous definitions of A and B. We now ask the
question what happens if we start from a point near to the stationary state of the economy with
only the old technology, and introduce a (very) small amount of capital employing the new
technology. Under which circumstances will the new economy diffuse through the economy, and
at what level will it eventually saturate? This is essentially a question about the process of
competition between the new and the old technologies.

Silverberg was able to show that the answers to these questions depend on some of the parameter
values. More specifically, he showed that if the following condition is satisfied, the new
technology will take off and eventually take over completely from the old technology:27
A2 A1
> , (4.20)
B2 B1

Recalling the definitions of A and B in this expression, one may define TA/B = a(1-c(/+)). It
is clear from inequality (20) that in order for the new technology to be viable, the ratio T must
increase from the old to the new technology. Under what circumstances will this be true?

If the new technology uses less labour (i.e., has a higher value of a) and uses less capital (i.e., has
a lower value of c), it is straightforward to show that inequality (20) will always hold. This is the
case where the new technology is superior to the old one in all respects. Hence it is only natural
that there are no obstacles to its long-run diffusion through the economy.

A more interesting situation emerges when the new technology has superior labour productivity,
but this comes at the price of higher capital intensity, or when it has superior capital productivity,
but this comes at the price of lower labour productivity. In either one of these cases, the outcome
of the battle between the two technologies will depend on the two parameters  and /, i.e., on
population growth and capital depreciation.

To see this, consider the definition of T, which shows that a higher value of either  or / lowers
the value of T. Now suppose a2 > a1 but also c2 > c1. This implies that there is a range of relatively

27
For the mathematically interested reader: this result may be obtained by calculating the Jacobian
matrix of the system (19), and evaluate the eigenvalues of this matrix at the point v1 = m/n, v2 = 0, w = A1/B1. If
at least one eigenvalue with a positive real part is found, the original equilibrium of the system (the one with
only technology 1 present) is unstable, and hence that technology 2 will take off once endowed with a positive
value of the capital stock.

121
small values of c2 for which a take-off of technology 2 will occur (given values for c1, a1 and a2),
but there is also a range for which this will not occur (T falls). The range for which a take-off
occurs is smaller for higher  and/or /. In other words, a higher rate of population growth or
capital depreciation (both of which are assumed to be independent of technological change) will
have a retarding effect on the probability of such a take-off. A similar effect may be shown for the
case c2 < c1 but also a2 > a1.

The economic rationale behind this finding is quite intuitive. What a higher population rate does
is to ease pressure on the labour market, and hence to bring down the rate of growth of real
wages. If the new technology is labour saving (but capital deepening), it will be more attractive
in situations where labour is expensive than with cheap labour. Hence a slower growth rate of real
wages (due to faster population growth) will tend to favour the old technology. On the other
hand, what a higher rate of depreciation does, is to make capital more expensive (one needs more
investment to maintain a certain productive capacity in the long run). Given that the new
technology is capital deepening, a higher rate of depreciation means the new technology becomes
less attractive.

The simple conclusion from this model of competition between two technologies is that the take-
off and speed of diffusion of a new technology depends both on the technological characteristics
of the new and old technology, but also on economy-wide or societal factors which are not
directly connected to technology itself. The model shows this for the rate of population growth
and the rate of capital depreciation, but in fact one may think of a whole range of other factors
for which similar conclusions would hold. Chapter 3 provided many historical examples of cases
where such factors could retard the diffusion of new technologies. The present model can be seen
as a metaphor for such cases.

So far, the model has been analyzed for the case of two technologies only. However, it can also
be analyzed for more than two technologies, or in fact for a situation in which new technologies
arrive at certain intervals, and hence there is a process of technological renewal that is constantly
ongoing. These more complicated models work in the same way as the model analyzed above,
but are harder to solve analytically. However, by numerically solving the model on a computer,
one may derive the time paths for the diffusion of new technologies in such a case. Figure 5.5
gives an example.

The dark lines in the figure correspond to individual technologies, which are introduced every 50
years. It has been assumed that a new technology has labour productivity 15% higher than the
previous one, while capital intensity remains unchanged. Applying the same line of reasoning as
in the case of two technologies, this ensures that new technologies take off and spread through
the economy. The simulation is started with only one technology present. The diagram displays
on the left axis the variable f/(1-f), defined in the discussion of Figure 5.2, for each of the
technologies. In this case, we have used fivi/vi, where the subscript i indicates a technology.

122
Figure 3.5. Time series from the Silverberg model with introduction of
a 15% more labour productive technology every 50 periods

After a period in which the initial technology is dominant, we observe that the model evolves into
a stage of regular substitution patterns between the old and new technologies. During this phase,
the observed substitution pattern obviously curves bear great similarity to Figure 5.2. Also note
that it will take approximately 100 periods (i.e., two technology generations) before a technology
becomes the dominant one in terms of employment. It takes another 100 periods before the
technology vanishes from the scene. If we are willing to interpret the periods as years, this
suggests a parallel to one of the conclusions of chapter 3, namely that technological paradigms
that dominate a certain era often stem (in terms of invention) from a period long in the past. This
specific result will be applied in a later chapter when we look at an evolutionary model of
economic growth in the long run. Finally, note that the real wage rate rises steadily as new and
more productive technologies diffuse through the economy.

In order for the reader to experiment with the Silverberg diffusion model, a simulation model is
available from the author’s website (http://www.tm.tue.nl/ecis/bart/silmod.zip). The model runs
under Windows, and may be used to generate simulations in which a new and old technology
compete, or one in which new technologies get introduced continuously (as in Figure 5.5).

A final note on the Silverberg model concerns the role it reserves for incremental innovation. If
we take, as in Figure 5.5, the innovations as somehow representing technological paradigms, there
is no mechanism at all that corresponds to endogenous changes in the technology in the form of
incremental innovation. The productivity of each new innovation is given exogenously, and
remains constant over time. It is not so difficult, however, to incorporate some form of
incremental change of the new technology, depending, for example, on cumulated experience with
the new technology. A more sophisticated way of dealing this would be to make incremental
innovation of the old technology a function of the degree of competition of the new technology.
This corresponds to the example of the competition between the sailing ship and the steamer,
briefly mentioned in chapter 1. To set up such a model would obviously greatly enhance the level
of realism of our analysis, but is beyond the scope of this book.

5.4. Networks and competition between standards

123
The case of competition between technologies becomes rather peculiar when so-called network
externalities are present. Network externalities result when the utility that a user derives from a
product depends on the number of other users of the product. A case in point is the telephone.
If there would be only one user of this product, this user would not derive much utility from the
telephone set, for the simple reason that there would be nobody else to call. When other users
start adopting a telephone, the utility for the first user obviously grows, as the opportunity to talk
to others grows.

In this example, the utility for the user depends on the size of the communications network, hence
the term network externalities. This phenomenon may be observed in the case of other networked
products as well, e.g., railroads (the size of the network determines the number of places a train
passenger may travel to). The term networks must obviously be taken broadly here, and not be
restricted to the case of physical networks, such as telephone cables. Also more virtual networks,
such as airlines, must be included in the definition.

One may easily imagine that there are also limits to network externalities. In the example of the
telephone, the value of the product for an individual user will increase as long as people from this
person’s circle of acquaintances will adopt the telephone, but once everybody in this limited group
of people has bought a phone, a further increase in the number of users (i.e., people not
acquainted to our individual) will not increase this value any further. Hence network externalities
may seize beyond a certain threshold value of the network.

The importance of networked products has greatly increased in the Information Age. The
Internet, or computer networks in general, are a case in point where the size of the physical
network (cables) determines the value of the product to the user. But there are also many
examples of virtual networks, as is the case for users of compatible computers (such as the Wintel
PC platform or the Apple Macintosh discussed in chapter 1). Although these users need not be
connected by a physical computer network, they may exchange data and software within the
group of users of their computer type by means of floppy disks or other media. They may also
exchange information and experience about the use of the hard- and software. The value of both
types of exchange are obviously larger the larger the number of users (the virtual network) will
be.

The Wintel and Apple Macintosh example also brings out clearly the fact that networks may
compete with each other. Macintosh users may exchange software, data and information with
each other, but in general such exchange will be impossible (or at best a great deal more difficult)
with Wintel users. In other words, network externalities are largely limited to the network of the
technological variant that has been adopted, and do not extend to other networks, even if they
represent products with similar functions.

One of the functions of a technological standard is to enhance the extent of network externalities
by increasing compatibility between products that adhere to the standard. Quite naturally, this will
also imply that compatibility with other standards is limited, and hence that network externalities
are confined to exchange within standards and not between standards. This is why we will
represent the battle between two competing technologies that are both characterized by network
externalities as a battle between two standards. The model that we will deploy was, in its basic

124
form, proposed by Brian Arthur.28 It assumes there are two alternative technologies aimed at the
consumer market. Each technology corresponds to a technological standard, which specifies a
number of compatible products. Network externalities take place within each of the two
standards, but not between them. We also assume that there is no limit to the network
externalities. As was argued above, this may in fact be a too optimistic assumption about the
importance of network externalities, and we should therefore derive our conclusions with some
care.

We start the model by looking at the consumer side. Consumers in the market have the following
utility function:
Uij ūij   Xj pj, (4.21)

where i indicates a consumer and j a standard. U is utility, u is the utility derived from purchasing
technology j when no other consumers buy that technology, X are the number of users of standard
j,  > 0 is a parameter that governs the strength of the network externalities and p is the price.
Suppose there are N consumers in the market and each consumer buys one unit of the good. Let
us suppose that the market for goods according to each of the two standards is characterized by
perfect competition, and that the marginal costs of producing the goods are equal between the
standards. As usual, we denote the marginal cost level by c and we have p = c.

The utility functions can be drawn as in Figure 5.6. The horizontal axis displays the number of
consumers using the standard denoted as 1. Note that it does not matter which of the two
technologies we denote on this axis. If all consumers buy, X2 = N - X1, and this means that the
curve for the technology with X on the horizontal axis is upward sloping, and the other curve is
downward sloping.

An individual consumer will pick the technology that has highest value for her. Consider the
situation described by the curves U1 and U2 in Figure 5.6. Where the two curves intersect, the
consumer is indifferent between the two standards. Because the curves have been drawn under
the assumption u 1 = u 2 , this is at the point N / 2 on the horizontal axis. To the right of this
point, technology 1 is preferred, to the left technology 2 is preferred. In other words, which of
the two technologies is preferred by the consumer depends on which one has the highest market
share. Note that by increasing the value of u 1 , the curve U1 shifts upward to U1'. This moves the
point where the consumer is indifferent to the left, and hence technology 1 is preferred over a
broader range of its market share (X).

Formally, this process can be described by finding the value of X1 for which the consumer is
indifferent, i.e., where Ui1 =Ui2 . Using the utility function, this yields:
ūi2 ūi1
Xi1  N, (4.22)
2 2

28
The presentation here and in the next section draws on Liebowitz and Margolis.

125
where the asterisk indicates the indifference level. This equation clearly shows that under the
assumption u 1 = u 2 , X i*1 = N / 2 , as was already evident from the graph.

We have assumed that the level of utility derived from a standard when no other consumers buy
that standard ( u ) is different for each consumer. As equation (22) shows, this also implies that
the level at which a consumer is indifferent between the two standards ( X i*1 ) will generally differ
between consumers. Let us assume that the different values of X i*1 can be described by a uniform
distribution.

Figure 2.1. Utility functions in the market with


network externalities

Let us now suppose that N consumers will buy each period, so that it will take a period of time
equal to 1/ before the market is fully served. In making a decision on which of the two standards
to buy, the consumers will take the share in current sales as their expectation of the ultimate
shares of each of the two standards. With this information, the consumer will be able to determine
which standard is more beneficial, by comparing the market share with her value for X i*1 .

In order to see what will happen to the market share of current sales of the two standards, we
must make use of the cumulative distribution function of the consumers’ values of X i*1 . As in the
probit model analyzed above, the cumulative distribution describes the number of consumers with
X i*1 X, for different values of X. If, as we have assumed, the distribution of the values of X i*1 is
uniform over some range [xl ,xh], the cumulative distribution is simply the product of (xh-xl) and
D, where D is the number of consumers with a specific preference.

126
Figure 2.2. Equilibrium analysis for competition between
standards with network externalities present

This is described in Figure 5.7. The top figure of panel (a) in Figure 5.7 shows a uniform
distribution over the range [0, N]. This is the distribution of consumer tastes ( X i*1 ) in the
economy. Because we have assumed that there are N consumers (N is the density of the
distribution), this implies that D = 1, so N×D = N. D is the height of the distribution in the figure.
The cumulative density function is given in the bottom part of panel (a). This is obtained in a
similar way to Figure 5.3 above. Because of the horizontal line in the top part of panel (a), the
cumulative distribution is linear and increases at a slope equal to D.

The final assumption that we need in order to proceed to the determination of the market
equilibrium in the standards battle, says that the population of consumers buying at each moment
in time is independent of time. This means that the group of consumers buying at a certain time
is a representative sample of the total consumer population.

127
To see how we can determine the equilibrium market share, consider panel (b) of Figure 5.7. Here
we assume that all consumers are identical and have X i*1 =N / 2. Besides the cumulative
distribution, the figure also shows the 45 degrees line in the bottom part. This is the line at which
the values on the horizontal axis are equal to those on the vertical axis. Remember that the
cumulative distribution describes the number of ‘new’ consumers that will buy technology 1 for
each value of the market share of technology 1 in current sales. The latter is given as a result of
the choice of previous generations of consumers, so we can read off the share of new consumers
buying technology 1 from the cumulative distribution function. To do this, we start at the share
of technology 1 in current sales on the horizontal axis, then move up to the cumulative
distribution curve, and finally move left to the vertical axis.

Whenever the cumulative distribution function is below the 45 degrees line, this means that
technology 1 will have a lower share with new consumers than with current old consumers. This
means that the market share of technology 1 will be falling. Conversely, when the cumulative
distribution function is above the 45 degrees line, the share of technology 1 in total sales will rise.
In this case, the market share of technology 1 is rising.

Suppose what happens if we start with market share of technology 1 to the left of point E1 (still
in panel b). Obviously, the cumulative distribution function is below the 45 degrees line, and hence
the market share will shrink. If we repeat the analysis for the next period, we will thus have to
start from a value (on the horizontal axis) that is left of the previous starting point. Because we
are still below the 45 degrees line, this process will go on. Only when we reach a market share
of technology 1 equal to zero (this happens in the limit) will we stop moving left on the horizontal
axis. Thus, the point E3 is obviously an ‘attractor’ of the system. Applying similar reasoning to
a starting point to the right of E1, we arrive at the conclusion that the system will evolve towards
point E2. Hence, both E3 and E2 are equilibrium points, and which one will be reached depends
on where we start.

Note that if we start exactly at point E1, the process will stay there, and the same holds for points
E2 and E3. The difference between E1 on the one hand and E2 and E3 on the other hand, however,
is that the latter two points are stable equilibria, while E1 is unstable. This means that a slight
distortion of the situation from E1 will drive the market further away from the equilibrium, while
the opposite holds for E3 and E2. For these two points, a small distortion will eventually lead the
system back to the original equilibrium.

The mathematically skilled reader will have noticed that the lower parts of the different panels in
Figure 5.7 are also known as phase diagrams, used in the analysis of differential equations (Figure
5.4 is also a phase diagram). The stability of the equilibrium points, which correspond to
intersections between the cumulative distribution and the 45 degrees line, depends on the slope
of the cumulative distribution at the cutting point. When the slope is larger than one, the
equilibrium is unstable, while it is stable if the slope is smaller than one. We will encounter this
when looking at the further panels of Figure 5.7.

Panel (c) shows a less extreme case than panel (b), but one that has the same qualitative result.
In this case, not all consumers are equal, but instead we have a uniform distribution of preferences
over a wide range. The only restriction we have made is that there are no consumers who will

128
‘go against’ an overwhelming majority of users of either one technology. This is represented by
the top part of the panel, which shows the frequency of consumers with very low or very high
threshold levels is zero. As the reader may easily verify, points E2 and E3 are equilibrium states
of the system and, as before, which one will be reached depends on where we start.

Panel (d) is different from the previous two cases. Here it is assumed that each of the two
alternatives has a group of strong supporters, e.g., because that particular system is tailored well
to the needs of the group. As an example, one may think of the professional group of graphical
designers having a strong preference for the Apple Macintosh because of superior graphical
qualities of this computer. At the same time, ‘gamers’ might have a preference for a Wintel system
(the competing technology) because a superior supply of games is available for this platform. The
majority of users, however, is not willing to commit so strongly to either one technology, but
instead tends to ‘go with the flow’. Our familiar way of reasoning will show that now point E1 is
a stable equilibrium, and the system will tend towards this point no matter where we start
(provided it is not exactly in one of the corners, which are now unstable equilibria).

In the case of panel (e), everyone prefers technology 1 beyond a relatively low threshold level of
users. Only for very low numbers of users of technology 1 are there some supporters of
technology 2. As the cumulative distribution functions shows, under identical circumstances
(market shares), technology 1 is preferred by all consumers. This means we may interpret this case
as one where technology 1 is superior from a technological point of view (and everyone knows
this). Now there is no interior equilibrium. Point E2 is a stable equilibrium and E3 is unstable. This
implies that the system will tend towards dominance of technology 1 (the superior technology).

The final case, panel (f) is similar to the previous one, but now there is nobody who is willing to
use technology 1 if the number of supporters for that technology is close to zero. This is the case
where technology 1 is considered somehow superior, but superiority is not so large as to offset
all possible disadvantages from a small network. Now an interior equilibrium point re-emerges,
but it is an unstable one. The existence of the equilibrium does imply, however, that there is now
a range of initial values (to the left of the interior equilibrium E1) for which the system tends to
dominance of technology 2.

The central lesson from this analysis is that there are circumstances in which the outcome of the
competition process between two standards depends on the initial state of the system, i.e., on the
decisions of early adopters. We call this property of the system ‘path dependence’, because the
final state of the system depends on the path it takes. This is a ‘dangerous’ property under some
circumstances. Consider, for example, the case of Table 5.1. This table gives the pay-off in terms
of consumer utility as a function of the number of consumers for two competing technologies.
Both technologies are characterized by network externalities (utility rises when the number of
users grows), but the rate at which this occurs differs. Moreover, the level of utility when the
technology is used in autarky differs between the technologies.

Number of users 0 10 20 30 40 50 60 70
Utility technology 1 10 11 12 13 14 15 16 17
Utility technology 2 1 5 10 20 30 40 60 60

129
Table 5.1. Lock-in to a bad technology

Let us assume that the benefits of both technologies at various user levels are not known in
advance, i.e., that the data in Table 5.1 must be discovered by experience. In this case, the first
generation of consumers, who are comparing the utility of both technologies in autarky or at low
levels of fellow-users, will opt for technology 1. The next generation of consumers will then base
their choice on the current market share of the technologies, and hence also choose technology
1. Adoption of technology 1 becomes a self-fulfilling prophecy, and the systems gets ‘locked-in’
to technology 1. With the hindsight knowledge in Table 5.1, however, it is clear that this choice
is inefficient whenever the total number of users becomes 30 or more in the long run. This is the
situation of lock-in to an inferior technology. Getting out of the locked-in situation would be
possible if all consumers would switch to technology 2 at the same time. But this is unlikely to
occur because such a common switch would require a too great deal of co-ordination. At an
individual level, switching costs are obviously too high to provide an incentive for individual
consumers to move to technology 2 as long as other consumers cannot be expected to move.

Obviously, the example in Table 1 is a rather peculiar one. It shows that an economy may
theoretically get locked-in to an inferior technology, which is obviously an undesirable property.
But this theoretical possibility does not prove that such situations occur often in reality. It has
been argued for a number of historical cases, however, that lock-in to a bad technology did take
place. In the Apple vs Wintel battle, it is often argued that the Apple Macintosh has superior user
value because of its high quality user interface (compared to Windows). Nevertheless, most users
have settled for a Wintel system, as the case in chapter 1 showed. Similarly, it has been argued
that the battle between the VHS and Betamax video recorder systems has resulted in a lock-in to
inferior technology (VHS). Another famous case is the keyboard layout used in most computers
(the QWERTY standard). A different lay-out, the so-called Dvorak system, would supposedly
result in higher typing speed. Network externalities in the latter case consist of training institutions
(schools offering typing courses) all having settled on the QWERTY system, and hence an
individual user who would like to adopt the Dvorak system would face high learning costs.

Whether the eventually dominating technology is indeed inferior to the technology that has been
locked-out, or the reverse is true, is hard to prove in any of the above cases. Discussions on which
system is superior are often blurred by ideological arguments of each supporters side. Moreover,
an objective evaluation of both systems is hard because counterfactual analysis is impossible in
the economic reality unless in very specific cases. Thus we must take our conclusion on the
possibility lock-in of inferior technology with a certain degree of caution.

However, even when there is no inherently superior technology (as in panel b of Figure 5.7), the
notion of lock-in still bears practical relevance. In such a case, the consumer does not really
bother which of the two technologies will be selected, but instead simply jumps on the bandwagon
that gets started first. From the point of view of the firm, however, the possibility of lock-in
creates a good business opportunity. By building a large initial group of supporters, a firm’s
standard may get enough momentum to become locked-in and drive out competing standards.
This is why firms are eager to have their technology incorporated in a (successful) standard. Also,
a firm may be willing to incur initial losses in order to build a high enough user base and hence
‘tip’ the market in its own favour. This can be done by offering a product for a very low (or even
zero) price during the market introduction phase. If this attracts enough users to create a lock-in,

130
prices will go up after this has occurred so that profits will emerge. Such strategies, however, are
obviously limited by the fact that consumers may recognize the possibility of lock-in, and, because
of fear of getting stranded in a standard that after all proves unsuccessful, or because of fear of
higher prices after lock-in has occurred, opt for a different product that is not based on a standard,
or based on an ‘open standard’ (available to a large group of suppliers).

5.5. Network externalities as an inverse common pool

The case of a market with network externalities can be analyzed in terms of the common pool
problem that was explored in the previous chapter. The differences between the case there (patent
races) and the case of network externalities is that in the present situation, the externality of
‘fishing’ in the common pool is positive rather than negative. Remember that in chapter 4, we
interpreted a patent race as a fishing trip to a pont full of inventive ideas. The more fishermen
(inventors) there are, the smaller the probability for any one of them to come up with the big
catch. In other words, in a patent race each participant enforces a negative externality on the other
participants. In the case of network externalities, the reverse is obviously true. Each network
participant raises the value of the network for the others, and hence enforces a positive
externality. We will see that this has a profound impact on the nature of the common pool
problem.

The central idea is that in a competitive market, i.e., a market where no firm ‘owns’ the network
and many competitors serve the market for the networked product, the positive externality cannot
be internalized by any market party. If, on the other hand, the network is owned by a monopolist
supplier of the product, this monopolist may internalize the externality. This puts the welfare
properties of a monopoly in a different perspective. This argument is illustrated in a graphical way
in Figure 5.8.

The left panel of the figure considers the traditional situation. P is price and Q quantity. The
market is characterized by a downward-sloping demand curve D. Marginal costs are assumed to
be upward-sloping (a common assumption, but different from the one we have made with regard
to marginal costs in chapters 2 and 4), and represented by the curve mc. The curve MR represents
marginal revenue, and may be obtained by differentiating the total revenue function (this is equal
to PQ). A firm operating in a competitive market will have a supply curve equal to the marginal
cost curve. Hence the market equilibrium for the competitive market will be Ec. A monopolist,
however, will use its market power to bring up the price by means of a reduction of output. In this
case, the profit maximum for the monopolist is found at the point where marginal revenue equals
marginal costs.29 The equilibrium quantity may thus be obtained directly from the intersection of
these two curves. The corresponding market price is obtained by moving upward to the demand
curve, to the point Em. Hence we see that compared to a competitive market, the monopolist will
sell less products, but at a higher price (indicated by the arrows along the axes). This leads to a
dead-weight welfare loss in the form of consumer surplus (as in the Nordhaus model of chapter
2) equal to the dark triangle in the figure.

29
This may easily be verified by setting up and solving the monopolist’s profit maximization problem
in a similar way as we have done in chapters 2 and 4.

131
Figure 2.3. Markets with network externalities as an inverse common
pool problem

Now suppose that due to network externalities, the demand curve is upward sloping, as in the
right panel of the figure. This means that a larger network (corresponding to higher Q) will lead
to extra utility and hence a higher willingness to pay for the product. This is indeed a strange
situation from the point of view of economic theory. It can only result if network externalities are
so strong that they completely offset the traditional tendency of decreasing marginal utility (and
hence willingness to pay). We must therefore note that an upward-sloping demand curve as in the
right panel of Figure 5.8 is not necessarily found in all markets with network externalities. We will
pursue the case mainly for the sake of the argument.

The competitive market equilibrium remains at the intersection of the marginal costs curve and
the demand curve, i.e., at Ec. Note that now that the demand curve is upward-sloping, the
marginal revenue curve will also be upward sloping, and lies above the demand curve. This is the
result of the positive externality of network size: with a larger network size, the marginal
consumer not only raises her own utility, but also that of all other users. The monopolist still gets
optimal profits at the intersection of marginal revenues and marginal costs. This is due to the fact
that when marginal revenues go up as a result of network externalities, the monopolist can charge
it’s users extra for this benefit. When the network is not ‘owned’, but instead served by a large
number of firms, this extra charge is not possible due to price competition.

We derive the monopoly equilibrium Em in the same way as before. Like before, the monopoly
price will go up. But now this is accompanied by a rise of output rather than a fall as in the
traditional case. Associated with the rise in output is the internalization of the positive externality
by the monopolist (in the form of extra profits). Rather than a welfare loss due to a loss of
consumer surplus, we now have an increase in total welfare due to the internalization. Note,
however, that consumers have not internalized the externality. The conclusion is therefore that
the traditional argument against monopoly (it leads to welfare losses) loses some of its power in
markets where network externalities are extremely strong. In such markets, a network owner
(monopolist) may internalize the positive network externalities that exist.

132
5.6. Conclusions

The diffusion of innovations takes place according to an S-shaped pattern over time. Economic
models explaining this empirical finding can be separated into two categories. The first is a class
of models that considers the spread of innovations in isolation of competing or complementary
innovations. The second class considers competition between technologies.

In the class of models that look at a single innovation, the epidemic model stands out for its
simplicity. This model is based on an analogy with the spread of an infectious disease, and
concludes that there are several economic factors that may speed up the diffusion process. These
factors include, for example, the profitability and capital requirements of the underlying
innovation.

However, the epidemic model is not very popular, because off its lack of microeconomic
foundation. The probit model fills this gap. In this model, the individual decision whether or not
to adopt is central. The speed of diffusion depends on factors such as the rate of incremental
innovation of the technology, or the growth rate of the wage rate.

A similar result is found for the case of competing technologies. Here we consider the question
under what circumstances a technology will take off if it has to compete with an older technology.
It was seen that if the new technology is not superior in all respects (e.g., it saves labour but
requires more capital relative to the old technology), its diffusion may be hampered by factors
such as population growth. Note that these factors are in no way related to technology itself, so
that it must be concluded that the diffusion potential of a technology is not just a matter of
‘getting the engineering right’.

A special case emerges when technologies are characterized by network externalities, i.e., if the
value for the individual user depends on the number of users of the technology. Many so-called
information good which are typical of recent developments in the information and
communications sector are characterized by this phenomenon. In this case, the decision which of
two competing technologies (standards) to adopt depends on which technology has been able to
achieve highest market share in the past. Such a mechanism may lead to self-fulfilling prophecies
and lock-in to a standard. The danger even exists that an economy gets locked-in to an inferior
technology. The existence of network externalities may also lead to a different view on the
welfare properties of monopolies. When network externalities are strong, ownership of the
network (by a monopolist) may lead to internalization of externalities that would be lost under
free competition.

References to the original works described in this chapter

Arthur, W.B. 1988. ‘Competing Technologies: An Overview’. In Technical Change and


Economic Theory, edited by G. Dosi, C. Freeman, R. R. Nelson, G. Silverberg and L.
Soete. London: Pinter.
David, P. 1975. Technical Choice, Innovation and Economic Growth. Cambridge: Cambridge
University Press.
Gruebler, A. 1990. The rise and fall of infrastructures : dynamics of evolution and technological
change in transport. Heidelberg: Physica-verlag.

133
Liebowitz, S.J., and S.E. Margolis. 1999. Winners, Losers & Microsoft. Competition and
Antitrust in High Technology. Oakland: The Independent Institute.
Mansfield, E. 1968. Industrial Research and Technological Innovation. New York: W.W.
Norton.
Nakicenovic, N. 1984. Growth to Limits, Long Waves and the Dynamics of technology. Ph.D.,
Wirtschaftswissenschaftlichen Fakultät, Universität Wien, Vienna.
Silverberg, G. 1984. ‘Embodied Technical Progress in a Dynamic Economic Model: the Self-
Organization Paradigm’. In Nonlinear Models of Fluctuating Growth, edited by R.
Goodwin, M. Krüger and A. Vercelli. Berlin: Springer Verlag.

134

You might also like