Professional Documents
Culture Documents
Introduction To Economic Growth
Introduction To Economic Growth
451 Introduction to
Economic Growth
Daron Acemoglu
MIT Department of Economics
January 2006
ii
Contents
I
Introduction
1.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
18
19
19
2.1.1
19
2.1.2
Endowments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.1.3
27
2.1.4
Definition of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.1.5
29
2.1.6
35
41
2.2.1
41
2.2.2
42
2.2.3
49
iii
Balanced Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
2.3.2
53
2.3.3
55
2.3.4
59
63
63
66
3.2.1
66
3.2.2
71
3.2.3
75
79
II
51
83
83
4.2 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
89
Neoclassical Growth
95
99
113
6.2.2
6.3.2
149
7.1.2
7.1.3
7.1.4
Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.1.5
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.2.2
7.2.3
167
8.2.2
8.2.3
8.8.2
Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
. . . . . . . . . . . . . . . . . . . . . . . 199
203
9.2.2
9.2.3
Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.2.4
9.2.5
9.3.2
221
III
Endogenous Growth
239
243
263
289
IV
307
311
333
341
373
383
xi
xii
Part I
Introduction
Chapter 1
Stylized Facts of Economic Growth
and Development
1.1
There are very large dierences in income per capita or output per worker across countries
today. Countries at the top of the world income distribution are thirty times as rich as
countries at the bottom in PPP adjusted dollars. For example, in 2000, GDP per capita
in the United States was $32500 (valued at 1995 $ prices). In contrast, income per capita
is much lower in many other countries: $9000 in Mexico, $4000 in China, $2500 in India,
$1000 in Nigeria, and much much lower in some other sub-Saharan African countries such
as Chad, Ethiopia, Mali (all figures adjusted for purchasing power parity). The gap is larger
when there is no PPP adjustment. The next figure shows a cross-sectional look at these
income-level dierences in the year 2000.
5
Should we care about cross-country income dierences? The answer is a big yes. High
income levels reflect high standards of living. It is true that together with economic growth,
pollution increases and individual aspirations may also increase so that the same bundle
of consumption may no longer make an individual as happy. But at the end of the day,
when one compares an advanced, rich country with a less-developed one, there are striking
dierences in the quality of life, standards of living and health. In fact, it is even dicult for
us to imagine the burden of poverty at the levels experienced by countries in sub-Saharan
Africa. There is little doubt that the consumption level, living standards and health level of
richer countries are appreciably higher than those with lower income per capita. These gaps
represent big welfare dierences.
Understanding how some countries can be so rich while some others are so poor is one
of the most important, perhaps the most important, challenges facing social science.
6
This picture shows how East Asian tigers have grown at much higher rates than the rest
of the world over the past 40 years, while a number of countries in sub-Saharan Africa and
Central America have experienced negative growth.
For example, despite some big growth successes and disasters, countries that were rich
in 1960 are very very likely to be rich today. A regression of log income per worker in 1990
on log income per worker in 1960 gives the following relationship:
ln y1990 =
0.56
(0.48)
1.00
ln y1960
(0.06)
(1.1)
If we look at output or income per worker, the overall shape of the world income distribution has been relatively stable in the postwar period. There is certainly no narrowing of
income gaps. Instead, there is a small but notable increase in the dispersion of incomes. This
is shown in the next figure which depicts the standard deviation of log income per capita in
the world and the ratio of the income of the five richest to the five poorest countries in the
world.
9
Moreover, there is also a pattern of stratification, whereby some of the middle-income countries of the 1960s appear to have joined either the low-income or the high-income club. This
is shown in the next figure:
10
The above statements refer to the unconditional distributionthat is, they refer to
whether the income gap between two countries increases or decreases irrespective of these
countries characteristics. Alternatively, we can look at the conditional distribution (e.g.,
Barro and Sala-i-Martin, 1992). Here the picture is one of conditional convergence: in
the postwar period, the income gap between countries that share the same characteristics
typically closes over time (though it does so quite slowly).
How do we capture conditional convergence? Consider a typical Barro growth regression:
0
gt,t1 = ln yt1 + Xt1
+ t
(1.2)
where gt,t1 is the annual growth rate between dates t 1 and t, yt1 is per capita income
at date t 1 and X is a set of variables that the regression is conditioning on (in theory, the
determinants of steady state income and/or growth). When no covariates are included, this
11
If we go even further back, the pattern may be one of reversal: Acemoglu, Johnson and
Robinson (2002) show that in 1500, among the societies that were later to be colonized by
European powers, those that were relatively prosperous are today relatively poor.
How do we measure/proxy economic prosperity in 1500? It turns out that urbanization
rates and population density are good proxies for prosperity during preindustrial periods
(and urbanization rates are also good proxies even today).
A variety of evidence shows that in 1500 the Mughals, Aztecs and Incas were much
more urbanized and densely settled than the civilizations in North America, New Zealand
and Australia. Today the U.S., Canada, New Zealand and Australia are orders of magnitude richer than the countries now occupying the territories of the Mughal, Aztec and Inca
Empires, such as India, Ecuador or Peru. Therefore, among this set of countries there was
a pattern of reversal, whereby those that were relatively prosperous in 1500 have become
relatively poor today. The reversal is not confined to this set of countries, and is more wide13
USA
CAN
AUS
10
SGP
HKG
NZL
CHL
ARG
VEN
URY
TUN
ECU
BLZ PER
GTM
DOM
PRY
MEX
MYS
COL PAN
CRI
BRA
JAM
PHL
DZA
IDN
EGY
SLV BOL
GUY
MAR
LKA
HND
NIC
PAK
VNM IND
HTI
LAO
BGD
CAN
AUS
10
10
Urbanization in 1500
15
20
USA
SGP
HKG
NZL
CHL BRB
BHS
ARG
BWA
BRA
NAM
SUR
GUY
VEN
ZAF
GAB
MYS
KNA
PAN
COL TTO
CRI
MEX
LCA
ECU
GRD
PER
BLZ
DOM
DMA
VCT
GTM
TUN
DZA
JAM
PHL
IDN
MAR
SLV
AGO
LKA
ZWE HND
NIC
CMR
GIN
CIV
COG MRTGHA
SEN
COM
IND
SDN PAK
LSO
VNM
GMB
TGO
CAF
HTI
LAO KEN BEN
UGA NPL
BGD
ZAR
BFA
TCD
MDG
ZMB
NGA
NER
ERI
MLI
BDI
RWA
MWI MOZ
PRY
SWZ
CPV BOL
TZA
EGY
SLE
ETH
6
-5
0
Log Population Density in 1500
14
When did this reversal take place? Consistent with the discussion from Pritchetts paper
above, the evidence suggests that the reversal among the former European colonies took
place during the 19th century as well. Up to the late 18th century, previously prosperous
places continued to be somewhat more prosperous. It was the age of industrialization, the
19th century, when previously less-prosperous former colonies became rapidly urbanized,
industrialized and increased their GDP per capita. The next two pictures give a sense of
these processes:
20
15
10
0
800
1000
1200
1300
1400
1500
1600
15
1700
1750
1800
1850
1900
1920
350
300
250
200
150
100
50
0
1750
1800
US
1.2
1830
Australia
1860
Canada
1880
New Zealand
1900
Brazil
1913
Mexico
1928
1953
India
Interpretation
This discussion points to the following set of facts and questions that are central to an
investigation of the determinants of long-run dierences in income levels and growth:
1. The major pattern to be explained is why there are such large dierences in income per
capita and worker productivity across countries. This immediately takes us to questions
of why some countries grow (or have grown) while other countries have failed to grow
and stagnated.
2. The relative stability of the postwar income distribution has suggested to many economists that we should look for dierences across countries leading to very large permanent dierences in income, but not necessarily large permanent dierences in
16
1.3
The Agenda
In the rest of the class, we will look at models that can help us understand the mechanics
of economic growth. This means understanding a variety of models that underpin the way
economists think about the process of capital accumulation, technological progress, and
productivity growth. Only by understanding these mechanics can we have a framework for
thinking about the causes of why some countries are growing and some others are not, and
why some countries are rich and some others are not.
Therefore, the approach will be two pronged: on the one hand, we want to understand
the mathematical structure of these models as well as possible; on the other, we want to
understand what these models and others have to say about which key parameters or key
economic processes are dierent across countries and why.
18
Chapter 2
The Solow Growth Model
2.1
2.1.1
We start with the simplest growth model, sometimes referred to as the Solow-Swan model
after two economists who developed versions of it, or simply as the Solow growth model after
our own Bob Solow, who was awarded the Nobel prize for his contributions to growth theory.
This is a closed economy, with a unique final good. The economy is in discrete time
running to infinite horizon, so that time is indexed by t = 0, 1, 2, .... Time periods here can
correspond to days, weeks, or years. So far we do not need to take a position on this.
The economy is inhabited by a large number of households, and for now we are going
to make relatively few assumptions on the households because in this baseline model, they
will not be optimizing. To fix ideas, you may want to assume that all households are
identical, so that the economy admits a representative consumer. We return to what this
assumption of the representative consumer involves below. As an aside, you should know
19
(2.1)
where Y (t) is the total amount of production of the final good, K (t) is the capital stock,
L (t) is total employment and A (t) is technology. The capital stock here denotes the quantity
of machines used in production. Both the capital stock and technology are taken to be
single indices, and at some level, they are treated as black boxeswe will later discuss how
such models can be extended to think of multiple types of technologies and capital goods.
For now, the important assumption is that technology is free, it is publicly available as a
non-excludable, non-rival good. Thus the firm does not have to pay for it.
20
F (K, L, A)
> 0,
L
2 F (K, L, A)
FLL (K, L, A)
< 0.
L2
FL (K, L, A)
(2.2)
2.1.2
Endowments
Let us imagine that all factors of production are owned by households. In particular, households own all of the labor, which they supply inelastically. If there is population growth, this
can be thought of as existing households becoming larger, or new households being born.
For our purposes here this does not matter. The households also own the capital stock of
the economy, and we take their initial holdings of capital, K (0), as given (as part of the description of the environment), and this will determine the initial condition of the dynamical
system we will be analyzing. For now how this initial capital stock is distributed among the
23
L(t),K(t)
(2.3)
(2.4)
(2.5)
and
K0
lim FL (K, L, A) = and lim FL (K, L, A) = 0 for all K > 0 and all A.
L0
2.1.3
Finally, we can write the law of motion of the capital stock of the economy. Recall that K
depreciates exponentially at the rate , so that the law of motion of the capital stock is given
by
K (t + 1) = (1 ) K (t) + I (t) ,
(2.6)
where I (t) is investment at time t. From national income accounting for a closed economy,
we have
Y (t) = C (t) + I (t) + G (t) ,
(2.7)
where C (t) is consumption and G (t) is government spending. For now, we take G (t) 0,
so that national income is divided between consumption and investment. Therefore, using
(2.1), (2.6) and (2.7), feasible dynamic allocations in this economy would have to satisfy
K (t + 1) F [K (t) , L (t) , A (t)] + (1 ) K (t) C (t) .
The question is to determine the equilibrium dynamic allocation among the set of feasible
dynamic allocations. Here the behavioral rule of the constant savings rate simplifies the
structure of equilibrium considerably. It is important that the constant savings rate is a
27
(2.8)
(2.9)
Thus combining (2.1), (2.6) and (2.8), we have the key dynamic (dierence) equation
of the Solow growth model:
K (t + 1) = sF [K (t) , L (t) , A (t)] + (1 ) K (t) .
(2.10)
In the Solow growth model, the equilibrium is essentially described by this equation
together with laws of motion for L (t) and A (t).
2.1.4
Definition of Equilibrium
The Solow model is a mixture of an old-style Keynesian model and a modern dynamic macroeconomic model. Households do not optimize when it comes to their savings/consumption
decisions. Instead, their behavior is captured by a behavioral rule. But firms maximize and
28
2.1.5
We can make more progress by exploiting the constant returns to scale nature of the production function. To do this, let us make some further assumptions:
1. Let us assume that population is constant and individuals supply labor inelastically,
so that L (t) = L.
2. Let us also assume that there is no technological progress, so that A (t) = A.
We will relax these assumptions later. For now, let us define the capital-labor ratio of
the economy as
k (t)
K (t)
.
L
Then using the constant returns to scale assumption we have that income per capita, y (t)
Y (t) /L, is given by
K (t)
, 1, A
y (t) = F
L
f (k (t)) .
29
(2.11)
(2.12)
The fact that both of these factor prices are positive follows from Assumption 1, which
imposed that the first derivatives of F with respect to capital and labor are always positive
(with more general production functions, zero factor prices are possible over certain ranges).
Given this, we can divide both sides of (2.10) by L and obtain a simpler dierence
equation
k (t + 1) = sf (k (t)) + (1 ) k (t) .
(2.13)
Since this dierence equation is derived from (2.10), it also can be referred to as the equilibrium dierence equation of the Solow model, in that it describes the equilibrium behavior
of the key object of the model, the capital-labor ratio, and the other equilibrium quantities
can be obtained from the capital-labor ratio k (t).
At this point, we can also define a steady-state equilibrium for this model without technological progress and population growth.
Definition 3 A steady-state equilibrium without technological progress and population growth
is an equilibrium path in which k (t) = k for all t.
In other words, in the steady-state equilibrium the capital-labor ratio remains constant.
Most of the models we will analyze in this course will admit a steady state equilibrium, and
typically the economy will tend to this steady state equilibrium over time (but often never
reach it in finite time). This is also the case for this simple model.
30
f (k )
=
.
s
k
(2.14)
An alternative visual representation of the steady state is to view it as the intersection between a ray through the origin with slope (representing the function k) and the function
sf (k). The next figure shows this picture, which is also useful in seeing the level of consumption and investment in a single figure.
This establishes:
Proposition 2 Consider the basic Solow growth model and suppose that Assumptions 1 and
2 hold. Then there exists a unique steady state where the capital-labor ratio is equal to
31
(2.15)
c = (1 s) f (k ) .
(2.16)
Proof.
The preceding argument establishes that (2.14) is a steady state, i.e., a zero
of the dierence equation (2.13). To establish existence, note that from Assumption 2,
limk0 f (k) /k = and limk f (k) /k = 0. Moreover, f (k) /k is continuous from Assumption 1, so there exists k such that (2.14) is satisfied. To see uniqueness, dierentiate
f (k) /k with respect to k, which gives
[f (k) /k] f 0 (k) k f (k)
w
=
< 0,
=
k
k2
k
(2.17)
where the last equality uses (2.12). Since f (k) /k is everywhere decreasing, there can only
exist a unique value k that satisfies (2.14).
Equation (2.15) and (2.16) then follow by definition.
So far the model is very parsimonious, and does not have many parameters. But what we
are most interested in is to understand how cross-country dierences in certain parameters
translate into dierences in growth rates or income levels. This will be done in the next
proposition. But before doing so, let us generalize the production function in one simple
way, and assume that
f (k) = af (k)
so that a is a shift parameter, with greater values corresponding to greater productivity of
factors. This type of productivity is referred to as Hicks-neutral as we will see below, but
32
y (a, s, )
y (a, s, )
y (a, s, )
> 0,
> 0 and
< 0.
a
s
= ,
k
as
which holds for an open set of values of k . Now apply the implicit function theorem to
obtain the results. For example,
k
(k )2
= 2 >0
s
sw
where w = f (k ) k f 0 (k ) > 0. The other results follow similarly.
Therefore, countries with higher savings rates and better technologies will have higher
capital-labor ratios and will be richer. Those with greater (technological) depreciation, will
tend to have lower capital-labor ratios and will be poorer. All of the results in Proposition 3
are intuitive, and start giving us a sense of some important determinants of the capital-labor
ratios and income levels across countries.
The same comparative statics with respect to a and immediately apply to c as well.
However, it is straightforward to see that c will not be monotonic in the savings rate (think,
33
for sgold , with the corresponding steady state capital level kgold
such that
f 0 kgold
= .
In other words, there exists a unique savings rate and the corresponding capital-labor
ratio which will maximize steady-state consumption. This is shown in the next figure with
34
Below this savings rate, the society has too low a capital-labor ratio to maximize consumption, and above this rate, the capital-labor ratio is too high, i.e., individuals are investing too much and not consuming enough. This is the essence of what people refer to as
dynamic ineciency, which we will encounter in greater detail in models of overlapping generations. However, recall that there is no explicit utility function here, so statements about
ineciency have to be considered with caution and skepticism. In fact, the reason why
such dynamic ineciency will not arise once we endogenize consumption-saving decisions of
individuals will be apparent to many of you already.
2.1.6
Proposition 2 establishes a unique steady state equilibrium. Recall, however, that an equilibrium path does not refer simply to the steady state but to the entire path of capital stock,
35
(2.18)
have x (t) x . Moreover, x is globally asymptotically stable if for all x (0) Rn , for
(2.19)
solution {x (t)}
t=0 satisfies x (t) x where x is the steady state (zero) of the dierence
equation given by Ax = x .
The proof of this theorem can be found in any textbook on dynamical systems, for example, David Luenberger Introduction to Dynamic Systems: Theory Models and Applications,
John Wiley & Sons, 1979, and a version of it for dierential equations is in Carl Simon and
Lawrence Bloom Mathematics for Economists, Norton, 1994.
Next let us return to be the nonlinear autonomous system (2.18). Unfortunately, much
less can be said about nonlinear systems, but the following is a standard local stability result.
Theorem 3 Consider the following nonlinear autonomous system
x (t + 1) = F [x (t)]
(2.20)
where F :Rn Rn and suppose that F is continuously dierentiable, with initial value x (0).
Let x be a zero of this system, i.e., F (x ) = x . Define
A =F (x ) ,
and suppose that all of the eigenvalues of A are strictly inside the unit circle. Then the
dierence equation (2.20) is locally asymptotically stable, in the sense that there exists an
open neighborhood of x , B (x ) Rn such that starting from any x (0) B (x ), we have
x (t) x .
37
(2.21)
with a unique zero at k . Now recall that f () is concave from Assumption 1 and satisfies
f (0) = 0 from Assumption 2. For any strictly concave function, we have that
f (k) > f (0) + kf 0 (k) = kf 0 (k) ,
(2.22)
where the second line uses the fact that f (0) = 0. Now linearizing (2.21) around k , we have
k (t + 1) ' [sf 0 (k ) + (1 )] (k(t) k ).
Since from (2.14), k = sf (k ), (2.22) implies that = sf 0 (k ) /k > f 0 (k ), and thus
[sf 0 (k ) + (1 )] (0, 1), establishing local asymptotic stability for the Solow model from
Corollary 1.
38
{k (t)}
t=0 always approaches k , thus must be globally stable.
This stability result is easier to see diagrammatically, which is shown in the next figure.
The following corollary is then immediate:
Corollary 2 Suppose that Assumptions 1 and 2 hold, and k (0) < k , then {w (t)}
t=0 is
results apply.
Intuitively, if the economy starts with too little capital relative to its labor supply, there
will be capital deepening (capital accumulation relative to labor), and as a result the marginal product of capital will fall given the diminishing returns to capital feature embedded in
Assumption 1, and the wage rate will increase. Conversely, if it starts with too much capital,
it will decumulate capital, and in the process the wage rate will decline and the rate of return
to capital will increase. The next figure shows this process diagrammatically, emphasizing
that the trade-o is between the replacement of the capital stock per eective labor due to
depreciation (and perhaps population growth and technological change) and the capital to
eective labor ratio:
39
Therefore, the Solow growth model has a number of nice properties; unique steady state,
asymptotic stability, and simple and intuitive comparative statics.
So far, it has no growth however. The steady state is the point at which there is no
growth in the capital-labor ratio, no more capital deepening, and no growth in income per
capita. The Solow model typically incorporates economic growth by allowing technological
change. Before doing this, however, it is useful to look at the mapping between discrete time
and continuous time.
40
2.2
2.2.1
Recall from the discussion above that the time periods could refer to days, weeks, months or
years. In some sense, the time unit is not important. This suggests that perhaps it may be
more convenient to look at dynamics by making the time unit as small as possible, i.e., by
going to continuous time. The continuous time setup in general has a number of advantages,
since some pathological results of discrete time disappear in continuous time (see Problem
Set 1). Moreover, especially in the presence of uncertainty, continuous time models have
more flexibility both in doing dynamics and for providing explicit form solutions. For us,
they are useful particularly because a lot of growth theory is cast in continuous time.
Let us start with a simple dierence equation
x (t + 1) x (t) = g (x (t)) .
(2.23)
This equation states that between time t and t + 1, the absolute growth in x is given by
g (x (t)). Let us now consider the following approximation
x (t + t) x (t) ' t g (x (t)) ,
for any t [0, 1]. When t = 0, this equation is just an identity. When t = 1, it gives
(2.23). In-between it is a linear approximation, which should not be too bad if the distance
between t and t + 1 is not very large, so that g (x) ' g (x (t)) for all x [x (t) , x (t + 1)]
(however, you should also convince yourself that this approximation could in fact be quite
bad if you take a very nonlinear function g, for which the behavior changes significantly
between x (t) and x (t + 1)). Now divide both sides of this equation by t, and take limits
41
as a dierential equation representing the same dynamics as the dierence equation (2.23)
for the case in which the distance between t and t + 1 is small. Recall that here x (t) denotes
the time derivative x (t) /t.
2.2.2
We can now repeat all of the analysis so far using the continuous time representation. Nothing
has changed on the production side, so we continue to have (2.4) and (2.5) as the factor prices,
but now these refer to instantaneous rental rates (i.e., w (t) is the flow of wages that the
worker receives for an instant etc.).
Savings are again given by
S (t) = sY (t) ,
while consumption is given by (2.9) above.
Also, let us now introduce population growth into this model, and assume that the labor
force L (t) grows proportionally, i.e.,
L (t) = exp (nt) L (0) .
(2.24)
The purpose of doing so is that in many of the classical analyses of economic growth, population growth plays an important role, so it is useful to see how it aects things here. We
are not introducing technological progress yet, which will be done below.
42
K (t)
,
L (t)
(2.25)
Therefore we have:
Definition 5 In the basic Solow model in continuous time with population growth at the
rate n, no technological progress and an initial capital stock K (0), an equilibrium path
is a sequence of capital stocks, labor, output levels, consumption levels, wages and rental
rates [K (t) , L (t) , Y (t) , C (t) , w (t) , R (t)]
t=0 such that K (t) satisfies (2.25), L (t) satisfies
(2.24), Y (t) is given by (2.1), C (t) is given by (2.9), and w (t) and R (t) are given by (2.4)
and (2.5).
As before, a steady-state equilibrium involves k (t) remaining constant. As before, we
will refer to the steady-state equilibrium capital-labor ratio as k .
43
(2.26)
In other words, going from discrete to continuous time has not changed any of the basic
economic features of the model, and again the steady state can be plotted in the familiar
figure used above (now with the population growth rate featuring in there as well):
We immediately obtain:
Proposition 6 Consider the basic Solow growth model in continuous time and suppose that
Assumptions 1 and 2 hold. Then there exists a unique steady state equilibrium where the
capital-labor ratio is equal to k (0, ) and is given by (2.26), per capita output is given
44
f (k) = af (k) .
Then we have
Proposition 7 Suppose Assumptions 1 and 2 hold and f (k) = af (k). Denote the steadystate equilibrium level of the capital-labor ratio by k (a, s, , n) and the steady-state level of
output by y (a, s, , n) when the underlying parameters are given by a, s and . Then we
have
k (a, s, , n)
k (a, s, , n)
k (a, s, , n)
k (a, s, , n)
> 0,
> 0,
and
<0
a
s
n
y (a, s, , n)
y (a, s, , n)
y (a, s, , n)
y (a, s, , n)
> 0,
> 0,
and
< 0.
a
s
n
The new result relative to the earlier comparative static proposition is that now a higher
population growth rate, n, also reduces the capital-labor ratio and income per capita. The
reason for this is simple. A higher population growth rate means there is more labor to
use the existing amount of capital, which only accumulates slowly, and consequently the
equilibrium capital-labor ratio ends up lower. This result implies that countries with higher
population growth rates will have lower incomes per person (or per worker).
The stability analysis is also unchanged. To do this in detail, we simply need to remember
the equivalents of the above theorems for dierential equations. In particular we have:
45
(2.27)
with initial value x (0), where x (t) Rn for all t and A is an n n matrix. Suppose that
all of the eigenvalues of A have negative real parts. Then the dierential equation (2.27) is
asymptotically stable, in the sense that starting from any x (0) Rn , x (t) x where x is
the steady state (zero) of the system given by Ax = 0.
Theorem 5 Consider the following nonlinear autonomous dierential equation
x (t) = F [x (t)]
(2.28)
K (t) =
k =
sA
n+
n+
s
1
1
which is a very nice and simple interpretable form for the steady-state capital-labor ratio.
Transitional dynamics are also straightforward in this case. In particular, we have:
k (t) = sA [k (t)] (n + ) k (t)
with initial condition k (0). To solve this equation, let x (t) k (t)1 , so the equilibrium law
of motion of the capital labor ratio can be written in terms of x (t) as
x (t) = (1 ) sA (1 ) (n + ) x (t) ,
48
sA
sA
x (t) =
+ x (0)
exp ( (1 ) (n + ) t)
n+
n+
or in terms of the capital-labor ratio
1
1
sA
sA
1
k (t) =
+ [k (0)]
exp ( (1 ) (n + ) t)
.
n+
This solution illustrates that starting from any k (0), the equilibrium k (t) k = (sA/ (n + ))1/(1) ,
and in fact, the rate of adjustment is related to (1 ) (n + ). This is intuitive: a higher
implies less diminishing returns to capital, which slows down dynamics. Similarly a smaller
means less replacement of depreciated capital and a smaller n means slower population
growth, both of those slowing down the adjustment of capital per worker and thus transitional dynamics.
2.2.3
Before discussing technological progress, it is useful to see how the model we have developed
so far can generate sustained growth (without technological progress). The Cobb-Douglas
example above already shows that when is close to 1, adjustment of the capital-labor
ratio back to its steady-state level can be very very slow. A very slow adjustment towards a
steady-state has the flavor of sustained growth rather than the system settling down to a
stationary point quickly.
In fact, the simplest model of sustained growth essentially takes = 1 in terms of the
Cobb-Douglas production function above. To do this, let us relax Assumptions 1 and 2
(which do not allow = 1), and suppose that
F [K (t) , L (t) , A (t)] = AK (t) ,
49
(2.29)
(2.30)
but it is simpler to illustrate the main insights with (2.29), leaving the analysis of the richer
production function (2.30) to Problem Set 1.
With this production function, the fundamental law of motion of the capital stock is
given by (again with population growth given by (2.24)):
k (t)
= sA n.
k (t)
Therefore, if sA n > 0, there is sustained growth in the capital-labor ratio, and given
(2.29), there is sustained growth in income per capita. This immediately establishes the
following proposition:
Proposition 9 Consider the Solow growth model with the production function (2.29) and
suppose that sA n > 0. Then in equilibrium, there is sustained growth of income per
capita at the rate sA n. In particular, starting with a capital-labor ratio k (0) > 0, the
economy has
k (t) = exp ((sA n) t) k (0)
and
y (t) = exp ((sA n) t) Ak (0) .
This proposition not only establishes the possibility of endogenous growth, but also shows
that in this simplest form, there are no transitional dynamics. The economy always grows
50
2.3
2.3.1
The models analyzed so far did not feature technological progress. We now introduce changes
in A (t) to capture improvements in the technological know-how of the economy. There is
little doubt that what human societies know to produce, and how eciently they can produce
them, has progressed tremendously over the past 200 years, and even more tremendously
over the past 1000 or 10,000 years. An attractive way of introducing economic growth is
to allow technological progress. The question is how to do this. At some level we will see
that the production function F [K (t) , L (t) , A (t)] is too general to achieve our objective. In
51
90%
80%
70%
60%
50%
Labor
Capital
40%
30%
20%
10%
1994
1989
1984
1979
1974
1969
1964
1959
1954
1949
1944
1939
1934
1929
0%
Despite fairly large fluctuations, there is no trend. This and the relative constancy of
capital-output ratios until the 1970s have made many economists prefer models with balanced
growth to those without. (Since the 1970s capital-output ratios may or may not be constant
depending on how you measure them). Also for future reference, note that the capital share
in national income is about 1/3, while the labor share is about 2/3. We are ignoring the share
of land here as we did in the analysis so far: land is not a major factor of production. This
52
2.3.2
What are some convenient special forms of the general production function F [K (t) , L (t) , A (t)]?
First we could have
F [K (t) , L (t) , A (t)] = A (t) F [K (t) , L (t)] ,
so that technological progress simply multiplies output. This is known as Hicks-neutral
technological progress. Intuitively, in this case if we think of the isoquants in the L-K
space, technological progress simply corresponds to a relabeling of the isoquants (without
any change in their shape).
53
It turns out that, although all of these forms of technological progress look equally
plausible ex ante, balanced growth forces us to one of these types of neutral technological
progress. In particular, balanced growth necessitates that all technological progress be labor
augmenting or Harrod-neutral. This is a very surprising result, and it is also somewhat
troubling, since we have no idea why technological progress should take this form. We now
state and prove the relevant theorem here.
54
2.3.3
A version of the following theorem was first proved by Uzawa in 1961. For simplicity and
without loss of any generality, let us focus on continuous time models. The key elements of
balanced growth, as suggested by the discussion above, are the constancy of factor shares
and the constancy of the capital-output ratio, K (t) /Y (t). Since there is only labor and
capital in this model, by factor shares, we mean
L (t)
w (t) L (t)
R (t) K (t)
and K (t)
.
Y (t)
Y (t)
log y (t)
log (k (t) /y (t))
1
log K(t)
log Y (t)
=
=
=
log F [K(t),L(t),A(t)]
log K(t)
FK [K(t),L(t),A(t)]K(t)
F [K(t),L(t),A(t)]
K (t)
,
1 K (t)
(x ) dx
from the inverse function theorem, (x ) is invertible in the neighborhood of x , with inverse
denoted by 1 (y/A)
56
y (t) 1 y (t)
k (t)
=
A (t)
A (t)
A (t)
y (t)
= f 1
A (t)
or
y (t)
k (t)
=f
,
A (t)
A (t)
and thus
K (t)
Y (t) = A (t) L (t) f
,
A (t) L (t)
AL (t) L (t)
Y (t)
= exp ((gH + gK ) t) F 1,
K (t)
AK (t) K (t)
L (t)
.
exp ((gH + gK ) t) f exp ((gL gK ) t)
K (t)
Now we also have
Y (t)
K (t)
=s
,
K (t)
K (t)
and in steady state, according to the hypotheses of the theorem, we have Y (t) /K (t) constant, so K (t) /K (t) = g, i.e., capital grows at the same rate as total output. Combined
58
2.3.4
Now we are ready to analyze the Solow growth model with technological progress. I will
only present the analysis for continuous time (the discrete time case is equivalent). From
Theorem 7, we know that the production function must take the form
F [K (t) , A (t) L (t)] ,
with purely labor-augmenting technological progress asymptotically. For simplicity, let us
assume that it takes this form throughout. Moreover, suppose that there is technological
progress at the rate g, i.e.,
A (t)
= g,
A (t)
59
(2.31)
(2.32)
The simplest way of analyzing this economy is again to express everything in terms of
a normalized variable. Since eective units of labor are given by A (t) L (t), and F exhibits
constant returns to scale in its two arguments (by virtue of exhibiting constant returns to
scale in capital and labor), we can define
k (t)
K (t)
.
A (t) L (t)
(2.33)
K (t)
= F
,1
A (t) L (t)
f (k (t)) .
y (t)
(2.34)
(2.35)
which is very similar to the law of motion of the capital-labor ratio in the continuous time
model, (2.25).
An equilibrium in this model is defined similarly to before. Consequently, we have:
Proposition 10 Consider the basic Solow growth model in continuous time, with Harrodneutral technological progress at the rate g and population growth at the rate n. Suppose that
Assumptions 1 and 2 hold, and define the eective capital-labor ratio as in (2.33). Then
there exists a unique steady state equilibrium where the eective capital-labor ratio is equal
to k (0, ) and is given by
f (k )
+g+n
=
.
k
s
and also
y (A (0) , s, , n, t)
y (A (0) , s, , n, t)
y (A (0) , s, , n, t)
y (A (0) , s, , n, t)
> 0,
> 0,
< 0 and
< 0,
A (0)
s
n
62
Chapter 3
The Solow Model and the Data
One of the important uses of the aggregate production function approach and the basic
Solow model is that they provide us with a simple vehicle to look at the data, both at
growth over time and income-level dierences (and growth rate dierences) across countries.
I start here with over-time changes, i.e., growth accounting, and then will move to the more
important application for the purposes of this course, which involves looking at cross-country
dierences.
3.1
Growth Accounting
Let us go back to the most general form of the aggregate production function given by (2.1),
whereby
Y (t) = F [K (t) , L (t) , A (t)] .
63
and
Recalling the definition of factor shares above, and denoting g Y /Y , gK K/K
FA A A
Y A
K,t,t+1
K,t + K,t+1
2
and
L,t,t+1 is defined similarly.
Applying this method, Solow found that much of economic growth over the 20th century
was due to technological progress. This has been a landmark finding, focusing the attention
of economists on sources of technology dierences over time, across nations, across industries
and across firms.
Since then, many economists, most notably Dale Jorgensen, have attempted to reduce
the amount due to the residual technology by adjusting for the quality of labor and capital
inputs. This is still an active research area, partly because there are conceptual issues about
how far one should go in adjusting the quality of inputs. For example, better computers
can translate into more capital, reducing the TFP residual, but at the end of the day better
computers are a result of better technology. We will return to these issues again below.
65
3.2
We are now in a position to take the basic Solow model to the data. The simplest way
of doing this is to follow the approach of Mankiw, Romer and Weil (1992). These authors
basically estimated a cross country regression inspired by the above model. However, a
basic estimation which does not take human capital into account proved to be inadequate.
Therefore, Mankiw, Romer and Weil (1992) used an augmented Solow also incorporating
human capital. I first develop this model briefly, and then look at the empirical evidence.
Since our purpose here is to look at cross-country income dierences, from the beginning, I
present the model for a cross-section of countries.
Here already there is a major (and at some level a very problematic assumption), adopted
by many authors, among them Mankiw, Romer and Weil (1992), Barro (1991) and much of
Barro and Sala-i-Martin (2004), which is that the world consists of a cross-section of countries
which do not interact. In other words, these countries do not trade financial assets, goods,
or there is no slow diusion of technology across these countries. These countries inhabit the
world, but they are all islands onto themselves. I start with this case of no interdependence,
but interdependences arising from technology flows and international trade will be discussed
below.
3.2.1
(3.1)
kj
Kj
Hj
and hj
,
Aj Lj
Aj Lj
(3.2)
kj = sk yj nj + g + k kj
j
A
j
yj
nj + g + h hj .
h j = shj
Aj
(3.3)
(3.4)
As in our baseline models, in steady state, both kj and hj have to be constant. Thus
setting k j = 0 and h j = 0 in (3.3) and (3.4) and solving yields the following steady-state
values of physical capital and human capital ratios to eective labor:
kj
hj
skj
nj + g + k
skj
nj + g + k
!1
!
shj
nj + g + h
shj
nj + g +
1
! 1
1
!1 1
.
h
shj
skj
+
ln yj = ln Aj + gt +
ln
ln
1
1
nj + g + h
nj + g + k
(3.5)
This is an equation which can be estimated using cross-country data if we have measures of
shj . In addition, we can use investment rates (investments/GDP) for skj , population growth
68
However, with all of these assumptions, equation (3.5) can still not be estimated, because
the term ln Aj is unobserved to the econometrician, and could be correlated with all of
the other right hand side variables. Therefore implicitly, Mankiw, Romer and Weil make
another crucial assumption, considerably stronger than the common technology advances
assumption:
With these assumptions, Mankiw, Romer and Weil estimate equation (3.5). The estimation is a success for the augmented-Solow model. If human capital is not included, the
fit is not very good and the estimates are not reasonable. This is shown in the next table.
69
Without human capital, the coecient in front of the investment/GDP ratio should be
/ (1 ), thus the estimate suggests ' 0.6, which is far too high bearing in mind that
given the factor distribution of income we expect the exponent of capital in the production
function to be closer to 1/3.
But for the augmented model with human capital, the fit is very good as shown in the
next table. Now the parameter estimates imply 1/3, 1/3 and R2 .78.
70
At face value, these results provide strong support for the augmented Solow model. The
estimate of is consistent with a capital share of one-third in national income, and the R2
implies that almost 80 percent of the dierences in income per capita can be explained by
investment decisions (human and physical capital dierences).
3.2.2
But there are two major (and related) problems with this approach:
1. The orthogonal technology assumption is too strong. When Aj varies across countries,
71
= 0.313
(0.027)
In other words, the correlation between income and schooling is too strong relative to
what we should expect on the basis of micro evidence. In particular, the eect of schooling
on income is much larger than the 6-10 percent dierence expected.
This result is not simply explained by the fact that interest rates vary across countries.
Notice that we can write r = (1 ) Y /K, so including the (log) capital output ratio would
be one way to control for interest rate dierences. In this regression, the log capital-output
should have a coecient of (1 ) /, approximately 0.5 taking as 2/3. Running this
regression with 1985 data, we obtain
log Y
= 0.266
+ 0.408
(0.033)
(0.178)
log
K
Y
So, there is still a very large eect of education on income, and the quantitative eect of
capital (as a proxy for interest rates) is plausible.
This relationship between education and income may reflect human capital externalities.
For example, we might have the productivity term, A, as a function of average human capital
in the economy. In this case, the rate of return to human capital in the Mincer regressions
would only reflect the private returnthat is, the increase in the individuals wage as a
74
3.2.3
A related approach is to use calibration/levels accounting rather than regression analysis and
make use of the findings of Mincer (micro wage) regressions. This is the approach first taken
by Bils and Klenow, and then by Klenow and Rodriguez and Hall and Jones. The advantage
of the calibration approach is that the omitted variable bias underlying the estimates of
Mankiw, Romer and Weil will be less important (since microlevel evidence is being used to
anchor the contribution of human capital). The disadvantage is that certain assumptions on
functional forms have to be taken much more seriously, and we explicitly have to assume no
human capital externalities.
Here let me follow Hall and Jones. Consider the following production function
Yj = Kj1 (Aj Hj )
(3.6)
with Hj interpreted as eciency units of labor. Assume the following Mincer-type relation75
X
E
where (E) is the rate of return to E years of schooling and Lj (E) is the number of individuals in country j with E years of schooling. We can use dierent values for (E) and
construct alternative estimates of Hj . Hall and Jones (1999) use a piecewise linear specification for (E) based on work by Psacharapoulos from less developed countries (showing
returns to earlier years of schooling that are greater than to higher education). Once we have
a series for Hj and one for Kj , which can be constructed using standard perpetual inventory
methods, we can construct predicted incomes, for example, as
2/3
1/3
Yj = Kj AUt S Hj
and compare these predicted incomes with actual incomes.
Alternatively, we could back out country-specific technology terms (relative to the U.S.)
as
Ajt
=
AUt S
Ytj
YtU S
!3/2
KtU S
Ktj
1/2
HtU S
Htj
Hall and Jones perform this exercise using output per worker rather than income per
capita. They find:
1. Dierences in physical and human capital still matter a lot, accounting for as much as
50 percent of the actual dierences in output per worker.
2. But there are also significant productivity dierences.
76
77
The conclusion of this calibration exercise is therefore very similar to the one that followed
from the regression analysis presented in the previous section.
Naturally, some of the assumptions of these calibration exercise can be relaxed. For
example instead of assuming at Cobb-Douglas production function, one could do levels
accounting. Essentially, ranked the countries according to their capital-labor ratio (or
capital-output ratio), and then use the equivalent of the growth accounting equation above,
in particular, we can write
xj,j+1 = gj,j+1
K,j,j+1 gK,j,j+1
Lj,j+1 gL,j,j+1 ,
78
3.3
fj Vjf
sj
N
X
fi Vif
(3.7)
i=1
where Vjf is that endowment of factor f in country j, fj is the factor productivity of factor f
in country j, and sj is the share of country j in world consumption (this uses the assumption
that all countries have the same homothetic preferences). N is the total number of countries.
Given estimates of the net export of factor contents, the Xjf s, equation (3.7) solves for
a unique sequence of fj s taking one of the countries as the base. So from this equation we
can obtain an estimate of the dierences in factor productivities. At this level, this may be
viewed simply as an untested strong hypothesis.
The major contribution of Treflers paper is to note that if there is factor price equalization, we should also have
wjf
fj
wjf0
fj0
(3.8)
for any pair of countries, j and j 0 , where wjf is the price of factor f in country j. With data
on factor prices, we can therefore construct alternative series for fj s. It turns out that the
series implied by (3.7) and (3.8) are very similar, so there appears to be some validity to this
approach. The following figure shows his estimates:
80
Given this validation, we can presume that there is some information in the numbers
that Trefler obtains. These numbers imply that there are very large dierences in labor
productivity, and some substantial, but much smaller dierences in capital productivity. For
example, labor in Pakistan is 1/25th as productive as labor in the United States. In contrast,
capital productivity dierences are much more limited than labor productivity dierences.
For example, capital in Pakistan is only half as productive as capital in the United States.
81
82
Chapter 4
Fundamental Determinants of
Dierences in Income
4.1
The use of the Solow model and the production function approach illustrated how cross
country income dierences can be understood as resulting from physical capital dierences,
human capital dierences and technology dierences. These technology dierences, themselves, may represent actual dierences in the technologies used by countries, or other eciency dierences in the use of the factors. At this level, the framework we have does a very
good job of helping us understand the proximate causes of income dierences. The same
procedure also helps us understand the proximate causes of the process of economic growth.
However, the observation that a country is poorer than another because it has worse
technology, less physical capital and less human capital immediately poses the next question:
why does it have worse technology, less physical capital, less human capital? This question
83
4.2
Hypotheses
Why do some countries invest more in physical and human capital and possess better technologies? There are four sets of broad hypotheses:
1. Luck: some countries just turned out to be lucky. It is dicult to operationalize this
approach, and at some level, it is quite similar to the other hypotheses, but less specific
(one way of operationalizing it may be by using the multiple equilibrium models we
will discuss below).
A version of this hypothesis where such dierences are transitory is clearly not supported by the evidence presented so far, which points out to very persistent dierences
over long periods.
A version of this hypothesis where a small dierence caused by luck may lead to large
persistent dierences is also dicult to reconcile with the data given the reversal documented above. So I will place less emphasis on the importance of luck. Nevertheless,
some of the theories presented below will show how small dierences in initial condi84
2. Geography: This view is becoming very popular recently. It claims that dierences
in economic performance reflect, to a large extent, dierences in geographic, climatic
and ecological characteristics across countries. The most common is the view that
climate has a direct eect on income through its influence on work eort. This idea
dates back to Machiavelli and Montesquieu. Alfred Marshall (1890) similarly wrote:
vigor depends partly on race qualities: but these, so far as they can be explained
at all, seem to be chiefly due to climate. Gunnar Myrdal (1968): climate exerts
everywhere a powerful influence on all forms of life, and that serious study of the
problems of underdevelopment... should take into account the climate and its impacts
on soil, vegetation, animals, humans and physical assets in short, on living conditions
in economic development.
The recent bestseller by Jared Diamond, Guns, Germs and Steel, suggests that the
timing of the Neolithic revolution has had a long lasting eect by determining which
societies were the first ones to develop strong armies, and technology. For example,
he states that: ...proximate factors behind Europes conquest of the Americas were
the dierences in all aspects of technology. These dierences stemmed ultimately from
Eurasias much longer history of densely populated... societies dependent on food production (1997, p. 358). Diamond argues that dierences in the nature and history of
food production, in turn, are due to the types of crops, domesticated animals, and the
axis of agricultural technology diusion in dierent continents, all of which are geographically determined characteristics. In the economics circles, Je Sachs has been
pushing for this view. He argues that Certain parts of the world are geographically
85
86
Can we say anything about the relative importance of geography, institutions and culture? Measures of each are strongly correlated with income per capita or other determinants
of income. This is borne out both by growth regressions, and level regressions.
For example, returning to growth regressions of the type (1.2), the variables in X that
enter significantly can be interpreted as determinants of cross-country dierences in growth.
There is a very large literature on regressions of this sort. These regression analyses find a
variety of variables to be important in explaining growth. First, investment rates in physical
and human capital are found to be important. But, this does not inform us much about
the ultimate sources of dierences in economic performance, since dierences in physical and
88
4.3
As discussed above, in Acemoglu, Johnson and Robinson (2002), we looked at the horserace
between geography and institutions. The geography explanation predicts persistence in
income, since the geographic, ecological and climatic factors that should matter are changing
only little over periods as long as 500 years. Although the institutions view also suggests
persistence, a major shock could disrupt persistence, or even create a reversal.
In this context, the expansion of European overseas empire provides a natural experiment. Europeans aected the institutions of many societies through their colonization.
89
Based on these three premises, we use the mortality rates expected by the first European
settlers in the colonies as an instrument for current institutions in these countries.
Summarizing this schematically:
(potential) settler
mortality
settlements
early
institutions
current
institutions
current
performance
The results show a large eect of institutions on income, and generate no evidence that
geography matters. The following two figures summarize most of the findings. The first
shows the cross-sectional relationship between income per capita and a measure of economic
institutions, protection against expropriation risk. This is one of many potential variables
capturing the institutional features of a country that can be used. Its advantage is that it is
directly about protection of property rights, thus intimately related to economic incentives
that are highlighted by the institutions approach.
91
HKG
10
ARG
PAN
USA
SGP
AUS CAN
NZL
MLT
BHS CHL
VEN
URY
MEX GAB
MYS
ZAF
CRI COL
TTO BRA
ECU
PER DOMTUN
DZA
PRY
JAM
EGYMAR
BOLGUY
AGO
LKA
HND
NIC
CMR
GIN CIV
COG
SEN
GHA
PAK
SDN
VNM TGO
HTI
KEN
UGA
BGD NGA
ZAR
BFA
MDG
NER
MLI
GTM
SLV
SLE
ETH
IDN
IND
GMB
TZA
4
4
6
8
Average Expropriation Risk 1985-95
10
The second shows the first-stage relationship between log (potential) settler mortality
and protection against expropriation risk (so that higher scores correspond to better protection against expropriation by government or elites, or generally to better property rights
protection), and the third shows the reduced form between income per capita and settler
mortality. The latter two figures together give the two-stage least squares estimate of the
eect of broad economic institutions on long-run income per capita dierences.
92
10
NZL
USA
CAN
SGP
AUS
IND
HKG
MYS
MLT
ZAF
PAK
GUY
ETH
GMB
BRA
CHL
IDN
BHS
MEX
TTO
COL
VEN
MAR
JAM
CRI
URY
PRY
EGY
ECU
DZA
TUN
VNM
ARG
DOM
LKA
KEN
SEN
PAN
PER
BOL
HND
NIC
BGD
GTM
SLV
GAB
CIV
TGO
TZA
CMR
GIN
GHA
SLE
NGA
AGO
NER
COG
UGA
BFA
MDG
SDN
MLI
HTI
ZAR
10
4
6
Log of Settler Mortality
USA
SGP
HKG
CAN
AUS
NZL
MLT
MYS
ZAF
FJI
CHL
BHS
BRB
ARG
VEN
URY
MEX
GAB
PAN
COL
CRI
TTO
BRA
TUN
ECU
PER
DZA DOM
BLZ
GTM
PRY JAM
IDN
MAR
EGY
SLV
BOL
GUY
AGO
LKA
HND
NIC
CMR
GIN CIV
MRT
SEN COG
GHA
PAK IND
SDN VNM
TGO
CAF
HTI
BEN
LAO
KEN
UGA
BGD
ZAR
BFA
TCD
NERMDG
BDI
RWA
TZA
SLE
ETH
MUS
GMB
NGA
MLI
4
2
4
6
Log of Settler Mortality
Acemoglu, Johnson and Robinson (2001) conduct a variety of checks to show that this
relationship is robust, and likely due to the institutional channel (but like all instrumental
variable strategies, there is always the possibility that the instrument is not excludable).
93
94
Part II
Neoclassical Growth
95
97
98
Chapter 5
Towards Neoclassical Growth
At this point, let us take a step back. The entire Solow growth model was predicated on a
constant savings rate. Instead, it would be much more satisfactory to specify the preference
orderings of individuals as in standard general equilibrium theory and go from there. To
prepare for this, let us consider an economy consisting of a unit measure of infinitely-lived
households. These households can be truly infinitely lived, or could consist of overlapping
generations with full (or partial) altruism linking generations within the household. Then
the problem would be one in which each household i has an instantaneous utility function
given by
ui (ci (t))
where ui : R R is increasing and concave and ci (t) is the consumption of household i
this means that the individual does not derive any utility from the consumption of other
households, so consumption externalities are ruled out. Throughout, we will assume that
individuals discount the future proportionally (also referred to as exponentially), so that
99
ti ui (ci (t)) ,
t=0
where i (0, 1) is the discount factor of household i. In addition, we can have dierences in
households income processes, for example, for each household we could have eective labor
5.1
Representative Consumer
Instead of the more general framework mentioned above, we will look at economies that admit
a representative consumer. What this means is that we will think that the preference side of
the economy can be represented as if there were a single consumer making the consumption
and saving decisions (and labor supply decisions when these are endogenized).
One way of having a representative consumer is to assume that each household has the
same utility function
u (ci (t))
where u : R R is increasing and concave and ci (t) is the consumption of household i,
and also the same discount factor , and the same sequence of eective labor endowments
{h (t)}
t=0 . The advantage of this approach is that the economy indeed has a representative
100
t u (c (t)) ,
t=0
where (0, 1) is the common discount factor of all the households, and c (t) is the consumption level of the representative household.
This is an extremely convenient assumption, though as the next theorem shows, most
models do not admit representative consumers:
Theorem 8 (Debreu-Mantel-Sonnenschein) Consider an exchange economy with a finite number N < of commodities and H < households, each with potentially dierent
preferences. Let p be the vector of prices and x (p) be the vector of aggregate excess demands
0
for these commodities at the price vector p. For > 0, let P = p RN
+ :pj /pj 0 for all j and j .
Then any > 0, any continuous function x : P RN
+ that satisfies Walras Law and homogeneity of degree 0 can be an aggregate excess demand function.
Proof. See Debreu (1974) or Mas-Colell, Winston and Green (1995), Proposition 17.E.3.
101
H
X
ai (p) + b (p) y,
i=1
where y
PH
i=1
yi is aggregate income.
Proof. The proof follows from basic micro theory, and is left to you as an exercise.
Therefore, when there is a special form of quasi-linearity in the preferences, aggregating
them to have representation for a representative consumer is possible.
In this context, it is interesting to consider the CRRA (constant relative risk aversion)
102
P t c(t)1 1 if 6= 1 and 0
t=0
1
,
U=
P
ln
c
(t)
if
=
1
t=0
where is the coecient of relative risk aversion and also the inverse of the intertemporal
elasticity of substitution, which regulates how willing individuals are to substitute consumption over time.
This class of utility functions satisfy the conditions of Theorem 9. We will see below that
CRRA preferences have a special role in models of economic growth, because they are the
unique class of utility functions that are consistent with balanced growth. Therefore, if we
wish to impose balanced growth, the assumption that the economy admits a representative
consumer is not as restrictive as in models in which we wish to analyze growth without
making the balanced growth assumption.
5.2
Problem Formulation
Let us now make the representative consumer assumption. Suppose that each households
utility function in discrete time starting at time t = 0 is (ignoring uncertainty)
t u (c (t)) ,
(5.1)
t=0
(5.2)
t0
T /t
v (T ) exp lim ln (1 + t r)
t0
T
ln (1 + t r)
= exp lim
t0 t
However, the term in square brackets has a limit of the form 0/0. Let us next write this as
ln (1 + t r)
r/ (1 + t r)
= lim
= rT
t0
t0
t/T
1/T
lim
where I used lHopitals rule to obtain the first equality, and then took the limits in the
numerator and denominator to obtain the second equality. Therefore,
v (T ) = exp (rT ) .
104
5.3
Welfare Theorems
105
Suppose to obtain a contradiction that there exists (p, q, x) which Pareto dominates
(p , q , x ). Then it must be the case that for all households i I, xi is weakly preferred to
xi , i.e.,
xi i xi
and for at least one i0 I, the new allocation is strictly preferred to xi , i.e.,
xi i xi .
Since (p , q , x ) is a competitive equilibrium, it must be the case that for all i I,
p xi yi (p )
(5.3)
where yi (p ) is the income of household i at price vector p defined above. Suppose not.
We know that by non-satiation p xi = yi (p ), then if p xi < yi (p ), household i could
choose more of each commodity, i.e., xi + for small enough, and again by non-satiation
reach higher utility than that given by xi . This would contradict the hypothesis that xi is
utility maximizing at the price vector p .
106
(5.4)
p xi >
yi (p ) =
iI
iI
iI
since
p i +
p i +
X
f
X
f
if p qf
p qf
we have that
X
f
p xi >
p qf
X
iI
X
f
p qf
p i +
X
f
p qf
(5.5)
!
X
X
X
i +
xi =
qf ,
iI
iI
5.4
{c(t),k(t)}
t=0
t u (c (t))
t=0
subject to
k (t + 1) = f (k (t)) c (t) + (1 ) k (t) ,
(5.6)
{c(t),k(t)}
t=0
t u (c (t))
t=0
subject to
a (t + 1) = r (t) a (t) c (t) + w (t) ,
(5.7)
given a (0), where a (t) denotes the assets of the household at time t and r (t) is the rate
of return on assets and w (t) is wage in come. The constraint, (5.7) is the flow budget
constraint, meaning that it links tomorrows assets to todays assets. Here we need an
additional condition so that this flow budget constraint eventually converges (i.e., so that
a (t) should not go to negative infinity). This can be ensured by imposing a lifetime budget
constraint, but the flow budget constraint is often more convenient to work with, so we need
to augment it with another condition as we will see later.
110
5.5
The formulation of the optimal growth problem in continuous time is very similar. In particular, we have
max
[c(t),k(t)]t=0
subject to
k (t) = f (k (t)) c (t) k (t)
(5.8)
k (t) 0 and given k (0). Once again, this problem lacks one boundary condition which will
come from the transversality condition.
The most convenient way of characterizing the solution to this problem is via optimal
control.
We next discuss dynamic programming and optimal control briefly.
111
112
Chapter 6
Here I provide a very brief overview of infinite horizon optimization in discrete time, in
particular of stationary dynamic programming. I also include some technical details, which
are not essential for the purposes of this course, but may be useful for those of you who want
to understand some of the tools better.
113
6.1
Using abstract but simple notation, the canonical dynamic optimization program in discrete
time can be written as
Problem A1
v (x0 ) =
sup
{xt+1 }
t=0
t F (xt , xt+1 )
t=0
subject to
xt+1 (xt ),
for all t 0
x0 given.
where xt X RK for some K 1. In many economic applications, we will have K = 1,
so that xt R. Here I used sup rather than max, since there is no guarantee that the
maximal value is attained by any feasible plan.
Here F is the payo function, depending on xt , which is the state variable, and xt+1 ,
which corresponds to the control variable. In this simple formulation, xt+1 will also directly
become the state variable in the next time period.
The constraint on the problem is written as
xt+1 (xt )
where
:X X
is a correspondence determining what type of xt+1 is allowed given the state variable xt .
Notice that this problem is stationary in the sense that the payo function F is not
time-dependent. It only depends on xt and xt+1 .
114
F (x0 , x1 , ...),
then because there is no discounted structure, dynamic programming could not be used (at
least in its simplest form). Moreover, it can be noted that problems that do not have an
exponential discounted structure pose another problem for us: they are not time-consistent,
in the sense that the original plan that maximizes the initial objective function is not necessarily what an individual would like to stick to if he or she is carrying out the optimization
period by period. Time consistency is both a very natural property and one that makes
the mathematical analysis much simpler. In many ways, it is also the essence of dynamic
programming.
For concreteness, let us recall the optimal growth problem from above:
max
{c(t),k(t)}t=0
t u (c (t))
t=0
subject to
k (t + 1) = f (k (t)) c (t) + (1 ) k (t) ,
k (t) 0 and given k (0). To map this problem into the form here, let xt = k (t) and
xt+1 = k (t + 1). Then use the constraint to write:
c (t) = f (k (t)) k (t + 1) + (1 ) k (t) ,
115
{c(t),k(t)}
t=0
t u (f (k (t)) k (t + 1) + (1 ) k (t))
t=0
sequence {xt }
t=0 from some (vector) space of infinite sequences (for example, {xt }t=0 L ,
where L is the vector space of infinite sequences that are bounded with the kk norm,
which I will denote throughout by the simpler notation kk). Such problems sometimes have
nice features, but often are dicult to characterize both analytically and numerically.
The basic idea of dynamic programming is to turn the sequence problem into a functional
equation, i.e., one of finding a function rather than a sequence. This often gives better
economic insights, similar to the logic of comparing today to tomorrow. It is also often easier
to characterize analytically or numerically. In this particular case, the relevant functional
equation can be written as
Problem A2
v(x) =
(6.1)
y(x)
In fact, this form of the problem suggests itself naturally from the formulation Problem
A1. Suppose Problem A1 has a maximum with optimal sequence denoted by {xt }
t=0 starting
with x0 Then by definition,
v (x0 ) =
t F (xt , xt+1 )
t=0
F (x0 , x1 )
F (x0 , x1 )
+
+
j F (xj+1 , xj+2 )
j=0
v (x1 )
116
(6.1). Second, because the function v is defined recursively, in the sense that it is on the
right hand side of (6.1) as well, this is often referred to as the recursive formulation.
What makes this formulation useful is that the solution will often be a time invariant
policy function, g : X X determining what value of xt+1 to choose for a given value of
the state variable xt . [In general, there are two complications: first, a control reaching the
117
6.2
6.2.1
We say that (S, ) is a metric space, if S is a space and is a metric defined over this space
with the usual properties (loosely corresponding to distance between elements of S).
Definition 6 Let (S, ) be a metric space and T : S S be an operator mapping S into
118
|T x T y|
< 1,
|x y|
all x, y S with x 6= y.
n = 1, 2, ...
(6.2)
m1 + ... + n+1 + n ( 1 , 0 )
= n mn1 + ... + + 1 ( 1 , 0 )
n
( 1 , 0 ),
1
where the first line uses the triangle inequality (which is true by definition for any metric),
and the second line uses (6.2).
The last line implies that as n, m , m and n are getting closer, so { n }
n=0 is a
Cauchy sequence. Since S is complete, this establishes that
n S.
Now note that for any 0 S and any n N, we have
(T , ) (T , T n 0 ) + (T n 0 , )
(
, T n1 0 ) + (T n 0 , ),
where the first line again uses the triangle inequality, and the second line the definition of
the contraction. The above argument shows that both of the terms on the right tend to zero
as n , which implies that (T , ) = 0, establishing that T = , thus a fixed point
exists.
120
6.2.2
Let us now apply the above tools to the problem of dynamic programming, outlined at the
beginning. Consider a sequence {xt+1 }
t=0 which attains the supremum of Problem A1. We
will now show that this sequence will satisfy the recursive equation of dynamic programming
v(xt ) = F (xt , xt+1 ) + v(xt+1 ), for all t = 0, 1, 2, ...,
(6.3)
and moreover, under some boundedness conditions, any sequence that is a solution to (6.3)
is a solution to Problem A1, in the sense that it attains its supremum. In other words, we
will establish some equivalence results between the solutions to Problem A1 and Problem
A2.
To prepare for these results, let us define the set of feasible sequences or plans starting
with initial value x0 :
(x0 ) = {{xt+1 }
t=0 : xt+1 (xt ),
t = 0, 1, ...}.
Let us denote a typical element of the set by x = (x0 , x1 , ...) (x0 ), and assume:
Assumption 3 (x) is nonempty for all x X; and for all x0 X and x (x0 ),
P
limn nt=0 t F (xt , xt+1 ) exists.
123
Thus v (x0 ) is the supremum in Problem A1 (i.e., the value of the program in Problem A1).
Note that it follows by definition that v is the unique function satisfying the following three
conditions for Problem A1, or the sequence problem, SP:
1. if |v (x0 )| < , then
v (x0 ) u(x),
all x (x0 );
(6.4)
some x (x0 );
(6.5)
2. if v (x0 ) = +, then there exists a sequence {xk } in (x0 ) such that limk u(xk ) =
+; and
3. if v (x0 ) = , then u(x) = ,, for all x (x0 ).
Conversely, we will say that v is a solution to Problem A2 (and thus satisfies the
functional equation (6.3)), if the following three conditions for FE hold:
1. If |v (x0 )| < , then
v (x0 ) F (x0 , y) + v (y),
all y (x0 ),
(6.6)
some y (x0 );
(6.7)
lim F (x0 , y k ) + v (y k ) = +;
(6.8)
3. if v (x0 ) = , then
F (x0 , y) + v (y) = ,
all y (x0 ).
(6.9)
lim
n
X
t F (xt , xt+1 )
t=0
= F (x0 , x1 ) + lim
n
X
t F (xt+1 , xt+2 )
t=0
= F (x0 , x1 ) + u(x0 ).
This lemma basically says that the utility from any feasible plan can be decomposed into
two parts, the current return and continuation value. It therefore formalizes the principle of
optimality introduced more informally above.
Theorem 15 Let X, , F, and satisfy Assumption 3. Then the function v is a solution
to Problem A2.
125
Since
xk1
all k,
it follows that FE (6.8) holds for the sequence {y k = xk1 } in (x0 ). If v (x0 ) = , then
u(x) = F (x0 , x1 ) + u(x0 ) = ,
126
Hence v (x1 ) = , all x1 (x0 ). Since F is real-valued and > 0, (6.9) follows immediately.
Under the additional boundedness condition, we have the following converse to this
theorem:
Theorem 16 Let X, , F, and satisfy Assumption 3.
satisfies
lim n v(xn ) = 0,
(6.10)
then v = v .
Proof. (sketch) Condition (6.10) implies that v cannot take on the values + or .
Hence v satisfies (6.6) and (6.7), and it is sucient to show that this implies v satisfies (6.4)
and (6.5).
Since v is the solution to Problem A2, then (6.6) implies that for all x0 X and x (x0 )
v(x0 ) F (x0 , x1 ) + v(x1 )
F (x0 , x1 ) + F (x1 , x2 ) + 2 v(x2 )
..
.
un (x) + n+1 v(xn+1 ).
Now taking the limit as n and using the convergence property from (6.10), we obtain
(6.4) for any x (x0 ).
127
n = 1, 2, ...
Since (6.10) implies that for n suciently large the second term is also less than /2, it
follows that as n ,
v(x0 ) u(x) + ,
completing the proof.
An important implication is that although Problem A2 may have many solutions, only
one of those will satisfy the convergence condition (6.10). In general, we can make a lot of
progress by studying solutions to Problem A2, but sometimes we need to impose (6.10) in
order to pick the right solution (this is similar to sometimes working with necessary conditions
for optimization, though of course then we need to impose the suciency conditions).
Naturally, our interest is mainly with optimal plans. For this we have:
t = 0, 1, 2, ...
(6.11)
all x (x0 ).
(6.12)
Now choose x1 = x1 , (6.12) still holds. Since (x1 , x2 , x3 , ...) (x1 ) implies that (x0 , x1 , x2 , x3 , ...)
(x0 ), so that
u(x0 ) u(x0 ),
all x (x1 ).
Therefore u(x0 ) = v(x1 ). Substituting this into (6.12) yields (6.11) for t = 0. Continuing
by induction establishes (6.11) for all t.
Finally, the converse to this theorem is:
Theorem 18 Let X, , F, and satisfy Assumption 3. Let x (x0 ) be a feasible plan
from x0 satisfying (6.11), and with
lim sup t v (xt ) 0.
(6.13)
n = 1, 2, ...
Then using (6.13), we find that v (x0 ) u(x ). Since x (x0 ), the reverse inequality
holds, establishing the result.
The above theorems are useful in showing the equivalence of Problem A1 and Problem
A2. Now the usefulness of the dynamic programming formulation in Problem A2, and hence
129
(6.14)
where < 1. As before, X is the possible set of values for the state variable and : X X
is the correspondence describing the constraints on the problem. We now make an additional
assumption, which is not necessary, but greatly simplifies the analysis.
Assumption 4 X is a compact subset of RK , is nonempty, compact-valued and continuous. Moreover, let A = {(x, y) X X : y (x)} and F : A R be bounded and
continuous.
The importance of Assumption 4 is that it will allow us to focus on the space of bounded
functions. Most importantly, since F is bounded over its eective domain, there exists
some B < , such that |F (x, y)| < B for all (x, y) A. This immediately implies that
|v (x)| B/(1 ), all x X. Consequently, we can focus our attention on value functions
in the space C (X) of continuous bounded functions defined on X, with the natural norm
on this space, the sup norm, kf k = supxX |f (x)|.
In particular, to see the usefulness of the contraction mapping theorem, now define the
operator T such that
(T f )(x) = max [F (x, y) + f (y)].
y(x)
(6.15)
A fixed point of this operator, v = T v, will be a solution to (6.14), establishing the desired
results. Then we can derive the policy functions from the value function.
130
(6.16)
be the policy function (correspondence). Under the assumptions of Theorem 19, G is compact
valued and upper hemi-continuous.
Proof. This follows immediately from Berges maximum theorem.
We can next see how Theorem 13 enables us to establish more properties of the value
function and the policy correspondence. In particular, for example, let us assume
Assumption 5 For each y, F (, y) is strictly increasing in each of its first K arguments,
and is monotone in the sense that x x0 implies (x) (x0 ).
Theorem 20 Let X, , F, and satisfy Assumptions 4 and 5, and let v be the unique
solution to (6.14). Then v is strictly increasing.
131
Since C 0 (X) is a
closed subset of the complete metric space C(X), by Theorem 13, it is sucient to show
that T [C 0 (X)] C 00 (X). Assumption 5 immediately implies that for any nondecreasing f ,
T f is increasing, establishing the result.
Furthermore, let us impose
Assumption 6 F is strictly concave, i.e.,
F [(x, y) + (1 )(x0 , y 0 )] F (x, y) + (1 )F (x0 , y 0 ),
all (x, y), (x0 , y 0 ) A,
function.
Proof. The proof again follows from Theorem 13. Let C 0 (X) C(X) be the set of bounded,
continuous, (weakly) concave functions on X, and let C 00 (X) C 0 (X) be the set of strictly
132
(0, 1),
and x = x0 + (1 )x1 .
for all x D
(6.17)
with equality at x0 . Now, we show that (6.17) implies that v () is dierentiable. For this
note that v () is concave, thus v () is convex, and by a standard result in convex analysis,
it possesses subgradients. Moreover, for any subgradient p of v at x0 must satisfy
for all x D,
where the first inequality uses the definition of a subgradient and the second uses the fact
that W (x) v(x), with equality at x0 as established in (6.17). Since W is dierentiable at
x0 , p is unique, and again by a standard result in convex analysis, any convex function with
a unique subgradient at an interior point x0 is dierentiable at x0 . This establishes that
v (), thus v (), is dierentiable as desired.
134
6.3
6.3.1
Basic Equations
(6.18)
We know that the solution to our problem has to satisfy this functional equation. Moreover,
let us assume (as proved under some conditions above) that the value function v is dierentiable (we take the payo function F to be dierentiable everywhere). Moreover, consider
y Int (x), in other words, the constraints on the problem are not binding. Then we can
write a convenient Euler equation for this problem (again using s to denote optimal values)
as
y F (x , y ) + y v (y ) = 0.
Let us first focus on the case where both x and y are real numbers. Then, we have the
simpler condition:
F (x , y )
+ v 0 (y ) = 0.
y
(6.19)
This is very intuitive; it requires the sum of the marginal gain today from increasing y
and the discounted marginal gain from increasing y on the value of all future returns to be
equal to zero. For example, we can think of F as being decreasing in y and increasing in x
(recall for example the representation of the basic growth model with F (x, y) corresponding
to u (f (x) y + (1 ) x)or u (f (k (t)) k (t + 1) + (1 ) k (t))). In this case, equation
(6.19) requires the current cost of increasing y to be compensated by higher values tomorrow.
135
F (x, y )
.
x
(6.20)
These equations follow from the fact that x does not appear directly anywhere else (and its
eects through y, i.e., x y or y/x can be ignored, given the optimality condition (6.19)).
Now in the one-dimensional case, combining (6.20) together with (6.19), we have the
following very useful condition:
F (x , y )
F (y , g (y ))
+
=0
y
x
where x denotes the derivative with respect to the first argument and y with respect to
the second argument, and g (x) is the optimal policy given state variable x.
Alternatively, we could write this with the time subscripts as
F (xt+1 , xt+2 )
F (xt , xt+1 )
+
= 0.
xt+1
xt+1
(6.21)
However, this Euler equation is not sucient for optimality. In addition we need the
transversality condition. In the more general case this is equivalent to:
lim t xt F (xt , xt+1 ) xt = 0
136
F (xt , xt+1 )
xt = 0.
xt
(6.22)
In words, this condition requires that the product of the marginal return from the state
variable x times the value of this state variable does not increase asymptotically at a rate
faster than 1/.
We will see why this transversality condition makes sense shortly. But for now, we can
note the following theorem:
Theorem 23 Let X RK
+ , and suppose that X, , F, and satisfy Assumptions 4, 5, 6 and
7. Then the sequence xt+1 t=0 , with xt+1 Int(xt ), t = 0, 1, . . . , is optimal for Problem
A1 given x0 , if it satisfies (6.21) and (6.22).
Proof. Let x0 be given; let {xt } be a feasible (nonnegative) sequence satisfying (6.21) and
(6.22) and {xt } another feasible (nonnegative) sequence. Assumptions 4, 6 and 7 imply that
F is continuous, concave, and dierentiable, so let us define
lim
T
X
t=0
as the dierence of the objective function between the feasible sequences {xt } and {xt }. If
we establish that is nonnegative for any feasible nonnegative sequence {xt }, then we will
have established {xt } yields no lower utility than any feasible {xt }, thus it must be optimal.
Now by definition of a concave function, we have
lim
T
X
t=0
t=0
Since {xt } satisfies (6.21), the terms in the summation are all zero. Therefore, substituting
from (6.21) into the last term and then using (6.22) gives
lim T Fx (xT , xT +1 ) (xT xT )
T
lim T Fx (xT , xT +1 ) xT ,
T
where the last line uses the fact that from Assumption 5, F is increasing in x, i.e., Fx 0
and xt 0, all t.
result.
6.3.2
To get more insights into dynamic programming, let us return to the sequence problem.
Also, let us suppose that xt is one dimensional and that there is a finite horizon T . Then
the problem becomes
max
{xt+1 }T
t=0
T
X
t F (xt , xt+1 )
t=0
subject to xt+1 0 with x0 as given. Moreover, let F (xT , xT +1 ) be the last periods utility,
with xT +1 as the state variable left after the last period (this utility could be thought of as
the salvage value for example), since the world ends after date T .
In this case, we have a finite-dimensional optimization problem and we can simply look
at first-order conditions. Moreover, let us again assume that the optimal solution lies in
138
t F (xt , xt+1 )
xt+1
xt+1
= 0,
or
for any 0 t T 1,
F (xt+1 , xt+2 )
F (xt , xt+1 )
+
= 0,
xt+1
xt+1
which are identical to the Euler equations for the infinite-horizon case. In addition, for xT +1 ,
we have the following boundary condition
xT +1 0, and T
F (xT , xT +1 )
xT +1 = 0.
xT +1
(6.23)
Intuitively, this boundary condition requires that xT +1 should be positive only if an interior
value of it maximizes the salvage value at the end.
Again, returning to the growth example for a second, recall that
F (x, y) = u (f (x) + (1 ) x y) ,
with the mapping x = k and y = k+1 .
Now in this case at the last date T , we have
F (xT , xT +1 )
= u0 (cT ) < 0,
xT +1
Therefore, we must have kT +1 = 0, i.e., there will be no capital left at the end of the world.
This is very intuitive. If any of it were left, utility could be improved by consuming that
capital either at the last date or at some earlier date.
139
lim T
F (xT +1 , xT +2 )
xT +1 = 0.
xT +1
or canceling the negative sign, and without loss of any generality, changing the timing:
lim
F (xT , xT +1 )
xT = 0,
xT
which is exactly the transversality condition as (6.22). This derivation also emphasizes that
alternatively we could have had the transversality condition as
lim T
F (xT , xT +1 )
xT +1 = 0,
xT +1
which emphasizes that there is no unique transversality condition, but we generally need a
boundary condition at infinity, which would be one of multiple potential conditions. This
issue will return when we look at optimal control in continuous time.
Therefore, a slightly dierent (and more heuristic) way of obtaining Theorem 23, is to
consider the above sequence problem with T , i.e.,
max
{xt+1 }
t=0
t F (xt , xt+1 ).
t=0
140
6.4
We are now in a position to apply the methods developed so far to the problem of optimal
growth. In this section, I will limit myself to optimal growth.
Recall the optimal growth problem as
max
{c(t),k(t)}
t=0
t u (c (t))
(6.24)
t=0
subject to
k (t + 1) = f (k (t)) + (1 ) k (t) c (t) and k (t) 0,
(6.25)
(6.26)
with (k) given by the interval [c, f (k) + (1 ) k] given the nonnegativity of the capital
stock.
Given the above theorems, in particular Theorems 15-22, the following proposition immediately follows:
Proposition 13 Given Assumptions 1, 2 and 8, the optimal growth model as specified in
(6.24) and (6.25), has a stationary solution characterized by the value function V (k) and
consumption function c (k). The amount s (k) is the capital stock of the next period, where
s (k) = f (k) + (1 ) k c (k). Moreover, V (k) is strictly increasing and concave in k and
s (k) is nondecreasing.
Proof. Optimality of the solution to the value function (6.26) for the problem (6.24) and
(6.25) follows from Theorems 15-18. That V (k) exists follows from Theorem 19, and the
fact that it is increasing and strictly concave, with the policy correspondence being a policy
function follows from Theorem 21.
Thus we only have to show that s (k) is nondecreasing. This can be proved by contradiction. Suppose, to arrive at a contradiction, that s (k) is decreasing, i.e., there exists k
and k 0 > k such that s (k) > s (k0 ). Since k0 > k, s (k) is feasible when the capital stock is
k0 . Moreover, since, by hypothesis, s (k) > s (k0 ), s (k 0 ) is feasible at capital stock k.
142
(6.27)
But clearly,
(z x0 ) (z x) = (z 0 x0 ) (z 0 x) ,
which combined with the fact that z 0 > z and that u is strictly concave and increasing implies
that
u (z x0 ) u (z x) > u (z 0 x0 ) u (z 0 x) ,
contradicting (6.27). This establishes that s (k) must be nondecreasing everywhere.
In addition, Assumption 2 (the Inada conditions) imply that savings and consumption
levels have to be interior, thus Theorem 22 applies and immediately establishes:
143
(6.28)
which is a remarkable result, because it shows that the steady state capital-labor ratio does
not depend on preferences, but simply on technology, depreciation and the discount factor.
We will obtain an analogue of this result in the continuous-time neoclassical model as well.
Moreover, since f () is strictly concave, k is uniquely defined. Thus we have
144
6.5
To show that the Pareto optimal growth allocation can be decentralized is very straightforward. Suppose that all households are identical, with utility function given by u (c) as
above, and normalize their measure to 1. Suppose they all start with capital stock k0 . The
other side of the economy are competitive firms. Households rent their capital to firms. It
is straightforward to see that households will receive a rental price of Rt = f 0 (kt ) because of
competitive market prices. They will therefore face a gross rate of return equal to
rt = [f 0 (kt ) + (1 )]
(6.29)
for renting one unit of capital at time t in terms of date t + 1 goods. In addition, they will
receive the wage rate of wt = f (kt ) kt f 0 (kt ).
146
{ct ,at }
t=0
t u (ct )
t=0
148
Chapter 7
The continuous time problem brings a number of new issues. The main reason is that even
with a finite horizon, the maximization is with respect to an infinite-dimensional object (in
fact an entire function, y : [t0 , t1 ] R). This requires us to review some basic ideas from
the calculus of variation and from optimal control, but most of the tools and ideas that are
necessary for this course are very straightforward.
I will start with the finite-horizon problem and the simplest treatment (which is much
more similar to calculus of variation than optimal control), to give you the basic idea, and
then provide the more powerful theorems from optimal control.
149
7.1
7.1.1
(7.1)
subject to
x (t) = g (t, x (t) , y (t))
(7.2)
(7.3)
and
Here x (t) R is the state variable, whose behavior is governed by the dierential equation
(7.2). y (t) Y (t) R is the control variable. In addition, we assume that f and g are
continuously dierentiable functions.
This is the simplest optimal control problem because it has boundary conditions that
regulate when the planning horizon ends (more generally, t1 can be a choice variable as well,
or it could extend to infinity as we will see later).
The diculty of this problem arises from two features:
1. We are choosing a function: y : [0, t1 ] Y rather than a vector or a finite dimensional
object.
2. The constraint takes an unusual form of a dierential equation.
These features make it dicult for us to know what type of optimal policy to look for.
For example, y may be a very discontinuous function. It may often hit the boundary of the
feasible set etc.
150
7.1.2
Variational Arguments
Before going into greater detail, let us try to understand the essence of the problem, which
can be done by using the variational principle of the calculus of variation.
For this purpose, let us suppose that
a continuous function y () defined over [0, t1 ] with y (t) IntY (t)
which achieves the optimum in this problem. Therefore, we are ruling out both the boundary
conditions and discontinuities.
Now consider the following variation
y (t, ) = y (t) + (t) ,
where (t) is an arbitrary fixed continuous function. We refer to this as a variation, because
given (t), by varying , we obtain dierent sequences of controls. The problem, of course,
is that some of these may be infeasible, i.e., y (t, )
/ Y (t) for some t. However, since
y (t) IntY (t), and a continuous function over a compact set [0, t1 ] is bounded, we can
always find > 0 such that for any () function
y (t) + (t) IntY (t)
for all < . Thus we can conduct variational arguments for small s. But, in analogy
with regular calculus, the argument that there is no gain from a variation for small s is
essentially what we need.
To prepare for these arguments, let us fix an arbitrary (), and define x (t, ) as the
path of the state variable corresponding to the path of control variable y (t, ). This implies
that x (t, ) is given by:
x (t, ) = g (t, x (t, ) , y (t, )) for all t [0, t1 ] and with x (0, ) = x0 .
151
(7.4)
t1
(7.5)
By the fact that y (t) is optimal, and that for < , y (t, ) (and thus x (t, )) is feasible,
we have
() (0) for all < .
Next, rewrite the equation (7.4), so for all t [0, t1 ]:
g (t, x (t, ) , y (t, )) x (t, ) 0.
Now for any continuously dierentiable function : [0, t1 ] R, it must be the case that
Z t1
(t) [g (t, x (t, ) , y (t, )) x (t, )] dt = 0.
(7.6)
0
The function (), chosen suitably, will be the costate variable, with a similar interpretation
to the Lagrange multipliers in regular (constrained) optimization. Now add (7.6) to (7.5) to
obtain:
()
t1
[f (t, x (t, ) , y (t, )) + (t) [g (t, x (t, ) , y (t, )) x (t, )]] dt.
t1
R t1
0
()
f (t, x (t, ) , y (t, )) + (t) g (t, x (t, ) , y (t, )) + (t) x (t, ) dt
0
152
()
t1
h
i
fx (t, x (t, ) , y (t, )) + (t) gx (t, x (t, ) , y (t, )) + (t) x (t, ) dt
t1
[fy (t, x (t, ) , y (t, )) + (t) gy (t, x (t, ) , y (t, ))] (t) dt
(t1 ) x (t1 , ) .
Now at evaluating this expression at = 0, we have
0
(0)
t1
h
i
fx (t, x (t) , y (t)) + (t) gx (t, x (t) , y (t)) + (t) x (t, 0) dt
t1
(t1 ) x (t1 , 0) .
where x (t) denotes the path of the state variable corresponding to the optimal plan, y (t).
As with the standard finite-dimensional optimization, if there exists some function (t) for
which 0 (0) 6= 0, this means that the value of the program can be improved. Therefore, we
need to have
0 (0) 0 for all (t) .
This can only be possible if the second integral is equal to zero for all (t), i.e., only if
Z
t1
[fy (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t))] (t) dt = 0 for all (t) ,
(7.7)
(7.8)
7.1.3
The conditions (7.7) and (7.8) should remind you of a Lagrangian maximization. By analogy
with the Lagrangian, a much more economical way of expressing Theorem 24 is to construct
the equivalent of the Lagrangian in this case, the Hamiltonian:
H (t, x, y, ) f (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t)) .
(7.9)
Then we have
Theorem 25 (Simplified Maximum Principle) Consider the problem of maximizing
(7.1) subject to (7.2) and (7.3), with f and g continuously dierentiable, has an interior
solution y (t) IntY (t) with corresponding path of state variable x (t). Let H (t, x, y, ) be
154
(7.10)
(t) = Hx (t, x (t) , y (t) , (t)) for all t [0, t1 ] , and (t1 ) = 0.
(7.11)
x (t) = H (t, x (t) , y (t) , (t)) for all t [0, t1 ] , and x (0) = x0 .
(7.12)
7.1.4
Generalizations
The above theorems can be immediately generalized to the case in which the state variable
and the controls are vectors rather than scalars, and also to the case in which there are constraints. The constrained case requires constraint qualification conditions as in the standard
finite-dimensional optimization case. These are slightly more messy to express, and since we
will make no use of the constrained maximization problems, I will not state these theorems.
The vector-values theorems are direct generalizations of the ones presented above, and
are useful in growth models with multiple capital goods. In particular, let
max
x(t),y(t),x1
J (x (t) , y (t))
157
t1
(7.13)
(7.14)
(7.15)
and
Here x (t) RK for some K 1 is the state variable and again y (t) Y (t) RN for some
N 1 is the control variable. In addition, we again assume that f and g are continuously
dierentiable functions. We then have:
Theorem 28 (Maximum Principle) Consider the problem of maximizing (7.13) subject to (7.14) and (7.15), with f and g continuously dierentiable, has an interior solution
y
(t) IntY (t) with corresponding path of state variable x (t). Let H (t, x, y, ) be given by
H (t, x, y, ) f (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t)) ,
(7.16)
(7.17)
(7.18)
(7.19)
7.1.5
Limitations
The limitations of what we have done so far are obvious. First, we have assumed that a
continuous and interior solution to the optimal control problem exists. This is in general
a very strong assumption. Second, and equally important for our purposes, we have so far
looked at the finite horizon case, whereas analysis of growth models requires us to solve
infinite horizon problems. To deal with both of these issues, we need to look at the more
modern theory of optimal control. This is done in the next section.
159
7.2
7.2.1
x(t),y(t)
(7.20)
subject to
x (t) = g (t, x (t) , y (t)) ,
(7.21)
(7.22)
and
t
The main dierence is that now time runs to infinity, and there is no choice of endpoint
x1 . In addition, I have simplified the problem by removing the feasibility set on the control
y (t), simply requiring this function to be real-valued.
For this problem, we call a pair (x (t) , y (t)) admissible if y (t) is a piecewise continuous
function of time and x (t) is a piecewise smooth function of time satisfying (7.21) given y (t)
(since x (t) is given by a continuous dierential equation, the piecewise continuity of y (t)
ensures the piecewise smoothness of x (t)). Notice that this is a significant generalization of
the above approach, since discontinuous controls are allowed as long as they are piecewise
continuous.
There are a number of technical diculties when dealing with the infinite-horizon case,
which are similar to those in the discrete time analysis. Primary among those is the fact
160
(7.23)
(7.24)
x (t) = H (t, x (t) , y (t) , (t)) for all t R+ , x (0) = x0 and lim x (t) x1 .
t
(7.25)
Notice an important dierence between Theorem 25 and the current theorem. There is no
boundary condition in Theorem 31 corresponding to (t1 ) = 0 of Theorem 25. Consequently,
the necessary conditions in Theorem 31 will not uniquely pin down a solution path. To do
this we need an infinite-horizon version of the transversality condition. One might be tempted
to impose a condition of the form
lim (t) 0
as the transversality condition, but this is not in general the case. We will see an example
where this does not apply soon. A milder transversality condition of the form
lim H (t, x, y, ) = 0
161
Theorem 33 (Arrow Sucient Conditions for Infinite Horizon) Consider the problem of maximizing (7.20) subject to (7.21) and (7.22), with f and g continuously dierentiable. Define H (t, x, y, ) as in (7.9), and suppose that a solution y (t) and the corresponding
path of state variable x (t) satisfy (7.23)-(7.25). Given the resulting costate variable (t), define M (t, x, ) H (t, x, y (t) , ). If M (t, x, ) is concave in x and limt (t) (x (t) x (t))
0 for all x (t) implied by an admissible control path y (t), then y (t) and the corresponding
x (t) achieve the unique global maximum of (7.20).
Notice that both of these this eciency theorems have the dicult to check condition
that limt (t) (x (t) x (t)) 0 for all x (t) implied by an admissible control path y (t).
This condition will disappear when we can impose a proper transversality condition.
162
7.2.2
The following example, which is very close to the original Ramsey model, illustrates that
there are in general no transversality conditions.
Example 2 Consider the following problem:
Z
[log (c (t)) log c ] dt
max
0
subject to
k (t) = [k (t)] c (t) k (t)
k (0) = 1
and
lim k (t) 0
= k 1 = .
Hc =
Hk
163
1
> 0 and lim k (t) = k .
t
c
Therefore, the equivalent of the standard finite-horizon transversality conditions do not hold.
It can be verified, however, that along the optimal path
lim H (k (t) , c (t) , (t)) = 0,
7.2.3
Part of the diculty, especially regarding the absence of a transversality condition, comes
from the fact that we did not impose enough structure on the functions f and g. As discussed
above, our interest is with the growth models where the utility is discounted exponentially.
Then the problem is a more special one, taking the form:
x(t),y(t)
(7.26)
subject to
x (t) = g (t, x (t) , y (t)) ,
(7.27)
(7.28)
and
t
The special feature of this problem is that the payo function, the equivalent of f ,
depends on time only through exponential discounting. The Hamiltonian in this case would
164
(7.29)
(7.30)
x (x (t) , y (t) , (t)) for all t R+ and lim [exp (t) x (t) (t)] = 0.
(t) (t) = H
t
(7.31)
(x (t) , y (t) , (t)) for all t R+ , x (0) = x0 and lim x (t) x1 .
x (t) = H
t
165
(7.32)
is necessary. Notice that compared to the transversality condition before, there is the additional term exp (t). This is because the transversality condition applies to the original
costate variable (t), i.e., limt [x (t) (t)] = 0, and as shown above the current-value
costate variable (t) is given by (t) = exp (t) (t) = 0.
The suciency theorems can also be strengthened now by incorporating the transversality condition and expressing the conditions in terms of the current-value Hamiltonian:
Theorem 35 (Mangasarian Sucient Conditions for Discounted Infinite-Horizon
Problems) Consider the problem of maximizing (7.26) subject to (7.27) and (7.28), with u
(x, y, ) as the current-value Hamiltonian as in
and g continuously dierentiable. Define H
(7.29), and suppose that a solution y (t) and the corresponding path of state variable x (t)
satisfy (7.30)-(7.32). Suppose also that for the resulting current-value costate variable (t),
(x, y, ) is jointly concave in (x, y) for all t R+ , then y (t) and the corresponding x (t)
H
achieve the unique global maximum of (7.26).
Theorem 36 (Arrow Sucient Conditions for Discounted Infinite-Horizon Problems) Consider the problem of maximizing (7.26) subject to (7.27) and (7.28), with u
(x, y, ) as the current-value Hamiltonian as
and g continuously dierentiable. Define H
in (7.29), and suppose that a solution y (t) and the corresponding path of state variable
x (t) satisfy (7.30)-(7.32). Given the resulting current-value costate variable (t), define
(x, y, ). If M (t, x, ) is concave in x, then y (t) and the corresponding x (t)
M (t, x, ) H
achieve the unique global maximum of (7.26).
166
Chapter 8
The Neoclassical Growth Model
We are now ready to start our analysis of the standard neoclassical growth model (also known
as the Ramsey, or Cass-Koopmans model). This model diers from the Solow model only
in explicitly modeling the consumer side and endogenizing savings (i.e., allowing consumer
optimization). Beyond its use as a basic growth model, this model has become a workhorse
for many areas of macroeconomics, including the analysis of fiscal policy, taxation, business
cycles, and even monetary policy.
8.1
(8.1)
c0
(8.2)
(8.3)
where c (t) is consumption per capita at time t, is the subjective discount rate, and the
eective discount rate is n, since it is assumed that the household derives utility from the
consumption of its additional members in the future as well. We assume throughout that
Assumption 10
> n.
This assumption ensures that there is in fact discounting of future utility streams. Otherwise, (8.3) would have infinite value, and standard optimization techniques would not be
useful in determining what an optimal plan is (we would need to use over-taking type criteria
etc.). More generally, there is something somewhat strange about models in which utility is
168
K (t)
= F
,1
L (t)
f (k (t)) ,
y (t)
where, as before,
k (t)
K (t)
.
L (t)
(8.4)
Competitive factor markets then imply that, at all points in time, the rental rate of
capital and the wage rate are given by:
R (t) = FK [K(t), L(t)] = f 0 (k(t)).
(8.5)
(8.6)
and
169
A (t)
,
L (t)
we obtain:
a (t) = (r (t) n) a (t) + w (t) c (t) .
(8.7)
In practice, household assets can consist of capital stock, K (t), which they rent to
firms and government bonds, B (t). In models with uncertainty, households would have a
portfolio choice between the capital stock of the corporate sector and riskless bonds. Government bonds play an important role in models with uncertainty and heterogeneity, allowing
households to smooth idiosyncratic shocks. But in representative household models without government, their only use is in pricing assets (for example riskless bonds versus equity
etc.), since they have to be in zero net supply, i.e., total supply of bonds has to be B (t) = 0.
Consequently, we will have that assets per capita are equal to the capital stock per capita
(or the capital-labor ratio in the economy), i.e.:
a (t) = k (t) .
170
(8.8)
The equation (8.7) is only a flow constraint, and it is not sucient to act as a proper
budget constraint on the individual. To see this, consider a finite-horizon economy, ending
at the time T . In this case, we could express the entire set of constraints on the household
as a single budget constraint of the form:
Z T
Z T
c (t) L(t) exp
r (s) ds dt + A (T )
0
t
Z T
Z
Z T
=
w (t) L (t) exp
r (s) ds dt + A (0) exp
t
(8.9)
T
r (s) ds ,
which requires the households discounted budget constraint to hold at time T (hence all
income and expenditures are carried forward to date T units). Clearly, dierentiating this
expression and expanding L(t) gives (8.7). And yet (8.7) by itself does not guarantee that
the level of A (T ) is such that this lifetime budget constraint holds. Therefore, in the finitehorizon, we would simply impose this lifetime budget constraint as a boundary condition.
In the infinite-horizon case, we need a similar boundary condition. This is generally
referred to as the no-Ponzi-game condition, and takes the form
Z t
(8.10)
This condition is stated as an inequality, to ensure that the individual does not asymptotically
tend to a negative wealth. But we will see from the transversality condition of the individual
problem that the individual would never want to have positive wealth asymptotically, so the
no-Ponzi-game condition can be alternatively stated as:
Z t
171
(8.11)
R
T
both sides of (8.9) by exp 0 r (s) ds to obtain
Z
Z t
Z T
then divide everything by L (0) and note that L(t) grows at the rate n, to obtain
Z t
Z T
Z T
c (t) exp
(r (s) n) ds dt + exp
(r (s) n) ds a (T )
0
0
0
Z t
Z T
=
w (t) exp
(r (s) n) ds dt + a (0) .
0
Now take the limit as T and use the no-Ponzi-game condition (8.11) to obtain
Z t
Z t
Z
Z
c (t) exp
(r (s) n) ds dt = a (0) +
w (t) exp
(r (s) n) ds dt,
0
which essentially requires the discounted sum of expenditures to be equal to initial income
plus the discounted sum of labor income. Therefore this equation is a direct extension of
(8.9) to infinite horizon. This derivation makes it clear that the no-Ponzi-game condition
(8.11) essentially ensures that the individuals lifetime budget constraint holds in infinite
horizon.
172
8.2
8.2.1
Characterization of Equilibrium
Definition of Equilibrium
We are now in a position to define an equilibrium in this dynamic economy. I will provide
two definitions, the first somewhat less formal, and second more useful in characterizing the
equilibrium below.
A competitive equilibrium of the Ramsey economy consists of paths of consumption,
capital stock, wage rates and rental rates of capital, [C (t) , K (t) , w (t) , R (t)]
t=0 such that
the representative household maximizes its utility given initial capital stock K (0) and the
given the time path of capital stock and labor [K (t) , L (t)]
t=0 all markets clear.
Notice that in equilibrium we need to determine the entire time path of real quantities
and the associated prices. This is a very important point. In dynamic models whenever we
talk of equilibrium, this refers to the entire path of quantities and prices. In some models,
we will focus on the steady-state equilibrium, but equilibrium always refers to the entire
path.
Since everything can be equivalently defined in terms of per capita variables, let me
states the alternative definition in terms of those:
8.2.2
Let us start with the problem of the representative consumer. From the definition of equilibrium we know that this is to maximize (8.3) subject to (8.7) and (8.11). Let us ignore
(8.11) first, and set up the current value Hamiltonian:
(a, c, ) = u (c (t)) + (t) [w (t) + (r (t) n) a (t) c (t)] ,
H
with state variable a, control variable c and current-value costate variable .
From Theorem 34, the following are necessary conditions:
c (a, c, ) = 0 = u0 (c (t)) (t) ,
H
a (a, c, ) = (t) + ( n) (t) = (t) (r (t) n) ,
H
lim [exp ( ( n) t) (t) a (t)] = 0.
(8.12)
which states that the multiplier changes depending on whether the rate of return on assets
is currently greater than or less than the discount rate of the household.
The first condition, on the other hand, implies
u0 (c (t)) = (t) .
174
(8.13)
(8.14)
where
u (c (t))
is the elasticity of the marginal utility u0 (c(t)). More importantly, u (c (t)) is also the
inverse of the intertemporal elasticity of substitution, which plays a crucial role in most macro
models. The intertemporal elasticity of substitution regulates the willingness of individuals
to substitute consumption (or labor or any other attribute that yields utility) over time.
This elasticity for dates t and s > t is defined as
u (t, s) =
As s t,we have
u (t, s) u (t) =
1
u0 (c (t))
=
.
00
u (c (t)) c (t)
u (c (t))
This is not surprising, since the concavity of the utility function u (), thus the elasticity of
marginal utility, determines how willing individuals are to substitute consumption over time.
Next, note also that integrating (8.12), we have
Z t
(r (s) ) ds
(t) = (0) exp
0
Z t
0
(r (s) ) ds ,
= u (c (0)) exp
0
175
Z t
0
(r (s) ) ds
= 0,
lim exp ( ( n) t) a (t) u (c (0)) exp
t
0
Z t
(r (s) n) ds
= 0,
lim a (t) exp
t
which implies that the strict no-Ponzi condition, (8.11) has to hold.
We can derive further results on the consumption behavior of households. In particular,
R
t
notice that the term exp 0 r (s) ds is a present-value factor that converts a unit of
income at time t to a unit of income at time 0. In the special case where r (s) = r, this
factor would be exactly equal to exp (rt). But more generally, we can define an average
interest rate between dates 0 and t as
1
r (t) =
t
r (s) ds.
In that case, we can express the conversion factor between dates 0 and t as
exp (
r (t) t) .
Now recalling that the solution to the dierential equation
y (t) = b (t) y (t)
is
y (t) = y (0) exp
176
b (s) ds ,
r (s)
ds
u (c (s))
r (t)
t ,
c (t) exp ( (
r (t) n) t) dt = a (0) +
w (t) exp ( (
r (t) n) t) dt,
and substituting for c (t) into this lifetime budget constraint in this iso-elastic case, we obtain
c (0) =
Z
(1 ) r (t)
+ n t dt a (0) +
exp
w (t) exp ( (
r (t) n) t) dt
0
(8.15)
8.2.3
Equilibrium Prices
Equilibrium prices are straightforward and are given by (8.5) and (8.6). This implies that
the market rate of return for consumers, r (t), is given by (8.8), i.e.,
r (t) = f 0 (k (t)) .
Substituting this into the consumers problem, we have
c (t)
1
=
(f 0 (k (t)) )
c (t)
u (c (t))
(8.16)
as the equilibrium version of the consumption growth equation, (8.13). Equation (8.15) in
the iso-elastic utility case also similarly generalizes.
177
8.3
Optimal Growth
Before characterizing the equilibrium further, it is useful to look at the optimal growth
problem, defined as the capital and consumption path chosen by a benevolent social planner
trying to achieve a Pareto optimal outcome. In particular, suppose that the social planner
gives exactly the same weights to people in dierent generations, so that it solves the problem
max
[k(t),c(t)]
t=0
subject to
k (t) = f (k (t)) (n + )k (t) c (t)
and k (0) > 0. To solve this problem, once again set up the current-value Hamiltonian, which
in this case takes the form
(k, c, ) = u (c (t)) + (t) [f (k (t)) (n + )k (t) c (t)] ,
H
with state variable k, control variable c and current-value costate variable .
From Theorem 34, the following are necessary conditions:
c (k, c, ) = 0 = u0 (c (t)) (t) ,
H
k (k, c, ) = (t) + ( n) (t) = (t) (f 0 (k (t)) n) ,
H
lim [exp ( ( n) t) (t) k (t)] = 0.
Going exactly through the same steps as before, it is straightforward to see that these
optimality conditions imply
c (t)
1
=
(f 0 (k (t)) ) ,
c (t)
u (c (t))
178
Z t
0
(f (k (s)) n) ds
= 0,
lim k (t) exp
t
8.4
Steady-State Equilibrium
Now let us characterize the steady-state equilibrium (or equivalently the steady-state optimal
allocation). In steady state, consumption per capita will be constant, thus
c (t) = 0.
From (8.16), this implies that irrespective of the exact utility function, we must have a
capital-labor ratio k such that
f 0 (k ) = + ,
(8.17)
which is the equivalent of the steady-state relationship in the discrete-time optimal growth
model, and as is the case there, it pins down the steady state capital-labor ratio only as
a function of the production function, the discount rate and the depreciation rate. This
also corresponds to the modified golden rule, rather than the golden rule we saw in the
179
(8.18)
which is similar to the consumption level in the basic Solow model, but the steady-state
capital-labor ratios determined dierently. Moreover, given Assumption 10, a steady state
where the capital-labor ratio and thus output are constant necessarily satisfies the transversality condition.
This analysis therefore establishes:
8.5
Transitional Dynamics
Next, we can determine the transitional dynamics of this model. Recall that transitional
dynamics in the basic Solow model were given by a single dierential equation with an initial
condition. This is no longer the case, since the equilibrium is determined by two dierential
180
Z t
0
(f (k (s)) n) ds
= 0.
lim k (t) exp
This combination of an initial condition and a transversality condition is quite typical for
optimal control problems where we are trying to pin down the behavior of both state and
control variables. This means that the notion of stability has to be dierent from that
of those in Theorems 4, 5 and 6. In particular, the consumption level (or equivalently the
costate variable ) is the control variable, and its initial value c (0) (or equivalently (0)) is
free. It has to adjust in a way to satisfy the transversality condition at infinity. Therefore,
rather than requiring all eigenvalues of the linear system or the linearized system to be
negative, what we want is saddle-path stability, which involves the number of negative
eigenvalues to be the same as the number of state variables. In particular, we have the
following straightforward generalizations of Theorems 4 and 5:
Theorem 37 Consider the following linear dierential equation system
x (t) = Ax (t)
(8.19)
with initial value x (0), where x (t) Rn for all t and A is an n n matrix. Suppose that
m n of the eigenvalues of A have negative real parts. Then there exists an m-dimensional
181
x (t) = F [x (t)]
(8.20)
and suppose that m n of the eigenvalues of A have negative real parts and the rest have
positive real parts. Then there exists an open neighborhood of x , B (x ) Rn and an mdimensional manifold M B (x ) such that starting from any x (0) M, the dierential
equation (8.20) has a unique solution with x (t) x .
Put dierently, these two theorems state that only a lower-dimensional subset of the
original space leads to stable solutions. However, in this context this is exactly what we
require, since c (0) will adjust in order to place us on exactly such a lower-dimensional
subset of the original space.
There are two ways of seeing this. The first one is simply by analyzing the above system
diagrammatically. This is done in the next picture:
182
The inverse U-shaped curve is the locus of points where k = 0. The vertical line, on the
other hand, is the locus of points where c = 0. The shape of the first one can be understood
by analogy to the diagram where we saw the golden rule. If the capital stock is too low,
steady-state consumption is low, and if the capital stock is too high, then the steady-state
consumption is again low. There exists a unique level, kgold , which maximizes the state-state
consumption per capita. The reason why the c = 0 locus is just a vertical line simply follows
from the fact that only the unique level of k given by (8.17) can keep per capita consumption
constant. Once these two loci are drawn, the rest of the diagram can be completed by looking
at the direction of motion according to the dierential equations. Given this direction of
movements, it is clear that there exists a unique stable arm, the lower-dimensional manifold
183
An alternative way of establishing the same result is by linearizing the set of dierential
equations, and looking at their eigenvalues. Recall the two dierential equations determining
the equilibrium path:
k (t) = f (k (t)) (n + )k (t) c (t)
and
c (t)
1
=
(f 0 (k (t)) ) .
c (t)
u (c (t))
184
n 1
= 0.
det 00
c f (k )
0
u (c )
It is straightforward to verify that, since c f 00 (k ) /u (c ) < 0, there are two real eigenvalues,
one negative and one positive. This implies that there exists a one dimensional stable
manifold converging to the steady state, exactly as the stable arm in the above figure.
Therefore, the local analysis also leads to the same conclusion. However, the local analysis
can only establish local stability, whereas the above analysis established global stability.
8.6
The above analysis was for the neoclassical growth model without any technological change.
Let us now extend the production function to:
Y (t) = F [K (t) , A (t) L (t)] ,
where
A (t) = exp (gt) A (0) .
185
(8.21)
K (t)
= F
,1
A (t) L (t)
f k (t) ,
y (t)
where now
k (t)
K (t)
.
A (t) L (t)
(8.22)
is the capital to eective labor ratio, taking into account that eective labor is increasing
because of labor-augmenting technological change.
In addition to the assumption on technology, we also need to impose a further assumption
on preferences in order to ensure balanced growth. We define balanced growth as growth
consistent with the Kaldor facts of constant capital-output ratio and capital share in national
income. These two observations together also imply that the rental rate of return on capital,
R (t), has to be constant, which, from (8.8), implies that r (t) has to be constant. In addition,
balanced growth requires that consumption and output grow at a constant rate. The Euler
186
Therefore, balanced growth is only consistent with utility functions that have
c(t)1 1 if 6= 1 and 0
1
u (c (t)) =
,
ln c(t)
if = 1
where the elasticity of marginal utility of consumption, u , is given by the constant . When
= 0, these represent linear (risk-neutral) preferences, whereas when = 1, we have log preferences. As , these preferences become infinitely risk-averse, and infinitely unwilling
to substitute consumption over time.
More specifically, we now assume that the economy admits a representative consumer
with CRRA preferences
exp (( n)t)
c (t)1 1
dt.
1
(8.23)
I refer to this model, with labor-augmenting technological change and CRRA preference
as given by (8.23) as the canonical model, since it is the model used in almost all applications
with steady growth (unless non-balanced growth is the purpose as will be discussed in some
of the structural change models below). Clearly, the Euler equation in this case takes the
simpler form:
1
c (t)
= (r (t) ) .
c (t)
187
(8.24)
.
A (t)
c (t)
We will see that this normalized consumption level will remain constant along the BGP. In
particular, we have
c (t)
c (t)
g
c (t)
c (t)
1
(r (t) g) .
=
Z th
i
0
f k (s) g n ds
= 0.
lim k (t) exp
r (t) = f 0 k (t)
Since in steady state c (t) must remain constant, therefore
r (t) = + g
188
(8.25)
f 0 k = + + g,
(8.26)
which pins down the steady-state value of the normalized capital ratio k uniquely, in a way
similar to the model without technological progress. The level of normalized consumption is
then given by
c = f k (n + g + ) k ,
(8.27)
Z t
which can only be the case if the integral within the exponent goes to zero, i.e., if
(1 ) g n > 0, or alternatively if the following assumption is satisfied:
Assumption 11
n > (1 ) g.
Note that this assumption strengthens Assumption 10 when < 1. Alternatively, recall
that in steady state we have r = + g and the growth rate of output is g + n. Therefore,
Assumption 11 is equivalent to requiring that r > g + n. We will encounter conditions like
this all throughout, and they will also be related to issues of dynamic eciency as we will
see below.
For now, we have the following immediate generalization of Proposition 18:
189
Interestingly, the results that the steady-state capital-labor ratio was independent of
preferences is no longer the case, since now k given by (8.26) depends on the elasticity
of marginal utility (or the inverse of the intertemporal elasticity of substitution), . The
reason for this is that there is now growth, so the willingness of individuals to substitute
consumption today for consumption tomorrow determines how much they will accumulate
and thus the equilibrium capital to eective labor ratio.
A similar analysis to before also lead to an immediate generalization of Proposition 19,
which is stated here. The proof is left as at home work exercise, but the next figure gives
the sketch already.
Proposition 21 Consider the neoclassical growth model with labor augmenting technological
progress at the rate g and preferences given by (8.23). Suppose that Assumptions 1, 2, 9 and
11 hold. Then there exists a unique equilibrium path of normalized capital and consumption,
k (t) , c (t) converging to the unique steady-state k , c with k given by (8.26). More-
over, if k (0) < k , then k (t) k and c (t) c , whereas if k (0) > k , then k (t) k and
c (t) c .
190
f k = k ,
c 1 1
g ,
=
k
c
k 1
=k
gn
k
191
c
.
k
k.
Therefore, these
Now define z c/k and x k1 , which implies that x/x
= ( 1) k/
two equations can be written as
x
= (1 ) (x g n z)
x
(8.28)
z
c k
= ,
z
c k
thus
1
z
=
(x g) x + + g + n + z
z
=
(( )x (1 ) + n) + z.
(8.29)
The two dierential equations (8.28) and (8.29) together with the initial condition x (0) and
the transversality condition completely determine the dynamics of the system. In Problem
Set 4, you will be asked to complete this example for the special case in which 1 (i.e.,
log preferences).
8.7
In the above model, the rate of growth of per capita consumption and growth are determined
exogenously, by the growth rate of labor-augmenting technological progress. The level of
income, on the other hand, depends on preferences, in particular, on the intertemporal
elasticity of substitution, 1/, the discount rate, , the depreciation rate, , the population
growth rate, n, and naturally the form of the production function f ().
If we were to go back to the proximate causes of dierences in income per capita or
growth across countries, this model would give us a way of understanding those dierences
only in terms of preference and technology parameters.
192
0
r (t) = (1 ) f k (t) .
This implies
c (t)
c (t)
g
c (t)
c (t)
1
(r (t) g) .
=
1
(1 ) f 0 k (t) g ,
=
8.8
8.8.1
Quantitative Evaluations
Policy Dierences
For a qualitative evaluation of the eect of policy dierences, let us follow Jones (1995)
and Chari, Kehoe and McGrattan (1997). Imagine that the main policy dierence across
countries is in terms of tax structure that aects the the relative price of capital goods.
Chad Jones uses data from the Summers-Heston data set on the price of investment goods
relative to consumption goods and shows that there are large dierences in the relative price
of capital goods (compared to consumption goods); he also shows that a high relative price of
capital goods is associated with low growth over the postwar period. This has led a number
of economists, for example, Chari, Kehoe and McGrattan (1997) or Parente and Prescott
(1994) to argue that a major dierence across countries is the extent of distortions arising
from taxes, corruption or other policy dierences, which aect the relative price of capital.
Although this is a plausible starting point, once you look at the data, the dierences in the
relative price come not from the fact that investment goods are much more expensive in
some countries, but from the fact that consumption goods are cheaper. We will discuss this
later below, but for now let us stick with the traditional approach and think of dierential
policies aecting the relative price of capital.
Suppose that all countries admit a representative consumer with identical preferences,
194
exp (t)
Cj1 1
dt,
1
(8.30)
(8.31)
with Hj representing exogenously given stock of eective labor (human capital). The accumulation equation is
K j = Ij Kj .
The only dierence across countries is in the budget constraint for the representative
consumer, which takes the form
(1 + j ) Ij + Cj Yj ,
(8.32)
where j is the tax on investment. This tax varies across countries, for example because
of policies or dierences in institutions/property rights enforcement. Notice that 1 + j is
also the relative price of investment goods (relative to consumption goods): one unit of
consumption goods can only be transformed into 1/ (1 + j ) units of investment goods.
Note that the right hand side variable of (8.32) is still Yj , which implicitly assumes that
j Ij is wasted, rather than simply redistributed to some other agents in the economy. This
is without any major consequence, since, as noted in Theorem 9 above, CRRA preferences
as in (8.30) have the nice feature that they can be exactly aggregated across individuals, so
we do not have to worry about the distribution of income in the economy.
195
C j
1 (1 ) AHj
=
.
Cj
(1 + j ) Kj
Consider the steady state. Because A is assumed to be constant, the steady state corresponds to C j /Cj = 0. (Alternatively, we could have A growing at a constant rate and C j /Cj
equal to the growth rate of A.) This immediately implies that
Kj =
(1 )1/ AHj
[(1 + j ) ( + )]1/
So countries with higher taxes on investment will have a lower capital stock in steady state.
Equivalently, they will also have lower capital per worker, or a lower capital output ratio
(using (8.31) the capital output ratio is simply K/Y = (K/AH) ).
Now substituting this into (8.31), and comparing two countries with dierent taxes (but
the same human capital), we obtain the relative incomes as
Y ( )
=
Y ( 0 )
1 + 0
1+
(8.33)
So countries that tax investment, either directly or indirectly, at a higher rate will be poorer.
The advantage of using the neoclassical growth model for quantitative evaluation relative
to the Solow growth model is that the extent to which dierent types of distortions (here
captured by the tax rates on investment) will aect income and capital accumulation is
determined endogenously. In contrast, in the Solow growth model, what matters is the
savings rate, so we would need other evidence to link taxes or distortions to savings (or to
other determinants of income per capita such as technology).
196
8.8.2
Extensions
Basically, these authors start from a one-sector model, and try to generate large responses
to distortions. But there is a constraint in this exercise: the share of capital in GDP is
197
(1 + j ) (Ij + Xj ) + Cj Yj
and
H j = Xj Hj .
where X denotes investment in human capital. With this reasoning, and using the numbers
implied by Mankiw, Romer and Weils regression analysis discussed above, they take = 1/3
(or they take the share of accumulable factors in GDP to be 2/3). In this case, (8.33) implies
income dierences as large as 64 fold. They therefore conclude that the augmented Solow
model is capable of explaining income dierences across countries quantitatively based on
distortions on investment.
However, this conclusion is subject to exactly the same caveats as Mankiw, Romer and
Weils analysis. A share of 2/3 for the accumulable factors in GDP is too high, and implies
implausibly large eects of education on income as pointed out above.
198
8.9
Parente-Prescott argue that the simple neoclassical model is not sucient to account for
the large dierences in income per capita across countries. Consistent with the evidence
presented above, they suggest that we have to take dierences in technology into account.
Their approach for technology dierences, however, is very similar to the tax-type distortions aecting physical capital (and human capital) decisions in the neoclassical model. In
particular, they argue that technology dierences arise because there are barriers to technology adoption, inducing economies with worse distortions not to adopt superior technologies.
Essentially, this explanation turns technology into an accumulable factor, with a neoclassical
production function. Consequently, even though Parente-Prescott argue that their model is
dierent from the neoclassical model, it is really a variant of it, and I will treat it that way
here.
Therefore, this explanation circumvents the problems of the neoclassical models without
being forced to increase the share of capital in national income. Moreover, the ParentePrescott formulation does this while keeping exogenous growth. However, there is a sense in
which what is being done here is to add the degree of freedom, and interpret a more broad
concept of capital is technology.
Here is a very simple version of their model. Suppose that output is given by
Yt = At Nt
where At is technology/knowledge which will be accumulated endogenously. Each firm can
firms in this economy). This limit on firm workers (so there will be Nt /N
at most employ N
level employment is imposed because the production technology exhibits increasing returns
to scale, and otherwise all workers would be employed in one firm.
199
Xt =
At+1
At
S
Tt
dS
Intuitively, each incremental improvement between At and At+1 costs an amount that depends on the distance of this improvement to the frontier technology, and also a shift parameter . As before, a high level of corresponds to better technology for absorbing
world knowledge, and a low level of corresponds to significant distortions in the process of
technology adoption.
Solving this integral, we obtain
1/
Xt =
1/
At+1 At
(1)/
(1/) Tt
Zt
At
(1)/
Tt
as the eective knowledge stock. Then, we have the law of motion of this knowledge stock
1/
(1)/
as Zt+1 At+1 / (1 + g) Tt
. So
Zt+1 = 0 Zt + Xt
as the law of motion, where 0 is a constant. This equation makes it clear that the modeling
200
ytj
Ztj
.
0 =
0
ytj
Ztj
Will dierences in now lead to large output dierences? The answer depends on . If
is small, as in the neoclassical model with a small share of capital in national product, there
will only be quantitatively small dierences in output per capita across countries. However,
if is large, for example = 0.7, this economy will behave similar to the neoclassical model
with a capital share equal to 2/3.
So therefore the basic dierence between this model and the standard neoclassical model
is that here instead of capital, we have knowledge being accumulated, and is assumed
to be large so knowledge is taken to be very important in production (e.g., corresponding
to a large share of payments to technology in GDP, if everything was priced according to
marginal productbut because of the increasing returns to scale this is not the case).
In other words, by introducing the knowledge stock and increasing returns to scale,
Parente and Prescott take us back to a production function of the form Y = K 0.7 L0.3 but
with K replaced by Z. As a result, they obtain significantly larger eects of distortions on
income than implied by the neoclassical production function with Y = K 0.3 L0.7 .
Although the explanation is plausible, the model does not generate further insights than
the statement that distortions lead to lower input use, and knowledge is just another input.
As we will see below, many models of endogenous technological progress will have a similar
flavor of technology being accumulated by purposeful investments, but they will allow for
201
202
Chapter 9
Growth with Overlapping Generations
The models analyzed so far were assumed to admit a representative household or consumer.
These models are useful in providing us a tractable framework for the analysis of capital accumulation and neoclassical growth. Moreover, they had the nice feature that the competitive
equilibrium coincided with the natural Pareto optimal allocation. In many situations, however, the assumption of a representative household is not appropriate. This was already
discussed above. But one specific set of circumstances where we have to depart from this
assumption is the one where we look at an economy in which new households (individuals)
arrive over time. These models, first analyzed by Paul Samuelson and then later Peter Diamond, are referred to as overlapping generations models, since dierent generations are born
at dierent points in time.
For economic growth, these models are useful, first, because they provide a tractable
alternative to the infinite-horizon representative agent models, second, because they have
some very dierent implications, and third, they allow a discussion of national debt and
Social Security type issues.
203
9.1
Problems of Infinity
9.2
9.2.1
In this economy, time is discrete and runs to infinity. Each individual lives for two periods.
For example, all individuals born at time t live for dates t and t + 1. For now let us assume
206
(9.1)
where u () satisfies the conditions in Assumption 9, c1 (t) denotes the consumption of the
individual born at time t during the first period of his life (which is at date t), and c2 (t + 1)
this the consumption during the second period of his life (at date t + 1). Also (0, 1) is
the discount factor.
Individuals can only work in the first period of their lives, and supply one unit of labor
inelastically, earning the equilibrium wage rate w (t).
Let us also assume that there is population growth, so that total population is
L (t) = (1 + n)t L (0) .
The production side of the economy is the same as before, characterized by a set of
competitive firms, and represented by the standard constant returns to scale aggregate production function, satisfying Assumptions 1 and 2. Also assume that capital fully depreciates
after being used. As a result, we have that the rate of return to saving equals the rental rate
of capital, i.e.,
r(t) = R (t) = f 0 (k (t)) ,
(9.2)
where f (k) is the standard per capita production function described above, and the wage
rate is
w (t) = f (k (t)) k (t) f 0 (k (t)) .
207
(9.3)
9.2.2
Consumption Decisions
Let us start with the individual consumption decisions. Denoting savings by s (t), this is a
solution to the following maximization problem
max
c1 (t),c2 (t+1),s(t)
subject to
c1 (t) + s (t) w (t)
and
c2 (t + 1) R (t + 1) s (t) ,
where we are using the convention that old individuals rent their savings of time t as capital
to firms at time t + 1, so receive the gross rate of return R (t + 1). The second constraint
incorporates the notion that individuals will only spend money on their own end of life
consumption (there is no consumption term for descendants etc.). I have not imposed the
constraints that s (t) 0, since with negative savings, individuals would violate their secondperiod budget constraint (given non-negativity of consumption).
It is clear that both constraints will hold as equalities given that u () is strictly increasing.
Then the first-order condition for a maximum implies:
u0 (c1 (t)) = R (t + 1) u0 (c2 (t + 1)) .
(9.4)
Solving these equations for consumption and thus for savings, we also obtain the following
implicit solution for savings
s (t) = s (w (t) , R (t + 1)) .
(9.5)
The function s (, ) is increasing in its first argument, and may be increasing or decreasing
in its second argument.
208
9.2.3
Equilibrium
s (w (t) , R (t + 1))
1+n
or substituting for R (t + 1) and w (t) from (9.2) and (9.3), we obtain the
k (t + 1) =
(9.6)
as the fundamental law of motion of the overlapping generations economy. A steady state is
given by a solution to this equation such that k (t + 1) = k (t) = k , i.e.,
k =
s (f (k ) k f 0 (k ) , f 0 (k ))
1+n
(9.7)
Since the savings function s (, ) can take essentially any form, the dierence equation
(9.6) can lead to quite complicated dynamics, and multiple steady states are possible. The
next figure shows some potential plots of the equation (9.6), which can lead to a unique
stable equilibrium, to multiple equilibria, or to an equilibrium with zero capital stock.
209
9.2.4
To get more insights, let us now specialize the above setup by assuming CRRA utility
functions, in particular,:
c1 (t)1 1
+
U (t) =
1
c2 (t + 1)1 1
1
(9.8)
where > 0, (0, 1). Furthermore, assume that technology is Cobb-Douglas, so that
f (k) = k
Everything else is the same as above. This simplifies the first-order condition for consumer
optimization and implies
c2 (t + 1)
= (R (t + 1))1/ .
c1 (t)
210
(9.9)
w (t)
,
(t + 1)
(9.10)
where
(t + 1) [1 + 1/ R (t + 1)(1)/ ] > 1,
ensuring that savings are always less than earnings. The relationship between the savings
and factor prices is given by
s (t)
1
=
,
w (t)
(t + 1)
s (t)
s (t)
1
=
.
(R (t + 1))1/
R (t + 1)
(t + 1)
sw
sr
Note that 0 < sw < 1. Moreover, sr > 0 if < 1, but sr < 0 if > 1, and sr = 0 if = 1.
The relationship between the rate of return on savings and the level of savings reflects the
counteracting influences of income and substitution eects you are familiar with from basic
micro. The case of = 1, i.e., log preferences, is of special importance and is often used in
many applied models. With log preferences, income and substitution eects exactly cancel
each other, and thus changes in the interest rate (and therefore changes in the capital-labor
ratio of the economy) have no eect on the savings rate.
Now equation (9.6) implies
k (t + 1) =
w(t)
s(t)
=
,
(1 + n)
(1 + n) (t + 1)
211
(9.11)
(9.12)
(1 + n) [1 + 1/ f 0 (k(t + 1))(1)/ ]
The steady state then involves a solution to the following implicit equation:
k =
f (k ) k f 0 (k )
.
(1 + n) [1 + 1/ f 0 k )(1)/ ]
Now using the Cobb-Douglas formula, we have that the steady state is the solution to the
equation
(1)/ i
= (1 )(k )1 .
(1 + n) 1 + 1/ (k )1
(9.13)
(9.14)
The steady-state value of R and thus k can now be determined from equation (9.14), which
always has a unique solution. We can next investigate the stability of this steady state. To
do this, substitute for the Cobb-Douglas production function in (9.12):
k (t + 1) =
(1 ) k (t)
(9.15)
Now using (9.15), the following proposition can be proved (proof left for Problem Set 5):
Proposition 25 In the overlapping-generations model with two-period lived agents CobbDouglas technology and CRRA preferences, there exists a unique steady-state equilibrium with
the capital-labor ratio k given by (9.13) and as long as 1, this steady-state equilibrium
is globally stable for all k (0) > 0.
The next figure shows the dynamics diagrammatically in this particular (well-behaved)
case, which look very similar to the dynamics of the basic Solow model:
212
Figure 9.1:
213
9.2.5
Pareto Optimality
Let us now return to the general problem, and compare the overlapping-generations equilibrium to the choice of a social planner wishing to maximize a weighted average of all
generations utilities. In particular, the social planner in question maximizes
tS U (t)
t=0
where S is the discount factor of the social planner. Substituting from (9.1), this implies:
t=0
c2 (t)
.
1+n
which is similar to the modified golden rule we saw in the context of the Ramsey growth
model. In particular, it does not depend on preferences (the utility function u ()) and
does not even depend on the individual rate of time preference, . Clearly, k S is typically
dierent from the steady-state value of the competitive economy, k , given by (9.7), which
is not surprising given the dierent preferences that are being maximized.
More interesting is the question of whether the competitive equilibrium is Pareto optimal.
The example from Shell in the previous section suggests that it may not be. In particular,
exactly as in the Shells example, we cannot use the First Welfare Theorem (Theorem 10)
because of the infinite number of commodities.
In fact, the competitive equilibrium is not in general Pareto optimal. The simplest way
of seeing this is that the steady state level of capital stock, k , given by (9.7), can be so high
that it is in fact greater than kgold , that is, the economy is to the right of the golden rule,
thus by reducing savings, consumption can increase for every generation.
More specifically, note that in steady state we have
f (k ) (1 + n)k = c1 + (1 + n)1 c2
c ,
where the first line follows by the accounting identity, and the second defines c as the total
steady-state consumption. Therefore
c
= f 0 (k ) (1 + n)
k
215
9.3
We now briefly discuss how Social Security can be introduced as a way of dealing with
overaccumulation in the overlapping-generations model. Very briefly, we will consider a
fully-funded system, in which the young make contributions the Social Security and their
contributions are paid back to them in their old age. The alternative is an unfunded system
or a pay-as-you-go Social Security system, where transfers from the young directly go to the
current old.
9.3.1
In a fully funded system, the government at date t raises some amount d (t) from the young
(by compulsory contributions to their Social Security accounts etc.), and this is invested in
the only productive asset of the economy, the capital stock, and pays the workers when they
217
c1 (t),c2 (t+1),s(t)
subject to
c1 (t) + s (t) + d (t) w (t)
and
c2 (t + 1) R (t + 1) (s (t) + d (t)) ,
for a given choice of d (t) by the government. Notice that now the total amount invested in
capital accumulation is s (t) + d (t) = (1 + n) k (t + 1).
It is also no longer the case that individuals will always choose s (t) > 0, since they
have the income from Social Security. Therefore this economy can be analyzed under two
alternative assumptions, with the constraint that s (t) 0 and without.
It is clear that as long as s (t) is free, whatever the sequence of Social Security payments
{d (t)}
t=0 (as long as it is feasible), the competitive equilibrium applies. When s (t) 0
is imposed as a constraint, then the competitive equilibrium applies if given the sequence
{d (t)}
t=0 , the privately-optimal saving sequence {s (t)}t=0 is such that s (t) > 0 for all t.
s (t) > 0 for all t, then the set of competitive equilibria without Social Security are the
set of competitive equilibria with Social Security.
218
9.3.2
The situation is very dierent with unfunded Social Security. Now we have that the government collects d (t) from the young at time t and distributes this to the current old with
per capita transfer b (t) = (1 + n) d (t) (which takes into account that there are more young
than old because of population growth). Therefore, the individual maximization problem
becomes
max
c1 (t),c2 (t+1),s(t)
subject to
c1 (t) + s (t) + d (t) w (t)
and
c2 (t + 1) R (t + 1) (s (t)) + (1 + n) d (t + 1) ,
for a given feasible sequence of Social Security payment levels {d (t)}
t=0 .
What this implies is that the rate of return on Social Security payments is 1 + n rather
than R (t + 1), because unfunded Social Security is a pure transfer system. Only s (t) goes
into capital accumulation. Therefore, intuitively we expect unfunded Social Security to
219
220
Chapter 10
Recitation Material: Stochastic
Growth
10.1
The classic analysis of economic growth with stochastic shocks was undertaken by Brock
and Mirman in their 1972 paper. This was done in the context of optimal growth. However,
if the economy admits a representative household, it turns out that despite the stochastic
shocks, the First and Second Welfare Theorems still hold, so equilibrium growth is the
same as optimal growth. In fact, the Brock-Mirman model is the starting point of the Real
Business Cycle models you will study later. For now, it suces to note that this model, for
all practical purposes, is identical to the non-stochastic model, except that we have to think
of expectations. In particular, it is a solution to the following program:
max
{c(t),k(t)}
t=0
E0
X
t=0
221
t u (c (t))
(10.1)
(10.2)
with given k (0). Here E0 is the expectations operator conditional on information available
at time t = 0. The budget constraint with the production function substituted in, equation
(10.2), requires some care in interpreting. A (t) is now introduced as a stochastic productivity term. The expectations are taken because the time path of the sequence {A (t)}
t=0 is
not known in advance. This implies that strategies have to have the proper measurability
conditions. In particular, in general we can do this by assuming that information at time t is
represented by a partition Ft , so that E0 [x] = E [x | Ft ], and variables chosen at time t have
to be measurable with respect to Ft . This simply means that they can not be conditioned
on realizations of future-dated stochastic variables.
The above model can be enriched by assuming that there are stochastic preference shocks,
for example by augmenting the utility function u (c (t)) by a shock b (t), so that u (c (t) | b (t))
is also a random function dependent on the realization of b (t).
In addition, analysis of growth under uncertainty makes the standard assumptions on
the production function as in Assumptions 1 and 2 above, and the standard assumption on
preferences as in Assumption 8.
Given this setup, the problem can again be written as a dynamic programming problem,
but now it is a stochastic dynamic programming problem, in particular, it takes the form
V (k) = max {u (c) + EV [Af (k) + (1 ) k c]}
c(k)
(10.3)
where the expectation is included because there is uncertainty about future values of the
stochastic variable A. The rest of the analysis is very similar to the non-stochastic case,
except that the Euler equations also include expectations. For example, assuming that A (t)
222
10.2
I now present the model from Acemoglu-Zilibotti (JPE 1997) aimed at capturing the interaction between diversification of risks and capital accumulation, and emphasizing the
endogenous generation of risks in the growth process. This model will give an example of
stochastic growth and also illustrate how the productivity of capital can change endogenously over the development process, and dier across countries. Finally, this model will
also introduce some tools that are useful for analyses of dynamic stochastic economies.
10.2.1
The Environment
Consider the following model. There is a continuum of equally likely states represented by
the unit interval. Agents have to invest their savings in intermediate sectors, which will than
payo in the form of capital in the next period. Intermediate sector j [0, 1] pays a positive
return only in state j and nothing in any other state.
This formulation implies that investing in a sector is equivalent to buying a Basic Arrow
Security that only pays in one state of nature. More formally, an investment of F j in sector j
generates capital of the amount RF j if state j occurs and F j Mj , and nothing otherwise.
There is also is a safe project, which transforms one unit of savings into r < R of capital.
The requirement F j Mj implies that all intermediate sectors have linear technologies
but some require a certain minimum size, Mj , before being productive. The distribution of
223
Mj = max 0,
D
(j ) .
(1 )
Sectors j have no minimum size requirement and for the rest of the sectors, the minimum
size requirement increases linearly). The next figure shows the minimum size requirements
diagrammatically, and will be used for determining the equilibrium as well once demand for
assets is introduced:
proportional investment F in all projects j J [0, 1], and the measure of the set
_
J is p, then the portfolio pays the return RF with probability p, and nothing with
probability 1 p.
224
log(cjt+1 )dj,
(10.4)
which again ensure a constant savings rate. Note that integration over [0,1] is over the states
of nature. The individual life cycle and decisions are summarized in the next figure:
(10.5)
invested by agent h t in sector j, h,t is the amount invested in the safe asset, and t is
the set of young agents at time t. Since both labor and capital trade in competitive markets,
equilibrium factor prices in state j are given as:
j
Wt+1
j
= (1 )A Kt+1
(1 )A
jt+1
10.2.2
j 1
= A Kt+1
A
(rh,t +
j
RFh,t
)dh
(rh,t +
j
RFh,t
)dh
(10.6)
(10.7)
Equilibrium
Now consider the portfolio decisions of households. Each household takes the set of traded
securities as given, and maximizes its utility by allocating its savings across dierent assets.
Securities are labeled by the indices of the project to which they are attached. Therefore,
one unit of security j entitles its holder to R units of t + 1 capital in state of nature j.
Denote the unit price of security j (in terms of savings of time t) by Pj,t . Assume that the
intermediates are supplied by financial intermediaries. Since 1 unit of savings invested in
a project thats open yields one unit of capital, competition among financial intermediaries
ensures that in equilibrium Pj,t = 1that is, all projects will be oered to households at
marginal cost.
Therefore, denoting the set of open projects at time t by Jt , optimal consumption, savings
and portfolio decisions can be characterized by:
log(ct ) +
max
st ,t ,{Ftj }
0j1
226
log(cjt+1 )dj,
(10.8)
Ftj dj = st ,
(10.9)
(10.10)
/ Jt ,
Ftj = 0, j
(10.11)
ct + st wt ,
(10.12)
It is important that these agents not only take wt , jt+1 , but also the set of risky assets Jt as
given.
A static equilibrium given wage earnings of young agents, Wt , (or given Kt ) is a solution
to the maximization problem (10.8) subject to (10.9)-(10.12), such that Ftj Mj for all
open sectors. A dynamic equilibrium is a sequence of static equilibria linked to each other
through (10.6)
Because preferences are logarithmic, the following saving rule is obtained irrespective of
the risk-return trade-o:
st s (wt ) =
wt .
1+
(10.13)
Given this result, a households optimization problem can be broken into two parts: first,
the amount of savings is determined, and then an optimal portfolio is chosen.
Next observe that in equilibrium we will have
0
1. Ftj = Ftj j, j 0 that are open (i.e. j, j 0 Jt ). Since each individual is facing the
same price for all of the traded symmetric Arrow securities, he would want to purchase
an equal amount of eachi.e., a balanced portfolio.
2. The set of open projects will be Jt = [0, nt ] for some nt [0, 1]. This states that when
only a subset of projects can be opened in equilibrium, small projects are opened
227
(10.14)
t + nt Ft = st ,
(10.15)
t,Ft
subject to:
(q )
B
where nt and jt+1 s are taken as parametric by the agent, and st is given by (10.13). t+1
=
(rt )1 is the marginal product of capital in the bad state, when the realized state
(q )
G
= (RFt + rt )1 applies in the good
is j > nt and no risky investment pays o. t+1
Ftj,
Then let:
nt (Kt )
(1 nt )R
s,
R rnt t
(10.16)
F Rr s , j n
t
t
Rrnt t
=
.
0
j > nt
1/2
(R+r){(R+r)2 4r[(Rr)(1) D
Kt +R]}
2r
if Kt
if Kt >
(10.17)
D 1/
D 1/
)AKt , and t , Ftj are given by (10.16) and (10.17) with nt = nt (Kt ).
(10.18)
(1
1+
This equilibrium can be expressed as the intersection of the aggregate demand of each
risky asset, F (nt ), with the thick curve that traces minimum size requirements in the figure.
228
10.2.3
Dynamics
Next, it is straightforward to characterize the full stochastic equilibrium process, the equilibrium law of motion of Kt is:
Kt+1 =
r(1nt )
RKt
Rrnt
RKt
prob. 1 nt
prob. nt
(10.19)
e (n (Kt )) = (1 n )
r(1 n )
R + n R
R rn
(10.20)
K QSSB
1
"
# 1
r 1 n (K QSSB )
R
=
R rn (K QSSB )
and
K QSSG = (R) 1 .
(10.21)
If uncertainty could be completely removed, that is n(K QSSG ) = 1, then there would
never be bad news, and the good quasi steady state would be a real steady state; a point, if
reached, from which the economy would never depart.
From equations (10.18) and (10.21), the condition for this steady state to exist is that
the saving level corresponding to K QSSG be sucient to ensure a balanced portfolio of
investments, of at least D, in all the intermediate sectors. Thus, if:
D < 1 R 1 ,
230
(10.22)
At very low levels of capital, the Inada conditions of the production function guarantee
positive growth even conditional on bad news (both curves lie above the 45 line). Then,
there is a range (region II) in which growth only occurs conditional on good draws (the bad
draws curve is below the 45 line).
Regions I and II are separated by K QSSB . When they are below this level, all economies
will grow towards it. When they are above this level, their output will fall in case they receive
bad shocks, and the probability of bad news is very high when the economy has a level of
capital stock just above K QSSB .
As good news is received, the capital stock will grow and the probability of a further
lucky draw will increase. Note that even when it grows, the economy is still exposed to large
undiversified risks, and will typically experience some set-backs.
Finally, provided (10.22) is satisfied, the economy will eventually enter region III where
231
10.2.4
Eciency
Since all agents are price takers, it may be conjectured that the decentralized equilibrium
here is ecient. This turns out not to be the case. To illustrate this, consider the portfolio
allocation that a social planner maximizing the welfare of the current generation of savers
would choose taking the amount of savings as given.
The dierence between the social planners allocation and a decentralized equilibrium is
that, the social planner explicitly chooses Ftj and the number of open sectors, Jt . It is
straightforward to see that the subset of projects in which the planner will invest will be of
the form J F B = [0, nF B ]similar to the decentralized equilibrium.
Therefore, subject to feasibility, the planner will solve
max
nt ,t ,{Ftj }
0jnt
nt
(10.23)
F j,F B = Mj > Mj if
t
Ftj,F B = Mj
F j,F B = 0
t
jt < jt
if nF B (Kt ) jt jt .
if
jt > nF B (Kt )
(10.24)
10.2.5
Implications
10.2.6
Would the market failure in portfolio choices be overcome if some financial institution could
coordinate households investment decisions? Imagine that rather than all agents acting
in isolation and ignoring their impact on each others decisions, funds are intermediated
through a financial coalition-intermediary. This intermediary can collect all the savings and
oer to each saver a complex security (as dierent from a Basic Arrow Security) that pays
RFtj,F B +rFt B in each state j, where Ftj,F B and Ft B are as in the optimal portfolio. Holding
this security would make each consumer better o compared to the equilibrium.
Although from this discussion it may appear that the ineciency we identified may
234
will not act as an intermediary, then Zh = . Among the possible non-null announcements,
(1)
there is autarky, i.e. Zh = (0, {h}), which means that h will only intermediate (at most)
his own savings. Finally, we denote the set of first-stage announcements of all agents by
Z (1) : R+ .
In the second stage, each agent h can announce his plan to run at most one project
and sell the corresponding Basic Arrow Security, i.e. h announces a pair (j, Pj,h ), as in
the game discussed in Section 3. But, now, securities are sold to financial intermediaries
rather than directly to households. Formally, the second-stage announcement for agent h
(2)
is Zh = (j, Pj,h ) [0, 1] R+ . and Z (2) : [0, 1] R+ is the set of all second-period
announcements. We will also denote the set of minimum security prices announced in the
second stage of the game by P = {P j }jJ .
In the third stage, each household takes the set of prior announcements, Z (1) and Z (2) ,
236
announced his name. Note that although the set Mh Z (1) could be empty, this will never be
the case in equilibrium, since any agent can costlessly make the autarky announcement in the
first stage. Finally, after all agents announce which coalition-intermediary they will belong
to, each intermediary makes the optimal investment decision. We still use the notation h ,
Fhj to denote the investment of an agent (through a coalition) in the safe and risky assets.
More precisely, if a coalition invests Fj in project j, then Fhj will be the share of agent h
in this coalition times Fj .
Definition 9 A (perfect) equilibrium is a set of announcements Z = (Z (1) , Z (2) , Z (3) )
at each stage of the game, a price function P (Z ) for all Basic Arrow Securities, a saving
decision sh (Z ), and induced holdings of the safe asset h (Z ) and securities Fhj (Z ) for all
agents, and factor payments W and such that given the announcements of the previous
stage(s) and the announcements of all other agents in the current stage, every household
(i)
chooses Zh that maximizes its utility as given by (5) and factor returns are determined by
(10.6) and (10.7).
Note that the definition of equilibrium used so far was also subgame perfect. Here we
emphasize perfection in order to reiterate the importance of Assumption 3 in our analysis.
The first observation is that free entry will drive profits (commissions) to zero in both
the first and second stages. This is established by the following lemma (proof omitted).
Lemma 3 In equilibrium, (i) P j, (Z ) = 1, j; (ii) h = 0, h .
With this remark, it is now possible to establish the following proposition:
237
(2)
= or Zh
= (j, 1) where
= (j, 1).
3. In the third stage ah 6= will choose a portfolio which induces t and Ftj = Ft as
given by equations (10.16) and (10.17).
This result implies that even with unrestricted coalitions, the ineciency cannot be
prevented. The key feature is that each agent would be creating a positive externality by
holding a non-balanced portfolio like the one necessary for eciency, and they will typically
find a way of moving towards a balanced portfolio, undermining eorts to sustain the ecient
allocation.
238
Part III
Endogenous Growth
239
241
242
Chapter 11
First-Generation Models of
Endogenous Growth
The first-generation models of endogenous growth made a big advance relative to the neoclassical growth model in generating sustained growth. Two approaches are noteworthy here.
The first one basically keeps the essence of the neoclassical approach, with competitive markets and no externalities. The second makes the first attempt at endogenizing technology
by introducing externalities and knowledge spillovers (flows) across firms.
11.1
AK Model Revisited
Let us start with the simplest neoclassical model of sustained growth, which we already
encountered in the context of the Solow growth model. This is the so-called AK model, where
the production technology is linear in capital. We will also see that in fact what matters is
that the accumulation technology is linear, not necessarily the production technology. But
243
11.1.1
Since there will be growth and we are, at least at first, interested in balanced growth, we
are forced to use preferences that are asymptotically consistent with balanced growth. We
may as well assume these preferences from the beginning, thus choose the standard CRRA
preferences of the canonical model.
More specifically, let us assume that the economy admits an infinitely-lived representative
household with utility given by
Z
U=
c1 1
exp ( ( n) t)
dt,
1
(11.1)
a = (r n)a + w c,
(11.2)
where a is assets per person, r is the interest rate, w is the wage rate, and n is the growth
rate of population. I have now suppressed time dependence to simplify notation.
We again impose the no-Ponzi game constraint:
Z t
[r(s) n] ds
0
lim a(t) exp
t
(11.3)
The Euler equation for the representative household is the same as before and gives:
c
1
= (r )
c
Z t
[r(s) n] ds
= 0.
lim a(t) exp
t
244
(11.4)
(11.5)
(11.6)
with A > 0 being a constant. Equation (11.6) has a number of notable dierences from
our standard production function satisfying Assumptions 1 and 2. First, output is only a
function of capital, and there are no diminishing returns (i.e., it is no longer the case that
f 00 () < 0). More important is the fact that the Inada conditions embedded in Assumption
2 are no longer satisfied. In particular,
lim f 0 (k) = A > 0.
(11.7)
Since the marginal product of labor is zero, the wage rate, w, is zero. This is a somewhat
extreme result, and again it can be relaxed as we will see below. Alternatively, in this model
we can think of k as a combination of physical and human capital, in which case there will
be labor income coming from human capital, which will be accumulating in the same way
as physical capital (in particular linearly).
245
11.1.2
Equilibrium
To characterize the equilibrium, which is defined in exactly the same way as in the basic
neoclassical model, we again use a = k, r = A , and w = 0, and substitute these into
equations (11.2), (11.4), and (11.5), to obtain:
k = (A n)k c
c
1
= (A ),
c
lim k(t)e(An)t = 0.
(11.8)
(11.9)
(11.10)
The important result immediately follows from equation (11.9). There is a constant
rate of consumption growth (as long as A > 0), and this is entirely independent of
the level of capital stock per person, k. This will also imply that there are no transitional
dynamics in this model. Starting from any k (0), the economy will immediately start growing
at a constant rate. To see this, integrate equation (11.9) starting from some initial level of
consumption c(0) [still to be determined]. This gives
1
(A )t .
c(t) = c(0) exp
(11.11)
Since there is growth in this economy, we have to ensure that the transversality condition
is satisfied (i.e., that lifetime utility is bounded away from infinity), and also we want to
ensure positive growth. Therefore we impose:
Assumption 12
A>+ >
1
(A ) + n + .
246
11.1.3
Transitional Dynamics
We now more explicitly show there are no transitional dynamics, that is, not only the growth
rate of consumption, but the growth rates of capital and output are also constant at all points
in time, and equal the growth rate of consumption given in equation (11.9).
To do this, let us substitute for c(t) from equation (11.11) into equation (11.8), which
yields
(11.12)
for some constant z0 chosen to satisfy the boundary conditions. Therefore, equation (11.12)
solves for:
k(t) = exp((A n) t) + [(A )( 1)/ + / n]1 [c(0) exp ((1/) ((A )t))] ,
(11.13)
247
Note that [(A )( 1)/ + / n] > 0, so the second term in this expression converges to zero as t . But the first term is a constant. Thus the transversality condition
can only be satisfied if = 0. Therefore we have from (11.13) that:
k(t) = [(A )( 1)/ + / n]1 [c(0) exp ((1/) ((A )t))]
(11.14)
(11.15)
It is also interesting to note that in this simple AK model, growth is not only endogenous in the sense of being sustained, but it is also endogenous in the sense of being
aected by underlying parameters. For example, consider an increase in the rate of discount,
248
+n+
k/k
A + n + ( 1)
K + K
=
=
Y
A
A
(11.16)
11.1.4
It is straightforward to incorporate policy into this framework. The simplest and arguably
one of the most relevant classes of policies is, as also discussed above, that which aects the
rate of return to accumulation. In particular, suppose that there is an eective tax rate of
on the rate of return from capital income. Repeating the analysis above immediately implies
that this will adversely aect the growth rate of the economy, which will now become:
g=
(1 ) (A )
.
(1 ) A + n (1 )
,
A
which is a decreasing function of if A > 0. Therefore, in this model, the savings rate
is constant in equilibrium as in the basic Solow model, but in contrast to that model, it
responds endogenously to policy.
11.2
The model studied in the previous section is attractive in many respects. It generates
sustained growth, which responds to policy, to underlying preferences and to technology.
250
(11.17)
where the subscript C denotes that these are capital and labor used in the consumption
sector, which has a Cobb-Douglas technology. In fact, the Cobb-Douglas assumption here
is quite important in ensuring that the share of capital in national income is constant [can
251
(11.18)
The distinctive feature of the technology for the investment goods sector, (11.18), is that
it is linear in the capital stock and does not feature labor. This is an extreme version of
an assumption often made in two-sector models that the investment-good sector is more
capital-intensive than the consumption-good sector. In the data, there seems to be some
support for this, though the capital intensities of many sectors have been changing over time
as the nature of consumption and investment goods has changed.
Market clearing implies:
KC (t) + KI (t) K(t),
for capital, and
LC (t) L,
for labor (since labor is only used in the consumption sector).
An equilibrium in this economy is defined similarly to that in the neoclassical economy,
but also features an allocation decision of capital between the two sectors. Moreover, since
the two sectors are producing two dierent goods, consumption and investment goods, there
will be a relative price between the two sectors which will adjust endogenously.
Since both market clearing conditions will hold as equalities (the marginal product of
both factors is always positive), we can simplify notation by letting (t) denotes the share
252
L
(1 (t)) K (t)
(11.19)
(11.20)
.
pI (t) pI (t) pC (t)
In our setting, given our choice of numeraire, we have pC (t) /pC (t) = 0. Moreover, pI (t) /pI (t)
is given by (11.20). Finally,
rI (t)
=A
pI (t)
given the linear technology in (11.18). Therefore, we have
rC (t) = A +
pI (t)
.
pI (t)
and in steady state, from (11.20), the steady-state consumption-denominated rate of return
is:
rC = A (1 ) gK .
From (11.4), this implies a consumption growth rate of
gC
1
C (t)
= (A (1 ) gK ) .
C (t)
(11.21)
Finally, dierentiate (11.17) and use the fact that labor is always constant to obtain
C (t)
K C (t)
=
,
C (t)
KC (t)
which, from the constancy of (t) in steady state, implies the following steady-state relationship:
gC = gK .
Substituting this into (11.21), we have
gK
=
A
1 (1 )
254
(11.22)
A
.
1 (1 )
(11.23)
What about wages? Now since labor is being used in the consumption good sector, there
will be positive wages. Since labor markets are competitive, the wage rate at time t is given
by
w (t) = (1 ) pC (t) B
(1 (t)) K (t)
L
= gK ,
which implies that wages also grow at the same rate as consumption.
Moreover, with exactly the same arguments as in the previous section, it can be established that there are no transitional dynamics in this economy. This establishes the following
result:
Proposition 31 In the above-described extended AK economy, starting from any K (0) > 0,
consumption and labor income grow at the constant rate given by (11.23), while the capital
stock grows at the constant rate (11.22).
It is straightforward to conduct policy analysis in this model, and as in the basic AK
model, taxes on investment income will depress growth. Similarly, a lower discount rate will
increase the equilibrium growth rate of the economy
One important implication of this model, dierent from the neoclassical growth model, is
that there is continuous capital deepening. Capital grows at a faster rate than consumption
255
11.3
The model that started much all the interest in endogenous growth is Romer (1997). Romer
wanted to explicitly model the process of knowledge accumulation, but realized that this
would be dicult in the context of a competitive economy. His initial solution (later updated
and improved in his and others work during the 1990s) was to consider knowledge as a
byproduct of production that accumulates by itself. I now present this model.
11.3.1
Consider an economy without any population growth (we will see why this is important)
and a production function with labor-augmenting knowledge (technology) that satisfies the
standard assumptions, Assumptions 1 and 2. For reasons that will become clear, instead
of working with the aggregate production function, let us look at the production function
256
(11.24)
where Ki (t) and Li (t) are capital and labor rented by a firm i. Notice that A (t) is not
indexed by i, since it is technology common to all firms. Let us normalize the measure of
final good producers to 1, so that we have the following market clearing conditions:
Z 1
Ki (t) = K (t)
0
and
Li (t) = L,
where L is the constant level of labor (supplied inelastically) in this economy. Firms are
competitive in all markets, which implies that they will all hire the same capital to eective
labor ratio, and moreover, factor prices will be given by their marginal products, thus
F (K (t) , A (t) L)
L
F (K (t) , A (t) L)
.
R (t) =
K (t)
w (t) =
The key assumption of Romer (1997) is that although firms take A (t) as given, this stock
of technology (knowledge) advances endogenously for the economy as a whole. In particular, Romer assumes that this takes place because of spillovers across firms, and attributes
spillovers to physical capital. Lucas (1998) develops a similar model in which the structure is
identical, but spillovers work through human capital (i.e., while Romer has physical capital
externalities, Lucas has human capital externalities).
The idea of externalities is not uncommon to economists, but both Romer and Lucas
make an extreme assumption of suciently strong externalities such that A (t) can grow
257
A (t) = BK (t) ,
(11.25)
i.e., the knowledge stock of the economy is proportional to the capital stock of the economy.
This can be motivated by learning-by-doing whereby, greater investments in certain sectors
increases the experience (of firms, workers, managers) in the production process, making the
production process itself more productive. Alternatively, the knowledge stock of the economy
could be a function of the cumulative output that the economy has produced up to now,
thus giving it more of a flavor of learning-by-doing. The reason why the externalities work
through capital might be justified along the lines of the structural change model we will
discuss below, where it is assumed that the manufacturing sector, which is more capitalintensive, is more important for generating externalities (whether this is so or not is not very
clear, and in any case, there is no compelling evidence that such externalities are very large).
In any case, substituting for (11.25) into (11.24) and using the fact that all firms are
functioning at the same capital-eective labor ratio, we obtain the production function of
the representative firm as
Y (t) = F (K (t) , BK (t) L) .
y (t)
(11.26)
(11.27)
and
which is constant.
11.3.2
Equilibrium
An equilibrium is defined similarly to the neoclassical growth model, as a path of consumption and capital stock for the economy, [C (t) , K (t)]
t=0 that maximize the utility of the
representative household and wage and rental rates [w (t) , R (t)]
t=0 that clear markets. The
important feature is that because the knowledge spillovers, as specified in (11.25), are external to the firm, factor prices are given by (11.26) and (11.27)that is, they do not price the
role of the capital stock in increasing future productivity.
Since the market rate of return is r (t) = R (t) , it is also constant. This immediately
implies that consumption in this economy, given by the usual Euler equation, grows at the
259
1
f (L) Lf0 (L) .
(11.28)
It is also clear that capital grows exactly at the same rate as consumption, so the rate of
capital, output and consumption growth are all given by gC as given by (11.28).
Let us assume that
f (L) Lf0 (L) > 0,
(11.29)
so that there is positive growth, but also that growth is not fast enough to violate the
transversality condition, in particular,
f (L) Lf0 (L) <
+ .
1
(11.30)
It is also straightforward to verify that as in the AK model above, there are no transitional
dynamics in this model. This establishes:
Proposition 32 In the above-described Romer model with physical capital externalities, as
long as conditions (11.29) and (11.30) are satisfied, there exists a unique equilibrium path
where starting with any level of capital stock K (0) > 0, capital, output and consumption
grow at the constant rate (11.28).
You can also see now why population was assumed constant in this model. To do this,
first, note that there is a scale eect here, in that when population (labor force) L is higher,
since f (L) Lf0 (L) is always increasing in L (by Assumption 1), the growth rate of the
economy will increase. Moreover, if population is growing constantly, the economy will not
admit a steady state and the growth rate of the economy will increase over time (output
reaching infinity in finite time and violating the transversality condition).
260
11.3.3
Given the presence of externalities, it is not surprising that the decentralized equilibrium
characterized in Proposition 32 is not Pareto optimal. To characterize the allocation that
maximizes the utility of the representative household, let us again set up on the currentvalue Hamiltonian, noting that the per capita accumulation equation for this economy can
be written as
k = f (L) k c k.
The current-value Hamiltonian is
h
i
c1 1
+ f (L) k c k ,
H (k, c, ) =
1
and has the necessary conditions:
c (k, c, ) = 0 = c
H
h
i
Hk (k, c, ) = + = f (L) ,
These equations imply that the social planners allocation will also have a constant growth
rate for consumption (and output) given by
gCS =
1
f (L) ,
which is always greater than gC as given by (11.28)since f (L) > f (L) Lf0 (L). Essentially, the social planner takes into account that by accumulating more capital, she is
improving the productivity in the future. Since this eect is external to the firms, the
decentralized economy fails to internalize this externality. Therefore we have:
261
262
Chapter 12
Multiple Equilibria and the Process of
Development
The models discussed so far generated sustained economic growth, which is important both
for understanding why some countries are much richer today than others, and the historical
process of economic growth leading to the modern world. However, the process of economic
development is not simply a linear sustained growth process. The process of development, as
emphasized by Simon Kuznets, is also one of the transformation of the economy. Agriculture
becomes less important, manufacturing becomes more important (and then later services
become more important). Urbanization increases. Simultaneously, there is a process of
coordination, or perhaps cumulative causation (where an economic process becomes selfsustaining once underway) going on, in which the increase in demand for certain goods and
services (especially coming from cities), fuels further growth. Many economic, social and
economic institutions also change in the process. To do justice to these topics, we need to
delve much deeper into issues of development economics and political economy, which are
263
12.1
Let us start with a very simple model of multiple equilibria arising from aggregate demand
externalities. Below in discussing models of endogenous technological change, monopolistic
competition will play a crucial role, since firms that discover new machines will become the
monopolistic suppliers of these machines or of goods produced with these machines. However,
the focus there will not be on multiple equilibria. Here we start with a simple two-period
model of an economy with monopolistic competition, which will lead to multiple equilibria.
The model is a version of Murphy, Shleifer and Vishnys (1989) Big Push paper. As the
name of the paper suggests, the idea is to think of the development process as a move from
one equilibrium to another, likely due to a coordinated move, a big push.
12.1.1
Consider the following two-period economy. All agents have preferences given by
U=
C11 1
C 1 1
+ 2
1
1
where C1 and C2 denote consumption at the two dates. plays a similar to before, with
1/ being the intertemporal elasticity of substitution, regulating how willing individuals are
264
c2
w2 + 2
w1 + 1 +
,
R
R
where t denotes the profits accruing to the representative consumer, and wt is the wage
rate at time t. R is the gross interest rate. Although individuals can borrow and lend, in
the aggregate, the resource constraints have to hold, so R will be determined in equilibrium
to ensure this.
The new feature in this model is that output is an aggregate of intermediates. In particular, there is a continuum of dierentiated intermediate goods, with their total measure
normalized to 1, and the aggregate production at time t is given by:
Yt =
yt (i)
di
where yt (i) is the output level of intermediate i at date t. This production function has
the standard love-for-variety feature first introduced by Dixit and Stiglitz. This functional
form can be used either for aggregating intermediates or directly as a utility function. Its
advantage is that it provides an extremely tractable model of substitution between dierent
265
(12.1)
where > 1 and lt (i) denotes labor devoted to the production of intermediate good i at
time t. Labor market clearing, naturally, requires
Z
lt (i) di L
(12.2)
At date 1, there is a designated producer for each intermediate, but a competitive fringe
can also enter and produce each good as productively as the designated producer. At date
1, the designated producer can also invest in the new technology, which costs F per firm.
If this investment is undertaken, this producers productivity at date 2 will be higher by a
factor as indicated by equation (12.1). In contrast, the fringe will not benefit from this
technological improvement, thus the designated producer will have some degree of monopoly
power.
All firms are assumed to be owned equally by all the consumers. They will maximize
profits taking the market prices (especially the market interest rate) as given.
266
12.1.2
Equilibrium
Since this is a two-period economy, we will be looking for a subgame perfect equilibrium.
Moreover, to simplify the discussion, let us focus on symmetric subgame perfect equilibria,
SSPE. An SSPE consists of an allocation of labor across firms, investment decisions for firms,
wages for both periods and an interest rate linking consumption between the two periods.
First, since all goods are symmetric, the first period labor market clearing is straightforward and we will have
l1 (i) = L for all i [0, 1]
(recall that the measure of sectors and firms is normalized to 1). This implies that
Y1 = L.
At date 2, the equilibrium will depend on how many firms have adopted the new technology.
Since we are looking at the symmetric equilibrium (SSPE), we only consider the two extremes
where all firms adopt and no firm adopts. In either case, again the marginal productivity of
all sectors are the same, so labor will be allocated equally, i.e.,
l2 (i) = L for all i [0, 1] .
Consequently, when the technology is not adopted, we have
Y2 = L
and when the technology is adopted by all the firms, we have
Y2 = L.
We now turn to the pricing decisions. In the first date, the designated producers have no
monopoly power because of the competitive fringe, thus they charge price equal to marginal
267
(12.3)
To see this more formally, recall that the standard Euler equation in this case is
C1 = RC2 ,
(12.4)
as given in (12.3).
which can only be satisfied with C1 = C2 , if the gross interest rate is R
Next consider the situation in which the designated producers have invested in the advanced technology. Now they can produce units of output with one unit of labor, while
the fringe of competitive firms still produces one unit of output with one unit of labor.
This implies that the designated producers have some monopoly power. The extent of this
monopoly power depends on the comparison of and .
Let us first find the demand facing each producer, which is given as a solution to the
following program of profit maximization for the final goods sector:
max
[y2 (i)]i[0,1]
y2 (i)
1
Z 1
di
268
y2 (i)1/ Y2
= p2 (i) ,
or
y2 (i) = (p2 (i)) Y2 .
(12.5)
This expression is useful in laying the foundations for the aggregate demand externalities,
which we will discuss soon; the demand for good i depends on the total amount of production,
Y2 . [However, you should ask yourself why this actually causes an externality; even with
perfectly competitive markets, the demand for my goods may depend on the supply of other
goods in the economy. So why is there an externality here?]
A nice feature of the demand curve implied by equation (12.5) is that it is iso-elastic
(i.e., the demand elasticity is constant). This will be a very convenient feature in many of
the models using this class of utility or production functions below.
To make more progress, first imagine the situation in which there is no fringe of competitive producers. In that case, each designated producer will act as an unconstrained
monopolist and maximize its profits given by price minus marginal cost times quantity, i.e.,
w2
y2 (i) .
2 (i) = p2 (i)
w2
(p2 (i)) Y2 ,
max 2 (i) = p2 (i)
p2 (i)
(p2 (i))
w2
Y2 p2 (i)
(p2 (i))1 Y2 = 0,
269
w2
.
1
This is the standard monopoly price formula of a markup related to demand elasticity over
the marginal cost, w2 /. Here the markup is constant because the demand elasticity is
constant.
However, the monopolist can only charge this price if the competitive fringe could not
enter and make profits stealing the entire market at this price. Since the competitive fringe
can produce one unit using one unit of labor, the monopolist can only charge this price if
1
1.
1
Otherwise, the price would be too high and the competitive fringe would enter. Let us
assume that is not so high as to make the monopolist unconstrained. In other words, let
us impose
Assumption 13
1
> 1.
1
Under this assumption, the monopolist will be forced to charge a limit price. It is
straightforward to see that this equilibrium limit price would be
p2 = w2 .
If it were any higher, the competitive fringe would enter, steal the whole market and make
positive profits. If it were any lower, the monopolist could increase its price without losing
the market, and thus increase its profits. This implies that under Assumption 13, each
monopolist would make per unit profits equal to
w2
w2
1
=
w2 .
270
1 1
w2 Y2 .
(12.6)
The wage rate can be determined from income accounting. Total production will be
equal to Y2 = L, and this has to be distributed between profits and wages, thus
1 1
w2 L + w2 L = L,
L
LF
(12.7)
> R.
Consequently, the interest rate in this case is higher than the one in which there is no
investment. This is natural, since investment implies that individuals are being asked to
forgo date 1 consumption for date 2 consumption. Note also that the greater is , the higher
271
1
L.
where the superscript N denotes that no other firm is undertaking the investment. Therefore,
the net discounted profits at date 1 for the firm in question is
1 1
L
R
1
L.
= F +
N = F +
Next consider the case in which all other firms are undertaking the investment. In this
case, profits at date 2 are
I2 = ( 1) L,
where the superscript I designates that all other firms are undertaking the investment.
Consequently, the profit gain from investing at date 1 is
1
( 1) L
L
( 1) L.
= F +
LF
I = F +
272
(12.8)
that is, when nobody else invests, investment is not profitable, and when all other firms
invest, investment is profitable. This is clearly possible because of the aggregate demand
externality, the fact that I > N ; when other firms invest, they produce more, there is
more aggregate demand, and therefore profits from having invested in the new technology
are higher. Counteracting this eect is the fact that the interest rate is also higher when all
firms invest. Therefore, the existence of multiple equilibria requires the interest rate eect
not to be too strong. For example, in the extreme case where preferences are linear, i.e.,
= 0, we have that
I = F + ( 1) L > N = F +
1
L,
so (12.8) is certainly possible. More generally, the condition for the existence of multiple
equilibria is that:
L
LF
( 1) L > F >
1
L.
(12.9)
It is also straightforward to see that whenever both equilibria exist, the equilibrium with
investment Pareto dominates the one without investment, since condition (12.9) implies that
all households are better o with the upward sloping consumption profile giving them higher
consumption at date 2.
273
12.2
The previous section illustrated the potential of development traps because of aggregate
demand externalities. Investment by dierent firms may require coordination, leading to
multiple equilibria. Underdevelopment may be thought to correspond to a situation in
which the coordination is on the bad equilibrium, and the development process starts with
the big push, changing the coordination to the high-investment equilibrium.
Similar issues arise, in a more dynamic way, when the economy is subject to credit market
problems. Moreover, credit market problems will illustrate how the distribution of income
(and the incidence of poverty) in a society might aect economic growth and the process of
economic development. I will illustrate these issues in the simplest possible way looking at
the eect of credit market problems on human capital investments.
275
12.2.1
When credit markets are imperfect, a major determinant of human capital investments will
be the distribution of income (as well as the degree of imperfection in the credit markets).
I start with a discussion of the simplest case with no borrowing (extreme credit market problems) to illustrate how the distribution of income will matter, and may also selfperpetuate.
Consider an economy with a continuum 1 of dynasties. Each individual lives for two periods, childhood and adulthood, and gets an ospring in his adulthood. There is consumption
only at the end of adulthood.
Preferences are given by
(1 ) log cit + log eit+1
where c is consumption at the end of the individuals life, and e is the educational spending
on the ospring of this individual. The budget constraint is
cit + eit+1 wti ,
where w is the wage income of the individual.
There are a number of important features embedded in this utility function:
1. Even though it is a very similar utility function to that we worked with in the overlapping generations model, now the utility function refers to the utility that an individual
obtains from his consumption and the indirect utility he obtains from leaving something to his ospring. In other words, this utility function features impure altruism
(sometimes referred to warm glow preferences): parents do not care about the utility
of their ospring, but simply about what they bequeath to them, here education.
276
(ei ) if ei 1
t
t
i
,
ht+1 =
h
if eit < 1
will choose the spending on education that maximizes its own utility. This immediately
implies the following savings rate:
eit = wti = Ahit .
277
(12.10)
(12.11)
Now, let us look at the dynamics of human capital for a particular dynasty i. If at time
0, we have hi0 < (A)1 , then (12.10) implies that eit < 1, so the ospring will have hi1 = h.
< (A)1 , and repeating this argument, we have hi < (A)1
Given (12.11), we have hi1 = h
t
for all t. Therefore, a dynasty that starts with hi0 < (A)1 will never reach a human capital
Next consider a dynasty with hi0 > (A)1 . Then from (12.11), we have hi1 = (Ahi0 ) >
1, so this dynasty will accumulate human capital and reach the steady state given by
h = (Ah ) or
h = (A) 1 > 1.
(as long as hi0 < h ; otherwise, the dynasty would have started with too much human capital
and would decumulate human capital).
The most important result is that this simple model features poverty traps due to the
nonconvexities created by the credit market problems.
It is interesting to contrast two economies subject to the credit market problems, but
with dierent distributions of income. For example, imagine an economy with two groups
starting at income levels h1 and h2 > h1 such that (A)1 < h2 . Now if inequality (poverty)
is high so that h1 < (A)1 , a significant fraction of the population will never accumulate
278
12.2.2
Now let us allow borrowing in the model above. Each individual still lives for two periods.
In his youth, he can either work or acquire education.
The utility function of each individual is
(1 ) log cit + log bit ,
where again c denotes consumption at the end of the life of the individual. The budget
constraint is
cit + bit m
279
(12.12)
which implies that investment in human capital is profitable when financed at the lending
rate r.
Let us now consider an individual with wealth x. If x h, assumption (12.12) implies
that individual will invest in education. If x < h, then whether it is profitable to invest in
education are not will depend on the wealth of individual and the borrowing interest rate, i.
Let us now write the utility of this agent (with x < h) in the two scenarios, and also the
280
(2 + r) wu + (1 + i) h ws
ir
The dynamics of the system can then be obtained simply by using the bequests of unconstrained, constrained-investing and constrained-non-investing agents.
More specifically, the equilibrium correspondence describing equilibrium dynamics is
n t
xt+1 =
(12.13)
bs (xt ) = (ws + (1 + i) (xt h)) if h > xt f
b (x ) = ((1 + r) (w + x ) + w )
if xt < f
u
t
u
t
u
Equilibrium dynamics can now be analyzed diagrammatically by looking at the graph of
(12.13).
Note an important feature here. The correspondence (12.13) describes the behavior of
the wealth of each individual. However, the whole wealth distribution can also be studied
from (12.13). This is because dynamics in this economy are Markoviandescribed simply
by the Markov process without any general equilibrium interactions.
281
All individuals with xt < g converge to the wealth level xU , while all those with xt > g
converge to the greater wealth level xS . As in the example without credit markets, there is a
poverty trap which attracts agents with low initial wealth. The distribution of income again
has a potentially first-order eect on the income level of the economy. If the majority of the
individuals start with xt < g, the economy will have low productivity, low human capital
and low wealth.
It is also clear that financial development should matter for human capital investments.
In an economy with better financial institutions, we may expect the wedge between the
borrowing rate and the lending rate to be smaller, i.e., i to be smaller given r. With a
smaller i, more agents will escape the poverty trap, and in fact the poverty trap may not
exist (there may not be an intersection between (12.13) and the 45 degree line where (12.13)
is steeper).
282
12.3
As mentioned above, an important element of the process of economic development, especially starting from the early stages of development, is that of structural change. Pretty
much all societies have started as agricultural economies, and have grown together with a
transformation of the economy, with the share of output of manufacturing (and services) increasing. The most standard reason for this is thought to be Engels law, which is the name
given to the feature that the budget share of food declines as individuals become richer.
Here I will outline a model by Matsuyama (1992), which incorporates both this feature
and the possibility of learning-by-doing as an important factor in economic growth.
12.3.1
Consider the following continuous time economy, consisting of two sectors: manufacturing
and agriculture. Both sectors produce using only labor. Population is constant and equal
to L = 1, and labor is supplied inelastically.
Technologies in the two sectors are given by the following diminishing returns production
functions
X M (t) = M (t) F (n (t)) F (0) = 0, F 0 > 0, F 00 < 0,
(12.14)
(12.15)
where n (t) is the fraction of labor employed in manufacturing as of time t. This way of
writing the two production functions already imposes market clearing in the labor market.
Notice that agricultural productivity, A, is not indexed by time, hence it is constant.
283
(12.16)
W =
log(cA (t) ) + log(cM (t)) exp (t) dt,
(12.17)
(12.18)
with , and > 0, and cA (t) denoting the consumption of the agricultural good and cM (t)
denoting the consumption of the manufacturing good at time t. The parameter is the
discount factor, and designates the importance of agricultural goods versus manufacturing
284
(12.19)
The first inequality states that the economys agricultural sector is productive enough to
provide the subsistence level of food to all consumersotherwise individuals would receive
negative infinite utility.
The budget constraint of consumers in each period is
cA (t) + p (t) cM (t) w (t) + (t)
where (t) is the profits per representative household.
12.3.2
Equilibrium
(12.20)
(12.21)
where
(n) G(1 n) G0 (1 n)F (n)/F 0 (n).
Moreover, we have
(0) = G(1), (1) < 0 and 0 () < 0.
The function (n) can be interpreted as the excess demand for manufacturing over
agriculture. An equilibrium has to satisfy (12.21). From Assumption (12.19) it is clear that
the equilibrium condition (12.21) has a unique interior solution in which
n (t) (0, 1) .
Since the right-hand side of (12.21) is decreasing in A, this solution can be written as a
function of agricultural productivity, A:
n (t) = v(A),
(12.22)
F (v(A))
.
F 0 (v(A))
which is also increasing in A; this implies that higher agricultural productivity also increases
agricultural consumption. Therefore, this discussion leads to the following simple result:
287
288
Chapter 13
Interdependence and Growth in the
Open Economy
The analysis so far treated each country as a closed island, not interacting with the rest
of the countries in the world. This is clearly not the correct way to view the world. In
this chapter, we have a first look at some models of interdependences. First, I begin with
a model of technology transfer from an exogenously advancing world technology frontier.
Then, I discuss a model of technology transfer and trade. Finally, I look at how international
trade influences the process of economic growth, creating interdependences across growing
countries.
13.1
The Nelson-Phelps model is the simplest model of technology diusion across countries, and
has proved a useful reduced-form model for many applications. In addition to its growth
289
(13.1)
(hj )
T (t) .
g + (hj )
(13.2)
Suppose now that output in each country is proportional to Aj (t). Equation (13.2) then
implies that countries with low human capital will be poor, because they will absorb less of
the frontier technology.
This eect is in addition to the direct productive contribution of human capital to output, and suggests that human capital dierences across countries can be more important
in causing income dierences than calculations based on private returns to schooling might
suggest.
13.2
A more subtle and in many ways more useful model of technology transfer is that of Krugman
(1979), which is also useful for our purposes because it combines interdependences due to
technology transfer with those arising from international trade.
13.2.1
Consider two sets of economies, North and South. All individuals in all countries have the
same Dixit-Stiglitz preferences with love for variety given by
C=
c (i)
di
where c (i) is the consumption of the ith good, M is the total number of goods that will be
determined endogenously, and > 1 is the elasticity of substitution between these goods.
291
pN
pS
wN
wS
(13.3)
cN =
LN
LS
and cS =
MN
MS
where LN is total labor force in the North, and LS is the total labor force in the South. MN
is the total number of new goods (produced in the North) and MS is the total number of
old goods. Combining this with (13.3) we obtain
wN
=
wS
LN MS
LS MN
For this type of equilibrium to exist, we need wN /wS > 1, and this situation is also drawn
diagrammatically in the next figure as the intersection of the relative demand curve for
Northern labor with the relative supply curve at LN /LS . Note that wN /wS > 1 corresponds
to an intersection when the relative demand services downward sloping. Instead the flat
portion of the relative demand curve corresponds to the case where there is no full specialization, and hence some of the old goods are produced in the North (and wN /wS = 1).
293
So if there is a suciently large technology gap between the North and South, Northern
wages and incomes will be higher.
What determines the number of new and old goods? Krugman developed a model to
analyze this, formalizing an idea due to Vernon on the product cycle across countries.
In particular, suppose that new goods are created according to the following Poisson
process
M = iM
and these goods are imitated by the South slowly, according to the Poisson process
M S = tMN
and recall that
M = MS + MN
In steady state, we need the number of new and old goods to grow at the same rate, i.e.,
294
Then
M S /MS = M/M.
MS
t
= .
MN
i
Relative wages can be obtained as:
wN
=
wS
LN t
LS i
In this economy, relative utility and relative incomes per capita are simply proportional to
relative wages.
It is straightforward to check that as i, the rate of creation of new technologies, increases
wages (and incomes) in the North relative to the South will increase. As the rate of imitation,
t, increases the North becomes relatively poor.
13.2.2
Next, consider a variation on this model without international trade. In this case, the
number of goods produced and consumed in each country will dier. Standard arguments
give incomes in the North and the South as
1
0
wN
= M 1 and wS0 = MS1
1
0
wN
i 1
= 1+
wS0
t
The relative income dierences are typically larger now. For example, to illustrate this
point, consider the case in which
LN t
LS i
South and the North will have the same level of income when there is international trade
295
13.3
Perhaps the most major source of interaction between countries is through international
trade. A number of papers investigate how international trade aects the process of economic
growth and creates interdependences across countries. One example is Acemoglu and Ventura
(2002), who develop a tractable framework for analyzing cross-country income dierences
that incorporates international trade. Here I outline a version of that model. An additional
lesson from this model is that the stability of the world income distribution and findings
of conditional convergence do not necessary rule out endogenous growth (recall that these
patterns were used as evidence against endogenous growth models).
13.3.1
The Model
Consider a world economy consisting of a continuum of small countries with mass 1. There
is a continuum of intermediate products indexed by z [0, M], and two final products that
are used for consumption and investment. There is free trade in intermediate goods and no
296
(13.4)
where c(t) is consumption at date t in the (, , )-country (this is the same as the CRRA
preferences we have used so far with 1).
The budget constraint of the representative consumer is
pI k + pC c = y rk + w,
(13.5)
where pI and pC are the prices of the investment and consumption goods, k is capital stock,
r is the rental rate, and w is the wage rate, and also total wage income, since population
in each country is normalized to 1. There is no depreciation of capital. Since there is no
international trade in assets, income, y, must equal to consumption, pC c, plus investment,
pI k.
Specialization is introduced as follows: is assumed to be the number of intermediates
produced by the (, , )-country, with
Z
(j) dG (j) = M,
where I have explicitly introduced the j to emphasize that these refer to country j, but I
will drop this notation below and talk of a representative country.
297
1
ZM
(13.6)
1
M
Z
BI (r, p (z)) = 1 r1 p(z)1 dz ,
(13.7)
There are a number of noteworthy features introduced with these unit cost functions:
1. Labor is only used in the production of consumption goods. This is a convenient way of
introducing endogenous growth following Rebelo (1991)the accumulation equation
is linear.
2. I have written the unit cost functions for convenience. The underlying production
functions are quite similar. For example, the investment good would be produced as
follows
M
Z
1
I = BKI1 xI (z) dz
where B is a normalizing constant, KI is capital used in the production of the investment good, and xI (z) is the quantity of the zth intermediate good used in the
production of the investment good.
298
13.3.2
Equilibrium
Consumer maximization of (13.4) subject to (13.5) yields the following first-order condition
r (t) + pI (t) pC (t)
c (t)
=+
,
pI (t)
pC (t)
c (t)
(13.8)
(13.9)
Equation (13.8) is the standard Euler equation and requires the rate of return to capital,
r + pI
pC
, to equal the rate of time preference plus the slope of the consumption path.
pI
pC
The only dierence from the familiar version of the Euler equation is that, as in the twosector extended AK economy discussed above, now the rate of return to savings includes the
relative change in the price of investment goods compared to consumption goods, since by
299
Z
0
t
Z
r (s) + pI (s)
w (v) exp
ds dv .
pI (s)
(13.10)
Next consider firm maximization. The price of any variety of intermediate produced in
the (, , )-country is equal to:
p (t) = r (t) .
(13.11)
Choose the ideal price index for intermediates as the numeraire, i.e.,
ZM
0
p(z)
dz =
p1 dG = 1.
(13.12)
Since all countries export practically all of their production of intermediates and import the
ideal basket of intermediates, this choice of numeraire implies that p is also the terms of
trade of the country, i.e. the price of exports relative to imports.
The conditions for price to equal marginal cost for the consumption and investment
sectors imply:
pC = w(1)(1 ) r(1 ) ,
(13.13)
pI = 1 r1 .
(13.14)
300
(13.15)
output, y, and exports p1 Y. Equation (13.15) implies that when the number of varieties,
, is larger, a given level of income y is associated with better terms of trade, p, and higher
rental rate of capital, since
r = p.
Intuitively, a greater implies that for a given level of aggregate capital stock, there will be
less capital allocated to each variety of intermediate, so each will command a higher price in
the world market. Conversely, for a given , a greater relative income y/Y translates into
lower terms of trade, p, and a lower rental rate, r.
Market clearing for labor is also straightforward.
the consumption goods sector, and given the Cobb-Douglas assumption, this demand is
(1 ) (1 ) times consumption expenditure, pC c, divided by the wage rate, w. So the
market clearing condition for labor is:
1 = (1 ) (1 )
pC c
.
w
(13.16)
pI k.
1 (1 ) (1 )
301
(13.17)
(13.18)
(this law of motion simply follows from the budget constraints of the representative consumer,
(13.5), combined with equilibrium conditions in (13.17)).
In addition, the market clearing conditions also imply that for each country:
rk + w = r
(rk + w)dG.
(1 ) (1 )
w
=
.
rk + w
[ + (1 ) ] r + (1 ) (1 )
(13.19)
(13.20)
For a given cross-section of rental rates, the set of equations in (13.18) determine the
evolution of the distribution of capital stocks. For a given distribution of capital stocks, the
set of equations in (13.19) and (13.20) determine the cross-section of rental rates.
It can now be shown that the world economy has a unique and stable steady state in
which all countries grow at the same rate.
Define the world growth rate as x Y /Y , and the relative income of a (, , )-country
= y/y
as yR y/Y . Then, setting the same growth rate for all countries, i.e., k/k
= x , the
steady-state cross-section of rental rates are:
1/
+ x
r =
Moreover:
yR
=
+ x
302
(13.21)
(13.22)
+ x
dG = 1.
(13.23)
Equation (13.22) describes the steady-state world income distribution and states that
rich countries are those which are patient (low ), create incentives to invest (high ), and
have access to better technologies (high ). Equation (13.23) implicitly defines the steadystate world growth rate.
This discussion establishes:
Proposition 35 In the above-described world economy, there exists a unique steady state
equilibrium in which all countries grow at the same rate x defined by (13.23), but have
unequal levels of income, terms of trade and rates of return on capital. The terms of trade
and the rental return on capital for each economy is given by (13.21) and the relative position
of each country in the world income distribution is given by (13.22).
The implications of this model and this proposition are described next.
13.3.3
Implications
13.4
The above model incorporated trade between countries together with terms of trade eects.
An alternative would be to incorporate trade assuming that each country is a small open
economy. This is done in Ventura (1997). If each country is within the cone of diversification,
this means there is factor price equalization, and thus each country takes factor prices as
given.
Imagine the world rate of return to capital is equal to r ; there is no trade in financial
assets (only in goods, which equalizes factor prices), and each country has identical preferences given by our standard CRRA formula. This implies that consumption growth in all
304
However, now imagine countries dier according to their patience, i.e., discount rate, j , as
we allowed in the previous model. Then the above equation becomes
cj
1
r j .
=
cj
In this case, more patient countries will have lower initial consumption but higher consumption growth, and therefore they will accumulate more capital and invest in their own country.
Ultimately, the more patient countries will become much richer. This process will end either
when the world moves out of the cone of diversification, or one country produces almost all
of the output of the world economy.
In fact, this feature that with given prices, the more patient country will ultimately
become much richer than the rest of the world is more general than the open economy model
outlined here. In a closed economy with individuals that have dierent discount rates, those
with smaller discount rates (greater patience) will ultimately become much richer than the
rest. In general, we tend to assume that all individuals have the same discount rates in order
to ensure a stable income distribution within a country.
305
306
Part IV
Endogenous Technological Change
307
310
Chapter 14
Expanding Variety Models
The simplest models of endogenous technological change are those in which the variety of
inputs used by firms increases (expands) over time as a result of R&D undertaken by research
firms. The key is that the R&D is purposeful, undertaken for profits, and it leads to an output
that increases the productivity of existing factors.
Two versions of essentially the same model could be used. In the first, research leads
to the invention of new goods, and individuals have love-for-variety, so they derive greater
utility when they have more goods available, so real income increases. In the second, which
is the one I will use here, it is the variety of machines that expand (because of invention
of new varieties), and a greater variety of machines leads to greater division of labor,
increasing the productivity of final good firms.
In all of these models, and also in the models of quality competition we will see below,
we will use the Dixit-Stiglitz constant elasticity structure.
311
14.1
We start with a particular version of the growth model with expanding varieties of inputs
and an R&D technology such that only output is used in order to undertake research. This
is sometimes referred to as the lab equipment model, since all that is required for research
is additional investment in more equipment in labs etc.
14.1.1
Z
0
C (t)1 1
exp (t) dt.
1
(14.1)
Throughout I suppress time dependence when this causes no confusion. There is no population growth.
The unique consumption good of the economy is produced with the following aggregate
production function:
1
Y =
1
N
1
k(v)
dv L
(14.2)
where L is the aggregate labor input, N denotes the dierent number of varieties of capital
inputs, and k (v) is the total amount of capital (machine) of input type v. The term (1 )
in the denominator is included for notational simplicity. Notice that for given N, which final
good producers take as given, equation (14.2) exhibits constant returns to scale. Therefore,
final good producers are competitive and subject to constant returns to scale, justifying our
use of the aggregate production function to represent their production possibilities set.
312
(14.3)
where I is investment and X is expenditure on R&D, which is for now assumed to come out
of the total supply of the final good. (Other models of R&D will be discussed below).
Assume that the creation of new inputs takes place as follows:
N = X,
(14.4)
and the economy starts with some initial technology stock N (0) > 0.
This implies that greater spending on R&D leads to the invention of new inputs. There
is no uncertainty in this process, at least at the aggregate level. One may want to think that
there is uncertainty at the individual level, but with many dierent research labs undertaking
such expenditure, at the aggregate level, equation (14.4) holds deterministically.
The important point is that R&D expenditure expands the potential set of capital/machine
varieties.
A firm that invents a new capital variety is the sole supplier of that type of machine,
and sets its price (v) to maximize profits. The demand for capital of type v is obtained
by maximizing (14.2). Namely, simply considering the aggregate production function, the
maximization problem for inputs is:
Z N
Z N
1
1
k(v) dv L
(v) k(v)dv wL.
max
[k(v)]lv[0,N ] ,L 1
0
0
313
(14.5)
L
k(v) =
(v)
1/
(14.6)
Assume also that, once the blueprint of a particular input is invented, the research firm
can create one unit of that machine at marginal cost equal to units of the final good.
Now consider the monopolist owning a machine of type invented at time t. This
monopolist chooses an investment plan and a sequence of capital stocks so as to maximize
the present discounted value of profits starting from time t, as given by
Z s
Z
exp
r () d [(, s)k(, s) k(, s)] ds
V (, t) =
t
(14.7)
where r (t) is the market interest rate at time t. Alternatively, assuming that the value
function is dierentiable in time, this could be written as a dynamic programming equation
of the form
r (t) V (, t) V (, t) = (, t)k(, t) k(, t).
14.1.2
(14.8)
To see why (14.8) follows from (14.7), you should think of the principle of optimality again
(now in continuous time rather than discrete time). In particular, rewrite (14.7) at time t
as:
V (, t) =
Z
exp
Z
r () d ((, s) ) k(, s)ds+
314
Z
exp
r () d [(, s)k(, s) k(
t
t=0
thus, applying the chain rule,
14.1.3
Characterization of Equilibrium
Since (14.6) defines isoelastic demands, the solution to the maximization problem of the
monopolist involves setting the same price in every period,
(, t) =
,
1
that is, all monopolists charge a constant rental rate, equal to a mark-up over the marginal
cost. Without loss of generality, normalize the marginal cost of machine production to
(1 ), so that
(, t) = = 1
315
(14.9)
(14.10)
implying that all monopolists sell exactly the same amount, charge the same price and make
the same amount of profits.
Substituting (14.6) and the machine prices into (14.2), we obtain
Y (t) =
1
N (t) L.
1
(14.11)
This is the major equation of the expanding product or input variety models. It shows
that even though the aggregate production function is constant returns to scale from the
viewpoint of final good firms which take N as given, for the overall economy, there are
increasing returns to scale and increases in the variety of machines, N, increase the productivity of output. In particular, (14.11) makes it clear that if N increases at the constant
rate, so will output per capita.
Similarly, the labor decision of the final good sector, from the first-order condition of
maximizing (14.5) with respect to L, implies the following equilibrium condition
w (t) =
N (t) .
1
(14.12)
Finally, there is free entry into research. This implies that at all points in time we must
have
V (, t) = 1,
316
(14.13)
14.1.4
Definition of Equilibrium
[C (t) , X (t)]
t=0 such that given the price path [r (t) , w (t)]t=0 , the representative household
is maximizing its utility given by (14.1), capital demands by the final goods sector satisfy
(14.9), the wage rate is given by (14.12), and the value of each monopolist, V (, t), satisfies
(14.7) and (14.13).
14.1.5
Steady State
Let us start with the steady state. In the steady state, the value of an invention will be
constant, thus V = 0, and also the interest rate will be constant, i.e., r (t) = r (where I
again use stars to denote BGP/steady-state values). Substituting this in either (14.7) or
(14.8), we obtain
V =
where is the (constant) flow of net profits per period, given by (14.10) above.
317
(14.14)
1
C
= (r )
C
(14.15)
and in steady state, the rate of growth of the economy is the same as the rate of growth of
consumption, so we have that the whole economy grows at the rate g = gc .
Therefore, given the steady-state interest rate we can simply determine the long-run
growth rate of the economy as:
g =
1
(L )
(14.16)
Since this is a growing economy, we need to ensure that the transversality condition is
satisfied in equilibrium. As usual, this requires r > g (since there is no population growth),
i.e.,
(1 ) L < ,
(14.17)
14.1.6
Transitional Dynamics
It is also straightforward to see that there are no transitional dynamics in this model. To
see this, let us go back to the value function for each monopolist. Substituting for profits,
this gives
r (t) V (, t) V (, t) = L.
Free entry gives
V (, t) = 1.
Dierentiating this with respect to time immediately implies V (, t) = 0, which is only
consistent with r (t) = r for all t, thus
r (t) = L for all t.
This establishes:
Proposition 37 In the above-described expanding input-variety model of endogenous technological change, with initial technology stock N (0) > 0, there is a unique equilibrium path
in which technology, output and consumption always grow at the rate g as in (14.16).
319
14.1.7
Z N
1
1
k(v) dv L
k(v)dv wL,
max
[k(v)]lv[0,N ] ,L 1
0
0
which only diers from the private maximization problem because the marginal cost of
machine creation, , is used. Recalling that 1 , this implies
ks (v) =
L
(1 )1/
thus
(1 )(1)/
Y (t) =
N (t) L
1
= (1 )1/ N (t) L.
320
1/
= (1 )
= (1 )1/ N (t) L.
Now, given this and (14.4), the maximization problem of the social planner can be written
as
max
Z
0
C (t)1 1
exp (t) dt
1
subject to
N (t) = (1 )1/ N (t) L C (t) .
In this problem, N (t) is the state variable, and C (t) is the control variable.
Let us set up the current-value Hamiltonian
1
h
i
1
(N, C, ) = C (t)
+ (t) (1 )1/ N (t) L C (t) .
H
1
321
C
1
1/
=
(1 )
L ,
C
(14.18)
which can be directly compared to the growth rate in the decentralized equilibrium, (14.16).
The comparison boils down to that of
(1 )1/ to ,
and it is straightforward to see that the former is always greater since (1 )1/ > 1 by
virtue of the fact that (0, 1). This implies that the socially-planned economy will always
grow faster than the decentralized economy. Intuitively, the social planner values innovation
more, because it will be able to use the machines more intensively after innovation, since
the monopoly markup reducing the demand for machines is absent in the social planners
allocation.
This establishes:
Proposition 38 In the above-described expanding input-variety model, the decentralized
equilibrium is not Pareto optimal, and always grows less than the allocation that would maximize utility of the representative household.
14.1.8
The divergence between the decentralized equilibrium and the socially planned allocation
introduces the possibility that there might be Pareto-improving interventions. The most
natural alternatives to consider in this model are two:
322
2. Subsidies to Capital Inputs: the problem also arises from the fact that the decentralized
economy is not using as many units of the machines/capital inputs (because of the
monopoly markup); so subsidies to capital inputs given to final good producers would
also be useful in increasing the growth rate.
14.2
In the model of the previous section, growth resulted from the use of final output for R&D.
This is similar, in some way, to the endogenous growth model of Rebelo (1991), since the
accumulation equation is linear in accumulable factors. As a result, we saw that, in equilibrium, output took a linear form in the stock of knowledge (new machines), thus a AN form
instead of the Rebelos AK form.
An alternative is to have scarce factors used in R&D. In other words, instead of the
lab-equipment, we now have scientists as the key creators of R&D. In this case, there will not
be endogenous growth, unless there are knowledge spillovers from past R&D. In other words,
now current researchers need to stand on the shoulder of past giants. In fact, the original
formulation by Romer (1990) was exactly of this knowledge-spillovers form, imposing the
standing on the shoulders of giants as part of the technological possibilities frontier of the
economy.
A typical formulation in this case is
N = NLR
(14.19)
where LR is labor allocated to R&D. The term N on the right-hand side captures spillovers
from the stock of existing ideas. The greater is N, the more productive is an R&D worker.
LR could be skilled workers as in Romer (1990), or scientists or regular workers. In the
latter case, there will be competition between the production sector and the R&D sector for
workers, and the marginal cost of workers and research would be given by the wage rate and
production sector. In particular, the free entry condition is now
N (t) V (v, t) = w (t)
324
N (t)
1
w (t) =
So the steady-state free-entry condition, with a constant steady-state (balanced growth path)
interest rate, r , becomes
N (t)
L
N (t)
=
r
1
C
1
= ((1 ) L ) .
C
(14.20)
The rest of the analysis is unchanged. In particular, the growth rate of technology and
output are also given by (14.20). Also, there are again no transitional dynamics, and we can
also compare the decentralized equilibrium to the Pareto optimal allocation. It is also useful
to note that there is again a scale eect heregreater L increases the interest rate and the
growth rate in the economy.
This discussion immediately establishes:
Proposition 39 In the above-described expanding input-variety model with knowledge spillovers,
there exists a unique balanced growth path equilibrium in which, technology, output and con325
14.2.1
Since we now have a model with monopolistic competition, we can also relate the results
to standard issues in industrial organization, such as competition policy, anti-trust, patents
etc.. For example, in this model we can introduce a fringe of competitive firms which could
limit the markup that each monopolist can charge. For example, recall that the optimal
markup that the monopolist charges is
=
.
1
Imagine, instead, that a fringe of competitive firms can copy the innovation of any monopolist, but they will not be able to produce at the same level of costs (because the inventor
has more know-how). In particular, suppose that instead of a marginal cost , they will
have marginal cost of with > 1. If > 1/ (1 ), this fringe is not a threat to the
monopolist, since the monopolist could set its ideal, profit maximizing, markup and the
fringe would not be able to enter without making losses. However, if < 1/ (1 ), the
fringe would prevent the monopolist from setting its ideal monopoly price. In particular in
this case the monopolist would be forced to set a limit price, exactly equal to
= .
(14.21)
This price formula follows immediately by noting that, if the price of the monopolist were
higher than this, the fringe could undercut and make profits, since their marginal cost is
equal to . If it were above this, the monopolist could further increase its price without
326
1 1/
(1)/
g =
( 1) (1 )
L ,
which is less than (14.16). Therefore, in this model, somewhat counter-intuitively, greater
competition, which reduces markups (and thus static distortions), also reduces long-run
growth. This is because profits are important in this model to encourage innovation by new
research firms. If these profits are cut, incentives for research are also reduced. Of course,
welfare is not the same as growth, and some degree of competition reducing prices below the
unconstrained monopolistic level might be useful for welfare depending on the discount rate
of the representative household. Essentially, with a lower markup, households are happier
in the present, but suer slower consumption growth. The exact tradeo between these two
opposing eects depends on the discount rate of the representative household.
Another similar application is to that of patent policy. In practice, patents are for
limited durations. In the baseline model, we assumed that patents are perpetual; once a
firm invents a new good, it has a patent forever and it becomes the monopolist for that good
327
14.3
As we have seen, the models used so far feature a scale eect in the sense that a larger
population, L, translates into a higher interest rate and a higher growth rate. This is
problematic for three reasons as argued in a series of papers by Chad Jones:
1. Larger countries do not necessarily grow faster (though the larger market of the United
States or European economies may have been an advantage during the early phases of
the industrialization process).
2. The population in general is not constant, but growing. If we have constant population
growth as in the standard neoclassical growth model, e.g., L (t) = exp (nt) L (0), these
models would not feature a balanced growth path. Instead, growth would become faster
and faster over time, eventually leading to an infinite output in finite time, violating
328
exp (t)
C 1 1
dt,
1
(14.22)
where C is consumption defined over the final good of the economy. This good is produced as
before, more specifically, with the production function, (14.2) and all the other assumptions
are the same as before.
New goods are produced by allocating workers to the R&D process as in the knowledgespillovers model studied in the previous section. However, now there are limited knowledge
spillovers, in particular,
N (t) = N (t) LR (t)
(14.23)
where < 1 and LR is labor allocated to R&D. So labor market clearing requires
LE (t) + LR (t) = L,
(14.24)
where LE (t) is the level of employment in the production sector. The fact that not all
workers are in the production sector implies that the aggregate output of the economy (by
an argument similar to before) is given by
Y (t) =
1
N (t) LE (t) ,
1
329
LE (t)
N (t) ,
= w (t) =
r
1
(1 ) LE (t)
= 1.
r
=
.
gN
N (t)
1
(14.25)
From equation (14.11), this implies the total output grows at the rate gN + n. But now there
is population growth, so consumption per capita gross at the rate
gc = gN
n
.
gc =
1
330
(14.26)
331
332
Chapter 15
Models of Quality Competition
15.1
Baseline Model
In the model of expanding machine variety, dierent machines were complements in production. However, in practice when a better computer comes to the market, it replaces previous
models. This is captured in the models of vertical quality competition or quality improvement, such as the models in Aghion and Howitt, or Grossman and Helpman. Population
and labor supply are again constant at L. The major dierence from the previous setup is
that the production function is now
1
Y (t) =
1
1
1
q(v, t)k(v, t)
dv L
(15.1)
where q(v, t) is the quality of machine v at time t and because now the number of varieties
is constant, I have normalized it to 1. Consequently, while in the previous section growth
took place because the variety of inputs expanded, here it takes place because existing inputs
become more productive. In many ways, this seems to describe the growth process better,
333
(15.2)
Let us normalize = 1 , so the monopolist sets the price (v, t) = q(v, t), and sells
k (v) = L. This generates profits
(v, t) =
1
Lq(v, t)
(15.3)
1
Q (t) L
1
Q (t) =
where
q(v, t)dv
The value of being the inventor is dierent now, because this position will not last forever.
More formally, the standard dynamic programming equation now becomes:
r (t) V (v, t) V (v, t) = (v, t) x(v, t)V (v, t)
335
(15.4)
Otherwise, there will be entry into or exit from research, since one more unit of the final
good provides a flow rate of obtaining V .
In steady state, V (v, t) = 0. So, dropping time and sector dependence and using stars
again to denote BGP values, we have
V
=
r + x
r + g /( 1)
q( 1)2 L
=
= 1 q.
[( 1)r + g ]
=
where the penultimate equality follows from substituting for profits from (15.3), and the last
equality follows from free entry condition (15.4).
336
1
( 1)2 L ( 1) .
(( 1) + 1)
1
g =
( + 1/ ( 1))
( 1)
L .
(15.5)
This establishes:
Proposition 41 In the above-described quality-improvement model, there exists a unique
balanced growth path equilibrium in which output and consumption grow at the same rate
given by (15.5). The rate of innovation is g / ( 1).
15.2
Pareto Optimality
This equilibrium, like that of the endogenous technology model with expanding input varieties, is not, generally, Pareto optimal. But in fact, this can be because there is too little
or too much innovation. The reason why there is too little innovation is the same as the
model in the previous section: a monopolist does not sell as many units of the new machines
as the social planner would like, and does not fully internalize the benefits accruing to final
good producers (and the economy) from further innovation. However, counteracting this
there is the business stealing eect coming from the Schumpeterian nature of the model; a
new innovation steals the profits of the existing monopolist. This tends to induce entrants
337
L
= 1/ L,
1/
given the assumption that in this case = 1 . This implies that total output, under the
socially-planned economy, is equal to
Y (t) =
(1)/
Q (t) L.
(1 )
=
=
q (v, t) ks (v, t) dv
(1)/
Q (t) L (1)/ Q (t) L
(1 )
(1)/
Q (t) L.
1
(15.6)
Finally, note that given the assumptions above, the social planner faces an aggregate
technology frontier of the form
Q (t) = ( 1) X (t) ,
338
Z
0
C (t)1 1
exp (t) dt
1
subject to
(1)/
Q (t) = ( 1)
Q (t) L ( 1) C (t) ,
where the constraint uses net output, (15.6), and the budget constraint.
In this problem, Q (t) is the state variable, and C (t) is the control variable.
Let us again set up the current-value Hamiltonian
"
#
1
(1)/
C
(t)
(Q, C, ) =
H
+ (t) ( 1)
Q (t) L ( 1) C (t) .
1
1
The necessary conditions are
C (N, C, ) = 0 = C (t) = ( 1) (t)
H
(1)/
Combining these conditions, we obtain the following growth rate for consumption in the
social planners allocation:
1
C
=
C
!
(1)/
( 1)
L .
1
339
(15.7)
340
Chapter 16
Directed Technical Change
The framework analyzed so far assumed technical change to be neutral towards dierent factors, and in fact, in most applications, we limited ourselves to the Cobb-Douglas production
function.
Technical change is often not neutral towards dierent factors of production, and the
elasticity of substitution between dierent factors is often found not to be equal to 1.
So it is important to consider the implications of more general production functions,
and think of endogenizing technology and technological dierences within this more general
framework. There are, however, reasons for economists focus on Cobb-Douglas production
function. The most important one is that a general production function, associated with
arbitrary technological progress, does not generate balanced growth. Instead, with a nonCobb-Douglas production function, balanced growth requires all technical change to be laboraugmenting. Therefore, once we abandon the Cobb-Douglas production function, we need to
develop a theory of why technical change is purely labor-augmenting, and a more generally
think about various biases in the nature of technical change.
341
1. The price eect: there will be stronger incentives to develop technologies when the
goods produced by these technologies command higher prices.
2. The market size eect: it is more profitable to develop technologies that have a larger
market. The importance of market size in innovation was much emphasized by the
famous scholar of innovation, Jacob Schmookler (1966), who, for example, argued:
invention is largely an economic activity which, like other economic activities, is
pursued for gain;... expected gain varies with expected sales of goods embodying the
invention.
342
16.1
16.1.1
Definitions
First consider what factor-augmenting and factor-biased technical change correspond to.
For this purpose, take the standard the constant elasticity of substitution (CES) production
function
i
h
1
1 1
+ (1 ) (AZ Z)
,
y = (AL L)
where L is labor, and Z denotes another factor of production, which could be capital or
skilled labor.
Here (0, ) is the elasticity of substitution between the two factors.
AL is labor-augmenting (labor-complementary) and AZ is Z-complementary. The relative
marginal product of the two factors:
MPZ
1
=
MPL
AZ
AL
1
1
Z
.
L
(16.1)
This implies that when > 1, i.e., when the two factors are gross substitutes, AL is laborbiased and AZ is Z-biased. In contrast, when < 1, i.e., when the two factors are gross
complements, AZ is labor-biased and AL is Z-biased.
16.1.2
Basic Model
Now we are in a position to consider a simple model of directed technical change. Assume
that preferences are again given by the CRRA function
Z
0
C 1 1
exp (t) dt.
1
343
(16.2)
h 1
1 i 1
C + I + X Y YL + (1 )YZ
(16.3)
In words, the output aggregate is produced from two other (intermediate) goods, YL and
YZ , with elasticity of substitution . Here Y can either be interpreted as the final good
aggregated from the two intermediates, YL and YZ , or Y could be an index of utility defined
over the two final goods, YL and YZ . Total output is again distributed between consumption,
C, spending on machines, I, and spending on R&D, X.
The fact that there is R&D spending signifies that I will use the lab-equipment model
to expose the basic ideas, but exactly the same results apply with the knowledge-spillovers
model.
Intermediate good production functions are:
Z NL
1
1
xL (j, t)
dj L ,
YL (t) =
1
0
and
1
YZ (t) =
1
NZ
xZ (j, t)
dj Z .
(16.4)
(16.5)
Note here that the range of machines used with the two sectors are dierent (there are
two disjoint sets of machines, though we use the index j to denote either for notational
simplicity).
Assume that machines to both sectors are supplied by technology monopolists. This is
a straightforward generalization of the endogenous technical change model of product variety
discussed above.
Each monopolist sets a rental price L (j, t) or Z (j, t) for the machine it supplies to the
market. These prices are potentially time-varying, but we will see that they will be constant
in equilibrium.
344
pL (t)
xL (j, t) =
L (j, (t))
Similarly
pZ (t)
xZ (j, t) =
Z (j, t)
1/
1/
L.
Z,
(16.7)
(16.8)
Since the demand curve for machines facing the monopolist, (16.7), is iso-elastic, the
profit-maximizing price will be a constant markup over marginal cost. In particular, all
machine prices will be given by
L (j, t) = Z (j, t) = 1 for all j and t.
These imply that
xL (j, t) = [pL (t)]1/ L for all j,
and
xZ (j, t) = [pZ (t)]1/ Z for all j.
Substituting these into (16.4) and (16.5), we obtain
YL (t) =
1
1
[pL (t)] NL (t) L
1
and
YZ (t) =
1
1
[pZ (t)] NZ (t) Z
1
345
(16.9)
Let VZ and VL be the net present discounted values of new innovations. Then in steady
state, we have that (dropping time dependence):
1/
1/
p L
p Z
VL = L
and VZ = Z
.
r
r
(16.10)
The comparison of these two values is of crucial importance. The greater is VZ relative
to VL , the greater are the incentives to develop Z-complementary machines, NZ , rather than
NL .
This highlights the two eects on the direction of technical change that I mentioned
above:
1. The price eect: a greater incentive to invent technologies producing more expensive
goods.
2. The market size eect: a larger market for the technology leads to more innovation.
The market size eect encourages innovation for the more abundant factor.
It is straightforward from the final good production function given in (16.3) that the
relative price of good Z to good L will be given by
1
pZ
1 YZ
p
=
pL
YL
1
1 NZ Z
1
p
=
NL L
346
NZ
NL
1 1
Z
.
L
(16.11)
where
( 1) (1 ) .
is the (derived) elasticity of substitution between the two factors. An increase in the relative
factor supply, Z/L, will increase VZ /VL as long as > 1 and it will reduce it if < 1.
Therefore, the elasticity of substitution regulates whether the price eect dominates the
market size eect.
Note also that we have
1 1
So the two factors will be gross substitutes when the two goods in utility function (or the
two intermediates in the production of the final good) are gross substitutes.
We have so far characterized the demand for new technologies. Next we have to determine
the supply of all new technologies, which will be, in part, regulated by the technological
possibilities for generating new machine varieties. Suppose as in the analysis above that new
machines in the two sectors are produced by investing in lab equipment:
N L = L XL and N Z = Z XZ ,
(16.12)
(16.13)
1
NZ
1
Z
=
,
NL
(16.14)
where the *s denote that this expression refers to the steady-state value
Before going further, using the same type of analysis as before, we can characterize the
equilibrium in this economy. Because there are two state variables now, the economy features
transitional dynamics, but still has a unique balanced growth path. These are stated in the
next proposition (and left for you to prove):
Proposition 43 In the directed technical change model described here, there exists a unique
balanced growth path equilibrium in which the relative technologies are given by (16.14), and
consumption and output grow at the rate
1
1 1
1
g=
(1 ) ( Z Z)
+ ( L L)
.
Starting from any NL (0) > 0 and NZ (0) > 0, the economy converges to this balanced growth
path.
More interesting than the aggregate growth rate of the economy in this case is how the
direction of technical change aects relative factor prices and how it responds to changes in
relative supplies. The study this issue, recall that relative factor prices are given by
NZ
wZ
= p1/
=
wL
NL
NZ
NL
1
1
Z
.
L
(16.15)
First, the relative factor reward, wZ /wL , is decreasing in the relative factor supply, Z/L.
Second, the same combination of parameters,
1
,
for more abundant factors is more profitable also determines whether a greater NZ /NL i.e.,
a greater relative physical productivity of factor Z increases wZ /wL .
348
2
Z
.
L
(16.16)
Comparing this equation to the relative demand for a given technology, we see that the
response of relative factor rewards to changes in relative supply is always more elastic in
(16.16) than in (16.15) as implied by Proposition 44.
This is simply an application of the LeChatelier principle, which states that demand
curves become more elastic when other factors adjust, but with a new interpretationthat
is, the relative demand curves become flatter when technology adjusts.
The more important and surprising result here is that if is suciently large, in particular if > 2, the relationship between relative factor supplies and relative factor rewards
can be upward sloping. Let us refer to a situation in which an increase in the relative supply
of a factor changes technology so much that the relative price of the factor becoming more
349
16.1.3
Implications
Let us now consider the implications of this simple model of directed technical change, and in
particular of Propositions 44 and 45. One of the most interesting applications is to changes
in the skill premium. For this application, imagine that Z = H stands for skilled workers,
for example, college-educated workers. In the United States labor market, the skill premium
has shown no tendency to decline despite a very large increase in the supply of college educated workers. On the contrary, following a brief period of decline during the 1970s in the
face of the very large increase in the supply of college-educated workers, the skill (college)
premium has increased very sharply throughout the 1980s and 1990s, to reach a level not
experienced in the postwar era. The following figure shows the general patterns by plotting
the college premium and the relative supply of college graduate workers in the United States
since WWII.
350
.6
.5
.4
.4
.2
.3
.6
0
39
49
59
69
year
79
89
96
In the labor and macro literature, the most popular explanation for these patterns is skillbiased technological change. For example, the computers or the a new IT technologies are
argued to favor skilled workers relative to unskilled workers. But why should the economy
adopt and develop more skill-biased technologies throughout the past 20 years, or more
generally throughout the entire 20th century? This question becomes more relevant once we
remember that during the 19th century many of the technologies that were fueling economic
growth, such as the factory system and the major spinning and weaving innovations, were
skill-replacing rather than skill-complementary.
Thus, in summary, we have the following stylized facts:
1. Secular skill-biased technical change increasing the demand for skills throughout 20th
century.
351
Relative Wage
Long-run relative
demand for skills
Short-run
Response
Exogenous Shift in
Relative Supply
If on the other hand we have < 2, the long-run relative demand curve will be downward sloping, though again it will be shallower than the short-run relative demand curve.
Then following the increase in the relative supply of skills there will be an initial decline in
the skill premium (college premium), and as technology starts adjusting the skill premium
will increase. But it will end up below its initial level. To explain the larger increase in the
353
Relative Wage
Long-run relative
demand for skills
16.2
The above model derived the relative bias results by assuming a constant elasticity of substitution production function. In fact, the spirit of the results are much more general. The
following proposition generalizes these results:
354
L
and denote equilibrium technologies by (A , A ),
pose that factor supplies are given by Z,
Z
L
L,
A , A and wL Z,
L,
A , A . Then we have that
and equilibrium factor prices by wZ Z,
Z
L
Z
L
L
:
for all Z,
1
ln (AZ /AL )
=
(16.17)
ln (Z/L)
1 +
and
L,
A , A /wL Z,
L,
A , A ln (A /A )
ln wZ Z,
Z
L
Z
L
Z
L
0
ln (AZ /AL )
ln (Z/L)
L,
A , A /wL Z,
L,
A , A
d ln wZ Z,
2
Z
L
Z
L
=
,
d ln (Z/L)
1 +
(16.18)
(16.19)
16.3
One of the advantages of the models of directed technical change is that they allow us to
investigate why technological change might be purely labor-augmenting as required for balanced growth. Here I outline a model which generates this results (though under somewhat
more restrictive assumptions than the directed technical change results we have seen so far).
16.3.1
Consider an economy consisting of L unskilled workers who work in the production sector,
and S scientists who perform R&D. The distinction between unskilled workers and scientists is adopted to ensure that the production and R&D sectors do not compete for workers.
The economy again admits a representative consumer with the usual constant relative risk
aversion (CRRA) preferences:
Z
0
C (t)1 1
exp (t) dt
1
(16.20)
where C (t) is consumption at the time t and 0 is the elasticity of marginal utility.
356
C + I wL + rK + S S + ,
where I denotes investment, w is the wage rate of labor, r is the interest rate, K denotes the
capital stock, S is the wage rate for scientists, and is total profit income. The resource
constraint of the economy implies that
h 1
1 i 1
,
wL + rK + S S + = Y = YL + (1 )YK
(16.22)
(16.23)
Let us also use this opportunity to develop a variant of the models studied above. In
particular, let us assume that the labor-intensive and capital-intensive goods are produced
competitively from constant elasticity of substitution (CES) production functions of laborintensive and capital-intensive intermediates, with elasticity 1/(1 ):
YL =
1/
Z
yl (i) di
and YK =
357
yk (i) di
1/
(16.24)
(16.25)
where l(i) and k(i) are labor and capital used in the production of good i. Market clearing
for labor and capital then requires:
Z
l (i) di = L and
k (i) di = K.
(16.26)
To close the model, we need to specify the innovation possibilities frontierthat is,
the technological possibilities for transforming resources into blueprints for new varieties of
capital-intensive and labor-intensive intermediates.
Let us assume that these blueprints are created by the R&D eorts of scientists, who
are, in turn, employed by R&D firms. There is free-entry into the R&D sector. Once an
R&D firm invents a new intermediate, it receives a perfectly enforced patent and becomes
the perpetual monopolist of that intermediate. R&D firms have access to the following
technologies for invention:
n
m
= bl (Sl ) Sl and
= bk (Sk ) Sk ,
n
m
358
(16.27)
(16.28)
I also assume that the economy starts at t = 0 with n (0) > 0 and m (0) > 0.
Equation (16.27) implies a number of important features:
1. Technical change is directed, in the sense that the society (researchers) can generate
faster improvements in one type of intermediates than the other. This feature will
enable the analysis of whether equilibrium technical change will be labor- or capitalaugmenting.
2. The fact that () is decreasing means that there are intra-temporal decreasing returns
to R&D eort; when more scientists are allocated to the invention of labor-intensive
intermediates, the productivity of each declines. This might be, for example, because
scientists crowd each other out in competing for the invention of similar intermediates.
This decreasing returns assumption is adopted to simplify the analysis of transitional
dynamicswhen () is constant, the behavior of Sl and Sk is discontinuous.
3. Research eort devoted to the invention of labor-intensive intermediates, (Sl ) Sl ,
leads to a proportional increase in the supply of these intermediates at the rate bl ,
while the same eort devoted to the discovery of capital-using intermediates leads to a
proportional increase at the rate bk . The parameters bl and bk potentially dier since
the discovery of one type of new intermediate may be technically more dicult than
359
16.3.2
An equilibrium in this economy is given by time paths of factor, intermediate and good prices,
w, r, S , [pl (i)]ni=0 , [pk (i)]m
i=0 , pL and pK , employment, consumption and saving decisions,
n
m
[l(i)]ni=0 , [k(i)]m
i=0 , [yl (i)]i=0 , [yk (i)]i=0 , C and I, and the allocation of scientists between the
(16.29)
where recall that r is the rate of interest. The consumption sequence [C(t)]
0 also satisfies
the lifetime budget constraint of the representative agent (the no Ponzi game constraint):
Z t
r (v) dv = 0.
lim K (t) exp
(16.30)
Consumer maximization gives the relative price of the capital-intensive good as:
1
pK
=
p
pL
361
YK
YL
(16.31)
1
1
pK = p1 + (1 ) 1 and pL = + (1 ) p1 1 .
(16.32)
Next, consumer maximization and the CES functions in (16.24) yield the following isoelastic demand curves for intermediates:
1
1
pl (i)
yl (i)
pk (i)
yk (i)
=
and
=
.
pL
YL
pK
YK
(16.33)
Given these isoelastic demands, profit maximization by the monopolists implies that prices
will be set as a constant markup over marginal cost (which is w for the labor-intensive
intermediates and r for the capital-intensive intermediates):
1
1
1
w
1
r
pl (i) = 1
w=
r= .
and pk (i) = 1
(16.34)
Since, from (16.34), all labor-intensive intermediates sell at the same price, equation (16.33)
implies that yl (i) = yl , for all i, and since all capital-intensive intermediates also sell at the
same price, yk (i) = k for all i as well. Then from the market clearing equation (16.26), we
obtain
yl (i) = l(i) =
L
K
and yk (i) = k(i) = .
n
m
(16.35)
Substituting (16.35) into (16.24) and integrating gives the total supply of labor- and
capital-intensive goods as:
YL = n
L and YK = m
K.
(16.36)
These equations reiterate that n and m correspond to labor- and capital-augmenting technologies. Greater n enables the production of a greater level of YL for a given quantity of
labor, and similarly an increase in m raises the productivity of capital.
362
pL and r = m
pK .
(16.37)
Finally, using (16.31) and (16.36), the relative price of the capital intensive good is
1
1
1 m K
pK
=
.
p
pL
n
L
(16.38)
Z s
exp
(r() + ) d f (v)dv,
(16.39)
where r(t) is the interest rate at date t, is the depreciation (obsolescence) rate of existing
intermediates, and
l =
1 wL
1 rK
and k =
n
m
(16.40)
are the flow profits from the sale of labor- and capital-intensive intermediate goods.
Scientists are paid a wage S , and competition between the two sectors and free-entry
ensure that this wage is equal to the maximum of their contribution to the value of monopolists in the two sectors. Recall that R&D firms do not internalize the crowding eects, so
the marginal value of allocating one more scientist to the invention of labor-intensive intermediates is bl (Sl ) nVl , and for capital-intensive intermediates, it is bk (Sk ) mVk , where Vl
and Vk are given by (16.39). Therefore, free-entry requires:
S = max {bl (Sl ) nVl , bk (Sk ) mVk } .
(16.41)
Equation (16.41) implies zero expected profits for all firms at all point in time, so = 0 in
(16.21).
363
16.3.3
Let us define an asymptotic path (AP) as an equilibrium path that the economy tends to as
t , and does not include limit cycles. In an AP, we can have either limt C (t) /C (t) =
, i.e., consumption grows more than exponentially (explodes), or limt C (t) /C (t) = gc ,
i.e., the rate of consumption growth tends to a constant, possibly 0 (including the case
where limt C (t) = 0 as a special case). A balanced growth path (BGP) is defined as an
AP where output, consumption and the capital stock grow at the same finite constant rate,
i.e., limt C (t) /C (t) = limt Y (t) /Y (t) = limt K (t) /K (t) = g.
This subsection will show that with < 1, only BGPs can be an AP, so if the economy
is going to tend to a non-cycling path, this has to be a BGP. In contrast, with 1, there
may exist asymptotic paths where consumption grows more than exponentially or grows at
a dierent rate than capital, but these artists interesting for us given our focus on < 1.
To facilitate the analysis, let us a dope the notation:
N n
and M m
and, together with (16.36), allows me to write output in a more compact way:
h
i
1
1 1
Y = (NL)
+ (1 ) (MK)
364
(16.42)
MK
,
NL
(16.43)
which is a direct generalization of the normalized capital stock defined in the neoclassical
growth model as capital stock divided by the eective units of labor. Here the numerator
contains the eective units of capital as well, since there can be capital-augmenting technical change. Then, using (16.32), (16.37), (16.38) and (16.43), we can write the interest
rate as:
1
i 1
h
1
+ (1 )
.
r = R(M, k) (1 ) M k
(16.44)
rK
1 1
= pk =
k .
wL
(16.45)
The relationship between the relative share of capital and the normalized capital stock
depends on , which is the elasticity of substitution between capital-intensive and laborintensive goods. Equation (16.45) shows that is also the elasticity of substitution between
capital and labor in this economy. In response to an increase in k, sK will also increase if
> 1, and will decrease if < 1.
This analysis leads to the following crucial result:
Proposition 47 With < 1, all APs are BGPs and feature purely labor-augmenting technical change, i.e., they have limt M (t) /M (t) = 0.
This result demonstrates that with < 1, i.e., with labor and capital as gross complements, the only asymptotic (non-cycling) paths will feature purely labor-augmenting technical change. There will be research eort devoted to the invention of capital-intensive intermediates, but this is only to keep the state of technology in that sector at a constant level.
365
16.3.4
We saw above that with < 1, only a BGP with purely labor-augmenting technical change
can be an AP. Now I show that there in fact exists a unique BGP as long as > 0, and
characterize the properties of this equilibrium path.
First note that from the Euler equation, (16.29), the BGP rate of interest has to be
wL/n
1
1 rK/m
and Vk =
.
r + (1 2) g/ (1 )
r+g
(16.46)
Notice that these values also grow at a constant rate along the BGP because w, K and n
are growing. The denominator for Vl is dierent from that of Vk because its BGP growth
rate is lower than that of Vk : n, which is in the denominator of l , grows along the balanced
growth path, while m remains constant.
366
1 n
1
=
[bl (S Sk ) (S Sk ) ] .
n
(16.47)
and at M/M
= 0, R&D firms are indierent between capital- and labor-augmenting technical
change, i.e., bl (S Sk ) nVl = bk (Sk ) mVk , or from equation (16.10),
bl (S Sk ) wL
bk (Sk ) r K
=
.
r + (1 2) g / (1 )
r + g
(16.48)
bl (S Sk ) (1 ) ( + + ( 1) g )
,
bk (Sk ) ((1 ) ( + ) + ((1 ) ( 1) + ) g )
367
(16.49)
k=k
b
1
K = b .
(16.50)
Finally, let M be such that k G(M ), i.e., M is the level of capital-augmenting technology that is consistent with the equilibrium interest rate taking its BGP value when k = k .
As a result, when k = k and M = M , the interest rate will be equal to r and the relative
share of capital will be b .
In BGP, M/M
= 0, while N/N
> 0. Because of the depreciation of technologies, there
must be both research to invent new labor-intensive and capital-intensive intermediatesif
16.3.5
Transitional Dynamics
Finally, we would like to know whether the economy will tend to be balanced growth path
with three labor augmenting technical change. Here, the feature that < 1 ensures this. In
particular, we have the following result (which is proved in Acemoglu, 2003):
Proposition 50 Suppose > 0 and that < 1, then the BGP characterized above is locally
saddle-path stable.
Therefore, this model provides a framework in which technological change can be capitalaugmenting in the short run or in the median run, but in the long run it will be endogenously
labor-augmenting, ensuring a balanced growth path equilibrium as in the standard neoclassical growth model.
16.3.6
Policy Implications
Despite the similarity of this model to the neoclassical one, the implications are actually
quite dierent. Let us consider one example here. Suppose that there is taxation of capital
income, so that the budget constraint of the representative household becomes:
C + I wL + (1 ) rK + S S + + T.
It can be verified that in the standard neoclassical growth model with exogenously laboraugmenting technological change, an increase in will aect the capital to eective labor
ratio and the share of capital in national income. In contrast, here we have:
370
371
372
Chapter 17
Recitation Material: Appropriate
Technology
Thinking of the composition of technology also opens the way for us to consider issues of
appropriate technologies. Recall that in previous models technological dierences were often
explained by assuming that technologies did not freely flow from advanced countries to less
advanced ones. Why should ideas not flow and machines not be exported to poor countries?
Perhaps distortions as in the previous model, but even when ideas could flow at no cost,
productivity dierences may stay.
Why? Many technologies used by LDCs are inappropriate because they are designed
to make optimal use of the prevailing factors and conditions in DCs, where most technologies
are developed.
There is a mismatch between technologies developed in the North and LDCs weather
conditions, labor force skills, etc.
Most technologies are developed in the North. For example, over 90% of the world R&D
373
17.1
Atkinson and Stiglitz suggested the following idea: new technologies are specific to a given
capital-labor ratio. When used with dierent capital labor ratios, they are less productive.
For example, suppose that the production technology is
Y = A (k | k0 ) K 1 L = A (k | k0 ) k1 L
where k = K/L is the capital-labor ratio, and A (k | k0 ) is the productivity of technology
designed to be used with capital-labor ratio k0 when used instead with capital-labor ratio k.
For example, suppose that
k
A (k | k ) = A min 1,
k0
0
for (0, 1). That is, when a technology designed for the capital labor ratio k 0 is used with
a lower capital-labor ratio, there is a loss in eciency.
Now suppose that new technologies are developed in richer economies, which have greater
capital-labor ratios. Then productivity in a less developed country with the capital-labor
ratio k < k 0 will be
Y = A (k | k0 ) k1 L = Ak 1+ (k 0 )
So less developed countries will produce with worse technologies. Moreover, this technological
disadvantage will be larger when the gap in the capital intensity of production between these
countries and in the technologically advanced economies is greater.
374
17.2
17.2.1
A Model
C( )1 1
exp(( t))d ,
1
Macroeconomic equilibrium:
375
C + I + X Y exp
(17.1)
ln y(i)di ,
Here i denotes either a task that needs to be performed for production, or an industry
that will contribute to final output.
Technology:
y(i) =
NL
kL (i, v)
dv [(1 i)l(i)] +
NH
kH (i, v)
dv [iZh(i)] ,
(17.2)
Solve for equilibrium kz (v), z (v) and replace kz (v) in production functions:
h
i1/
(1 )p(i) ((1 i)l(i)) /L (v)
,
h
i1/
.
kH (i, v) = (1 )p(i) (iZh(i)) /H (v)
kL (i, v) =
(17.3)
(17.4)
Given these isoelastic demand for machines, the optimal rental rates for the technology
monopolists will again be a constant markup over marginal cost.
Substituting machine prices into (17.3), and then using the resulting expressions with
(17.2), we obtain output in sector i as
y(i) = 1 p(i)(1)/ [NL (1 i)l(i) + NH iZ h(i)] .
Technical progress: increases in NL and NH (as in the baseline directed technical change
model discussed above).
NL and NH are the only state variables in the model.
Now taking NL and NH , the equilibrium is straightforward to characterize.
The equilibrium will take a similar form both in the North and in the South. a threshold
J [0, 1] such that skilled workers will be used only in sectors i > J. More explicitly,
i < J, h(i) = 0, and
i > J, l(i) = 0.
In equilibrium:
377
where PL and PH are price indices for goods produced intensively using the skilled or
unskilled workers.
Relative price of skill-intensive goods:
/2
PH
NH ZH
=
,
PL
NL L
The equilibrium threshold will be given by
1/2
NH ZH
J
=
,
1J
NL L
Total output:
2
Y = exp() (NL L)1/2 + (NH ZH)1/2 ,
Wage premium:
wH
=Z
wL
NH
NL
1/2
ZH
L
1/2
(17.5)
ZH n
Ln
378
(17.6)
n
wH
/wLn = Z
independent of factor endowment in the North (this is the eect of directed technical change,
but also the special case corresponding to = 2 in terms of the directed technical change
model above).
Next, assume that Southern producers take NL and NH from the North, and maximize
profits. This captures the notion that the North is the technologically advanced economy,
and the South is the follower.
A monopolist in each Southern country copies each new machine and sells it to the
producers in its country.
In equilibrium:
Js > Jn
In other words, certain tasks that are performed by skilled workers in the North will be
performed by unskilled workers in the Souththis is simply an implication of the greater
skill abundance in the North.
Technology levels NH and NL are determined in the North, and Y s grows at the same
rate g as in the North.
379
17.2.2
Implications
What are the productivity implications of directed technical change in the North, and the
South importing technologies developed in the North?
Define
A=
y=
Y
L+ZH
Y
L+H
Both output per eective unit of labor and output per capita are greater in the North
than the South, even if both countries have the same cost of capital. This is true a fortiori,
if cost of capital is higher in the South.
Intuition: TFP is maximized in the North.
Why? The world technologies are designed to make best use of factor abundance/scarcity
in the North. For example, there are many more skilled workers in the North, so technologies
developed in the North are more skill-biased then what is required in the South. Since
J s > J n , these skill-biased technologies will be less useful in the South than in the North,
leading to endogenous productivity dierences between these countries.
17.2.3
Calibration
Can this mechanism lead to sizable eects? Can we generate output per worker dierences
which resemble those in the data? Can we improve on the neoclassical model?
380
c
= A (K c )1 (Lc + ZH c ) .
YNC
U SA
= 1,
A (identical for all countries) chosen so as to normalize yNC
c 1
c
= NL (K )
YAZ
c 1/2
(L )
NH /NL = ZH n /Ln
North = U S
U SA
= 1.
NL is chosen so as to normalize yAZ
381
NH
ZH c
NL
1/2 !2
H/L
LDC
yNC
5th
yNC
<2N C
Our model
LDC
yAZ
5th
yAZ
<2AZ
Primary
1.8 0.45
0.16
0.651 0.39
0.09
0.728
Sec. att.
1.8 0.39
0.15
0.816 0.26
0.05
0.937
0.15
0.808 0.28
0.07
0.934
Higher
1.8 0.43
0.18
0.718 0.37
0.13
0.843
Primary
1.5 0.46
0.17
0.625 0.40
0.09
0.723
Sec. att.
1.5 0.41
0.16
0.757 0.28
0.05
0.931
0.17
0.745 0.31
0.08
0.918
Higher
1.5 0.45
0.19
0.666 0.39
0.14
0.803
Primary
1.0 0.49
0.21
0.540 0.41
0.10
0.707
Sec. att.
1.0 0.49
0.21
0.540 0.31
0.07
0.901
0.21
0.540 0.36
0.11
0.840
Higher
0.21
0.540 0.44
0.17
0.689
1.0 0.49
382
Chapter 18
Epilogue: Political Economy of
Growth
This course so far has been about understanding the mechanics of economic growth. The
models we have seen are very useful for understanding how individuals accumulate capital, how physical and human capital aect economic growth and income levels, and how
technology endogenously changes and is transferred from one country to another.
However, the major question motivating much of the analysis of economic growth is to
understand why some countries are rich while some others are poor, or why some countries
grow faster than while others stagnate.
At some level, what we have focused on are the proximate causes of this process. Exactly
as in the empirical analysis of decomposing cross-country income dierences into dierences
in physical capital, human capital and technology, we have learned how to construct microfounded models which help us in thinking about the process of economic growth in a careful
and rigorous way.
383
18.1
As discussed above, institutions (and related policy dierences originating from institutional
dierences) have become popular recently in thinking of fundamental causes of dierences
in income per capita and growth performance of countries. In this context, institutions
contrast with other potential fundamental causes such as geographical dierences or cultural
factors. While geographic characteristics of countries and regions may lead to dierences
in the technology available to individuals or make their investments in physical and human
capital more dicult, institutional dierences, associated with dierences in the organization
of society, shape economic and political incentives and aect the nature of equilibria via these
384
18.1.1
Douglass North (1990, p. 3) oers the following definition: Institutions are the rules of the
game in a society or, more formally, are the humanly devised constraints that shape human
interaction. Three important features of institutions are apparent in this definition: (1)
that they are humanly devised, which contrasts with other potential fundamental causes,
like geographic factors, which are outside human control; (2) that they are the rules of the
game setting constraints on human behavior; (3) that their major eect will be through
incentives (see also North, 1981).
There are tremendous cross-country dierences in the way that economic and political life
is organized. A voluminous literature documents large cross-country dierences in economic
institutions, and a strong correlation between these institutions and economic performance,
and we have seen some of those in the early lectures of this course.
Knack and Keefer (1995), for instance, look at measures of property rights enforcement
compiled by international business organizations, Mauros (1995) study looks at measures
of corruption, and work by Djankov, La Porta, Lopez-De-Silanes and Shleifer compiles measures of entry barriers across countries, while many studies look at variation in educational
institutions and the corresponding dierences in human capital. All of these authors find
substantial dierences in these measures of economic institutions, and significant correlation between these measures and various indicators of economic performance. For example,
Djankov et al. find that, while the total cost of opening a medium-size business in the United
States is less than 0.02 percent of GDP per capita in 1999, the same cost is 2.7 percent of
GDP per capita in Nigeria, 1.16 percent in Kenya 0.91 percent in Ecuador and 4.95 percent
385
18.1.2
As a first step in modeling institutions, let us consider the relationship between three institutional characteristics: (1) economic institutions; (2) political power; (3) political institutions.
As already mentioned above, economic institutions matter for economic growth because
they shape the incentives of key economic actors in society, in particular, they influence
investments in physical and human capital and technology, and the organization of production. Economic institutions not only determine the aggregate economic growth potential of
the economy, but also the distribution of resources in the society, and herein lies part of the
problem: dierent institutions will not only be associated with dierent degrees of eciency
and potential for economic growth, but also with dierent distribution of the gains across
dierent individuals and social groups.
How are economic institutions determined? Although various factors play a role here, including history and chance, at the end of the day, economic institutions are collective choices
of the society. And because of their influence on the distribution of economic gains, not all
individuals and groups typically prefer the same set of economic institutions. This leads
to a conflict of interest among various groups and individuals over the choice of economic
institutions, and the political power of the dierent groups will be the deciding factor.
The distribution of political power in society is also endogenous. To make more progress
here, let us distinguish between two components of political power; de jure (formal) and de
facto political power (see Acemoglu and Robinson, 2005). De jure political power refers to
power that originates from the political institutions in society. Political institutions, similar
to economic institutions, determine the constraints on and the incentives of the key actors,
390
economic
de jure
performancet
economic
political
political
= institutionst =
institutionst =
powert
&
distribution
&
of resources
de facto
distribution
t+1
=
of resourcest = political
political
power
institutions
t+1
This diagram illustrates both the eect of economic institutions on economic performance
and the distribution of resources in a society, and the role of the combination of de jure and
de facto political power in shaping both economic and political institutions.
18.1.3
Institutions in Action
As a brief example, consider the development of property rights in Europe during the Middle Ages. Lack of property rights for landowners, merchants and proto- industrialists was
detrimental to economic growth during this epoch. Since political institutions at the time
placed political power in the hands of kings and various types of hereditary monarchies, such
rights were largely decided by these monarchs. The monarchs often used their powers to
expropriate producers, impose arbitrary taxation, renege on their debts, and allocate the
392
395
18.2
Now I present a simple model of the determination of institutions in the context of investigating their impact on economic growth. The basic setup is one in which an existing elite
is in control of political power, and uses their monopoly of political power for their own
interests even when this is costly for the society at large. I will present a simple model
of this which will highlight various sources of ineciencies in policies, which in turn will
translate into inecient (non-growth enhancing) institutions. It should be noted at this
point, however, that the concept of ineciency here is not that of Pareto ineciency, since
when distributional issues are important, Pareto eciency is not a strong enough concept.
An economy in which all of the resources are allocated to a single individual who has no
investment opportunities, thus growth is stifled, may nevertheless be Pareto ecient. Thus
the concept of ineciency here is being used in the sense of non-growth enhancing or
non-surplus maximizing.
The various sources of ineciencies in policies are
1. Revenue extraction: the group in powerthe elitewill set high taxes on middle
class producers in order to extract resources from them. These taxes are distortionary. This
source of ineciency results from the absence of non-distortionary taxes, which implies that
the distribution of resources cannot be decoupled from ecient production.
2. Factor price manipulation: the group in power may want to tax middle class producers in order to reduce the prices of the factors they use in production. This ineciency
arises because the elite and middle class producers compete for factors (here labor). By
taxing middle class producers, the elite ensure lower factor prices and thus higher profits for
396
18.2.1
Baseline Model
= E0
X
t=0
400
t cjt ,
(18.1)
1
(Aj ) (ktj )1 (ltj ) ,
1 t
(18.2)
where k denotes capital and l labor. Capital is assumed to depreciate fully after use. The
Cobb-Douglas form is adopted for simplicity.
The key dierence between the two groups is in their productivity. To start with, let
us assume that the productivity of each elite agent is Ae in each period, and that of each
middle class agent is Am . Productivity of the two groups diers, for example, because they
are engaged in dierent economic activities (e.g., agriculture versus manufacturing, old versus
new industries, etc.), or because they have dierent human capital or talent.
On the policy side, there are activity-specific tax rates on production, e and m , which
are constrained to be nonnegative, i.e., e 0 and m 0. There are no other fiscal
instruments (in particular, no lump-sum non-distortionary taxes). In addition there is a
total income (rent) of R from natural resources. The proceeds of taxes and revenues from
natural resources can be redistributed as nonnegative lump-sum transfers targeted towards
401
Ttm
Tte
jS e S m
jt ytj dj + R.
(18.3)
Let us also assume that there is a maximum scale for each firm, so that ltj for all j
and t. This prevents the most productive agents in the economy from employing the entire
labor force. Since only workers can be employed, the labor market clearing condition is
Z
jS e S m
ltj dj 1,
(18.4)
with equality corresponding to full employment. Since ltj , (18.4) implies that if
e + m
1
,
(ES)
there can never be full employment. Consequently, depending on whether Condition (ES)
holds, there will be excess demand or excess supply of labor in this economy. Throughout,
I assume that
Assumption 15
e
1
1
and m ,
402
18.2.2
Economic Equilibrium
choose their investment and employment optimally and the labor market clears.
Each producer (firm) takes wages, denoted by wt , as given. Finally, given the absence of
adjustment costs and full depreciation of capital, firms simply maximize current net profits.
Consequently, the optimization problem of each firm can be written as
1 jt j j 1 j
max
(A ) (kt )
lt wt ltj ktj ,
j j 1
kt ,lt
ltj
(1
1
[0, ] if wt =
(1
1
=0
if wt <
(1
1
(18.5)
jt )1/ Aj
jt )1/ Aj .
(18.6)
jt )1/ Aj
A number of points are worth noting. First, in equation (18.6), the expression (1
j. If the wage is above this amount, this producer would not employ any workers, and if it is
403
e 1/ e
m 1/ m
.
(1 t ) A ,
(1 t ) A
1
1
(18.7)
The form of the equilibrium wage is intuitive. Labor demand comes from two groups, the
elite and middle class producers, and when condition (ES) does not hold, their total labor
demand exceeds available labor supply, so the market clearing wage will be the minimum of
their net marginal product.
One interesting feature, which will be used below, is that when Condition (ES) does
not hold, the equilibrium wage is equal to the net productivity of one of the two groups of
producers, so either the elite or the middle class will make zero profits in equilibrium.
Finally, equilibrium level of aggregate output is
Z
Z
1
1
j
e (1)/ e
m (1)/ m
(1 t )
(1 t )
A
lt dj +
A
ltj dj + R.
Yt =
1
1
e
m
jS
jS
(18.8)
18.2.3
Inecient Policies
Now I use the above economic environment to illustrate a number of distinct sources of
inecient policies. In this section, political institutions correspond to the dictatorship of
the elite in the sense that they allow the elite to decide the policies, so the focus will be on
the elites desired policies. The main (potentially inecient) policy will be a tax on middle
class producers, though more generally, this could correspond to expropriation, corruption
or entry barriers. As discussed in the introduction, there will be three mechanisms leading to
inecient policies; (1) Resource Extraction; (2) Factor Price Manipulation; and (3) Political
Consolidation.
To illustrate each mechanism in the simplest possible way, I will focus on a subset of the
parameter space and abstract from other interactions. Throughout, I assume that there is
and et , where 1. This limit can be
an upper bound on taxation, so that m
t
institutional, or may arise because of the ability of producers to hide their output or shift
into informal production.
The timing of events within each period is as follows: first, taxes are set; then, investments are made. This removes an additional source of ineciency related to the holdup
problem whereby groups in power may seize all of the output of other agents in the economy
once it has been produced. Holdup will be discussed below.
To start with, I focus on Markov Perfect Equilibria (MPE) of this economy, where
strategies are only dependent on payo-relevant variables. In this context, this means that
405
(18.3)) which maximizes the elites utility, taking the economic equilibrium as a function of
the sequence of policies as given.
More specifically, substituting (18.5) into (18.2), we obtain elite consumption as
e
e 1/ e
(1 t ) A wt lte + Tte ,
(18.9)
ct =
1
with wt given by (18.7). This expression follows immediately by recalling that the first term
in square brackets is the after-tax profits per worker, while the second term is the equilibrium
wage. Total per elite consumption is given by their profits plus the lump sum transfer they
receive. Then the political equilibrium, starting at time t = 0, is simply given by a sequence
w
m
e
of { et , m
t , Tt , Tt , Tt }t=0,1,..., that satisfies (18.3) and maximizes the discounted utility of
P
t e
the elite,
t=0 ct .
The determination of the political equilibrium is simplified further by the fact that in the
MPE with full capital depreciation, this problem is simply equivalent to maximizing (18.9).
We now characterize this political equilibrium under a number of dierent scenarios.
18.2.4
Revenue Extraction
To highlight this mechanism, suppose that Condition (ES) holds, so wages are constant at
zero. This removes any eect of taxation on factor prices. In this case, from (18.6), we also
have ltj = for all producers. Also assume that > 0 (for example, = 1).
It is straightforward to see that the elite will never tax themselves, so et = 0, and will
redistribute all of the government revenues to themselves, so Ttw = Ttm = 0. Consequently
406
m
(1)/ m
A m + R
(1 m
t )
1 t
(18.10)
at time t, facedownwhere the first term is obtained by substituting for ltm = and for (18.5)
m
into (18.2) and multiplying it by m
t , and taking into account that there are middle class
producers and a fraction of tax revenues can be redistributed. The second term is simply
the revenues from natural resources. It is clear that tax revenues are maximized by m
t = .
In other words, this is the tax rate that puts the elite at the peak of their Laer curve. In
contrast, output maximization would require m
t = 0. However, the output-maximizing tax
rate is not an equilibrium because, despite the distortions, the elite would prefer a higher
tax rate to increase their own consumption.
At the root of this ineciency is a limit on the tax instruments available to the elite.
If they could impose lump-sum taxes that would not distort investment, these would be
preferable. Inecient policies here result from the redistributive desires of the elite coupled
with the absence of lump-sum taxes.
It is also interesting to note that as increases, the extent of distortions are reduced,
since there are greater diminishing returns to capital and investment will not decline much
in response to taxes.
Even though m
t = is the most preferred tax for the elite, the exogenous limit on
taxation may become binding, so the equilibrium tax is
RE
m
min {, }
t =
(18.11)
for all t. In this case, equilibrium taxes depend only on the production technology (in
particular, how distortionary taxes are) and on the exogenous limit on taxation. For example,
407
18.2.5
I now investigate how inecient policies can arise in order to manipulate factor prices. To
highlight this mechanism in the simplest possible way, let us first assume that = 0 so that
there are no direct benefits from taxation for the elite. There are indirect benefits, however,
because of the eect of taxes on factor prices, which will be present as long as the equilibrium
wage is positive. For this reason, I now suppose that Condition (ES) does not hold, so that
equilibrium wage is given by (18.7).
Inspection of (18.7) and (18.9) then immediately reveals that the elite prefer high taxes
in order to reduce the labor demand from the middle class, and thus wages, as much as
possible. The desired tax rate for the elite is thus m
t = 1. Given constraints on taxation,
FPM
for all t. We therefore have:
the equilibrium tax is m
t =
Proposition 54 Suppose Assumption 15 holds, Condition (ES) does not hold, and = 0,
FPM
then the unique political equilibrium features m
for all t.
t =
This result suggests that the factor price manipulation mechanism generally leads to
higher taxes than the pure revenue extraction mechanism. This is because, with the factor
price manipulation mechanism, the objective of the elite is to reduce the profitability of the
408
18.2.6
I now combine the two eects isolated in the previous two subsections. By itself the factor
price manipulation eect led to the extreme result that the tax on the middle class should
be as high as possible. Revenue extraction, though typically another motive for imposing
taxes on the middle class, will serve to reduce the power of the factor price manipulation
409
m
1
e
e
m (1)/ m m m
A lt + R ,
A wt lt + e
(1 t )
max
m
1
1 t
t
(18.12)
(18.13)
1/ m
A Ae .
ltm = if (1 m
t )
(18.14)
The first term in (18.12) is the elites net revenues and the second term is the transfer they
receive. Equation (18.13) is the market clearing constraint, while (18.14) ensures that middle
class producers employ as much labor as they wish provided that their net productivity is
greater than those of elite producers.
The solution to this problem can take two dierent forms depending on whether (18.14)
holds in the solution. If it does, then w = Ae / (1 ), and elite producers make zero
profits and their only income is derived from transfers. Intuitively, this corresponds to the
case where the elite prefer to let the middle class producers undertake all of the profitable
410
(1)/
A (1 )
This assumption ensures that the solution will always take the latter form (i.e., (18.14)
does not hold). Intuitively, this condition makes sure that the productivity gap between the
middle class and elite producers is not so large as to make it attractive for the elite to make
zero profits themselves (recall that (1 )(1)/ < 1, so if e = m and Ae = Am , this
condition is always satisfied).
1/ m m
A t / (1 ),
Consequently, when Assumption 16 holds, we have wt = (1 m
t )
1
m
m (1)/ m m m
1/ m
t (1 t )
(1 m
A l +R
A ,
t )
e
1
1
(18.15)
where I have used the fact that all elite producers will employ employees, and from (18.13),
lm = (1 e ) /m .
The maximization of (18.15) gives
m
t
e
1+
.
= (, , , )
1 m
1
(1 e )
t
The first interesting feature is that (, e , , ) is always less than . This implies that
m
t is always less than 1, which is the desired tax rate in the case of pure factor price
manipulation. Moreover, (, e , , ) is strictly greater than / (1 ), so that m
t is
411
COM
min
(, e , , )
, .
1 + (, e , , )
(18.16)
It is also interesting to look at the comparative statics of this tax rate. First, as
increases, taxation becomes more beneficial (generates greater revenues), but COM declines.
This might at first appear paradoxical, since one may have expected that as taxation becomes
less costly, taxes should increase. Intuition for this result follows from the observation that an
increase in raises the importance of revenue extraction, and as commented above, in this
case, revenue extraction is a force towards lower taxes (it makes it more costly for the elite
to move beyond the peak of the Laer curve). Since the parameter is related, among other
things, to state capacity, this comparative static result suggests that higher state capacity
will translate into lower taxes, because greater state capacity enables the elite to extract
revenues from the middle class through taxation, without directly impoverishing them. In
other words, greater state capacity enables more ecient forms of resource extraction by the
groups holding political power.
Second, as e increases and the number of elite producers increases, taxes also increase.
The reason for this eect is again the interplay between the revenue extraction and factor
price manipulation mechanisms. When there are more elite producers, reducing factor prices
becomes more important relative to gathering tax revenue. One interesting implication of
this discussion is that when the factor price manipulation eect is more important, there will
412
18.2.7
Political Consolidation
I now discuss another reason for inecient taxation, the desire of the elite to preserve their
political power. This mechanism has been absent so far, since the elite were assumed to
always remain in power. To illustrate it, the model needs to be modified to allow for endogenous switches of power. Institutional change will be discussed in greater detail later. For
now, let us assume that there is a probability pt in period t that political power permanently
shifts from the elite to the middle class. Once they come to power, the middle class will
pursue a policy that maximizes their own utility. When this probability is exogenous, the
previous analysis still applies. Interesting economic interactions arise when this probability
is endogenous. Here I will use a simple (reduced-form) model to illustrate the trade-os and
assume that this probability is a function of the income level of the middle class agents, in
particular
pt = p (m cm
t ) [0, 1] ,
(18.17)
where I have used the fact that income is equal to consumption. Let us assume that p is
continuous and dierentiable with p0 > 0, which captures the fact that when the middle
413
Ae w le + 1e m (1 m )(1)/ Am lm m + R
t t
t
t
1
1 t
V e (E) = max
m
e
e
t
+ [(1 p ) V (E) + p V (M)]
t
(1
1
1/ m m m
m
A lt wt ltm m .
t )
I wrote V e (E) and V e (M) not as functions of time, since the structure of the problem makes
it clear that these values will be constant in equilibrium.
The first observation is that if the solution to the static problem involves cm
t = 0, then
the same fiscal policy is optimal despite the risk of losing power. This implies that, as
long as Condition (ES) does not hold and Assumption 16 holds, the political consolidation
mechanism does not add an additional motive for inecient taxation.
To see the role of the political consolidation mechanism, suppose instead that Condition
RE
(ES) holds. In this case, wt = 0 and the optimal static policy is m
min {, } as
t =
discussed above and implies positive profits and consumption for middle class agents. The
dynamic maximization problem then becomes
m
m
1
e
m (1)/ m
A + e 1 t (1 t )
A + R
1
e
. (18.18)
V (E) = max
m
+ V e (E) p (1 m )1/ Am m (V e (E) V e (M))
t
t
1
414
+ e p0
1 m
t
m 1/ m m
(1 t ) A (V e (E) V e (M)) = 0.
1
RE
It is clear that when p0 () = 0, we obtain m
min {, } as above. However,
t =
PC
> RE min {, } as long as V e (E) V e (M) > 0. That
when p0 () > 0, m
t =
V e (E) V e (M) > 0 is the case is immediate since when the middle class are in power, they
get to tax the elite and receive all of the transfers.
Intuitively, as with the factor price manipulation mechanism, the elite tax beyond the
peak of the Laer curve, yet now not to increase their revenues, but to consolidate their
political power. These high taxes reduce the income of the middle class and their political
power. Consequently, there is a higher probability that the elite remain in power in the
future, enjoying the benefits of controlling the fiscal policy.
An interesting comparative static is that as R increases, the gap between V e (E) and
V e (M) increases, and the tax that the elite sets increases as well. Intuitively, the party
in power receives the revenues from natural resources, R. When R increases, the elite
become more willing to sacrifice tax revenue (by overtaxing the middle class) in order to
increase the probability of remaining in power, because remaining in power has now become
more valuable. This contrasts with the results so far where R had no eect on taxes. More
interestingly, a higher , i.e., greater state capacity, also increases the gap between V e (E) and
V e (M) (because this enables the group in power to raise more tax revenues) and thus implies
a higher tax rate on the middle class. Intuitively, when there is no political competition,
greater state capacity, by allowing more ecient forms of transfers, improves the allocation
of resources. But in the presence of political competition, by increasing the political stakes,
it leads to greater conflict and more distortionary policies.
415
Proposition 56 Consider the economy with political replacement. Suppose also that Assumption 15 and Condition (ES) hold and > 0, then the political equilibrium features
PC
m
> RE for all t. This tax rate is increasing in R and .
t =
18.2.8
I have so far focused on Markov perfect equilibria (MPE). In general, such a focus can be
restrictive. In this case, however, it can be proved that subgame perfect equilibria (SPE)
coincide with the MPE. This will not be true in the next subsection, so it is useful to briefly
discuss why it is the case here.
MPE are a subset of the SPE. Loosely speaking, SPEs that are not Markovian will be
supported by some type of history-dependent punishment strategies. If there is no room
for such history dependence, SPEs will coincide with the MPEs.
In the models analyzed so far, such punishment strategies are not possible even in the
SPE. Intuitively, each individual is infinitesimal and makes its economic decisions to maximize profits. Therefore, (18.5) and (18.6) determine the factor demands uniquely in any
equilibrium. Given the factor demands, the payos from various policy sequences are also
uniquely pinned down. This means that the returns to various strategies for the elite are
independent of history. Consequently, there cannot be any SPEs other than the MPE characterized above. Therefore, we have:
Proposition 57 The MPEs characterized in Propositions 53-56 are the unique SPEs.
416
18.2.9
Lack of CommitmentHoldup
The models discussed so far featured full commitment to taxes by the elites. Using a term
from organizational economics, this corresponds to the situation without any holdup.
Holdup (lack of commitment to taxes or policies) changes the qualitative implications of
the model; if expropriation (or taxation) happens after investments, revenues generated by
investments can be ex post captured by others. These types of holdup problems are likely
to arise when the key investments are long-term, so that various policies will be determined
and implemented after these investments are made (and sunk).
The problem with holdup is that the elite will be unable to commit to a particular
tax rate before middle class producers undertake their investments (taxes will be set after
investments). This lack of commitment will generally increase the amount of taxation and
ineciency. To illustrate this possibility, I consider the same model as above, but change the
timing of events such that first individual producers undertake their investments and then
the elite set taxes. The economic equilibrium is unchanged, and in particular, (18.5) and
(18.6) still determine factor demands, with the only dierence that m and e now refer to
expected taxes. Naturally, in equilibrium expected and actual taxes coincide.
What is dierent is the calculus of the elite in setting taxes. Previously, they took
into account that higher taxes would discourage investment. Since, now, taxes are set after
investment decisions, this eect is absent. As a result, in the MPE, the elite will always want
HP
to tax at the maximum rate, so in all cases, there is a unique MPE where m
for
t =
all t.
417
(1 )(1)/ Am m .
(1 ) (1 )
(18.19)
If, in contrast, they deviate at any point, the most profitable deviation for them is to set
m = 1, and they will raise
(1 )(1)/ Am m .
1
(18.20)
The trigger-strategy profile will be an equilibrium as long as (18.19) is greater than or equal
418
18.2.10
Suppose now that taxes are set before investments, so the source of holdup in the previous
subsection is absent. Instead, suppose that at time t = 0 before any economic decisions or
policy choices are made, middle class agents can invest to increase their productivity. In
particular, suppose that there is a cost (Am ) of investing in productivity Am . The function
is non-negative, continuously dierentiable and convex. This investment is made once and
the resulting productivity Am applies forever after.
Once investments in technology are made, the game proceeds as before. Since investments in technology are sunk after date t = 0, the equilibrium allocations are the same as in
the results presented above. Another interesting question is whether, if they could, the elite
would prefer to commit to a tax rate sequence at time t = 0.
The analysis of this case follows closely that of the baseline model, and I simply state
the results (without proofs to save space):
419
That this is the unique MPE is quite straightforward. It is also intuitive that it is the
unique SPE. In fact, the elite would choose exactly this tax rate even if they could commit at
time t = 0. The reason is as follows: in the case of pure factor price manipulation, the only
objective of the elite is to reduce the middle class labor demand, so they have no interest
in increasing the productivity of middle class producers.
For contrast, let us next consider the pure revenue extraction case with Condition (ES)
satisfied. Once again, the MPE is identical to before. As a result, the first-order condition
for an interior solution to the middle class producers technology choice is:
0 (Am ) =
1
(1 m )1/
11
(18.21)
where m is the constant tax rate that they will face in all future periods. In the pure
revenue extraction case, recall that the equilibrium is m = RE min {, }. With the
same arguments as before, this is also the unique SPE. Once the middle class producers
have made their technology decisions, there is no history-dependent action left, and it is
impossible to create history-dependent punishment strategies to support a tax rate dierent
than the static optimum for the elite. Nevertheless, this is not necessarily the allocation that
the elite prefer. If the elite could commit to a tax rate sequence at time t = 0, they would
choose lower taxes. To illustrate this, suppose that they can commit to a constant tax rate (it
is straightforward to show that they will in fact choose a constant tax rate even without this
restriction, but this restriction saves on notation). Therefore, the optimization problem of
420
m
1 m
m
m dA
A
+
=0
1 m
d m
where dAm /d m takes into account the eect of future taxes on technology choice at time
t = 0. This expression can be obtained from (18.21) as:
1 (1 m )(1)/
1
dAm
< 0.
=
d m
11
00 (Am )
This implies that the solution to this maximization problem satisfies m = T A < RE
min {, }. If they could, the elite would like to commit to a lower tax rate in the future
in order to encourage the middle class producers to undertake technological improvements.
Their inability to commit to such a tax policy leads to greater ineciency than in the case
without technology adoption. Summarizing this discussion:
Proposition 61 Consider the game with technology adoption, and suppose that Assumption
15 and Condition (ES) hold and > 0, then the unique political equilibrium features m
t =
RE min {, } for all t. If the elite could commit to a tax policy at time t = 0, they would
prefer to commit to T A < RE .
An important feature is that in contrast to the pure holdup problem where SPE could
prevent the additional ineciency (when 1 , recall Proposition 59), with the technology adoption game, the ineciency survives the SPE. The reason is that, since middle
421
18.2.11
The previous analysis shows how inecient policies emerge out of the desire of the elite,
which possesses political power, to redistribute resources towards themselves. I now discuss
the implications of these mechanisms for inecient institutions. Since the elite prefer to
implement inecient policies to transfer resources from the rest of the society (the middle
class and the workers) to themselves, they will also prefer inecient economic institutions
that enable and support these inecient policies.
To illustrate the main economic interactions, I consider two prototypical economic institutions: (1) Security of property rights; there may be constitutional or other limits on the
extent of redistributive taxation and/or other policies that reduce profitability of producers
investments. In terms of the model above, we can think of this as determining the level
of . (2) Regulation of technology, which concerns direct or indirect factors aecting the
productivity of producers, in particular middle class producers.
As pointed out in the introduction, the main role of institutions is to provide the framework for the determination of policies, and consequently, preferences over institutions are
derived from preferences over policies and economic allocations. Bearing this in mind, let
422
18.3
The above analysis characterized the equilibrium under the dictatorship of the elite, a set
of political institutions that gave all political power to the elite producers. An alternative is
to have the dictatorship of the middle class, i.e., a system in which the middle class makes
the key policy decisions (this could also be a democratic regime with the middle class as
the decisive voters). Finally, another possibility is democracy in which there is voting over
dierent policy combinations. If e + m < 1, then the majority are the workers, and they
will pursue policies to maximize their own income.
I now briefly discuss the possibility of a switch from the dictatorship of the elite to one
of these two alternative regimes. It is clear that whether the dictatorship of the elite or that
of middle class is more ecient depends on the relative numbers and productivities of the
two groups, and whether elite control or democracy is more ecient depends on policies in
democracy. Hence, this section will first characterize the equilibrium under these alternative
political institutions. Moreover, for part of the analysis in this subsection, I simplify the
discussion by imposing the following assumption:
427
18.3.1
With the dictatorship of the middle class, the political equilibrium is identical to the dictatorship of the elite, with the roles reversed. To avoid repetition, I will not provide a full
analysis. Instead, let me focus on the case, combining revenue extraction and factor price
manipulation. The analog of Assumption 16 in this case is:
Assumption 18
m
(1)/
A (1 )
e
A m.
Given this assumption, a similar proposition to that above immediately follows; the
middle class will tax the elite and will redistribute the proceeds to themselves, i.e., Ttw =
Tte = 0, and moreover, the same analysis as above gives their most preferred tax rate as
(, m , , )
e
COM
t =
min
, .
(18.22)
1 + (, m , , )
Proposition 68 Suppose Assumptions 15 and 17 hold, Condition (ES) does not hold, and
> 0, then the unique political equilibrium with middle class control features et = COM as
given by (18.22) for all t.
Comparing this equilibrium to the equilibrium under the dictatorship of the elite, it is
apparent that the elite equilibrium will be more ecient when Ae and e are large relative
428
18.3.2
Democracy
Under Assumption (A4), workers are in the majority in democracy, and have the power
to tax the elite and the middle class to redistribute themselves. More specifically, each
w
workers consumption is cw
t = wt + Tt , with wt given by (18.7), so that workers care
about equilibrium wages and transfers. Workers will then choose the sequence of policies
P t w
w
m
e
{ et , m
t , Tt , Tt , Tt }t=0,1,..., that satisfy (18.3) to maximize
t=0 ct .
It is straightforward to see that the workers will always set Ttm = Tte = 0. Substituting
for the transfers from (18.3), we obtain that democracy will solve the following maximization
problem to determine policies:
max
wt +
e m
t , t
m
(1)/ m m m
e
e (1)/ e e e
t (1 m
)
A
l
(1
)
A
l
+R
t
t
t
1
As before, when Condition (ES) holds, taxes have no eect on wages, so the workers will
tax at the revenue maximizing rate, similar to the case of revenue extraction for the elite
above. This result is stated in the next proposition (proof omitted):
429
Therefore, in this case democracy is more inecient than both middle class and elite
control, since it imposes taxes on both groups. The same is not the case, however, when
Condition (ES) does not hold and wages are positive. In this case, workers realize that by
taxing the marginal group they are reducing their own wages. In fact, taxes always reduce
wages more than the revenue they generate because of their distortionary eects. As a result,
workers will only tax the group with the higher marginal productivity. More specifically, for
m 1/ m
example, if Am > Ae , we will have et = 0, and m
A = Ae or
t will be such that (1 t )
1/ m
A Ae . Therefore, we have:
m
t = and (1 )
Proposition 71 Suppose Assumptions 15 and 18 hold and Condition (ES) does not hold.
Then in the unique political equilibrium with democracy, if Am > Ae , we will have et = 0,
Dm
will be such that (1 Dm )1/ Am = Ae or Dm = and (1 )1/ Am Ae .
and m
t =
e
De
If Am < Ae , we will have m
will be such that (1 De )1/ Ae = Am or
t = 0, and t =
De = and (1 )1/ Ae Ae .
The most interesting implication of this proposition comes from the comparison of the
case with and without excess supply. While in the presence of excess labor supply, democracy taxes both groups of producers and consequently generates more ineciency than the
dictatorship of the elite or the middle class, when there is no excess supply, it is in general
less distortionary than the dictatorship of the middle class or the elite. The intuition is that
when Condition (ES) does not hold, workers understand that high taxes will depress wages
and are therefore less willing to use distortionary taxes.
430
18.3.3
Consider a society where Assumption 18 is satisfied and Ae < Am so that middle class
control is more productive (i.e., generates greater output). Despite this, the elite will have
no incentive, without some type of compensation, to relinquish their power to the middle
class. In this case, political institutions that lead to more inecient policies will persist even
though alternative political institutions leading to better outcomes exist.
One possibility is a Coasian deal between the elite and the middle class. For example,
perhaps the elite can relinquish political power and get compensated in return. However,
such deals are in general not possible. To discuss why (and why not), let us distinguish
between two alternative approaches.
First, the elite may relinquish power in return for a promise of future transfers. This
type of solution will run into two diculties. (i) such promises will not be credible, and once
they have political power, the middle class will have no incentive to keep on making such
transfers. (ii) since there are no other, less distortionary, fiscal instruments, to compensate
the elite, the middle class will have to impose similar taxes on itself, so that the alternative
political institutions will not be as ecient in the first place.
Second, the elite may relinquish power in return for a lump-sum transfer from the middle
class. Such a solution is also not possible in general, since the net present value of the benefit
of holding political power often exceeds any transfer that can be made. Consequently, the
desire of the elite to implement inecient policies also implies that they support political
institutions that enable them to pursue these policies. Thus, in the same way as preferences
over inecient policies translate into preferences over inecient economic institutions, they
431
18.3.4
To develop a better understanding for why inecient institutions emerge and persist, we
need an equilibrium model of institutional change. I now briefly discuss such a model.
It is first useful to draw a distinction between de jure and de facto political power. De
jure political power is determined by political institutions. In the baseline model, de jure
political power is in the hands of the elite, since the political institutions give them the right
to set taxes and determine the economic institutions. De facto political power, which comes
from other sources, did not feature so far in the model (except in the discussion of political
consolidation). The simplest example of de facto political power is when a group manages
to organize itself and poses a military challenge to an existing regime or threatens it with
a revolution. I will conceptualize institutional change as resulting from the interplay of de
jure and de facto political power.
433
m (0) + T e RE + R /m
,
(M) =
1
(18.23)
where RE is given by (18.11). The first term in the numerator is their own revenues,
Am / (1 ), and the second is the distribution from the revenue obtained by taxing the
elite and from natural resources. The term 1 provides the net present discounted value
of this stream of revenues. Similarly, the value of an elite producer in this case is
e RE
e
.
V (M) =
1
(18.24)
What about the dictatorship of the elite? Let us write this value recursively starting in
the no threat state:
(18.25)
This expression incorporates the fact that, in the MPE, during periods of low threat, the
elite will follow their most preferred policy, m = RE and T m = 0. The low threat state
recurs with probability 1 q. What happens when st = H? As noted above, there are
435
(18.26)
where recall that is the cost of regime change for the middle class. When this constraint
holds, the elite could make sucient concessions to keep the middle class happy within the
existing regime.
Therefore, to determine whether concessions within the dictatorship of the elite will be
sucient to satisfy the middle class, we simply need to calculate V m (E, H). Note that the
best concession that the elite can do is to adopt a policy that is most favorable for the middle
where V m (E, L) is given by expression (18.25), with V m (E, H) replacing V m (E, H) on the
+
(1
(1
q))
+
R
/
(1
q)
m
(18.28)
V (E, H) =
(1 )
This is the maximum credible utility that the elite can promise the middle class within
the existing regime. The reason why they cannot give them greater utility is because of
commitment problems. As (18.28) makes it clear, the elite transfer resources to the middle
class only in the state st = H. Even if they promise to make further transfers or not tax
436
Instead, the elite will choose a policy combination m , e , Tm , Te such that V m (E, H) =
V m (M) , i.e., they will make the middle class just indierent between overthrowing the
regime or accepting the concessions. The value of such concessions to the elite is, by similar
arguments, given by:
V e (E, H) =
i
h
m RE
e
e
e
e
e
)+T
+ R / + (1 (1 q)) (
(1 q) (0) + T
(1 )
(18.29)
Whether the elite will make these concessions or not then depends on the values of other
options available to them. Another alternative is the use of repression whenever there is a
threat from the middle class. Such repression is always eective, so the only cost of this
strategy for the elite is the cost they incur in the use of repression, . Denote V e (O, st ) the
value function to the elite it uses repression and the state is st . By standard arguments,
we can obtain this value by writing the following standard recursive formulae: V e (O, H) =
the fact that, when using the repression strategy, the elite will always choose their most for
437
+
R
/ (1 (1 q))
V e (O, H) =
.
1
(18.30)
Consequently, for the elite to prefer concessions, it needs to be the case that V e (E, H)
V e (O, H).
Finally, the third alternative for the elite is to allow regime change, and obtain V e (M) as
given by (18.24). Evidently, V e (M) is less than V e (E, H), since in the latter case they only
make concessions (in fact limited concessions) with probability q. Therefore, regime change
will only happen when (18.26) does not hold. In addition, for similar reasons, for regime
change to take place, we need V e (M) V e (O, H). Note that all of the values here are
simple functions of parameters, so comparing these values essentially amounts to comparing
nonlinear functions of the underlying parameters.
Putting all these pieces together and assuming for convenience that when indierent the
elite opt against repression, we obtain the following proposition:
Proposition 72 Consider the above environment with potential regime change and suppose
that Condition (ES) holds. Then there are three dierent types of political equilibria:
1. If (18.26) holds and V e (E, H) V e (O, H), in the unique equilibrium the regime
always remains the dictatorship of the elite. When st = L, the elite set their most preferred
policy of m = RE , e = 0 and T m = 0, and when st = H, the elite make concessions
m
e m e
m
m
V e (O, H), then the regime always remains the dictatorship of the elite. The elite always set
438
+
T
+
R
/ (1 (1 q))
V e (O, H) V e (M) =
1
is increasing in R and . This implies that when R is high, so that there are greater rents from
439
441