Intermediate Microeconomic Theory Tools and Step by Step Examples 0262044234 9780262044233 Compress

Intermediate Microeconomic Theory
Intermediate Microeconomic Theory
Tools and Step-by-Step Examples
Ana Espinola-Arredondo and Felix Muñoz-Garcia
The MIT Press

Cambridge, Massachusetts
London, England

c 2020 Massachusetts Institute of Technology
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means
(including photocopying, recording, or information storage and retrieval) without permission in writing from the
publisher.
This book was set in Times New Roman by Westchester Publishing Services.
Library of Congress Cataloging-in-Publication Data
Names: Espinola-Arredondo, Ana, author. | Muñoz-Garcia, Felix, author.

Title: Intermediate microeconomic theory : tools and step-by-step examples /
Ana Espinola-Arredondo and Felix Muñoz-Garcia.
Description: Cambridge, Massachusetts : MIT Press, [2020] | Includes
bibliographical references and index.
Identifiers: LCCN 2019053969 | ISBN 9780262044233 (hardcover)
Subjects: LCSH: Microeconomics.
Classification: LCC HB172 .E855 2020 | DDC 338.5--dc23
LC record available at https://lccn.loc.gov/2019053969
Contents
Chapter Examples xiii

Preface xix
Organization of the Book xx
How to Use This Textbook xxi
Ancillary Materials xxii
Acknowledgments xxii
1 Introduction 1
1.1 What Is Microeconomics? 1
1.2 Comparative Statics 2
1.3 Overview of the Book 2
1.3.1 Consumer Theory 2
1.3.2 Production Theory 4
1.3.3 Markets—Putting Consumers and Producers Together 4
1.3.4 Strategy—Let’s Play Games! 5
1.3.5 Putting Game Theory to Work 5
1.3.6 More Market Failures—When Markets Work Well
and When They Don’t 6
2 Consumer Preferences and Utility 7

2.1 Introduction 7
2.2 Bundles 7
2.3 Preferences for Bundles 8
2.3.1 Ranking Bundles with More Units 10
2.3.2 Satiation and Bliss Points 12
2.4 Utility Functions 14
2.5 Marginal Utility 18
2.5.1 Diminishing Marginal Utility 19
2.6 Indifference Curves 20
2.6.1 Properties of Indifference Curves 22
2.7 Marginal Rate of Substitution 25
2.7.1 Diminishing MRS 26
2.8 Special Types of Utility Functions 28
2.8.1 Perfect Substitutes 29
2.8.2 Perfect Complements 30
vi Contents
2.8.3 Cobb-Douglas 32
2.8.4 Quasilinear 34
2.8.5 Stone-Geary 35
2.9 A Look at Behavioral Economics—Social Preferences 36
2.9.1 Fehr-Schmidt Social Preferences 36
2.9.2 Bolton and Ockenfels Social Preferences 37
Appendix. Finding the Marginal Rate of Substitution 37
Exercises 38
3 Consumer Choice 45
3.1 Introduction 45
3.2 Budget Constraint 45
3.3 Utility Maximization Problem 49
3.4 Utility Maximization Problem in Extreme Scenarios 55
3.5 Revealed Preference 57
3.6 Kinked Budget Lines 60
3.6.1 Quantity Discounts 60
3.6.2 Introducing Coupons 62
Appendix A. Applying the Lagrange Method to Solve the Utility Maximization
Problem 64
Appendix B. Expenditure Minimization Problem 65
Relationship between the Utility Maximization Problem and the Expenditure
Minimization Problem 68
Exercises 70
4 Substitution and Income Effects 75

4.1 Introduction 75
4.2 Income Changes 75
4.2.1 Using the Derivative of Demand 76
4.2.2 Using Income Elasticity 77
4.2.3 Using the Income-Consumption Curve 79
4.2.4 Using the Engel Curve 80
4.3 Price Changes 82
4.3.1 Using the Derivative of Demand 83
4.3.2 Using the Price-Elasticity of Demand 83
4.3.3 Using Price-Consumption Curves 85
4.4 Income and Substitution Effects 87
4.5 Putting Income and Substitution Effects Together 88
4.5.1 Income and Substitution Effects on the Labor Market 94
Appendix A. Not All Goods Can Be Inferior 97
Appendix B. An Alternative Representation of Income and Substitution Effects 98
Using Elasticities to Represent the Slutsky Equation 101
Exercises 102
5 Measuring Welfare Changes 107

5.1 Introduction 107
5.2 Consumer Surplus 107
Contents vii
5.3 Compensating Variation 110

5.4 Equivalent Variation 114
5.5 Measuring Welfare Changes with No Income Effects 116
Appendix. An Alternative Representation of the Compensating and Equivalent
Variations 120
A.1 Compensating Variation 120
A.2 Equivalent Variation 122
Exercises 124
6 Choice under Uncertainty 127

6.2 Lotteries 128
6.3 Expected Value 128
6.4 Variance 129
6.5 Expected Utility 131
6.6 Risk Attitudes 132
6.6.1 Risk Aversion 132
6.6.2 Risk Loving 134
6.6.3 Risk Neutrality 136
6.7 Measuring Risk 138
6.7.1 Risk Premium 138
6.7.2 Certainty Equivalent 139
6.7.3 Arrow-Pratt Coefficient of Absolute Risk Aversion 140
6.8 A Look at Behavioral Economics—Nonexpected Utility 142
6.8.1 Weighted Utility 144
6.8.2 Prospect Theory 145
Exercises 148
7 Production Functions 155

7.2 Production Function 156
7.3 Marginal and Average Product 157
7.4 Relationship between APL and MPL 161
7.5 Isoquants 163
7.6 Marginal Rate of Technical Substitution 165
7.7 Special Types of Production Functions 168
7.7.1 Linear Production Function 168
7.7.2 Fixed-Proportions Production Function 169
7.7.3 Cobb-Douglas Production Function 170
7.7.4 Constant Elasticity of Substitution Production Function 171
7.8 Returns to Scale 171
7.9 Technological Progress 173
7.9.1 Types of Technological Progress 174
Appendix A. MRTS as the Ratio of Marginal Products 175
Appendix B. Elasticity of Substitution 176
Exercises 180
viii Contents
8 Cost Minimization 183

8.2 Isocost Lines 183
8.3 Cost-Minimization Problem 185
8.4 Input Demands 189
8.4.1 Input Demand—Responses 192
8.5 Cost Functions 193
8.6 Types of Costs 195
8.7 Average and Marginal Cost 198
8.7.1 Output Elasticity to Total Cost 199
8.8 Economies of Scale, Scope, and Experience 201
8.8.1 Economies of Scale 201
8.8.2 Economies of Scope 203
8.8.3 Economies of Experience 205
Appendix. Cost-Minimization Problem—A Lagrangian Analysis 206
Exercises 208
9 Partial and General Equilibrium 213

9.2 Features of Perfectly Competitive Markets 214
9.3 Profit Maximization Problem 214
9.4 Supply Curves 217
9.4.1 Individual Firm Supply 217
9.4.2 Market Supply 220
9.5 Short-Run Supply Curve 221
9.6 Market Equilibrium 224
9.6.1 Short-Run Equilibrium 224
9.6.2 Long-Run Equilibrium 225
9.7 Producer Surplus 226
9.8 General Equilibrium 228
9.8.1 Equilibrium Prices 230
9.8.2 Efficient Allocations 233
9.8.3 Equilibrium versus Efficiency 234
9.8.4 Adding Production to the Economy 239
9.9 A Look at Behavioral Economics—Market Experiments 240
Appendix. Efficient Allocations and Marginal Rate of Substitution 241
Exercises 242
10 Monopoly 247
10.2 Why Do Monopolies Exist? 247
10.3 The Monopolist’s Profit Maximization Problem 249
10.3.1 A Closer Look at Marginal Revenue 250
10.3.2 Solving the Monopolist’s Problem 253
10.4 Common Misunderstandings of Monopoly Markets 255
10.5 The Lerner Index and Inverse Elasticity Pricing Rule 257
10.6 Multiplant Monopoly 260
Contents ix
10.7 Welfare Analysis under Monopoly 263

10.8 Advertising in Monopoly 266
10.9 Monopsony 268
Exercises 271
11 Price Discrimination and Bundling 277

11.2 Price Discrimination 278
11.2.1 First-Degree Price Discrimination 279
11.2.2 Second-Degree Price Discrimination 281
11.2.3 Third-Degree Price Discrimination 284
11.3 Bundling 286
Exercises 291
12 Simultaneous-Move Games 297

12.2 What Is a Game? 298
12.3 Strategic Dominance 300
12.4 Nash Equilibrium 306
12.5 Common Games 310
12.6 Mixed-Strategy Nash Equilibrium 316
12.6.1 Graphical Representation of Best Responses 321
Exercises 323
13 Sequential and Repeated Games 329

13.2 Game Trees 330
13.3 Why Don’t We Just Find the Nash Equilibrium of the Game Tree? 332
13.4 Subgame-Perfect Equilibrium 334
13.4.1 Subgame Perfect Equilibrium in More Involved Games 335
13.5 Repeated Games 340
13.5.1 Finite Repetitions 340
13.5.2 Infinite Repetitions 341
13.6 A Look at Behavioral Economics—Cooperation in the Experimental Lab? 346
Exercises 347
14 Imperfect Competition 355

14.2 Measuring Market Power 356
14.3 Models of Imperfect Competition 357
14.3.1 Cournot Model—Simultaneous Quantity Competition 358
14.3.2 Bertrand Model—Simultaneous Price Competition 365
14.3.3 Cartels and Collusion 369
14.4 Stackelberg Model—Sequential Quantity Competition 373
14.5 Product Differentiation 377
Appendix. Cournot Model with N Firms 380
Exercises 383
x Contents
15 Games of Incomplete Information and Auctions 391

15.2 Extending Nash Equilibria to Games of Incomplete Information 392
15.3 Auctions 396
15.3.1 Auctions as Allocation Mechanisms 396
15.4 Second-Price Auctions 397
15.5 First-Price Auctions 400
15.5.1 Privately Observed Valuations 400
15.5.2 Equilibrium Bidding in First-Price Auctions 401
15.5.3 Extending the First-Price Auction to N Bidders 405
15.5.4 First-Price Auctions with Risk-Averse Bidders 407
15.6 Efficiency in Auctions 409
15.7 Common-Value Auctions 410
15.8 A Look at Behavioral Economics—Experiments with Auctions 411
Appendix. First-Price Auctions in More General Settings 412
Exercises 414
16 Contract Theory 419

16.2 Moral Hazard 421
16.2.1 Contracts When Effort Is Observable 422
16.2.2 Contracts When Effort Is Unobservable 424
16.2.3 Preventing Moral Hazard 428
16.3 Adverse Selection 428
16.3.1 Market for Lemons 428
16.3.2 Market for Lemons—Symmetric Information 429
16.3.3 Market for Lemons—Asymmetric Information 429
16.3.4 Principal-Agent Model 431
16.3.5 Principal-Agent Model—Symmetric Information 431
16.3.6 Principal-Agent Model—Asymmetric Information 433
16.3.7 Principal-Agent Model—Comparing Information Settings 436
16.3.8 Preventing Adverse Selection 438
Appendix. Showing That PCH and ICL Hold with Equality 439
Exercises 440
17 Externalities and Public Goods 445

17.2 Externalities 445
17.2.1 Unregulated Equilibrium 446
17.2.2 Social Optimum 448
17.3 Restoring the Social Optimum 451
17.3.1 Bargaining between the Affected Parties 451
17.3.2 Government Intervention 453
17.4 Public Goods 455
17.4.1 A Look at Behavioral Economics—Public-Good Experiments 459
Contents xi
17.5 Common-Pool Resources 459

17.5.1 Finding Equilibrium Appropriation 460
17.5.2 Common-Pool Resources—Joint Profit Maximization 462
Exercises 464
References 469
Index 471
Chapter Examples
Chapter 2: Consumer Preferences and Utility
Example 2.1: Monotonic and strictly monotonic preferences. 11

Example 2.2: Nonsatiated preferences. 13
Example 2.3: Utility ranking and increasing transformations of the utility function. 14
Example 2.4: Testing properties of preference relations. 15
Example 2.5: Finding marginal utility, MU. 18
Example 2.6: Diminishing marginal utility. 19
Example 2.7: Finding ICs for two utility functions. 21
Example 2.8: Finding MRS. 27
Chapter 3: Consumer Choice
Example 3.1: UMP with interior solutions–I. 51

Example 3.2: UMP with interior solutions–II. 52
Example 3.3: UMP with corner solutions. 53
Example 3.4: Testing for WARP. 58
Example 3.5: Quantity discounts. 62
Example 3.6: Coupons. 63
Example 3.7: EMP with a Cobb-Douglas utility function. 66
Example 3.8: EMP with a quasilinear utility. 67
Example 3.9: Dual problems. 69
Chapter 4: Substitution and Income Effects
Example 4.1: Increasing income in a Cobb-Douglas utility function. 77

Example 4.2: Finding income elasticity in the Cobb-Douglas scenario. 78
Example 4.3: Finding income-consumption curves. 80
Example 4.4: Finding Engel curves. 81
Example 4.5: Demand and price changes. 83
Example 4.6: Price elasticity and demand. 84
Example 4.7: Finding price-consumption curves. 87
xiv Chapter Examples
Example 4.8: Finding IE and SE with a Cobb-Douglas utility function. 91

Example 4.9: Finding IE and SE with a quasilinear utility. 93
Example 4.10: Applying the Slutsky equation to the Cobb-Douglas case. 100
Chapter 5: Measuring Welfare Changes
Example 5.1: Finding CS with linear demand. 108

Example 5.2: Finding CS with nonlinear demand. 109
Example 5.3: Finding the CV of a price decrease. 112
Example 5.4: Finding the EV of a price decrease. 115
Example 5.5: CS, CV, and EV with a quasilinear utility function. 117
Example 5.6: An alternative representation of CV. 121
Example 5.7: An alternative representation of EV. 123
Chapter 6: Choice under Uncertainty
Example 6.1: Finding the EV of a lottery. 129

Example 6.2: Finding the variance of a lottery. 130
Example 6.3: Finding the EU of a lottery. 132
Example 6.4: Finding the EU of a lottery under risk-loving preferences. 134
Example 6.5: Finding the EU of a lottery under risk-neutral preferences. 136
Example 6.6: Finding the RP of a lottery. 138
Example 6.7: Measuring RP and CE with other risk attitudes. 140
Example 6.8: Finding the AP coefficient. 141
Example 6.9: The certainty effect. 143
Example 6.10: Weighted utility. 144
Example 6.11: Using WU to explain the certainty effect. 145
Example 6.12: Prospect theory. 147
Example 6.13: Using prospect theory to explain the certainty effect. 148
Chapter 7: Production Functions
Example 7.1: Examples of production functions. 156

Example 7.2: Finding average product. 158
Example 7.3: Finding marginal product. 160
Example 7.4: Relationship between APL and MPL . 162
Example 7.5: Finding isoquant curves for a Cobb-Douglas production function. 165
Example 7.6: Finding the MRTS of a Cobb-Douglas production function. 166
Example 7.7: Finding the MRTS of a linear production function. 167
Example 7.8: Testing for returns to scale. 172
Example 7.9: Testing for technological progress. 174
Example 7.10: Identifying the type of technological progress. 175
Chapter Examples xv
Chapter 8: Cost Minimization
Example 8.1: A particular isocost. 185

Example 8.2: Cost minimization with Cobb-Douglas production functions. 188
Example 8.3: Cost minimization with linear production functions. 189
Example 8.4: Finding input demands with a Cobb-Douglas production function. 190
Example 8.5: Finding input demands with a linear production function. 191
Example 8.6: Finding total cost in the Cobb-Douglas case. 194
Example 8.7: Finding total costs in the linear production case. 194
Example 8.8: Comparing long- and short-run costs. 196
Example 8.9: Finding average and marginal cost. 199
Example 8.10: Output elasticity in the Cobb-Douglas case. 200
Example 8.11: Testing for economies of scale. 202
Example 8.12: Economies of scope. 204
Example 8.13: Slope of the experience curve. 205
Chapter 9: Partial and General Equilibrium
Example 9.1: PMP in the Cobb-Douglas case. 216

Example 9.2: Finding the long-run supply curve. 219
Example 9.3: Finding market supply. 220
Example 9.4: Finding the short-run supply curve. 223
Example 9.5: Finding short-run equilibrium output and price. 224
Example 9.6: Finding long-run equilibrium output and price. 225
Example 9.7: Finding producer surplus. 227
Example 9.8: Finding an equilibrium allocation and price. 230
Example 9.9: Finding efficient allocations. 234
Example 9.10: Testing the First Welfare Theorem. 235
Example 9.11: Testing the Second Welfare Theorem. 237
Chapter 10: Monopoly
Example 10.1: Positive and negative effects of selling more units. 251
Example 10.2: Finding marginal revenue with linear demand. 251
Example 10.3: Finding monopoly output with linear demand. 253
Example 10.4: Price elasticity of output qM under a linear demand. 256
Example 10.5: Lerner index with a linear demand. 258
Example 10.6: Lerner index with constant elasticity demand. 259
Example 10.7: Multiplant monopoly. 261
Example 10.8: Finding the deadweight loss of a monopoly. 264
Example 10.9: Finding the monopolist’s optimal advertising ratio. 267
Example 10.10: Finding optimal L in monopsony. 269
xvi Chapter Examples
Chapter 11: Price Discrimination and Bundling
Example 11.1: First-degree price discrimination. 279

Example 11.2: Second-degree price discrimination. 282
Example 11.3: Third-degree price discrimination. 284
Example 11.4: Bundling. 286
Chapter 12: Simultaneous-Move Games
Example 12.1: Finding strictly dominant strategies. 301

Example 12.2: When IDSDS does not provide a unique equilibrium. 304
Example 12.3: When IDSDS does not have a bite. 305
Example 12.4: Finding best responses and NEs. 308
Example 12.5: Prisoner’s Dilemma game. 310
Example 12.6: Battle of the Sexes game. 312
Example 12.7: Coordination game. 314
Example 12.8: Anticoordination game. 315
Example 12.9: Penalty kicks in soccer. 317
Chapter 13: Sequential and Repeated Games
Example 13.1: Applying NE to the Entry game. 332

Example 13.2: Backward induction in the Entry game. 334
Example 13.3: Applying backward induction in more involved game trees. 336
Example 13.4: Sustaining cooperation with a Grim-Trigger Strategy. 342
Chapter 14: Imperfect Competition
Example 14.1: Cournot model with symmetric costs. 362

Example 14.2: Cournot model with asymmetric costs. 364
Example 14.3: Bertrand model. 368
Example 14.4: Collusion when firms compete in quantities. 369
Example 14.5: Sustaining cooperation within the cartel. 371
Example 14.6: Stackelberg model. 376
Example 14.7: Output competition with product differentiation. 379
Chapter 15: Games of Incomplete Information and Auctions
Example 15.1: Cournot competition, with asymmetric information about costs. 393
Chapter Examples xvii
Chapter 16: Contract Theory
Example 16.1: Finding optimal contracts when effort is observable. 422

Example 16.2: Finding optimal contracts when effort is unobservable. 425
Example 16.3: Principal-agent problem under symmetric information. 432
Example 16.4: Principal-agent problem under asymmetric information. 436
Chapter 17: Externalities and Public Goods
Example 17.1: Unregulated equilibrium. 446

Example 17.2: Finding the social optimum. 448
Example 17.3: Prohibiting pollution. 450
Example 17.4: Finding optimal emission fees. 453
Example 17.5: Free-riding of public goods. 456
Preface
This textbook offers an introduction to intermediate microeconomic theory for under-

graduate students. Our presentation differs from current intermediate microeconomics
textbooks—such as Besanko and Braeutigam (2013), Varian (2014), Goolsbee, Levitt, and
Syverson (2015), and Perloff (2016)— along several dimensions:
• Length. The book is significantly shorter than most current books on this topic, which
often exceed 830 pages. Most current textbooks include lengthy presentations, such as 45
page-long chapters. Providing shorter chapters, we seek to make the material more attrac-
tive to students, who can read each chapter (the material corresponding to approximately
a week of the course) in less than one hour.
• Worked-out examples. Every chapter provides the basic theoretical elements, reducing
them to their main ingredients, and includes several detailed examples and applications.
The chapters also present the intuition behind each mathematical assumption and result.
• Tools. We provide step-by-step tools on how to solve standard exercises, so students can
apply a common approach to solve similar exercises.
• Algebra support and step-by-step calculations. We assume readers have little mathemat-
ical background in algebra and calculus, so we walk them through each algebra step and
simplification, helping them reproduce all the results on their own. From our recent expe-
rience, students’ calculus for this course is appropriate, but their algebra is often rusty.
Hence, we give algebra steps and simplifications, making sure that students can more
easily follow every step and recall basic algebra properties.
• Self-assessment exercises. The book includes 140 self-assessment exercises, which give
readers the opportunity to review concepts from previous examples. These questions
encourage readers to repeat the step-by-step approach presented in each example, consid-
ering slightly new scenarios to gain extra practice. Students can then check their answers
with the Practice Exercises for Intermediate Microeconomic Theory book.
• Practice Exercises for Intermediate Microeconomic Theory. This accompanying book
provides detailed answer keys to all the self-assessment exercises, as well as the
xx Preface
173 odd-numbered end-of-chapter exercises. In addition, it offers step-by-step explana-

tions, promoting understanding about how students can approach similar exercises on their
own, emphasizing the economic intuition behind the mathematical results. This is, then,
radically different from solution manuals, which rarely provide detailed explanations, are
difficult to read on their own, and are distributed only to instructors. The combination of
both textbooks seeks to help undergraduate students improve both their theoretical and
practical preparation in intermediate microeconomics.
Therefore, this book is especially attractive for students in programs in economics, busi-
ness administration, finance, or related fields in social sciences. Given its step-by-step
approach to examples and intuition, it should be appropriate for Intermediate Micro-
economics courses, with or without calculus, as we kept the amount of calculus to a
minimum.
Organization of the Book
Chapter 1 defines Microeconomics, how it is used to examine different real-world prob-

lems, and provides an outline of the book. Chapters 2 and 3 are dedicated to consumer
theory, first describing preference relations and utility functions (chapter 2), followed by
a presentation of how individuals choose optimal bundles (chapter 3). We then take a
more applied approach by using the tools presented in previous chapters to examine how
income or price changes affect consumer purchases (chapter 4), and how to evaluate the
welfare gain/loss that consumers experience from a price change (chapter 5). Chapter 6
then investigates how to represent individual attitudes toward risk and uncertainty, and
how to measure different risky situations. Chapters 7 and 8 switch the focus toward the
analysis of firms, first analyzing their production decisions, inputs, and technology (chap-
ter 7), how to represent the firm’s costs, and how to minimize them to find the optimal
combination of inputs (chapter 8). Chapters 9 and 10 study two extreme types of markets:
perfectly competitive markets in partial and general equilibrium (chapter 9) and monopolies
where a single firm operates (chapter 10). Chapter 11 expands on the monopolist’s analysis
by considering forms of price discrimination that the monopolist can practice, as well as
bundling.
We then explain some basic game theory in chapter 12 (simultaneous-move games) and
chapter 13 (sequential and repeated games). These two chapters serve as the building blocks
for most of the subsequent chapters, starting with chapter 14, which examines markets with
few firms, either competing in quantities or prices, choosing their actions simultaneously
or sequentially, and selling products that are regarded as homogeneous or heterogeneous
by their customers. Chapter 15 extends the analysis of games to contexts in which one
(or all) players cannot observe some relevant piece of information (games of incomplete
Preface xxi
information). Auctions are an interesting application of this type of games, because every
bidder observes her valuation for the object being sold, but cannot observe her rivals’
valuations for the object before submitting her bid.
Chapter 16 examines contract theory and incentives, which are natural applications of
game-theoretic tools as well. Unfortunately, the presentation of this topic in most Inter-
mediate Microeconomics textbooks is either too verbal, and thus does not provide precise
equilibrium results, or too formal and difficult for the average student to grasp. We hope
that this chapter strikes a balance between rigor and intuition. Finally, chapter 17 also uses
the game theory tools practiced in previous chapters, applying them now to the study of
externalities, public goods, and common-pool resources.
How to Use This Textbook
The writing style of the textbook, as well as the possibility of combining it with the Practice
Exercises book, allows for flexible uses by instructors of the following courses:
• Intermediate Microeconomics with Calculus. This book probably fits best with this type
of course. Instructors can recommend most chapters in the book as the main reading ref-
erence for students. In addition, instructors could assign the reading of specific exercises
in the Practice Exercises book, which should help students better understand the appli-
cation of the theoretical foundations, ultimately allowing them to become better prepared
for homework assignments and exams.
• Intermediate Microeconomics without Calculus, or Managerial Economics. The
book is also appropriate for instructors in this type of course because it includes little
calculus, mostly contained in a few worked-out examples and end-of-chapter appendices.
The instructor can also recommend some exercises from the Practice Exercises book, as
some of these exercises do not require the use of calculus tools.
• Introduction to Microeconomics (for students in Honors programs, or with some
mathematical background). The book can also be used in Introductory Microeconomics
courses in programs with students with an algebra background. We believe that students
should be comfortable with the style of this book after just one class in Mathematical
Economics, which they often take in their freshman year of college. Otherwise, students
who took at least one algebra and one calculus course in high school should also be rela-
tively comfortable given our step-by-step approach in all the calculations and worked-out
examples.
• Managerial Economics (Master’s level). For instructors teaching Managerial Eco-
nomics courses in Masters of Science (MS) programs in finance, business administration,
or related fields, the book can also serve as a direct, easy-to-follow, reading reference.
xxii Preface
Some of the numerical exercises from the Practice Exercises book can also support this
pedagogical strategy.
The length of this book can be attractive to instructors of the abovementioned courses who
use a “flipped classroom” approach, as students can easily read on their own the main theory
and examples before class, moving activities such as working on exercises and homework to
during class time. It can also be appealing to instructors using “case-based” teaching, who
cover real-life case studies from companies in class, leaving the learning of the main theory
and applications to students at home.
Ancillary Materials
• Practice Exercises for Intermediate Microeconomic Theory. This book includes step-by-
step answer keys with intuitive explanations to all self-assessment exercises, and all odd-
numbered exercises at the end of every chapter (342 exercises in total). This can be useful
for students to practice with more exercises, seeing the common approach that we follow
to solve them here.
• Solutions Manual for Intermediate Microeconomic Theory (available only to instructors).
It includes step-by-step answer keys to the 140 self-assessment exercises and 341 end-of-
chapter exercises (481 exercises in total). These exercises are ranked in order of difficulty,
with A next to the title for the easiest exercises (often involving no calculus), B for the
intermediate-level exercises, and C for the most difficult exercises, which require some
calculus or several algebra steps.
• Microsoft PowerPoint slides (available only to instructors). They cover the main topics
in every chapter of the textbook. The slides include all definitions, equations, short
explanations, and figures, thus facilitating class preparation. Slides can also be dis-
tributed to students as a first set of lecture notes that they can complement with in-class
explanations.
Acknowledgments
We would first like to thank several colleagues who encouraged us in the preparation of
this manuscript: Alan Love, Ron Mittelhammer, Jill McCluskey, and Antonio Manresa. We
are, of course, especially grateful to our teachers and advisors at the University of Pitts-
burgh and at the University of Barcelona for instilling a passion for research and teaching
in microeconomics. We are thankful to Emily Taber, Laura Keeler, Melody Negron, and all
the publishing team at the MIT Press, for their constant encouragement and support; and to
several teaching assistants at Washington State University, who helped us with this project
over several years; and to a number of students who provided feedback on earlier versions of
Preface xxiii
the manuscript: Eric Dunaway, John Strandholm, Pak-Sing Choi, Kiriti Kanjilal, Samantha
Johnson, Hyoenjun Hwang, Loles Garrido, Mursaleen Shiraj, Chelsea Pardini, and Casey
Bolt. Last, but not least, we would like to thank our family and friends for encouraging us
during the preparation of the manuscript.
Ana Espinola-Arredondo and Felix Muñoz-Garcia

School of Economic Sciences, Washington State University, Pullman, WA
1 Introduction
1.1 What Is Microeconomics?
What is microeconomics, anyway? This is often the first question that many nonecono-
mists ask you when you tell them that you are taking a microeconomics course or doing
research in microeconomics. One of our friends humorously says that we study “small
economies,” or goods and services that you can buy with a few cents. But, seriously, what
is microeconomics, and how can we use it?
Microeconomics seeks to understand individual behavior, where we understand the
“individual” to mean a consumer, a firm, a voter, a group of friends, a public official, or a
regulator. This behavior is mainly in economic contexts, but also in other social situations.
Here are some examples:
• Consumers. We seek to study consumer purchasing decisions. If you find out that your
favorite singer will be in town, and that tickets are sold for $45, will you buy one? What
about buying one more ticket so you can invite a friend? The answers to these questions
not only depend on how much you like this singer (a lot!) but also on your budget (whether
you can afford spending money on buying one or two tickets this month).
• Firms. We also analyze firms’ input decisions (e.g., how many workers to hire and com-
puters to purchase); how firms use these inputs to produce units of output with different
technologies; how many units each firm chooses to produce; and at what price each firm
sells these units. The answers to these questions will change, of course, depending on how
many other firms compete in that market and if its goods are regarded as relatively similar
to its competitors by consumers.
• Regulators. With no regulation, public officials can anticipate how firms and consumers
behave in different markets. The officials then ask whether policy tools, such as taxes or
quotas on consumers or firms, can be beneficial. We return to these questions at different
points throughout the book.
2 Chapter 1
We investigate the behavior of these agents under the assumption of rationality: each
agent seeks to maximize her payoff (i.e., utility for the consumer, or profits for the firm)
given her resources, and given the information to which she has access. Importantly, this
understanding of rationality is sufficiently broad, so we can consider situations where an
agent seeks to maximize her own material payoff, as well as other contexts where she maxi-
mizes a combination of her own and other agents’ payoffs. In other words, rationality implies
only that the agent seeks to maximize some kind of payoff mix, allowing her to be selfish or
altruistic.
We consider consumers with different motivations (often referred to as “other-regarding
motivations” and “biases”) in several sections of the book with the title “A Look at Behav-
ioral Economics,” where we mainly present alternative theories of consumer behavior the
literature proposed in the last decades; see the end of chapters 2, 6, 9, 13, 15, and 17.
1.2 Comparative Statics
Besides analyzing individual behavior under given conditions (e.g., a specific ticket price
of $45), we seek to predict how this behavior varies when some of these conditions change.
Generally, in economics we use the term “comparative statics” to measure how an individ-
ual’s behavior changes when we vary one, and only one, variable (such as the price of an
item). In the above example about concert tickets, if your answer was to purchase two tickets
(one for you and another for a friend) when the price is $45, would you make a different
choice if the ticket price increases to $55?1
For an example concerning firms, imagine that a technology is discovered that decreases
ice cream production costs. We then seek to understand how much ice cream sellers respond
by lowering their prices, and how the answer to this question changes depending on the
competitive pressures that they face in the industry.
1.3 Overview of the Book
1.3.1 Consumer Theory

Chapters 2–6 examine a consumer’s preferences for different goods, her choices about how
many units of each good to purchase; and how these choices vary when the consumer oper-
ates under uncertainty (e.g., not knowing the exact return she will receive in her investment
portfolio).
In chapter 2, you will find the simplified model that economists use to represent you as
a consumer. Using the notion of a consumption bundle (a list of goods and services), we
1. Alternatively, think about ice cream purchases. If your favorite ice cream brand becomes cheaper, by how much
would you increase your purchases of that brand? What about your purchases of other ice cream brands which
become, in relative terms, more expensive?
Introduction 3
analyze how a consumer’s preferences rank different bundles of goods and how to represent
these preferences in a utility function, which helps us measure the consumer’s well-being
from each bundle. We then discuss the various properties that utility functions can satisfy.
Chapter 3 studies the consumer’s optimal purchasing decision. In that regard, we start by
making an obvious, yet important, point: individuals may like more units of all (or at least
some) goods, but they cannot afford all of them! We then describe the budget constraint that
we all face as consumers, which is essentially dictated by prices of goods and our available
income. Given this budget constraint, the consumer’s purchasing decision can be informally
understood as follows:
Buy the bundle that increases my utility as much as possible but…
without breaking the bank!
For compactness, we refer to this purchasing decision as the consumer’s “demand” for
the good.2
In chapter 4, we essentially look at our results from the consumer’s problem (her demand
for the good) and check how it varies when we increase her income by a small amount.
After winning the lottery, you may increase your purchases for most goods (such as a new
house or a nicer car), and yet you may decrease your purchases of some goods, such as fast
food. We then spend some time evaluating how purchases of a good change when its price
experiences a small increase.
Chapter 5 follows up on the discussion in chapter 4, evaluating now the welfare loss that
the consumer suffers once the price of a good increases. For completeness, we present three
common measures that economists use to evaluate this welfare loss: the change in consumer
surplus, the compensating variation, and the equivalent variation. We analyze their similar-
ities and differences, as well as providing several numerical examples to illustrate how you
can apply them to other contexts.
Previous chapters of this book considered, for simplicity, that the consumer operates
under certainty. Coming back to the ice cream example, the analysis assumed that you
know each flavor. However, what if you are buying an ice cream cone at an unfamiliar
place? Chapter 6 focuses on situations where the consumer faces uncertainty about some
element that affects her utility, such as the ice cream flavor she buys. Another example is
that of accepting a job paying a salary of $60,000 a year with certainty, or working for
a start-up company that will pay you $95,000 if the company makes it to the New York
Stock Exchange (which, according to your information, will happen with a probability of
30 percent) or $15,000 if the company does not (which would occur with the remaining
probability, 70 percent). In this chapter, we ask a simple question: Which job would you
2. We also consider an alternative way to approach consumer purchasing decisions in which, rather than buying
the bundle that maximizes her utility given a budget constraint, the consumer chooses the bundle that minimizes
her budget (her expenditure) while reaching a minimum utility level. As you probably suspect, both approaches to
the consumer’s problem usually yield the same results (i.e., the same optimal bundle).
4 Chapter 1
choose? To answer this question, we introduce the “expected utility” that a worker can obtain
from each job. Using the tools that we learned in this chapter, we present different risk
attitudes that an individual may have towards a risky investment (or a risky job at a start-
up company), and finish this chapter by introducing several measures of risk aversion that
economists often use, such as the risk premium of an investment, or its certainty equivalent.
1.3.2 Production Theory

Chapters 7 and 8 focus on firms, rather than consumers. We first analyze a firm’s production
decision, such as its use of inputs (how many workers to hire or machines to purchase) and
its production level (how many units of output to produce). For simplicity, our presentation
is as similar as possible to consumer theory, which should help readers understand that
agents (whether consumers or firms) face relatively similar problems from a mathematical
standpoint: agents seek to maximize their payoff (either utility for the consumer, or profits
for the firm) and face constraints (the budget constraint for the consumer, reflecting all the
bundles she can afford, or technological constraints for the firm, indicating those output
levels the firm can produce given its technology).
While chapter 7 helps us find the production decision that maximizes a firm’s profit, chap-
ter 8 evaluates the cost that the firm incurs from this output decision. We find the units of
each input that the firm hires; its average cost (i.e., cost per unit of output); its marginal cost
(i.e., increase in cost when the firm increases its output by one unit); and how the firm’s
average cost is affected when its scale expands (economies of scale) or when it offers more
product lines (economies of scope).
1.3.3 Markets—Putting Consumers and Producers Together

Chapters 9–11 combine the results from previous chapters about which bundles of goods
consumers purchase and which bundles firms produce, placing these agents into two types
of markets: perfectly competitive markets (chapter 9) and monopolies (chapters 10 and 11).
These are, of course, extreme market structures. Indeed, as covered in chapter 9, perfectly
competitive markets encompass many firms, each producing a small share of industry out-
put. Therefore, when choosing to produce a larger output, every firm can anticipate that
its decision will not affect market prices. In other words, firms are “price takers” as they
take prices as given. In monopoly markets, in contrast, a single firm operates, choosing
the output that maximizes its profits. While firms are price takers in competitive industries,
the monopolist is a “price setter” because its output decision uniquely determines market
price.
We start our analysis of monopolies in chapter 10 with a natural question: why do monop-
olies exist? Using this question as a motivation, we examine the optimal output decision for
the monopolist. We then apply our analysis to multiplant monopolies, in which a firm is
the only seller of a product, which is made at two or more plants. Finally, we evaluate the
Introduction 5
welfare loss that society experiences when an industry is monopolized rather than operated
under perfect competition.
Chapter 11 expands our analysis of monopolistic firms by asking a provocative question:
how can monopolies further increase their profits? As we discuss, this firm can practice three
forms of price discrimination, which informally can be understood as charging different
prices to different types of customers. We also explore other tools the monopolist can use to
increase its profits: advertising, which makes its product known to a larger pool of customers
(chapter 10); and bundling, where the firm offers customers a “bundled product” (such as a
PC tower and monitor).
1.3.4 Strategy—Let’s Play Games!

Previous chapters have considered extreme types of industries, such as perfectly competi-
tive markets, where the output decision of an individual firm does not affect market prices,
and thus does not affect the profits of other firms in the same industry; and monopolies,
where a single firm sells its product, and thus is not affected by competition. In subsequent
chapters of the book, we analyze other, less extreme industries, where a few firms compete
against each other. Importantly, their competition gives rise to strategic effects. Consider,
for instance, an industry with two firms. If one of them increases its output, it can sell more
units, but now the market becomes a bit more flooded with products, decreasing market
prices and ultimately reducing its rival’s profit.
Generally, every industry having more than one firm (but not an infinite number of firms)
will experience these strategic effects from firms’ interactions: the actions of one firm affect
its rivals’ profits! Before exploring these types of markets, we must equip ourselves with
the tools to better understand strategic interactions among firms.
The branch of economics that studies strategic behaviors is known as “game theory”
because it examines the interactions among “players” (such as firms, consumers, or gov-
ernments) in situations where the action of one player affects the payoffs of other players.
Chapters 12 and 13 present us with these tools, starting with games in which all play-
ers (e.g., firms) choose their actions (e.g., output levels) simultaneously (chapter 12); and
continuing with games in which players act sequentially (chapter 13). For all the games
we analyze, we seek to predict how players behave so that we can anticipate which is the
“equilibrium behavior” in each game.
1.3.5 Putting Game Theory to Work

Chapter 14 uses game theory tools from chapters 12 and 13 to study industries with a limited
number of firms (i.e., imperfectly competitive markets). We start analyzing markets in which
firms simultaneously choose their actions, either competing in price or in quantity. We then
move on to examine industries where firms act sequentially, and extend our results to settings
where firms sell products that consumers regard as close (but not perfect) substitutes.
6 Chapter 1
Chapter 15 extends many of the game theory tools from previous chapters to situations
with incomplete information (i.e., contexts where one player has more information than its
rivals). A common example is that of a firm observing its production cost, but its rivals
cannot perfectly observe it. Another typical example is an auction where, as a bidder, you
know how much you are willing to pay for the object on sale (e.g., a Picasso painting), but
you do not get to know how much other bidders are willing to pay for it. We dedicate special
attention to different auction formats, such as first-price auctions (where the winning bidder
pays the highest bid) and second-price auctions (where the winning bidder only pays the
second highest bid). We then analyze what the optimal bidding strategy is for each one (i.e.,
how much money you should bid if you participate in one of these auctions).
1.3.6 More Market Failures—When Markets Work Well and When They Don’t
In our analysis of monopolies, we highlight the fact that they reduce social welfare relative
to what arises under a competitive industry. However, is this the only case of a market fail-
ure (i.e., a market producing less-than-optimal outcomes)? In the final chapters of the book,
we provide a negative answer to that question, as we identify several other contexts where
market failures exist. Specifically, we use game theory again to analyze two other mar-
ket failures: those emerging in contracts where one party is better informed than the other
(chapter 16), and those arising in situations where the actions of one agent produce external
effects on another agent’s well-being (chapter 17). In this final chapter, we also examine
sustainability issues in common-pool resources, such as a fishing ground or a forest. In a
short-run analysis, where agents ignore the long-term effects of their behavior, they may
choose to exploit the resource intensively. However, when agents consider these long-run
effects, their optimal behavior dictates a less intense exploitation.
2 Consumer Preferences and Utility
2.1 Introduction
In this chapter, we explore how to represent consumer preferences for different goods. We
start by discussing properties of consumer preferences, such as wanting more units of some
goods, and then mathematically describe how to measure the satisfaction that a person
enjoys with a utility function. Next, we analyze this function in detail, discussing how to
represent preferences over various types of goods, such as goods regarded as substitutes
or complements by the consumer. We also discuss other types of utility functions often
used in economic applications, such as the Cobb-Douglas utility function, the Stone-Geary
utility function, and quasilinear utility functions. We also discuss how to depict these utility
functions graphically. Finally, we describe utility functions representing “social” rather than
“selfish” preferences.
2.2 Bundles
In this chapter, we describe how to represent preferences over bundles of goods and ser-
vices that the individual considers consuming. First, we need to define what we mean by a
“bundle.”
Bundle A list of goods and services.
For instance, if an individual consumes only two goods (apples and oranges, represented
by goods x and y, respectively) a bundle could be A = (40, 30), indicating that the individual
consumes x = 40 apples and y = 30 oranges. Figure 2.1 represents apples on the horizontal
axis and oranges on the vertical axis, implying that bundles can be depicted by a point in
the positive quadrant. Bundle A = (40, 30) would then have a length of 40 on the x-axis, and
a height of 30 on the y-axis.
8 Chapter 2
y, oranges
Bundle A = (40,30)
30
40 x, apples
Figure 2.1
Bundle A = (40, 30).
2.3 Preferences for Bundles
We can now start our analysis of consumer preferences over bundles, which will help us
understand how a consumer ranks different bundles. For instance, a consumer might prefer
bundles with more units of all goods. However, she may dislike other goods (i.e., “bads”),
such as garbage or pollution, thus preferring bundles with the smallest possible amount of
those goods. Most of the examples in this book will nonetheless consider goods rather than
bads, unless otherwise stated.
We next provide a list of properties satisfied by most of the preference relations we exam-
ine in this and future chapters. We refer to two bundles, A and B, each with units of goods
x and y, A = (xA , yA ) and B = (xB , yB ). Each bundle could then be depicted as a point in
figure 2.1. In addition, our explanations use the following notation:
• A B denotes that the individual “strictly prefers” bundle A to B (so “strictly” rules out
the possibility that she is indifferent between the two bundles).
• A ∼ B means that the individual is indifferent between bundles A and B (i.e., she is equally
happy with either of them).
• A B denotes that the individual “weakly prefers” bundle A to B, which allows her to be
indifferent between the two bundles or to strictly prefer A to B.
Next, we describe our first property on preferences.
Completeness A preference relation is complete if the consumer has the ability to

compare every two bundles A and B. Formally, the consumer strictly prefers bundle
A to B (represented as A B), strictly prefers bundle B to A (denoted as B A), or is
indifferent between these two bundles (represented as A ∼ B).
Consumer Preferences and Utility 9
That is, we do not allow the consumer to respond, “I don’t know how to compare these two
bundles!” While we have all found ourselves in situations where comparing two completely
new options was rather difficult (think about the last time you ordered food in an ethnic
restaurant with which you were unfamiliar), completeness implies that the consumer has
enough time to be able to compare and rank the two bundles. In other words, completeness
requires only that the individual is capable of ranking bundles (allowing her to prefer one
over the other, or be indifferent between them), and does not allow her to be unable to
compare both bundles.
Transitivity For every three bundles A, B, and C, if the consumer prefers A to B

(A B), and B to C (B C), then she must also prefer A to C (A C).
Intuitively, if the consumer prefers a first option to a second option, and the second option
to a third, it must be that she prefers the first to the third. A consumer with intransitive pref-
erences would have A B and B C (the same premise as in the previous property), but
state that C A (the opposite conclusion than in the previous definition). Hence, her pref-
erences would exhibit a cycle, which becomes evident when we summarize all the previous
information as follows:
A B C A,
as she both starts and finishes at bundle A. Importantly, individuals with intransitive
preferences would be subject to exploitation, as we discuss next.
Consider three goods, an orange, an apple, and a

Exploitation of intransitive individuals
banana, and a consumer with the following preferences:
Orange Banana and Banana Apple,

but Apple Orange,
where her preference Apple Orange violates transitivity. For her preferences to satisfy
transitivity, we would need her to prefer the opposite, Orange Apple.
To illustrate this point, assume that the individual initially owns an orange, and she plays a
game with a fruit seller. If the fruit seller gives the individual her preferred fruit, she pays $1.
In this scenario, a seller could offer her an apple for $1, which the individual would accept
because she strictly prefers the apple to the orange, Apple Orange. Once the consumer has
the apple, the seller could approach her again, offering a banana for $1, which she would
also accept because she strictly prefers a banana to an apple, Banana Apple. Now that the
consumer owns a banana, the seller could approach her again, offering an orange for $1,
which she would accept, given that she reported Orange Banana. At this point, she has
the same fruit as at the beginning of the exchange (orange), but has lost $3 in the process
due to her intransitive preferences.
10 Chapter 2
Of course, the seller can start the process all over again and continue it ad infinitum, taking
all the money from the consumer. As a result, individuals with intransitive preferences would
be subject to exploitation by sellers (or heartless microeconomics students!), and ultimately
be eliminated from the marketplace. Given this rationale, transitivity does not seem to be a
very demanding property for preferences to satisfy.
2.3.1 Ranking Bundles with More Units

The next two properties (strict monotonicity and monotonicity) describe how an individual
ranks a bundle that contains more units of one good, or more units of all goods, than another
bundle.
Strict monotonicity Consider an initial bundle A and a new bundle B, where bundle
B has the same amount of good x as bundle A (xB = xA ), but it contains more units of
good y (yB > yA ). We say that a consumer’s preferences satisfy strict monotonicity if
she strictly prefers B to A (B A).
Therefore, increasing the units of even a single good, as we do with good y in bundle B,
produces a new bundle that is strictly preferred to the original bundle A. Informally, strict
monotonicity can be understood as “more is strictly better” (or “more of anything is strictly
preferred”) because the consumer prefers bundles containing more units of any good.
We next explore a weaker version of strict monotonicity, which allows the consumer to
be indifferent between the new bundle B and the original bundle A.
Monotonicity Consider an initial bundle A and new bundles B and C, where bundle
B has the same amount of good x as bundle A (xB = xA ), but it contains more units of
good y (yB > yA ), whereas bundle C has more units of both goods than bundle A does
(i.e., xC > xA and yC > yA ). We say that a consumer’s preferences satisfy monotonicity
if she weakly prefers B to A (B A), but strictly prefers C to A (C A).
Intuitively, with monotonicity, the consumer can be indifferent between bundles A and B,
despite B containing more units of good y than bundle A does. With strict monotonicity,
however, increasing the number of units of any good was strictly preferred. Informally,
monotonicity can be interpreted as “more is weakly better” because the consumer is either
indifferent about receiving a bundle that contains more units of at least one good, or strictly
better off (but never worse off!). Furthermore, monotonicity states that, if the amounts of all
goods are higher, as in bundle C, then the consumer is strictly better off. Monotonicity then
says, informally, that “more of everything is strictly preferred,” whereas strict monotonicity
says that “more of anything is strictly preferred.”
Example 2.1: Monotonic and strictly monotonic preferences Consider the

following scenario. We present two bundles A = (2, 3) and B = (2, 4) to Eric, an under-
graduate student in your Microeconomics class. While the amount of good x is the
same in both bundles, there is more of good y in bundle B than in A. We then ask him
which bundle he prefers. He responds that he strictly prefers bundle B to A, which we
write as B A. If this ranking holds for any two bundles we present to him, where
only one of the two goods is increased, we can say that his preferences are strictly
monotonic. Relative to bundle A, bundle B only increased the amount of good y, and
that was enough for Eric to strictly prefer B to A.
We then present the same two bundles to Chelsea, a classmate of Eric, who reports
being indifferent between bundles B and A (i.e., B ∼ A). However, if we replace bundle
B with bundle C = (3, 4), Chelsea tells us that she strictly prefers C to A (i.e., C A).
If this ranking holds for any two bundles in which one has more units of all goods
than the other, then we can say that her preferences satisfy monotonicity.
Intuitively, bundle C has more units of both goods x and y than A does, leading
Chelsea to say that C A. In contrast, increasing the amount of only good y (as we
did in bundle B) made her indifferent between the two bundles (B ∼ A).
We next present the first “self-assessment” of the book, which are short questions check-
ing your understanding by changing one of the features in a previous example. We strongly
encourage you to work on these questions. You can check your answers with the Practice
Exercises in Intermediate Microeconomic Theory book, which includes detailed answer keys
to all these questions, along with the odd-numbered exercises at the end of every chapter.
Self-assessment 2.1 Consider now that Eric prefers bundle A = (1, 1) to B = (2, 1).
Assume that this ranking holds for any two bundles we present to him, where only the
amount of good x increases from bundle A to B. Are Eric’s preferences monotonic?
Are they strictly monotonic? What if he prefers bundle A = (3, 3) to B = (2, 2), and a
similar ranking holds for any two bundles where the amount of both goods decreases
from A to B?
Strict monotonicity implies monotonicity. From the previous discussion, if a consumer

becomes strictly better off if we increase any one of the goods she consumes, then she is
not worse off, which is the minimal requirement we need for her preferences to satisfy
monotonicity. That is, if a consumer’s preferences satisfy strict monotonicity, then they
also satisfy monotonicity:
Strict monotonicity =⇒ Monotonicity

12 Chapter 2
In addition, monotonicity and strict monotonicity require that the consumer regards all
items in her bundle as goods rather than bads (such as pollution or garbage). To see this,
recall that if some good were a bad, increasing the number of units in the initial bundle A
would produce a new bundle B that would be less preferred than the original bundle A, thus
violating the definitions of monotonicity and strict monotonicity.
To better understand this type of preferences, the next property allows for bads.
2.3.2 Satiation and Bliss Points
Nonsatiation Preferences satisfy nonsatiation if, for every bundle A, we can find
another bundle B for which the consumer is strictly better off. Formally, for every
bundle A, there is another bundle B for which B A.
Intuitively, nonsatiation means that there is no “bliss bundle” where the consumer cannot
be made any happier by consuming an alternative bundle.1
An alternative explanation of nonsatiation is the following: For every bundle A, we can
always find another bundle B that is weakly preferred to A. Importantly, this definition allows
us to search for the “more preferred” bundle B anywhere we need to.2 Therefore, nonsatia-
tion allows the consumer to regard some goods as “bads,” as opposed to monotonicity (strict
monotonicity), where more units of all goods (at least one good) are desirable. Starting from
an initial bundle A, the consumer can identify other bundles preferred to A, such as B, that
contain more units of one of the goods (e.g., food) but fewer units of the other good (e.g.,
pollution or garbage). Nonsatiation only requires the consumer to, essentially, always find
more preferred bundles.
Lastly, note that if a consumer’s preferences satisfy monotonicity, they also satisfy
nonsatiation:
Monotonicity =⇒ Nonsatiation,
Recall that, by definition, monotonicity requires that, starting from any bundle A, we can
increase the amount of all goods (creating a new bundle B) and make the individual better
off. Therefore, the consumer is not satiated at bundle A, because we can still find other
bundles, such as B, which make him better off.
1. As a consequence, the utility function representing these preferences (a topic we discuss later in the chapter)
cannot have a maximum, as that would be a satiation (i.e., bliss) point. In most applications, we guarantee this
requirement by asking the consumer to choose a bundle from an affordable (or feasible) set of bundles where no
bliss point exists.
2. That is, we can search for the “more preferred” bundle B to the northeast of A in figure 2.1 (i.e., bundles with
more units of both goods than bundle A); to the southeast of A (i.e., bundles with more units of good x, but fewer
units of y); and to the northwest of A (i.e., bundles with more units of good y, but fewer units of x).
The opposite relationship, however, does not necessarily hold; that is,
Monotonicity Nonsatiation.
We can find consumers whose preferences satisfy nonsatiation but violate monotonicity, as
the following example illustrates.
Example 2.2: Nonsatiated preferences Consider again your classmate Eric. We

present bundles A = (2, 3) and D = (2, 1) to him, as depicted in figure 2.2.
After asking which bundle he prefers, Eric responds that he strictly prefers bundle
D to A, which we write as D A. In addition, he says that no other bundle makes him
as happy as D does. We can then conclude that his preferences satisfy nonsatiation,
but violate monotonicity. Why? First, note that, relative to bundle A, bundle D
decreased the amount of good y, keeping good x unaffected. If Eric strictly prefers
bundle D, it must be that he regards good y as a bad, seeking to reduce the amount of
it that he consumes.
Bundle A = (2,3)
3
Bundle D = (2,1)
1
2 x
Figure 2.2
Bundles A and D.
Self-assessment 2.2 Assume that Eric prefers bundles with more units of good x
but fewer units of y. If he cannot consume negative amounts of either good, can you
find a bliss point (i.e., a bundle where he is satiated)? Given your answer, do Eric’s
preferences satisfy nonsatiation? What about monotonicity?
14 Chapter 2
In the next section, we describe how to represent consumer preferences using utility func-
tions, and then provide examples of utility functions where some of (or all) these proper-
ties hold.
2.4 Utility Functions
We use utility functions to mathematically represent consumer preferences, as defined next.
Utility function The level of satisfaction that an individual enjoys from consuming
a bundle of goods.
For instance, if the individual consumes bundle A = (40, 30) and her utility function is
u(x, y) = 3x + 5y, we can evaluate this utility function at bundle A to obtain a utility level of
u(40, 30) = (3 × 40) + (5 × 30) = 270.
Importantly, the utility level that we obtain from bundle A (e.g., 270 in the previous example
about bundle A) is not as important as the ranking of utilities across bundles. In other words,
only the utility ranking matters, which is often known as “ordinality” because it focuses on
how the consumer orders bundles. In contrast, the specific utility level that the consumer
reaches with each bundle does not matter, which is referred to as “cardinality.” The following
examples illustrates this point.
Example 2.3: Utility ranking and increasing transformations of the utility func-
tion Consider utility function u(x, y) = xy. Bundle A = (40, 30) produces in this
context a utility level of u(40, 30) = 1, 200, while a new bundle B = (20, 30) generates
a lower utility level of u(20, 30) = 600, implying that the individual prefers bundle A
to B (A B). Imagine now that, rather than using utility function u(x, y) = xy to rep-
resent the preferences of this consumer, we use v(x, y) = 3xy + 8, which is just an
increasing transformation of the original utility function u(x, y).3
In this situation, bundle A yields a utility level of v(40, 30) = 3, 608, whereas
bundle B still generates a lower utility level of v(20, 30) = 1, 808, entailing that the
individual still prefers bundle A to B (A B). In summary, a consumer’s preference
over bundles A and B is unaffected (i.e., her ranking of A and B does not change)
3. Graphically, you can interpret function v(x, y) = 3u(x, y) + 8 as an upward shift of the initial function u(x, y)
originating at 8 and increasing its slope by 3. Other increasing transformations include v(x, y) = au(x, y) + b,
where a and b are positive constants; v(x, y) = u(x, y)2 ; and, more generally, functions where v(x, y) is increasing
in u(x, y).
if we apply an increasing transformation on her initial utility function. These

increasing transformations are also known as “monotonic” transformations because,
graphically, they produce an upward shift on the initial utility function.
Self-assessment 2.3 Consider again bundles A = (40, 30) and B = (20, 30) from
Example 2.3, and assume that Chelsea’s utility function is u(x, y) = 2x + 3y. Does she
prefer bundle A, or B, or is she indifferent between them? What if her preferences are
represented with utility function v(x, y) = 4x + 6y? What if they are represented with
v (x, y) = 4x − 6y? Hint: Utility function v (x, y) is not an increasing transformation
of Chelsea’s original utility function u(x, y).
Next, for practice, we consider a specific utility function and test the above properties of
preference relations.
Example 2.4: Testing properties of preference relations Consider again the util-
ity function u(x, y) = xy from Example 2.3. Let us check if the preference relation
that this utility function represents satisfies (a) completeness, (b) transitivity, (c) strict
monotonicity, (d) monotonicity, and (e) nonsatiation.
Completeness. For every two bundles A = (xA , yA ) and B = (xB , yB ), completeness
holds when either u(xA , yA ) u(xB , yB ), u(xB , yB ) u(xA , yA ), or both (thus implying
u(xA , yA ) = u(xB , yB )). This indeed holds because the utility level that we obtain from
bundle A, u(xA , yA ), is a real number (e.g., 1, 200 as in Example 2.3), and so is the util-
ity level that we obtain from bundle B, u(xB , yB ). We can then verify this by comparing
these numbers and showing that either u(xA , yA ) u(xB , yB ), u(xB , yB ) u(xA , yA ), or
u(xA , yA ) = u(xB , yB ).4
Transitivity. For every three bundles A, B, and C, where u(xA , yA ) u(xB , yB ) and
u(xB , yB ) u(xC , yC ), transitivity holds when u(xA , yA ) u(xC , yC ). This follows the
same argument as with completeness: since utility levels are real numbers, u(xA , yA )
u(xB , yB ) and u(xB , yB ) u(xC , yC ), implying that u(xA , yA ) u(xC , yC ) must also
hold. For example, if u(xA , yA ) = 1, 200, u(xB , yB ) = 600 and u(xC , yC ) = 300, we
know that 1, 200 600, 600 300, and 1, 200 300, thus implying that transitivity
is satisfied.
4. For the bundles in Example 2.3, it is easy to check that u(xA , yA ) ≥ u(xB , yB ) because 1, 200 > 600, but a similar
argument applies to any pair of bundles A and B.
16 Chapter 2
Strict monotonicity. For this property to hold, we need utility function u(x, y) = xy
to be strictly increasing in both goods. (Recall that consumers with strictly monotonic
preferences prefer bundles with more units of any good.) We can formally check
this by differentiating the utility function with respect to x, with respect to y, and
confirming that these derivatives are positive. That is, for this example, we need:
∂u(x, y) ∂u(x, y)
= y and = x.
∂x ∂y
Therefore, we can say that increasing the units of good x produces a strict increase
in the consumer’s utility level, so long as she consumes positive units of good y (i.e., if
y > 0). If, instead, she does not consume good y at all, y = 0, increasing good x does
not alter the individual’s utility level. Therefore, strict monotonicity does not hold
because an increase in good x does not necessarily increase the consumer’s utility.5
Monotonicity. For this property to hold, we need the utility function u(x, y) = xy to
be weakly increasing in both goods. That is, separately increasing the amount of either
good does not hurt. From the previous discussion, we know that an increase in good x
either produces a strict increase in the consumer’s utility (which occurs when y > 0)
or does not affect her utility (when y = 0), but it never reduces her utility. A similar
argument applies for an increase in good y, which never produces a decrease in her
utility level. Lastly, an increase in both x and y produces a new bundle that generates
a strictly greater utility. To see this, consider that good x is increased by a > 0 units
(from x to x + a units) and good y is increased by b > 0 units (from y to y + b units).
This yields a utility level of
u(x + a, y + b) = (x + a)(y + b),

which strictly exceeds the utility of the original bundle u(x, y) = xy for any values of
increments a and b.
Nonsatiation. This property holds by monotonicity. Indeed, we found that increas-
ing the amounts of both goods produces a new bundle (x + a, y + b) that is strictly
preferred to the original bundle (x, y); this result holds for any original bundle (x, y)
that we analyze. In other words, starting from any bundle (x, y), we can always find
another bundle, such as (x + a, y + b), for which the consumer is better off. As a con-
sequence, the consumer is never satiated (i.e., she does not reach a bliss point), as
required by nonsatiation.
5. A similar argument applies to the effect of increasing y, which strictly increases the consumer’s utility level
(does not modify her utility level) if she consumes a positive amount of good x (no units of good x at all).
Table 2.1
Utility functions and their properties.
Utility Function Completeness Transitivity Strict Monotonicity Monotonicity Nonsatiation

√ √ √ √
u(x, y) = by X
√ √ √ √
u(x, y) = ax X
√ √ √
u(x, y) = ax − by X X
√ √ √ √ √
u(x, y) = ax + by
√ √ √ √
u(x, y) = A min{ax, by} X
√ √ √ √
u(x, y) = Axα yβ X
Table 2.1 shows which of these five properties hold for common utility functions, where
we assume that parameters a, b, A, α, and β are all strictly positive. You can replicate the
steps in example 2.4 for each utility function to identify which properties hold, and con-
firm that you obtain the same results as in table 2.1. Let us briefly describe the intuition
behind each utility function and the goods that each of them normally represents. First, util-
ity function u(x, y) = by considers an individual who regards good x as irrelevant, and thus
increasing x has no effect on her utility; u(x, y) = ax exhibits a similar pattern, but instead
good y is now irrelevant. In utility function u(x, y) = ax − by, good y is a bad because increas-
ing y decreases her utility. The other utility functions represent goods that the consumer
regards as perfect substitutes, u(x, y) = ax + by; complements, u(x, y) = A min{ax, by}; or
neither perfect substitutes nor complements, u(x, y) = Axα yβ . These are explained in detail
in the last sections of this chapter.
Following the discussion in section 2.2, if a preference relation satisfies strict mono-
tonicity, it then satisfies monotonicity and nonsatiation, such as in utility function u(x, y) =
Axα yβ . Similarly, if a utility function satisfies monotonicity, then it must satisfy nonsatiation,
which holds for all other utility functions on the table. The converse of these relationships is
not necessarily true, however. In other words, a preference relation can satisfy monotonicity,
but not necessarily strict monotonicity, such as in u(x, y) = A min{ax, by}.
Self-assessment 2.4 Consider that Eric’s utility function is u(x, y) = 2x + 3y,

which is just an example of u(x, y) = ax + by, where a = 2 and b = 3. Following the
steps in example 2.4, show that this utility function satisfies completeness, transitivity,
monotonicity, strict monotonicity, and nonsatiation, as summarized in table 2.1. Then
consider one of Eric’s friends, John, who has a utility function u(x, y) = min{2x, 3y}.
Following the steps in example 2.4, show that the utility function satisfies all the
properties in table 2.1, except for strict monotonicity.
18 Chapter 2
2.5 Marginal Utility
We next describe how an individual’s utility increases as we increase the amount of one, and
only one, of the goods she consumes.
Marginal utility of a good The rate at which utility changes as consumption of a

good increases.
Intuitively, marginal utility answers the question: how much better off do you become
by consuming 1 more unit of good x? Mathematically, we measure the marginal utility of
good x, MUx , by partially differentiating the utility function u(x, y) with respect to x:
∂u(x, y)
MUx = ,
∂x
and similarly for the marginal utility of good y, MUy = ∂u(x,y)

∂y . Graphically, we measure the
slope (rate of change) of the utility function as we increase the amount of good x, holding
the amount of other goods constant. Recall that, when differentiating with respect to good x,
we keep the amount of good y constant, so we only change one thing at a time (that is, we
only increase the amount of good x). Similarly, when we differentiate with respect to good y,
we only increase the amount of good y, while leaving good x unchanged.
The next example illustrates how to find marginal utility in a common utility function.
Example 2.5: Finding marginal utility, MU Consider the utility function u(x, y) =
x1/2 y1/2 . The marginal utility of good x is
1 1 1 1 1
MUx = x 2 −1 y1/2 = x− 2 y 2
2 2
1/2
or, after rearranging, MUx = 12 yx1/2 . (Recall that, for any exponent α > 0, expression
x−α can alternatively be written as a ratio, as follows: x1α .) This MUx is positive when
the individual consumes positive amounts of good x and y, thus indicating that 1 more
unit of good x raises her utility.
Similarly, the marginal utility of good y is
1 1 1 1 1 1
MUy = x 2 y 2 −1 = x 2 y− 2 ,
2 2
1/2
or, MUy = 12 xy1/2 . As for good x, we find here that when the individual consumes
positive amounts of goods x and y, MUy is positive.
Self-assessment 2.5 Chelsea’s utility function is u(x, y) = 5x + 2y. Find the

marginal utility for goods x and y. Repeat your analysis, assuming that her utility
function is u(x, y) = 5x − 2y. Interpret.
2.5.1 Diminishing Marginal Utility

An interesting property of many utility functions is that their marginal utilities are decreas-
ing in the amount of the good that the individual consumes. That is,
∂MUx
MUx decreases in x, or 0;
∂x
and the same applies for good y, where MUy decreases in y. Intuitively, this entails that,
while more units of good x increase the individual’s utility level, further increments in x
produce smaller utility gains. In other words, when the consumer has few units of a good
(such as food), providing her with 1 more unit increases her utility by a great deal. When
she already has large amounts of the food, however, giving her 1 more unit of food produces
a small utility gain (or no gain at all!). Example 2.6 illustrates this property.
Example 2.6: Diminishing marginal utility Consider the individual in example

1/2
2.5, where we found a marginal utility from good x of MUx = 12 yx1/2 . Because good x
only shows up in the denominator of MUx , this marginal utility is decreasing in the
amount that the consumer enjoys of good x. More formally, we can differentiate MUx
with respect to x, obtaining
∂MUx y1/2
= − 3/2 ,
∂x 4x
which is negative for any combination of x and y. Similarly, her marginal utility
1/2 ∂MU x1/2
from good y, MUy = 12 xy1/2 , is decreasing in good y because ∂y y = − 4y 3/2 < 0 for
all values of x and y.
Self-assessment 2.6 Are the expressions of MUx and MUy that you found in self-
assessment 2.5 decreasing or increasing?
20 Chapter 2
2.6 Indifference Curves
Figure 2.3a depicts the utility function analyzed in example 2.4, u(x, y) = x1/2 y1/2 . Graphi-
cally, it resembles a mountain where the height is the utility that the individual achieves by
consuming a specific amount of x and y. For instance, with a bundle such as (x, y) = (4, 9),
the consumer reaches a utility level of u(4, 9) = 41/2 91/2 = 6; but she can also obtain this
utility level at other bundles, such as (x, y) = (6, 6), which yields a utility of u(6, 6) =
61/2 61/2 = 6, or at (x, y) = (9, 4), which also generates a utility of u(9, 4) = 91/2 41/2 = 6.
Figure 2.3b depicts a “slice” of the utility mountain in Figure 2.3a at a height of u = 6.
As discussed previously, we can reach a height of u = 6 at bundles such as (4, 9), (6, 6)
and (9, 4). Figure 2.3b connects these bundles with a curve and indicates that the consumer
(a)
(b)
y
25
20
15
10 (4,9)
(6,6)
(9,4)
5 IC for u=6
2 4 6 8 10 x
Figure 2.3
(a) Utility function. (b) Indifference curve.
obtains the same utility u = 6 in all of them. That is, she is indifferent between consuming
any of these bundles, as she achieves the same utility with each of them. For this reason, this
curve is referred to as an indifference curve, as defined next. For practice, you can repeat
this process for a different utility level, such as u = 14.
Indifference curve (IC) A curve connecting consumption bundles that yield the
same utility level.
Figure 2.3b illustrates the indifference curve for utility function u(x, y) = x1/2 y1/2 , eval-
uated at a utility level of u = 6. Because we represent good y on the vertical axis, we only
need to solve for y to obtain the expression of an indifference curve. That is, rearranging
x1/2 y1/2 = 6, we find y1/2 = x1/2
6
which, after squaring both sides, yields the equation of the
indifference curve y = x . The next example describes how to find ICs in other common
36
utility functions.
Example 2.7: Finding ICs for two utility functions Consider again utility
function u(x, y) = x1/2 y1/2 . Let us obtain the expression for the indifference curve
when the consumer reaches utility level u = 10. This indifference curve entails that
x1/2 y1/2 = 10. We now seek to solve for y. First, we find that y1/2 = x10
1/2 , and then we
square both sides to obtain our indifference curve:
100
y= .
x
To identify a few bundles on the indifference curve, we can plug in several val-
ues for good x, such as x = 4, which produces y = 100 4 = 25; x = 8, which yields
y = 100
8 = 12.5; or x = 10, which entails y = 100
10 = 10. Finally, we can plot the three
bundles that we obtained –(4, 25), (8, 12.5), and (10, 10)– as points on the positive
quadrant, and connect these points to form the indifference curve for u = 10.
We can follow a similar approach for a different utility function, such as u(x, y) =
5x + 3y, and the utility level of u = 9. Solving for y in 5x + 3y = 9, we obtain
5
y = 3 − x.
3
This indifference curve, hence, originates at y = 3 and decreases at a rate of 5/3,
crossing the horizontal axis at 9/5. Recall that, to obtain the horizontal intercept of
a function, we just need to set the function equal to zero (as its height becomes zero
when crossing the horizontal axis) and solve for x. In this example, we set the function
22 Chapter 2
of the indifference curve equal to zero, 3 − 53 x = 0. Rearranging, we obtain 9 = 5x, and

solving for x we find a horizontal intercept of x = 9/5.
We can now evaluate the indifference curve y = 3 − 53 x at several values for good x
(of course, x needs to be smaller than the horizontal intercept 9/5 1.8, since other-
wise we would obtain
points
on the negative quadrant). For instance, if x = 1/4, we
find that y = 3 − 53 × 14 2.58; and similarly for x = 1/2, we obtain y = 2.16; and
for x = 1, where we have y = 1.33.
Self-assessment 2.7 Repeat example 2.7, but using utility function u(x, y) =
x1/3 y2/3 and utility level u = 16.
2.6.1 Properties of Indifference Curves

ICs are negatively sloped. In our previous examples, all indifference curves were negatively
sloped. This is a property that holds from monotonicity. To see this, consider a bundle
A = (xA , yA ), such as that depicted in Figure 2.4. The indifference curve passing through
bundle A cannot go through region I because bundles in this region contain strictly more
units of both goods x and y than bundle A does. Hence, the individual is not indifferent
between bundles in region I and bundle A; instead, she strictly prefers the former to the
latter. A similar argument applies to bundles in region II, which contain strictly fewer units
of both goods x and y than in bundle A. As a consequence, the consumer is not indifferent
between bundles in region II and bundle A, but she strictly prefers the latter to the former.
The only two regions where the indifference curve passing through A can lie are regions III
Region IV Region I
A
yA
Region II Region III
xA x
Figure 2.4
A positively sloped IC.
E
A B
Figure 2.5
Two ICs intersecting.
(where the consumer gets more units of x, but fewer of y) and IV (where she has more units
of y, but less of x). As an exercise, note that the indifference curves we found in example 2.5
were both strictly decreasing in x because both originated from a monotonic utility function.6
We often refer to negatively sloped indifference curves as “convex” preferences. Mathe-
matically, this means that, for any two bundles on the curve, such as A and B in figure 2.3b, a
straight line connecting them lies: (1) strictly above the curve, thus yielding a higher utility
level than bundles A or B, which occurs when the indifference curve is strictly decreasing
such as that in figure 2.3b; or (2) on the indifference curve, yielding the same utility level
as either of the two points we connected, which happens when the indifference curve is a
straight line. We return to this property, and its economic interpretation, in section 2.6.
Self-assessment 2.8 Consider a consumer with utility function u(x, y) = 3y + 2x

who seeks to reach a utility level u = 20. Solve for y to find her indifference curve. Is
it increasing or decreasing? What if her utility function is u(x, y) = 3y − 2x?
ICs cannot intersect. This property also follows from monotonicity. To illustrate why,
figure 2.5 depicts a situation where two indifference curves intersect at bundle A. Let’s exam-
ine why these indifference curves violate monotonicity. First, bundle B lies to the northeast
of bundle C, implying that the former contains larger amounts of both goods than bundle C
does. With monotonicity, the consumer prefers the bundle with more units of both goods, B,
6. Utility functions like u(x, y) = by − ax, where a, b > 0, have a positively sloped IC. To see this, consider a utility
level of u = 10, and solve for good y, to obtain an indifference curve y = 10+ax 20 a
b = b + b x, which increases in x
as in figure 2.4.
24 Chapter 2
25
20
15
B Thick indifference curve
10 A
2 4 6 8 10 x
Figure 2.6
Thick ICs violate monotonicity.
entailing that uB > uC .7 However, bundle D lies northeast of E. With monotonicity, the util-
ity from consuming D is larger than that of E, so uD > uE . Finally, because bundles C and D
lie on the same indifference curve, we must have that uC = uD . Similarly, bundles B and E
lie on the same indifference curve, implying that uB = uE . Combining these equalities with
inequality uB > uC , we obtain that
uE = uB > uC = uD , which implies uE > uD ,
contradicting the result found about bundles E and D (uD > uE ). Therefore, monotonicity
implies that indifference curves cannot intersect.
As a corollary of this property, note that every consumption bundle lies on one, and only
one, indifference curve. If, instead, a bundle could lie on two indifference curves (as bundle A
does in figure 2.5), we would experience contradictions like the one just discussed.
ICs are not thick. This property also follows from monotonicity. To see this, figure 2.6
depicts a thick indifference curve. Starting from any bundle, such as A in the figure, we could
find other bundles to the northeast of A, such as B. This bundle contains more units of both
goods than A does, implying that, by monotonicity, the consumer reaches a higher utility
level at B than at A. As a consequence, the consumer is not indifferent between bundles A
and B; instead, she strictly prefers B to A. Therefore, monotonicity implies that indifference
curves cannot be thick.8
7. The consumer is not indifferent between bundles B and C, which could occur only if both bundles yield the
same utility level. If these two bundles lie on different indifference curves, they must yield strictly different utility
levels.
8. When indifference curves are nonthick (such as that depicted in figure 2.3b), we cannot reproduce this argument.
Starting from a bundle such as A, we cannot find other bundles to the northeast of A that lie on the same indifference
curve. Instead, bundles to the northeast of A lie on indifference curves associated with higher utility levels.
3 units
IC
x
1 unit
Figure 2.7
Interpreting MRSx,y .
2.7 Marginal Rate of Substitution
As noted in the previous discussion, indifference curves are negatively sloped when mono-
tonicity holds. We next present a more formal expression of the slope of the indifference
curve.
Marginal rate of substitution (MRS) The rate at which a consumer is willing to

give up units of good y as she receives an additional unit of good x, in order to keep
her utility level constant. Formally, the MRS of good x for y is given by the ratio of
marginal utilities:
MUx
MRSx,y = .
MUy
Intuitively, we start at a bundle on an indifference curve, such as bundle A of figure 2.7,

and ask the consumer:
If you could receive 1 more unit of good x, how many units of good y
would you be willing to give up to keep your utility level unaffected?
As depicted in figure 2.7, this question means that we move along the indifference curve,
from a bundle like A, to a new bundle B, where the consumer has 1 more unit of good x
but gives up 3 units of y to maintain her utility level. In other words, MRSx,y measures how
much disutility from consuming fewer units of good y the individual is willing to suffer, as
captured by MUy < 0, to receive 1 more unit of good x, as represented by MUx > 0. When
26 Chapter 2
8 A
C
5
3 B u2
u1
2 5 7 x
Figure 2.8
Diminishing MRSx,y —First interpretation: Preference for variety.
MUx > 0 and MUy < 0, the MRSx,y becomes negative, MRSx,y = (+) (−) = (−), as depicted by
the slope of the indifference curve in figure 2.7.9 (The appendix at the end of this chapter
mathematically proves why the MRSx,y coincides with the ratio of marginal utilities.)
2.7.1 Diminishing MRS

One interesting property of common utility functions is that they exhibit a diminishing
MRS. Because the MRS graphically represents the slope of the indifference curve, this
property implies that the indifference curve is relatively steep for small amounts of good x
(on the left side of figure 2.7), but becomes flatter as we move rightward toward greater
amounts of good x. This property can be interpreted according to two economic intuitions
we discuss next.
1. First interpretation: Preference for variety. A diminishing MRS implies that indiffer-
ence curves are bowed in toward the origin. In this context, the consumer is indifferent
between two extreme bundles, such as A in figure 2.8 (which contains many units of y,
but few of x) and B (which has many units of x, but few of y), both bundles yielding the
same utility level u1 . The consumer, however, prefers more balanced bundles, such as
C, which yields a higher utility level u2 , where she consumes an intermediate amount of
both goods x and y.
2. Second interpretation: Decreasing willingness to substitute. Starting at a bundle like A in
figure 2.9, the individual is willing to give up several units of good y in order to obtain 1
more unit of x, because she has several units of y, but few of x. Her willingness to give
9. If, upon receiving 1 more unit of good x, the consumer did not give up units of y, her utility level would not
necessarily be the same as at bundle A. Her utility level would be strictly higher when her preferences satisfy strict
monotonicity, because she has 1 more unit of good x and the same number of units of good y. Her utility level,
however, could coincide with that at bundle A if the individual’s preferences satisfy monotonicity, and if she is
indifferent between A and a bundle containing more units of x than A does.
3 units
C
1 unit D
IC
x
1 unit 1 unit
Figure 2.9
Diminishing MRSx,y —Second interpretation: Decreasing willingness to substitute.
up units of good y, however, decreases once she has more units of good x and few units
of y (as in bundle C). That is, the consumer is willing to give up several units of the good
that is relatively more abundant to obtain 1 unit of the good she lacks (this is the case of
good x in bundle A). However, she becomes less willing to give up units of good y once
she has few units of this good (as in bundle C).
This interpretation can be seen in the MRS definition, MRSx,y = MU x
MUy . At a point like A
in figure 2.9, the marginal utility from additional units of x is relatively high (because
this good is scarce), while the marginal disutility from giving up y is relatively low (as
the good is abundant), yielding a large ratio MU x
MUy , which entails a large MRS and a steep
indifference curve. At point C, in contrast, the marginal utility from additional units of
good x is now low (because this good became more abundant than at point A) and the
marginal disutility from giving up y becomes high (as the good is now relatively scarce),
yielding a small ratio MU x
MUy , which entails a low MRS and an almost flat indifference curve.
Example 2.8: Finding MRS In this example, we examine three utility functions,
where MRS is decreasing, constant, or increasing in good x, respectively. First, con-
sider utility function u(x, y) = x1/2 y1/2 from example 2.5 again, where we found the
marginal utilities for goods x and y, MUx and MUy . These can now be used to obtain:

1
1 − 12 12 2− − 12
MUx 2x y y y
MRSx,y = = = = ,
MUy 1 12 − 12 1
− 12 x
2−
2x y x
where we canceled the 1/2 on the numerator and denominator, and used the property
a
that xxb = xa−b for exponents a and b. Therefore, we found that MRSx,y = yx , which is
28 Chapter 2
decreasing in good x, yielding indifference curves that are bowed in toward the origin,
such as those in figures 2.7–2.9.
Consider now the linear utility function u(x, y) = ax + by, where a and b are positive
parameters. In this situation, marginal utilities are MUx = a and MUy = b, yielding
MUx a
MRSx,y = = ,
MUy b
which is constant in x. For instance, if a = 10 and b = 4, then MRSx,y = 2.5, indicating

that the slope of the indifference curve is −2.5 along all its points (i.e., a straight
line).10
Lastly, consider a consumer with utility function u(x, y) = ax2 + by3 . In this
context, marginal utilities are MUx = 2ax and MUy = 3by2 , yielding
MUx 2ax
MRSx,y = = ,
MUy 3by2
which is increasing in x.11 Therefore, the indifference curve is relatively flat for low
values of x, but becomes steeper as we move rightward along the x-axis, eventually
becoming almost vertical. Graphically, indifference curves are bowed away from the
origin.
Self-assessment 2.9 Eric’s utility function is u(x, y) = x1/3 y2/3 . Find his MRS and
show whether it is increasing or decreasing in x. Repeat your analysis for Pam’s utility
function, u(x, y) = 3x + 2y, and for Maria’s utility function, u(x, y) = 3x − 2y.
2.8 Special Types of Utility Functions
This section presents, in detail, five types of utility functions often used in economic
applications, each capturing a consumer’s preference for different classes of goods.
10. As an exercise, we can find the equation of the indifference curve in this utility function. For a given utility
level u, we have u = 10x + 4y. Rearranging, we obtain u − 10x = 4y, which, solving for good y, yields y = u4 − 10
4 x.
Graphically, this means that the indifference curve originates at a vertical intercept of u4 (e.g., 20
4 = 5 if u = 20) and
decreases with a slope of − 10
4 = −2.5, confirming what we found with the MRS. Of course, this slope is constant,
as it does not depend on the amount of good x that the individual consumes.
11. For instance, if a = 10 and b = 5, then the MRS becomes MRSx,y = 4x2 .
3y
2 a
slope = –
b b
1
b
1 2 x
a a
Figure 2.10
Perfect substitutes in consumption.
2.8.1 Perfect Substitutes

The consumer may regard two goods as close substitutes, such as two brands of unflavored
mineral water (e.g., Aquafina versus Dasani), or two universal serial bus (USB) memory
sticks (e.g., Sandisk and Kingston), as the consumer can use either good without signifi-
cantly affecting her utility. Other goods that are often regarded as substitutes are coffee and
black tea, or butter and margarine. For these goods, the consumer’s utility function takes
the form
u(x, y) = ax + by,
where a and b are positive parameters. (For instance, if a = 2 and b = 4, one unit of x gives
the consumer twice as much utility as one unit of y, because MUx = 2 and MUy = 4, so that
MRSx,y = 4/2 = 2.) This utility is linear in both good x, because its marginal utility MUx = a
is constant, and good y, because MUy = b is also a constant. As discussed in example 2.8,
the MRS in linear utility functions is
MUx a
MRSx,y = = .
MUy b
Graphically, a constant MRS results in indifference curves that are represented by a
straight line with slope − ab . To see this point, solve for y in utility function u(x, y) = ax + by,
which yields the equation of the indifference curve, y = ub − ab x, originating at ub , decreasing
at a rate of ab , and crossing the horizontal axis at ua . Figure 2.10 illustrates two indifference
curves: one evaluated at u = 1, and another at u = 2.12
12. In some settings, the utility function takes the form u(x, y) = ax + ay = a (x + y), indicating that parameters a
and b coincide. In this situation, the slope of the indifference curve simplifies to − aa = −1.
30 Chapter 2
Recall that, intuitively, MRS measures the consumer’s willingness to give up units of
good y to obtain 1 more unit of x, while keeping her utility level unaffected. Therefore,
a constant MRS (i.e., a number) implies that the consumer’s willingness to substitute y for
additional units of x is, in plain terms, “always the same,” that is to say, it remains unaffected
by the relative scarcity of each good. In contrast, when MRS is decreasing, the consumer is
willing to give up more units of good y when x becomes relatively scarce (in other words,
good x becomes more attractive, in relative terms, as compared to the abundant good y).
Self-assessment 2.10 Chelsea’s utility function is u(x, y) = 3x + 2y. Graph her

indifference curve for utility levels u = 10 and u = 20.
2.8.2 Perfect Complements

The individual in this case must consume goods in fixed proportions, such as cars and gaso-
line, left and right shoes, or peanut butter and jelly sandwiches. In particular, her utility
function takes the form
u(x, y) = A min {ax, by} ,
where A, a, and b are positive parameters.13 For example, if A = 1 and a = b = 2, the utility
function reduces to
u(x, y) = min {2x, 2y} = 2 min{x, y}.
One interesting property of this utility function is that, if the consumer increases the
amount of good x by one unit without increasing the amount of y, her utility does not nec-
essarily increase. Specifically, when the consumer has more units of x than y (x y), an
increase in x does not increase her utility at all. However, when she has more units of y
(y > x), an increase in x—the least abundant good—does increase her utility level.
To illustrate this point, consider that the consumer has 10 units of each good, yielding a
utility level of
u(10, 10) = min {2 × 10, 2 × 10} = min{20, 20} = 20.
If good x is now increased from 10 to 11 units, but good y is unaffected, her utility remains
at the same level because
u(11, 10) = min {2 × 11, 2 × 10} = min{22, 20} = 20.
13. This utility function is often referred to as “Leontieff,” after the economist who first conceptualized it, Wassily
Leontieff.
a
y= x
b
a D
2 u2 = 2 Aa
b
a
u1 = Aa
b C E
a
b
1 2 x
Figure 2.11
Perfect complements in consumption.
In other words, increasing the amount of one of the goods alone does not yield utility gains,
as this consumer needs to enjoy both goods in fixed proportions. (Think about having more
fuel-powered cars without having more gasoline!) Formally, this means that preferences for
complementary goods violate the strict monotonicity property, because giving the consumer
more units of only one good does not necessarily increase her utility.
Graphically, the indifference curves associated with this utility function have an L-shape,
as figure 2.11 illustrates. Starting from the bundle at the kink, like C, an increase in good x
alone (moving rightward) does not increase the consumer’s utility, as depicted by bundle E,
which lies on the same indifference curve as bundle C. A similar argument applies if we
increase good y alone (moving upward from C) as depicted by bundle D, which also lies on
the same indifference curve as bundle C. The kink occurs at points where the two arguments
inside min {ax, by} coincide, that is, at ax = by. Solving for y, we find y = ab x, as depicted in
the ray in figure 2.11 that crosses all indifference curves at their kinks.14
While the slope of this indifference curve is zero in its flat segment (see the portion on
the right side of bundle C), and −∞ in the vertical segment (to the left of C), it is undefined
at the kink. Graphically, we could depict infinitely many tangent lines at the kink to define
the slope of the indifference curve, so we could not identify a unique and precise slope at
this kink.

14. In addition, at bundle C = 1, ab of figure 2.11, the consumer’s utility becomes u 1, ab = A min

a × 1, b ba = A min{a, a} = Aa, as illustrated in utility level u1 = Aa in the figure. A similar argument applies

to bundle E = 2, ab , where her utility is u 2, ab = A min a × 2, b ab = A min{2a, a}. Because a < 2a, we find
that her utility is A min{2a, a} = Aa, which coincides with her utility in bundle C. As a practice exercise, find her
utility at bundle D, confirming that it is also Aa.
32 Chapter 2
Self-assessment 2.11 John’s utility function is given by u(x, y) = 3 min{x, 2y}.

Graph his indifference curve for utility levels u = 10 and u = 20.
2.8.3 Cobb-Douglas
The Cobb-Douglas utility function15 is an intermediate case lying between the utility func-
tions in subsections 2.8.1 and 2.8.2, because the consumer regards goods as neither perfectly
substitutable nor complementary. We have encountered this utility function in previous
examples throughout this chapter, but we present a more general form here:
u(x, y) = Axα yβ ,
where A, α, and β are positive parameters. As described in previous examples, MUx =
Aαxα−1 yβ and MUy = Aβxα yβ−1 , and they yield
MUx Aαxα−1 yβ αyβ−(β−1) αy

MRSx,y = = = = .
MUy Aβxα yβ−1 βxα−(α−1) βx
Its MRS is then decreasing in x (i.e., as x increases, the ratio goes down), thus pro-
ducing indifference curves that are bowed in toward the origin. Graphically, indifference
curves become flatter as we move rightward to higher values of x. This type of utility func-
tion embodies the following functions as special cases, which are often used in economic
analysis:
1. A = α = β = 1, which reduces the utility function to u(x, y) = xy. In this case, the MRS
simplifies to MRSx,y = yx because α = β.
2. A = 1 and α = β, which simplifies the function to u(x, y) = xα yα = (xy)α . In this situation,
the MRS also reduces to MRSx,y = yx .
3. A = 1 and β = 1 − α, which yields a utility function of u(x, y) = xα y1−α . The MRS now
α y
simplifies to MRSx,y = 1−α x.
Common examples of case (2) are u(x, y) = (xy)1/2 and u(x, y) = (xy)1/3 , and examples
of case (3) include u(x, y) = x1/3 y2/3 and u(x, y) = x1/4 y3/4 .
Self-assessment 2.12 Maria’s utility function is u(x, y) = 5x1/2 y1/4 . Graph her
indifference curve for utility levels u = 10 and u = 20.
15. This function was developed in 1927 by Paul Douglas (an economist and U.S. senator) and Charles Cobb
(mathematician and economist).
Lastly, note that the exponents in the Cobb-Douglas utility function can be interpreted as
elasticities. Before we show this result, we define “utility elasticity.”
Utility elasticity of good x, εu,x The percentage increase in utility (if εu,x > 0)
or percentage decrease in utility (if εu,x < 0) that the consumer experiences after
increasing the amount of good x she consumes by 1 percent. More formally,
% u(x, y)
εu,x = .
% x
Rearranging this expression, we obtain

u(x,y)
% u(x, y) u(x,y) u(x, y) x
εu,x = = x
= .
% x x u(x, y)
x
When the increase in the amount of good x is marginally small, we can rewrite this
expression as follows
∂u(x, y) x
εu,x = ,
∂x u(x, y)
where the first term simply represents the marginal utility of good x and the second term
is a ratio with the amount of good x that the individual consumes in the numerator and her
utility function in the denominator. Equipped with the definition of utility elasticity εu,x , we
can apply it to the Cobb-Douglas utility function to find that
∂u(x, y) x x
εu,x = = Aαxα−1 yβ α β ,
∂x u(x, y) Ax y
∂u(x,y)
∂x u(x,y)
∂u(x,y)
because ∂x = Aαxα−1 yβ . Simplifying this expression yields
Aαxα−1+1 yβ Aαxα yβ
εu,x = = = α.
Axα yβ Axα yβ
Hence, when facing a utility function like u(x, y) = Axα yβ , we can claim that the expo-
nent in good x, α, represents the utility elasticity of a marginal increase in x. Intuitively, a
1 percent increase in the amount of good x increases utility by α percent. A similar argument
applies to good y, whose utility elasticity is β.
Self-assessment 2.13 Consider Maria’s utility function again, u(x, y) = 5x1/2 y1/4 .
What is the utility elasticity of good x? And of good y? Interpret.
34 Chapter 2
2.8.4 Quasilinear
The quasilinear utility function is often used in economic applications analyzing consumers
who use all their additional income on one good alone (e.g., good y, or video games). Alter-
natively, additional income is never spent on good x. This occurs for goods such as garlic
and toothpaste, whose consumption is relatively unaffected by an individual’s income.16
Generally, the quasilinear utility function has the form
u(x, y) = v(x) + by,
where b is a positive constant, and v(x) is a nonlinear function in x, such as v(x) = x1/2
or v(x) = ln x.17 Other commonly used nonlinear functions in x include v(x) = axy because
its derivative is v (x) = ay, which is not a constant, or generally, any function v(x) whose
derivative with respect to x, v (x), is not a constant, but instead depends on the units of
good x, good y, or both.
In this context, marginal utilities are MUx = v (x) and MUy = b, which yield
MUx v (x)
MRSx,y = = .
MUy b
Hence, for a given value of x, the MRS is constant because it does not depend on the amount
of good y.
To illustrate this result, consider a consumer with quasilinear utility function u(x, y) =
x1/2 + 3y, so that v(x) = x1/2 and b = 3. Hence, v (x) = 12 x−1/2 , and the MRS becomes
1 −1/2
2x 1
MRSx,y = = √ .
3 6 x
Therefore, for a given value of x, such as x = 16, we obtain MRSx,y = √1 = 24 1

, which is
6 16
constant in y (i.e., it does not depend on y). Graphically, this result says that if we fix the
value of good x (e.g., x = 16 units, as depicted in figure 2.12), the slope of the indifference
curve (MRSx,y ) is unaffected by the amount of good y. In other words, if we extend a vertical
line at a given value of x, such as x = 16, the slope of the indifference curve is the same at
all indifference curves being crossed by this vertical line.
Graphically, this implies that indifference curves are parallel shifts of each other (in this
case, vertical shifts), indicating that the consumer will use additional income to buy good y
16. Other examples may include fads, such as hula hoops or pet rocks, for which all additional income is spent
on the fad item. A similar argument applies for recent fads, such as Star Wars figures or Pokemon Go. While the
Pokemon Go app is free, players need PokeCoins to buy useful items and for inventory upgrades, generating a
revenue of $1.8 billion in the two years after its launch.
17. Recall that when we say that function v(x) is nonlinear, we mean that its derivative with respect to x, v (x), is not
a constant (i.e., it depends on the amount of good x or y). In the previous examples, this derivate is v (x) = 12 x−1/2
and v (x) = 1x , respectively; all being a function of x that is required for v(x) to be nonlinear.
u3
u2
u1
16 x
Figure 2.12
Quasilinear utility—Parallel ICs.
alone (the good that entered linearly in her utility function), but additional income is not
used to purchase more units of good x (the good that entered nonlinearly).
Self-assessment 2.14 Eric’s utility function is u(x, y) = x1/3 + 14 y. Find his MRS
and depict his indifference curve for utility levels u = 10 and u = 20.
2.8.5 Stone-Geary
The Stone-Geary utility function18 takes a Cobb-Douglas shape, but requires that individu-
als have a minimum amount of each good they require to live, such as half a gallon of water
or 2,200 food calories per day. Using x and y to represent the minimal amounts of goods x
and y that the individual needs, the utility function is written as
u(x, y) = A (x − x)α (y − y)β ,
where A, α, and β are positive constants. Intuitively, the individual obtains a positive utility
from good x only after exceeding her minimal consumption x (i.e., when x > x); and the same
applies for the utility from good y, when y > y. Otherwise, her utility from good x, good y, or
both is negative. Note that when the minimal amounts of x and y are both zero (x = y = 0),
this utility function reduces to u(x, y) = Axα yβ , thus coinciding with the standard Cobb-
Douglas expression discussed previously. In this situation, the marginal utilities become
MUx = Aα (x − x)α−1 (y − y)β and
α β−1
MUy = Aβ (x − x) (y − y) ,
18. This utility function was first derived by Roy C. Geary, and later, it was empirically estimated by Richard Stone.
36 Chapter 2
thus implying that the MRS is

MUx Aα (x − x)α−1 (y − y)β
MRSx,y = =
MUy Aβ (x − x)α (y − y)β−1
α (y − y)β−(β−1) α (y − y)
= = .
β (x − x)α−(α−1) β (x − x)
Interestingly, when the minimal amounts of x and y that the individual must consume are
zero (x = y = 0), this MRS collapses to
α y−0 αy
MRSx,y = = ,
β (x − 0) βx
which coincides with the MRS found in the Cobb-Douglas case. As in that case, indifference
curves are here bowed-in toward the origin.
Self-assessment 2.15 Ana’s utility function is u(x, y) = 5 (x − 2)1/2 (y − 1)1/3 .

Find her marginal utilities and her MRS, and check if it is decreasing in x.
2.9 A Look at Behavioral Economics—Social Preferences
The utility functions considered in the previous sections assume that the individual cares
about the bundle she receives but ignores the bundle (or money) that other individuals enjoy.
However, we can imagine many scenarios where we care about the well-being of family
members, friends, or even strangers we just watched on television. In this section, we explore
two of the utility functions suggested by the literature to account for this behavioral pattern,
where individuals exhibit social, rather than selfish, preferences.
Generally, the field of behavioral economics relaxes standard assumptions in economics,
such as selfishness, unbounded rationality, or the individual’s unlimited willpower to resist
temptations.19 We examine a few topics in behavioral economics in chapters 6, 9, 13, 15,
and 17, providing reading recommendations.
2.9.1 Fehr-Schmidt Social Preferences

Fehr and Schmidt (1999) suggested a utility function that captures social preferences. For
simplicity, let us consider a context with two individuals, 1 and 2, and let x1 and x2 rep-
resent their respective incomes. When individual 2 is richer than 1, x2 > x1 , the utility
19. For an introduction to behavioral economics, see Just (2013) and Angner (2016), or the more advanced
presentation in Camerer (2003).
of individual 1 becomes
x1 − α(x − x )
2 1
Disutility from envy
where parameter α 0 denotes the disutility that individual 1 suffers from envy because
she is poorer than individual 1. In contrast, when individual 2 is poorer than 1, x2 < x1 (so
individual 1 is richer), the utility of individual 1 is
x1 − β(x − x )
1 2
Disutility from guilt
where parameter β 0 reflects the disutility that individual 1 suffers from guilt because she
is richer than individual 2.20 As a special case, note that when all parameters are zero, α =
β = 0, this utility function reduces to x1 , both when x2 > x1 and otherwise, which reflects
standard (selfish) preferences as the individual only cares about his income, x1 , and does
not suffer from envy or guilt.
2.9.2 Bolton and Ockenfels Social Preferences

Bolton and Ockenfels (2000) proposed a relatively more general utility function than did
Fehr and Schmidt (1999). For the case of two individuals, 1 and 2, the utility function of
individual 1 can be expressed as
x1
u1 x1 , ,
x1 + x2
where the first argument in the parentheses is interpreted as the selfish component because
individual 1 considers only her own wealth x1 . The second argument measures the share that
individual 1’s wealth represents of the total wealth in the group, x1x+x
1
2
, or her situation rela-
tive to the group, thus capturing social preferences. An example of this utility function can be
1/2
x1 x1
u1 x1 , = x1 + α .
x1 + x2 x1 + x2
If parameter α 0, individual 1 enjoys a utility from owning a larger share of total wealth.
If, instead, α < 0, individual 1 suffers from owning a larger share of wealth.
Appendix. Finding the Marginal Rate of Substitution
We increase good x by 1 unit and seek to measure how many units of good y the con-
sumer must give up to preserve her current utility level. Because we simultaneously alter
20. Fehr and Schmidt (1999) assumed that individuals suffer more envy than guilt, which means that parameters
α and β satisfy α β.
38 Chapter 2
the amount of x and y, we totally differentiate the utility function u(x, y) to obtain
∂u(x, y) ∂u(x, y)
du = dx + dy.
∂x ∂y
Because the consumer is moving along an indifference curve, her utility level does not vary,
implying that du = 0. Plugging this result into the left side, and using MUx = ∂u(x,y)
∂x and
∂u(x,y)
MUy = ∂y , we obtain
0 = MUx dx + MUy dy,

du=0
or, rearranging, −MUy dy = MUx dx. Lastly, because we are interested in the rate at which y
changes for a 1-unit increase in good x, we solve for dy
dx , to obtain
21
dy MUx
− = ,
dx MUy
as required. Therefore, the slope of the indifference curve, − dy dx , coincides with the ratio
MUx
of marginal utilities MUy . For compactness, this ratio is referred to as the marginal rate of
substitution between goods x and y, or MRSx,y .
Exercises
1. Indifference Curves.B Answer the following questions for each of the utility functions in table 2.1.
(a) Find the marginal utility for good x and y, MUx and MUy .
(b) Are these marginal utilities positive? Are they strictly positive? Connect your results with the
properties of monotonicity and strict monotonicity.
(c) Find MRS = MUx
MUy . Does MRS increase in the amount of good x?
(d) Depict an indifference curve reaching a utility level of u = 10, and another indifference curve
of u = 20. Do the indifference curves cross either axis?
(e) Provide an example of goods that you think can be represented with each utility function in
table 2.1.
2. Plotting Curves.A Find the indifference curve for each of the utility functions in table 2.1, evalu-
ating all of them at a utility level of u = 10 units. (Hint: You just need to plug in u = 10 and solve
for y.) Are these indifference curves negatively sloped? Plot each indifference curve by considering
three values for good x, such as x = 1, 2, and 4, and finding the corresponding value of good y.

dy
21. Starting from −MUy dy = MUx dx, we divide both sides by dx, which yields MUy − dx = MUx ; and then,
dy
dividing both sides by MUy , we find − dx = MUx
MU .
y
For all utility functions in table 2.1, assume that the parameters take the values A = 1, a = 2, b = 3,
α = 0.5, and β = 0.5.
3. One Way or Another.B Consider an individual with utility function
u(x, y) = min{x + 2y, 2x + y}.
Plot her indifference curve at a utility level of u = 10 units. Interpret.

4. Perfect Complements.A Consider a consumer with utility function u(x, y) = min{3x, 4y}.
(a) Depict her indifference curve at a utility level of u = 20.
(b) Depict bundles A = (10, 10), B = (14, 10), and C = (10, 14).
(c) Find the utility levels that the consumer reaches at bundles A, B, and C.
5. Cobb-Douglas.A Consider an individual with the Cobb-Douglas utility function
√ √
u(x, y) = x y.
Assume that her income is I = $120, the price of good x is px = $4, and the price of good y is
py = $10.
(a) Find the marginal utility of good x, MUx , and that of good y, MUy .
(b) Given the results in part (a), does this utility function satisfy monotonicity? What about strict
monotonicity?
(c) Using the marginal utilities you found in part (a), find the MRS.
6. Marginal Rate of Substitution–I.A Find the MRS for each of the utility functions in table 2.1.
Are the MRS that you found diminishing? Provide an economic interpretation for each MRS.
7. Perfect Substitutes.A Consider a consumer with utility function u(x, y) = ax + by.
(a) For a given utility level u(x, y) = 10, find the equation of the indifference curve. (Hint: Set
u = 10 and solve for y.)
(b) Find the marginal utilities MUx and MUy .
(c) Find MRS. Does it increase in the amount of good x?
(d) Does this utility function satisfy strict monotonicity? What about monotonicity? And local
nonsatiation?
8. Examples of Goods Fitting Each Utility Function.A Consider a scenario with only two goods,
x and y. For each of the following utility functions, provide two examples (other than those given
in this book), justifying why each utility function represents preferences for that type of good.
(a) Perfect substitutes.
(b) Perfect complements.
(c) Cobb-Douglas.
(d) Stone-Geary.
9. Increasing Transformations.B Chelsea has the following Cobb-Douglas utility function:
u(x, y) = xy. Assume that we apply any of the following transformations. Show that when we
40 Chapter 2
consider increasing transformations, Chelsea’s ordering of bundles A = (1, 2) and B = (3, 8) is

unaffected (i.e., she still prefers B to A). When we consider decreasing transformations, show that
Chelsea’s ordering of bundles A and B may be affected.
(a) v(x, y) = [u(x, y)]2
(b) v(x, y) = ln[u(x, y)]
(c) v(x, y) = 5[u(x, y)]
1
(d) v(x, y) = u(x,y)
(e) v(x, y) = 7[u(x, y)] − 2
10. Finding Properties–I.A Eric’s preferences for books, x, and computers, y, can be represented
with the following Cobb-Douglas utility function: u(x, y) = x3 y2 .
(a) Find Eric’s marginal utility for books, MUx , and for computers, MUy .
(b) Are his preferences monotonic (i.e., weakly increasing in both goods)?
(c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference
curve.
(d) Find Eric’s MRS between x and y . Interpret your results.
(e) Are his preferences convex (i.e., bowed in toward the origin)?
(f) Consider a given utility level of 10 utils. Plot his indifference curve in this case.
11. Finding Properties–II.C Repeat exercise 10, but assume now that Eric’s preferences
are
represented with the following (Stone-Geary) utility function: u(x, y) = 2 x3 − 1 y2 − 2 .
curve.
(d) Find Eric’s MRS between x and y. Interpret your results.
12. Finding Properties–III.B Repeat exercise 10, but assume now that Eric’s preferences are
represented with the following (linear) utility function: u(x, y) = 3x + 4y.
curve.
13. Finding Properties–IV.B Repeat the previous exercise, but assume now that Eric’s preferences
are represented with the following utility function for two goods regarded as complements in
consumption: u(x, y) = min {3x, 4y}.
(c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference curve.
14. Envious Preferences.B Peter’s preferences are represented by the utility function
u(x, y) = 4x + 2(x − y),
where x denotes the amount of books he has, while y represents the amount of books his friend
has. Intuitively, when x > y, he owns more books than his friend, and his utility increases. When
x < y, he owns fewer books than his friend and his utility decreases (he suffers from envy).
(a) Find Peter’s marginal utility for the books he owns, MUx , and for his friend’s, MUy .
(c) For a given utility level u, solve the utility function for y to obtain Peter’s indifference curve.
(d) Find Peter’s MRS between x and y. Interpret your results.
15. Guilty Preferences.B Repeat exercise 14, but assume that Peter’s utility function is now
u(x, y) = 4x − (x − y);
that is, he suffers from guilt when he owns more books than his friend (x > y). Relative to envy
aversion in exercise 14, guilt aversion reduces Peter’s utility less dramatically. Stated in words,
Peter cares more about feeling envy than about feeling guilt.
(a) Find Peter’s marginal utility for the books he owns, MUx , and for his friend’s, MUy .
(c) For a given utility level u, solve the utility function for y to obtain Peter’s indifference curve.
(d) Find Peter’s MRS between x and y. Interpret your results.
16. Toddler Rationality.A Eric’s daughter is 4 years old. He has spent a long time trying to figure out
her favorite animal. Today he is asking her to pick which animal she prefers among two animals
at a time. He learned the following strict preferences:
42 Chapter 2
Animal 1 Animal 2 Preferred Animal

Unicorn Dog Unicorn
Rabbit Pig Rabbit
Cat Pig Pig
Rabbit Unicorn Rabbit
Duck Cat Duck
Pony Cat Cat
Rabbit Duck Duck
Pig Dog Pig
Pony Rabbit Pony
Unicorn Pony Pony
Dog Rabbit Dog
In this table, the first row indicates that when presented with a unicorn or a dog, his daughter would
strictly prefer a unicorn. Assume that Eric’s daughter could compare any two animals (i.e., that
she has complete preferences). Is Eric’s daughter rational? Explain why or why not. (Note: Some
of these relations are redundant.)
17. Protein Preferences.A John is out to dinner and is presented a menu of items that contains a beef
dish and a chicken dish, and John chooses the beef dish. Before the server walks away with John’s
order, she remembers that the menu is actually out of date and a fish dish is available, as well.
(a) Suppose that John remains with his original order of the beef dish. What does this imply
about John’s preferences for all three dishes?
(b) Suppose that John switches his order to the fish dish. What does this imply about John’s
preferences for all three dishes?
(c) Suppose that John switches his order to the chicken dish. What does this imply about John’s
preferences for all three dishes?
18. Inverse Utility.C Consider a consumer with the utility function:
1
u(x, y) = .
xy
(a) Does this utility function satisfy monotonicity?
(b) Does this utility function satisfy local nonsatiation?
19. Eating Pizza.A While out for dinner one night, Peter orders a large pepperoni pizza for himself.
After eating the first slice, he remarks that the pizza is delicious, and he’ll have another slice.
Slices two and three continue to receive accolades, but less so, with Peter expressing that he is
starting to feel full after the third slice. Peter decides to have a fourth slice, after which he decides
that he is full and prefers to eat no more pizza.
(a) What is happening to Peter regarding his utility?
(b) Suppose that Peter was dared to eat a fifth slice of pizza and accepted. Afterward, he com-
plains that he feels ill and leaves for the bathroom (rather hurriedly). What has happened to
Peter’s utility?
20. Discrete Marginal Rate of Substitution.A On the weekend, Eric traveled to a local barter market
to exchange some of his apples for cheese. Bringing 20 apples with him, he offers 5 apples to the
first cheese merchant he sees, in exchange for a wedge of cheese. Eric offers only 4 of his apples
to another cheese merchant for his second wedge of cheese, and only 2 apples to a third merchant
for his third wedge of cheese.
(a) What is Eric’s MRS as he moves from 0 to 1, then 1 to 2, and finally 2 to 3 wedges of cheese?
(b) Why does Eric offer fewer apples for each additional wedge of cheese he obtains?
3 Consumer Choice
3.1 Introduction
In chapter 2, we learned about consumer preferences over different bundles, and how they
help us represent an individual’s ranking over different alternatives. In addition, we discussed
how to represent these preferences with a utility function to measure how much utility a
consumer derives from different bundles. However, we were silent about which bundles are
affordable for the consumer to buy (as if she had unlimited resources!). To determine how
a consumer chooses amongst different bundles, we need to consider not only her prefer-
ences, but also her budget. In this chapter, we first describe how to represent such a budget
constraint, and then combine the consumer’s utility function and her budget constraint to
identify her optimal consumption choices.
3.2 Budget Constraint
Budget constraint The set of bundles that a consumer can afford, given the price of
each good and her income.
For instance, the budget set for two goods (units of food, x, and units of clothing, y) is
px x + py y I,
where px represents the price of each unit of food, py denotes the price of each unit of
clothing, and I represents the consumer’s available income to spend on food and clothing.
Intuitively, the budget set says that the total dollar amount that the consumer spends on food,
px x, plus the total dollar amount she spends on clothing, py y, cannot exceed her available
income, I. For example, if the price of good x is px = $10, that of good y is py = $20, and
the consumer has an income of I = $400 to spend on either good, her budget constraint is
10x + 20y 400.
46 Chapter 3
I px
Slope = –
py py
I x
px
Figure 3.1
A budget line.
Bundles (x, y) that satisfy this budget constraint strictly, px x + py y < I, imply that the
consumer does not use all her income, whereas bundles for which the constraint holds with
equality, px x + py y = I, mean that the consumer spends all of her income. We often refer to
this last equation, px x + py y = I, as the budget line (see figure 3.1). Because good y is on the
vertical axis, we can rearrange budget line px x + py y = I to py y = I − px x, and then solve for
y, to obtain
I px
y= − x,
py py
where pIy represents the vertical intercept, and − ppxy is the slope of the budget line. We can
also find the horizontal intercept in figure 3.1 by setting y = 0 in the equation for the bud-
get line px x + py y = I (because at this point, its height is zero). This yields px x + py 0 = I,
or px x = I. After solving for x, we find x = pIx , as depicted in the horizontal axis of
figure 3.1.
At the vertical (horizontal) intercept, the consumer spends all her income on good x
(good y), so she can afford pIx units ( pIy units) of this good. At all other points along the
budget line, however, she purchases positive units of both goods. The slope of the budget
line, − ppxy , tells us how many units of the good on the y-axis the consumer must give up
to buy 1 more unit of the good on the x-axis, as we move from left to right on figure 3.1.
Continuing with our previous example, if px = $10 and py = $20, the slope of the budget
line is − ppxy = − 1020 , or − 2 . In this case, the slope tells us that the consumer must give up
1
1/2 units of good y to acquire 1 more unit of good x, because good y is twice as expensive
as good x. Alternatively, she must give up 1 unit of good y to purchase 2 more units of
good x.
Consumer Choice 47
y
I'
py px
Slope = –
py
I
py
px
Slope = –
py
I I' x
px px
Figure 3.2
Budget line (BL) after an increase in income.
Self-assessment 3.1 Eric faces prices px = $13 and py = $18, and income I =
$250. Plot his budget line, finding the vertical and horizontal intercepts, and its slope.
Interpret.
Changes in income. An increase in income, I, shifts the budget line outward in a parallel
fashion. To see this effect, notice that when income increases from I to I , where I > I, the

horizontal intercept increases from pIx to pI x , and so does the vertical intercept from pIy to pI y ,
as depicted in figure 3.2. Graphically, the increase in income moves the vertical intercept
upward and the horizontal intercept rightward. In addition, this shift is parallel to the initial
budget line because its slope, − ppxy , is unaffected by a change in income (i.e., the slope is not
a function of the individual’s income I).
Intuitively, as her income increases (holding prices constant), the consumer can afford
a larger set of bundles. That is, she can afford more units of good x (because the horizon-
tal intercept moves rightward), more units of good y (because the vertical intercept moves
upward), or more units of both goods.1 A decrease in income would have the opposite
effect on the budget line, of course, shifting it inward (closer to the origin) in a parallel
fashion.
1. To see this last point graphically, you can depict a 45-degree line (upward diagonal) on figure 3.2. This 45-
degree line crosses both the original and the new budget lines at a point where the consumer purchases the same
amount of both goods (i.e., x = y). However, at the crossing point with the new budget line, the individual can
purchase more units than at the crossing point with the original budget line.
48 Chapter 3
(a) (b)
y y
I
p
py Slope = – p x
y
Increase in py
I p
Slope = – x I
py py
p 'y
p
p 'x Slope = – p x'
Slope = – y
py
I I x I x
p 'x Increase in px px px
Figure 3.3
(a) Increase in price px . (b) Increase in price py .
Self-assessment 3.2 Consider self-assessment 3.1 again. If Eric’s income

increases to I = $540, find his budget line. What are the new vertical and horizontal
intercepts? Does the slope of the budget line change?
Changes in prices. An increase in the price of one good, such as px , pivots the bud-
get line inward, as illustrated in figure 3.3a. In particular, the vertical intercept pIy is
unaffected by changes in px , whereas the horizontal intercept pIx moves leftward when
px increases. Indeed, when the price of good x increases from px to px , the horizontal
intercept decreases from pIx to pI . To interpret the economic intuition behind this result,
x
recall that the horizontal intercept measures the amount of good x that the individual can
afford when spending all her income I on good x alone. As x becomes more expensive,
she cannot afford as many units of x. Intuitively, the individual now faces a more expen-
sive good x, thus shrinking the set of bundles that she can afford.2 A similar argument
applies if the price of good y increases (decreases), as depicted in figure 3.3b. In this
case, the horizontal intercept pIx remains unaffected, but the vertical intercept pIy moves
down (up).
2. The opposite argument applies if good x becomes cheaper, where the horizontal intercept would move rightward,
thus expanding the set of bundles that she can afford.
Consumer Choice 49
Self-assessment 3.3 Consider Eric’s situation in self-assessment 3.1 again. If the

price of good x doubles from px = $13 to px = $26, while py and I are unaffected, what
is the new position of Eric’s budget line? What if, instead, the price of good y doubles
from py = $18 to py = $36, while px and I are unchanged?
Query. What would happen if both income and the price of all goods were doubled? This
is a tricky question. If all prices and income change, the budget line is unaffected. Indeed,
you can confirm this result by noticing that (1) the vertical intercept of the budget line, pIy ,
2I I
would now become 2py , which simplifies to py , thus indicating no change in its position;
I 2I I
(2) the horizontal intercept px is now 2px , which reduces to px , also reflecting no change in
its position; and (3) the slope − ppxy
is now − 2p
which simplifies to − ppxy , implying that the
x
2py ,
slope of the budget line does not change either. As a consequence, no term in the budget line
is affected by a simultaneous increase in all prices and income. Note that such argument not
only applies to a doubling of all prices and income, but also extends to any common increase
in all prices and income (i.e., multiplying px , py , and I by a common factor α > 1, such as
α = 3), and to any common decrease in all prices and income (i.e., multiplying px , py , and
I by a common factor α, where 0 < α < 1, such as α = 1/2).
3.3 Utility Maximization Problem
After using the budget line to describe which bundles the consumer can afford, we are ready
to present the process by which the consumer chooses utility-maximizing bundles. In par-
ticular, the consumer chooses bundles that maximize her utility among all those that she can
afford. Figure 3.4 illustrates this idea by superimposing indifference curves, which represent
the utility that the consumer obtains from different bundles, on top of her budget line, which
depicts her affordable bundles.
We can test whether points A–D in this graph are utility-maximizing. First, point C cannot
be optimal because, although the consumer reaches utility level u1 , and exhausts her income
because px x + py y = I, she could find other bundles, such as A, where she still spends her
income and obtains a higher utility u2 , where u2 > u1 . Bundles like B cannot be optimal
either because, despite spending all her income, the consumer reaches a lower utility level
than at bundle A. Finally, bundles such as D, lying strictly above the budget line, cannot
be optimal either. Despite yielding a higher utility level than A, they are unaffordable and
thus violate the budget constraint. As a consequence, only bundles such as A, where the
budget line and the indifference curves are tangent to each other, can be optimal for the
consumer.
50 Chapter 3
D
BL
A
u3
u2
C
u1
x
Figure 3.4
Utility maximization problem.
Same “bang for the buck.” This tangency condition requires that the slope of the budget
line at bundle A, ppxy , is equal to the slope of the indifference curve, MRS = MU x 3
MUy . That is,
utility-maximizing bundles must satisfy
MUx px MUx MUy
= , or after rearranging = .
MUy py px py
MU
Intuitively, condition MU
px = py states that the marginal utility per dollar spent on the
x y
last unit of good x must be equal to that of good y; or more informally, the bang for the
buck must coincide across all goods. (Appendix A at the end of this chapter proves this
result.)
MUy
Otherwise, if in a bundle we have MUpx > py , the consumer would obtain a larger bang
x
for the buck from x than from y, ultimately providing her with incentives to spend more dol-
MUy
lars in x and fewer in y. Hence, the initial bundle for which MU
px > py cannot be optimal
x
because the consumer has incentives to readjust her consumption bundle. This occurs, how-
ever, at corner solutions where the consumer spends all her income purchasing one good
alone (as discussed in example 3.3 later in this chapter).
We next present a general procedure on how to solve the utility maximization pro-
blem (UMP), and afterward illustrate the use of the procedure with three step-by-step
examples. This procedure applies to relatively general situations, but it does not apply
to utility functions with one or more goods having a negative marginal utility, such as
u(x, y) = ax2 − bx, where a, b > 0 and MUx = 2ax and MUy = −b. In this type of utility
function, the consumer reduces his purchases of the good with a negative marginal utility to
∂u(x,y)
3. Recall from chapter 2 (section 2.5) that MUx denotes the marginal utility of good x, MUx = ∂x , while MUy
∂u(x,y)
represents the marginal utility of good y, MUy = ∂y .
Consumer Choice 51
zero (in the previous example y = 0) and buys as many units as possible of the other good
(i.e., x = pIx ).
Tool 3.1. Procedure to solve the UMP

1. Set the tangency condition as MUx
MUy = ppxy . Cross-multiply and simplify.
2. If the expression found from the tangency condition:
a. Contains both unknowns (good x and y), then solve for x, and insert the resulting
expression into the budget line px x + py y = I.
b. Contains only one unknown (good x or y, but not both), then solve for that unknown.
Afterward, insert your result into the budget line px x + py y = I to obtain the remaining
unknown.
MU MU
c. Contains no good x or y, then compare MU x y MUx
px against py . If px > py , then set
y
good y = 0 in the budget line and solve for good x. (You found a corner solution
MUy
where the consumer purchases only good x.) If, instead, MU px < py , then set x = 0 in
x
the budget line and solve for good y. (In this case, you found a corner solution where
she purchases only good y.)
3. If, in step 2, you find that one of the goods is consumed in negative amounts (e.g.,
x = −2), then set the amount of this good equal to zero on the budget line (e.g.,
px 0 + py y = I), and solve for the remaining good.
4. If you haven’t found the values for all the unknowns (goods x and y) yet, use the tangency
condition from step 1 to find the remaining unknown.
Example 3.1: UMP with interior solutions–I Consider an individual with a Cobb-
Douglas utility function u(x, y) = xy, facing market prices px = $20 and py = $40, and
income I = $800.
Step 1. Let us use the tangency condition to find the optimal consumption bundle
y
of this consumer. In this case, MU
MUy = x , which implies that the tangency condition
x
MUx
MUy = ppxy becomes yx = 20 y
40 . Simplifying, we find x = 2 , or 2y = x. This result contains
1
both x and y, so we can now move on to step 2a, ignoring steps 2b and 2c.
Step 2a. From the budget line, we have that 20x + 40y = 800.4 Inserting 2y = x into
the budget line, we obtain
20(2y) + 40y = 800,

x
4. Mathematically, we have a system of two equations, 2y = x and 20x + 40y = 800, and two unknowns, x and y.
Because this example illustrates Tool 3.1, we continue applying step 2a. However, you could directly solve for x
and y in these two equations.
52 Chapter 3
or, rearranging, 80y = 800, which yields y = 800 80 = 10 units. Because we found that
the consumer purchases 10 units of good y, we can move on to step 4. (Recall that we
need to stop at step 3 only if, at the end of step 2, you find that x or y are negative.)
Step 4. Lastly, to find the optimal consumption of good x, we use the tangency
condition x = 2y = 2 × 10 = 20 units.
Summary. The optimal consumption bundle is (20, 10). As confirmation, note that
at bundle (20, 10), the slope of the indifference curve, yx = 10
20 , coincides with that of
the budget line, ppxy = 12 , because 10
20 = 1
2 .
Example 3.2: UMP with interior solutions–II Consider a variation of example 3.1
in which the individual now has a Cobb-Douglas utility function u(x, y) = x1/3 y2/3 ,
facing market prices px = $10 and py = $20, and income I = $100. We seek to use the
px
tangency condition MU MUx
MUy = py , but first we must find MUy , as follows:
x
1 3 −1 311 −3 32 2 2 2 1
MUx x y x y y3+3 y
= 3 1 2 = 3 1 1 = 1 2 = .
2 3 3 −1 2 3 −3
MUy
3x y 3x y 2x 3 + 3 2x
px y
Step 1. Plugging this result into the tangency condition MU
MUy = py yields 2x = 20 , or
x 10
rearranging y = x. This result contains both x and y, so we can move on to step 2a.
Step 2a. Inserting y = x into the budget line, 10x + 20y = 100, we obtain
10 (y) + 20y = 100,

x
30 = 3.33 units. Because we found that

or, rearranging, 30y = 100, which yields y = 100
the consumer purchases a positive amount of good y, we can move on to step 4.
Step 4. The optimal consumption of good x can be found by using the tangency
condition y = x = 3.33 units.
Summary. Therefore, the optimal consumption bundle is (3.33, 3.33) where the
consumer purchases the same amount of each good.5
In example 3.2, we can find the budget shares of each good; that is, the percentage
of income that the consumer spends on good x and on good y. In particular, the budget
y 3.33 = 1 , coincides
5. As confirmation, note that, at this optimal bundle (3.33, 3.33), the slope of the IC, 2x = 2×3.33 2
px 10 10 1
with that of the budget line, py = 20 , because 20 = 2 .
Consumer Choice 53
share of good x is
px x 10 × 3.33 1
= = ,
I 100 3
and it is similar for the budget share of good y, where
py y 20 × 3.33 2
= = ,
I 100 3
which coincides with the exponent of each good in the consumer’s Cobb-Douglas util-
ity function, u(x, y) = x1/3 y2/3 . This is a useful result that generalizes to all types of
Cobb-Douglas utility functions, u(x, y) = Axα yβ , where A, α, β > 0, thus allowing us to
immediately infer that the budget share of good x is α, while that of good y is β, just by
looking at the exponents of this utility function.
Self-assessment 3.4 Chelsea has utility function u(x, y) = x1/2 y1/4 , facing prices
px = $3 and py = $2, and income I = $16. Using the same steps as in example 3.2,
find Chelsea’s optimal consumption of goods x and y.
Examples 3.1 and 3.2 examined a scenario in which the consumer purchases positive
amounts of all goods (e.g., 10 units of x and 20 of y). However, we can encounter scenarios
in which consumers prefer to consume zero units of either good. We now explore such a
situation.
Example 3.3: UMP with corner solutions Assume that a consumer has the utility
function u(x, y) = xy + 7x, faces market prices px = $1 and py = $2, and an income
I = $10.
px y+7
Step 1. Using the tangency condition MU MUy = py we find that x = 2 , which col-
x 1
lapses to 2y + 14 = x. This result contains both x and y, so we can now move on to

step 2a.
Step 2a. From the budget line, we have that x + 2y = 10. Plugging 2y + 14 = x into
the budget line, we obtain
(2y + 14) + 2y = 10,

x
or, rearranging, 4y = −4, which yields y = −1 units. Using Tool 3.1, we now move
on to step 3.
Step 3. Because the amounts of goods x and y cannot be negative, this result entails
that the individual that we consider would like to reduce her consumption of good y
54 Chapter 3
y u1 u2 u3
BL Bundle (10,0)
10 units x
Figure 3.5
Optimal bundle with quasilinear utility.
as much as possible (i.e., y = 0). We can insert this result into the budget line to obtain
x + (2 × 0) = 10, or x = 10 units.
Summary. We have thus found a corner solution, where the consumer in this case
uses all her income to purchase good x alone (i.e., x = pIx = 101 = 10 units). Graphi-
cally, her optimal bundle (x, y) = (10, 0) is located on the horizontal intercept of her
budget line.
Finally, note that at the corner solution, the tangency condition does not hold
because
MUx MUy y+7 x
= , or in this case = ,
px py 1 2
and, evaluating this equality at the corner solution (x, y) = (10, 0), we obtain 0+7
1 > 2
10
(we know that the inequality holds with sign > because it simplifies to 7 > 5). As
expected from these results, the marginal utility per dollar spent on good x is larger
than that on good y, thus inducing the individual to increase her consumption of good
x and decrease that of y. Figure 3.5 depicts this result.6
Intuitively, she would like to further decrease her consumption of good y and use the
money saved to buy more units of x, but she can no longer decrease her consumption
of y once she reaches y = 0.
y+7
6. As confirmation, note that, at this optimal consumption bundle (10, 0), the slope of the IC, 1 , becomes
0+7 = 7, thus being larger than the slope of the budget line, 10 = 5. In other words, the IC passing through bundle
1 2
(10, 0) is steeper than the budget line, as depictedin figure 3.5.
Consumer Choice 55
Self-assessment 3.5 Repeat the analysis of example 3.3, but assume now that
prices change to px = $2 and py = $1. How are the results affected?
3.4 Utility Maximization Problem in Extreme Scenarios
Goods are regarded as perfect substitutes. A common utility function for which corner
solutions arise is when goods are substitutes in consumption, such as two brands of unfla-
vored mineral water (e.g., Dasani and Aquafina). As described in chapter 2 (section 2.8.1),
this utility function has the form u(x, y) = ax + by, where a and b are positive parameters.
In this case, MU
MUy = b . Hence, one of the following three cases emerges:
x a
• If ab > ppxy , the indifference curve is steeper than the budget line, thus producing a corner
solution in which the consumer purchases only units of good x, as in example 3.3. This
can be seen more easily if we represent the tangency condition using the “bang for the
buck” approach,
a b
> ,
px py
which indicates that the bang for the buck from good x is larger than that of y, thus imply-
ing that the consumer would like to increase her consumption of good x while decreasing
that of y. (An example of this would be when a = b = 1 and prices are px = $1 and
py = $3.)
• The opposite argument applies if ab < ppxy , where a corner solution exists, in which the
consumer spends now all her income on good y. In this case, the optimal consumption
bundle lies on the vertical intercept of the budget line. (This occurs, for instance, if a =
b = 1 and prices are px = $3 and py = $2, where the inequality becomes 11 = 1 < 1.5 = 32 .)
• Lastly, if ab = ppxy , the slope of the indifference curves and that of the budget line coincide,
yielding a complete overlap between an indifference curve and the budget line. In this
case, tangency occurs at all the points of the budget line, implying that all the points are
optimal consumption bundles. Formally, in this case we say that a continuum of solutions
exists, because any bundle (x, y) satisfying px x + py y = I is utility maximizing.
Self-assessment 3.6 Eric’s utility function is u(x, y) = 3x + 4y and faces prices

px = $1 and py = $2.5 and income I = $23. Comparing his MRSx,y and the price ratio,
find his optimal consumption of goods x and y.
56 Chapter 3
Goods are regarded as perfect complements. When, in contrast, the consumer regards
goods as perfect complements, such as cars and gasoline, her utility function takes the form
u(x, y) = A min{ax, by}, where A, a, and b are all positive parameters. Chapter 2 (section
2.8.2) presented this utility function, discussing that its indifference curves are L-shaped
and have a kink at a ray from the origin with slope a/b. In addition, we described that the
MRS of this function is undefined, because the kink would admit any slope. As a conse-
quence, we cannot use the tangency condition MRS = ppxy , given that we cannot guarantee
that the MRS takes a specific number.
Optimal bundles in this context, therefore, require us to identify bundles for which
we cannot increase the consumer’s utility level, given her budget constraint. This occurs,
in particular, when she consumes the bundle at the kink of her indifference curve
where it intersects her budget line. Mathematically, that requires ax = by for the bun-
dle to be at the kink or, after rearranging, y = ba x; and px x + py y = I for the bundle to
be on the budget line. Hence, we have a system of two equations, y = ba x and px x +
py y = I, and two unknowns, goods x and y. Inserting y = ba x into the budget line, we
obtain
a
px x + py x = I,
b

y
which, solving for x, yields x = p +p

I
a = bp +ap . Therefore, the optimal amount of y
x
bI
y
x yb
becomes
a bI aI
y= = .
b bpx + apy bpx + apy
For instance, if a = b = 2 (which occurs when the individual needs to consume the same
amount of each good), prices are px = $10 and py = $20, and her income is I = $100, the
optimal consumption of good x is
bI 2 × 100 10
x= = = units,
bpx + apy (2 × 10) + (2 × 20) 3
and it is similar for good y, where y = bpxaI

+apy = (2×10)+(2×20) = 3 units.
2×100 10
Self-assessment 3.7 John’s utility function is u(x, y) = 5 min{2x, 3y} and he faces
prices px = $1 and py = $2 and income I = $100. Using the previous argument, find
his optimal consumption of goods x and y.
Consumer Choice 57
3.5 Revealed Preference
In previous sections, we analyzed how to find optimal consumption bundles, assuming that we
could observe the consumer’s preferences represented with her utility function. But what if we
only know which choices she made when facing different combinations of prices and income?
Can we still say whether an individual made optimal consumption choices? The answer to this
question is yes, thanks to the Weak Axiom of Revealed Preference (WARP). Before we state
this axiom, let bundle A = (xA , yA ) be the optimal consumption bundle that the individual
selects when facing initial prices and income (px , py , I) and, similarly, let bundle B = (xB , yB )
be her optimal consumption bundles when facing final prices and income (px , py , I ).
Weak Axiom of Revealed Preference (WARP) If optimal consumption bundles A

and B are both affordable under initial prices and income (px , py , I), then bundle A
cannot be affordable under final prices and income (px , py , I ). That is,
if px xA + py yA I and px xB + py yB I, then px xA + py yA > I .
Intuitively, if both bundles A and B are initially affordable, and the consumer selects A
as optimal, she is “revealing” a preference for bundle A over B. WARP then requires that
bundle A is not affordable under final prices and income (px , py , I ); otherwise, the consumer
should still select the original bundle A rather than B.
Hence, WARP can be interpreted as a consistency requirement in an individual’s choices
when facing different prices and incomes: if she chooses bundle A when other bundles are
affordable, she should keep choosing bundle A if it is still affordable. If, instead, she chooses
bundle B when facing new prices and income, it must be that the original bundle A is no
longer affordable.
We next provide a tool to test for WARP.
Tool 3.2. Checking for WARP. Let us follow this two-step procedure:
1. Checking the premise. Check if bundles A and B lie on or below the initial budget line
BL, which represents initial prices and income (px , py , I). That is, make sure that both
bundles are initially affordable.
1a. If step 1 holds, move to step 2.
1b. If step 1 does not hold, then stop. We can only claim that the individual choices do
not violate WARP.7
7. In this case, the premise of WARP does not hold, which means that we cannot claim that WARP is satisfied
or violated. We can only claim that WARP is satisfied if steps 1 and 2 hold, and we can only claim that WARP is
violated if step 1 holds but 2 does not.
58 Chapter 3
2. Checking the conclusion. Check if bundle A lies strictly above the final budget line BL ,
which represents final prices and income (px , py , I ). That is, check that bundle A is no
longer affordable.
2a. If step 2 holds, then WARP is satisfied.
2b. If step 2 does not hold, then WARP is violated.
Hence, if step 1 holds, the premise of WARP is satisfied, and we can move on to check
its conclusion, as stated in step 2. In summary, WARP is either: (i) satisfied if steps 1 and
2 hold; (ii) violated if step 1 holds but 2 does not; or (iii) not violated if step 1 does not
hold. Example 3.4 illustrates several consumer choices: some satisfying and some violating
WARP.
Example 3.4: Testing for WARP Figures 3.6a–d represent the same change in the
budget line, from BL to BL . This change may be due to a simultaneous decrease in
px (so BL becomes flatter than BL) and an income reduction (shifting the budget line
closer to the origin). For instance, the initial budget line BL could depict a situation
where px = py = $2 and I = $10, whereas the final budget line BL illustrates the case
where px decreases to px = $1 and income decreases to I = $7, leaving py unchanged.
In this context, the vertical intercept of the budget line decreases from pIy = 102 =5
I
py = 2 = 3.5 units, = 10
2 =5
7 I
units to and the horizontal intercept increases from px

px = 1 = 7 units.
I 7
units to
Figure 3.6a depicts a scenario where WARP is satisfied. To see this, we start noting
that step 1 holds because bundle A lies on the initial budget line BL, while bundle
B lies strictly below BL, thus implying that both bundles are affordable under initial
prices and income. We can then move on to step 2, and notice that bundle A lies
strictly above the final budget line BL , making this bundle unaffordable under the
final prices and income. As a consequence, WARP is satisfied. Figure 3.6b, however,
depicts choices that violate WARP. To see this, first note that the premise of WARP, as
stated in step 1, holds because bundle A lies on the initial budget line BL and bundle B
lies strictly below BL. However, step 2 does not hold because bundle A lies below the
final budget line BL , making A affordable under final prices and income. Therefore,
the consumer is not consistent in her choices given that both bundles A and B are
affordable under BL and BL , but she changes her choices in each budget set. In this
scenario, WARP is violated.
Figure 3.6c illustrates a situation in which WARP is not violated. Indeed, step 1 in
the procedure to test WARP does not hold because, while bundle A lies on the initial
budget line BL, bundle B lies strictly above BL, making the latter unaffordable under
the initial prices and income. Because step 1 does not hold, the premise of WARP
Consumer Choice 59
(a) (b)
y y
BL BL
B B
BL' A BL'
x x
Figure 3.6
(a) WARP holds. (b) WARP is violated.
(c) (d)
y y
BL BL
A
B BL’ B BL’
x x
Figure 3.6
(c) WARP is not violated. (d) WARP is not violated.
60 Chapter 3
does not hold either, implying that WARP is not violated. A similar argument applies
to figure 3.6d.
Self-assessment 3.8 Consider figures 3.6a–3.6d again. Assume for each figure
that bundle A lies at the crossing point between budget lines BL and BL . How are the
results of example 3.4 affected? What if B is the bundle lying at the crossing point
between BL and BL ?
Exercise 3.14, at the end of the chapter, provides a numerical example where you can
apply the above procedure to check for WARP.
3.6 Kinked Budget Lines
In previous sections of this chapter, we considered a linear budget line, which assumes
that consumers face a constant price for goods x and y, regardless of how many units they
purchase. We now examine budget lines that counter this assumption.
3.6.1 Quantity Discounts

Sellers often offer quantity discounts that make the first few units (such as the first 2 units)
more expensive than each unit afterwards. Formally, this means that the consumer faces a
price px for all units of good x between 0 and x (i.e., for all x x), but a lower price px ,
where px < px , for each subsequent unit (i.e., for all x > x). Figure 3.7 depicts this budget
line, which originates at pIy and decreases at a price ratio of ppxy for all x x, as in the standard
budget lines in this chapter. However, when the consumer purchases more than x units, she
benefits from the quantity discount, lowering the price of x to px , which decreases the price
p
ratio from ppxy to pxy . Graphically, this lower price ratio entails a flatter budget line for all units
to the right side of x.
Mathematically, the equation of the budget line in this scenario is
I px
y= − x for all x x,
py py

Vertical intercept Slope
which coincides with the budget line in section 3.2 for all units of x to the left of the kink in
figure 3.7, x x, originating at pIy , having a slope of − ppxy , and crossing the horizontal axis at
px . However, for all units to the right side of the kink, x > x, the equation of the budget line is
I
Consumer Choice 61
y
p
Slope = – p x
y
I
py
BL (solid kinked lines)
I px – px '
py – py x
px '
Slope = –
py
x I I px – px '
– x
x
px px ' px '
Figure 3.7
Budget line with quantity discounts.

I px − px p
y= − x − x x for all x > x.
py py py

Relative to the equation with no price discounts, this expression differs in two ways. First,
p p
it is flatter, because its slope is pxy rather than ppxy , where pxy < ppxy , as depicted in figure 3.7.
p −p
Second, it originates at a lower vertical intercept, pIy − xpy x x rather than pIy (see the vertical
intercepts in figure 3.7).8
Figure 3.7 also helps us understand the effect of a large or small price discount. A large
price discount makes the difference px − px larger, shifting the vertical intercept downward
and flattening the right segment of the budget line. In contrast, a small price discount pro-
duces a small difference px − px , pushing the vertical intercept upward (closer to that of the
original budget line with no discounts, pIy ) and steepening the right segment of the budget
line (so that it is almost as steep as the left segment).9

p −p p
8. The horizontal intercept of this line can be found by setting y = 0, which yields 0 = pIy − xpy x x − pxy x
p p −p p −p
or, after rearranging, pxy x = pIy + xpy x x. Solving for x, we obtain a horizontal intercept of x = I − x x x, as
px px
depicted in figure 3.7.
9. As a remark, note that the equations of both segments of the budget line (to the left and the right
of x) coincide
px −px px
if the seller offers no price discount (that is, px = px ). In this case, equation y = pIy − py x − py x becomes

p −p p p
y = pIy − xpy x x − pxy x, which simplifies to y = pIy − pxy x.
62 Chapter 3
Example 3.5: Quantity discounts Eric has an income of $100 to purchase video
games (good x) and food (good y). The price of food is py = $5, regardless of how
many units he buys, while that of video games is px = $4 for the first two units, but
px = $1 for the third unit and beyond. Because the cutoff here is at x = 2 units, Eric’s
budget line is then
100 4 4
y= − x = 20 − x for all x 2, and
5 5 5

100 4 − 1 1 3 94 1
y= − 2 − x = 20 − 2 − x = − x for all x > 2.
5 5 5 5 5 5
Graphically, Eric’s budget line originates at I

py = 100
5 = 20 units on the vertical axis
px
and decreases at a rate of − py = − 45 = −0.8 for the first two units. For all units x > 2,
p
however, his budget line originates at y = 94
5 18.8 units, has a slope of − py = − 5 =
x 1
px −px
−2, thus becoming flatter, and crosses the horizontal axis at x = pI − px x = 1
100
−
x
(4−1)
1 × 2 = 100 − 6 = 94 units.
3.6.2 Introducing Coupons

Consider a market where the government offers coupons that let consumers purchase the
first x units of good x for free. Figure 3.8 depicts the budget line in this situation, BLC ,
where subscript C denotes “coupons.” The budget line is flat for all units between 0 and
x and then decreases at the usual price ratio ppxy , having a kink at exactly the number
of units of good x that the consumer enjoys for free, x. For comparison purposes, the
figure also includes a budget line without coupons, BLNC , where the subscript NC denotes
“no coupons.” Intuitively, the coupons expand the set of bundles that the consumer can
afford.
Mathematically, the kinked budget line BLC in figure 3.8 can be expressed as
py y = I for all x < x, and

BLC =
px (x − x) + py y = I for all x x.
This condition says that the equation of BLC is py y = I, to the left of the kink, x < x,
because, in this case, the consumer effectively faces a price of zero for good x, px = $0,
thanks to the coupons. For bundles to the right of the kink, x x, the budget line becomes
px (x − x) + py y = I, because the consumer exhausted all coupons at that point and faces
market prices px and py . Solving for y, we can also represent budget line BLC as y = pIy for
all x < x, and as y = pIy − ppxy (x − x) for all x x, which can be written alternatively as
Consumer Choice 63
y
I + px x
py py
BLC (solid kinked lines)
I
py
px px
Slope = – BLNC Slope = –
py py
x I I x
+x
px px
Figure 3.8
Budget lines with and without coupons.

I px px
y= + x − x.
py py py

Graphically, the term in parentheses represents the vertical intercept of BLC , as depicted
in figure 3.8 (see dashed line). Wecan then use this expression to find the horizontal intercept
of BLC . Setting y = 0, yields 0 = pIy + ppxy x − ppxy x, which simplifies to ppxy x = pIy + ppxy x and,
solving for x, we find the horizontal intercept x = pIx + x, as figure 3.8 indicates.
Example 3.6: Coupons John’s income is $100, the price of electricity is px = $1,
and that of bikes is py = $4. Assume that a government agency distributes coupons
for the first 200 kWh per month, making them free. Because x = 200, John’s budget
line BLC is y = pIy for all x < 200, which in this scenario becomes y = 100
4 = 25 units.
This is graphically represented by a horizontal line at a height of y = 25 from x = 0 to
x = 200. For units of x beyond x = 200, however, John’s budget line is

I px px 100 1 1 1
y= + x − x= + 200 − x = 75 − x.
py py py 4 4 4 4
Graphically, this means that the dashed segment in figure 3.8 originates at y = 75,
decreases at a rate of 14 , and hits the horizontal axis at x = pIx + x = 100
1 + 200 =
300 units.
64 Chapter 3
Appendix A. Applying the Lagrange Method to Solve the Utility Maximization Problem
px
In the presentation of the UMP in this chapter, we used the tangency condition MU
MUy = py to
x
find optimal consumption bundles. However, we never formally showed that this condition
must be satisfied at the optimum of the UMP. We now demonstrate this result. First, note
that the UMP can be expressed as
max u(x, y)
x,y
subject to px x + py y = I.
Therefore, the consumer chooses the bundle (x, y) that maximizes her utility func-
tion u(x, y) subject to the budget line. As described previously, we use the budget line
px x + py y = I, rather than the budget constraint px x + py y I, because the consumer will
always spend all of her available income.10 Hence, the consumer faces a “constrained maxi-
mization problem,” in which her constraint is px x + py y = I (i.e., choosing a point along the
budget line), and the objective function that she seeks to maximize is her utility function.
Constrained maximization problems are often solved by setting up a Lagrangian function,
which in this UMP is
L (x, y; λ) = u(x, y) + λ I − px x − py y ,
where λ represent the Lagrange multiplier, which multiplies the budget line. To solve this
problem, we take first-order conditions with respect to x, y, and λ, which yields
∂L
= MUx − λpx = 0,
∂x
∂L
= MUy − λpx = 0, and
∂y
∂L
= I − px x − py y = 0.
∂λ
MUx
The first condition can be rearranged to px = λ; and, similarly, the second condition can
MUy
be expressed as py = λ. Because both conditions are equal to λ, we obtain
MUx MUy
=λ= .
px py
This is the “bang for the buck” coinciding across goods, as described in section 3.3.
px
Alternatively, this condition can be expressed as MU
MUy = py , which coincides with the
x
10. This happens even if some of the goods are regarded as bads for the consumer. In that case, she would spend all
her income I on the other good, still leaving no money unspent. In other words, even if a corner solution emerges
as the solution of the UMP, the consumer spends all her money. This was the case, for instance, in example 3.3,
where the consumer’s utility function was quasilinear and a corner solution emerged.
Consumer Choice 65
B
Direction of
cheaper budgets
D
A
u1, where u1 > u
C
u
BL1 BL2 BL3
x
Figure 3.9
The EMP.
tangency condition used in the previous analysis (i.e., at the optimum, the slope of the
indifference curve coincides with the slope of the budget line).
Appendix B. Expenditure Minimization Problem
The UMP considers a fixed budget and finds which bundle provides the consumer with
the highest utility level. Alternatively, one could approach the consumer’s problem as if she
sought to reach a minimal utility level (i.e., a target utility), but wanted to do that by spending
the least amount of money. In other words, the consumer alternatively could minimize her
expenditure while reaching a fixed utility level. This is the approach that the expenditure
minimization problem (EMP) follows, which we describe next.
As figure 3.9 depicts, the EMP can be graphically understood as the consumer seeking to
reach an indifference curve with a target utility level u, but shifting her budget line as close
to the origin as possible (because lower income levels shift the budget line downward).
Bundles like B or C cannot be optimal because the consumer, despite reaching the target
utility level u, spends more income than at other bundles, such as A, given that budget line
BL2 lies below BL3 . Bundle A, in contrast, must be optimal (i.e., expenditure minimizing)
because we cannot find other bundles for which the consumer still reaches the “target” utility
level u at a lower expenditure than BL2 . As figure 3.9 illustrates, at the optimal bundle A,
the indifference curve and the budget line are tangent to each other (i.e., their slopes coin-
px
cide), thus providing us with the same equilibrium condition, MU MUy = py , as when solving the
x
UMP. Finally, note that the consumer’s constraint in this setting becomes u(x, y) = u, rather
than u(x, y) u, because she would never choose bundles satisfying u(x, y) > u. Intuitively,
bundles providing the consumer with more than the minimal target utility u, such as D in
figure 3.9, cannot be optimal, since the consumer can find cheaper bundles that reach the
66 Chapter 3
target utility level u. These bundles still satisfy the constraint and can be purchased at a
lower cost.
As we did for the UMP, we next offer a step-by-step procedure to find the optimal con-
sumption bundle that solves the EMP, and subsequently illustrate the application of the
procedure with two numerical examples.
Tool 3.3. Procedure to solve the EMP
1. Set the tangency condition MUx

MUy = ppxy . Cross-multiply and simplify.
a. Contains both unknowns (good x and y), then solve for y, and insert the resulting
expression into the utility constraint u(x, y) = u.
b. Contains only one unknown (good x or y, but not both), then solve for that unknown.
Afterward, insert your result into the utility constraint u(x, y) = u to obtain the
remaining unknown.
MUx MUy MUx MUy
c. Contains no good x or y, then compare px against py . If px > py , set good y = 0
MUy
in the utility constraint u(x, y) = u, and then solve for good x. If, instead, MU px < py ,
x
set x = 0 in the utility constraint u(x, y) = u and then solve for good y.
3. If in step 2 you find that one of the goods is consumed in a negative amount (e.g., x = −2),
then set the amount of this good equal to zero on the utility constraint (e.g., u(0, y) = u),
and solve for the remaining good.
4. If you have not found the values for all the unknowns (goods x and y) yet, use the tangency
The optimal consumption bundle that we find after applying Tool 3.3 (i.e., after solv-
ing the EMP) is usually referred to as “compensated demand”11 to differentiate it from
the consumption bundles we obtain from solving the UMP, known as “uncompensated
demand.”12 Example 3.7 applies Tool 3.3 to a Cobb-Douglas utility function.
Example 3.7: EMP with a Cobb-Douglas utility function Consider a Cobb-

1 2
Douglas utility function u(x, y) = x 3 y 3 , where the consumer faces prices px = $10
and py = $20, and a utility target of u. To find the consumption bundle that solves
px
the EMP, we apply the tangency condition MU MUy = py . However, we first need to find
x
11. This demand is also known as “Hicksian demand,” after British economist Sir John Richard Hicks.
12. This demand is also known as “Marshallian demand,” after economist Alfred Marshall, or “Walrasian demand”
after economist and mathematician Leon Walras.
Consumer Choice 67
MUx
MUy , as follows:
1 −3 3 2 2
MUx x y y
= 3 1 1 = .
MUy 2 3 −3 2x
3x y
We apply the steps in Tool 3.3 next.
px y
Step 1. Tangency condition MU MUy = py reduces to
x
2x = 10
20 , or y = x. This result
contains both x and y, so we can move on to step 2a.
1 2
Step 2a. The constraint u(x, y) = u in this context becomes x 3 y 3 = u. Inserting our
result from step 1, y = 2x, in the constraint yields
1 2
x 3 (x) 3 = u,

y
and, rearranging, x = u. For instance, if the target utility level is u = 5, the optimal
amount of good x becomes x = 5 units.
Because we found a positive amount of good x, we can move on to step 4. (We need
to stop at step 3 only if we obtain negative amounts of either good.)
Step 4. Finally, using the tangency condition, y = x, we find y = u.
Summary. The optimal consumption bundle is x = y = u, consuming the same
amount of each good. For instance, if the consumer seeks to reach a utility level u = 5,
the optimal bundle is (5, 5).
1 1
Self-assessment 3.9 Eric’s utility function is u(x, y) = x 2 y 2 , and he faces prices
px = $1 and py = $3 and has a utility target of u = 20. Using the same steps as in
example 3.7, solve Eric’s EMP, finding his optimal purchases of goods x and y.
Example 3.8: EMP with a quasilinear utility Consider the same quasilinear
demand from example 3.3, u(x, y) = xy + 7x, facing market prices px = $1 and py =
$2. In addition, assume that the consumer targets a utility level of u = 70 (which is the
utility that the individual achieves when consuming her optimal consumption bundle
at the UMP).
68 Chapter 3
We now apply the steps in Tool 3.3.

px y+7
Step 1. In this scenario, the tangency condition MUMUy = py becomes x = 2 , which
x 1
collapses to 2y + 14 = x. Because this result contains x and y, we move on to step 2a.

Step 2a. Inserting the result from the tangency condition, 2y + 14 = x, into the utility
target xy + 7x = 70, we obtain
(2y + 14)y + 7(2y + 14) = 70,

x x
√ + y) = 70, or (7 + y) = 35. Taking the square roots of both

which simplifies to 2(7 2 2
sides yields 7 + y = 35, and solving for good y, we obtain y = −1.08 units. Because
we found negative units of at least one good, we need to apply step 3 next.
Step 3. As in the UMP, these results indicate that the individual consumes a zero
amount of good y and dedicates all her income to buy only units of good x. As
described in example 3.3, the marginal utility per dollar (its bang for the buck) is
larger from good x than from y, regardless of the amount consumed, which drives her
to purchase only good x. Because y = 0, her utility constraint becomes u(x, 0) = 70,
or x0 + 7x = 70, which, solving for x, yields x = 10 units.
Summary. The optimal consumption bundle is x = 10 and y = 0, regardless of the
utility target that the individual seeks to reach. This result, of course, is consistent
with that of example 3.3, where we solved the UMP of this consumer finding the
same optimal bundle.
Self-assessment 3.10 John’s utility function is u(x, y) = 4x1/2 + 2y, he faces prices
px = $2 and py = $3, and has a utility target of u = 10. Using the same steps as in
example 3.8, solve John’s EMP, finding his optimal purchases of goods x and y.
Relationship between the Utility Maximization Problem and the Expenditure

Minimization Problem
After reading example 3.8, you probably found some similarities and differences between
the UMP and the EMP. In both approaches, we start by writing down the tangency condition
MUx px
MUy = py , because in both the UMP and the EMP we require that the consumer chooses a
bundle where her budget line is tangent to her indifference curves. However, the UMP takes
the result from the tangency condition and inserts it in the constraint of the UMP, the budget
Consumer Choice 69
UMP EMP
Tangency
condition,
MRS = px/py
inserted into...
Budget constraint Utility constraint

pxx+pyy = I u(x,y) = u
Figure 3.10
Comparing UMP and EMP.
line px x + py y = I, whereas the EMP takes that result and plugs it into its corresponding
constraint, the utility target u(x, y) = u.13 Figure 3.10 depicts the similarities and differences
of these two approaches.
This description of the EMP, as well as examples 3.7–3.8, probably made you interpret
the EMP as the mirror image of the UMP, as both approaches lead us to the same optimal
consumption bundle. (Formally, we say that the EMP is the dual representation of the UMP.)
Indeed, consider a consumer that solves her UMP and finds bundle (xU , yU ) to be optimal.
In this situation, the utility that she can reach when purchasing bundle (xU , yU ) is u(xU , yU ).
In this context, if we ask the consumer to solve her EMP and we require her to reach a target
utility level of exactly u = u(xU , yU ), the bundle that solves her EMP will coincide with that
of solving her UMP.
We can draw the opposite relationship, but starting now from the EMP. Specifically, let
(xE , yE ) be the bundle that solves the EMP, and let I E be the income that the consumer needs
to purchase such optimal bundle (i.e., px xE + py yE = I E ). Then, if we ask the consumer
to solve her UMP, giving him an income of I = I E , the optimal bundle solving her UMP,
(xU , yU ), coincide with that solving her EMP, (xE , yE ).
Example 3.9: Dual problems We now return to examples 3.2 and 3.7 to illustrate
these equivalences between the UMP and EMP.
From UMP to EMP. First, recall that solving the UMP in example 3.2, we found
(xU , yU ) = (3.33, 3.33), which yields a utility level of u = 3.33. If we go to the
EMP in example 3.7 (where the consumer still faces the same utility function and
prices), and ask her to target a utility level of u = 3.33, then her optimal bundle
13. For convenience, we solve for y in the tangency condition and insert our result into the y term of the budget
line when we solve the UMP, or in the y term of the utility function when solving the EMP. Alternatively, you can
solve for good x and obtain similar results.
70 Chapter 3
becomes (xE , yE ) = (3.33, 3.33) because in example 3.7 we found that x = y = u.

Hence, optimal bundles in the UMP and EMP coincide.
From EMP to UMP. Similarly, consider that we approach the consumer again, giving
her the income that she would need to purchase the optimal bundle found in the EMP
of example 3.7, px xE + py yE = $100, as described previously. Solving her UMP, she
obtains (xU , yU ) = (3.33, 3.33), which coincides with the optimal bundle solving the
EMP.
Exercises
1. Budget Line.A Peter has an income of I = $100, which he dedicates to purchasing soda and pizza.
The price of soda is $1 per can, while that of pizza is $2 per slice.
(a) Find the equation of his budget line, and represent it graphically.
(b) How does Peter’s budget set change when his income increases to I = $150?
(c) Consider that the university that Peter attends subsidizes pizza, decreasing its price by $1.
What is Peter budget set now?
(d) What if, instead, the university gives Peter 25 vouchers that he can use to get 25 free slices of
pizza?
2. Choosing the Best Deal.B Peter has a monthly income of I = $1, 000 to spend on pizza (good x,
with a price of $20 per unit), or other goods (composite good y, whose price is $1). The pizza place
where he always goes announced an attractive offer: “Pay $800 and you will get 30 pizzas for an
entire month (1 pizza per day).”
(a) Assuming that Peter’s preferences are represented with a Cobb-Douglas utility function
u(x, y) = x1/3 y2/3 , will Peter accept the offer?
(b) What if, instead, Peter’s preferences are represented with a linear utility function u(x, y) =
2x + y? Will he accept the offer then?
3. Composite Good.A Consider an individual with utility function u(x, y) = (x + 3)y, and income
I = $30. The price of good x is px = $2, while that of good y is normalized to py = $1 (that is, good
y represents the money left for purchasing all other goods but x, which we refer as the “numeraire”).
(a) Find the optimal consumption bundle of this individual. Evaluate his utility function at this
optimal bundle.
(b) Assume now that his income was increased by $10 (for a total of I = $40). What is his new
optimal consumption bundle? What is the new utility level that he can reach?
(c) Assume now that the price of good x decreases in $1, so its new price is px = $1. What is his
new optimal consumption bundle? What is the new utility level that he can reach?
Consumer Choice 71
(d) Assume that he receives a coupon allowing him to consume 4 units of good x for free. What
is his new optimal consumption bundle? What is the new utility level that he can reach?
(e) In which version of parts b–d is the consumer better off? That is, describe whether the consumer
prefers the change in income from part (b), the change in prices from part (c), or the coupon
from part (d).
4. Checking Statements.A One of your classmates approaches you before an exam, saying that he
figured out how to tackle several questions in microeconomics. For each of the following sentences,
argue if your classmate is wrong and give the reason behind your answer.
(a) Perfect substitutes. “If a consumer regards two goods as substitutes in consumption, he will
always choose the cheaper good.”
(b) Perfect complements. “If a consumer regards two goods as perfect complements in consump-
tion, his compensated demand must be flat.”
(c) Demand and compensated demand. “Demand and compensated demand curves are different
for all types of goods.”
5. Expenditure Function.A Consider that a consumer’s expenditure function is given by
2 1/3
upx py
e(px , py , u) = 3 .
4
Find the demand for good y, y(px , py , u). [Hint: Because px hx (px , py , u) + py hy (px , py , u) =
e(px , py , u), you may differentiate e(px , py , u) with respect to py to find hy (px , py , u).] The demand
function you find is the one that solves the EMP. In this scenario, however, we find it without need-
ing to solve the EMP. As we have information about expenditure function e(px , py , u), we can find
this demand function more directly by just differentiating e(px , py , u) with respect to py .
6. Quasilinear Utility–I.B John has a monthly income of I = $700 and a quasilinear utility function
of the type u(x, y) = x1/2 + 7y. The price of good x is $2, while the price of good y is $3.
(a) Find John’s tangency condition, following step 1 of the utility maximization procedure.
(b) Find John’s equilibrium quantities for goods x and y.
7. Cobb-Douglas–I.A Eric has a monthly income of I = $500 and a Cobb-Douglas utility function
of the type u(x, y) = x1/2 y1/2 . The price for good x is $1, and the price for good y is $3.
(a) Find Eric’s tangency condition, following step 1 of the utility maximization procedure.
(b) Find Eric’s equilibrium quantities for goods x and y.
8. Cobb-Douglas–II.A Chelsea has a monthly income of I = $1, 000 and a Cobb-Douglas utility
function of the type u(x, y) = x0.6 y0.4 . The price for good x is $3, and the price for good y is
$1.
(a) Which good does Chelsea prefer, x or y? How do you know this?
(b) Find Chelsea’s tangency condition, following step 1 of the utility maximization procedure.
(c) Find Chelsea’s equilibrium quantities for goods x and y.
(d) Compare your results from parts (a) and (c). Are they in line with Chelsea’s preferences? Why
or why not?
72 Chapter 3
9. Cobb-Douglas–III.C Eric has a Cobb-Douglas utility function of the type u(x, y) = x1/2 y1/2 .
Suppose that Eric has a general value for his income, I, and general values for the prices of goods
x and y (px and py , respectively).
(a) Find Eric’s tangency condition, following step 1 of the utility maximization procedure.
(b) Find Eric’s equilibrium quantities for goods x and y as a function of I, px , and py .
(c) Compare your results with those in exercise 7 by setting I = $500, px = $1, and py = $3. Are
they identical?
10. Kinks in the Curve.B John has a weekly income of I = $50, which he dedicates to purchasing
soda and other goods (a composite good, whose price is $1). The price of soda is $2 per can.
(a) Find the equation of his budget line, and represent it graphically.
(b) Suppose that the price of soda changes to $2/can for the first 10 cans, and then $1 for each
additional can. Depict this new budget line graphically.
(c) Before the price change in part (b), John purchased 20 cans of soda and 10 units of the
composite good. Describe generally how John’s consumption of soda and the composite good
change as the price changes.
11. Different Relationships.C Eric has a weekly income of I = $50 and a utility function of the type
x1/2 = x1/2 (y + 1)−1 . The price for good x is $1, and the price for good y is also $1.
u(x, y) = y+1
(a) Are x and y both goods? If not, which is a “bad?”
(b) Find Eric’s tangency condition, following step 1 of the utility maximization procedure.
(c) Find Eric’s equilibrium quantities for goods x and y.
12. Perfect Substitutes–I.A Chelsea has a monthly income of I = $800 and a utility function of the
type u(x, y) = 3x + 4y. The price for good x is $1, and the price for good y is $2.
(a) Find Chelsea’s tangency condition, following step 1 of the utility maximization procedure.
(b) Find Chelsea’s equilibrium quantities for goods x and y.
(c) Suppose that the price of good y falls to $1. What happens to Chelsea’s equilibrium quantities?
13. Perfect Complements–I.A John has a monthly income of I = $400 and a utility function of the
type u(x, y) = min{2x, y}. The price of good x is $3, while the price of good y is $4.
(a) What is John’s most preferred ratio of consuming good x to good y?
14. WARP.B Eric has a weekly income of I = $40, which he allocates between purchasing goods x
and y. When the price of good x is $4 and the price of good y is also $4, Eric purchases 3 units of
good x and 7 units of good y in equilibrium. Now suppose that the price of good x falls to $2.
(a) Find the equations of his original and new budget lines and represent them graphically.
(b) Suppose that Eric’s new equilibrium bundle is 5 units of good x and 5 units of good y. Does
this new bundle violate WARP? Explain why or why not.
(c) Suppose that Eric’s new equilibrium bundle contains 4 units of good y. How many units of
good x must be consumed such that our equilibrium allocation does not violate WARP?
Consumer Choice 73
15. Expenditure Minimization Problem.A Peter wishes to reach a utility level of U = 50 and has a
Cobb-Douglas utility function of the type u(x, y) = x0.4 y0.6 . The price for good x is $1, and the
price for good y is $4.
(a) Find Peter’s tangency condition, following step 1 of the expenditure minimization procedure.
(b) Find Peter’s equilibrium quantities for goods x and y.
(c) How much income does Peter require to reach his target utility level?
16. Perfect Substitutes–II.B Chelsea wishes to reach a utility level of U = 100 and has a utility
function of the type u(x, y) = 5x + 2y. The price for good x is $3,and the price for good y is $1.
(a) Find Chelsea’s tangency condition, following step 1 of the expenditure minimization
procedure.
(b) Find Chelsea’s equilibrium quantities for goods x and y.
(c) How much income does Chelsea require to reach her target utility level?
17. Perfect Complements–II.B John wishes to reach a utility level of U = 75 and has a utility
function of the type u(x, y) = min{3x, 2y}. The price of good x is $2, while the price of good
y is $3.
(a) What is John’s most preferred ratio of consuming good x to good y?
(c) How much income does John require to reach his target utility level?
18. Quasilinear Utility–II.B Eric wishes to reach a utility level of U = 50 and has a quasilinear utility
function of the type u(x, y) = 4x + y1/2 . The price of good x is $2, while the price of good y is
also $2.
(a) Find Eric’s tangency condition, following step 1 of the expenditure minimization procedure.
(b) Find Eric’s equilibrium quantities for goods x and y.
(c) How much income does Eric require to reach his target utility level?
19. Utility Maximization.B Suppose that you are in a situation where you can afford purchasing 3
units of good x and 4 units of good y. You discover that another affordable bundle of 5 units of
good x and 2 units of good y causes you to reach the same level of utility. Assume that your utility
function and budget line are both well behaved.
(a) Are you maximizing your utility by consuming your original bundle? Why or why not?
(b) Propose an alternative, affordable bundle that would yield a higher utility level that either of
the original bundles.
20. Goods and Bads.A Consider a situation where there are two goods in the world: food and garbage.
Naturally, garbage in this context would be considered a “bad,” while food is considered a good.
Suppose that the price for both types of goods is positive.
(a) Would a rational person consume any garbage in equilibrium? Why or why not?
(b) Under what condition or conditions would a consumer be willing to consume a positive
amount of garbage?
4 Substitution and Income Effects
4.1 Introduction
In this chapter, we use the solution to the consumer’s utility maximization problem from
chapter 3 —the optimal consumption bundle— to analyze how it changes in the individual’s
income, as well as in the price of each good.
We start examining how consumption bundles are affected by income. When the con-
sumer becomes richer, her purchases of most goods will likely increase, but consumption
of some goods, such as fast food, may fall. In addition, the consumer could use part of
her larger purchasing power to buy other goods. For instance, individuals may need less
of basic staples, such as bread or rice, as their income increases, but update their smart-
phones more frequently. This pattern has been well documented in countries moving from
underdeveloped to developed status.
Furthermore, we examine how consumers respond to less expensive goods by increasing
their consumption of that good, and how they may respond by decreasing their purchases
of other goods, which now become relatively more expensive. We then analyze how this
increased consumption can be disaggregated (separated) into a substitution and an income
effect. The substitution effect measures how the consumer shifts her purchases toward
the good that became relatively less expensive. The income effect reflects the consumer’s
increased purchases due to her larger purchasing power.
4.2 Income Changes
In this section, we take the demand for a good (the optimal consumption bundle described in
chapter 3), and analyze how it changes as the consumer’s income increases. For most goods,
we expect such demand to increase as the individual becomes richer (i.e., the number of units
she demands at a given price increases as her income grows). We next present four ways to
measure such a change in demand.
76 Chapter 4
4.2.1 Using the Derivative of Demand

Recall that, formally, we use x(px , py , I) to represent a consumer’s demand for good x, which
is a function of the price of goods x and y, and the consumer’s income. Next, we describe
normal and inferior goods.
Normal goods A consumer’s demand for good x, x(px , py , I), is normal if its
derivative with respect to income is positive; that is,
∂x(px , py , I)
> 0.
∂I
Intuitively, she demands more units of good x as her income, I, increases. Most goods sat-
isfy this property, and thus they are normal goods. You can think about how your purchases
of holiday packages, restaurant meals, and cars would increase if you won the lottery. How-
ever, not all goods are normal, because some products see their demand fall as individuals
become richer, as we define next.
Inferior goods A consumer’s demand for good x, x(px , py , I), is inferior if its
derivative with respect to income is negative; that is,
∂x(px , py , I)
< 0.
∂I
Intuitively, consumers cut their consumption of inferior goods as soon as they can afford
to do so. Examples include basic food staples, such as canned meats, rice, or intercity bus
service. If your income were to double (or you won the lottery), wouldn’t you reduce your
purchases of Spam (canned, precooked, pork and ham meat), and bus tickets to travel to
another city (rather than taking a plane)?
Recall that when we say that a derivative is negative, the terms in the numerator and
denominator move in opposite directions. In this context, the negative derivative implies that
when income increases, quantity demanded decreases; and vice versa.1 This concept indi-
cates that inferior goods, such as basic staples, experience demand increases if individuals
(or entire countries) become poorer, such as after an economic crisis. This could have been
the case, for instance, in Mexico after the 1990 currency crisis (when individuals started con-
suming more tortillas than in previous years, cutting their purchases of high-quality meats),
1. This argument also applies to the situation where such a derivative is positive. In that context, an increase in
income raises the quantity demanded, and a decrease in income entails a reduction in the quantity demanded. That
is, income and demand move in the same direction, either both going up, or both going down.
Substitution and Income Effects 77
or the US after the 2008 crisis (when Walmart saw its sales increase, whereas high-end
supermarkets saw their sales decrease).2
Example 4.1: Increasing income in a Cobb-Douglas utility function Consider

an individual with a Cobb-Douglas utility function u(x, y) = xy, who faces prices px ,
py , and income I. As shown in previous chapters, her optimal consumption for good
x (i.e., her demand) is3
I
x= .
2px
This demand is increasing in income because I shows up in the numerator (more
formally, you can check that, indeed, ∂x∂I = 2px > 0). Hence, good x is normal in con-
1
sumption because its demand increases in income. A similar argument applies to the
demand of good y = 2pI y , which is also increasing in income.
Self-assessment 4.1 Assume that Eric’s demand function for good x is x =

√ 5I ∂x
px −3py . Find the derivative ∂I and its sign, and interpret your results.
4.2.2 Using Income Elasticity

An alternative way to represent the relationship between income and demand is found by
∂x(px ,py ,I)
inserting the derivative examined in the previous section, ∂I , in the formula of income
elasticity, as follows:
∂x(px , py , I) I
εx,I = ,
∂I x(px , py , I)
which measures the percentage change in quantity demanded per 1 percent change in
income. In other words, if we increase income by 1 percent, quantity demanded changes
by εx,I percent.
Because the elements in the second ratio, I and x(px , py , I), are both positive, the sign
of the income elasticity ultimately depends on the sign of the derivative in the first term.
2. See McKenzie (2002).

3. If you do not remember this result, now is a good time to practice. First, set the tangency condition in this
setting, MU x = y = px , or y = px x. Second, insert this result into the budget line, p x + p y = I, to obtain p x +
x y x
MUy x py py
px I
py py x = I, which simplifies to 2px x = I. Finally, solving for x yields the demand function for x, x = 2p , as
x
required.
78 Chapter 4
Table 4.1
Types of goods according to their income elasticity.
Income Elasticity, εx,I Type of Good Example
εx,I < 0 Inferior Canned food

0 < εx,I < 1 Necessity Water
εx,I > 1 Luxury Yachts
∂x(px , py , I)
That is, income elasticity is positive, εx,I > 0, when the good is normal, ∂I > 0; but
∂x(px , py , I)
it becomes negative, εx,I < 0, when the good is inferior, < 0. ∂I
In addition, when income elasticity is positive and larger than 1 (i.e., εx,I > 1), we say
that the good is regarded as a luxury by consumers. Indeed, an income elasticity of εx,I > 1
(e.g., εx,I = 2) indicates that the consumer responds to a 1 percent increase in her income by
increasing her consumption of the good by more than 1 percent (e.g., 2 percent). In other
words, an increase in income produces a more-than-proportional increase in the quantity
demanded of the good, which occurs with goods such as housing (particularly second resi-
dences), electronic gadgets, or yachts. Instead, a good whose income elasticity is less than
1 (0 < εx,I < 1), is regarded as a necessity, as a 1 percent increase in income yields a less-
than-proportional increase in demand (e.g., by 0.5 percent). As an extreme example, when
εx,I = 0, the consumer is insensitive to changes in her income, purchasing the same amount
of the good (water or natural gas) regardless of her income. Table 4.1 summarizes the three
types of goods.
Example 4.2: Finding income elasticity in the Cobb-Douglas scenario From

example 4.1, the demand for good x is x = 2pI x , and its derivative with respect to
income is ∂x
∂I = 2px . We can use these results to evaluate the income elasticity of good
1
x, as follows:
∂x(px , py , I) I
εx,I =
∂I x(px , py , I)
1 I 1
= I
= 2px = 1,
2px 2p 2px
x
thus indicating that the good is normal, because εx,I > 0, but it is neither a luxury
(which requires εx,I > 1) nor a necessity (which needs that εx,I < 1).
y
BL3
BL2
BL1 C
Income-consumption curve
IC3
B
A
IC2
IC1
x
Figure 4.1
Income-consumption curve.
Self-assessment 4.2 Consider again Eric’s demand function x = √px5I−3py . Find his
income elasticity εx,I , its sign, and interpret your results.
4.2.3 Using the Income-Consumption Curve

To find the income-consumption curve, we first depict the optimal consumption bundle at an
initial income level I1 . As illustrated in figure 4.1, bundle A is where the indifference curve
IC1 becomes tangent to the budget line BL1 . The individual’s income is then increased,
which shifts her budget line upward, to BL2 . Then she selects a new consumption bundle
that maximizes her utility, as depicted by point B, where indifference curve IC2 becomes
tangent to her new budget line BL2 . We repeat the process again, increasing her income,
which produces a new budget line BL3 , leading the consumer to choose optimal consump-
tion bundle C. Lastly, we connect all her optimal consumption bundles with a curve, which
we refer to as the “income-consumption curve.”
When the income-consumption curve has a positive slope, as in the segment between
bundles A and B, it means that the individual increases her purchases of both goods x and y as
she becomes richer. As a consequence, we interpret positively sloped income-consumption
curves as being characteristic of normal goods. However, when these curves have a negative
slope, as in the segment between bundles B and C in figure 4.1, the consumer decreases
her purchases of good x (graphically, we move leftward as we jump from B to C), but she
increases her purchases of good y (graphically, we move upward when jumping from B
to C). Therefore, negatively sloped income-consumption curves indicate that one of the
goods must be inferior.
80 Chapter 4
Example 4.3: Finding income-consumption curves From example 4.1, the

demand for good x is x = 2pI x , and the demand for good y is y = 2pI y . Hence, the ratio
of these demands is
I
y 2py px
= I
=
x 2px
py
which is the slope of the ray connecting the origin (0, 0) with any optimal consump-
tion bundle. For instance, when px = $4 and py = $2, this ratio becomes yx = 42 = 2,
thus indicating that the optimal consumption of goods y and x maintain a two-to-one
relationship. Graphically, this result implies that the income-consumption curve is
a straight line from the origin, with a constant slope of 2. After a given increase in
income, the consumer responds by increasing her demand for good y more signifi-
cantly than for good x, which comes as no surprise because x is twice as expensive
as y.4
Self-assessment 4.3 Consider Maria’s demand function x = √5Ipx and assume

that her demand for good y is symmetric, so that y = √5Ipy . Find Maria’s income-
consumption curve, and evaluate it at prices px = $4 and py = $9.
4.2.4 Using the Engel Curve

An alternative approach to represent how income affects the demand for a good plots the
demand for good x, x(px , py , I), on the vertical axis and income I on the horizontal axis;
obtaining the so-called Engel curve of the good. The left panel of figure 4.2 depicts a pos-
itively sloped Engel curve, which implies that the good is normal, as the number of units
purchased increases with income (i.e., purchases of x increase as we move rightward on
the graph). The right panel depicts an Engel curve that has a positive slope for low-income
levels (implying that the good is normal when the individual is not very rich), but eventu-
ally becomes negatively sloped (indicating that the individual starts regarding the good as
inferior once she is sufficiently rich).
4. Alternatively, you can see this result by going back to the demand function of good x, x = 2pI , and evaluate it at
x
the price considered in the current example, px = $4, to obtain x = 8I . Similarly, for the demand of good y, y = 2pI ,
y
we find y = 4I . We can now see that a given increase in income induces an increase of ∂x 1
∂I = 8 in the demand of
∂y 1
good x and a larger increase of ∂I = 4 in the demand of good y.
(a) (b)
x x
Income Income
Figure 4.2
Two Engel curves.
The left panel depicts the Engel curve for products such as real estate, whose demand
keeps increasing regardless of the individual’s income (e.g., second and third residences). In
contrast, the Engel curve on the right panel may illustrate canned food or public transporta-
tion (which increases for a while as the individual’s income grows, but eventually decreases
when her income is sufficiently high).
Example 4.4: Finding Engel curves Recall from example 4.1 that the demand for
good x is x = 2pI x . Because Engel curves plot units of x on the horizontal axis and
income I on the vertical axis, we need to solve for I in x = 2pI x , obtaining an Engel
curve of
I = (2px ) x.
That is, the Engel curve for this good originates at zero, and has a slope of 2px (e.g.,
a slope of 6 if px = $3). In addition, such a slope is positive and constant in x, thus
indicating that the consumer regards good x as normal (demand increases in income)
for all income levels.
Self-assessment 4.4 Consider again Eric’s demand function, x = √px5I−3py . Solve

for income I to find his Engel curve. Is its slope positive or negative? What does your
result mean in terms of good x being normal or inferior?
82 Chapter 4
y
Region b
BL2
B
Region a
BL1 B
A
B
Region c
Figure 4.3
Not all goods can be inferior.
Remark—Not All Goods Can Be Inferior—While our previous discussion allows some goods
to be inferior, it is important to note that not every good can be inferior. Figure 4.3 depicts an
individual facing income I1 at budget line BL1 , who chooses an optimal consumption bundle
A. When her income increases to I2 , her budget line shifts upward to BL2 . Which bundle B
does the consumer select at this new income level? Because the individual must exhaust all
her income, her new bundle B must lie along BL2 , thus allowing three possibilities:
• If bundle B lies in region a, to the northeast of bundle A, the individual increases her
consumption of both goods x and y.
• If bundle B lies in region b, northwest of A, the consumer purchases more units of y but
fewer of x, thus regarding good y as normal and x as inferior, respectively.
• If bundle B lies in region c, southeast of A, the individual buys fewer units of y but more
of x, indicating that good y is inferior, whereas x is normal.
Hence, either both goods are normal, or only one of them is inferior. For both to be
inferior, the new bundle B should lie in the shaded region to the southwest of the initial
bundle A. In this region, however, the consumer does not spend all her income. As described
in chapter 3, bundles in the shaded region are not optimal because the consumer can still
afford bundles that increase her utility. (Appendix A at the end of this chapter, provides a
more formal proof of this result using income elasticities.)
4.3 Price Changes
In this section, we analyze how demand changes as the price of one good changes. Similar
to the previous section, we can use different measures, as described next.
4.3.1 Using the Derivative of Demand

Because x(px , py , I) represents a consumer’s demand for good x, we say that her demand
curve for the good is negatively sloped if its derivative with respect to its own price px is
negative:
∂x(px , py , I)
< 0.
∂px
In this case, the consumer purchases fewer units as the good becomes more expensive,
where we keep her income and the price of all other goods constant. If, instead, the opposite
∂x(px ,py ,I)
relationship holds, ∂px > 0, the demand function has a positive slope. This indicates
that the quantity demanded and price go in the same direction. In this case, an increase
(decrease) in price leads the consumer to increase (decrease) her purchases of this good.
(What we just said sounds crazy, but we will return to this type of goods later in this chapter.)
We refer to this type of goods as “Giffen goods.”
Example 4.5: Demand and price changes Consider the demand function from
example 4.1, x = 2pI x . If the price of good x increases by a small amount, the
consumer’s purchases respond as follows:
∂x(px , py , I) I
=− 2,
∂px 2px
which is negative, given that I and px are both positive by assumption. Hence,
demand function x = 2pI x decreases in price px . Graphically, this demand function has
a negative slope.
Self-assessment 4.5 Consider again Eric’s demand function, x = √px5I−3py . Find

∂x(px ,py ,I)
the derivative ∂px and its sign, and interpret.
4.3.2 Using the Price-Elasticity of Demand

An alternative way to represent the relationship between the price of good x and its demand
∂x(px ,py ,I)
is by inserting the derivative described in section 4.3.1, ∂px , in the formula of price-
elasticity, as follows:
∂x(px , py , I) px
εx,px = .
∂px x(px , py , I)
84 Chapter 4
Intuitively, if we increase price px by 1 percent, quantity demanded changes by εx,px per-

cent. As for income elasticity, the elements in the second ratio are all positive, implying
∂x(px ,py ,I)
that the sign of εx,px depends only on the sign of the derivative ∂px (i.e., whether the
demand function has a positive or negative slope). For most goods, the demand function has
a negative slope (i.e., the quantity demanded and price move in opposite directions), entail-
ing that price-elasticity must also be negative. Example 4.6 evaluates the price-elasticity of
the demand function we found in example 4.1.
Example 4.6: Price elasticity and demand Consider again the demand function
from example 4.1, x = 2pI x . Using the formula for price elasticity, and recalling from
∂x(px ,py ,I)
example 4.5 that ∂px = − 2pI 2 , we obtain
x
∂x(px , py , I) px
εx,px =
∂px x(px , py , I)
I px
=− = −1.
2p2x 2pI
x
Intuitively, a 1 percent increase in price px produces a proportional reduction in its

own demand (i.e., purchases of good x decrease by exactly 1 percent).
Self-assessment 4.6 Consider again Eric’s demand function, x = √px5I−3py . Find its
price elasticity εx,px , and interpret your result.
Cross-price elasticity The “cross-price elasticity” of the demand for good y to px is

∂y(px , py , I) px
εy,px = .
∂px y(px , py , I)
This expression says that, if we increase the price of good x by 1 percent, the quantity
demanded of good y changes by εy,px percent. In example 4.1, the demand for good y is
y = 2pI y , implying that quantity demanded of y is independent of px . Therefore, the first
∂y(p ,p ,I)
term in the cross-price elasticity formula becomes x y
∂px = 0, entailing that εy,px = 0.
Intuitively, a 1 percent increase in the price of good x does not affect the demand for good y
at all.
Price-consumption curve
B C
BL1 BL2 BL3

12 4 x
px
2 A
1 B
0.5 C
12 4 x
Figure 4.4a
Price-consumption curve–I.
4.3.3 Using Price-Consumption Curves

Figure 4.4a illustrates a decrease in the price of good x, px . Starting from px = 2 in budget
line BL1 , a decrease to px = $1 makes the budget line BL2 flatter, implying that the individ-
ual can now afford larger amounts of good x. A similar argument applies when this price
further decreases to px = $0.50 in budget line BL3 .5 Figure 4.4a also depicts the optimal
consumption bundles that the consumer selects at each price px : bundle A at px = $2 (where
she consumes only 1 unit of good x), bundle B at px = $1 (where she now consumes 2 units
of x), and bundle C at px = $0.50 (where she consumes 4 units of x). The graph connects
optimal bundles A–C with a curve, often referred to as the “price-consumption curve.” In
this example, as good x becomes cheaper, the individual increases her purchases of both
goods x and y.
5. Recall from chapter 3 that the horizontal intercept of the budget line is pIx . A decrease in px then produces an
increase in ratio pIx , ultimately shifting this horizontal intercept rightward. In contrast, the vertical intercept, pIy ,
is unaffected by a cheaper good x, because it is independent of px .
86 Chapter 4
Price-consumption curve
A C
BL 3
BL 1
BL 2
x
Figure 4.4b
Price-consumption curve–II.
Figure 4.4a (top panel) depicts that the price-consumption curve has a positive slope at
all levels of price px . This could occur, for instance, if good x represents housing: as houses
and apartments become cheaper (per square foot), you can probably afford a larger house.
The lower panel of figure 4.4a summarizes these results. For clarity, it depicts good x on the
horizontal axis (as in the top panel) but has the price of this good, px , on the vertical axis
(rather than the units of y). This helps us more easily focus on good x, by directly seeing
how the purchases of good x are affected by changes in its own price px . The lower panel
represents, of course, the demand curve for x. (For instance, the demand function found
in example 4.1, x = 2pI x , would have a graphical representation similar to that in the lower
panel of figure 4.4a, because x = 2pI x decreases in px and at a decreasing rate.)6
Figure 4.4b illustrates a situation in which the individual also increases her consumption
of good x, the good that became relatively cheaper. As for good y, however, the consumer
in this case decreases her purchases when moving from bundle A to B (when px decreases
6. Assume that income is I = $10, and depict the demand function x = 2p 10 = 5 . You can then plot this expression
x px
∂x = − 5 is negative, given
in a graphing calculator. Alternatively, note that demand decreases in px because ∂p
x 2
(px )
that px > 0 by assumption, thus indicating that demand x = p5x has a negative slope. In addition, such a negative
2
slope increases (becoming closer to zero) as px increases (i.e., the second derivative is ∂ 2x = 10 3 , which is
∂px (px )
positive). To graphically interpret these two results, first note that we are differentiating x (on the horizontal axis)
with respect to px (on the vertical axis). We then can check the slope of the figure once you rotate it 90 degrees
counterclockwise, so x is now on the vertical axis and px is on the horizontal axis. At this point, it is clear that
the demand has negative slope (smaller purchases of x as px increases), and becomes flatter as we further increase
prices.
from $2 to $1), but increases her purchases of y afterwards (when px further decreases to
$0.50).7
Example 4.7: Finding price-consumption curves Recall from example 4.3 that
the demand for good x is x = 2pI x , and that of good y is y = 2pI y . Hence, the ratio of
these demands is
I
y 2py px
= I
= ,
x 2px
py
which gives the slope of the ray connecting the origin (0, 0) with any optimal con-
sumption bundle. Therefore, an increase in the price of good x, px , increases the
value of the above ratio yx = ppxy , showing that the consumer moves to optimal bun-
dles that contain more units of good y (the good that, relative to x, became cheaper)
and less units of good x (the good that became more expensive). Graphically, this
entails that the price-consumption curve pivots northwest as px increases. The oppo-
site argument applies if py increases, where now ratio yx = ppxy goes down, illustrating
that the consumer purchases more units of good x but less of good y.
Self-assessment 4.7 Maria’s demand function for good x is x = √5Ipx , and her
demand for good y is y = √5Ipy . Find her price-consumption curve and interpret your
results.
4.4 Income and Substitution Effects
The previous discussion uses the demand curve (and the price-consumption curve) to mea-
sure how much the individual increases her quantity demand of a good after a price decrease.
Such an increase in the quantity demanded, however, includes two simultaneous effects,
which we examine separately next.
Income effect The change in the quantity demanded due to an increase in purchasing
power, with the price of the item held constant.
7. Graphically, the price-consumption curve unambiguously moves rightward, indicating an increase in the pur-
chases of good x, but it has a negative slope followed by a positive slope, reflecting that the individual initially
decreases her consumption of y, but subsequently increases it.
88 Chapter 4
A cheaper good x allows the individual to afford more units of all goods (i.e., larger
purchasing power).8 The increase in the quantity demanded of good x due to greater pur-
chasing power is referred to as the “income effect.” As described at the beginning of this
chapter, an increase in income can induce the consumer to increase (decrease) her purchases
of a good if she regards it as normal (inferior), thus allowing for positive (negative) income
effects.
The substitution effect seeks to measure how much quantity demanded changes due to the
change in the price ratio (making one good relatively cheaper than the other), while fixing
the utility of the individual at the level reached before the price change.
Substitution effect The change in quantity demanded due to a change in its price,
holding the utility level constant.
In contrast to the income effect, we now keep the utility constant at the level reached
before the price change. That is, we adjust the individual’s income so she can reach the
same utility level as before the price change, and then she can choose an optimal bundle
at the new price ratio. After a price decrease (increase), the substitution effect always leads
to an increase (decrease) in the quantity demanded of the good that became relatively less
(more) expensive.
This quantity change is, importantly, not due to an increase in the consumer’s purchasing
power. Instead, it is due to a change in the relative price of the two goods, inducing the
consumer to increase (decrease) her purchases of the good that became relatively cheaper
(more expensive).
4.5 Putting Income and Substitution Effects Together
Figure 4.5 depicts the income and substitution effects of a price decrease for normal goods.
First, when facing budget line BL1 , the consumer selects an optimal consumption bundle A,
where she reaches an indifference curve IC1 , and she purchases xA units of good x. When the
price of x decreases, the budget line pivots upward to BL2 , and in this situation the consumer
chooses an optimal bundle C, where she purchases xC units of good x. The difference xC − xA
represents the total effect (or total change in consumption due to the cheaper price of x). To
separate the total effect into the substitution and income effects, we first need to shift BL2
downward (so we reduce the individual’s income) to make her reach the same utility level as
8. For example, imagine that housing prices were to fall by 50 percent tomorrow. You could probably afford a
larger apartment (or even a house!), more units of other goods, or both.
C
B
BL1
BLd BL 2
xA xB xC x
TE
SE IE
Figure 4.5
Income and substitution effects for normal goods.
before the price change. The resulting budget line, BLd , is parallel to BL2 , thus having the
final price ratio, and is tangent to indifference curve IC1 at bundle B, where the consumer
purchases xB units of good x. We are done! At this point, we only need to summarize our
results:
1. The increase in consumption from xA to xB reflects the substitution effect (buying more
units of x solely due to the fact that good x became relatively cheaper).
2. The increase in consumption from xB to xC represents the income effect (buying more
units of x because the consumer’s income has increased, but fixing prices at the final price
ratio).
3. The total effect is, of course, the sum of the substitution and income effects.
Figure 4.6a illustrates the income and substitution effect when the consumer regards good
x as inferior. In this scenario, the income effect is negative, but it only partially offsets the
substitution effect, thus producing a total effect that is positive overall.
Figure 4.6b depicts a Giffen good, which also exhibits a negative income effect, being
large enough to offset the substitution effect and generate an overall negative total effect.
Recall that the total effect measures how the quantity demanded responds to a change in
price (in these figures, a decrease in price of x). Hence, a normal and inferior good would
entail that the consumer responds to a cheaper good x by increasing her purchases (i.e.,
positive total effect), whereas a Giffen good would mean that the consumer responds by
reducing her purchases of good x (i.e., negative total effect). That is, demand curves are
90 Chapter 4
A B
BL1 BLd BL2

xA xC xB x
TE IE
SE
Figure 4.6a
Income and substitution effects for inferior goods.
C
A
B
BL1 BLd BL2

xC xA xB x
TE SE
IE
Figure 4.6b
Income and substitution effects for Giffen goods.
Table 4.2
Substitution and income effects.
Price Decrease Price Increase

Type of good SE IE TE Type of good SE IE TE
Normal + + + Normal − − −
Inferior + − + Inferior − + −
Giffen + − − Giffen − + +
negatively sloped for normal and inferior goods, but become positively sloped for Giffen
goods.9
In the case of a price decrease, the left panel of table 4.2 summarizes the signs of the
substitution effect (SE), income effect (IE), and total effect (TE) for normal, inferior, and
Giffen goods. A positive sign in the table indicates that, after a price decrease (such as
those in figures 4.5–4.6), the individual responds by increasing her purchases of the good,
while a negative sign represents a decrease in her purchases of that good. The right side of
table 4.2 analyzes a price increase, showing that all signs are reversed. For instance, inferior
goods exhibit a negative substitution effect (fewer purchases), a positive income effect (more
purchases), and, overall, a negative total effect.
Example 4.8: Finding IE and SE with a Cobb-Douglas utility function Consider

a utility function u(x, y) = xy, income I = $100, and a price for good y of py = $1. The
price of good x decreases from px = $3 to px = $2. Let us find the total effect of the
price change, and then decompose it into the income and substitution effects.
Finding initial bundle A. Under the initial price, px = $3, the tangency condition
MUx px y
MUy = py is x = 1 or, after rearranging, y = 3x. Inserting this expression in the budget
3
line, 3x + y = 100, we obtain 3x + 3x = 100, or x = 100 yields y =

6 units, which then
3 6 = 50 units. Hence, with the initial price of x, the optimal bundle is A = 100
100
6 , 50 .
Finding final bundle C. Similarly, under the final price, px = $2, the tangency con-
px y
dition MU
MUy = py is x = 1 or, after rearranging, y = 2x. Inserting this result into the
x 2
4 = 25 units, which
new budget line, 2x + y = 100, we obtain 2x + 2x = 100, or x = 100
9. Alternatively, a Giffen good requires its sales to increase as the good becomes more expensive. Empirical evi-
dence for Giffen goods has proven elusive, but some references exist for basic staples, such as tortillas in Mexico,
(McKenzie, 2002), and rice in China (Jensen and Miller, 2007). Intuitively, a Giffen good must be clearly infe-
rior, in the sense of reducing its quantity demanded significantly after small increases in income. In this case, the
good would have a sufficiently negative income effect in order to offset the positive substitution effect, leading to a
negative total effect.
92 Chapter 4
yields y = 2 × 25 = 50 units. Hence, under the final price of good x, the optimal bundle
is C = (25, 50). The total effect of the decrease in px is, therefore, an increase of
100
TE = xC − xA = 25 − 8.3 units.
6
Finding decomposition bundle B. We can now decompose this total effect into the
substitution and income effects. First, we need to find the decomposition bundle B.
From the previous discussion, this bundle satisfies two conditions:
1. First, at bundle B, the consumer reaches the same utility level as at the initial
6 , 50 = 6 × 50 833.3,
bundle A. Because bundle A produces a utility of u 100 100
bundle B must satisfy xy = 833.3.

2. Second, at bundle B, the decomposition budget line BLd (which has the same slope
p
as BL2 , pxy = 21 ) is tangent to the consumer’s indifference curve. That is, we need
MUx y
the slope of the indifference curve, MUy = x , to coincide with the slope of budget
p y
line BLd , pxy = 2
1, which implies x = 1 , or y = 2x.
2
In summary, these two conditions state that
xy = 833.3 and y = 2x.
Inserting one equation into the other, we obtain x(2x) = 833.3. After rearranging, we
find x2 = 416.6; applying square roots to both sides yields x = 20.4 units. Hence, y =
2 × 20.4 = 40.8 units, which entails that bundle B is B = (20.4, 40.8).
In summary, the substitution effect of the decrease in px is given by the rightward
move from bundle A to B; that is,
100
3.74 units,
SE = xB − xA = 20.4 −
6
whereas the income effect of this price decrease is captured by the move from bundle
B to C; that is,
IE = xC − xB = 25 − 20.4 = 4.6 units.
In particular, the substitution effect indicates that the individual increases her pur-
chases of good x by 3.74 units only due to the lower price of this good (but still reaching
the same utility level as before the price change). The income effect reflects that, for
a given price ratio, the individual increases her consumption of good x by 4.6 units
because a cheaper good x increases her purchasing power.
Self-assessment 4.8 John’s utility function is u(x, y) = x1/3 y2/3 , his income is I =
$150, and the price of good y is py = $1. The price of good x decreases from px = $3
to px = $1. Using the steps in example 4.8, find the substitution and income effects.
Next, we alter example 4.8 by considering a quasilinear utility function. As we show,

income effects of a price decrease for the good that enters linearly are zero, implying that
the substitution and total effects coincide.
Example 4.9: Finding IE and SE with a quasilinear utility Consider now a utility
function u(x, y) = 2x1/2 + y, and the same income and
price changes
as in example 4.8.
1 −1/2
In this example, the marginal utilities are MUx = 2 2 x = x1/2 and MUy = 1.
1
Finding initial bundle A. At initial price px = $3, the tangency condition MUx
MUy = ppxy
becomes
1
x1/23
= ,
1 1
1
or x1/2 = 3. After rearranging, we find 13 = x1/2 . Squaring both sides yields x = 19
0.11 units. Inserting this expression in the budget line, 3x + y = 100, we obtain (3 ×
0.11) + y = 100, or y = 100 − 0.33 99.67 units. Hence, under the initial price of x,
the optimal bundle is A = (0.11, 99.67).
Finding final bundle C. Similarly, at the final price, px = $2, the tangency condition
1
MUx
MUy = ppxy yields x1/2
1 = 21 or, after rearranging, 1
x1/2
= 2. Squaring both sides yields
x = = 0.25 units. Inserting this result into the new budget line, 2x + y = 100, yields
1
4
(2 × 0.25) + y = 100. After solving for y, we find y = 100 − 0.5 = 99.5 units. Hence,
under the final price of good x, the optimal bundle is C = (0.25, 99.5). The total effect
of the decrease in px is, therefore, an increase of
TE = xC − xA = 0.25 − 0.11 = 0.14 units.
Finding decomposition bundle B. We can now break down this total effect into the
substitution and income effects. To do that, we first need to find the decomposition
bundle B. From our discussion, this bundle satisfies two conditions:
1. First, at bundle B, the consumer reaches the same utility level as at the initial bundle
A. Because bundle A produces a utility of
94 Chapter 4

u(0.11, 99.67) = 2 × 0.111/2 + 99.67 100.33,
bundle B must satisfy 2x1/2 + y = 100.33.

2. Second, at bundle B, the decomposition budget line BLd (which has the same slope
p
as BL2 , pxy = 21 ) is tangent to the consumer’s indifference curve. That is, the slope
1
MUx
MUy = 1 , must coincide with that of budget
x1/2
of the indifference curve, line BLd ,
px
py = 21 , which simplifies 1
to x1/2 = 2 or, after rearranging, x = 0.25 units.
We now insert x = 0.25

into 1/2theutility target that bundle B must reach, 2x1/2 +
y = 100.33, to obtain 2 × 0.25 + y = 100.33. After solving for y, we find y =
100.33 − 1 = 99.33 units.
In summary, the substitution effect of the decrease in px is given by the rightward
move from bundle A to B; that is,
SE = xB − xA = 0.25 − 0.11 = 0.14 units,
whereas the income effect of this price decrease is captured by the move from bundle
B to C; that is,
IE = xC − xB = 0.25 − 0.25 = 0 units.
Therefore, the income effect is zero in this example (IE = 0), implying that the
substitution effect is equal to the total effect, SE = TE. This means that the consumer,
after experiencing a cheaper good x, uses her increased purchasing power to buy more
units of good y alone, rather than increasing her purchases of good x. Nonetheless,
she understands that good x became relatively cheaper than y, inducing her to increase
her purchases of x by 0.14 units, as reflected by the substitution effect.
Self-assessment 4.9 Chelsea’s utility function is u(x, y) = 3x + 4y1/2 , her income

is I = $220, and py = $1. The price of good x decreases from px = $3 to px = $2. Using
the steps in example 4.9, find the substitution and income effects.
4.5.1 Income and Substitution Effects on the Labor Market

The previous analysis of income and substitution effects can be readily applied to any good
or service, such as the number of hours of leisure that an individual enjoys, L. Because the
day has only 24 hours, the analysis of leisure choices allows us to examine its counterpart,
(a)
u1
24w
BL1
yA
2w
w
L
0 LA 24
(b)
y
24w ’
u2
BL2
u1 C
BL1
A
2w ’
w’
L
0 LC LA 24
Figure 4.7
(a) Labor decisions when facing wage, w. (b) Labor decisions when facing a higher wage, w .
working hours, H. That is, given that L + H = 24, then H = 24 − L. Figure 4.7a represents
an individual facing a salary of w per working hour. Her budget line BL1 originates at the
horizontal intercept, representing 24 hours of leisure per day (or, equivalently, zero hours
of work). At this point, her total income is zero, so she cannot purchase units of good y on
the vertical axis (which can be understood as a composite good, aggregating all goods and
services, as opposed to leisure).
If she chooses to work 1 hour, moving leftward, her income increases to w; and if she
works 2 hours (moving farther to the left) her income increases to 2w. If she works all day
(24 hours), her income becomes 24w, as depicted on the vertical intercept of her budget
line, whereby she does not enjoy any leisure but can purchase the largest amount of goods
(represented on the vertical axis). Finally, note that indifference curves move northeast,
which indicates that her utility increases as she enjoys more hours of leisure (moving to the
right) and more units of goods (moving upward). At an hourly wage of w, she chooses an
96 Chapter 4
24 w' BL2
u2
u1 C
B
BL1
A
2 w' BLd
w'
LB LC L
0 LA 24
IE TE
SE
Figure 4.8
Income and substitution effects in the labor market.
optimal consumption bundle, in which her budget line BL1 is tangent to her indifference
curve u1 , which occurs at bundle A, where she enjoys LA hours of leisure and yA goods.
Figure 4.7b depicts an increase in the worker’s hourly salary to w , where w > w. As
a consequence, the new budget line BL2 becomes steeper than BL1 because, starting at
24 hours of leisure, every hour of work (moving leftward) entails a larger salary, which
allows the worker to afford more units of good y. If she were to work all day (24 hours),
her income would be 24w , as depicted on the vertical intercept of BL2 , which lies above
that of BL1 because 24w > 24w. Facing this more generous salary, the worker chooses an
optimal bundle C, where she enjoys LC hours of leisure. The decrease in leisure LC − LA is,
therefore, the total effect that arises from the salary increase. This total effect can be broken
into a substitution and an income effect, as we did in similar applications in this chapter.
To examine the substitution effect, we shift the final budget line BL2 downward so it
becomes tangent to the initial indifference curve, u1 . This downward parallel shift gives us
the so-called decomposition budget line BLd , which is tangent to u1 at bundle B, where the
worker enjoys LB hours of leisure. Figure 4.8 superimposes this effect on figure 4.7b.
The substitution effect in this case is given by the change in leisure from the initial bundle
A to the decomposition bundle B, LA − LB , whereas the income effect is represented by the
change in leisure from the decomposition bundle B to the final bundle C, LC − LB . The sum
of the substitution and income effects coincides with the total effect; that is, (LB − LA ) +
(LC − LB ) = LC − LA . In figure 4.8, the income effect of the salary increase moves in the
opposite direction of the substitution effect. Intuitively, a more generous salary per hour
induces the worker to work more hours (the substitution effect moves her choice leftward,
toward more work and less leisure). The opportunity cost of leisure increases along with the
wage, because this opportunity cost is captured by the higher hourly wage that the worker
forgoes if she does not work 1 more hour. In other words, every hour of leisure becomes
more expensive.
However, the income effect reflects the fact that, as the worker becomes richer, she can
afford to work fewer hours and enjoy more leisure. If the income effect is sufficiently large,
it would completely offset the substitution effect, resulting in an overall positive total effect
on leisure. In this case, a more generous salary increases the hours of leisure that the worker
enjoys, thus reducing the number of hours she chooses to work.10
Appendix A. Not All Goods Can Be Inferior
In this appendix, we use income elasticities to prove that not all goods can be inferior. Let
us begin by writing a property that holds in all the previous analysis: when the consumer
chooses her optimal bundle, this bundle must lie on the budget line. Formally, we say that,
at optimal consumption bundles x(px , py , I) and y(px , py , I), the individual must exhaust her
income, or
px x(px , py , I) + py y(px , py , I) = I.
Because we seek to analyze the effect of an income change, let us differentiate this
expression with respect to I:
∂x(px , py , I) ∂y(px , py , I)
px + py = 1.
∂I ∂I
∂x(px ,py ,I)
To obtain the expression of income elasticity, εx,I = ∂I
I
x(px ,py ,I) , in each term, we
x(px ,py ,I)×I y(px ,py ,I)×I
multiply the first term by x(px ,py ,I)×I ; and multiply the second term by y(px ,py ,I)×I . This
multiplication yields
∂x(px , py , I) x(px , py , I) I ∂y(px , py , I) y(px , py , I) I
px + py = 1.
∂I x(px , py , I) I ∂I y(px , py , I) I
Rearranging, we obtain
px x(px , py , I) ∂x(px , py , I) I py y(px , py , I) ∂y(px , py , I) I
+ = 1,
I
∂I x(p ,
x yp , I) I
∂I y(px py , I)
,

θx εx,I θy εy,I
10. If we plot the number of working hours H on the horizontal axis, and the salary per hour w on the vertical axis,
we can visually understand the relationship between H and w with the labor supply curve of the worker. When the
total effect of an increase in w is positive, working hours increase in w, thus producing a positively sloped labor
supply curve. In contrast, when the total effect is negative, working hours decrease in w, entailing a negatively
sloped labor supply.
98 Chapter 4
or, more compactly,
θx εx,I + θy εy,I = 1,
p x(p ,p ,I)
where θx = x Ix y represents the budget share that the individual spends on good
x (which entails that θx is a percentage, such that 0 θx 1). In addition, εx,I =
∂x(px ,py ,I) I
∂I x(px ,py ,I) is the definition of income elasticity discussed in previous sections of this
chapter. An analogous interpretation applies to the budget share of good y, θy , as well as to
its income elasticity. εy,I .
At this point, we are ready to tackle our initial question: Can all goods be inferior? For that
to occur, we would need their income elasticities to be negative (i.e., εx,I < 0 and εy,I < 0).
However, that would require the left side of the previous expression, θx εx,I + θy εy,I , to be
negative, given that budget shares are both positive or zero.11 Hence, this equality could not
hold if both goods were inferior. As a consequence, one or both goods must be normal, but
both cannot be inferior.
Appendix B. An Alternative Representation of Income and Substitution Effects
In previous sections of this chapter, we analyzed the increase in demand coming from a
price decrease, and how to break down this increase (total effect of the price decrease) into
the income and substitution effects. We next present a more compact approach to express
these two effects. First, from the discussion of the utility maximization problem (UMP) and
the expenditure minimization problem (EMP) in chapter 3 (see Appendix B), we take the
demand function that results U
from the UMP, x px , py , I , and evaluate it at an income level,
we obtain I = e px , py , u . Recall that this is the necessary income to purchase the optimal
bundle that solves the EMP. Therefore, we obtain

xU px , py , e px , py , u = xE px , py , u , (4.1)

where e px , py , u = px xE px , py , u + py yE px , py , u . That is, the optimal bundle that
solves the UMP (left side of 4.1) coincides with the bundle solving the EMP
(right side),
where the solution to the UMP is evaluated at income level I = e px , py , u .
Because the income and substitution effect measure how purchases of good x are affected
by a change in its price, px , we next differentiate both sides of equation (4.1) with respect to
price px , to obtain
∂xU ∂xU ∂e ∂xE
+ = . (4.2)
∂px ∂e ∂px ∂px
11. Recall that budget shares are percentages, θx ∈ [0, 1] and θy ∈ [0, 1], implying that θx and θy are either positive
numbers, or zero.
To understand the left side of equation(4.2), recall that price px shows up in the first and
third arguments of xU px , py , e px , py , u , implying
that
we need to differentiate separately
in each of them. In addition, px is inside e px , py , u , meaning that we need to apply the
chain rule.12
e px , py , u = px x px , py , u + py y px , py , u Eby definition;
E E
Note that that is,
we use
E p ,p ,u .
e px , py , u to refer to the income that we need
to buy x px , py , u and y x y
Differentiating with respect to px in e px , py , u , yields
∂e
= xE px , py , u
∂px

and x px , py , e px , py , u = x px , py , u . We can insert this result at the end of the left
U E
side of equation (4.2), to obtain

∂xU ∂xU U ∂xE
+ x = . (4.3)
∂px ∂e ∂px

Finally, because e px , py , u = I, we can ultimately express equation (4.3) as
∂xU ∂xU U ∂xE

+ x = .
∂px ∂I ∂px
Rearranging yields the so-called Slutsky equation:
∂xU ∂xE ∂xU U
= − x ,
∂px ∂px ∂I

TE SE IE
indicating that the total effect of a decrease in px (as measured by the effect on the demand
function found after solving the UMP) is given by the substitution effect (as captured by the
change in the demand found after solving the EMP) and the income effect.
Let us briefly explain why the substitution effect is measured by ∂x
E
∂px . Upon a decrease in
px , the budget line pivots upward, but the EMP requires that the individual still reaches the
same utility target u. Graphically, the consumer must then return to the same indifference
curve she reached before the price change, but with a flatter budget line (as the price ratio
px
py decreased). As a consequence, she moves to a bundle located on the same indifference
curve, but to the southeast of the initial consumption bundle, thus leading her to purchase
more units of good x. Hence, a decrease in px increases her purchases of x, implying that
∂xE
∂px < 0 (i.e., px and purchases of x move in opposite directions). Importantly, this result
12. Recall the chain rule when dealing with composite functions. If we have a differentiable function y = f (x)
where x = g(z) is another differentiable function, then the derivative of y with respect to z is equal to the derivative
dy df (y)
of y with respect to x, times the derivative of x with respect to z. More compactly, dz = dx x(z) dz = f (x)g (z).

Intuitively, a marginal increase in z produces an increase in x (as captured by g (z)) and, because x increases y, we
also experience an increase in y (as measured by f (x) in the last expression).
100 Chapter 4
does not rely on goods being normal or inferior but rather applies to all types of goods. For
the income effect, however, its sign depends on whether goods are normal (a larger income
results in increased purchases of x, implying that ∂x∂I > 0) or inferior (a larger income results
U
∂xU ∂xU
in decreased purchases of x, ∂I < 0). Specifically, when goods are normal, ∂I > 0, we
obtain
∂xU ∂xE ∂xU U
= − x
∂p ∂p ∂I
x x
TE is − SE is − +
IE is −
entailing that income and substitution effects are both negative, thus reinforcing each other.
In contrast, when goods are inferior, ∂x∂I < 0, we have
U
∂xU ∂xE ∂xU U

= − x
∂px ∂px ∂I

TE ? SE is − −
IE is+
which implies that, while the substitution effect is negative (recall that it is always negative),
the income effect is now positive; and thus the sign of the total effect is ambiguous. If the
substitution effect dominates the income effect,13 the total effect is negative (as with inferior
goods); whereas when the income effect dominates, the total effect becomes positive (as
with Giffen goods).
Example 4.10: Applying the Slutsky equation to the Cobb-Douglas case Con-
sider the Cobb-Douglas utility function from example 4.1. After solving the UMP,
we found that the demand for good x was xU (px , py , I) = 2pI x . In that situation, ∂x
U
∂px =
∂xU
− I
, whereas ∂I = 2p1 x . Applying the Slutsky equation, we obtain
2(px )2
I ∂xE 1 I
− = − ,
2 (p ) 2 ∂p 2p 2p
x x x x
TE SE IE
E
13. This occurs if the absolute value of the substitution effect, ∂x
∂px , is greater than the absolute value of the
U
income effect, ∂x∂I xU .
∂xE
∂px = − 4(px )2 . For instance, if px = $3
I
thus implying that the substitution effect is
becomes ∂x
E
and I = $100, the substitution effect ∂px = − 9 , the income effect is also
25
− 2p1 x 2pI x = − 25
9 , and thus the two effects reinforce each other, ultimately producing a
total effect of − I 2 = − 509 5.55 units. Intuitively, a marginal increase in the price
2(px )
of good x decreases the quantity demanded by 5.55 units, where half of this decrease
can be attributed to the substitution effect alone (change in price ratio). The remaining
half is explained by the smaller purchasing power that the individual experiences when
facing a more expensive good (income effect).
Using Elasticities to Represent the Slutsky Equation

We can also represent the Slutsky equation in a more compact way by using elasticities.
First, let us multiply the left and right sides by xpUx to obtain
∂xU px ∂xE px ∂xU U px

= − x U. (4.4)
∂px xU ∂px xU ∂I x
We
now multiply
the
second
term inthe right side of equation (4.4) by II = 1, and note that
xU px , py , e px , py , u = xE px , py , u in the first term on the right side. That is, equation
(4.4) becomes
∂xU px ∂xE px ∂xU U px I
= − x U , (4.5)
∂px xU ∂px xE ∂I x I
where the left side coincides with our definition of price elasticity, εx,px = ∂x
U px
∂px xU
; and the
∂xE px
first term on the right side is εx,p
E =
x ∂px xE , where x
E
denotes the demand function we found
after solving the EMP. Furthermore, the second term on the right side can be rearranged as
follows:
∂xU U px I ∂xU I px xU
x U = ,
∂I x I ∂IxU
I
εx,I θx
∂x(p ,p ,I) px x U
where εx,I = x y
∂I x(px ,py ,I) represents the income-elasticity of demand, and θx = I
I
denotes the budget share that the individual spends on good x. As a consequence, equation
(4.5) can be rewritten as
εx,px = εx,p
E
x
− θx εx,I . (4.6)
From an applied perspective, this expression can be more attractive than the Slutsky equa-
tion, because we can often find estimates for elasticities εx,px , εx,I , and budget share θx , thus
102 Chapter 4
allowing us to infer elasticity εx,p

E . To illustrate this point, consider two extreme examples.
x
First, if you analyze the demand for garlic, you will likely find that the budget share of most
consumers is negligible (i.e., θx 0), implying that equation (4.6) reduces to εx,px εx,p
E .
x
In other words, the income effect is close to zero, and hence, the substitution effect coin-
cides with the total effect. Second, consider a good such as housing, with a much larger
budget share (e.g., θx = 0.3). If we have estimates of its price elasticity being εx,px = −0.6,
and its income elasticity being εI,x = 1.3, then we can find εx,pE , by using equation (4.6) as
x
follows:
−0.6 = εx,p
E
x
− (0.3 × 1.3)
which, solving for εx,p

E , yields ε E = −0.21. Intuitively, a 1 percent increase in the price of
x x,px
housing reduces demand by 0.6 percent if wealth is left unaffected. However, if the consumer
receives additional wealth to guarantee that she can still reach the same utility level as before
the price change, her demand for housing would be reduced by only 0.21 percent.
Exercises
1. Deriving Functions–I.A Consider an individual with a Cobb-Douglas utility function u(x1 , x2 ) =

x1 x2 , facing an income I = 100 and prices p1 and p2 for goods 1 and 2, respectively.
(a) Find the demand function for each good.
(b) Assume that the price of both goods increases by 10 percent. Find the new demand functions
for each good.
(c) Find the price-consumption curve of each good. Interpret.
(d) Find the Engel curve of each good. Interpret.
2. Calculating Effects–I.A Consider the scenario in exercise 1, but now assume specific prices
p1 = $10 and p2 = $5. In this context, only the price of good 1 decreases to p1 = $4. Answer the
following questions:
(a) Find the demand for each good, both before and after the price change. This represents the
“total effect” of the price change.
(b) Identify which part of the total effect originates from the substitution and the income effects.
3. Quasilinear Utility–I.B Repeat your analysis in exercise 1, but assuming a quasilinear utility
function u(x1 , x2 ) = ln x1 + x2 .
(b) Assume that the price of both goods increases by 10 percent. Find the new demand functions
for each good.
(d) Find the Engel curve of each good. Interpret.
4. Calculating Effects–II.A Consider the situation in exercise 3, but now assume specific prices
p1 = $10 and p2 = $5. In this context, only the price of good 1 decreases to p1 = $4. Answer the
following questions:
total effect of the price change.
5. Decomposition Bundles.B Consider an individual with an utility function u(x, y) = x2 y, and facing
prices px = $2 and py = $4.
(a) Assuming that his income is I = $800, find the optimal consumption of goods x and y that
maximizes his utility. That is, solve his UMP.
(b) Consider now that the price of good y decreases from py = $4 to py = $3. Find this consumer’s
new optimal consumption bundle. Then, identify the total effect of the price change, and
decompose it into the substitution and income effects.
(c) Considering that the price of good y remains at py = $4, assume that the consumer seeks to
reach the same utility level as in part (a). Find the optimal consumption of goods x and y that
minimizes his expenditure. That is, solve his EMP.
(d) As in part (b), assume that the price of good y decreases from py = $4 to py = $3. Find this
consumer’s new optimal consumption bundle. Comparing your results from parts (c) and (d),
argue that the total effect that we find when using the compensated demand (the result of the
EMP) measures the substitution effect alone. Interpret.
6. Perfect Substitutes.B Peter’s preferences for tea and coffee are given by u(x, y) = 2x + y, where
x denotes the units of tea and y the units of coffee. His income is $500, and the initial prices are
px = $36 and py = $22.
(a) Find the utility-maximizing pair of tea and coffee. [Hint: Peter regards tea and coffee as
perfectly substitutable, so you should anticipate that he consumes only one of the two goods.]
(b) Assume now that the price of tea increases, to px = $40. Is his consumption of tea and cof-
fee affected by the price change? Your answer defines the total effect of the price change.
Decompose it into substitution and income effects.
(c) What if the price of tea further increases to px = $83? What is the total effect of the price
change? What are the substitution and income effects?
7. Income Effects.A Peter informs us that his demand for housing decreases when his income
decreases. Can we infer from that information that, after an increase in the price of housing, Peter’s
demand will decrease?
8. Quasilinear Utility–II.B Chelsea’s utility function is u(x, y) = 3x + 4y1/2 , her income is I = $220,
and py = $1. The price of good x decreases from px = $3 to px = $2. Using the steps in example
4.9, find the substitution and income effect from this price change.
9. Linear Demand.A Suppose that the demand for cookies (good x) was expressed as x = 250 − 3px ,
where px is the price of cookies.
(a) Calculate the price elasticity of demand.
104 Chapter 4
(b) For what prices is the demand for cookies elastic?

(c) For what prices is the demand for cookies inelastic?
10. Point Elasticity.A Consider the market for football tickets. It faces the following supply and
demand functions:
qS = −2 + 2p
qD = 8 − 3p + 2I + pB
where p is the price for football tickets, I is average income in units of $10, 000, and pB is the
price of basketball tickets.
(a) Let I = 4 and pB = 2. Calculate the equilibrium price and quantity.
(b) Calculate the price elasticity of demand, income elasticity, and cross-price elasticity at the
equilibrium price and quantity.
2
11. Income Elasticity–I.B Suppose that the demand for beef (good x) can be expressed as x = 2I−I
px ,
where I is the consumer’s income, measured in units of $100, 000.
(a) Calculate the income elasticity for beef.
(b) Provide an interpretation for the income elasticity for beef. For what values of I is beef a
normal good?
12. Engel Curve.C Suppose that the demand for good x was x = 9 − (I−3)
2
px .
(a) Calculate the income elasticity for good x.
(b) Derive and plot the Engel curve for good x.
(c) For what income levels is good x normal? Label this range on your plot in part (b).
13. Perfect Complements.B Consider an individual with utility function u(x, y) = min{2x, 3y}, facing
an income I = 250 and prices px and py for goods 1 and 2, respectively.
(b) Calculate the price elasticity of demand and income elasticities for both goods. Interpret.
14. Calculating Effects–III.A Consider the scenario in exercise 13, but now assume that the initial
prices are px = $10 and py = $8. In this context, only the price of good y increases, to py = $12.
Answer the following questions:
“total effect” of the price change.
15. Deriving Functions–II.A Consider an individual with the Cobb-Douglas utility function u(x, y) =
x0.4 y0.6 , facing an income I and prices px and py for goods x and y, respectively.
(b) Find the price-consumption curve of each good. Interpret.
(c) Find the Engel curve of each good. Interpret.
16. Calculating Effects–IV.A Consider the situation in exercise 15, but now assume specific prices
px = $3 and py = $2 and income of I = $100. In this context, only the price of good x decreases,
to px = $2. Answer the following questions:
total effect of the price change.
(b) Find the decomposition bundle.
(c) Identify which part of the total effect originates from the substitution and the income effects.
17. Income Elasticity-II.A Suppose that the demand for good x can be expressed as x = 2Ip 1 , where
x
px is the price of good x and I is the consumer’s income.
(a) Calculate and interpret the income elasticity for good x.
(b) Derive and plot the Engel curve for good x.
18. Magnitudes of SE and IE.A After calculating the effects of a price increase for good x, you find
that the substitution effect in this situation is equal to SE = −8.
(a) Suppose that the income effect is equal to IE = −4. Calculate the total effect. How does the
consumer regard good x with respect to their income?
(b) Suppose instead that the income effect is equal to IE = 3. Calculate the total effect. How does
the consumer regard good x with respect to their income?
(c) Suppose now that the income effect is equal to IE = 12. Calculate the total effect. How does
the consumer regard good x with respect to their income?
19. Giffen Good?B Suppose that you work in a hardware store in a community that is expecting a
major hurricane in the next few days. To ration your plywood, you start to increase its price, but
you find that with each price increase, more people seem to purchase your plywood. You begin to
expect that your plywood may be a Giffen good.
(a) Provide an argument why plywood is a Giffen good under these circumstances.
(b) Provide an alternative explanation for the increase in plywood sales.
√
I py
20. Cross-Price Elasticity.B Suppose that the demand for good x is x = 2p , where px is the price
x
of good x, py is the price of good y, and I is the consumer’s income.
(a) Calculate and interpret the cross-price elasticity of good x with respect to good y.
(b) Suppose instead that the demand for good x is x = 2p I√p . Calculate and interpret the cross-
x y
price elasticity of good x with respect to good y.
5 Measuring Welfare Changes
5.1 Introduction
In this chapter, we evaluate the welfare gain that individuals enjoy when facing cheaper
prices and, similarly, the welfare loss they suffer when facing more expensive goods. Inter-
estingly, this can occur when demand changes, but also when a new sales tax is enacted
that affects the selling price. Our analysis helps explain consumer welfare losses that are a
consequence of more expensive goods or more stringent taxes.
Here, we discuss three measures of welfare change: (1) consumer surplus, (2) compen-
sating variation, and (2) equivalent variation. We evaluate them in applied settings, and
discuss contexts under which all three measures produce the same welfare change (i.e.,
the same number). This coincidence, however, does not necessarily happen in all situa-
tions, so we also examine applications where each welfare measure yields a different welfare
change.
5.2 Consumer Surplus
As discussed in previous chapters, the demand curve identifies how many units of a good x
an individual is willing to purchase at price px and income $I. As such, the demand func-
tion can be interpreted as representing the maximum number of units that the individual
consumes at each price px .1 Alternatively, it can be understood as measuring, for a given
number of units of good x, how many dollars the individual is willing to pay for these units.
In short, the demand function represents the maximum willingness-to-pay for the good.2
Hence, if we compare this maximum willingness-to-pay against the price that the consumer
1. Graphically, for a given horizontal line corresponding to a price px , the crossing point with the demand curve
measures the maximum number of units she is willing to buy at that price px .
2. Graphically, for a given vertical line corresponding to x units, the height of the demand function measures her
willingness-to-pay.
108 Chapter 5
actually pays for the good, we find a measure of the utility gain that she makes when buying
the good.
Consumer surplus (CS) The area below the demand curve and above the price that
consumers pay for the good.
We now present examples on how to find such CS in two situations: one with a linear
demand, and another with a nonlinear demand.
Example 5.1: Finding CS with linear demand Figure 5.1 depicts a demand curve
p(q) = 10 − 2q, and a market price of p = $4. The area below the demand curve and
above the current price of p = $4 measures the CS, which is given by triangle A,
with area
1
CS = (10 − 4) 3 = 9,
2
where the height of the triangle is 10 − 4 = 6, as depicted on the vertical axis, whereas
its base is given by the output level that solves 4 = 10 − 2q, yielding q = 3 units.
(Graphically, this is the output level for which the demand function reaches a height
of exactly p = $4.)
In addition, if the price were to fall to p = $3, output would increase to 3 = 10 − 2q,
that is to say, q = 3.5 units. As a consequence, CS increases by the size of areas B and
$10
p (q)=10 – 2q
$4
B C
$3
3 3.5 q
Figure 5.1
CS with linear demand.
Measuring Welfare Changes 109
C in the graph. We then represent the increase in CS as
CS = B + C
1
= (4 − 3)3 + (4 − 3)(3.5 − 3)
2
= 3 + 0.25 = 3.25.
Therefore, the increase in CS is 3.25, which produces a new CS of 9 + 3.25 = 12.25.
Self-assessment 5.1 Repeat the analysis in example 5.1, but considering now a
demand function p(q) = 11 − 13 q. What is the change in CS when the price of good x
decreases from $4 to $3?
Example 5.2: Finding CS with nonlinear demand Consider the nonlinear demand
in example 4.1, x = 2pI x , arising from a Cobb-Douglas utility function. If the consumer
faces a price of px = $4 and an income level of I = $100, she purchases x = 2×4
100
= 12.5
units. If the price decreases to px = $3, she increases her purchases to x = 2×3 = 16.6,
100
as depicted in figure 5.2. In this case, however, to find the gain in CS, we must use the
integral of demand function x = 100 2px between prices px = $4 and px = $3 because the
demand function is not linear (i.e., it is not a straight line).
In particular, the increase in consumer surplus is
4 4
100 1
CS = dpx = 50 dpx =
3 2px 3 px
= 50[ln px ]43 = 50[ln 4 − ln 3] = 14.38.
What would happen if we tried to approximate the change in CS using the rectangle
and triangle below the demand curve (as if it was linear)? In that case, we would find
that the approximated increase in CS is

1
CS(approx.) = [12.5 × (4 − 3)] + (16.67 − 12.5) × (4 − 3) = 14.59,
2
Area of rectangle B
Area of triangle C
thus implying an overestimation of the true change in CS, because 14.59 > 14.38.
110 Chapter 5
p (q)
4
ΔCS
3
Figure 5.2
Change in CS with nonlinear demand.
Self-assessment 5.2 Repeat the analysis in example 5.2, but assume a demand
function x = 2√3I
px . Still assuming that the price of good x decreases from $4 to $3,
what is the change in CS, CS?
5.3 Compensating Variation
In this section, we present the compensating variation (CV) as an alternative measure of

welfare change. For simplicity, we consider that the price of good y is normalized to $1,
which means that we divide all prices by py . For instance, if prices are px = $4 and py = $3,
we divide all prices by py = $3 to obtain the normalized prices px = $ 43 and py = $ 33 = $1.
An advantage of this normalization is that the vertical intercept of the budget line, y = pIy ,
is now y = pI = 1I = I. We can hence interpret the vertical intercept of the budget line as
y
the consumer’s income, and make income comparisons by just looking at the height of this
intercept.
Without further ado, we present the CV, which measures the welfare change that an
individual experiences from a price change.
Compensating variation (CV) How much money an individual needs to take away
from (give to) a consumer after a price decrease (increase) such that she is as well off
as before the price change.
I
CV
IB A
C
B
u2
u1
BL1 BLB BL2
x
Figure 5.3
Finding the CV.
Intuitively, the consumer is better off after a price decrease, as her budget line pivots
upward, allowing her to achieve a higher utility level. The CV then asks:
How much do we need to reduce the consumer’s income to make her as well off as she was before
the price change?
Graphically, we shift her new budget line downwards in a parallel fashion, thus exhibiting
the final price ratio, until the shifted budget line becomes tangent to the initial indiffer-
ence curve that the consumer reached before the price change. This implies that she obtains
the same utility level as before the decrease in prices. The opposite argument applies if
prices increase, where the consumer now reaches a lower utility level than before the
price change. Hence, the CV measures how much money we need to provide to the con-
sumer to compensate her for the price increase. Graphically, that means shifting her budget
line upward so she can reach the same indifference curve (i.e., utility level) as before the
price change. Regardless of whether we analyze a price decrease or increase, however,
the CV focuses on final prices, and evaluates how much money we need to take away
from the consumer in the case of a price decrease (give to the consumer in the case of a
price increase) to guarantee that she can reach the same utility level as before the price
change.
Figure 5.3 depicts the CV. At the initial prices, the consumer faces budget line BL1 , pur-
chases an optimal consumption bundle A, and reaches a utility level u1 . As mentioned in
the previous discussion about normalizing prices, the vertical intercept of BL1 measures the
consumer’s income I. When the price of good x decreases, her new budget line is BL2 , but
her income remains I (i.e., BL1 and BL2 have the same vertical intercept). At new prices,
the consumer chooses bundle C. However, the CV asks: “How much money would we need
to take from the consumer’s income I to make her new budget line BL2 tangent to her initial
112 Chapter 5
indifference curve u1 ?” To answer this question, we make a parallel shift of BL2 downwards
until we find another budget line, BLB , tangent to u1 , where the consumer purchases bundle
B.3 The vertical intercept of budget line BLB , IB in the graph, is the income that the con-
sumer would need to purchase bundle B. Hence, the difference CV = I − IB measures the
CV, namely, the amount of money we need to subtract from the consumer’s initial income I
to make her as well off as before the price change.
Example 5.3: Finding the CV of a price decrease Consider a consumer with the
Cobb-Douglas utility function u(x, y) = xy, an income of I = $100, and a normalized
price of good y at py = $1. We first seek to find demand functions for goods x and y,
which will help us in the analysis of the bundles at the initial and final prices. From the
px y px
tangency condition, MU
MUy = py , we obtain x = py , or after rearranging, y = px x because
x
py = $1. Inserting this result into the budget line px x + py y = I, which in this context
becomes px x + y = 100 because py = $1, we find
px x + px x = 100 ⇒ 2px x = 100.

Because y=px x
Solving for good x, we obtain the demand for this good, x = 100 2px = px . We can then
50
insert this result into the expression obtained from the tangency condition, y = px x, to
find the demand for good y (i.e., y = px 50px = 50 units).
Let us consider now that the price of good x decreases from px = $3 to px = $2,
while the price of good y remains fixed at py = $1.
1. Finding the initial bundle A. At the initial price px = $3, this demand for good x
simplifies to xA = 503 16.67 units.
2. Finding the final bundle C. At the final price px = $2, the demand for goods x
increases to xC = 50
2 = 25 units.
3. Finding the decomposition bundle B. At the decomposition bundle, we must ensure
the following occurs:
a. The consumer must reach the same utility level as with the initial bundle A.
Because we have found that bundle A = (16.67, 50), this bundle yields a utility
level of
50 2500
uA = (50) = 833.33.
3 3
3. This is, of course, the decomposition bundle B that we found in our analysis of the substitution and income
effects in chapter 4. In that analysis, we also shifted the final budget line downward until the consumer reached the
same utility level as before the price change.
Therefore, the amount of goods x and y consumed at the decomposition

bundle B, (xB , yB ), must also yield a utility level of 833.33, which we can
mathematically express as follows:
(xB ) (yB ) = 833.33.

b. The consumer’s indifference curve must be tangent to the budget line (i.e.,
MUx px
MUy = py ), which in this context entails y = px x. Because px = 2, the tangency
condition can be written as y = 2x. Substituting this condition in the above
equation, (xB ) (yB ) = 833.33, we find
uB = (xB )(yB ) = (xB ) (2xB ) = 833.33

yB =2xB
or, after rearranging, 2 (xB )2 = 833.33, which further simplifies to (xB )2 =

416.67. Applying square roots on both sides, we find that xB 20.41. We can
now insert this result into the tangency condition, y = 2x, to find the amount
of good y that the consumer has at bundle B, obtaining yB = 2 × 20.41 = 40.82
units.
4. Evaluating the CV. The CV is given by CV = I − IB , where I = $100 is the con-
sumer’s income and IB represents the income that the individual needs to purchase
the decomposition bundle B = (20.41, 40.82) found in point 3 at the final prices
(i.e., px = $3 and py = $1). Specifically,
IB = ($2 × 20.41) + ($1 × 40.82) = 81.64.
Thus, the CV is
CV = I − IB = $100 − $81.64 = $18.36.
Expressed in words, if, after experiencing the price decrease, we reduce the con-
sumer’s income by $18.36, her utility level coincides with that before the price
decrease.
Self-assessment 5.3 Repeat the analysis in example 5.3, assuming the same utility
function and py = $1, but consider that income is I = $125 and the price of good x
decreases from px = $2 to px = $1.
114 Chapter 5
5.4 Equivalent Variation
The equivalent variation focuses on the day before the price change, as opposed to the CV,
which focuses on the day after the price change. Similar to the CV, the EV is defined as
follows.
Equivalent variation (EV) How much money one needs to give to (take away from)
a consumer before a price decrease (increase) such that she is as well off as after the
price change.
Using a similar interpretation as for the CV, note that a price decrease will make the
consumer better off. Hence, the EV asks:
How much money do we need to offer the consumer today
(before she enjoys the price decrease)
to make her as well off as after the price decrease?
Graphically, a price decrease pivots the consumer’s budget line outward, leading her
to reach a higher utility level. If we provide the consumer with more income today, her
initial budget line shifts outwards until it becomes tangent to her final utility level. The
opposite argument applies if she experiences a price increase, which pivots her initial bud-
get line inward, driving her to achieve a lower utility level. In this case, the EV would
measure how much money we need to take away from a consumer today (before she suffers
the price increase) to make her as worse off as she will be once she suffers such an increase in
prices.
Figure 5.4 depicts the EV when the consumer suffers a price decrease. First, the individual
faces a budget line BL1 and purchases bundle A, reaching utility level u1 . Second, the price
of good x decreases, pivoting the individual’s budget line from BL1 to BL2 , which leads her
to purchase bundle C and reach a higher utility level u2 . The EV focuses on the “before-
the-price-change” scenario, and asks how much money we need to give to the consumer (on
top of her initial income I) to make her as well off as she will be once she enjoys such a
price decrease. Graphically, we then need to make a parallel shift of her initial budget line
BL1 outward until it becomes tangent to her final indifference curve, and thus reaches utility
level u2 . As depicted in the figure, this occurs at bundle E, which entails a budget line BLE .
Because the vertical intercept of the budget lines indicates the individual’s income along all
points of that line, IE reflects the total income that the consumer needs to reach utility level
u2 at the initial prices.4 Hence, the additional income that we need to give the consumer to
4. We say “at the initial prices” because budget lines BL1 and BLE are parallel, thus indicating that they both face
the initial price ratio.
IE
EV
E
A
u2
u1
BL1 BLE BL2
x
Figure 5.4
Finding the EV.
reach utility level u2 is EV = IE − I, as measured by the height between points IE and I on

the vertical axis of the figure.
Example 5.4: Finding the EV of a price decrease Following the scenario in exam-
ple 4.8, consider a consumer with the Cobb-Douglas utility function u(x, y) = xy,
income I = $100, and a price for good y of py = $1. The price of good x decreases
from px = $3 to px = $2. From example 4.8, we know that the initial bundle A is
A = 50 3 , 50 , the final bundle is C = (25, 50), and the decomposition bundle is B =
(20.4, 40.8). The EV of this price decrease is given by EV = IE − I, where I = $100
denotes the individual’s income and IE represents the income she needs to purchase
bundle E. But, where is bundle E? To find this bundle, recall the conditions we
discussed in figure 5.4:
1. Bundle E must reach the same utility level as the final bundle C. Because
C = (25, 50), its utility level is uC = 25 × 50 = 1, 250, implying that bundle E =
(xE , yE ) must also yield this utility level; that is,
xE yE = 1, 250.
2. Bundle E must be a tangency point; that is, the tangency condition MUx
MUy = ppxy must
hold, which in this case entails yx = 31 (recall that the slope of BL1 and BLE coincide,
116 Chapter 5
as depicted in figure 5.4). This tangency condition simplifies to y = 3x. Plugging

this result into xE yE = 1, 250, we obtain
xE (3xE ) = 1, 250,
or to (xE )2 = 1,250
which collapses to 3(xE )2 = 1, 250, √ 3 416.67. Applying square
roots on both sides, we find xE = 416.67 = 20.41 units. We can then use the
tangency condition, y = 3x, to find the amount of good y that the individual
consumes in bundle E (i.e., yE = 3 × 20.41 = 61.2 units).
Therefore, the income that the individual spends to purchase bundle E =

(20.41, 61.2) at the initial prices (px = $3 and py = $1) is
IE = ($3 × 20.41) + ($1 × 61.2) = $122.43,
implying that the EV is
EV = IE − I = $122.43 − $100 = $22.43.
Expressed in words, if, before enjoying the price decrease, we increase the con-
sumer’s income by $22.47, we help her reach the same utility level that she will enjoy
after the price decrease.
function and py = $1, but consider that income is I = $125, and the price of good x
decreases from px = $2 to px = $1.
5.5 Measuring Welfare Changes with No Income Effects
The previous discussion considered three approaches to measure the welfare change that
consumers experience after a price change: (i) the change in CS, (ii) the CV, and (iii) the
EV. While these welfare measures generally differ (as illustrated in Examples 5.2–5.4 for the
Cobb-Douglas utility function), they produce the same exact number if income effects are
absent. As we know from chapter 4, income effects are zero when, for instance, the consumer
has quasilinear preferences. This type of preferences, as example 5.5 shows, produce the
triple coincidence CS = CV = EV . Appendix A at the end of chapter 4 demonstrated that
income effects are absent when the budget share that the consumer spends on the good we
analyze (relative to her entire income) is negligible and/or when the income-elasticity of the
good is small. In these cases, we can conclude that the three measures of welfare change,
CS, CV , and EV , will approximately coincide.
Example 5.5: CS, CV, and EV with a quasilinear utility function Consider a
√
consumer with a quasilinear utility function u(x, y) = 2 x + y, an income level of
I = $100, and a price of good y at py = $1. We start by finding the demand function
√1
for goods x and y. In this context, the tangency condition MUx
MUy = ppxy becomes 1
x
= p1x ,
which simplifies to x = p12 , providing us with the demand function for good x.5 We
x
can now find the demand function for good y. The budget line px x + py y = I becomes
px x + y = 100 in this example. Inserting the previous result x = p12 into the budget line,
x
we obtain
1
px + y = 100,
p2x

Demand for x
which simplifies to 1
px + y = 100, ultimately yielding the demand function for good y,
y = 100 − 1
px .
Consider now that the price of good x decreases from $4 to $3, and let
us find the increase in consumer welfare measured through the three tools learned in
this chapter: the CS, CV, and EV discussed next.
Finding the CS. To obtain the welfare change by using the CS, we simply need to
integrate the demand curve of good x between $4 to $3 as follows:
4
1 1 4 1 1 1
CS = 2
dpx = − = − − − = = 0.08.
3 px px 3 4 3 12
Finding the CV. We now find the change in consumer welfare measured through
the CV = I − IB . Let’s start by finding the income that the consumer needs to purchase
bundle B, IB . To do that, we first obtain bundles A, C, and B, as follows:
1. Finding the initial bundle A. At the initial price px = $4, the demands for goods
x and y simplify to xA = 412 = 16
1
and yA = 100 − 14 = 3994 , and thus bundle A is
A= 1 399
16 , 4 .
5. As discussed in chapter 3, when dealing with quasilinear utility functions, the tangency condition provides the
expression of the demand function for good x without the need to insert the results into the budget line. The budget
line only plays a role in determining the demand for good y, which is given by the income left after purchasing
good x.
118 Chapter 5
2. Finding the final bundle C. At the final price px = $3, the demand for goods x and
y change to xC = 312 = 19 and yC = 100 − 13 = 299
3 , implying that C =
1 299
9, 3 .
3. Finding the decomposition bundle B. At the decomposition bundle, the following
must occur:
a. The consumer must reach the same utility level as with the initial bundle A.
Because we found that bundle A is A = 16 1 399
, 4 , this bundle yields a utility
level of

1 399
uA = 2 + = 100.25.
16 4
Therefore, the decomposition bundle B, (xB , yB ), must also yield a utility level
of 100.25, which mathematically can be expressed as follows:
√
uB = 2 xB + yB = 100.25. (5.1)
b. The consumer’s indifference curve must be tangent to the budget line at the final
px
prices, MU
MUy = py , which in this example means
x
√1
xB 3
= ,
1 1

or x1B = 3. Squaring both sides, we obtain x1B = 9, or xB = 19 0.11. Substi-
√
tuting this result in equation (5.1), 2 xB + yB = 100.25, gives us

1
2 + yB = 100.25,
9
which simplifies to 23 + yB = 100.25, ultimately yielding yB 99.58 units.
Therefore, the income that the consumer needs to purchase the decomposition
bundle B = (0.11, 99.58) is
IB = 3 (0.11) + 1(99.58) = $99.91.
4. Evaluating the CV. The CV is then given by
CV = I − IB = 100 − 99.9133 ≈ 0.08,
which coincides with the CS we found previously because the consumer exhibits
a quasilinear utility function.
Finding the EV. We now find the change in consumer welfare measured through the
EV = IE − I. We then start by finding the income that the consumer needs to purchase
bundle E, IE .
1. Bundle E must reach the same utility level as the final bundle C. In our search for
the CV, we already found that bundle C = 19 , 299
3 , which yields a utility level of

1 299
uC = 2 + = 100.33.
9 3
Therefore, bundle E = (xE , yE ) must also yield a utility level of 100.33, which
mathematically can be written as
√
uE = 2 xE + yE = 100.33. (5.2)
2. The consumer’s indifference curve must be tangent to the budget line at the initial
px
prices, MU
MUy = py , which in this example means
x
√1
xB 4
= ,
1 1

or x1E = 4. Squaring both sides, and rearranging, we obtain xE = 16 1
0.0625.
√
Thus, substituting this result in equation (5.2), 2 xE + yE = 100.33, gives us,

1
2 + yE = 100.33,
16
which simplifies to 24 + yE = 100.33, ultimately yielding yE = 99.83 units. Thus,
the income that the consumer needs to purchase bundle E = (0.0625, 99.83) is
IE = 4 (0.0625) + 1(99.83) = 100.08.
3. Evaluating the EV. The EV is then given by
EV = IE − I = 100.08 − 100 = 0.08.
This result coincides with those we found for the CS and CV previously because
the consumer exhibits a quasilinear utility function.
function and py = $1, but consider that income is I = $125, and that the price of good
x decreases from px = $2 to px = $1.
120 Chapter 5
Appendix. An Alternative Representation of the Compensating and Equivalent Variations
We next present an alternative approach to measuring the CV and EV, which uses the
expenditure function, found in chapter 3, where we study the consumer’s expenditure
minimization problem (EMP).
A.1 Compensating Variation

Consider an individual facing prices px and py , and seeking to reach a utility target of u in
her EMP. From the discussion in appendix B in chapter 3, we know that she would set the
tangency condition MRSx,y = ppxy , and then insert the result into her constraint of reaching
utility target u (i.e., u(x, y) = u). Following this procedure, the individual obtains a demand
for good x of xE (px , py , u), and a demand for good y of yE (px , py , u), where superscript E
denotes that we obtained this expression after solving the EMP. Intuitively, these demands
help the individual minimize her expenditure while reaching utility target u. We can then
find the cost of buying these demands, as follows:
e(px , py , u) = px xE (px , py , u) + py yE (px , py , u), (5.3)
which we refer to as the “expenditure function,” because it represents the minimal expen-
diture that the individual needs to incur to reach utility level u at current prices. We can
then repeat the process when the price of good x decreases from px to px , and the individual
can thus reach a higher utility u , where u > u. In this setting, we would obtain demands
xE (px , py , u ) and yE (px , py , u ), and an expenditure function of
e(px , py , u ) = px xE (px , py , u ) + py yE (px , py , u ).
We now repeat our analysis, decreasing prices from px to px , but we require that the
individual reaches the same utility level as before the price change, u. In this context, we
would find demands xE (px , py , u) and yE (px , py , u), and an expenditure function of
e(px , py , u) = px xE (px , py , u) + py yE (px , py , u).
We can now use the monetary amounts found in e(px , py , u) and e(px , py , u) to express
the CV. In particular, recall that the CV takes an “after-the-price-change” perspective, thus
implying that we must focus on the final price, px . In addition, recall that the CV measures
the amount of money the consumer is willing to give up after the price decrease (after her
utility level improves from u to u ) to be just as well off as before the price decrease (where
she only reached utility level u). Formally, the CV can then be written as
CV = e(px , py , u ) − e(px , py , u).

Yet another representation of CV. While the equation given here is a convenient expres-
sion of the CV, we can alternatively write it using xE (px , py , u) alone. In particular, note
that the individual’s expenditure must satisfy e(px , py , u) = I when the utility target u coin-
cides with the maximal utility that the individual reaches when solving her UMP. Similarly,
e(px , py , u ) = I, which allows us to rewrite the CV as
CV = I − e(px , py , u),
or, using e(px , py , u) = I, we can express CV as
CV = e(px , py , u) − e(px , py , u).
CV then represents the decrease in the consumer’s minimal expenditure of reaching utility
target u when prices decrease from px to px . Hence, the CV can be rewritten as the change
in e(px , py , u), its derivative with respect to px , when the price of good x decreases from px
to px , as follows:
px
∂e(px , py , u)
CV = dpx .
px ∂px
Lastly, to simplify this expression, recall that from the previous description of e(px , py , u)
∂e(px ,py ,u)
in equation (5.3), we know that its derivative with respect to px is ∂px = xE (px , py , u),
which reduces the CV to
px
CV = xE (px , py , u)dpx .
px
Graphically, the CV then becomes the area below the demand curve for good x that we found
from the EMP, xE (px , py , u), between prices px and px .
Example 5.6: An alternative representation of CV Consider an individual with

y) = xy. We can solve the EMP to show that the
a Cobb-Douglas utility function u(x,
p
demand for good x is xE (px , py , u) = u pyx . (You can take this opportunity to practice
with the EMP, showing that you can find the same demand function.)6 Consider that
px
6. Recall that to solve the EMP, we need to satisfy two conditions. The first is the tangency condition MU
MU = py ,
x
y
y p p
which reduces to x = pxy in this example, where u(x, y) = xy. Solving for good y yields y = x pxy . Second, from
the utility target condition, we know that the consumer must reach autility level
u so that xy = u. Inserting the
p p p
expression obtained from the tangency condition, y = x pxy , yields x x pxy = u, which simplifies to x2 = u pxy .

p
Taking the square root of both sides, we obtain the demand for good x, xE (px , py , u) = u pyx . Intuitively, the
consumer’s demand for good x increases in the utility that she seeks to reach, u (as she needs more units of x to
reach this utility), in the price of good y (i.e., as this good becomes more expensive, the consumer demands more
units of good x because it became cheaper in relative terms), but decreases in the price of good x.
122 Chapter 5
the price of good x decreases from px = $3 to px = $2, the price for good y is held
constant at py = $1, and the consumer seeks to reach a utility target of u = xy = 50
3 ×
50 = 833.33. This is the utility level that the consumer reaches at bundle A = 503 , 50 ;
see example 5.4 for more details. In this case,
the demand function
we found from
p
the EMP, xE (px , py , u) = u pyx , simplifies to
833.33 p1x = 28.87 p1x . Therefore, the

CV becomes the integral of demand function 28.87 p1x between prices px = $3 and
px = $2; that is,
3
px 3 1 1
CV = x (px , py , u)dpx =
E
28.87 dpx = 28.87 dpx
px 2 px 2 px
√ 3 √ √
= 28.87 2 px 2 = 28.87 2 3 − 2 $18.35.
Due to approximations while solving, there is a small difference ($0.01) in the CV

in this example and that in example 5.3.
function and py = $1, but consider that the price of good x decreases from px = $2 to
px = $0.5.
A.2 Equivalent Variation

We can follow a similar approach to write the EV using the expenditure function. In par-
ticular, recall that the EV takes a “before-the-price-change” perspective, thus implying that
we must focus on the initial price, px . Hence, the EV can be expressed as
EV = e(px , py , u ) − e(px , py , u),
which measures the amount of money that the consumer needs to receive before the price
decrease (when her utility level is still u) to be just as well off as after the price decrease
(when she reaches a higher utility level, u ). We can also follow a similar approach as in
the CV to obtain yet one more expression for the EV. First, note that the consumer’s minimal
expenditure e(px , py , u) satisfies e(px , py , u) = I, which helps us to rewrite the above EV as
EV = e(px , py , u ) − I;
and because e(px , py , u ) = I holds as well, we can express the EV as
EV = e(px , py , u ) − e(px , py , u ).
Intuitively, the EV measures the change in the consumer’s minimal expenditure when,
reaching a utility level u , the price of good x decreases from px to px . Therefore, the EV can
be rewritten as the change in e(px , py , u ), its derivative with respect to px , when the price of
good x decreases from px to px , as follows:
px
∂e(px , py , u )
EV = dpx .
px ∂px
Finally, to simplify this expression, recall that the derivative of e(px , py , u ) with respect to
∂e(px ,py ,u )
px is ∂px = xE (px , py , u ), which helps us reduce the EV to
px
EV = xE (px , py , u )dpx .
px
Graphically, the EV is the area below the demand curve for good x that we found from the
EMP, xE (px , py , u ), between prices px and px . Comparing this expression with that of the
CV, they are symmetric except for the fact that the EV is evaluated at utility level u whereas
the CV is evaluated at utility level u.
Example 5.7: An alternative representation of EV Following example 5.6, con-

p
sider an individual with demand for good x as x (px , py , u) = u pyx . As in that
E
example, consider that the price of good x decreases from px = $3 to px = $2. The
utility level that the consumer reaches at final bundle C (after the price change) is
u = 1, 250 (revisit example 5.4 for more details). Inserting
level u = 1, 250
utility
and price py = $1 into the demand function yields 1, 250 p1x = 25 2
px . Therefore,
the EV becomes

px 3 2

EV = x (px , py , u )dpx =
E
25 dpx
px 2 px
√ √ 3 √ √ √
= 25 2 2 px 2 = 50 2 3 − 2 $22.47 (5.4)
which coincides (up to 0.03) with the answer obtained in example 5.4 using an
alternative approach.
124 Chapter 5
function and py = $1, but consider that the price of good x decreases from px = $2 to
px = $0.5.
Exercises
1. Changes in CS.A Patricia wants to measure the change in CS when the market price for doughnuts
increases to $14.5 (box of a dozen doughnuts). She needs your help! Consider that the demand for
doughnuts is p(q) = 15 − 12 q, and the supply is p = 7q.
2. CS and a tax.A We can also use CS to measure the impact of taxes and subsidies. Assume that the
inverse demand for a pack of cigarettes is p(q) = 25 − 5q.
(a) Find the CS when the market price is p = $4.00.
(b) Find the change in CS if a $0.50 tax is added to the price of a pack of cigarettes.
3. CS and changing prices.A Every spring, John goes to a local co-op to buy seeds to plant in his
field. He has been keeping track of prices of seeds and the number of tons of seeds that he buys
each year and estimates his demand curve to be p(q) = 300 − 10q. Last year, John paid $150 per
ton of seeds. This year, he noticed that the price went down to $100. Unfortunately, John didn’t take
any economics courses in college, so he doesn’t know how to quantify his welfare improvement.
Help John find his CS from this price decrease.
50 . What
4. CS and nonlinear demand.B Assume that a consumer has a demand for good x of x = 2√p x
is the change in CS if the price of x increases from px to px ? For simplicity, you can assume that
px = $1.
5. CS and nonlinear demand-I.A Jean has a Cobb-Douglas utility function that yields a demand for
jeans of j = 2pI . Jean has an income of $100, and jeans have a price of $25.
j
(a) What is the change in CS if the price of jeans increases from $25 to $30?
(b) What is the change in CS if the price of jeans decreases from $25 to $20?
6. CS and nonlinear demand-II.A A different variation of the Cobb-Douglas utility function (u =
x0.2 y0.8 ) will yield a demand for x of x = 0.2I
px .
(a) If I = 100 and px changes from $1 to $2, what is the change in CS?
(b) If I = 100 and px changes from $5 to $6, what is the change in CS? How does this differ from
(a)?
(c) If I = 200 and px changes from $1 to $2, what is the change in CS? How does this differ from
(a)?
7. Calculating CV.B Redo the analysis from example 5.3, but now assume that py changes from
py = $1 to py = $2, while px = $1.
8. CV with different income.B Redo the analysis from example 5.3, but now assume that I = $200.
How does the increase in income affect CV?
9. CV with Cobb-Douglas utility function.B Chris has a demand √ for books (b) and other goods
(y) that follows the Cobb-Douglas utility function u(b, y) = y b, and an income of I = $50. Find
Chris’s CV if the price of books decreases from pb = $2 to pb = $1.
10. CV and EV.A In words, describe the difference between the CV and EV.
11. CV with general price change.C Let’s investigate the impact of a generic change in price.
Consider a consumer with the Cobb-Douglas utility u(x, y) = xy, an income of I = 100, and a
normalized price of good y at py = $1. What is the CV of a change in the price of x from px to px ?
For simplicity, you can assume that px = 1. Dividing both prices by px , we can more compactly
p
express the initial price as ppxx = $1, and the final price as pxx = p. Intuitively, when p > 1, we have
that px > px , so good x becomes more expensive, and when 0 < p < 1, we have that px < px and
good x is cheaper.
12. EV with Cobb-Douglas demand–IB Again, consider Chris’s demand √ for books (b) and other
goods (y) that follows the Cobb-Douglas utility function u(b, y) = y b, and an income of I = $50.
Find Chris’s EV if the price of books decreases from pb = $2 to pb = $1.
13. EV with Cobb-Douglas Demand–II.B Consider a consumer with the Cobb-Douglas demand
√
u(x, y) = x y, with income I = $100, and the price of good y is normalized at py = $1. Calculate
the EV of the change in price of good x from px = $5 to px = $10.
14. EV with general price change.C Repeat the analysis from example 5.4, but assume a generic
increase in the price of good x from px to px . For simplicity, you can assume that px = 1. Dividing
both prices by px , we can more compactly express the initial price as ppxx = $1, and the final price
p
as pxx = p. Intuitively, when p > 1, we have that px > px , so good x becomes more expensive, and
when 0 < p < 1, we have that px < px and good x is cheaper.
15. EV with different income.A Redo the analysis from example 5.4, but now assume that I = $200.
How does the increase in income affect EV?
16. CV and EV with quasilinear utility.B Samantha often consumes two goods during exam times
at school to relax, chocolate (c) and music (m). Her utility from consuming these two goods is
1
represented by the following quasilinear utility function, u(c, m) = c + 2m 3 . Her income level
during exam week is I = $120, and the price of a bar of chocolate is pc = $4. Identify the CV and
EV when the price for downloading music increases from pm = $2 to pm = $3.
17. CS, CV, and EV with no income effects–I.B Repeat the analysis from example 5.5, but now
assume a generic increase in the price of good x, from px to px . For simplicity, you can assume
that px = 1. Dividing both prices by px , we can more compactly express the initial price as ppxx = $1,
p
and the final price as pxx = p. Intuitively, when p > 1, we have that px > px , so good x becomes
more expensive, and when 0 < p < 1, we have that px < px and good x is cheaper.
18. CS, CV, and EV with no income effects–II.C Explain the intuition behind section 5.5 of this
chapter. That is, why do CS, CV, and EV coincide when there is no income effect?
126 Chapter 5
19. Alternative representation of CV.B Consider a consumer with utility u(x, y) = x0.75 y0.25 .
(a) Find the demand for goods x and y by solving the consumer’s EMP.
(b) Calculate the CV for a price increase from px = $1 to px = $2, where u = 10 and py = $1.
(c) Calculate the CV of the price change for good x, but use the demand function of good y to
see how the consumer’s welfare in her purchases of good y is affected by a more expensive
good x.
20. Alternative representation of CV – quasilinear utility.B Consider a consumer with the
quasilinear utility function u(x, y) = 2x1/3 + y. The demand for good x from the EMP yields
2p 3/2
xE (px , py , u) = 3py , as solved for in exercise 16 (where goods m and c are relabeled as
x
x = m and y = c).
(a) Find demand for good y from the consumer’s EMP.
(b) Calculate the CV for a price increase from px = $5 to px = $10, where u = 30 and py = $1.
(c) Calculate the CV of the price change for good x, but use the demand function of good y to
good x.
21. Alternative representation of EV.B Consider a consumer with a utility function u(x, y) =
x0.75 y0.25 .
(a) Find the demand for goods x and y by solving the consumer’s EMP.
(b) Calculate the EV for a price increase from px = $1 to px = $2, where the new utility is u = 5
and py = $1.
(c) Calculate the EV of the price change in good x, but use the demand function of good y to
good x.
22. Alternative representation of EV – quasilinear utility.B Consider a consumer with a quasilinear
3/2
utility u(x, y) = 2x1/3 + y. The demand function from the EMP yields xE (px , py , u) = p6x , as
solved for in exercise 16 (where we relabeled goods m and c as x = m and y = c).
(a) Find demand for good y from the consumer’s EMP.
(b) Calculate the EV for a price increase from px = $5 to px = $10, where u = 20 and py = $1.
(c) Calculate the EV of the price change in good x, but use the demand function of good y to see
how the consumer’s welfare in her purchases of good y is affected by a more expensive good
x at the new utility, u = 20.
6 Choice under Uncertainty
6.1 Introduction
In this chapter, we analyze situations where individuals or firms make choices under
uncertainty, such as playing roulette in a casino or buying company stocks, as each out-
come is not certain but has a probability associated with it. Another example is weather
predictions, which nowadays are reported with a probability associated with rain, cloud
cover, or sunny days.
We start the chapter by describing what we mean by a lottery: an uncertain event with
associated probabilities. We then explain how to find a lottery’s expected value, its variance,
its standard deviation, and the expected utility that an individual obtains from participat-
ing in a lottery. While the expected value and variance of a lottery are both objective
measures that we all agree on, its expected utility can be different depending on each
individual’s degree of risk aversion. We define risk aversion in section 6.6, along with
risk loving and risk neutrality. To start thinking about risk aversion, consider two job
offers, both of them entailing the same expected dollar amount, $50,000. However, offer
A gives you $50,000 with certainty (no risk), while offer B involves risk (e.g., it promises
$60,000 with probability 0.8 and $10,000 with probability 0.2). Which one would you
choose? As we discuss in section 6.6, a risk-averse individual may prefer offer A, a risk
lover may prefer offer B, and a risk-neutral person would be indifferent between the two
offers.
We then discuss different measures of risk: (1) the risk premium that a risk-averse indi-
vidual is willing to pay to avoid risk (i.e., to obtain a certain amount rather than participating
in a lottery); (2) the certainty equivalent of a lottery; and (3) the Arrow-Pratt coefficient of
absolute risk aversion. We finish the chapter by presenting other approaches to decision-
making under uncertainty from the behavioral economics literature, such as the certainty
effect, prospect theory, and weighted utility.
128 Chapter 6
Probability p
60%
B
30%
C
10%
A
Figure 6.1
Probability lottery portrayed as a histogram.
6.2 Lotteries
Lottery An uncertain event with N potential outcomes, where each outcome i occurs
with an associated probability pi ∈ [0, 1], and the sum of these probabilities satisfies
p1 + p2 + … + pN = 1.
The act of flipping a coin, for instance, can be regarded as a lottery, with two potential out-
comes (Heads or Tails), each being equally likely (with probability 1/2). Similarly, weather
conditions tomorrow can be understood as a lottery, where each outcome would be a dif-
ferent weather condition associated with a specified probability. Other common examples
are stock returns, the outcome of a race, the score of a soccer match, and the probabi-
lity of lightning striking you while you read this. Therefore, lotteries can be understood as
probability distributions over outcomes, such as the one depicted in figure 6.1, where out-
come A occurs with probability 10 percent, B with probability 60 percent, and C with the
remaining probability 30 percent. These probabilities can be understood as the frequency
with which we observe a certain outcome, such as A, occurring. For instance, if A refers to
good-quality cars, probability 10 percent indicates the proportion of good-quality cars in a
certain region.
6.3 Expected Value
In this section, we describe how to obtain the expected value of a lottery (e.g., the return of
a stock at the New York Stock Exchange) by measuring its expected value.
Choice under Uncertainty 129
Expected value (EV) The average payoff of a lottery, where each payoff is weighted
by its associated probability.
The EV, therefore, computes the average payoff of a lottery by multiplying each possible
payoff with its associated probability of occurring. As a consequence, the EV assigns a
larger weight to those payoffs that are relatively more likely to occur, and a smaller weight
to those that are less likely. Example 6.1 puts this definition to work.
Example 6.1: Finding the EV of a lottery Consider the following probability dis-
tribution: outcome A ($90) occurs with probability 10 percent, outcome B ($20) with
probability 60 percent, and outcome C ($60) with probability 30 percent. The EV of
the lottery is given by the weighted average
EV = (0.1 × $90) + (0.6 × $20) + (0.3 × $60)

= 9 + 12 + 18 = $39.
As discussed previously, the EV assigns the largest weight to the most likely outcome
B (its associated probability is 0.6), a smaller weight to outcome C (its probability is
0.3), and the smallest weight to the most unlikely outcome A (because its probability
is only 0.1).
Self-assessment 6.1 Consider the lottery in example 6.1, but assume now that
outcome A provides you with a payoff of $800, while outcome C only gives you $12.
How is the EV of the lottery affected? Interpret.
6.4 Variance
While the EV informs about the expected payoff of a lottery, it does not provide us with a
measure of how risky the lottery is. We can find lotteries yielding the same EV as that in
example 6.1 (EV = $39), yet far less risky than that lottery. For instance, a lottery with two
equally likely outcomes a ($30) and b ($48) also generates an EV of
EV = (0.5 × $30) + (0.5 × $48) = $39.
Intuitively, while the lottery in example 6.1 has a large payoff variability (with payoffs
ranging from $20 to $90), the lottery we just presented fluctuates close to its EV of $39
130 Chapter 6
($9 down in outcome a, or $18 up in outcome b). One measure of the riskiness of a lottery
is its variance, which we define next.
Variance (Var) The average squared deviation of a lottery from its EV, weighting
each squared deviation by the associated probability of that outcome.
You can think about variance sequentially, as follows:
1. For each possible outcome in the lottery, x, we compute how far away this outcome is
relative to the EV (i.e., x − EV ). This difference can be positive, if payoff x satisfies
x > EV ; negative, if x < EV ; or zero, if the outcome’s payoff coincides with the EV.
2. Square this payoff difference, (x − EV )2 , so all differences are positive (both if payoffs
are above or below EV ).
3. Lastly, multiply this squared deviation (x − EV )2 by the probability of the outcome, as
in the EV calculation. This helps us weight each outcome with its associated likelihood
of occurring.
4. If we repeat these three steps for all possible outcomes, and sum them up, we obtain the
variance.
Therefore, the variance measures the dispersion of a data set relative to its mean (i.e., it
increases as some payoffs become further away from the EV of the lottery). For instance,
a volatile stock has a high variance. The variance also increases as outcomes with a large
squared deviation become more likely (i.e., their probability weight increases).
We next provide a numerical example to illustrate how to find the variance of a lottery.
Example 6.2: Finding the variance of a lottery Let us first calculate the variance
of the (risky) lottery in example 6.1:
VarRisky = 0.1 × ($90 − $39)2 + 0.6 × ($20 − $39)2 + 0.3 × ($60 − $39)2
= $609.
Intuitively, while the squared deviation of outcome A, ($90 − $39)2 , is large, its
probability weight is the lowest (0.1), helping reduce the variance due to outcome A.
In contrast, the squared deviation of outcome B is the smallest, ($20 − $39)2 , as $20
is close to the EV of the lottery. Next, we can practice finding the variance of the
relatively safe lottery presented at the beginning of this section:
VarSafe = 0.5 × ($30 − $39)2 + 0.5 × ($48 − $39)2

= $81,
which is, of course, much smaller than that of the previous lottery because the squared
deviations are low.
Self-assessment 6.2 Consider the risky lottery in example 6.2. If outcome A yields
$800 rather than $90, how is the variance of the lottery affected? Interpret.
While variance helps us measure the volatility of a data set, it cannot be interpreted as a
dollar amount, as payoff deviations from the mean have been squared. The standard devi-
ation, defined next, helps us understand the dispersion of a data set in dollars or, more
generally, in the original units of our payoffs.
√
Standard deviation (SD) The square root of the variance, or SD = Var.
√
For the variances found in√example 6.2, we have that SD = 609 = $24.67 for the most
risky lottery, and only SD = 81 = $9 for the less risky lottery. Needless to say, if lottery 1
has a larger variance than lottery 2, then it must also have a larger standard deviation because
SD is increasing in Var.
6.5 Expected Utility
While previous sections analyze how to evaluate the expected monetary value of a lottery,
and its riskiness, we do not yet have a tool to determine which specific lottery a deci-
sion maker selects when facing several available lotteries. To understand the value that she
assigns to each lottery, we must first measure the expected utility she obtains from each
lottery.
Expected utility (EU) The average utility of a lottery, weighting each utility with
the associated probability of that outcome.
We find the utility that the individual obtains from the payoff in one outcome, multi-
ply this utility by the probability of that outcome occurring, and then repeat the process
for all other outcomes. As a consequence, the definition of EU is similar to that of EV, as
both approaches weight payoffs according to their probability, assigning a larger weight
to more likely outcomes. However, EU plugs each payoff into the individual’s utility
132 Chapter 6
function to better assess how important that payoff is for her, while EV considers only
payoffs, without evaluating their utility for the individual.1 Example 6.3 illustrates this
definition.
Example 6.3:
√ Finding the EU of a lottery Consider an individual with utility func-
tion u(I) = I, where I 0 denotes the income that the individual receives in each
outcome. Let us first calculate the EU of the lottery in example 6.1:

EURisky = 0.1 × $90 + 0.6 × $20 + 0.3 × $60
= 5.96,
while that of the second (less risky) lottery is

EUSafe = 0.5 × $30 + 0.5 × $48
= 6.20,
This result indicates that the individual obtains a higher EU from the second lottery.
While both lotteries generate the same EV, the safer lottery yields a higher EU for this
individual.
Self-assessment 6.3 Consider the scenario in example 6.3, but assume that the
individual’s utility function changes to u(I) = I 1/3 . What is his EU from the risky
lottery? What about from the safe lottery?
6.6 Risk Attitudes
6.6.1 Risk Aversion

Figure 6.2 depicts the EU from the less risky lottery. We can understand the construction of
this figure sequentially as follows:
√
1. We plot the utility function u(I) = I, which is increasing and concave in income.2
Intuitively, more income increases the individual’s utility, but at a decreasing rate:
1. A utility function with the EU form is also referred as a “von Neumann-Morgenstern EU function.”
2. For an increasing utility function u(I), we say that it is concave if it increases at a decreasing rate. Mathemati-
cally, this means that its second-order derivative with respect to income I is negative or zero, u (I) ≤ 0, but never
positive.
u(I )
B
6.93
u(EV ) = 6.24 D
EU = 6.20
C
5.47
A
I
$30 EV = $39 $48
Figure 6.2
EV and EU from a lottery—risk averse.
additional amounts of income are more beneficial when she has only $1 than when she
has $1 million!
2. We place payoff $30 on the horizontal axis (recall that the individual obtains this payoff
when outcome A occurs).
3. √
We extend a vertical line from this point until we hit the utility function, at a height of
$30 ∼
= 5.47 at point A.
4. We then repeat steps 2–3 for the other outcome in this lottery, $48, first placing √ it on the
horizontal axis and then extending a vertical line that hits the utility function at $48 ∼ =
6.93 at point B.
5. Finally, we connect points A and B with a line and, because the lottery assigns the same
probability to both outcomes A and B (they are equally likely), we find the midpoint of
the line (see point C). The height of this point represents the EU of the lottery which, as
described in example 6.3, is EU = 6.20. Intuitively, this height represents the utility that
the individual obtains from playing the lottery, and thus faces some uncertainty about
which outcome will arise.
Note that if we face a more volatile lottery (with higher variance), the line connecting
points A and B becomes longer. For instance, if payoff $30 decreases to $16 in this lottery,
while keeping
√ all other elements of the lottery unchanged, point A’s height would be only
u(16) = 16 = 4.
In contrast, if we seek to depict the utility of the EV of the lottery, we only need to
place the payoff corresponding to the EV, $39, on the horizontal √ axis, and then extend
a vertical line until it hits the utility function, at a height of $39 ∼
= 6.24, as
√ illustrated
at point D. Intuitively, the utility of the EV—or, more compactly, u(EV ) = EV in this
134 Chapter 6
example—represents the utility that the individual obtains if she received the EV with cer-
tainty, without having to face the risk of playing the lottery. As figure 6.2 indicates, point D
lies above point C, thus indicating that
u(EV ) > EU.
In short, this says that the individual is “risk averse” because she prefers to receive the
EV of the lottery with certainty, where she obtains u(EV ), rather than having to face the risk
of playing the lottery, which yields EU.3
Intuitively, the reduction in utility that she suffers from the downside of the lottery (6.24 −
5.47 = 0.77) is larger than the increase in utility from the upside of the lottery (6.93 − 6.24 =
0.69). For this type of individual, we can anticipate that, if facing two lotteries with the same
EV (such as the risky and safe lotteries in example 6.3), she will always prefer the utility from
the safest lottery. Alternatively, if two lotteries have the same EV, a risk-averse individual
prefers the lottery with the lowest variance because it yields a higher EU.
Concave utility.
√ Risk aversion arises every time an individual’s utility function is concave,
such as u(I) = I, depicted in figure 6.2. Utility functions with the form u(I) = a + bI γ are
concave if constants a and b are positive and exponent γ in the individual’s income satisfies
γ ∈ (0, 1). In the previous example, a = 0, b = 1 and γ = 1/2, but other utility functions
like u(I) = 5 + 4I 1/3 or u(I) = 2 + 8I 2/5 would also yield increasing and concave utility
functions.4
6.6.2 Risk Loving

Not all individuals are risk averse. Instead, some individuals are “risk lovers” because they
enjoy facing situations where risk is involved, as example 6.4 illustrates.
Example 6.4: Finding the EU of a lottery under risk-loving preferences Con-

sider an individual with utility function u(I) = I 2 . Let us now find the EU of the two
lotteries considered in example 6.3, but now evaluated at this utility function:

EURisky = 0.1 × $902 + 0.6 × $202 + 0.3 × $602
= 2, 130
3. This result is a direct application of Jensen’s inequality. This inequality states that if f (x) is a strictly concave
function where x denotes a real number, such as money in utility function u(I) in this discussion, then f (E[x]) >
E [f (x)].
1
4. As an exercise, note that by differentiating the utility function u(I) = 5 + 4I 3 with respect to income I, we
1
obtain u (I) = 4 1 I 3 −1
=3 4 I − 23 ,
which is positive for all income levels I ≥ 0, indicating that the utility increases
3 2 5
in income. Differentiating u (I), we find u (I) = 43 − 32 I − 3 −1 = − 98 I − 3 , which is negative for all income

levels, thus reflecting that the utility increases, but at a decreasing rate (i.e., it is concave in income).
u(I )
2,304 B
C
EU = 1,602
u(EV ) = 1,521
D
900
A
I
$30 EV = $39 $48
Figure 6.3
EV and EU from a lottery—risk lover.
and

EUSafe = 0.5 × $302 + 0.5 × $482
= 1, 602.
This indicates that the individual obtains a higher EU from the first (risky) lottery
than from the second (safe) lottery.
Self-assessment 6.4 Consider the scenario in example 6.4, but assume that the
individual’s utility function is u(I) = 5I 3 . What is his EU from the risky lottery? What
about from the safe lottery? Which lottery yields the highest EU. Interpret.
√
Figure 6.3 plots utility function u(I) = I 2 . An immediate difference with u(I) = I
depicted in figure 6.2 is that u(I) = I 2 is convex (i.e., it increases in income at an increas-
ing rate).5 Intuitively, this says that the individual enjoys additional income more when she
owns $1 million than when she owns only $1. We can then follow a similar approach as
in the previous section to depict the EU of the safe lottery: first, we place the payoffs that
can arise from the lottery on the horizontal axis ($30 and $48); second, we extend a ver-
tical line upward until we hit the utility function (at a height of 900 for point A and 2, 304
5. For an increasing utility function u(I), we say that it is convex if it increases at an increasing rate. Mathemati-
cally, this means that its second-order derivative with respect to income I is positive or zero, u (I) ≥ 0, but never
negative.
136 Chapter 6
for point B); third, we connect points A and B with a straight line; and, finally, we find
the midpoint of this straight line at point C, which represents the EU of the lottery, where
EU = 1, 602.
In contrast, the utility of the EV is found by simply extending a vertical line upward from
the EV = $39 until we hit the utility function at point D; that is, u($39) = 392 = 1, 521. As
expected, in this case, point C lies below point D, indicating that
u(EV ) < EU.

Intuitively, this says that the individual is a “risk lover” because she prefers to play the
lottery and face risk (obtaining EU) to receiving the EV of the lottery with certainty, where
she obtains u(EV ).
Convex utility. Risk-loving attitudes emerge when an individual’s utility function is con-
vex, as u(I) = I 2 , depicted in figure 6.3. (Recall that we assume positive or zero income
levels throughout the chapter, I 0.) Generally, utility functions with the form u(I) =
a + bI γ are convex if constants a and b are positive, while exponent γ now satisfies γ > 1.
In example 6.4, parameters a and b took values a = 0 and b = 1, while exponent γ was
γ = 2; but other utility functions like u(I) = 5 + 7I 3 or u(I) = 8 + 2I 5 are also convex.6
6.6.3 Risk Neutrality

Finally, some individuals may not be risk averse or risk loving, but instead are “risk neutral.”
Example 6.5 calculates the EU again, but now under risk-neutral preferences.
Example 6.5: Finding the EU of a lottery under risk-neutral preferences Con-

sider an individual with utility function u(I) = I. The EU from the risky and safe
lotteries for this individual are
EURisky = (0.1 × $90) + (0.6 × $20) + (0.3 × $60)

= 39
and
EUSafe = (0.5 × $30) + (0.5 × $48) = 39,
meaning that the individual experiences the same EU from the risky and safe
lotteries.
6. As an exercise, note that differentiating utility function u(I) = 5 + 7I 3 with respect to income I, we obtain
u (I) = 7 × 3I 3−1 = 21I 2 , which is positive, implying that utility increases in income. Differentiating u (I), we
find u (I) = 2 × 21I 2−1 = 42I, which is also positive, thus indicating that utility increases at an increasing rate
(i.e., it is convex in income).
u(I )
48 B
C
EU = u(EV ) = 39
D
30 A
I
$30 EV = $39 $48
Figure 6.4
EV and EU from a lottery—risk neutral.
Self-assessment 6.5 Consider the scenario in example 6.5, but assuming that the
individual’s utility function is u(I) = 2 + 5I. What is her EU from the risky lottery?
What about from the safe lottery? Which lottery yields the highest EU? Interpret.
Figure 6.4 plots utility function u(I) = I, which is linear in income (i.e., a straight line).
That means that, it increases in income, but at a constant rate (in this case, 1) rather than
at a decreasing rate (as concave utility functions) or at an increasing rate (as with convex
utility functions).
This figure follows the same approach to depict the EU of the safe lottery as in previous
sections. We can immediately see that the height of point C, which represents the EU of the
lottery, coincides with that of point D, which identifies the utility of the EV of the lottery.
As a consequence,
u(EV ) = EU.
This result indicates that the individual is “risk neutral” because she obtains the same
utility from receiving the EV of the lottery with certainty, which yields u(EV ), and from
playing the lottery, where she obtains the EU.
Linear utility. Risk neutrality arises when an individual’s utility function is linear, thus
exhibiting the form u(I) = a + bI, where a and b are positive constants. In example 6.5,
a = 0 and b = 1; but other utility functions like u(I) = 3 + 8I are also linear.7
7. Generally, the linear utility function u(I) = a + bI is increasing in income, because u (I) = b is positive; at a
constant rate, given that u (I) = 0. We can confirm this property in this example, u(I) = 3 + 8I, because u (I) = 8
is positive and u (I) = 0, as required.
138 Chapter 6
6.7 Measuring Risk
The discussion has established that an individual with a concave utility function is risk
averse. A natural question is: how averse? Or, more generally, how can we measure risk? In
this section, we seek to measure the amount of money that a risk-averse individual is willing
to pay to avoid the risk of playing the lottery.
6.7.1 Risk Premium
Risk premium (RP) The amount of money that we need to subtract from the EV in
order to make the decision maker indifferent between playing the lottery and accepting
the EV from the lottery. That is, the RP solves
u(EV − RP) = EU.
To understand the RP, think about the following scenario. Assume that you are the risk-
averse individual of example 6.3, and we approach you with the relatively safe lottery. As
we know, the EU from playing that lottery is EU = 6.2. If, instead, we√offer you the EV of
the lottery with certainty, $39, your utility is larger because u(EV ) = 39 = 6.24, and you
would prefer the EV paid with certainty. Knowing that, we cut the EV that we offer you by $1.
Would you still prefer the EV − $1 than the EU? You may say yes if u(EV − $1) > EU. What
if we cut the EV by $2? You may still accept it if u(EV − $2) > EU. The RP then measures
how much we need to cut the EV offered to you with certainty to make you indifferent
between accepting the EV and playing the lottery, that is, u(EV − RP) = EU. Example 6.6
finds the exact RP for our safe lottery.
Example 6.6: Finding the RP of a lottery Considering the safe lottery of example
6.3, and recalling that EV = $39 and EU = 6.2, the RP solves
u(39 − RP) = 6.2

√
or 39 − RP = 6.2. Squaring both sides yields 39 − RP = 6.22 , and solving for RP, we
obtain RP = $0.56. Intuitively, we would need to cut the EV of the lottery by $0.56 for
the individual to be indifferent between playing the lottery and receiving that (dimin-
ished) EV with certainty. If we cut the EV = $39 by more than $0.56, the individual
would prefer playing the lottery rather than the (highly discounted) EV.
u(I )
6.93 B
u(EV ) = 6.24 D
EU = 6.20
C
5.47
A
RP
I
$30 CE EV = $39 $48
Figure 6.5
Finding the RP and CE of a lottery.
Self-assessment 6.6 Consider the scenario in example 6.6, but assume now that
EV = $42 and EU = 6. Find the RP and interpret your result.
Figure 6.5 illustrates the RP. Starting from the EV, the RP decreases the certain amount
that we offer the individual. Graphically, we shift the EV leftward until its utility (height of
point D) decreases enough to coincide with the EU of the lottery (height of point C). The
individual is now indifferent between playing the lottery and receiving that (diminished) EV
with certainty.
The diminished EV, after subtracting RP, EV − RP, thus makes the individual indifferent
between receiving that amount with certainty and playing the lottery. This diminished EV
is also known as the “certainty equivalent,” as we define next.
6.7.2 Certainty Equivalent
Certainty equivalent (CE) The amount of money that, if given to the individual
with certainty, makes her indifferent between receiving such a certain amount and
playing the lottery. That is,
CE = EV − RP.
In example 6.6, CE = EV − RP = 39 − 0.56 = $38.44. Therefore, if we offer $38.44 to the

risk-averse individual, she would be indifferent between receiving this amount and playing
the lottery.
140 Chapter 6
Example 6.7: Measuring RP and CE with other risk attitudes Consider the risk-
loving individual from example 6.4. Because EV = $39 and EU = 1, 602, the RP
solves u(39 − RP) = 1, 602, which in this case entails (39 − RP)2 = 1, 602. Applying
the square root to both sides of the equality, yields

39 − RP = 1, 602,
or 39 − RP = 40.02. Solving for RP, we obtain RP = −1.02, which is negative! This
indicates that, for the individual to be indifferent between playing the lottery and
receiving a monetary amount with certainty, we would need to offer her more than
the EV (rather than less, as in example 6.6). She loves risk, so she would actually
need to be compensated to stop playing the lottery. Therefore, the CE becomes
CE = EV − RP = 40.02. As suggested previously, RP < 0, thus augmenting the EV
to induce the individual to stop playing the lottery. As a consequence, CE > EV when
the individual is a risk lover.
Following a similar approach with the risk-neutral individual from example 6.5, we
obtain that the RP solves u(39 − RP) = 39, which in this case entails
39 − RP = 39,
ultimately yielding RP = $0. Intuitively, the individual is indifferent between receiving
EV with certainty and playing the lottery, so we don’t have to decrease EV (as with
risk-averse individuals), nor increase the EV (as with risk lovers). As a result, the CE
becomes CE = EV − RP = EV .
Self-assessment 6.7 Consider an individual with the utility function u(I) = I 2 . We

find that EV = $42 and EU = 1, 822. What is her RP from the lottery? What is her CE?
Compare the CE and RP and interpret your result.
6.7.3 Arrow-Pratt Coefficient of Absolute Risk Aversion

From figures 6.2–6.4, you probably noticed that risk aversion requires utility functions to
be concave, meaning that it is increasing in the individual’s income but at a decreasing rate.
The Arrow-Pratt coefficient of absolute risk aversion (AP for short) uses the concavity of
the utility function to measure risk aversion, as described next.
Arrow-Pratt coefficient of absolute risk aversion (AP) This coefficient is given by

u
AP ≡ − ,
u
where the denominator, u , represents the first-order derivative of the individual’s

utility function u(I) with respect to income I, while the numerator, u , denotes the
second-order derivative.
Recall that the denominator u is always positive because the individual enjoys a positive
utility when her income increases. The numerator, however, can be (1) negative when the
individual’s utility function is concave, u < 0 (entailing a positive AP coefficient given
the negative sign in the AP definition); (2) positive when her utility function is convex,
u > 0 (which yields a negative AP); or (3) zero when her utility function is linear u = 0
(providing a zero AP coefficient).
Example 6.8 illustrates how to find the AP for the risk-averse individual we considered
in previous examples.
Example 6.8: Finding the AP coefficient Consider the risk-averse individual from
√ 1
example 6.3 with utility function u(I) = I. The first-order derivative is u = 12 I − 2 ,
and the second-order derivative is then

1 1 − 1 −1 1 3
u = − I 2 = − I− 2 ,
2 2 4
yielding an AP coefficient of
3 3
u − 1 I− 2 1 −2
I
AP = − = − 1 4 = 2−1/2
u −1/2 I
2I

1 −3− − 12 1 1
= I 2 = I −1 = ,
2 2 2I
which is positive, thus indicating a positive risk aversion. In contrast, the risk-loving
individual from example 6.4 with utility function u(I) = I 2 , has a first-order derivative
of u = 2I, and second-order derivative of u = 2. Therefore, its AP coefficient is
u 2 1
AP = −

=− =− ,
u 2I I
which reflects a negative risk aversion (because she is a risk lover). As an exercise,
you can check that the risk-neutral individual of example 6.5, with utility function
u(I) = a + bI, has an AP coefficient of zero.8
8. Indeed, the first-order derivative is u = b, whereas the second-order derivative is u = 0, which yields
AP = − b0 = 0.
142 Chapter 6
Table 6.1
Summary of risk aversion measures.
Risk Averse Risk Lover Risk Neutral
Utility function Concave Convex Linear

u(EV ) vs. EU u(EV ) > EU u(EV ) < EU u(EV ) = EU
Risk Premium, RP + − 0
Certainty Equivalent, CE CE < EV CE > EV CE = EV
Arrow-Pratt coefficient, AP AP > 0 AP < 0 AP = 0
Exponent γ in u(I) = a + bI γ Between 0 and 1 Larger than 1 1
Self-assessment 6.8 Consider an individual with utility function u(I) = 2I 1/3 .

Using the same steps as in example 6.8, find her AP coefficient. Interpret.
Table 6.1 summarizes some of the results for a risk-averse, risk-loving, and risk-neutral
individual. When the individual is risk averse, in the first column: (1) her utility function
u(I) is concave in income; (2) her utility from the EV of the lottery is larger than the EU
from playing the lottery, u(EV ) > EU; (3) she would pay a positive amount to receive the
EV of the lottery with certainty rather than playing the lottery; (4) the CE she needs to
receive to avoid playing the lottery is lower than the EV of the lottery; and (5) the Arrow-
Pratt coefficient of risk aversion is positive, AP > 0. Finally, if her utility function has the
form u(I) = a + bI γ , exponent γ must be a number between 0 and 1.
6.8 A Look at Behavioral Economics—Nonexpected Utility
The EU measure is tractable and intuitive, inducing many researchers to test it experimen-
tally in the last decades. When we say that a theory has been “experimentally tested,” we
mean that researchers set up an experiment where several individuals (often college stu-
dents) are asked to sit at computer terminals and then presented with relatively simple
lotteries to choose among. To help every participant think hard about what is her best choice,
experiments provide monetary incentives, such as informing participants that they can take
home $1 of every $5 dollars they earn in the experiment.
What were the main findings of these experiments? Participants sometimes behaved dif-
ferently from what EU would have predicted, leading researchers in the field of behavioral
economics to propose alternative theories of decision-making under uncertainty that seek
to account for these experimental anomalies. We present some of them next.9
9. For more references, see the book by Daniel Kahneman and Amos Tversky (2000) or the more accessible one
published by Kahneman (2013).
Example 6.9: The certainty effect Kahneman and Tversky (1979) asked experi-
mental participants to consider what decisions they would make in the following two
choices:
1. Choice 1:
(a) Lottery A: Receive $3,000 with certainty.
(b) Lottery B: Receive $4,000 with probability 0.8 and receive $0 with probability
0.2.
2. Choice 2:
(c) Lottery C: Receive $3,000 with probability 0.25 and receive $0 with probabi-
lity 0.75.
(d) Lottery D: Receive $4,000 with probability 0.20 and receive $0 with probabi-
lity 0.80.
Kahneman and Tversky (1979) found that most participants prefer lottery A over B
in Choice 1 and lottery D over C in Choice 2. However, these preferences are incon-
sistent with EU theory, as we show next. In particular, an individual that evaluates
lotteries according to her EUs prefers lottery A over B in Choice 1 if and only if
u($3, 000) > 0.8u($4, 000) + 0.2u($0)
because in lottery A, she is receiving $3, 000 with certainty (see the left side of the
inequality), while in lottery B, she receives $4,000 with probability 0.8 and $0 other-
wise (see right side). Individuals also expressed a preference of lottery D over C in
Choice 2, which means
0.2u($4, 000) + 0.8u($0) > 0.25u($3, 000) + 0.75u($0),
where the left side of the inequality represents the EU of lottery D, while the right
side measures the EU of lottery C. Dividing both sides of the inequality by 0.25 and
rearranging yields
0.8u($4, 000) + 0.2u($0) > u($3, 000),

but this inequality is exactly the opposite of the inequality for Choice 1. This result is
problematic because we did not assume any risk attitude for the individual (she could
be risk averse, risk loving, or neutral). In other words, we cannot rationalize these
choices using EU, regardless of the utility function of this individual, u(·). The alter-
native theories of decision-making under uncertainty that we present next, however,
can help explain these preferences for lotteries.
144 Chapter 6
6.8.1 Weighted Utility

An individual with weighted utility (WU) assigns to each payoff x in the lottery, a weight
g(x), which may differ from the weight that she assigns to payoff y, that is to say, g(x) =
g(y). To illustrate this assumption, consider a lottery between two payoffs x and y, with
probabilities p and 1 − p, respectively. According to EU theory (section 6.5), the EU of this
lottery is
EU = pu(x) + (1 − p) u(y),
where u(x) denotes the utility that the individual enjoys from payoff x, while u(y) represents
her utility from payoff y. Examples of these utilities include u(x) = x1/2 and u(y) = y1/2 ,
as explored in previous examples in this chapter. Intuitively, p operates as the probability
weight on payoff x, whereas (1 − p) is the probability weight on payoff y. WU only changes
these probability weights, p and 1 − p, as follows:
g(x)p g(y)(1 − p)
WU = u(x) + u(y),
g(x)p + g(y)(1 − p) g(x)p + g(y)(1 − p)

Prob. weight on payoff x Prob. weight on payoff y
In the special case in which the individual assigns the same weight on both payoffs, g(x) =
g(y), this WU simplifies to10
WU = pu(x) + (1 − p) u(y)
= EU.
Therefore, WU theory in this case is equivalent to the EU theory presented in this chapter.
In contrast, when payoff weights do not coincide, g(x) = g(y), the two approaches yield
different results. Intuitively, when the individual assigns a greater probability weight on
the upward outcome of the lottery, g(y) > g(x) so payoffs satisfy y > x, the WU assigns a
larger importance to the upward outcome, ultimately yielding a WU that exceeds EU. In
this context, the individual is more willing to participate in the lottery when she evaluates it
according to the lottery’s WU than its EU.
Example 6.10: Weighted utility Consider the safe lottery in example 6.3, which
yields payoffs x = $30 and y = $48, both occurring with probability 1/2. The
√
individual’s utility function was u(x) = x, implying that the safe lottery generated
10. Indeed, when g(x) = g(y), we first obtain

g(x)p g(x)(1 − p)
WU = u(x) + u(y),
g(x)p + g(x)(1 − p) g(x)p + g(x)(1 − p)
g(x)pu(x) g(x)(1−p)
which simplifies to g(x) + g(x) u(y), and then ultimately reduces to pu(x) + (1 − p) u(y).
EU = 6.20. If, instead, the individual evaluates this lottery according to WU and
g(x) = 2, while g(y) = 3, her WU becomes
2 12 3 12
WU = $30 + $48
2 12 + 3 12 2 12 + 3 12

2 3
= × 5.47 + × 6.92 = 6.34,
5 5
which is larger than the EU. Intuitively, because the individual assigns a larger weight
to the upward outcome of the lottery (payoff y), she finds the lottery more attractive
when evaluating it according to WU than according to EU.
WU can help explain the preferences for lotteries described in example 6.9.
Example 6.11: Using WU to explain the certainty effect Consider again the indi-
vidual in example 6.10 and check whether her preferences can explain the certainty
effect presented in example 6.9. Lottery A is preferred to B in Choice 1 if and only if
2 × 0.2 3 × 0.8
$3, 000 > $0 + $4, 000,
(3 × 0.8) + (2 × 0.2) (3 × 0.8) + (2 × 0.2)
which simplifies to 54.77 > 54.21. In addition, lottery D is preferred to C in Choice
2 if and only if
2 × 0.2 3 × 0.8
$0 + $4, 000
(3 × 0.8) + (2 × 0.2) (3 × 0.8) + (2 × 0.2)
3 × 0.75 2 × 0.75
> $3, 000 + $0,
(3 × 0.75) + (2 × 0.25) (3 × 0.75) + (2 × 0.25)
which collapses to 54.21 > 44.81. Therefore, the experimental observations in Kahne-
man and Tversky (1979) can be explained by WU theory.
6.8.2 Prospect Theory

Tversky and Kahneman (1986) proposed that the value that an individual obtains from a
lottery can be different from the EU. In particular, considering the same lottery as in the
previous section (two payoffs x and y, with probabilities p and 1 − p, respectively), the value
of the lottery is
V = w(p)v(x, x0 ) + w(1 − p)v(y, x0 ).

146 Chapter 6
v(x,x 0)
Concave in gains
(payoffs above x0)
x0
0 x
Convex in losses
(payoffs below x0) Kink at reference
point x = x0
Figure 6.6
Utility function in prospect theory.
This value of the lottery differs from the EU in three dimensions:

• Probability weights. Like WU in the previous section, probability is also weighted with the
probability weighting function w(p), rather than considering p directly. When w(p) > p,
we say that the individual overestimates the likelihood of outcome x, and when w(p) < p,
she underestimates it. If w(p) = p, the individual is not overestimating or underestimating
the probability of outcome x, thus assigning the same probability weights as when she
uses EU to evaluate the lottery.11
• The use of reference points. Every payoff x is evaluated against a reference point x0 , such
as the status quo, so the individual’s utility from payoff x is v(x, x0 ), and that of payoff y is
v(y, x0 ). Utility v(x, x0 ) is increasing in x and, importantly, it is concave for all payoffs that
lie above the reference point, x > x0 , indicating that the individual is risk averse toward
gains (relative to the reference point). Figure 6.6 illustrates this utility function with a solid
line. However, v(x, x0 ) is convex for all payoffs that lie below the reference point (x < x0 ),
suggesting that the individual is risk loving toward losses, as depicted by the dashed line
of the utility function.
• Loss aversion. Utility v(x, x0 ) has a kink at reference point x0 , rather than a smooth
transition. Intuitively, this means that the individual suffers a larger disutility when he
11. A similar argument applies to outcome y, with weighted probability w(1 − p), which can overestimate the
likelihood of outcome y if w(1 − p) > 1 − p; or underestimate it if w(1 − p) < 1 − p. In addition, the probability
weighting function assumes that overestimation or underestimation does not happen with outcomes occurring with
certainty; that is, when p = 1, we find that w(1) = 1 and, similarly, when p = 0, we find that w(0) = 0.
loses $1 relative to the reference point x0 than when he gains the same dollar from x0 ,
which is often referred to as the individual exhibiting loss aversion.
Example 6.12: Prospect theory As an example of probability weighting function

w(p), consider
p1/2
w(p) = .
p1/2 + (1 − p)1/2
Figure 6.7 depicts w(p) on the vertical axis and p on the horizontal axis, and
includes the 45-degree line where w(p) = p. For relatively low probabilities (on the
left side of the figure), w(p) lies above the 45-degree line, so w(p) > p, indicating
that the individual overestimates outcomes that occur with low probability. In con-
trast, for relatively high probabilities (on the right side of the figure), w(p) lies below
the 45-degree line, implying that w(p) < p. In this case, the individual underestimates
outcomes that happen with high probability.
Regarding utility function v(x, x0 ), we could consider a reference point x0 = $0, for
simplicity, so that
x1/2 for all x 0, and
v(x, 0) =
−3(−x)1/4 for all x < 0.
Intuitively, the individual has a concave utility function x1/2 for all positive payoffs,
and a convex utility function −3(−x)1/4 for all negative payoffs, producing a graph
similar to that shown in figure 6.6, but with the kink happening at the origin, x = 0.
Exponent 1/2 in x1/2 captures her concavity in gains, exponent 1/2 in −3(−x)1/2
measures her convexity in losses, and −3 in −3(−x)1/2 represents her loss aversion.
w(p)
1.0 45-degree line
where w( p) =p
0.8 w( p)
Overestimation,
0.6 because w( p) >p
0.4 Underestimation,
because w( p) < p
0.2
0.2 0.4 0.6 0.8 1.0 p
p =1/2
Figure 6.7
Probability weighting function.
148 Chapter 6
To understand this point, note that if the utility for losses was −x1/2 , gains and losses
would produce the same effect on the consumer’s utility, leading to no kink at the
utility function in figure 6.6.
We finish this chapter by showing that prospect theory can help explain the certainty effect
discussed in example 6.9.
Example 6.13: Using prospect theory to explain the certainty effect Let us
consider again the preference for lotteries described in example 6.9. In particular,
we assume the probability weighting function w(p) in example 6.12 and the utility
function v(x, x0 ) from that example, where reference point x0 = $0. First, Kahne-
man and Tversky (1979) found that individuals typically prefer lottery A over B,
entailing
0.81/2 0.81/2
$3, 0001/2 > $4, 0001/2 + 1 − $01/2
0.81/2 + (1 − 0.8)1/2 0.81/2 + (1 − 0.8)1/2
which simplifies to 54.77 > 42.16. Similarly, individuals prefer lottery D over C;
that is
0.21/2 0.21/2
$4, 0001/2
+ 1 − $01/2
0.21/2 + (1 − 0.2)1/2 0.21/2 + (1 − 0.2)1/2
0.251/2 0.251/2
> $4, 0001/2 + 1 − $01/2
0.251/2 + (1 − 0.25)1/2 0.251/2 + (1 − 0.25)1/2
which collapses to $21.08 > $20.05. Therefore, individuals preferring lottery A over
B in Choice 1, but lottery D over C in Choice 2, is consistent with prospect theory.
For other examples of experiments testing behavior in contexts with uncertainty, see
Tversky and Kahneman (1992), and for a readable introduction to the general topic,
see Kahneman and Tversky (2000).
Exercises
1. Expected utility.A Scientists are evaluating the impact of climate change on the production of
apples in the Yakima region in Washington State. After analyzing the data on temperature during
the last fifty years, they have identified three cases:
(a) low impact, which can occur with probability 5 percent.
(b) medium impact, with probability 45 percent.

(c) high impact, with probability 50 percent.
A low-impact scenario implies profits for the agriculture industry of π = $85 million, the
medium-impact yields profits of π = $5 million, and the high-impact scenario implies negative
profits of π = $900 million. Consider a farmer who is risk averse and her utility function is concave
and equal to
1
u (π ) = 10 + 3 × (π ) 3 .
Calculate the EU for this farmer and discuss whether she should support measures to deal with
climate change.
2. EV and variance–I.A You are looking at two firms as an investment opportunity:
• For the first firm, you know that with probability 0.7, your investment will mature to a profit of
$60 million, and with probability 0.3, your investment will mature to a loss of $40 million.
• For the second firm, you know that with probability 0.9, your investment will mature to a profit
of $40 million, and with probability 0.1, your investment will mature to a loss of $10 million.
(a) Calculate the EV of each investment.
(b) Calculate the variance of each investment.
(c) If you had the opportunity to invest in only one of these firms, which would you pick, and why?
3. EV and variance–II.B You are looking at two firms as an investment opportunity.
• For the first firm, you know that with probability 0.7, your investment will mature to a profit
of $45 million, and with probability 0.3, your investment will mature to a loss of $30 million
dollars.
• For the second firm, you know that with probability 0.8, your investment will mature to a profit
of $30 million, and with probability 0.2, your investment will mature to a loss of $7.5 million.
(a) Calculate the EV of each investment.
(b) Calculate the variance of each investment.
(c) If you had the opportunity to invest in only one of these firms, which would you pick and why?
4. Expected utility–I.A Consider the situation in exercise 2, but suppose now that your utility function
is
√
u(π ) = 50 + π ,
where π is the profit from your investments.

(a) Calculate the EU of each investment.
(b) Based on your utility level, if you had the opportunity to invest in only one of these firms,
which would you pick, and why?
5. Expected utility–II.A Consider the situation in exercise 2, but suppose now that your utility
function is
u(π ) = (50 + π )2 ,
150 Chapter 6
where π is the profit from your investments.

(a) Calculate the EU of each investment.
(b) Based on your utility level, if you had the opportunity to invest in only one of these firms,
which would you pick, and why?
6. Risk attitudes.A After looking at the details of a lottery, you calculate that your EU from that
lottery is EU = 150.
(a) After performing some additional calculations, you find that the utility that you would obtain
if you instead received your EV is u(EV ) = 175. What is your attitude toward risk?
(b) What if, instead, you calculated the utility of your EV as u(EV ) = 140. What is your attitude
toward risk?
(c) What if, instead, you calculated the utility of your EV as u(EV ) = 150. What is your attitude
toward risk?
7. Risk aversion.B Suppose that you took part in a lottery that had a chance to increase, decrease, or
have no effect on your level of income. With probability 0.5, your income remains at its original
level, $500. With probability 0.2, your income increases to $700, and with probability 0.3, your
income decreases to $400. Your utility function is
u(I) = I 0.7 ,
where I denotes your income level.

(a) Using only the utility function, show that your risk preferences are risk averse.
(b) Calculate both your EU and the utility equivalent of the EV of your income.
(c) Using the results from part (b), show that your risk preferences are risk averse.
(d) Suppose now that you had the option to either accept this lottery, or walk away with your initial
$500. Should you accept the lottery? Why or why not?
8. Risk premium–I.A Consider the situation in exercise 7.
(a) Calculate your CE.
(b) Calculate and interpret your risk premium. Is it consistent with risk aversion?
9. Risk loving.B Suppose that you took part in a lottery that had a chance to increase, decrease, or
have no effect on your level of income. With probability 0.3, your income remains at its original
u(I) = I 2.5 ,

(a) Using only the utility function, show that your risk preferences are risk loving.
(c) Using the results from part (b), show that your risk preferences are risk loving.
(d) Suppose now that you had the option to either accept this lottery, or walk away with your initial
$200. Should you accept the lottery? Why or why not?
10. Risk Premium–II.A Consider the situation in exercise 9.

(b) Calculate and interpret your risk premium. Is it consistent with risk loving?
11. Risk neutrality.B Suppose that you took part in a lottery that had a chance to increase, decrease,
or have no effect on your level of income. With probability 0.4, your income remains at its original
u(I) = 125 + 3I,

(a) Using only the utility function, show that your risk preferences are risk neutral.
(c) Using the results from part (b), show that your risk preferences are risk neutral.
(d) Suppose now that you had the option to either accept this lottery, or walk away with your
initial $400. Should you accept the lottery? Why or why not?
12. Risk Premium–III.A Consider the situation in exercise 11.
(b) Calculate and interpret your risk premium. Is it consistent with risk neutrality?
13. Contracting an illness.B Consider a situation where you are faced with a risky situation. You
currently have $100,000 available for consumption, and with a 90 percent probability, you would
suffer no illness. You have a 9 percent chance, however, of contracting a case of influenza, leading
to the loss of $10,000 in consumption. In addition, there is a 1 percent chance that this is a severe
illness, leading to the loss of $50,000 in consumption. Your utility from consumption is
U(C) = C 0.4 ,
where C is your consumption level.

(a) What is your attitude toward risk? How do you know this?
(b) Suppose that you could purchase insurance against influenza. What is your CE?
(c) What is the maximum premium that you are willing to pay for insurance against influenza?
(d) What is your risk premium? How does this compare with your risk premium if you were risk
neutral?
14. Purchasing full insurance.B Suppose that Adam has an initial wealth of $100 and has the utility
√
function u(I) = I, where I > 0 denotes his income. Assume that he faces a 10 percent chance of
suffering a car accident where he would lose $25. He considers purchasing insurance to protect
against his potential loss. He can buy a units of insurance for $0.10 per unit, which pays $1 per
unit a that is purchased.
(a) What is Adam’s EU from buying a units of insurance?
(b) How many units of insurance, a, does Adam purchase?
152 Chapter 6
15. Not purchasing full insurance.B Consider Adam’s situation in exercise 14, except now each unit
of insurance costs $0.11.
(a) What is Adam’s expected utility from buying a units of insurance?
(b) How many units of insurance, a, does Adam purchase in this scenario?
16. Arrow-Pratt coefficient–I.B Consider an individual with the utility function u(I) = log(I), where
the log function refers to the natural logarithm and I is an individual’s income level.
(a) Calculate the Arrow-Pratt coefficient of risk aversion.
(b) Based on your results from part (a), what is this individual’s attitude towards risk?
17. Arrow-Pratt coefficient–II.B Consider an individual with the utility function u(I) = expI , where
exp denotes the exponential function and I is an individual’s income level.
(a) Calculate the Arrow-Pratt coefficient of risk aversion.
(b) Based on your results from part (a), what is this individual’s attitude towards risk?
18. Weighted utility.B Repeat exercise 13, but suppose that now you place additional weight on
the chance of contracting a severe illness. Let x denote the outcome where you do not contract
influenza, y denote the outcome where you contract a standard case of influenza, and z denote
the outcome where you contract a severe case of influenza. Suppose that g(x) = 3, g(y) = 3, and
g(z) = 4. How does your risk premium change as you weigh the worst outcome more heavily?
19. Prospect theory–I.B Suppose that your initial wealth is W = $50 and your utility function is
√
u(W ) = W . While out on a walk one day, you notice a $5 on the ground and pick it up.
(a) By how much does your utility increase?
(b) Suppose now that while walking home with your $55, you are stopped by a police officer
for jaywalking and fined $5. By how much does your utility decrease? How does your utility
now compare with your original utility?
√
(c) Repeat part (a), but suppose now that your utility function is u(W , W0 ) = W − W0 when
you increase your wealth, where W0 represents your wealth before the event occurs (your
reference point), and u(W , W0 ) = −(W − W0 )2 when you decrease your wealth.
(d) Repeat part (b) with the utility function in part (c).
(e) Compare the results of parts (b) and (d). Under which situation are you worse off? Why?
20. Prospect theory–II.C You are considering whether to invest in your brother’s business. He
informs you that if you invest $10, 000, you have a 75 percent chance of doubling your money
after a year. When you ask him about the other 25 percent chance, he mumbles something about
losing all your money.
(a) What is your EV for this investment?
√
(b) Suppose that you are risk averse, with a utility function of U(I) = I, where I represents the
value of your investment. What is your EU of this investment?
(c) You remain skeptical of your brother’s abilities, so instead you decide to utilize a value func-
tion to calculate your most likely outcome. From prospect theory, you decide to weight the
probability p = 0.75 that your brother is successful by using the weighting function
p1/3
w(p) = 1/3 ,
p + (1 − p)2/3
In addition, you decide to use your initial investment as a reference point, and calculate
√
your utility as U(I, 10, 000) = I + 10, 000 for a successful investment, and U(I, 10, 000) =
−(I − 10, 000)2 for an unsuccessful investment. What is your value of this investment?
(d) Compare the results from parts (b) and (c). Are you more likely to invest in your brother’s
business while using prospect theory? Explain.
21. Gambling.A Suppose that you have a situation where your grandfather enjoys spending all his
free time (and money) playing the slot machines at his local casino. When you confront him, he
explains to you that he is risk loving, and he can’t give up the thrill of taking a gamble. With what
you know about risk premiums, how could you persuade your grandfather to curtail his gambling
habits?
22. Exam pressure.A Suppose that you arrive at your final exam in this class to find that the professor
has reduced it to a single question. The question states that each student, in turn, will approach
him and roll a fair 20-sided die. With a roll of 1, the student receives 0 points on the final, but on
a roll of any other number, the student receives 100 points on the final.
(a) What is your expected score on the final?
(b) Suppose now that your professor offered to sell you some grade insurance. You could offer
him some of your final exam points in exchange for avoiding the risk. What percentage of
your points would you offer him?
- Note 1: There is no wrong answer to this question.
- Note 2: Your professor would never accept an offer less than 5 of your points, but he might
not be fair, so don’t offer him too little!
(c) What does your offer in part (b) reveal about your attitude toward risk?
7 Production Functions
7.1 Introduction
After describing consumer decisions in previous chapters, this chapter and chapter 8 focus
on firm decisions, such as how many units of output to produce and how many inputs to
invest in. “Inputs” are factors of production that the firm can transform into units of output,
such as labor (secretaries, chief executive officers, gardeners, or software engineers); capital
(buildings, desks, computers, and software packages); and land. We start by measuring the
average product that the firm obtains per unit of input, the additional product that the firm
gains when adding 1 more unit of input (the marginal product), and the relationship between
the two.
As in consumer theory, where we used indifference curves to illustrate combinations of
two goods (x and y) that yield the same utility level for the consumer, we now seek to depict
combinations of labor and capital that produce the same amount of output. We refer to
these labor-capital combinations as the firm’s “isoquant.” Following our approach in con-
sumer theory, we measure the slope of the firm’s isoquant, because it helps us understand
the firm’s ability to substitute one unit of labor for one unit of capital while maintaining its
output level unchanged. We then examine various production functions, which exhibit sim-
ilar mathematical properties as the utility functions previously explored in consumer theory
(chapter 2), such as the Cobb-Douglas production function, the linear production function,
and the fixed-proportions production function.
We conclude the chapter with two applications. First, we measure returns to scale in the
production process. If all inputs increase by the same proportion, returns to scale evalu-
ate how much the firm’s output increases. Second, we test for technological progress. As
the term indicates, this progress allows the firm to increase its output while using the same
amount of inputs.
156 Chapter 7
7.2 Production Function
In this section, we discuss how to represent the production of a firm as a function of its
inputs, such as labor, capital, and land, and generally, any other element that the firm can
transform into units of output.
Production function A function representing how a certain amount of inputs is

transformed into an amount of output q.
For example, q = f (K, L) is a production function as it describes how specific amounts

of labor L and capital K are transformed into an amount of output q. We next elaborate on
examples of common production functions.
Example 7.1: Examples of production functions The Cobb-Douglas function
q = AK α Lβ
is relatively common, where parameter A is positive and parameters α and β satisfy

α, β ∈ (0, 1). Let these parameters take the values A = 3, and α = β = 1/2, and con-
sider that the firm uses K = 4 machines and L = 9 workers. In this case, the maximum
output that the firm can generate, given its Cobb-Douglas technology, is
q = 3 × 41/2 × 91/2 = 18 units.

If, instead, we observe that the firm produces only 14 units of output using the
previous combination of inputs (K = 4 and L = 9), this indicates that the firm is not
efficiently managing its available inputs, because it is not reaching the maximum pos-
sible output (q = 18 units), given its current technology and input usage. We can then
measure a firm’s efficiency as the ratio of observed output that the firm produces to
the potential output identified by the production function. In this example, efficiency
would be 14/18 = 0.77. Alternatively, the firm would have an inefficiency level of
1 − 0.77 = 0.33.
Section 7.7, later in this chapter, elaborates on other types of production functions,
but we briefly list them here. They include: (1) production function q = aK + bL,
where a and b are all nonnegative and capital and labor enter linearly; (2) produc-
tion function q = A min {aK, bL}, where parameters A, a, and b are all positive and
capital and labor must be used in a certain proportion; and (3) production func-
tion q = AK α + bL, where parameters A, α, and b are all positive, and where one
Production Functions 157
input (in this case, labor) enters linearly and the other input enters nonlinearly.1 The
mathematical representation of these production functions is analogous to that of util-
ity functions in consumer theory (chapter 2). Indeed, we have only changed labels:
good x is now units of capital K, whereas good y is now units of labor L.
Self-assessment 7.1 Consider the firm discussed in example 7.1, but assume that
its production function is q = 5K 1/3 L2/3 . Which is the largest amount of output q that
the firm can produce using L = 9 and K = 4 inputs? What if the production function
changes to q = 7K + 4L? What if it changes to q = 5 min {2K, 3L}? What if it changes
to q = 4K 1/2 + 3L?
7.3 Marginal and Average Product
Here, we define two common measures to evaluate the productivity of a firm’s inputs:
average product and marginal product.
Average product The total units of output per unit of input. Hence, the average
product of labor is APL = Lq , whereas that of capital is APK = Kq .
As an example, if a firm produces 100 units of output, and hires L = 4 workers, its aver-
age product per worker is APL = 100 4 = 25 units. In other words, every worker produces on
average 25 units of output. This is frequently referred to in the media as “labor productiv-
ity,” in articles that report the growth of productivity over time. For instance, you might read
that “U.S. labor productivity grew by only 1 percent last year,” indicating that the average
production per worker increased by 1 percent. When labor productivity in a country grows,
the ratio APL = Lq must go up, and this can occur for a number of reasons: (1) total output
increases, while the number of employed workers remains constant, as a result of better
technology and education; (2) total output remains unaffected, but workers are fired after
firms automate the production process (replacing workers with machines, such as robots);
1. Recall from chapter 2 that, when we say “a function is linear in a good” (an input, in this chapter), we just mean
that the derivative of the function with respect to that input yields a constant (a number). That is, the derivative
is no longer a function of the units of labor L or capital K that the firm uses. If, instead, we say “a production
function is non-linear in an input” (or, alternatively, “the input enters nonlinearly”), the derivative of the function
with respect to that input yields an expression that still contains L, K, or both.
158 Chapter 7
B
400
A
200
L
4 16
Figure 7.1 √
Production function q = 100 L.
and (3) total output increases and more workers are hired, but the former grows faster than
the latter, thus increasing the ratio of Lq . √
Figure 7.1 depicts the production function q = 100 L, with units of labor L on the hor-
izontal axis and total output on the vertical axis. (For simplicity, the figure includes a
production function √with only one input.) At point A, with LA = 4 workers, the total product
becomes qA = 100 4 = 200 units, entailing that the average product of this firm is given by
qA
LA = 4 = 50 units. Graphically, the average product at A coincides with the slope of the ray
200
connecting the origin to√point A. Similarly, at point B, where LB = 16 workers, the total prod-
uct becomes qB = 100 16 = 400 units, implying that the average product is LqBB = 400 16 = 25
units. Graphically, the ray connecting the origin to point B now becomes flatter than that
connecting the origin to A, which should come at no surprise because the average product
was cut in half.
Example 7.2: Finding average product Consider a production function q =

5L1/2 + 3L − 6. The average product of labor is found by dividing total output by
units of labor, L, as follows:
q 5L1/2 + 3L − 6 6 5 6
APL = = = 5L1/2−1 + 3 − = 1/2 + 3 − .
L L L L L
We can now analyze the APL expression. First, as L increases, APL increases if the
derivative
∂APL 5 6
= − 3/2 + 2
∂L 2L L
is positive, which occurs when L62 2L53/2 . After rearranging, this condition simplifies
to L3/2−2 125
5 L
, and to 12 1/2 . Squaring both sides, we find L 144 5.76 workers.
25
In other words, APL increases in L for all L 5.76 workers, but it decreases in L
beyond that point. Therefore, APL reaches its maximum when ∂AP ∂L = 0, which occurs
L
at L = 5.76 workers.
Self-assessment 7.2 Consider the firm in example 7.2, but assume now that its
production function changes to q = 7L1/3 + 4L − 2. Find the average product, APL ,
and the labor at which APL reaches its maximum.
We next analyze how the total output of the firm increases as it utilizes an additional unit
of input.
Marginal product The rate at which total output increases as the firm uses an addi-
q
tional unit of either input. The marginal product of labor is MPL = L when labor
∂q
is discrete, or ∂L when it is continuous; whereas the marginal product of capital is
q ∂q
MPK = K when capital is discrete, or ∂K when it is continuous.
Hence, we can find the marginal product of an input by differentiating the production
function q = f (L, K) with respect to that input. Because the derivative of a function at a
point coincides with its slope of a tangent line at that point, we can graphically inter-
pret the marginal product of an input (e.g., labor) as the slope of the function when we
marginally
√ increase the amount of that input. Figure 7.2a depicts the production function
q = 100 L again. The marginal product of labor (e.g., hiring more workers) is measured by
the derivative of this function with respect to L; that is,
∂q 1 50
MPL = = 100 L1/2−1 = 1/2
∂L 2 L
because L1/2−1 = L−1/2 , and L−1/2 can be also expressed as L1/2 1

. Figure 7.2b depicts
marginal product MPL = L1/2 as a function of the number of workers that the firm hires, L.
50
For instance, at point A, where the firm hires LA = 4 workers, the marginal product of hir-
ing more workers becomes MPL = 450 1/2 = 2 = 25 units. That is, when the firm hires only
50
4 workers, it can increase its total output by 25 units if it hired an additional worker. In
160 Chapter 7
(a)
q
B
400
A
200
L
4 16
(b)
MPL
40
30
A
25
20
B
12.5
10
L
4 5 10 1516 20
Figure 7.2
(a) Depicting MPL as the slope of the production function. (b) Representing MPL directly.
contrast, at point B, where the firm already employs LB = 16, MPL becomes MPL = 16501/2 =
4 = 12.5 units. Intuitively, the additional output that each extra worker brings is positive,
50
but it decreases as the firm hires more and more workers. In short, MPL is diminishing. This
could happen, for instance, if there are “too many cooks in the kitchen.”
Example 7.3: Finding marginal product Consider the same production function
as in example 7.2, q = 5L1/2 + 3L − 6. Next, we find its marginal product of labor,
showing that it is decreasing in L, as in our above discussion. First, let us find MPL by
differentiating the production function with respect to L:
∂q 1 5
MPL = = 5 L1/2−1 + 3 = 1/2 + 3.
∂L 2 2L
To check if MPL decreases in L, we can calculate its derivative as follows:

∂MPL 5
= − 3/2
∂L 4L
which is negative because L > 0. Hence, as L increases, the marginal product of labor,
MPL , decreases. Essentially, additional workers bring more production to the firm,
but at a decreasing rate.
Self-assessment 7.3 Consider the firm in example 7.3, but assume that the firm’s
production function changes to q = 7L1/3 + 4L − 2. Find the marginal product, MPL ,
and check if it increases or decreases in labor.
7.4 Relationship between APL and MPL
The average and marginal products exhibit some interesting relationships:
1. When the APL curve is increasing, MPL lies above APL ;

2. When the APL curve is decreasing, MPL lies below APL ; and
3. When the APL curve is flat (at its highest point), MPL curve crosses APL .
To understand these relationships, let us consider grades in a class rather than output.
Assume that you take a midterm exam in one of your classes, and, a few days later, the
instructor shows up with a stack of graded exams, letting you know that your average grade
in the class will go up as a consequence of the midterm. Great news! What does that say
about your performance on the midterm exam? Of course, it is saying that your grade on
the midterm exam must be better than your previous average (a result you would know even
without taking intermediate micro!). In other words, for your average grade in the class to
increase, it must be the case that the grade in your midterm exam is higher than your previous
average, or alternatively, that the marginal effect of the last grade is higher than your average.
This is analogous to what happens with output: for the average product of labor to increase
in L (as the firm hires one more worker), it must be that the newly hired worker produces
more total output than previous workers did on average. In short, MPL > APL , as in the
midterm grade example.
The opposite applies when the midterm exam decreases your average grade in the class.
In that case, the midterm grade must be lower than your previous average. In the context
of output, the product of the newly hired worker is lower than that of previous workers on
average, or MPL < APL .
162 Chapter 7
Lastly, if the instructor of the class informs you that your grade is unaffected by your
midterm score, it means that your score exactly coincides with your previous average. In
the context of the firm, the newly hired worker is as productive as previous workers are on
average.
An immediate consequence of this argument is that the MPL curve crosses the APL at
the maximum point (the peak) of the APL curve. We can easily show this result for any
production function q = f (L). First, note that the average product per worker is APL = Lq =
f (L)
L . To find the number of workers, L, at which APL reaches its maximum, we differentiate
APL with respect to L and set our result equal to zero, as follows:
∂APL f (L)L − 1f (L)
= = 0,
∂L L2
where, because APL = f (L) 2
L , we find the derivative using the quotient rule. Note that, for

compactness, we use f (L) to denote the derivative of total output f (L) with respect to L. As
∂q
this derivative is the marginal product MPL = ∂L , we can replace f (L) for MPL , as follows:
MPL L − f (L) MPL f (L)
= − 2 = 0.
L2 L L
Multiplying both sides by L, we obtain
f (L)
MPL − = 0.
L
Finally, note that the second term is the average product per worker, APL = f (L)
L , which
allows us to write this expression more compactly as
MPL = APL .
This equation tells us that, at the maximum of the APL curve, the MPL curve crosses the
APL curve. Figure 7.3 illustrates this discussion: when the APL curve is increasing, MPL
lies above APL ; when the APL curve is decreasing, MPL lies below APL ; and when the APL
curve is flat (at its peak) the height of MPL and APL coincide.
Example 7.4: Relationship between APL and MPL Consider the production
function analyzed in examples 7.2 and 7.3. As shown in example 7.2, APL =
5
L1/2
+ 3 − L6 reaches its maximum at L = 144
25 5.7, where its height becomes
g(x)
2. Recall the quotient rule. For a function f (x) = h(x) , where both the numerator, g(x) and denominator, h(x), are
g (x)h(x)−g(x)h (x)
functions of x, the quotient rule says that the derivative of f (x) is f (x) = 2 . In our scenario, the
(h(x))
f (L)
quotient is given by APL = L , so the numerator f (L) plays the role of g(x) in the quotient rule, while L plays the
role of h(x).
MPL
APL
MPL
A
APL
Figure 7.3
The APL and MPL curves.
APL = 5 1/2 + 3 − 5.7

6
4.04. If we evaluate the MPL = 2L51/2 + 3 curve found in
(5.7)
example 7.3 at exactly the same L = 5.7, we obtain that the height of the MPL
curve is,
5
MPL = + 3 4.04,
2 (5.7)1/2
thus confirming that the MPL crosses the APL at its maximum point.
Self-assessment 7.4 Consider the firm in self-assessments 7.2 and 7.3. Using the
same steps as in example 7.4, find the point at which MPL crosses the APL curve.
7.5 Isoquants
In this section, we examine a firm’s ability to substitute one input for another while main-
taining the same level of output. For instance, a firm may consider acquiring a packaging
machine that does the job of three packaging workers. To evaluate that ability to substitute
between inputs, we first present a definition of how to measure input combinations that yield
the same output level.
Isoquant curve It represents combinations of labor and capital that yield the same
amount of output.
164 Chapter 7
K
q = 100
q = 200
A3
A1 A2
B L
Figure 7.4
Isoquants—nonprofitable input combinations.
The isoquant is, then, analogous to the indifference curve in consumer theory. As
discussed in chapter 2, the indifference curve of a consumer represents combinations of
goods x and y for which she obtains the same utility level. In the context of production,
an isoquant similarly reflects combinations of labor and capital that generate the same total
output.3 Figure 7.4 depicts an example of an isoquant, where at point A, the firm uses an
input combination that is relatively intense in capital (i.e., many units of capital and few
of labor), whereas at point B, the firm uses a labor-intense input combination. Yet, at both
points, the firm produces the same total output (100 units). At point C, however, the firm
reaches a higher total output (200 units).
Figure 7.4 continues the isoquant curve upward and rightward (in the dark shaded areas),
to illustrate that these shaded areas are regarded as unprofitable for the firm, and thus are
never chosen by a rational manager. To see this, note that at points A1 and A2 , the firm
produces the same units of output (100 units) because they both lie on the same isoquant.
However, point A2 uses more units of labor than point A1 (note that A2 is to the right side of
A1 ), and the same amount of capital (both points have the same height). As a consequence,
the input combination at A2 must be more expensive to purchase than that at A1 , and yet it
produces the same total output! No rational firm would then choose an input combination
such as A2 , an argument that applies to the upward-bending portion of the isoquant in the
shaded area on the right. Similarly, points A1 and A3 generate the same units of output
because they lie on the same isoquant, but A3 requires more capital than A1 , thus implying
that it is more expensive to purchase. As a result, the firm would never choose an input
3. Graphically, if a production function q = f (L, K) is represented by a “production mountain” in three dimensions

(3D), a horizontal slice of the production mountain at a specific height (at a particular output level) entails combi-
nations of labor and capital that yield the same output (i.e., reach the same height on the mountain). The 3D figure
of the production function could look like the figure discussed for utility functions in chapter 2 (figure 2.3a) while
its slice at a given height would resemble that of the indifference curve in figure 2.3b.
combination such as A3 , an argument that extends to all points on the backward-bending

portion of the isoquant in the shaded area at the top of the graph.
The equation of the isoquant is found using the same approach followed for indifference
curves in consumer theory (section 2.6 in chapter 2). As we illustrate in example 7.5, to find
the isoquant curve for a specific output level, such as q = 100 units, we only need to solve
for the variable on the vertical axis (often, capital K).
Example 7.5: Finding isoquant curves for a Cobb-Douglas production func-

tion Consider a Cobb-Douglas production function q = 5L1/2 K 1/2 , and let us find
the isoquant corresponding to output level q = 100 units. Inserting this output level
into the production function (left side), we obtain 100 = 5L1/2 K 1/2 , or 20 = L1/2 K 1/2 .
To solve for capital K, we first square both sides, which yields 202 = LK, or 400 = LK.
Lastly, we can solve for K to obtain the isoquant K = 400 L . Graphically, this isoquant
is a curve that approaches the vertical axis when L is close to zero, but it never crosses
that axis; and that approaches the horizontal axis when L is large, without ever cross-
ing that axis. (As an exercise, consider a linear production function q = 5L + 7K,
and find the isoquant corresponding to the output level q = 100. You should obtain
7 , crossing the horizontal axis at L = 5 , with a
a straight line originating at K = 100 100
slope of − 7 .)
5
Self-assessment 7.5 Consider the firm in example 7.5, but assume that it seeks
to produce q = 200 units. What is the firm’s isoquant now? What if its production
function changes to q = L1/3 K 2/3 ? Interpret your results.
7.6 Marginal Rate of Technical Substitution
We next find the slope of the isoquant. If the firm were to add one extra worker, it would
increase its total output. Then, we ask:
How many units of capital must the firm give up to maintain its output level unaffected after hiring
an extra worker?
The slope of the isoquant answers this question.
Marginal rate of technical substitution (MRTS) After increasing the quantity of

labor by 1 unit, the MRTS measures the amount by which capital must be reduced so
that output remains constant.
166 Chapter 7
K = 80 A
Large
decrease
in K
B
Small K = 60
decrease C
K = 50
in K q = 100 units
4 5 6 L
1 more 1 more
worker worker
Figure 7.5
Diminishing MRTS.
Figure 7.5 illustrates the MRTS for the same isoquant depicted in figure 7.4. At point A,
the firm uses a large amount of capital and few workers to produce q = 100 units of output.
If it were to hire 1 more worker (moving rightward from L = 4 to L = 5), the firm would
need to reduce its capital usage significantly (moving downward from K = 80 to K = 60) to
maintain its current output level, thus moving along the isoquant from the initial point A to B.
Importantly, we must move along the isoquant because we seek to keep output unchanged.4
Let us repeat this process at point B, where the firm employs more workers but fewer units
of capital than at A (L = 5 and K = 60). If the firm hired 1 more worker, it would be willing
to give up only a few units of capital to keep its output level unaffected. Intuitively, when
capital is abundant and labor scarce (as in point A), the firm is willing to give up many units
of capital to hire one more worker. However, as capital becomes more scarce (at point B),
the firm is less willing to replace it with workers. (For completeness, appendix A, at the
end of the chapter, shows that the slope of the isoquant is measured by the ratio of marginal
products.)
Example 7.6: Finding the MRTS of a Cobb-Douglas production function Con-

sider a firm with production function q = 8L1/2 K 1/2 . Its marginal product of labor is
MPL = 8 12 L1/2−1 K 1/2 = 4L−1/2 K 1/2 , and that for capital is MPK = 8 12 L1/2 K 1/2−1 =
4L1/2 K −1/2 . Hence, the MRTS in this scenario is
4. Intuitively, a high MRTS indicates that, at point A, the firm hired so few workers that hiring one worker would
increase output significantly, thus requiring a large capital reduction to keep output unchanged.
MPL 4L−1/2 K 1/2 K

MRTS = = = ,
MPK 4L1/2 K −1/2 L
which is decreasing in the units of labor (as L shows up only in the denominator).5
Graphically, this result entails that the slope of the isoquant (the MRTS) falls as we
move rightward toward more units of L. That is, the isoquant becomes flatter as we
move rightward or, alternatively, the firm’s isoquant is bowed in from the origin, as in
figure 7.5.
Self-assessment 7.6 Consider a firm with production function q = 4L1/3 K 2/3 .

Using the same steps as in example 7.6, find this firm’s MRTS. Interpret your results.
Example 7.7: Finding the MRTS of a linear production function Consider a firm
with a linear production function q = aL + bK, where a, b > 0. Its marginal product
of labor is MPL = a, while that for capital is MPK = b. Hence, the MRTS in this
scenario is
MPL a
MRTS = = .
MPK b
In this context, the MRTS is not a function of the units of labor or capital that the firm
uses. Indeed, the MRTS is just a constant (a number ab ). For instance, if parameters a
and b took the values a = 6 and b = 3, then the MRTS would become 63 = 2, implying
that the slope of the isoquant would be −2 in all its points. Graphically, the isoquant
would then be a straight line (because its slope is constant).6
Self-assessment 7.7 Consider a firm with production function q = 5L + 2K. Using

the same steps as in example 7.7, find this firm’s MRTS. Assume now that its produc-
tion function changes to q = 5L + 4K. What is the firm’s MRTS now? How does it
compare to the initial MRTS? Interpret your results in terms of capital productivity.
5. Formally, the derivative of the MRTS with respect to the units of labor, L, is ∂MRTS K
∂L = − L2 , which is negative
for any number of units of labor, L, and capital, K, that the firm uses.
6. As described in the previous discussion of isoquants, to find the isoquant of production function q = aL + bK,
q q
we can find K = b − ab L, where b represents the vertical intercept of the isoquant in the K-axis, while − ba is the
slope of the isoquant, thus confirming this result.
168 Chapter 7
q
b
a
b
q
L
a
Figure 7.6
Linear production function.
7.7 Special Types of Production Functions
7.7.1 Linear Production Function

As described in previous sections, the linear production function takes the form
q = aL + bK,
where both a and b are positive parameters (numbers), such as in q = 7L + 5K. The isoquant
of this production function is a straight line. To see this, solve for K in q = aL + bK, to
obtain
q a
K = − L,
b b
where qb is the vertical intercept of the isoquant, while ab denotes its negative slope, as
depicted in figure 7.6. For instance, consider a firm with production function q = 7L + 5K,
and let us depict the isoquant of q = 100 units of output, the vertical intercept is K = qb =
5 = 20, whereas the negative slope is b = 5 . The slope is constant along all points of the
100 a 7 7
a
isoquant because b is not a function of L or K. As a consequence, the MRTS is also constant
given that the latter represents the slope of the isoquant. Intuitively, the firm can substitute
7. Recall that, to find the horizontal intercept of this isoquant, we just need to set capital equal to zero (K = 0),
q
and solve for L. That is, q = aL + b0, which yields the horizontal intercept L = a . For instance, in the production
q
function q = 7L + 5K, if we depict the isoquant for q = 100 units of output, the horizontal intercept is L = a =
100 14.28 workers.
7
units of capital and labor at the same rate, regardless of the number of each input that it
employs. Hence, linear production functions can help us represent firms capable of easily
substituting between inputs, such as two types of fuel (oil and natural gas), or two types of
computers; that is, relative input usage does not change the firm’s ability to substitute one
input for another.
Self-assessment 7.8 Consider a firm with production function q = 5L + 2K. If

the firm seeks to produce q = 230 units, find the isoquant, its vertical and horizontal
intercept, and its slope.
7.7.2 Fixed-Proportions Production Function

This type of production function is the polar opposite of the linear production function
described previously. In this case, the firm cannot substitute between inputs and still main-
tain the same output level. Instead, the firm must use inputs in a fixed proportion to increase
output. As described earlier in this chapter, this production function takes the form
q = A min{aL, bK},
where parameters A, a, and b are all positive. An example of this production function would
be q = min{2L, 3K}. Importantly, an increase in one input without a proportional increase
in the other input will not result in an increase in production.
Figure 7.7 illustrates that, depending on the amounts of labor and capital used, the firm
faces either of the following three cases:
K= a L
b
q
Ab aL = bK
q
Aa L
Figure 7.7
Fixed-proportions production function.
170 Chapter 7
• If min{aL, bK} = aL, which occurs because aL < bK, the output level becomes q = AaL.
q
Solving for L in q = AaL, we obtain that L = Aa . Graphically, this is a vertical line at a
q
labor level of L = Aa (note that it is a straight vertical line because ratio Aa
L
is not a function
of K). In this example, q = min{2L, 3K}, if the firm produces q = 100 units, the vertical
segment of the isoquant happens when 2L < 3K or, solving for K, 23 L < K, where the
q
vertical line lies at L = Aa = 100
2 workers.
• If min{aL, bK} = bK, which occurs because aL > bK, the output level becomes q = AbK.
q
Solving for K in q = AbK, we obtain that K = Ab . Graphically, this is a horizontal line at
q q
a capital level of K = Ab (note that it is a straight horizontal line since ratio Ab is not a
function of L). For instance, in the example where q = min{2L, 3K}, if the firm seeks to
produce q = 100 units, the horizontal segment of the isoquant occurs when 2L > 3K or,
q
solving for L, L > 32 K, where the horizontal line lies at K = Ab = 1003 units of capital.
• If aL = bK, then, min{aL, bK} is either aL or bK, since both are the same number. In this
case, the output level becomes q = AaL = AbK because aL = bK. This occurs at the kink
of the isoquant, where aL = bK, which, solving for K, yields a kink at K = ab L. In this
example, q = min{2L, 3K}, the kink happens at K = 23 L. Graphically, this result means
that the kinks of all isoquants are crossed by a ray from the origin with slope 23 .
Note that the MRTS of the fixed proportion production function is not well defined,
because we can find infinitely many slopes for the isoquant at its kink. We can nonetheless
say that the slope of the isoquant is infinite in its vertical segment, and zero in its horizontal
segment. This type of production function is common in firms that cannot easily substitute
across inputs without altering their total output, such as firms in the chemical or pharma-
ceutical industry or firms with a highly automated production process. In the first type of
firm, a switch from one chemical to another might alter the final product; and in the second
type of firm, a reduction in the number of workers cannot be easily compensated for by an
increase in machines without serious adjustments to the production process.
Self-assessment 7.9 Consider a firm with production function q = 5 min{3L, K}.

If the firm seeks to produce q = 200 units of output, find and depict the isoquant.
7.7.3 Cobb-Douglas Production Function

This production function takes the mathematical form
q = ALα K β ,
where the parameters A, α, and β are all positive. For instance, if A = 1 and α = β = 1/2, the
production function becomes q = L1/2 K 1/2 , which we frequently encountered throughout
this chapter. In this scenario, the isoquant is found by squaring both sides of q = L1/2 K 1/2 ,
2
which yields q2 = LK, and solving for K (the input on the vertical axis), we obtain K = qL ,
as described in example 7.5. In addition, the slope if the isoquant (MRTS) becomes
MPL 1/2L−1/2 K 1/2 K 1/2+1/2 K
MRTS = = = =
MPK 1/2L1/2 K −1/2 L1/2+1/2 L
as discussed in example 7.6.
Self-assessment 7.10 Consider a firm with production function q = 5L1/3 K 2/3 .

Find the firm’s MRTS. Then, assuming that the firm seeks to produce q = 220 units
of output, find and depict the isoquant.
7.7.4 Constant Elasticity of Substitution Production Function

Lastly, we present a production function that exhibits a constant elasticity of substitution
(CES), σ . (For more details about this elasticity, see appendix B at the end of this chapter.)
This function has the following form:
σ −1 σ
σ −1 σ −1
q = aL σ + bK σ ,
where the term σ in the exponents represents the elasticity of substitution. An interesting
property of this production function is that it embodies all previously discussed production
functions as special cases. In particular, when the elasticity of substitution, σ , is σ = +∞,
we have that the production function coincides with the linear production function where the
firm can easily substitute between inputs; when σ = 0, the production function converges
to the fixed-proportions production function; and when σ = 1, the production function
coincides with the Cobb-Douglas production function.8
7.8 Returns to Scale
In this section, we analyze how a firm’s output responds to a common increase in all of
its inputs, which, for compactness, we refer to as “returns to scale.” Intuitively, the firm
increases its scale because it expands the number of all inputs being hired, but the question
is whether its output level increases more or less than proportionally to the increase in input
usage.
8. Proving this result is not straightforward. For a step-by-step proof, see Muñoz-Garcia (2017), chapter 2, pages
49–51.
172 Chapter 7
Returns to scale Consider that all inputs are increased by a common factor, λ > 1.
Hence, L is increased to λL, and K is increased to λK. If the firm’s output increases
as follows:
• λa > λ (which occurs when a > 1), we say that the firm exhibits increasing returns
to scale.
• λa < λ (which happens when a < 1), we say that the firm exhibits decreasing returns
to scale.
• λa = λ (which occurs when a = 1), we say that the firm exhibits constant returns to
scale.
As an illustration, consider a firm doubling the units of all inputs (λ = 2). If output
increases more than proportionally (i.e., output more than doubles), the firm’s production
function exhibits increasing returns to scale. If output increases less than proportionally
(it falls short from doubling), we say that the firm’s production function exhibits decreas-
ing returns to scale. Finally, if output increases proportionally (exactly doubling), the firm’s
production function satisfies constant returns to scale.
Example 7.8 illustrates how we can test for the existence of either of these three types of
returns. First, we increase all inputs by the same factor λ (multiplying all inputs by λ), and
then simplify our results to obtain how much total output increases.
Example 7.8: Testing for returns to scale Consider a Cobb-Douglas production

function q = ALα K β . If we increase all inputs by λ, labor becomes λL, while capital
is λK, implying that total output is now
A (λL)α (λK)β = Aλα Lα λβ K β ,
which simplifies to

λα+β ALα K β = λα+β q,

q
where we factored λ out on the left side, and use the definition q = ALα K β . (As
described previously, our goal when rearranging the previous expression was to fac-
tor the production function q = ALα K β out, so we could have q and an additional
term—in this case λα+β —which will help us identify whether increasing, constant,
or decreasing returns to scale are present.)
Hence, output increased by λα+β , giving rise to three possible cases:

• If its exponent satisfies α + β > 1, increasing returns to scale exist. This is the case,
for instance, if α = 23 and β = 12 .
• If the exponent satisfies α + β < 1, decreasing returns to scale are present; for
example, if α = 13 and β = 12 .
• Finally, if the exponent is exactly equal to 1 (α + β = 1), then constant returns to
scale exist. This happens if, for example, α = 12 and β = 12 .
Consider now a linear production function q = aL + bK. If we increase all inputs

by λ, we find
a (λL) + b (λK) = λ(aL + bK) = λq,

q
implying that output increased proportionally to inputs, and thus the firm’s production
process exhibits constant returns to scale.
Lastly, consider a fixed-proportions production function, q = A min{aL, bK}. If we
increase all inputs by a common factor λ, we obtain
A min{aλL, bλK} = λA min{aL, bK} = λq,

q
which also means that output responds proportionally to a given increase in inputs.
That is, this production function exhibits constant returns to scale.
Self-assessment 7.11 Consider a firm with the production function q =

5L1/3 K 2/3 . Use the same steps as in example 7.8 to find if this production function
exhibits increasing, decreasing, or constant returns to scale. What if the firm’s pro-
duction function changes to q = 7L + 8K? What if it changes to q = 3 min{L, 2K}?
7.9 Technological Progress
In previous sections, we assumed that the firm’s production function was unaffected by tech-
nological progress; however, firms often benefit from technological advances that allow
them to produce a larger output using the same amount of inputs.9 Mathematically, for a
9. Alternatively, technological progress lets firms produce the same number of units of output while using fewer
units of inputs.
174 Chapter 7
given pair of inputs L and K, the firm can produce q1 units of output before the technologi-
cal progress occurred, but it can produce q2 units of output afterward (where q2 > q1 ). For
instance, if the firm’s production function is q = A1 Lα K β before the technological change,
it could be q = A2 Lα K β afterward; q = A1 Lα+γ K β , where the exponent on labor increased
by γ > 0; or q = ALα K β+γ , where now the exponent on capital increased by γ > 0.
Example 7.9: Testing for technological progress Consider a firm with a pro-
duction function that changes from q = Lα K β to q = L2α K β . The firm benefits from
technological progress because
Lα K β < L2α K β
which simplifies to L < L2 , a condition that holds true for any number of workers
L > 1. Therefore, for a given pair of inputs L and K, the firm can produce more units
of output, confirming that technological progress exists.
Self-assessment 7.12 Consider a firm with the production function q = 7L + 2K,

which changes to q = 7L + 5K. Is the firm experiencing technological progress?
7.9.1 Types of Technological Progress

Technological progress can be one of three types: labor enhancing, capital enhancing, or
neutral, as we discuss next.
Labor-enhancing technological progress. We say that technological progress is labor

enhancing if it increases the marginal product of labor more than that of capital (i.e.,
MPL > MPK ). This can occur if higher education allows workers to be more productive,
MPL
while capital does not increase its marginal product as quickly. As a result, MRTS = MP K
increases because its numerator grows faster than its denominator. Graphically, this implies
that the firm’s isoquant becomes steeper after the technological progress, thus indicating
that the firm is willing to give up more units of capital to hire one more worker.
This type of technological progress is also referred to as “capital saving,” because the
firm can get rid of machines and hire more labor, which became more productive in relative
terms.
Capital-enhancing technological progress. In contrast, capital-enhancing technological
progress means that capital increases its marginal product faster than labor does, MPL <
MPK . This can happen if new robots or software programs make each unit of capital
more productive, while labor does not increase its marginal product as much. In this
MPL
case, MRTS = MP K
decreases, as its numerator grows more slowly than its denominator.
Graphically, this implies that the firm’s isoquant becomes flatter after the technological
progress occurred, thus suggesting that the firm is willing to give up fewer units of the
more productive capital to hire one more worker.
This type of progress is also labeled as “labor saving,” because the firm can fire some
workers and purchase more capital, which became relatively more productive.
Neutral technological progress. Finally, technological progress is regarded as neutral if the
marginal product of labor and capital increase by the same amount (i.e., MPL = MPK ).
MPL
In this scenario, MRTS = MP K
is unaffected, because its numerator grows by the same
amount as its denominator. Graphically, this implies that the firm’s isoquant does not change
its slope after the technological progress occurred, implying that the firm is willing to give up
as many units of capital to hire one more worker.
Example 7.10: Identifying the type of technological progress Consider the firm
in example 7.9. Before the technological change, MRTSPre = βα KL , while afterward it
becomes MRTSPost = 2α K
β L . Comparing them, we find that MRTS increased after the
technological progress:
MRTSPre < MRTSPost ,
indicating that the technological progress is labor enhancing. This is also known as
capital-saving progress because it provides incentives to the firm to replace capital by
hiring more workers.
Self-assessment 7.13 Consider the firm in self-assessment 7.12. Find the MRTS
before and after the technological change. Is this change labor saving, capital saving,
or neutral?
Appendix A. MRTS as the Ratio of Marginal Products
In previous sections of this chapter, we discussed that the slope of the isoquant is the ratio
MPL
of marginal products, MP K
, which we labeled as the firm’s MRTS. This appendix formally
MPL
shows that, indeed, the slope of the firm’s isoquant is measured by MP K
. We follow an
analogous proof to the one in chapter 2, where we showed that MRS = MU x
MUy represents the
slope of a consumer’s indifference curve (see appendix A in that chapter).
176 Chapter 7
Consider a firm with production function q = f (L, K), using L units of labor and K units
of capital to produce q units of output. To evaluate the slope of the firm’s isoquant, we
simultaneously increase labor (for instance, by 1 unit) and decrease capital.10 Hence, we
totally differentiate the production function q = f (L, K) with respect to L and K to obtain
∂f (L, K) ∂f (L, K)
dq = dL + dK.
∂L ∂K
Note that ∂f (L,K)

∂L = MPL represents the marginal product of additional units of labor and,
∂f (L,K)
similarly, ∂K = MPK reflects the marginal product of additional units of capital. Hence,
this expression becomes
dq = MPL dL + MPK dK.
Now, recall that, because we are moving along different points of the firm’s isoquants, the
output level is the same across all these points. Therefore, output does not change, entailing
that dq = 0. Plugging this result into the left side of this expression, we obtain
0 = MPL dL + MPK dK
and, rearranging, MPK dK = −MPL dL. Lastly, because we are interested in finding the slope
of the isoquant, we solve for − dK
dL . This reflects the rate at which the firm needs to decrease
K if L increases by 1 unit. Solving for − dK
dL yields
11
dK MPL
− = .
dL MPK
Hence, the slope of the firm’s isoquant, − dK

dL , coincides with the ratio of marginal products,
MPL
MPK , which we refer to as the MRTS.
Appendix B. Elasticity of Substitution
A common measure of how easy it is for a firm to substitute labor for capital is the “elasticity
of substitution,” presented as follows:

K K
L B− L A

% KL K
L A
σ= = MRTSB −MRTSA
.
%MRTS
MRTSA
10. We could also consider the opposite change in inputs, where labor decreases and capital increases. Generally,
we only need to change the amount of labor and capital simultaneously.
11. Starting from MPK dK = −MPL dL, we divide both sides by −dL, which produces MPK − dK dL = MPL , and
MPL
then divide both sides by MPK to obtain − dK
dL = MPK .
KA = 20 A
B
KB = 8
LA = 4 LB = 10 L
Figure 7.8
Finding MRTS at two points to obtain σ .
The elasticity of substitution tells us that, if the MRTS increases by 1 percent, the capital-
labor ratio that the firm uses, KL , increases by σ percent.
Figure 7.8 depicts a firm’s isoquant to illustrate this elasticity. Starting at point A, the firm
uses a capital-labor ratio KLAA = 20
4 = 5, and the isoquant has a slope of MRTSA = 6. At point
B, however, the capital-labor ratio decreases to KLBB = 10
8
= 0.8 (because the firm uses less
capital and more labor), and the isoquant has a flatter slope of MRTSB = 2.
Hence, the elasticity of substitution is

K K
L B− L A

K 0.8−5
L A 5 −0.84
σ= MRTSB −MRTSA
= = = 1.26,
MRTSA
2−6
6 − 23
which means that if the MRTS decreases by 2/3 (about 66 percent), then the capital-labor
ratio K/L decreases more than proportionally, by 84 percent. We can better understand the
elasticity of substitution by examining the extreme cases in which the firm can very easily
substitute inputs, and the case in which the firm cannot, as we discuss next.
Linear production function. If the firm has a linear production function q = aL + bK, its
isoquants are straight lines, as described in section 7.7.1. In this scenario, the MRTS is con-
stant along all the points of the isoquant (recall that the MRTS is equal to ab ), implying that
the denominator of the elasticity of substitution formula, σ , is zero (i.e., there is no change
in the MRTS). Therefore, regardless of the percentage change in the capital-labor ratio in
the numerator of σ , the elasticity of substitution is infinite because

K K
L B− L A

K
L A
σ= = +∞.
0
178 Chapter 7
Intuitively, the firm can substitute labor for capital very easily without altering its output
level.
Fixed-proportions production function. Let us now consider the opposite case, where the
firm faces a production function q = A min{aL, bK}. In this setting, the MRTS changes
drastically as we move rightward, from the vertical to the horizontal segment of the
L-shaped isoquant as shown previously in figure 7.7. Therefore, there is a large change
in the denominator of σ , producing a ratio in the formula of σ that converges to zero.
That is,

K K
L B− L A

K
L A
σ= = 0.
+∞
In this case, the firm cannot easily substitute units of labor for capital without affecting
its output level.
Cobb-Douglas production function. Next, we show that the Cobb-Douglas production
function q = ALα K β has an elasticity of substitution exactly equal to 1. First, we rewrite
the definition of the elasticity of substitution, as follows:
KL
% KL K KL MRTS
σ= = L
MRTS
= .
%MRTS MRTS KL
MRTS
Second, we find each of the four terms (two for the first ratio, and two for the second). Let
us start by finding the MRTS in the Cobb-Douglas production function:
MPL αALα−1 K β α K
MRTS = = = .
MPK βALα K β−1 β L
Rearranging the expression of the MRTS we just found, MRTS = βα KL , yields the capital-
K
labor ratio L, as follows:
β K
MRTS = .
α L
Applying increments on both sides, we obtain
β K
MRTS = ,
α L
or, rearranging,
KL β
= . (7.1)
MRTS α
From the expression of the MRTS, MRTS = βα KL , we also know that
MRTS α
= . (7.2)
K
L
β
Hence, inserting equations (7.1) and (7.2) into the definition of the elasticity of substitution,
we obtain
KL MRTS β α
σ= = = 1.
MRTS KL α
β

From (7.1) From (7.2)
Therefore, the Cobb-Douglas production function q = ALα K β has an elasticity of substitu-

tion σ = 1, regardless of the specific value of parameters A, α, and β.
CES production function. Lastly, we find the elasticity of substitution for the production
function presented in section 7.7.4, reproduced here:
σ −1 σ
σ −1 σ −1
q = aL σ + bK σ .
To find its elasticity of substitution, we first obtain its MRTS, as follows:

1 1
MPL aL− σ a K σ
MRTS = = 1
= .
MPK bK σ− b L
Applying logs on both sides, we find

a 1 K
ln(MRTS) = ln + ln ,
b σ L
which, after rearranging, simplifies to
1 K a
ln = ln(MRTS) − ln .
σ L b
Multiplying both sides by σ , yields
K a
ln = σ ln(MRTS) − σ ln .
L b
Therefore, the elasticity of substitution between labor and capital is the derivative of this
expression with respect to ln(MRTS); that is,

∂ ln KL
=σ,
∂ ln (MRTS)
coinciding with term σ in the exponent of the CES production function.
180 Chapter 7
Exercises
1. Properties of production functions.B A firm uses only one input, labor (L), to produce output
with production function q(L) = 3L2 + 0.5L − 0.6L3 .
(a) Total product. For which values of L does the total product curve q(L) increase or decrease?
For which values is it concave or convex in labor?
(b) Marginal product. For which values of L does the marginal product curve ∂q(L)
∂L increase or
decrease? For which values is it concave or convex in labor?
(c) Average product. For which values of L does the average product curve q(L)
L increase or
decrease? For which values is it concave or convex in labor?
(d) Find the value of L where the marginal product curve crosses the total product curve, and
where it crosses the average product curve.
2. MP and AP curves.A Sarah is looking into producing her homemade dog treats on a larger
scale and is contemplating two different kitchen sizes (K). Her production of dog treats fol-
lows q = 200KL + K 2 L3 . What are the marginal and average product curves for labor when
K = 5? What happens to the marginal and average product curves when her kitchen doubles to
K = 10?
3. Where the MP and AP curves cross.A A firm has the production function q = 50L − 2L2 − 10.
At what level of labor do the marginal product and average product curves cross?
4. Isoquants.B Graph the isoquants for the following production functions for an output q = 100
units:
(a) q = 10K + 5L.
(b) q = K 0.75 L0.25 .
(c) q = 5 min{10K, 5L}.
5. MP and changing capital.B A firm produces stickers, s, using capital and labor in its production,
with the production function s = 10K + 20L2 − K 3 L3 . Does the marginal product of labor increase
or decrease as capital increases?
6. Cobb-Douglas and MRTS.A Consider the Cobb-Douglas production function q(K, L) =
5L0.75 K 0.25 . Find the marginal products of labor and capital, and the MRTS.
7. Linear production and MRTS.A Jack produces water bottles and wants to know more about his
production function. He finds that it follows the function q(K, L) = 20K + 30L + 10KL. Find the
marginal products of labor and capital, as well as the MRTS.
8. Nonlinear production and MRTS.B Jack (from exercise 7) is at the end of the lease in his current
factory and is considering moving his production to a new space. With this new space comes a new
production function: q(K, L) = 0.5L + L0.6 K 0.4 . Find the marginal products of labor and capital,
as well as the MRTS.
9. Returns to scale–I.A Do the following production functions exhibit increasing, decreasing, or
constant returns to scale?
(a) q = 5K 0.7 L0.3 .
(b) q = K 0.5 L0.6 .

(c) q = 2K + 4L.
10. Returns to scale–II.C Do the following production functions exhibit increasing, decreasing, or
constant returns to scale?
(a) q = 2K + 4L + 5.
(b) q = 3K 0.5 L0.5 − 2.
√
(c) q = 3K 0.75 L0.25 + L.
11. Decreasing marginal returns.A A bakery that specializes in cupcakes has the production func-
tion for cupcakes of c = 10L − 0.5L2 in their current space. What is the firm’s marginal product?
At what amount of labor does the firm’s output begin to decrease (i.e., when does there start to be
“too many cooks in the kitchen”)?
12. Technological change.A A firm has a production function that changes from q = 7L + 2K to
q = 10L + 5K. Is the firm experiencing technological progress? Find the MRTS before and after
the technological change. Is this change labor saving, capital saving, or neutral?
13. Choosing production.A Eric is a manager of a firm that produces playing cards. He is investing in
a new technology and has two options with the resulting production functions: (a) q = 10L0.5 K 0.5 ,
and (b) q = 10(L0.5 + K 0.5 ). If the firm has 100 units of capital, when would the firm prefer
technology (a) over technology (b)?
14. Technology change with a Cobb-Douglas production function.A Julie’s Candy Factory pro-
duces candy bars with a production process that follows a Cobb-Douglas production function.
She invests in a new technology that changes the production function from q = L0.25 K 0.5 to
q = L0.25 K 0.75 . Is Julie experiencing technological progress, or was her investment a waste of
time and money? Find the MRTS before and after the technological change. Is this change labor
saving, capital saving, or neutral?
15. CES production function and marginal products.B Find the marginal products of labor and
σ −1 σ
σ −1 σ −1
capital for the CES production function q = aL σ + bK σ .
16. CES production and returns to scale.B Find the returns to scale for the CES production function
σ −1 σ
σ −1 σ −1
q = aL σ + bK σ .
17. Choosing factors of production.B Ashley is a producer of water purifiers for use in remote
African villages and relies heavily on donations and volunteers in her production of water puri-
fiers. In 1 hour, a worker can make 5 purifiers with 10 units of capital, or 2 workers can make
10 purifiers with 15 units of capital. As of Thursday evening, Ashley has 20 committed volun-
teers (for 5 hours on Saturday) and 100 units of capital. Ashley has 24 hours to round up more
volunteers, more capital, or both. What should she do?
18. Three inputs to production.C Many firms have more than two inputs to their production func-
tions. An example of this might be a microchip manufacturer with the production function
q = A0.58 L0.19 K 0.23 , where A is the materials and energy, and L and K are labor and capital,
respectively.
182 Chapter 7
(a) What type of returns to scale does this production function exhibit?
(b) Find the marginal products of materials MPA , labor MPL , and capital MPK .
(c) Find the MRTS between capital and material MRTSA,K , and between capital and labor
MRTSL,K . Explain what the difference means.
19. Comparing MRTS.C Tony and Chris are studying for exams to be certified public accoun-
tants (CPAs). Tony prefers reading laws and regulations (R), while Chris prefers practicing
audits (A). Tony’s score follows the function ST = 10A0.65 R0.35 , while Chris’s score follows
SC = 10A0.4 R0.6 .
(a) At what point will their MRTS between practicing audits and reading regulations be equal?
(b) Tony and Chris went to college together and know that each of them is studying for the CPA
exam, and they plan to study together. Their score functions change by adding a variable
representing the time they spend studying together, T, such that ST = 10A0.65 R0.35 + T 0.5
and SC = 10A0.4 R0.6 + T 0.5 . Find the MRTS between the time spent studying together and
their preferred method of studying. Can we tell who is willing to give up more of his preferred
method of studying to study together? If so, who?
20. Comparing Cobb-Douglas production functions.C Is it possible for two firms with different
Cobb-Douglas production functions, q1 = ALα K β and q2 = Lα K β , to have the same marginal
products at particular levels of capital and labor? What about each firm’s MRTS?
8 Cost Minimization
8.1 Introduction
Isoquant curves analyzed in chapter 7 help us describe input combinations for which the
firm reaches a specific output level. However, isoquants did not allow us to find which exact
input combination the firm chooses to minimize its costs. In this chapter, we combine the
isoquant, input combinations that help the firm reach an output level, and isocost lines,
which depict input combinations entailing the same cost, in order to identify which optimal
labor and capital the firm hires.
The chapter begins by examining the isocost line and then combining it with isoquants
to find the cost-minimizing amount of labor and capital that the firm acquires. For com-
pactness, we refer to this input pair as “input demand.” We then investigate whether a firm’s
input demand (e.g., the number of software developers that the firm hires) is decreasing in
that input’s price (e.g., the salary of software developers) but increasing in the price of other
inputs (e.g., salaries of other type of employees, and cost of capital). We also evaluate the
firm’s cost at the optimal units of labor and capital to obtain its “cost function.”
Finally, we decompose the firm’s cost function into various types of costs (such as sunk,
nonsunk, fixed, and variable), and analyze how to measure average and marginal costs. We
conclude the chapter with a discussion of how costs vary when the firm increases in scale
and when it adds more product lines.
8.2 Isocost Lines
In this section, we describe how to represent input combinations that result in the same total
cost for the firm.
Isocost line The set of input combinations (i.e., pairs of labor and capital amounts)
that yield the same total cost for the firm. That is, the combinations of L and K
for which
184 Chapter 8
TC
r
w
– TC w
r Isocost line K = – L
r r
L
TC
w
Figure 8.1a
Isocost line.
TC = wL + rK,
where w > 0 denotes the price of every unit of labor (wage per hour), r > 0 represents
the cost of each unit of capital (interest rate), and TC is a given total cost that the firm
incurs.
Figure 8.1a depicts the isocost line TC = wL + rK with units of L on the horizontal axis
and units of K on the vertical axis. As usual, we solve for the variable on the vertical axis (K),
to obtain
TC w
K= − L.
r r
Hence, TC w
r represents the vertical intercept of the isocost line, whereas r denotes its negative
slope.1 The firm faces a linear isocost line like that in figure 8.1a, regardless of its production
function q = f (L, K), because the isocost line is just an accounting sum of costs (i.e., the
costs that the firm incurs when hiring L workers plus those from acquiring K units of capital).
An increase in the total cost that the firm incurs, TC, increases both the vertical intercept
r , and its horizontal intercept, w , without altering its slope, − r ; thus
of the isocost line, TC TC w
producing a parallel upward shift in the isocost line (moving northeast). Intuitively, as the
firm can incur a larger cost, it can choose among higher input combinations.
If wages increase, the vertical intercept of the isocost TCr is unaffected, but its horizontal
intercept, TCw , decreases, thus making the isocost steeper.2 In other words, the firm can
1. To find the horizontal intercept, recall that we only need to set the variable on the vertical axis (K) equal to zero
in the equation of the isocost to obtain TC = wL + r0. Solving for L, we find a horizontal intercept of L = TC w .
2. Alternatively, you can see this result by noticing that an increase in w decreases ratio TC w , thus shifting the
horizontal intercept of the isocost line leftward, without changing the vertical intercept TCr .
Cost Minimization 185
afford to hire fewer workers as their wages increase. If the interest rate r increases, the
vertical intercept of the isocost TC
r decreases, flattening the isocost as a result. Therefore,
the firm can afford fewer units of capital as its price increases.
Example 8.1: A particular isocost Consider a firm facing a wage of w = $10,

a price for capital of r = $15, and incurring a total cost of TC = $200. Its isocost
line would be 200 = 10L + 15K. Solving for K yields K = 200 15 − 15 L, as depicted in
10
figure 8.1b.
15 13.3, is the vertical intercept;
The first term in the equation of the isocost line, 200
the second term, 15 = 3 , represents its negative slope; and 200
10 2
10 = 20 is its horizontal
intercept.
200
= 13.3
15
2
– 200 10
3 Isocost line K = – L
15 15
L
200
= 20
10
Figure 8.1b
An example of an isocost line.
Self-assessment 8.1 Consider the firm in example 8.1, but assume that wages
double to w = $20. Find the firm’s isocost in this scenario, and interpret how it changed
relative to that in figure 8.1b.
8.3 Cost-Minimization Problem
In this section, we combine the isoquant learned in chapter 7 with the isocost discussed
in section 8.2 to determine how many units of labor and capital the firm optimally hires.
Figure 8.2 depicts an isoquant where the firm produces 100 units of output, with a set of
isocost lines superimposed, each one associated with a different total cost, TC.
186 Chapter 8
TC '
r B
TC
r
A
C q = 100 units
D
L
TC TC '
w w
Figure 8.2
Cost-minimization problem (CMP).
The cost-minimization problem (CMP) can be represented as follows:

min TC = wL + rK
L, K
subject to 100 = f (L, K).
Intuitively, this problem asks the firm:
Choose the input combination that minimizes your total cost TC, reaching an output level of q =
100 units.
As illustrated in Figure 8.2, the CMP entails pushing the isocost line inward, because
isocost lines closer to the origin are associated with lower total costs and reach the isoquant
where the firm produces q = 100 units.3 Points like B or C in Figure 8.2 cannot be cost
minimizing because, while the firm reaches the isoquant of q = 100 units, it does so at a cost
that could still be reduced by moving to points closer to A. At point A, the firm minimizes
its total cost, TC , and reaches the isoquant q = 100. Cheaper input combinations, such as
that at point D, do not reach the target isoquant, q = 100, which fails to solve this CMP.
As a result, combinations of labor and capital minimizing the firm’s cost require that
the firm’s isoquant is tangent to its isocost, which implies that the slope of the isoquant
MPL
(MRTS = MP K
) and isocost ( wr ) coincide; that is,
MPL w
= .
MPK r
3. Recall that a decrease in TC shifts the isocost inward and in a parallel fashion, because TC is in the numerator
of both the vertical and horizontal intercepts of the isocost line.
We can cross-multiply this expression to rewrite it as

MPL MPK
= .
w r
Therefore, when minimizing its total cost, the firm rearranges its inputs until the point
where the marginal product per dollar spent on additional units of labor (e.g., hiring one
more worker), MP L
w , coincides with the marginal product per dollar spent on additional units
MPK
of capital, r . Informally, the bang for the buck must be the same across all inputs.4
If this condition does not hold, the firm still has incentives to readjust its input combina-
tion. For instance, if MP MPK
w > r , the firm could decrease its total costs (and still reach the
L
target production level q = 100) by acquiring fewer units of capital, and using the savings to
hire more workers, given that they provide a larger marginal product per dollar to the firm.
This is what happens at point B, where the isoquant is steeper than the isocost, entailing
MPL
that MP K
> wr , or rearranging, MP L MPK
w > r , ultimately providing the firm with incentives
to move southeast, towards point A, where it uses fewer units of capital and more units of
labor.5 (For completeness, the appendix at the end of the chapter shows that solving the
MPL
firm’s CMP yields the tangency condition MP K
= wr or, in its bang for the buck version,
MPL MPK
w = r .)
Similarly to consumer theory, we present a three-step procedure to solve the firm’s CMP
in tool 8.1, and after that, illustrate it with two numerical examples.6
Tool 8.1. Procedure to solve the CMP:

MPL
1. Set the tangency condition MPK = wr . Cross-multiply and simplify.
a. Contains both unknowns L and K, then solve for K, and insert the resulting expression
into the firm’s output target q = f (L, K).
b. Contains only one unknown (input L, input K, but not both), then solve for that
unknown. Afterwards, insert your result into the firm’s output target q = f (L, K).
MPL MPK MPL
c. Contains no input L or K, then compare w against r . If w > MPr K , then set
MPL
K = 0 in the firm’s output target, and solve for L. If, instead, w < MPr K , then set
L = 0 in the firm’s output target, and solve for K.
4. This result is analogous to what we found in consumer theory, where the individual rearranged her purchases of
goods x and y until the point where the marginal utility per dollar spent on an additional unit of good x coincided
with that of good y.
MPL
5. The opposite argument applies to point C, where the isoquant is flatter than the isocost, implying that MP < wr
K
or, after rearranging, MP MPK
w < r . In this case, the firm has incentives to hire fewer workers and acquire more
L
units of capital (moving northwest in the figure towards point A).
6. As with tool 3.1 in chapter 3, which solves the utility maximization problem (UMP) for the consumer, tool 8.1
applies when no input has a negative marginal product. If it does, the firm would hire zero units of that input, and
use all its resources to hire units of the other input. For instance, if MPK < 0, the firm hires K = 0 units of capital
and L = TC w workers.
188 Chapter 8
3. If in step 2 you find that one of the inputs is negative (e.g., L = −2), then set the amount
of that input equal to zero on the firm’s output target (e.g., q = a0 + bK), and solve for
the remaining input.
4. If you haven’t found the values for all the unknowns L and K yet, then use the tangency
Example 8.2: Cost minimization with Cobb-Douglas production functions

Consider a firm with the Cobb-Douglas production function q = L1/2 K 1/2 , seeking
to reach an output level of q = 100 units, and facing input prices w = $40 and r = $10.
MPL
Step 1. We set the tangency condition MPK = wr , which yields
1 −1/2 1/2
2L K 40
1 1/2 −1/2
= ,
2L K 10
or rearranging, KL = 4, which, solving for K, simplifies to K = 4L. Because this result

contains both inputs, K and L, we now move on to step 2a.
Step 2a. Inserting the result from step 1, K = 4L, into the output target of the firm,
q = 100, 100 = L1/2 K 1/2 , we obtain
100 = L1/2 (4L)1/2 ,

K
where K = 4L from the tangency condition. Rearranging this, we obtain 100 =

(4)1/2 L or, solving for L,
100 100
L= = = 50 workers.
(4) 1/2 2
Because the firm hires a positive number of workers, we can now move on to step 4.
(Recall that we need to go on to step 3 only if we find a negative amount of either
input.)
Step 4. Lastly, we can plug this result into the tangency condition K = 4L, to find that
K = 4 × 50 = 200 units of capital.
Summary. The cost-minimizing input combination is then (L, K) = (50, 200). Intu-
itively, the firm uses more capital than labor because labor is four times as expensive
as capital, while their marginal productivities are symmetric.
Self-assessment 8.2 Repeat the analysis in example 8.2, but assume that wages
decrease to w = $20. How are the results in example 8.2 affected?
Example 8.3: Cost minimization with linear production functions Consider the
scenario of example 8.2, but assume that the firm’s production function is linear, q =
2L + 8K.
MPL
Step 1. We set the tangency condition MP K
= wr , yielding 28 = 40
10 , which cannot
hold because each side represents a different number! Our result from the tangency
condition contains neither input, K or L, so we move on to step 2c.
MPL
Step 2c. We obtained that 28 < 40 10 , which entails that MPK < r or, after cross-
w
multiplying, MP MPK
w < r . As a consequence, the firm increases its purchases of capital
L
as much as possible, leading to a corner solution where the firm only purchases capital
but no labor (L = 0).
Step 4. Inserting L = 0 into the output target of the firm, 100 = 2L + 8K (recall that
the firm seeks to produce q = 100 units of output), yields 100 = (2 × 0) + 8K. Solving
8 = 12.5 units of capital.
for K, we obtain K = 100
Summary. The cost-minimizing input combination is (L, K) = (0, 12.5).
Self-assessment 8.3 Repeat the analysis in example 8.3, but assuming that wages
decrease to w = $20. How are the results in example 8.3 affected?
8.4 Input Demands
Examples 8.2–8.3 identified a specific number of units of labor and capital being employed
by the firm to minimize its costs, namely (L, K) = (50, 200) in example 8.2 and (L, K) =
(0, 12.5) in example 8.3.
We can now use that analysis in a more general setting, where input prices (w and r) are
not specific numbers, and similarly, where the output q that the firm seeks to reach is not
a concrete number of units. For illustration purposes, we reproduce example 8.2 in exam-
ple 8.4, where the firm faced a Cobb-Douglas production function q = L1/2 K 1/2 , without
assuming specific values to input prices w and r, or to output level q.
190 Chapter 8
Example 8.4: Finding input demands with a Cobb-Douglas production function

Consider the firm in example 8.2 again. We follow a similar procedure as in that
example.
1 −1/2 1/2
MPL 2L K
Step 1. We set the tangency condition MPK = wr , which yields 1 1/2 −1/2 = wr or,
2L K
rearranging, KL = wr , which solving for K simplifies to K = wr L. Because this result
contains both inputs, K and L, we now move on to step 2a.
Step 2a. We now insert the result from step 1, K = wr L, into the output target of the
firm, q = L1/2 K 1/2 , to obtain
w 1/2
q = L1/2 L .
r
K
w 1/2
Rearranging, we obtain q = r L and, solving for L, we find the firm’s labor
demand:
√
q q r
L= = √ .
w 1/2 w
r
√
Step 4. Finally, we plug labor demand L = q√wr into the tangency condition, K = wr L,
to find that capital demand is
√ √
wq r q w
K= √ = √ .
r w r

L
If we evaluate this input demands at the parameter values we considered in exam-

ple 8.2, q = 100 units, w =√
$40, and r = $10, we obtain√the same
√
results as in that
q w
example; that is, L = 100 √ 10 = 50 workers, and K = √ = 100
r
√ 40 = 200 units of
40 10
capital.
We can now do comparative statics√

of the labor and capital demands in example 8.4.
q r
Starting with labor demand L = w , we find that it is increasing in the number of units that
√
the firm seeks to produce, q; decreasing in the salary that the firm pays to workers, w; and
increasing in the price of capital, r. In other words, as the firm seeks to produce more units,
it needs to hire more workers; as it faces higher salaries, it responds by hiring fewer workers;
and as capital becomes more expensive, labor becomes relatively more attractive, and thus
the firm√
responds by hiring more workers. Similar results apply to the demand for capital,
q w
K = r , which is also increasing in the number of units the firm produces, q; decreasing
√
in the price of capital, r; but increasing in the price of labor, w.
Self-assessment 8.4 Repeat the analysis in example 8.4, but assume that the firm’s
production function changes to q = 4L1/3 K 1/2 . How are the results in example 8.4
affected? Interpret the results.
Example 8.5: Finding input demands with a linear production function Con-
sider the scenario of example 8.3, where production function was q = 2L + 8K.
MPL
Step 1. We set the tangency condition MP K
= wr , which yields 28 = wr . This result does
not contains either input, K or L, we now move on to step 2c.
Step 2c. Comparing the marginal product per dollar across inputs, we obtain that
MPL MPK 2 8
< if < ,
w r w r
simplifying to 14 < wr , which induces the firm to hire no workers (L = 0). Otherwise,
the marginal product per dollar spent on labor is now higher than that on capital,
entailing that the firm hires no capital (K = 0).
Step 4. If 14 < wr , the firm hires no workers (L = 0), which we can insert into the out-
put target q = 2L + 8K (which is now evaluated at a generic output level q), yielding
q = (2 × 0) + 8K. Solving for K, we obtain a demand for capital of K = q8 , which
is increasing in output. If, instead, 14 > wr , the firm hires no capital, K = 0, and its
demand for labor is found by inserting K = 0 into the output target, which yields
q = 2L + (8 × 0), or L = q2 , which is also increasing in output.7
In terms of comparative statics, note that labor and capital are increasing in the output q
that the firm seeks to produce, as in example 8.4. However, an increase in salary w does
not generally affect the firm’s demand for labor and capital. There is, however, one scenario
in which a higher salary w producesa change in input demands. When 14 > wr , the firm
produces using labor alone, (L, K) = q2 , 0 ; but if w increases enough to yield 14 < wr , the
firm changes its input usage completely to (L, K) = 0, q2 , thus using capital alone.
production function changes to q = 3L + 8K. How are the results in example 8.5
affected? Interpret.
7. When the marginal product per dollar coincides across inputs, 14 = wr , the isocost and isoquant overlap, which
gives rise to a continuum of solutions (i.e., all points along the isoquant are optimal). That is, any point satisfying
the equation of the isoquant q = 2L + 8K, is optimal.
192 Chapter 8
(a) (b)
L K
250
120
100 200
316.2 632.4
80 L= K=
w 150 r
60
100
40
50
20
20 40 60 80 100 w 20 40 60 80 100 r
Figure 8.3
(a) Labor demand. (b) Capital demand.
8.4.1 Input Demand—Responses

Response to changes in its own price. The demand for an input decreases as its price
increases, implying that the input demand has a negative slope. In example 8.4, for instance,
the demand for labor decreases in w and the demand for capital decreases in √
r. Assum-
ing q = 100 units and r = $10, the demand for labor on that example, L = q√wr , becomes
√
L = 100√w10 316.2
√ , which is clearly decreasing in w, as depicted in figure 8.3a. Simi-
w √
larly, assuming q = 100 units and w = $40, the demand for capital K = q√rw simplifies to
√
K = 100√r 40 632.4
√ , which is also decreasing in its own price r, as plotted in figure 8.3b.
r
The sensitivity of input demand to variations in its price is often measured using price
elasticity. For the case of labor demand, its price elasticity is
L
%L L w
εL,w = = L
w
= ,
%w w L
w
∂L w ∂L
or, if the change in salary w is infinitely small, εL,w = ∂w L , where ∂w represents the slope
of the labor demand curve. Intuitively, if salary w increases by 1 percent, the firm would
reduce the number of workers it hires by εL,w percent. A similar expression applies to the
elasticity of capital with respect to its price r, εK,r = ∂K r
∂r K .
To better understand this elasticity, let us consider some of the common production func-
tions examined in previous sections of the chapter. In the case of the fixed-proportion
production function, input demand becomes vertical, as the firm does not change its input
∂L
combination when input prices change. Hence, the slope of labor demand is ∂w = −∞,
yielding the same value for the labor elasticity, εL,w = −∞. In contrast, when the firm faces
a linear production function, its input demand is flat.8 In this case, the slope of labor demand
is zero, thus yielding zero elasticity, εL,w = 0. (Similar arguments apply to the slope of the
demand for capital and its elasticity.)
Response to changes in the price of the other input.The demand for an input may increase
as we increase the price of the other input, thus shifting upward. For instance, as labor
becomes more expensive (higher wages), capital
√
becomes more attractive. In example 8.4,
for instance, the demand for capital K = q√rw increases in salaries, w; and, similarly, the
√
demand for labor L = q√wr increases in the price of capital, r. Graphically, the demand func-
tion for labor (capital) in figure 8.3a (figure 8.3b) would shift outwards as the price of the
other input, capital (labor), becomes more expensive.
Response to changes in output. When the firm increases its demand for input as it seeks
to produce more units of output q, we say that such input is regarded as normal, whereas
when its demand decreases in q, we say that the input is inferior. For aggregate categories
of inputs, such as labor or capital, we rarely see firms using fewer of them as they seek to
produce more output. However, if we disaggregate an input into more specific categories, we
can quickly recognize some inputs as decreasing in output (inferior). For instance, consider
a firm with different types of labor, such as chief executive officers, midlevel managers,
sellers, accountants, secretaries, information technology personnel, and janitors. While the
firm may initially hire more workers in all categories as it increases its output, it may be
possible for the firm to sign software contracts when its output is large enough, and as a
result fire some accountants or secretaries, who would become inferior inputs.
8.5 Cost Functions
Total cost The expenditures that a firm incurs when hiring the optimal amounts of
labor and capital identified by its labor and capital demand; that is,
TC = wL∗ + rK ∗ .
For illustration purposes, we identify the total cost that emerges from the labor and capital
demands found in example 8.4 for a Cobb-Douglas production function.
8. Recall that a flat demand curve for an input entails that a minor change in its price leads the firm to stop using
this input, and switch to use only the other input. Intuitively, because both inputs can be easily substituted without
altering the number of units produced, q, the firm uses the input with the highest marginal product per dollar: only
labor if MP MPK
w > r , or only capital otherwise.
L
194 Chapter 8
Example 8.6: Finding total cost in the Cobb-Douglas

√
case Recall that the labor
q r
demand found in example 8.4 was L = √w , while the demand for capital was K =
√
q w
√ . Hence, the total cost is
r
L K

√ √

q r q w
TC = w √ + r √
w r
= qw1/2 r1/2 + qr1/2 w1/2

√
= 2q rw,
which increases as the firm produces more units of output, q, and as inputs become
more expensive (higher r and/or√w). If input prices take values w = $40 and r = $10,
total cost simplifies to TC = 2q 10 × 40 = 40q, which is a straight line with positive
slope of 40. Considering the same output target as in example 8.4, q = 100 units, this
total cost becomes TC = $4, 000.
Self-assessment 8.6 Consider the labor and capital demands found in self-
assessment 8.4. Find the total cost function TC, and then evaluate it at input prices
w = $40 and r = $10. Compare it against TC = 40q we found in example 8.6. Interpret.
Example 8.7: Finding total costs in the linear production case The input
demands found in example 8.5 were L = 0 and K = q8 when input prices satisfy 14 < wr
(i.e., when 4r < w), but become L = q2 and K = 0 when input prices satisfy 4r > w.
Hence, when 4r < w, the total cost is
q q
TC = w0 + r = r ,
8 8
which increases in the output that the firm seeks to produce, q, and in the price of
capital, r, but it is independent of the price of labor w because the firm does not use
labor in this case. If, instead, input prices satisfy 4r > w, the firm only uses labor, and
its total cost becomes
q q
TC = w + r0 = w ,
2 2
thus increasing in output q, and in wages w, but unaffected by the price of capital, r.
Lastly, note that if w increases enough, the condition on input prices 4r > w can revert
to 4r < w, leading the firm to one of the cases studied in example 8.5 where it uses
only capital, but no labor (i.e., L = 0 and K = q8 ).
Self-assessment 8.7 Consider the labor and capital demands found in self-
assessment 8.5. Find the total cost function TC in this scenario, and compare it against
TC = r q8 , found in example 8.7. Interpret.
8.6 Types of Costs
Explicit versus implicit costs. “Explicit costs” are those involving a direct outlay, and are
thus included in balance sheets. “Implicit costs,” however, do not necessarily involve direct
outlays, but they reflect the opportunity cost of an input. Therefore, implicit costs consider
the best alternative use of the input that the firm forgoes when dedicating that input to its
production process. A common example of opportunity cost is that of studying an under-
graduate degree: your monetary outlays (either in cash or in debt) are the explicit cost of
your education, whereas the foregone salary that you could earn in the years you dedicate
to your degree is the opportunity cost (or implicit cost) of your degree. Another highly
cited example is that of Kaiser Aluminum. This firm initially signed a long-term electricity
contract at a price of $23/mWh, which guaranteed Kaiser this purchasing price for several
years. In 2001, a few months after the contract was signed, however, the price of electricity
skyrocketed to $1,000/mWh. While the explicit cost of using a megawatt of electricity was
still $23, Kaiser’s implicit cost (the opportunity cost of using electricity in aluminum pro-
duction rather than selling it) was $1,000. Kaiser understood this difference and shut down
their smelters for a few days to sell the electricity on the open market, which the contract
allowed Kaiser to do.
Sunk versus nonsunk costs. Sunk costs cannot be recovered, even if the firm chooses to
shut down its operations. For instance, if the firm rents the building where it operates, and
the lease contract prohibits the firm from subletting the building to another party, rental
payments can be considered as unrecoverable, and thus sunk.9 If, instead, the lease does not
prohibit the firm from subletting the building, then the manager could sublet it, and recover
9. Another example of sunk cost is that of specific investments, such as developing new tools and machines to be
used in the production process, which only benefits the firm that develops them. If the firm were to shut down its
operations, it would face many difficulties at finding other companies interested in these (completely new) tools,
as they could be useless for firms in other industries and only of little use for other firms in the same industry.
196 Chapter 8
a significant portion of the rental cost, thus making it a nonsunk cost. The cost of most raw
materials and inputs is also nonsunk, because the firm can sell them back to its providers if
it were to shut down its operations (recovering a portion of the cost).
Long-run versus short-run costs. In the long run, the firm is assumed to have enough time
to vary the amount of all inputs as much as necessary. In the short run, however, the amount
of at least one input is considered to be fixed, as its amount cannot be easily changed in a
matter of only days or weeks. Most applications assume that in the short run capital is fixed,
as varying the building size (or some machines) usually requires several weeks or months.
Nonetheless, some industries might find that, in contrast, capital is relatively easy to vary,
but labor is fixed in the short run. A typical example is faculty positions at universities
(or top programmers in the technology industry). Acquiring a new computer or software
(both being forms of capital) can be done in a matter of hours, whereas a professor with a
long experience in a specific field would require the posting of ads, interviewing candidates
at professional meetings and conferences, inviting a small group of potential candidates
to visit the hiring institution, an offer to the selected candidate, and perhaps a subsequent
negotiation about the details of the contract—a process that can take four to five months, if
not longer.
Because in the short run the firm can vary the amount of fewer inputs, it has less freedom
to minimize its costs, which ultimately entails that short-run costs are higher (or equal, but
never lower) than long-run costs. For illustration purposes, example 8.8 revisits the firm in
examples 8.4 and 8.5, holding capital fixed in the short run.
Example 8.8: Comparing long- and short-run costs Consider a firm with Cobb-
Douglas production function q = L1/2 K 1/2 , but assume that capital cannot be varied in
the short run, being fixed at K = 150 units. To find the cost-minimizing units of labor
(i.e., the firm’s demand for labor in the short run), we only need to insert K = 150 into
the firm’s production function, q = L1/2 1501/2 , and solve for L. Squaring both sides
yields q2 = 150L, and solving for L, we obtain the short-run demand for labor:
q2
L= ,
150
which increases in the output that the firm seeks to produce, q. In this context, the
short-run total cost becomes
q2
STC = wL∗ + rK = w + r150.
150
Considering the same input prices as in example 8.4, w = $40 and r = $10, this
short-run total cost simplifies to STC = $1, 500 + 15
4 2
q , which lies above the long-run
12 , 000 STC(q)
10 , 000
8,000
TC(q)
6,000
4,000
2,000
$1,500
q
50 100 150 200
q = 75
Figure 8.4
Long-run versus short-run total costs.
total cost found in example 8.6, TC = 40q, as depicted in figure 8.4.10 If we assume the
same output target q = 150 units, this short-run total cost simplifies to STC = $7, 500,
which is higher than the long-run total cost found in example 8.6, TC = $6, 000, as
illustrated in figure 8.4.
When the firm seeks to produce q = 75 units of output, the long- and short-run
costs coincide; see the point of tangency on the graph where TC(75) = STC(75).
Intuitively, this result indicates
√
that√to produce this volume of output, the firm has
a capital demand of K = q√rw = 75√ 40 = 150 units, which coincides with the fixed
10
amount of capital that is given in the short run, K = 150. In other words, if in the
short run, the firm could choose the amount of capital it uses, it would exactly
choose K = 150. For all other output levels, q = 75 units, the fixed amount of capital
K = 150 yields a larger cost than in the long run, STC(q) > TC(q), as depicted in the
figure.
Self-assessment 8.8 Repeat the analysis in example 8.8, but assume that capital is
fixed at K = 50 units. Find the firm’s demand for labor, and its short-run cost function
STC. Assume the same input prices as in example 8.8, w = $40 and r = $10, and
compare STC against the firm’s long-run total cost TC = 40q.
2
10. Graphically, the short-run cost originates at 4×75
15 = 1, 500, which coincides with the fixed cost that the firm
needs to incur even if it produces zero units of output. (Indeed, when q = 0, the firm is “stuck” with K = 150 units
of capital that it cannot vary. Because the price of each unit is r = $10, its cost from capital alone is 150 × 10 =
$1, 500.) The short-run cost then increases at a rate of ∂STC
8q
∂q = 15 , which is itself increasing in q, thus exhibiting
a convex shape.
198 Chapter 8
Cheat sheet of short-run costs. As a trick to identify whether a particular cost is variable
or fixed, and whether it is sunk or nonsunk, you can ask the following questions:
1. Does the cost increase when the firm increases its production?
(a) If the answer is yes, then the cost is variable.
(b) If the answer is no, the cost is fixed.
2. Does the firm incur a positive cost if it were to shut down its operations (setting output
equal to zero)?
(a) If the answer is yes, the cost is sunk because the firm cannot recover the cost.
(b) If the answer is no, the cost is nonsunk because the firm can recover such
a cost.
For examples of a variable and nonsunk cost, think about labor and raw materials; for
examples of fixed but nonsunk costs, think about the heating that can be avoided if the firm
shuts down its operations and turns off the thermostat; and for examples of fixed and sunk
costs, think about the rental cost of a property that cannot be sublet.
8.7 Average and Marginal Cost
Average cost (AC) The total cost that the firm incurs per unit of output; that is,
AC = TC
q .
For instance, if the firm incurs $1,000 in total costs to produce 20 computer monitors, its
average cost is 1,000
20 = $50 per monitor.
Marginal cost (MC) The rate at which total costs increase as the firm produces 1
more unit; that is, MC = ∂TC
∂q .
Graphically, MC measures the slope of the total cost curve: when TC increases, its slope
must be positive, implying that MC is also positive (i.e., additional workers increase the
firm’s output), while when TC decreases, the opposite argument applies.
The AC and MC curves exhibit a similar relationship to the one described for aver-
age and marginal product, AP and MP, where MP curve crossed the AP curve at its
maximum (see section 7.4 in chapter 7). In the current scenario, the MC curve crosses
the AC curve at its minimum. The intuition for average and marginal scores in a test
is the same as that for average and marginal products, so we leave that as an exercise
for you.
Example 8.9: Finding average and marginal cost Consider the firm with the
Cobb-Douglas production function analyzed in example 8.4, where total cost is
TC = 40q. This firm’s average cost is then AC = 40q
q = 40, and its marginal cost is
MC = ∂(40q)
∂q = 40. Hence, the average and marginal cost curves are both constant and
coincide, being graphically depicted in figure 8.5 by a horizontal line at a height of
$40. (This is a common feature for all firms with a Cobb-Douglas production functions
like q = ALα K β , which exhibit constant average and marginal cost curves.)
In the case of the linear production function from example 8.7, we found that the
total cost is TC = r q8 when input prices satisfy 4r < w (i.e., labor is expensive relative
to capital), but changes to TC = w q2 when input prices satisfy 4r > w (labor is relatively
r q8
cheap). In this context, average cost becomes AC = q = 8r when 4r < w, but AC =
w 2q ∂(r q8 )
q = w2 when 4r > w. Similarly, marginal cost is MC = ∂q = 8r when 4r < w, but
∂(w q2 )
MC = ∂q = w2 when 4r > w. Hence, both AC and MC are constant in output, q, and
coincide for a given pair of input prices, w and r. Graphically, AC and MC would
overlap, both being depicted by a horizontal, flat line.
AC(q) = MC(q) = $40
$40
Figure 8.5
AC and MC with a Cobb-Douglas production function.
Self-assessment 8.9 Assume a firm with total cost TC = 30q2 . Find its average
and marginal cost functions, and depict them against q.
8.7.1 Output Elasticity to Total Cost

As discussed previously, the marginal cost MC = ∂TC ∂q measures how much total cost
increases if the firm increases its output by 1 unit. While this measure is useful, it is not
unit-free. To illustrate this point, consider a firm producing computer monitors in the US,
200 Chapter 8
and another firm producing cars in Germany. The MC from the first firm would be in dol-
lars/monitor, whereas that of the second firm would be in euros/car which are not easily
comparable. As in previous chapters, we can apply the definition of elasticity in this context
to obtain a unit-free measure of how total cost changes in output (output elasticity to total
cost), as follows:
TC
%TC TC TC q
εTC,q = = q
= ,
%q q TC
q
TC
or εTC,q = ∂TC q
∂q TC when the change in output q is small, so that
∂TC
q ∂q . In addition,
because MC = ∂TC
∂q , we can rewrite this elasticity as
q
εTC,q = MC TC . Lastly, as the average
q
cost is AC = TC
q , its inverse is
1
AC = TC , which allows us to express this elasticity as εTC,q =
1
MC AC , or more compactly, as
MC
εTC,q = ,
AC
which is a function of MC and AC alone.
When MC > AC, elasticity εTC,q = MC AC satisfies εTC,q > 1, indicating that total costs
increase more than proportionally to a 1 percent increase in output. In contrast, when
MC < AC, elasticity εTC,q = MC
AC must be εTC,q < 1, thus implying that total costs increase
less than proportionally to a 1 percent increase in output. Lastly, if MC = AC, elasticity
equals 1, εTC,q = 1, which reflects that total costs respond proportionally to a 1 percent
increase in output. This is the case with the total cost described in example 8.9, where
MC = AC = 40, thus yielding an elasticity of εTC,q = 1.11
Example 8.10: Output elasticity in the Cobb-Douglas case Consider again the
firm with the Cobb-Douglas production function from example 8.6, where total cost
was found to be TC = 40q. The output elasticity in this case is
∂TC q q
εTC,q = = 40 = 1.
∂q TC 40q
That is, if the firm increases its output by 1 percent, it will see its total costs also
increase by 1 percent. If the firm has a linear production, such as in example 8.7, total
cost was TC = r q8 when input prices satisfy 4r < w, but changes to TC = w q2 when
11. Empirically, the output elasticity to total cost is often lower than 1 (εTC,q < 1), with industries such as produc-
ers of computer units and accessories exhibiting elasticities around 0.6 − 0.7, while utilities (e.g., water and gas
distribution) exhibit elasticities around 0.99. Expressed in words, a 1 percent increase in output generates almost
a 1 percent increase in total costs for utilities but a less-than-proportional increase in total costs for computer
producers.
input prices satisfy 4r > w. Therefore, when 4r < w, output elasticity becomes
∂TC q r q q
εTC,q = = q = ,
∂q TC 8 r 8 r
which is increasing in output. Essentially, if the firm seeks to produce 1 percent more
units of output, its total costs increase by qr percent, thus entailing a larger percentage
increase in costs as its scale grows. (A similar result occurs if input prices satisfy
4r > w, where TC = w q2 , and output elasticity becomes εTC,q = wq , which we leave for
the reader as an exercise.)
Self-assessment 8.10 Consider a firm with total cost TC = 5 + 30q2 . Find its
output elasticity, and interpret your results.
8.8 Economies of Scale, Scope, and Experience
8.8.1 Economies of Scale
Economies of scale A firm experiences economies of scale when its average cost,
AC, decreases in output q.
This property often arises from specialization. When a new firm starts its operations, a
worker might be doing a variety of tasks, such as producing goods, selling them to cus-
tomers, preparing balance sheets, and even cleaning up the store. As the firm expands its
output, more workers are hired and more specific tasks can be assigned to each worker,
allowing each one to learn how to do her task in more effective ways. An additional reason
for economies of scale to exist is the presence of large capital investments, which are spread
over large output levels. For instance, if an automaker requires an expensive robot to pro-
duce cars, but costs from labor and materials are relatively low, the average cost of the first
car will be extremely high, but it will decrease for subsequent cars.
Firms may also experience an increase in their average cost when they increase output,
as we describe next.
Diseconomies of scale A firm suffers from diseconomies of scale when its average
cost, AC, increases in output q.
202 Chapter 8
For example, this can arise from managerial diseconomies. To understand this case, con-
sider a firm that produces a new technology, which increased its output substantially in the
last few years, and is still managed by its initial founder. Seeking to serve markets in other
countries, the founder realizes that the firm will need to hire more managers. These man-
agers, however, might not know the details of the firm’s product as closely as the current
manager, potentially making mistakes (at least while they learn the product details) and,
ultimately, increasing the firm’s operating cost.
Example 8.11: Testing for economies of scale Consider a firm with total cost
TC = a + bq + cq2 , where a, b, c 0.12 Let us first find the average cost, as follows:
TC a + bq + cq2 a
AC = = = + b + cq.
q q q
Figure 8.6 depicts this average cost. This expression reaches its minimum at the
point where its derivative with respect to q is zero (i.e, ∂AC
∂q = 0), or
∂AC a
= − 2 + c = 0,
∂q q
a
AC(q) = + b + cq
q
a2 + c2
min AC (q ) = +b
ac
1/2
q
a
q=⎛ ⎛
⎝c⎝
Figure 8.6
Testing for economies of scale.
12. Note that this allows for different types of costs: (1) if b = c = 0, total cost simplifies to TC = a, which are
constant in the units of output that the firm produces, q; (2) if c = 0, then total cost reduces to TC = a + bq, thus
being linear in output; and (3) if b = 0, total cost becomes TC = a + cq2 , thus being a standard convex expression
that originates at a height of a, and increases in q, at an increasing rate.
1/2 1/2
which, after solving for q, yields q = ac . Hence, for all output levels q < ac ,
the AC curve is decreasing, thus exhibiting economies of scale (see the left side of
1/2
figure 8.6), whereas for all output levels q > ac , the AC curve increases in output,
and the firm suffers from diseconomies of scale (right side).
As a confirmation, we next show that the minimum of the AC curve we just found,
1/2
q = ac , could alternatively be found by using the property that the MC and AC
curves cross each other at the minimum of the AC curve. First, we obtain the marginal
cost, MC = ∂TC ∂q = b + 2cq. Second, the MC and AC curves cross each other where
MC = AC, which implies
a
+ b + cq = b + 2cq.
q
a 1/2
Rearranging, we obtain a
q = cq and, solving for output q, we have that q = c . For
instance, if the firm’s total cost function is TC = 10 + 2q + q2 , implying that a = 10,
b = 2, and c = 1, the AC curve becomes 10 q + 2 + q, which reaches its minimum at
1/2
q = 101 3.16 units of output. For all q < 3.16, the firm’s AC curve decreases in
output, while for all q > 3.16, it increases in output.
Self-assessment 8.11 Consider a firm with total cost TC = 5 + 2q + q3 . Find the

average cost curve AC(q) and its minimum point. Interpret your results in terms of
economies of scale.
8.8.2 Economies of Scope
Economies of scope The situation where a firm incurs a lower total cost producing
two different products than the total cost that two firms would incur producing each
good separately; that is
TC(q1 , q2 ) < TC(q1 , 0) + TC(0, q2 ). (8.1)
This inequality says that the total cost from producing both q1 units of good 1 and q2 units of
good 2, TC(q1 , q2 ), is lower than the cost of producing q1 alone by one firm and q2 alone by
another firm, TC(q1 , 0) + TC(0, q2 ). Because the total cost of producing none of the goods
204 Chapter 8
is often zero,13 we can write TC(0, 0) = 0, which allows us to rewrite equation (8.1) as
TC(q1 , q2 ) < TC(q1 , 0) + TC(0, q2 ) − TC(0, 0),

zero
or, after rearranging, as follows:

TC(q1 , q2 ) − TC(q1 , 0) < TC(0, q2 ) − TC(0, 0). (8.2)
Intuitively, the right side of equation (8.2) represents the additional cost (increase in total
costs) that the firm experiences when it starts producing good 2. The left side reflects the
increase in total cost that the firm experiences if, when producing good 1, it starts to produce
good 2 as well. In summary, equation (8.2) describes that the increase in cost from starting
to produce one good alone is larger than the additional costs of adding one more good to the
firm’s product line. Common examples of economies of scope are television channels in a
satellite network. Once the network launches a service offering 80 channels, the additional
cost of offering 1 more channel is relatively low.14 Example 8.12 studies economies of scope
in a cola company.
Example 8.12: Economies of scope Consider a soda company producing two

types of cola, with and without sugar (e.g., regular and Diet Coke). When the firm
only produces regular cola (good 1), its total cost function is TC(q1 , 0) = 3q1 + 10,
where 10 indicates a fixed cost. When the firm only produces diet cola (good 2),
its total cost function is TC(0, q2 ) = 4q2 + 10. However, when the firm simultane-
ously produces both types of colas, its total cost function becomes TC(q1 , q2 ) =
(3 − α)q1 + (4 − α)q2 + (10 + β), where parameter α > 0 indicates the cost savings
effect that producing related products has on the unit cost of both regular and diet cola.
Parameter β > 0, instead, represents the increase in fixed costs that the firm experi-
ences when producing two types of colas rather than one. Therefore, the firm exhibits
economies of scope if
(3 − α)q1 + (4 − α)q2 + (10 + β) < [3q1 + 10] + [4q2 + 10] ,
which simplifies to β < 10 + α(q1 + q2 ). This condition says that the firm benefits
from economies of scope if the increase in fixed costs that it experiences when pro-
ducing both goods (as captured by parameter β) is relatively lower than the cost-saving
effect from producing both goods (as measured by parameter α).
13. Recall from the discussion about types of costs in section 8.6 that the total cost of producing zero units of
output is zero when fixed costs are non-sunk. If, instead, fixed costs are sunk, the firm cannot recover them, even
if it shut down its operations.
14. Another recurrent example of economies of scope is that of soda varieties. If a soda company offering four
or five types of soda chooses to offer one more variety (e.g., cherry cola), its additional costs from doing so are
relatively low, and definitely lower than those of a new soda firm planning to offer its first soda product.
Self-assessment 8.12 Consider the scenario in example 8.12. Does the soda com-
pany benefit from economies of scope when α = 2, β = 3, and q1 = q2 = 4 units?
8.8.3 Economies of Experience
Economies of experience The average variable cost decreases during the firm’s
production history.
Intuitively, economies of experience often emerge because workers in all tasks learn from
previous periods to avoid product defect, because the managers arrange workstations to
improve worker productivity or achieve higher material yield. In particular, economies of
experiences are commonly expressed as follows:
A
AVC(E) =
Eε
where A > 0 denotes the AVC from the first unit,15 E = qt−1 + qt−2 + …. measures expe-
rience from production in previous periods, and ε represents experience elasticity where
ε ∈ (0, 1). To see why ε measures the experience elasticity, note that such elasticity is
AVC
%AVC AVC AVC E
εAVC,E = = E
= ,
%E E AVC
E
or εAVC,E = ∂AVC
∂E AVC when the change in E is small. Because AVC(E) = Eε , its derivative
E A
∂AVC
with respect to E is ∂E = −AεE−(1+ε) , which entails that experience elasticity becomes
∂AVC E E
εAVC,E = = −AεE−(1+ε) A = ε.
∂E AVC Eε
Essentially, a 1 percent increase in the firm’s production experience, E, decreases its average
variable costs by ε percent.
Example 8.13: Slope of the experience curve An arguably more compact

method of analyzing the responsiveness of a firm’s average variable costs (AVC) to its
15. Indeed, if the firm produced only one unit during its history, E = 1, the average variable cost simplifies to
AVC = 1Aε = A.
206 Chapter 8
production experience (E) focuses on the slope of the experience curve, as follows:
A
AVC(2E) (2E)ε Eε 1
Slope of experience curve = = = = .
AVC(E) A
Eε
2ε Eε 2ε
This slope measures how much the average variable cost decreases when cumu-
lative output (E) doubles. Because the experience elasticity parameter ε satisfies ε ∈
(0, 1), an increase in the experience elasticity entails a larger slope of the experience
curve.16
Self-assessment 8.13 Consider a firm with AVC(E) = E10 1/2 . Find the experience
elasticity of the firm, and the slope of its experience curve. Interpret.
Remark—Economies of scale and economies of experience are often confused, but as the
previous discussion highlights, each applies to a different phenomenon. Mature industries,
such as cement or aluminum, often benefit from economies of scale, as increasing their
output allows them to further reduce their average costs. However, they rarely benefit from
economies of experience because their products and technology are relatively well known.
Economies of experience are more common in new products and start-ups, which see their
average variable costs decrease after learning from their own mistakes and experience.17
Appendix. Cost-Minimization Problem—A Lagrangian Analysis
In previous sections, we graphically showed that, at the input pair that minimizes the firm’s
MPL
costs, the isoquant and isocost must be tangent to each other, entailing MP K
= wr . This
appendix formally shows the origin of this result by solving the cost-minimization problem.
Let us start by writing down the firm’s problem:
16. For a numerical example, consider an elasticity of ε = 0.15, which yields a slope of 1 = 0.9. If, instead,
20.15
1 = 0.7.
ε = 0.5, the slope decreases to 0.5
2
17. An example of an industry where economies of experience are present but economies of scale are rare is that
of handmade pianos or handmade watches. In these activities, experience helps the firm avoid defects, and so it
arranges its workstations to improve worker productivity. However, output scale would not significantly decrease
average costs, given that approximately the same number of working hours and materials are needed in each
handmade piece.
min TC = wL + rK
L0, K0
subject to q = f (L, K).
Intuitively, this problem says that the firm seeks to minimize its total cost TC = wL + rK,
while reaching an output level q (e.g., q = 100 units). This is a constrained minimization
problem, in which the constraint is given by the output target, q = f (L, K), that the firm
seeks to reach. This problem has the following Lagrangian function:
L = wL + rK + λ [q − f (L, K)] ,
where λ denotes the Lagrange multiplier associated with the constraint. We next need to
differentiate with respect to the variables that the firm can alter in order to solve this problem
(units of L and K) and with respect to the Lagrange multiplier λ. First, differentiating with
respect to L, we obtain
∂f (L, K) w
w+λ − = 0, or =λ
∂L MPL
because MPL = ∂f (L,K)

∂L represents the marginal product of labor. Differentiating with respect
to K, we similarly find
∂f (L, K) r
r+λ − = 0, or = λ,
∂K MPK
given that MPK = ∂f (L,K)

∂K denotes the marginal product of capital. Lastly, we differentiate
with respect to the Lagrange multiplier λ, obtaining
q − f (L, K) = 0, or q = f (L, K),
which coincides with the constraint (the firm must reach a production level of q units of
output). Because the first two results are both equal to λ, we can set them equal to each
w
other to obtain MP L
= MPr K or, after cross-multiplying,
MPL MPK
= .
w r
When minimizing cost, the firm adjusts its inputs until it gets the same bang for the buck
across all inputs, as described in previous sections of this chapter. We can also rewrite this
MPL
result as MP K
= wr , which says that the firm hires inputs until the point at which the iso-
MPL
quant is tangent to the isocost (i.e., the slope of the isoquant MP K
coincides with that of the
w
isocost r ).
208 Chapter 8
Exercises
1. Cost minimization for Cobb-Douglas.B Consider a firm with the Cobb-Douglas production
function f (K, L) = 4K 1/2 L1/3 , where K denotes units of capital and L represents units of labor.
Assume that the firm faces input prices of r = $10 per unit of capital, and w = $7 per unit of
labor.18
(a) Solve the firm’s cost-minimization problem, to obtain the combination of inputs (labor and
capital) that minimizes the firm’s cost of producing a given amount of output, q.
(b) Use your results from part (a) to find the firm’s cost function. This is its long-run total cost,
as all inputs can be altered.
(c) Find the firm’s marginal cost function, and its average cost function. Interpret.
(d) Assume now that the amount of capital is held fixed at K = 3 units. Solve the firm’s cost-
minimization problem again to find the amount of labor that minimizes the firm’s cost.
(e) Use your results from part (c) to find the firm’s short-run cost function (because in the short
run, the firm can alter the amount of labor, but without changing the units of capital).
2. CMP for linear production.B Repeat the analysis in the previous exercise, but assume now that
the firm faces a production function f (K, L) = 4K + L, thus treating capital and labor as substitutes
in the production process.
3. Properties of a cost function.B A firm has the following cost function:
1 1 9
TC(q) = 2q3 − q2 + q + ,
3 2 10
where q denotes units of output. Intuitively, the first three terms on the right side capture the
firm’s variable cost, because they depend on the output the firm produces, whereas the last term
represents its fixed cost, as it is not a function of output q.
(a) Total cost. For which output q does the total cost curve TC(q) increase or decrease? For which
values is it concave or convex in output?
(b) Marginal cost. For which output q does the marginal cost curve ∂TC(q)
∂q increase or decrease?
For which values is it concave or convex in output?
(c) Average cost. For which output q does the average cost curve AC(q) = TC(q)
q increase or
decrease? For which values is it concave or convex in output?
(d) Average variable cost. For which output q does the average variable cost curve AC(q) increase
or decrease? For which values is it concave or convex in output?
(e) Find the value of q where the marginal cost curve crosses the total cost curve, where it crosses
the average cost curve, and where it crosses the average variable cost curve.
4. Properties of factor demand and cost functions.B Most managers must submit detailed reports
about the firm’s performance. However, the manager we consider in this exercise is rather sloppy,
18. The firm sells every unit of output at a price p > 0.

because he often has typos in his reports! For each of the following functions, argue what (if
anything) looks wrong and why.
3
(a) Capital demand of K(q, w, r) = 18 wq r .
(b) Labor demand of L(q, w, r) = 37 rp2 .
qw
(c) Cost function C(q, w, r) = q3/7 w1/3 r2 . [Hint: Check for homogeneity in input prices.]
5. CMP for Cobb-Douglas production–I.C Consider a Cobb-Douglas production function q =
ALα K β , where A,α,β > 0. Assume that input prices are w for labor and r for capital.
(a) Find labor demand L(q, w, r).
(b) Find capital demand K(q, w, r).
(c) Find total cost TC(q).
(d) Find average cost and marginal cost, AC(q) and MC(q). Show under which conditions on α
and β these costs are constant in q and coincide between them.
6. Elasticity with labor demand.B Using the labor demand you found for the Cobb-Douglas
production function in Exercise 8.5, do the following:
(a) Find the elasticity of labor demand with respect to wage, εL,w .
(b) Find the elasticity of labor demand with respect to the rental rate of capital, εL,r .
(c) Interpret the elasticities found in parts (a) and (b).
7. Explicit and implicit costs.A Calculate the explicit and implicit costs of finishing your education.
(Hint: Your university’s financial aid page should have estimates on tuition and cost of living, but
the opportunity cost of your degree may be harder to estimate.)
8. Finding an isocost line.A Katie, a recent college graduate, is looking to start a new entreprise
producing and selling T-shirts for local businesses. An hour of labor costs $15, and the rental rate
on capital is $10. If she limits herself to only spending $200 on her first order (remember, she is
a recent college grad), how much labor and capital can she hire? In other words, find and graph
the isocost line.
9. CMP with Cobb-Douglas production–II.B Jared is a manager for a local coffee roaster. He hires
labor at a rate of $20 per hour, and capital costs $15 per hour. His production function of pounds
of coffee beans follows q = 2K 0.25 L0.75 .
(a) If Jared wants to produce 10 pounds of coffee beans per hour, how much labor and capital
should he employ?
(b) What if the price of capital increases to $20?
10. Choosing between substitutes in production.B Lydia can use low-skilled or high-skilled work-
ers to help run her accounting department. The low-skilled workers, which we denote as Ll , can
review 10 accounts per hour, and the high-skilled workers, denoted as Lh , can review 15 accounts
per hour.
(a) What is Lydia’s production function for number of accounts reviewed per hour?
(b) If low-skilled workers cost $15 per hour and high-skilled workers cost $25 per hour, what is
her optimal use of the two types of labor?
(c) At what prices of labor is Lydia indifferent between hiring each type of labor?
210 Chapter 8
11. Cobb-Douglas production and input demand.B Let’s revisit our local coffee roaster Jared, with
production function q = 2K 0.25 L0.75 . Find Jared’s input demand for labor and capital without
assuming specific values for the price of labor and capital, w and r, respectively.
12. Fixed-proportion input and total cost.B Suppose that a sandwich-maker has a production func-
tion for sandwiches that is q = min{2B, M}, where B is slices of bread (which cost b per slice) and
M is slices of meat (which cost m per slice). What is the firm’s total, average, and marginal cost?
13. Short-run Cobb-Douglas costs.B Jenny produces first-aid kits using labor and capital with the
production function q = 6L0.8 K 0.2 , where the wage is $5 and rental rate is $3.
(a) Find her total cost function TC(q) if her capital is fixed at 50 units (this is her short run cost
curve).
(b) If her capital is fixed at 50 units, what is the total cost of 10, 25, 50, and 100 first-aid kits?
Graph this short-run total cost curve.
(c) Graph the short-run average and marginal cost curves (at K = 50). At what point do these
curves cross?
14. The cost function.A Explain why we need more than just the input prices to derive a cost function.
15. Substitutes in production.B Hard red winter wheat is planted in the fall in order to be harvested
in the spring. Suppose that wheat production uses acres of land A and labor L in its production
as follows: q = αA + Lβ , where q is in thousands of bushels. Calculate the total cost function for
wheat.
16. Finding MC, AC, and output elasticity.B A publisher for textbooks has a total cost of TC(q) =
25, 000 − 50q + 15q2 .
(a) Find the publisher’s marginal cost, average cost, average variable cost, and average fixed cost.
(b) Find the value of q for where the marginal cost curve crosses the average cost curve and
average variable cost curve.
(c) Find the output elasticity εTC,q .
17. Economies of scale–I.A Suppose that a firm has the cost function TC(q) = 5q3 (wr)0.5 . What is
the marginal and average cost? Does this firm exhibit economies of scale?
18. Economies of scale–II.B A manufacturer of computer monitors has the following total cost
function, TC(q) = 10, 000 − 25q + 2q2 . Characterize this firm’s economies of scale.
19. Economies of scope.A A frozen pizza manufacturer separately produces pepperoni pizzas (pp ) at
a total cost of TC(pp ) = 100 + pp , and sausage pizzas (ps ) at a total cost of TC(ps ) = 100 + 1.5ps .
Workers have told management there might be some cost savings if the pizzas were produced
simultaneously. One worker estimates the total cost of joint production (if the firm produces both
types of pizzas in the same plant) as
TC(pp , ps ) = pp + (1.5 − α)ps + (100 + β).
(a) When would the firm prefer to produce both types of pizzas in the same plant?
(b) If α = 0.5 and the firm wants to produce 150 pepperoni pizzas and 70 sausage pizzas, at what
level of β does the manufacturer benefit from economies of scale?
20. Experience elasticity–I.A Suppose that a firm has an economies of experience of AVC(E) = EAε ,
and an experience elasticity ε = 1/3.
(a) Find the slope of the firm’s experience curve.
(b) Suppose that the average cost of the first unit is A = 100. What is the firm’s average variable
cost if it produced 10 units in its history? What about 20 units? 100 units?
21. Experience elasticity–II.A Professor Smith has been teaching for a very long time, and therefore,
he has graded many term papers. He has estimated the average time in minutes that he takes to
grade a paper is equal to AT(Y ) = 2P ln Y , where Y is the number of years he has been teaching
and P is how long the paper is in pages. Find Professor Smith’s experience elasticity and the slope
of his experience curve.
22. PMP and CMP for general Cobb-Douglas.C Consider a firm with the Cobb-Douglas production
function q(K, L) = K α Lβ , where the exponents satisfy α, β > 0.
(a) Profit maximization problem (PMP). Let us first focus on the firm’s PMP, for a given output
price p. Write the firm’s PMP, differentiate with respect to capital and labor, and show that
the firm’s input demands are
β
pα 1−β
pβ 1−α−β
1−α−β
K(p, w, r) = and
r w
pβ
1−α
1−α−β pα α
1−α−β
L(p, w, r) = .
w r
(b) Is labor demand L(p, w, r) increasing in the output price p? Is it increasing in input prices w
and r? [Hint: Your answer depends on whether the firm exhibits increasing, decreasing, or
constant returns to scale.]
(c) Use your results from part (a) to obtain the firm’s supply function, q(p, w, r), showing that it
takes the following form:
β
pα α
pβ 1−α−β
1−α−β
q(p, w, r) = .
r w
(d) Is supply function q(p, w, r) increasing in the output price p? Is it increasing in input prices
w and r? [Hint: Your answer depends on whether the firm exhibits increasing, decreasing, or
constant returns to scale.]
(e) Cost-minimization problem (CMP). Let us now focus on the firm’s CMP. Use the tangency
MPL
condition MP K
= wr to find the firm’s compensated input demands, K(q, w, r) and L(q, w, r),
showing that they take the form
β α
1 αw α+β 1 βr α+β
K = q α+β and L = q α+β .
βr αw
212 Chapter 8
(f) Is the compensated labor demand L(q, w, r) increasing in the output level that the firm seeks
to reach q? Is it increasing in input prices w and r?
(g) Use your results from part (e) to obtain the firm’s cost function C(q, w, r) = wL∗ + rK ∗ ,
showing that it is
1 α β
C(q, w, r) = q α+β r α+β w α+β θ,

α β
β α+β α α+β
where θ = α + β .
(h) Is the cost function C(q, w, r) increasing in the output level that the firm seeks to reach q? Is
it increasing in input prices w and r?
(i) Find the firm’s average cost function. Is it increasing in the output level that the firm seeks
α β
to reach q? [Hint: Use T = r α+β w α+β θ to gather all the elements of cost function C(q, w, r)
that are not a function of output q.]
9 Partial and General Equilibrium
9.1 Introduction
In this chapter, we start to combine our findings from previous chapters about consumer
theory, where we learned how to find an individual’s demand function, and production the-
ory, where we identified the firm’s cost function. In this and subsequent chapters, we place
consumers and producers in different markets to better understand their behavior.
From our previous discussion of the firm’s cost function, you may remember that it reflects
the minimal costs that the firm incurs to produce a given output level q. The cost function,
however, did not tell us how many units of output the firm produces to maximize its profits.
In this chapter, we explore the firm’s problem in order to understand its incentives and the
firm’s optimal production (supply).
For simplicity, this chapter considers a perfectly competitive market, where the firm takes
output prices as given. In future chapters, we relax this assumption by analyzing industries
with fewer firms, such as a monopoly (only one firm) or an oligopoly (a few firms), where
prices are affected by the units that each firm brings to the market.
We start the chapter by describing the main features that differentiate a perfectly com-
petitive market from other types of markets, and we then describe its two main ingredients:
consumers’ aggregate demand for a good and firms’ aggregate supply of this good. We then
analyze how to find the equilibrium output and price in this market. This type of equilibrium
is often referred to as “partial equilibrium” because it focuses on a specific good, as opposed
to “general equilibrium,” which considers equilibrium outputs and prices for several goods
simultaneously. We also discuss general equilibrium in this chapter, comparing equilibrium
outcomes with those maximizing social welfare (i.e., socially optimal outcomes).
Finally, we analyze in which cases the equilibrium that naturally emerges in a perfectly
competitive market is socially optimal (the First Welfare Theorem). In this scenario, a social
planner cannot increase overall welfare by rearranging the way in which consumers purchase
goods or firms use inputs to produce output. We also explore the opposite relationship, in
which case a socially optimal outcome can be reached when consumers and firms freely
interact in a perfectly competitive market (the Second Welfare Theorem). With this issue,
214 Chapter 9
we seek to identify redistribution schemes (such as an income tax) that a government agency
can offer before consumers make their purchasing decisions and firms make their production
decisions. Allowing agents to solve their individual decision problems afterward can yield
an equilibrium outcome that is socially optimal.
9.2 Features of Perfectly Competitive Markets
Perfectly competitive markets satisfy the following properties:

• Fragmented: There are many small firms, each with a negligible market share. As a
consequence, an increase in the production of either firm does not alter market prices.
• Undifferentiated products: Consumers regard the products of all firms in the industry as
identical (i.e., undifferentiated).
• Perfect pricing information: Consumers can easily compare different sellers’ prices, at no
cost to themselves.
• Free entry and exit: In the long run, firms have the ability to enter the market if positive
economic profits can be earned, or to exit the industry if they incur losses.
Examples of these types of markets include agricultural products, such as common vari-
eties of wheat and rice. They indeed have several producers, each of them accounting for a
relatively small market share; the product is homogeneous when we focus on the market of
a specific variety; consumers can easily compare prices; and producers have access to rel-
atively similar technologies if the seeds of the variety have been available and well known
for decades, along with fertilizers and harvesting machinery.
9.3 Profit Maximization Problem
These features ultimately entail that all firms in the industry are price-takers—they take
the market price p as given—because individual production decisions do not alter market
prices. There is free entry and exit because technology and inputs are available to all firms.
Therefore, every firm’s profit-maximization problem (PMP) is
max π = TR(q) − TC(q) = pq − TC(q). (9.1)

q
TR(q)
The firm chooses its output level q to maximize its profit π , which is equal to the differ-
ence between the firm’s total revenue, TR(q) = pq, and its total cost, TC(q). The total cost
is, of course, the expression found in chapter 8 after solving the firm’s cost-minimization
problem (CMP). Hence, the PMP can be understood as a three-step procedure that unfolds
as follows:
Partial and General Equilibrium 215
1. We find input demands for L and K that minimize the firm’s cost subject to reaching a
generic production level q (i.e., the input combination that solves the CMP described in
chapter 8).
2. We insert input demands into the firm’s costs to obtain its total cost TC(q) = wL + rK,
which remains a function of the output q.
3. We insert the total cost TC(q) found in step 2 into the firm’s profit in equation (9.1).
The profit function π = pq − TC(q) from equation (9.1) implies that we have completed
steps 1 to 3. Hence, we need to differentiate the profit function only with respect to output
q, as follows:
∂TC
p− = 0, or p = MC(q),
∂q
where MC(q) = ∂TC ∂q denotes the marginal cost of output. Intuitively, this result says that,
to maximize its profits, the firm increases its output q until the point where the price from
selling an additional unit coincides with the additional cost that the firm incurs to produce
such extra unit.
This result should come as no surprise: if, instead, the firm chooses a volume of out-
put q for which p > MC(q), it could still increase its profits by producing more units,
given that the price that the firm receives per unit exceeds the marginal cost of produc-
ing such unit; and if the firm chooses q where p < MC(q), it could increase its profits by
producing fewer units because the price it receives per unit is less than the marginal cost of
producing it.
We can also check that p = MC(q) is a condition for the firm to maximize (rather than
minimize) its profits. In particular, this is done by finding the second-order conditions, which
are obtained by differentiating this result p − MC(q) = 0 with respect to output q again, and
checking that the result is negative. Indeed, differentiating p − MC(q) with respect to q
yields
∂MC
0−
∂q
which is negative (or zero) so long as ∂MC

∂q ≥ 0. Essentially, if the firm’s marginal costs
are increasing (or constant) in output, condition p = MC(q) guarantees that the firm is
maximizing its profits.1
1. Formally, we say that condition p = MC(q) is not only a necessary condition (obtained from the first differen-
tiation of the firm’s profits with respect to q), but also a sufficient condition (because the second differentiation
with respect to q produced an expression that is negative). In other words, second-order conditions check that the
second derivative of the firm’s profit function is negative in its output q, which graphically indicates that profits are
concave in output.
216 Chapter 9
Example 9.1: PMP in the Cobb-Douglas case Consider the firm of example 8.6 in
chapter 8, with Cobb-Douglas production function q = L1/2 K 1/2 . As found in example
8.6, its total cost function is TC(q) = 40q. Inserting this total cost into the firm’s PMP,
we obtain
max π = pq − 40q.
q
Differentiating with respect to output q yields p − 40 = 0, or p = $40. This result indi-

cates that, at a price of p $40, the firm produces as much as possible. If, instead,
the price is below that threshold (p < $40), the firm finds it optimal not to supply any
units whatsoever, producing q = 0 units. Figure 9.1a depicts this supply curve.
(a)
p
$40
Supply curve
q
(b)
p
p = 80q, or q = (1/80)p
Supply curve
80
q
Figure 9.1
(a) Supply curve with a linear cost function. (b) Supply curve with a convex cost function.
Assume now that the firm’s cost function was TC(q) = 40q2 , which is convex in
output q.2 In this scenario, condition p = MC(q) becomes p = 80q. Solving for output
p p
q, we can find the firm’s supply, q = 80 . Because 80 is increasing in price p, the firm
supplies more units as the price increases. Figure 9.1b plots this supply curve, which
originates at zero, and grows in p.
Self-assessment 9.1 Repeat the analysis in example 9.1, but assuming that the
firm’s total cost function is now TC(q) = 5 + 40q2 . Find and depict its supply curve.
9.4 Supply Curves
9.4.1 Individual Firm Supply

In this section, we use the result from the firm’s PMP, p = MC(q), to obtain the firm’s supply
curve. One can immediately guess that we only need to plot the firm’s marginal cost (MC)
curve MC(q), as in figure 9.2. In particular, for each price p on the vertical axis, we can
move rightward along the dotted lines toward the MC(q) curve, mapping the curve on the
horizontal axis, as illustrated by the arrows on the graph. Intuitively, this mapping from the
vertical to the horizontal axis says, for each price p, how many units the firm produces to
maximize its profits.3
In the long run, the amounts of all inputs can be varied or, in other words, there are no
fixed costs. Hence, the average cost curve AC(q) only includes variable costs (because both
labor and capital can be altered). Figure 9.3 superimposes the firm’s AC(q) curve on top of
the MC(q) curve depicted in figure 9.2.
This analysis of the firm’s supply curve assumes that the firm would supply units of output
even when the market price p falls below the firm’s average cost, AC(q). Doing so, however,
would result in losses, as depicted in figure 9.3. Hence, this production strategy would never
be chosen by the firm. Informally, it prefers to shut down its operations rather than making a
loss on every unit. As a consequence, the firm’s supply curve can be given by the relationship
found above, p = MC(q), but only on the segment of the MC curve, MC(q), that lies above
∂TC(q) ∂MC(q)
2. Indeed, the marginal cost is MC(q) = ∂q = 80q, and its derivative is ∂q = 80, which is positive for all
output levels q. Graphically, the total cost increases in q, and at an increasing rate.
3. Mathematically, this mapping from the vertical to the horizontal axis is just the inverse of MC(q). That is, if the
firm’s profit-maximizing condition is p = MC(q), this mapping would be equivalent to solving for q in p = MC(q),
as in example 9.1.
218 Chapter 9
MC(q)
$7
$5
10 units 12 units q
Figure 9.2
Supply curve and MC(q).
MC(q)
Supply curve
AC(q)
p = minAC(q)
Figure 9.3
Average and marginal costs.
the AC(q). Prices above the AC(q) curve help the firm make a positive profit margin per unit,
implying that the firm prefers staying active (producing a positive output level) to shutting
down. For prices below the AC(q) curve, the firm prefers to shut down, indicated by the
vertical spike along the vertical axis where q = 0.
Recall that in this section, we are considering a long-run approach where the firm can
alter the units of all inputs. In contrast, section 9.5 will examine a short-run scenario where
the firm cannot alter all inputs (some inputs are fixed), where we show that the firm can stay
active so long as price p covers its average variable cost AVC(q).
Example 9.2: Finding the long-run supply curve Consider a firm with total cost
curve given by TC(q) = −5q + 2q2 . We can find its MC curve by differentiating TC(q)
with respect to output q, obtaining MC(q) = ∂TC(q)
∂q = −5 + 4q. Figure 9.4 depicts this
marginal curve, which originates at −5 (in the negative quadrant) and increases in q
at a rate of 4. Setting p = MC(q), we obtain p = −5 + 4q which, solving for output q,
yields
p+5 p 5
q= = + .
4 4 4
This curve, however, is not necessarily the firm’s supply curve. To find that curve,
we first need to find the firm’s average cost AC(q) = TC(q)
q = −5 + 2q, and compare it
with the MC(q) found previously.4 To obtain the point where the MC(q) and AC(q)
curves cross, which constitutes the firm’s “shutdown price,” we set the two curves
equal to each other:
−5 + 4q = −5 + 2q,
which simplifies to q = 0. At this output level, the firm’s marginal cost is MC(0) =
−5 + (4 × 0) = −5. Hence, for any positive market price p, the firm produces a pos-
itive output level, according to q(p) = p4 + 54 ; but shuts down, producing zero output
p Supply curve
15 MC(q) = –5 + 4q
10
5 AC(q) = –5 + 2q
1 2 3 4 5 q
q = 2.5 units q = 3.75 units

–5
Figure 9.4
Finding a supply curve.
4. The AC(q) curve just obtained originates at a height of −5 when q = 0, and increases at a rate of 2, crossing the
horizontal axis when −5 + 2q = 0, or q = 52 = 2.5 units of output.
220 Chapter 9
q(p) = 0, when p = $0. We can summarize our results with the following supply
function:
p 5
+ if p > 0
q(p) = 4 4
0 otherwise,
as depicted in the thick segment of figure 9.4.

For instance, if the market price is p = $10, the firm supplies q = 10 4 +4= 4 =
5 15
3.75 units, as depicted in figure 9.4, whereas if the price the firm faces is p = $16, the
firm increases its production to q = 164 + 4 = 4 = 5.25 units.
5 21
TC curve is TC(q) = −5q + 8q2 . Find MC(q), AC(q), and the firm’s supply curve.
9.4.2 Market Supply

After finding the individual supply curve of each firm, we can easily aggregate them in
order to obtain the market (or aggregate) supply. This can be found by horizontally summing
across all individual demands in the industry. Example 9.3 examines market supply when
the number of firms in the industry, N, is given (i.e., no entry or exit occurs).
Example 9.3: Finding market supply Consider N firms, each with the individual
supply curve we found in example 9.2:
p 5
+ if p > 0
q(p) = 4 4
0 otherwise.
The market supply is then N × q(p); that is,

p 5
N 4 + 4 if p > 0
q(p) =
0 otherwise.
4 + 4 = 3.75 units, which

For instance, at a price p = $10, every firm supplies q = 10 5
entails an aggregate supply of N × 3.75 units of output. (If, for example, there are
N = 200 firms in the industry, aggregate supply becomes 200 × 3.75 = 750 units.)
Self-assessment 9.3 Consider the supply curve found in self-assessment 9.2. Find
the market supply curve, and evaluate it at N = 100 firms to obtain the aggregate
supply.
9.5 Short-Run Supply Curve
The analysis thus far assumes that the amount of all inputs could be altered (the long-run
approach). In the short run, however, the amount of at least one input is considered fixed,
such as capital. In this section, we analyze how the firm’s supply curve is affected if the
manager operates in a short-run scenario. For tractability, consider a TC function
a + bq + cq2 ,
TC(q) =

FC VC(q)
where parameter a 0 captures the part of total costs that is unaffected by changes in output
(fixed cost, FC). In contrast, the last two terms, bq + cq2 , depend on q and thus measure the
firm’s variable costs, VC. In this scenario, the average cost becomes
TC(q) a
AC(q) = = + b + cq ,
q q
AVC(q)
AFC
a
where q is the average fixed cost AFC(q), because a represents the fixed cost, whereas
b + cq denotes the average variable cost AVC(q), because bq + cq2 reflects the variable
cost. Because AC(q) = AFC + AVC(q), the difference between the AC(q) and AVC(q) is, of
course, the average fixed cost (AFC). In other words, AC(q) lies above AVC(q), as depicted
in figure 9.5.
For generality, we can allow a share of fixed cost, a, to be sunk (unrecoverable) or nonsunk
(recoverable). In particular, let a = aS + aNS , where aS denotes the sunk fixed cost, while
aNS represents the non-sunk fixed cost. In this context, the firm’s average fixed cost is
a aS + aNS
AFC(q) = = .
q q
Figure 9.6 depicts the average non-sunk cost (i.e., ANSC = aNS q + b + 2q), thus being
lower than the AC(q) curve, because AC(q) = aq + b + cq, but higher than the AFC curve,
AFC(q) = aS +a
q
NS
.
222 Chapter 9
MC(q)
AC(q)
p = min AC(q)
AVC(q)
p = min AVC(q)
Figure 9.5
AC and MC curves.
MC(q)
AC(q)
p = min AC(q) ANSC(q)

AVC(q)
p = min AVC(q)
Figure 9.6
AC functions when a share of fixed costs are sunk.
The question remains: What is the firm’s supply curve in a short-run context? To answer
this question, we need to recognize that, by altering its output decision (including, if nec-
essary, shutting down its operations), the firm can avoid its AVC(q) and its non-sunk costs,
as the latter are recoverable; but it cannot recover its sunk costs. The firm will then produce
positive amounts so long as the market price exceeds its ANSC because its non-sunk costs
aggregate all those cost categories that can be avoided or recovered. Hence, the shutdown
price in a short-run scenario lies at exactly the minimum of the ANSC or, alternatively, at
the point where ANSC crosses MC(q).
Example 9.4: Finding the short-run supply curve Consider a firm with the same
TC function as in example 9.2, but with $10 in fixed costs (i.e., TC(q) = 10 − 5q +
2q2 ), and assume that this fixed cost is evenly distributed into sunk costs, $5, and
non-sunk costs, $5. In that example, we showed that MC(q) = −5 + 4q. We can now
find the expression of the non-sunk costs, NSC(q) = 5 − 5q + 2q2 , which implies that
ANSC is
NSC(q) 5
ANSC(q) = = − 5 + 2q.
q q
We can then set the MC(q) and ANSC(q) equal to each other, in order to find their
crossing point; that is,
5
5 + 4q = − 5 + 2q,
q
√
which simplifies to 2q = 5q . Solving for output q, we obtain q = 2.5 1.58 units.
Inserting this output level into the MC(q) curve, we find the shutdown price in this
short-run scenario, p = −5 + (4 × 1.58) = $1.32. In summary, the firm’s short-run
supply curve is
p 5
+ if p > $1.32
q(p) = 4 4
0 otherwise.
Comparing the short-run supply we just found against the long-run supply identified
in example 9.2, we can see that they are very similar, as the firm still produces along its
MC curve; but now the firm needs a higher price to start producing positive amounts,
$1.32, than in the long-run scenario, $0. Intuitively, the firm did not face any fixed
costs in the long run, as all input usage could be recovered. In the short run, some
costs are fixed and, more importantly, sunk (unrecoverable), inducing the firm to start
producing only when it faces a sufficiently high price.
Self-assessment 9.4 Repeat the analysis in Example 9.4, but assuming that the
firm’s fixed costs are distributed differently: only $2 are sunk while $8 are non-sunk.
Find the firm’s short-run supply curve and compare your results against those in
Example 9.4.
224 Chapter 9
9.6 Market Equilibrium
9.6.1 Short-Run Equilibrium

In the short run, one can assume that the number of firms in the industry, N, is given. That
is, the time span is sufficiently short to prevent firms from entering or exiting the industry
(e.g., a day or a week). In that scenario, we can use the market demand (aggregate demand
after summing all individual demands, as described in chapter 3) and the market supply
(after summing all individual supplies, as found in section 9.4) to analyze the equilibrium
output and price in that market.
Example 9.5: Finding short-run equilibrium output and price Consider a market
demand qD (p) = 100 − 2p, and the aggregate supply curve of example 9.3. These
cross each other when qD (p) = qS (p), or
p 5
100 − 2p = N + ,
4 4
which simplifies to 8p + N(5 + p) = 400. Solving for price p, we find an equilibrium

price of
5(80 − N)
p= ,
8+N
which is decreasing in the number of firms N.5 In addition, this price crosses the shut-
down price of p = $1.32 at N = 461.21 firms.6 Hence, when there are only N 62
firms in the industry, the equilibrium price is p = 5(80−N)
8+N , which entails an aggregate
5(80−N)
output of q = 100 − 2 8+N = 8+N . (For instance, when N = 10 firms compete in
110N
350 ∼
the industry, equilibrium price is p = 5(80−10)
8+10 = 18 = $19.44, while aggregate out-
1,100 ∼
put becomes q = 8+10 = 18 = 61.1 units.) In contrast, when the number of firms
110×10
exceeds 62, such as when N = 90, every firm sets its individual production at zero,
q = 0, implying that aggregate output is also zero in equilibrium.
∂p
5. As an exercise, check that if we differentiate price p = 5(80−N)
8+N with respect to N, we obtain ∂N = −
440 ,
(8+N)
2
which is clearly negative. Expressed in words, this says that the equilibrium price decreases as more firms enter
the industry.
6. To see this point, set the price found previously, p = 5(80−N)
8+N , equal to the shutdown price of $1.32, so
that 5(80−N)
8+N = 1.32. Rearranging, we obtain 5(80 − N) = 1.32(8 + N). Solving for N, we find N = 61.62
firms.
Self-assessment 9.5 Repeat the analysis in example 9.5, but assume that the
demand function is now qD (p) = 350 − 2p. How do equilibrium price and quantity
change relative to those we found in example 9.5?
9.6.2 Long-Run Equilibrium

In a perfectly competitive market, firm entry can occur if potential entrants can make more
profits in this market than in other industries; and firm exit can happen if incumbent firms
make losses (or less profits) than in other easily accessible industries. The market reaches an
equilibrium (i.e., a stable situation) when no more firms have an incentive to enter or exit the
industry. For that to occur, it must be that firms make no economic profits (i.e., they make
the same profits as in other competitive markets). As a consequence, two conditions must
hold: (1) profits for every firm are zero, which entails that p = min AVC(q); and (2) aggre-
gate demand and supply cross each other, qD (p) = qS (p). We apply these two conditions to
example 9.6.
Example 9.6: Finding long-run equilibrium output and price Consider the same
market demand as in example 9.5, qD (p) = 100 − 2p, and an AC curve AC(q) = 10
q −
5 + 2q. Because in the long-run equilibrium, the production of every firm, q, must
satisfy p = MC(q) = AC(q), we must have that MC(q) = AC(q). Setting MC(q) equal
to AC(q), we find that
10
−5 + 4q = − 5 + 2q.
q
Rearranging, we obtain 2q = 10 q2 = 10
q , or√ 2 = 5. Taking the square root of both
sides yields an individual output of q = 5 = 2.24 units. This is the output level where
curve MC(q) crosses AC(q) at its minimum. In short, all firms produce an output of
q = 2.24 units, at an equilibrium price of p = MC(2.24) = −5 + (4 × 2.24) = $3.94,
which is the shutdown price in this scenario.
We now need to use only the second condition (no entry or exit incentives) to obtain
the last unknown: the number of firms operating in the industry in equilibrium, N ∗ .
To find N ∗ , we set aggregate demand equal to aggregate supply, as follows:
p 5
100 − 2p = N + .
4 4
226 Chapter 9
Because, we already found the equilibrium price, p = $3.94, we can insert it into
this expression to obtain
3.94 5
100 − (2 × 3.94) = N + ,
4 4
which, solving for N, yields an equilibrium number of firms of N ∗ = 92.12 2.23 = 41.21
(i.e., 41 firms are active in the industry, as no fractional firms can enter). Interestingly,
as demand increases, the equilibrium number of firms N ∗ grows as well. For instance,
if demand increases from that in the current example, qD (p) = 100 − 2p, to qD (p) =
4, 000 − 2p, the equilibrium number of firms grows to N ∗ = 1, 786 firms. Essentially,
because all firms produce the same output, q = 2.24 units, an increase in demand
attracts more firms to the industry.7
demand function is now qD (p) = 350 − 2p. How do equilibrium price, quantity, and
number of firms in the industry, change relative to those we found in example 9.6?
9.7 Producer Surplus
As described in chapter 5, the consumer surplus represents the difference between the con-
sumer’s maximum willingness-to-pay for an object and the price that she actually pays, p.
Graphically, the consumer surplus was given by the area below the demand curve (as that
captures the consumer’s maximum willingness-to-pay for each unit) and above market price
p. A similar argument applies to the analysis of producer surplus, as described next.
Producer surplus The difference between the price that the producer receives for
its product, p, and its marginal cost from producing that unit.
7. In the short run, where the number of firms is fixed, an increase in demand would lead to a short-lived increase
in prices above the shutdown price of $3.94. Graphically, the supply curve would be unaffected, but the demand
curve would shift rightward, thus producing a crossing point of demand and supply to the northeast of the initial
crossing point. Because the equilibrium price coincides with the shutdown price, the new (higher) price entails p >
min AC(q), thus allowing every firm to earn economic profits. As information about prices is common knowledge,
however, firms in other markets would be attracted to this industry in the long run, ultimately pushing equilibrium
prices down to p = min AC(q), with zero economic profit.
Graphically, the producer surplus (PS) is given by the region below the prevailing market
price p and the firm’s supply curve because the latter is found by setting p = MC(q) and
solving for q. Importantly, recall that the firm’s marginal cost comes from the derivative of
the total cost, MC(q) = ∂TC(q)
∂q ; and, in turn, the total cost TC(q) was the result of minimizing
the firm’s costs (i.e., choosing input combinations that minimize costs). As a consequence,
PS measures the profit margin that the firm makes by comparing the price that it receives
from each unit against the minimal cost that the firm incurs when producing 1 extra unit, as
captured by MC(q).
Example 9.7: Finding producer surplus Consider the supply function found in
example 9.6, p = −5 + 4q, or q = p4 + 54 , as depicted in figure 9.7. In that scenario,
we found the shutdown price to be pShutDown = $3.94. Let us evaluate PS when market
price is p = $15.
Because this supply curve is linear, we can find PS by calculating the area of
rectangle A plus triangle B in figure 9.7, as follows:
1
PS = (p − pShutDown )qShutDown + p − pShutDown q − qShutDown ,
2
Area A
Area B
where p = $15 denotes the price we consider, pShutDown = $3.94 expresses the shut-
down price found in example 9.6, qShutDown = 2.24 units is the output every firm
produces at the shutdown price, and q denotes the units sold at price p = $15, which
4 + 4 = 4 = 5 units. Using this information, PS in this example becomes
are q = 15 5 20
p
MC(q) = –5 + 4q
p = $15
Area B
Area A
min AC = $3.94 Supply curve
5 units q
q = 2.24 units
Figure 9.7
Producer surplus, PS = A + B.
228 Chapter 9
1
PS = (15 − 3.94)2.24 + (15 − 3.94) (5 − 2.24) = 24.7 + 15.26
2
Area A

Area B
which simplifies to PS = $40.03.
Self-assessment 9.7 Repeat the analysis in example 9.7, but assume that equili-
brium price is p = $13.
9.8 General Equilibrium
In this section, we extend this analysis to markets with more than one good. In particular,
we seek to find equilibrium prices for which the demand and supply for every good are
compatible with one another.
For presentation purposes, we consider markets with two goods, 1 and 2, but the analysis
extends to larger markets. In this scenario, an endowment
e ≡ eA1 , eA2 ; eB1 , eB2
denotes the amount of goods 1 and 2 that consumers A and B enjoy when they do not trade.
Consumer A’s endowment of good 1 is then eA1 , while that of good 2 is eA2 . Similarly, con-
sumer B’s endowment of good 1 is eB1 , and that of good 2 is eB2 . For instance, an endowment
could be e = (4, 1; 2, 3), indicating that consumer A initially has 4 units of good 1 and only
1 unit of good 2. Consumer B, however, has 2 units of good 1 and 3 of good 2.
Figure 9.8 depicts this endowment e using the so-called Edgeworth box. Graphically, the
origin of consumer A is in the southwest corner, so units of good 1 are illustrated on the
horizontal axis and units of good 2 on the vertical axis. The origin of consumer B is in
the northeast corner of the graph. To understand that consumer B is endowed with 2 units
of good 1 and 3 units of good 2, you may want to rotate the book 180 degrees. The length
of the horizontal axis represents the total endowment of good 1, eA1 + eB1 (which is equal to
6 units in this example), while the length of the vertical axis measures the total endowment
of good 2, eA2 + eB2 (4 units in the ongoing example).
Using a similar notation, we denote an allocation as
x ≡ xA1 , xA2 ; xB1 , xB2 ,
which lists the amount of goods 1 and 2 that consumers A and B enjoy. Allocation x can
differ from the initial endowment e if individuals trade among themselves. For instance, an
A
x2 Origin for
B
2 units Consumer B
x1
Endowment e
1 unit 3 units
A
x1
Origin for 4 units
Consumer A x2
B
Figure 9.8
Example of an endowment.
allocation x = (2, 3; 4, 1) indicates that consumer A enjoys 2 units of good 1 and 3 of good
2, while consumer B has 4 units of good 1 and only 1 unit of good 2. As an exercise, you
can depict this allocation in figure 9.8.
In the subsequent discussion, we focus on feasible allocations, as defined next.
Feasible allocation An allocation x is feasible if xe.
Condition xe says that the aggregate amount that all individuals consume, x, does not
exceed the aggregate amount they initially owned, e. In the context of two consumers and
two goods, this condition can be expressed as
xA1 + xB1 eA1 + eB1 for good 1, and

xA2 + xB2 eA2 + eB2 for good 2.
In figure 9.8, feasibility says that the allocation that consumers A and B enjoy cannot lie
outside the box because, as discussed previously, the total endowment of good 1, eA1 + eB1 ,
is measured by the length of the horizontal axis in the Edgeworth box, while the total
endowment of good 2, eA2 + eB2 , is given by the length of the vertical axis.
We next examine the equilibrium allocations that emerge when individuals trade between
themselves, continue defining what we mean by efficient allocations in this context, and
finally compare efficient and equilibrium allocations.
230 Chapter 9
9.8.1 Equilibrium Prices
Equilibrium price A price vector (p1 , p2 ) is in equilibrium if it clears the markets

for both good 1 and good 2.
In other words, the demand for every good k = {1, 2} from all individuals in the economy
(which we refer to as “aggregate demand”) coincides with the supply of that good in the
economy (“aggregate supply”).
We can understand this definition by considering what would happen otherwise: If aggre-
gate demand exceeds aggregate supply, agents could charge more for the items they sell,
implying that the initial price was not in equilibrium. Likewise, if aggregate supply exceeds
aggregate demand, buyers won’t be willing to pay so much for the product, forcing suppliers
to reduce prices. When aggregate demand and supply coincide, suppliers and buyers have
no incentive to increase or decrease prices.
How can we put this definition to use? We can do this by finding the demand of every
consumer and every good k because the sum of these expressions constitutes the aggregate
demand for good k. Aggregate supply similarly can be found by summing across the individ-
ual supply of every firm. For simplicity, we first consider an economy without production,
where individuals exchange the endowments they do not plan to consume (i.e., a “barter
economy”). In that scenario, supply of good k is just given by the total amount of good k in
the endowments of consumers A and B.
The demand from consumer A is obtained from solving her utility maximization problem
(UMP), which yields tangency condition MRS1,2 A = p1 ; see chapter 3 for details. Similarly,
p2
the demand from consumer B is found by solving her tangency condition MRS1,2 B = p1 .
p2
Hence, the demands from these two individuals are compatible if the price ratio pp12 satisfies
p1
A
MRS1,2 = MRS1,2
B
= .
p2
Intuitively, this condition says that the market is in equilibrium when the indifference
curves of consumers A and B are tangent to one another. That is, their slopes, captured by
the marginal rate of substitution (MRS), coincide. Example 9.8 illustrates how to use this
condition to find equilibrium prices that clear all markets.
Example 9.8: Finding an equilibrium allocation and price Consider two con-
sumers with the Cobb-Douglas
A A utility function ui (xi1 , xi2 ) = xi1xi2 for every consumer
i, and endowments e1 , e2 = (100, 350) for consumer A and eB1 , eB2 = (100, 50) for
consumer B. Figure 9.9 illustrates this initial endowment, with a total of 200 units of
good 1 and 400 units of good 2, which explains why the vertical axis in this example is
A
x2 Origin for
B 100 units Consumer B
x1
350 units 50 units

Endowment e
A
Origin for x1
100 units
Consumer A
x 2B
Figure 9.9
The endowment in example 9.8.
longer than the horizontal axis. Intuitively, consumers exhibit symmetric preferences
for goods 1 and 2 but start with asymmetric endowments of good 2 because consumer
A owns 350 units, while B only owns 50 units.
In this scenario, it is straightforward to find consumer A’s demand.
A = p1 yields xA2
Consumer A. Using her tangency condition, MRS1,2 = pp12 or, after
p2 xA1
rearranging, p2 xA2 = p1 xA1 . Inserting this result into consumer A’s budget constraint,
p1 xA1 + p2 xA2 = p1 100 + p2 350, we obtain p1 xA1 + p1 xA1 = p1 100 + p2 350, which sim-
plifies into consumer A’s demand for good 1:
p2
xA1 = 50 + 175 .
p1
Plugging this expression back into the tangency condition p2 xA2 = p1 xA1 , we obtain
p2
p1 50 + 175 = p2 xA2 ,
p1

xA1
and, solving for xA2 , yields consumer A’s demand for good 2, xA2 = 175 + 50 pp12 .
232 Chapter 9
Consumer B. We can follow a similar approach to find consumer B’s demand for
B = p1 , which yields xB2
goods 1 and 2, by using her tangency condition MRS1,2 p2 xB1
= pp12
or, after rearranging, p2 xB2 = p1 xB1 . We leave these calculations for the reader as an
exercise. Following the same steps as for consumer A, you should obtain that consumer
B’s demands are
p2 p1
xB1 = 50 + 25 and xB2 = 25 + 50 .
p1 p2
Lastly, we only need to find equilibrium prices. Inserting the demands for good 1 from
consumers A and B into the feasibility condition, xA1 + xB1 = 100 + 100, we obtain
p2 p2
50 + 175 + 50 + 25 = 200,
p p
1 1
xA1 xB1
which simplifies to 100 + 200 pp21 = 200. Solving for p2

p1 , we find an equilibrium price
ratio of
p2 1
= .
p1 2
A
x2 100 units Origin for
B 62.5 units Consumer B
x1
350 units 50 units

Endowment e
275 units 125 units
Equilibrium ICA
allocation x*
IC B
A
Origin for 137.5 units x1
Consumer A
100 units x 2B
Figure 9.10
The equilibrium allocation in example 9.8.
Plugging this price ratio into these demands yields an equilibrium allocation of
xA1 = 137.5 and xA2 = 275 units for consumer A, and xB1 = 62.5 and xB2 = 125 units for
consumer B. Figure 9.10 superimposes this equilibrium allocation onto figure 9.9.
Relative to the initial endowment, consumer A gives up 350 − 275 = 75 units of
good 1 to gain 137.5 − 100 = 37.5 units of good 2; and consumer B gains 75 units
of good 1 and gives up 37.5 units of good 2.
Finally, we can show that both consumers are made better off by trading. Con-
sumer A’s utility with her initial endowment is 350 × 100 = 35, 000 but at the equili-
brium allocation her utility increases to 275 × 137.5 = 37, 812.5. A similar argument
applies to consumer B, who sees her utility increase from 50 × 100 = 5, 000 with her
endowment to 62.5 × 125 = 7, 812.5.
Self-assessment 9.8 Repeat the analysis B in example 9.8, but assume that the
endowment of consumer B changes to e1 , e2 = (50, 100). How are equilibrium
B
allocations and price affected by this endowment change?
9.8.2 Efficient Allocations

In this section, we examine whether equilibrium allocations are efficient. This is an inter-
esting question because, if the allocation that emerges in equilibrium when individuals
exchange goods is efficient (in the sense we define here), no government intervention is
needed. We consider two assumptions in the rest of this chapter: (1) every consumer’s util-
ity function is strictly increasing in the goods she enjoys, being unaffected by the amount of
goods the other individual consumes; and (2) markets for goods 1 and 2 exist with prices p1
and p2 , which all consumers take as given.
Efficient allocation A feasible allocation x is efficient if we cannot find another

feasible allocation y that strictly increases the utility of at least one individual without
reducing the utility of any other individual.
In other words, we cannot rearrange the bundles that each individual consumes to make
at least one of them strictly better off than she is with x, without making another individual
worse off. The appendix in this chapter shows that efficiency entails that the indifference
curves of consumers A and B must be tangent to one another, thus having the same slope.
Because the slope of an indifference curve is measured with the MRS, we can say that an
234 Chapter 9
efficient allocation x requires

A
MRS1,2 = MRS1,2
B
.
9.8.3 Equilibrium versus Efficiency

From the previous section, we have learned that an equilibrium allocation occurs when the
indifference curves of consumers A and B are tangent to one another and have slopes equal
A = MRS B = p1 ). As a consequence, an equilibrium allocation
to the price ratio (i.e., MRS1,2 1,2 p2
also satisfies the efficiency condition MRS1,2A = MRS B . In other words, equilibrium alloca-
1,2
tions are efficient. This result is often known as the “First Welfare Theorem” and, given its
importance, we include it next.
First Welfare Theorem Every equilibrium allocation is efficient.
Therefore, the equilibrium allocation that emerges when individuals are allowed to trade
among themselves cannot be improved upon by a benevolent social planner (e.g., a gov-
ernment official) who reassigns goods between consumers. That is, the planner will not be
able to increase the utility of at least one individual without decreasing the utility of another
individual.
Importantly, this result cannot be interpreted as saying that markets are always efficient.
Instead, it means that markets are efficient so long as assumptions (1) and (2) hold, but
may be inefficient when these assumptions are violated. Examples where assumption (1)
does not hold include consumers caring about the amount of goods that other individuals
enjoy (exhibiting envy or guilt). Similarly, instances where assumption (2) is not satisfied
include those in which the market for one of the goods does not exist (such as for bads like
pollution), or if it does exist, consumers have market power, and thus fail to take prices as
given. We explore scenarios where agents sustain market power in chapters 10, 11, and 14,
and contexts where markets may fail to exist in chapter 16.
Example 9.9: Finding efficient allocations Consider the consumers of example

A = MRS B yields xA2 xB
9.8. The tangency condition MRS1,2 1,2 = x2B or, after rearranging,
xA1 1
xA2 xB1 = xB2 xA1 . The feasibility requirement for good 1 says that xA1 + xB1 = 100 + 100,
or xB1 = 200 − xA1 . Similarly, the feasibility requirement for good 2 says that xA2 + xB2 =
350 + 50, or xB2 = 400 − xA2 . Inserting these feasibility equations into the tangency
condition, xA2 xB1 = xB2 xA1 , yields
xA2 200 − xA1 = 400 − xA2 xA1 ,

xB1 xB2
A
x2 Origin for
B Consumer B
x1
Efficient
allocations
A
Origin for x1
Consumer A
x 2B
Figure 9.11
Efficient allocations.
which simplifies to xA2 = 2xA1 . Essentially, for an allocation to be efficient, consumer A

must enjoy twice as many units of good 2 than of good 1. Figure 9.11 illustrates this
line, which starts at the origin of consumer A and grows with a slope of 2.
Consumer B must then enjoy the remaining xB1 = 200 − xA1 units of good 1 and xB2 =
400 − xA2 of good 2.
endowment of individual B changes to eA1 , eB2 = (50, 100). Find the set of efficient
allocations. Is it affected by the change in individual B’s endowment?
Example 9.10: Testing the First Welfare Theorem Is the equilibrium allocation
found in example 9.8 efficient? For it to be efficient, the condition that we found
in example 9.9, xA2 = 2xA1 , must hold. It is indeed satisfied because in example 9.8,
236 Chapter 9
A
x2 Origin for
B 62.5 units Consumer B
x1
Equilibrium
allocation x*
275 units 125 units
ICA
ICB
Efficient
allocations
A
Origin for 137.5 units x1
Consumer A
x 2B
Figure 9.12
Efficient and equilibrium allocations.
we found that xA1 = 137.5 and xA2 = 275 for consumer A, where xA2 = 2 × 137.5 = 275
units.
Figure 9.12 superimposes the efficiency condition (from figure 9.11) and the equi-
librium allocation (from figure 9.10), showing that, indeed, the equilibrium allocation
lies on the line of efficient allocations.
Self-assessment 9.10 Consider the equilibrium allocation you found in self-

assessment 9.8. Is it efficient? Hint: It must satisfy the efficiency condition you found
in self-assessment 9.9
The First Welfare Theorem, informally, says that if we let market forces work, a social
planner (even if she is perfectly informed about all individuals’ preferences and their endow-
ments) won’t be able to improve welfare. In other words, the theorem provides an argument
against market intervention. (In future chapters, we explore under which conditions market
failures exist and this theorem does not hold, and thus government intervention might be
necessary.)
A natural question at this point is whether the converse relationship of that in the First
Welfare Theorem also holds. That is, can every efficient allocation emerge as an equilibrium
outcome?
Second Welfare Theorem Consider an efficient allocation x, and a redistribution

of the initial endowment, from e to e, which satisfy pei = pxi for every individ-
ual i = {A, B}. Then, every efficient allocation can be supported as an equilibrium
allocation given the new endowment e.
To better understand this theorem, consider a scenario where society prefers, among all
efficient allocations, a specific allocation x. The theorem says that this allocation can emerge
in equilibrium if we redistribute the initial endowments and then let the market work by
allowing individuals to trade among themselves. In other words, for every efficient allocation
we seek to implement, we can find the appropriate redistribution of endowments that will
ultimately move the equilibrium toward that efficient allocation.8
One way to redistribute endowments across consumers is by taxing some consumers and
distributing tax collection among other consumers as a subsidy. We explore this type of
redistribution scheme in example 9.11.
Example 9.11: Testing the Second Welfare Theorem The efficient allocations
found in example 9.9 satisfy xA2 = 2xA1 , where xA1 ∈ [0, 200]. One specific allocation
satisfying this condition is xA1 = 100 and xA2 = 200 units for consumer A, which leaves
xB1 = 200 − xA1 = 200 − 100 = 100 units of good 1 and xB2 = 400 − xA2 = 400 − 200 =
200 units of good 2 for consumer B. Which redistribution of the initial endowment can
lead to such an allocation emerging in equilibrium? From the equilibrium allocation
in example 9.8, we know that
xA2 p1
A
MRS1,2 = = =⇒ p1 xA1 = p2 xA2 (9.2)
xA1 p2
from consumer A, and
xB2 p1
B
MRS1,2 = = =⇒ p1 xB1 = p2 xB2 (9.3)
xB1 p2
8. Like the First Welfare Theorem, this theorem holds if assumptions (1) and (2) are satisfied, but may not hold
otherwise. In other words, we may not be able to implement an efficient allocation via redistribution if consumers
care about the amount other individuals enjoy (violating assumption 1), if the market for some good does not exist
(violating assumption 2) or if consumers do not take prices as given (violating assumption 2).
238 Chapter 9
from consumer B. We next show that this efficient allocation can be implemented if
the social planner (e.g., government) sets a tax tB > 0 to individual B, with the amount
collected going to individual A as a subsidy (technically, we set tA < 0 to individual A
as a “negative tax”).9
Consumer A. We can express consumer A’s budget constraint in terms of the subsidy
tA she receives, as follows:10
p1 xA1 + p2 xA2 = p1 eA + p2 eA + tA .
1 2
Value of initial endowment Tax/Subsidy

After substituting equation (9.2) on the left side and her initial endowment eA1 , eA2 =
(100, 350) on the right side, we find
2p1 xA1 = 100p1 + 350p2 + tA .
Solving for xA1 , we obtain
p2 tA
xA1 = 50 + 175 + .
p1 2p1
We now take xA1 for consumer A and do the following: (1) insert the specific efficient
allocation that we seek to implement, xA1 , xA2 , xB1 , xB2 = (100, 200, 100, 200); (2) insert
the equilibrium price ratio pp21 = 12 (which is not affected relative to example 9.8)11 ; and
(3) normalize the price of good 2, so that p2 = $1 and p1 = $2 in equilibrium. Doing
these three steps, we obtain
1 tA
100 = 50 + 175 + , (9.4)
2 2×2
which, solving for tA , yields tA = −$150 (a subsidy of $150 to consumer A). We would
obtain the same result if, in this analysis, we start by solving for xA2 , yielding xA2 =
50 pp12 + 175 + 2p
tA
2
. After applying steps (1) to (3), this expression collapses to 200 =
tA
(50 × 2) + 175 + 2×1 , which also yields a subsidy of tA = −$150 for consumer A.
9. This means that tax collection is subject to tA + tB = 0, which is often referred to as being “revenue neutral.”
Intuitively, tax collected cannot be used in a part of the economy not explicitly considered in our model.
10. The expression below does not assume that consumer A is subject to a tax or subsidy. We just write tA , and
then we will find whether our result yields a tax (if tA > 0) or a subsidy (if tA < 0, i.e., negative tax).
11. Intuitively, the utility functions of each individual did not change, nor did the total endowment of each good.
Relative to example 9.8, we are only taxing one individual, and then giving the tax collected to the other individual.
Consumer B. We can apply a similar argument to consumer B, so we can express

her budget constraint as a function of the tax tB she faces, as follows:12
p1 xB1 + p2 xB2 = p1 eB1 + p2 eB2 + tB .

Value of endowment Tax/Subsidy

After substituting equation (9.3) on the left side and her initial endowment eB1 , eB2 =
(100, 50) on the right side, we find
2p1 xB1 = 100p1 + 50p2 + tB .
Solving for xB1 , we obtain
p2 tB
xB1 = 50 + 25 + .
p1 2p1
After applying steps (1) to (3), xB1 becomes
1 tB
100 = 50 + 25 + ,
2 2p1
which, after solving for tB , yields tB = $150 (a tax of $150 to consumer B).13 Finally,
we confirm that the tax imposed on consumer B, tB = $150, coincides with the sub-
sidy provided to consumer A, tA = −$150, so the redistribution scheme is revenue
neutral.
9.8.4 Adding Production to the Economy

Previous sections of this chapter considered exchange economies, in the sense that supply
of goods comes from the initial endowments, but no production occurs (as in a barter eco-
nomy). When we allow for firms, this analysis still holds, but a few new elements emerge,
as we discuss next.
Equilibrium allocations.
In equilibrium allocations, we still need every consumer i to solve
her UMP (where we found the tangency condition MRS1,2i = p1 ), but we also require every
p2
12. A similar argument as for consumer A applies here. The expression here does not assume that consumer B
is subject to a tax or subsidy. We will find out shortly whether our result yields a tax (if tB > 0) or a subsidy (if
tB < 0).
p1 tB
13. A similar result arises if, in this analysis, we start by solving for good 2, xB B
2 , yielding x2 = 50 p + 25 + 2p .
2 2
After applying steps (1) to (3), this expression simplifies to 200 = (50 × 2) + 25 + t2B , which also yields a tax of
tB = $150 to consumer B.
240 Chapter 9
j j
firm to solve its PMP (which yields an additional tangency condition MRT1,2 = pp12 ). MRT1,2
denotes the marginal rate of transformation of input j (such as labor), which is defined as
j MPj1
MRT1,2 = ,
MPj2
where MPj1 is the marginal product of good 1 to input j (i.e., how much the output of good 1
increases as the firm uses 1 more unit of input j) and, similarly, MPj2 represents the marginal
j
product of good 2 to input j. Intuitively, the tangency condition MRT1,2 = pp12 says that the
firm rearranges the use of every input j between the production of goods 1 and 2 until their
MPj1 p1
relative productivity, , coincides with these goods’ price ratio, p2 .
MPj2
i = p1 and MRT j
In summary, an equilibrium allocation with production requires MRS1,2 p2 1,2
p1
= p2 , which we can compactly express as
j p1
i
MRS1,2 = MRT1,2 = ,
p2
holding for every individual i and every input j.
Efficient allocations. Regarding efficiency, the previous definition still applies; that is, an
allocation is efficient if we cannot find another feasible allocation that makes at least one
consumer strictly better off and no consumers worse off. Mathematically, efficiency with
production requires MRS1,2 i = MRT j , thus entailing that the rate at which consumers are
1,2
willing to trade goods coincides with the rate at which firms are capable of transforming one
good into another in their production process. Furthermore, the First and Second Welfare
Theorems also hold in economies with production under relatively general conditions.
9.9 A Look at Behavioral Economics—Market Experiments
Several controlled experiments have tested the sharp results discussed in this chapter—
namely, that an equilibrium price ratio helps clear markets. Experimenters often construct
a “double auction,” in which every buyer is informed of her value for the object, while
every seller is informed that her reservation price reflects the cost of producing the good.
Every seller is then asked to announce a price for the good, and simultaneously, every buyer
announces the price that she is willing to pay for the good.
The experimenter then aggregates the willingness-to-pay from all buyers to depict an
approximated demand curve and, similarly, aggregates the prices from all sellers to con-
struct an approximated supply curve. Finding the point where demand and supply curves
cross each other, the experimenter determines the market price and quantity; comparing
the observed results against the theoretical prediction. Surprisingly, experimental results
converge relatively fast to the theoretical prediction. As Smith (1991) put it, “I am still recov-
ering from the shock of the experimental results. The outcome was unbelievably consistent
with competitive price theory.” For a more detailed introduction to market experiments, see
Just (2013) and Angner (2016) and references therein.
Appendix. Efficient Allocations and Marginal Rate of Substitution
In this appendix, we show that an efficient allocation in an economy with two consumers
and two goods must satisfy MRS1,2 A = MRS B .
1,2
We start by recalling that, for an allocation x to be efficient, it must solve
max uA (x)
x
subject to
uB (x) uB
and
xA1 + xB1 eA1 + eB1 (Feasibility of good 1)

xA2 + xB2 eA2 + eB2 (Feasibility of good 2)
Essentially, an allocation is efficient if it maximizes consumer A’s utility without reducing

the utility of consumer B below a certain cutoff level uB , while satisfying the feasibility
condition for each good (which says that, in aggregate, individuals do not consume more
units of each good than those in the initial endowments). The Lagrangian function associated
with this maximization problem is

L = uA (x) + λ uB (x) − uB + μ1 eA1 + eB1 −xA1 − xB1 + μ2 eA2 + eB2 −xA2 − xB2 ,
where λ denotes the Lagrange multiplier associated with the first constraint (the utility of
consumer B cannot decrease below uB ); μ1 represents the Lagrange multiplier associated
with the second constraint (i.e., the feasibility constraint for good 1); and μ2 is the Lagrange
multiplier associated with the third constraint (i.e., the feasibility constraint for good 2).
Differentiating with respect to the units of goods 1 and 2 for consumer A, xA1 and xA2 , we
obtain
∂uA (x) ∂uA (x)
− μ1 = 0 and − μ2 = 0 (9.5)
∂x1A ∂xA2
242 Chapter 9
in the case of interior solutions. Similarly, differentiating with respect to the units of goods
1 and 2 for consumer B, xB1 and xB2 , we find
∂uB (x) ∂uB (x)

λ − μ1 = 0 and λ − μ2 = 0. (9.6)
∂x1B ∂xB2
Dividing the two equations that make up (9.5) yields
∂uA (x)
∂xA1 μ1
= (9.7)
∂uA (x) μ2
∂xA2
and dividing the two equations in (9.6), we obtain
∂uB (x)
∂xB1 μ1
= , (9.8)
∂uB (x) μ2
∂xB2
μ1
because λ cancels out. As equations (9.7) and (9.8) are both equal to μ2 , we can set them
equal to each other, yielding
∂uA (x) ∂uB (x)

∂xA1 ∂xB1
= A
, or MRS1,2 = MRS1,2
B
.
∂uA (x) ∂uB (x)
∂xA2 ∂xB2
Therefore, in efficient allocations, the MRS of consumers A and B must coincide.
Exercises
1. Identifying perfectly competitive markets.A Are the following markets an example of perfect
competition? If not, explain.
(a) The soybean market
(b) The market for a cable television provider
(c) The market for a popular item on the Internet
(d) The market for professional basketball players
(e) The new car market
2. Short-run equilibrium.B Consider a perfectly competitive market with aggregate demand given
by qD (p) = 10 − p. Assume that only two firms operate in this industry. The cost function of firm
1 is C1 (q1 ) = 3q21 − 7q1 , whereas that of firm 2 is C2 (q2 ) = 4q22 .
(a) Find the supply function of each firm.

(b) If no more firms can enter the industry, find the aggregate supply. Then, identify the equili-
brium price and output.
3. Long-run equilibrium and subsidies.B Consider a perfectly competitive market with aggregate
demand given by Q(p) = 330 − p. All firms face the same cost function C(qi ) = 2q2i − 4qi + 20,
where qi denotes the output of firm i.
(a) Assuming that there is free entry and exit in the industry, find the long-run equilibrium: number
of firms operating, equilibrium price, and output.
(b) The government considers two policies to induce the entry of more firms: (1) a subsidy of
s > 0 per unit of output sold; or (2) a subsidy of c > 0 per unit of output consumed. Compare
both policy tools. Which one is more effective at increasing the number of firms in the market
(assume s = c)?
4. Shutdown price.A Consider a firm that washes cars and has a cost function
1
C(q) = q2 + 7q + 10,
2
where q denotes the number of cars washed. Intuitively, the first two terms capture the firm’s vari-
able cost, because they depend on the output the firm produces, whereas the last term represents
its fixed cost, because it is not a function of output q.
(a) Find the firm’s marginal cost curve, its average cost curve, its average variable cost curve, and
its average fixed cost curve.
(b) Assume that the firm operates in a perfectly competitive industry, taking a price of p = $18 as
given. Which output level does the firm choose in this scenario?
(c) What if the price of output decreases to p = $11?
(d) For which price will the firm choose to shut down its operations?
5. Perfectly competitive equilibrium–II.A Consider the perfectly competitive wheat market with
aggregate demand given by Q(p) = 25 − 2p, where Q is in thousands of pounds of wheat and p is
in thousands of dollars.
(a) If the marginal cost of wheat is $2, 000, what is the market equilibrium price and quantity sold?
(b) One of the more certain things in policy is the so-called farm bill, which subsidizes many agri-
cultural goods. How would a subsidy of $1 per pound of wheat affect the market equilibrium?
6. Supply functions and equilibrium.B Sarah and Linda are owners of competing bakeries that oper-
ate under perfect competition with aggregate demand Q(p) = 500 − 10p. Sarah’s bakery produces
cakes with costs Cs (qs ) = 5 − 10qs + 3q2s and Linda’s cost is Cl (ql ) = 10 + 2q2l .
(a) Find each firm’s supply function, and, given that no other firms are entering the industry, find
the aggregate supply.
(b) Identify the equilibrium price and quantity.
244 Chapter 9
(c) The local chamber of commerce held a bake-off between the two that gave the winner (Sarah)
free rent for the next year (assume that her entire fixed cost is her rent). Does this prize affect
the equilibrium price and quantity of Sarah’s cakes?
(d) Suppose that the contest also gave a bump to the overall demand for cakes, changing aggregate
demand to Q(p) = 600 − 10p. What is the new equilibrium price and quantity?
7. Shutdown price and short-run supply.A Ben is the owner of an on-demand, pre-made food ser-
vice. His total cost function is TC(q) = 50 − 6q + 2q2 , where his entire fixed cost is sunk (his fixed
costs are the kitchen space he leases yearly). Find Ben’s shutdown price and short-run supply curve.
8. U-shaped MC.B Consider a firm with MC curve of MC(q) = (q − 3)2 + 1 that faces a price of $6.
This MC curve crosses the price at two points. Which is the profit-maximizing quantity? Can you
generalize this result to any U-shaped MC curve that crosses the demand curve in two spots?
9. Finding aggregate supply and equilibrium.A The market for gasoline currently has N firms,
each of which faces the cost function C(q) = 5 + 0.75q2 , and current market demand is Q(p) =
500 − 0.1p.
(a) Find the aggregate supply curve.
(b) What is the equilibrium price, quantity, and number of firms that will operate in the gasoline
market in the long run?
10. Crop choice and long-run equilibrium.B Farmers make a choice each growing season about
what crop to plant in each field. Explain the reasons why a farmer might choose to plant one crop
over another (such as soybeans rather than cotton). How does this choice affect the equilibrium
price and quantity for each crop (the crop they plant versus the one they don’t)? Are these markets
ever in a long-run equilibrium?
11. Calculating producer surplus.A Consider the market for dog treats, which has the aggregate
supply of

1 p + 5 if p > $5.50
S
q (p) = 2
0 otherwise.
Aggregate demand in this market is qD (p) = 120 − 6p.

(a) Find the market equilibrium price and quantity.
(b) Identify the producer surplus, PS.
(c) A recent report found that dog treats were making dogs overweight, and regulators propose a
tax of $2 per unit to decrease the purchase of dog treats. Will the tax have the intended con-
sequences? Find the new producer surplus, PS .
12. Impacts of a tax on PS and CS.A Many cities have banned the use of plastic grocery bags, while
some have implemented a tax on their use. Discuss the implications of these bans on producer
and consumer surpluses in the grocery market in the case of a ban, and then in the case of a tax.
13. Profit versus producer surplus.B Is the following statement true or false? Profit is the same as
producer surplus. Explain your answer.
14. General linear producer surplus and consumer surplus.A Consider a market with aggregate
supply QS (P) = d + cP, and aggregate demand QD (P) = a − bP where a > d.
(a) Find the equilibrium price and quantity.
(b) Find the producer and consumer surpluses.
(c) Why must it be the case that a > d?
15. General equilibrium–I.B Consider two consumers with utility functions ui (xi1 , xi2 ) = ln xi1 + xi2 ,
and endowments (eA A B B
1 , e2 ) = (50, 100) for consumer A and (e1 , e2 ) = (25, 125) for consumer B.
What is the equilibrium allocation and price?
16. General equilibrium–II.B Two roommates, Eric and Chris, have been buying their own gro-
ceries; and currently Eric has 25 slices of bread (bE ) and 8 eggs (eE ), while Chris has 12 slices
of bread (bC ) and 16 eggs (eC ). Eric’s utility function is uE (bE , eE ) = ln eE + bE , while Chris’s
utility is uC (bC , eC ) = eC bC . What is the equilibrium allocation and price?
17. Efficiency in general equilibrium.B Consider two neighbors that trade food from their gardens
(f ) and groceries (g). Neighbor A has utility uA (f A , gA ) = ln f A + 2 ln gA and neighbor B has
utility uB (f B , gB ) = 2 ln f B + ln gB . What is the equilibrium allocation and price if each neighbor
has 40 units of food and 25 units of groceries? Is this equilibrium efficient?
18. General equilibrium–III.A Consider two consumers with utility functions ui (xi1 , xi2 ) = xi1 xi2 , and
endowments (eA A B B
1 , e2 ) = (500, 100) for consumer A and (e1 , e2 ) = (100, 350) for consumer B.
What is the equilibrium allocation and price?
19. Second Welfare Theorem.C Consider the two consumers in exercise 18. Propose a second allo-
cation (i.e., not the equilibrium that you found), that satisfies the Second Welfare Theorem. How
could this allocation be implemented by a social planner?
20. Gains from trade.A If we have two identical consumers (with the same utility function), can they
gain from trade?
21. First Welfare Theorem with external effects.C Consider two individuals with utility func-
tions uA = xA yA and uB = xB yB − 0.5xA , and endowments eA (eA A B B B
x , ey ) = (15, 5) and e (ex , ey ) =
(10, 15).
(a) Is individual A’s utility affected by individual B’s consumption? Is individual B’s utility
affected by A’s consumption? Interpret.
(b) Find the equilibrium allocation.
(c) Show that the equilibrium allocation is not socially efficient (Hint: Refer to the appendix in
this chapter.)
(d) Does the First Welfare Theorem hold? Interpret your results.
22. First Welfare Theorem with restricted trade.C Consider two individuals with utility functions
uA = xA yA and uB = xB yB , and endowments eA (eA A B B B
x , ey ) = (20, 5) and e (ex , ey ) = (5, 15).
(a) Find the equilibrium allocation when only good x can be traded.
(b) Find the efficient allocation.
(c) Show that the equilibrium and socially efficient allocations do not coincide. Does the First
Welfare Theorem hold? Interpret your results.
10 Monopoly
10.1 Introduction
Chapter 9 analyzed equilibrium output and price in perfectly competitive markets where a
large number of firms (each with a small market share) compete selling the same product.
As we discussed, firms’ intense competition lead them to undercut each other’s price until
it coincides with their common marginal cost of production. As a consequence, firms earn
no profits in equilibrium.
In this chapter, we examine the opposite type of market structure, where only one firm
operates in an industry, thus having the ability to set output and price without competing with
other firms. We start by discussing the barriers that prevent firms from entering a monopo-
lized market, allowing the monopolist to charge high prices without the threat of entry. We
then examine the monopolist’s profit-maximization problem (PMP), and how it differs from
a competitive market. We also discuss the Lerner index of market power, which essentially
measures a firm’s ability to set a price markup over marginal cost.
In the last few sections of the chapter, we apply our analysis of monopoly to study three
extensions. First, we consider multiplant monopolies, where a single firm produces in differ-
ent plants, each with potentially distinct costs. Second, we analyze the welfare that results
from a monopoly, and how it is less than that in a perfectly competitive industry. Finally, we
examine markets with a single buyer and several sellers (monopsonies), showing that our
mathematical approach to monopoly extends to this type of market as well.
10.2 Why Do Monopolies Exist?
In this chapter, we show that monopolies set prices above those in a perfectly competitive
industry, thus reducing consumer welfare. Before starting our analysis, a natural question
is, “Why do monopolies exist in the first place if they are bad for society?” The following
discussion highlights some of the barriers to entry that protect a firm from competitors
joining its industry.
248 Chapter 10
Structural barriers. Incumbent firms may have a cost advantage (e.g., superior technology)
or a demand advantage (a large group of loyal customers). These advantages cause poten-
tial entrants to find it relatively unattractive to join the industry. A common example of cost
advantage is water and natural gas distribution companies, which incur massive fixed costs
to start their operation, but then face a relatively low cost of servicing each additional cus-
tomer. In this scenario, average cost (i.e., cost per unit of output) is decreasing in output q,
implying that the average cost of a single firm producing q units is lower than the aggregate
average cost of two firms that together produce q units; that is, AC(q) < AC(q1 ) + AC(q2 ),
where q1 + q2 = q.1 This type of industry is often referred to as a “natural monopoly”
because it is natural to find only one firm in this market, given that it benefits from decreasing
average costs (economies of scale, as discussed in section 8.8 of chapter 8).
A common example of demand advantage is that of online sellers, such as Amazon or
eBay. Many customers visit their websites as their first option when they plan to buy a new
item. Even large firms, such as WalMart, have struggled to increase their online sales due
to the tendency of most online buyers to use Amazon by default.2
Legal barriers. In some countries, both developed and underdeveloped, monopolies are
sometimes legally protected. An extreme example of this would be a country not allowing
new telephone companies to operate, as was the case in several European countries until the
1980s. A less extreme case, still observed in most countries nowadays, is protection from
patents. A firm can, after years of research and development, file for the patent of a product
or process that it discovered. Such a patent prohibits other firms or individuals from using
the patented product (or process). A common example is that of a pharmaceutical firm,
patenting a new drug it discovered, which allows the firm to sell the drug as a monopolist
(i.e., no other firm can sell the same drug) for twenty years. You might have heard of some
prescription drugs with patents sold at astronomical prices, such as Glybera, with an annual
cost of $1.2 million (yes, million!); Soliris, with an annual cost of $440,000; and Elaprase,
$375,000, to name but a few. Even a 5 percent copay looks scary!3
Strategic barriers. Even in the absence of legal or structural barriers, an incumbent firm can
take actions to deter entry, such as starting a price war against every newcomer. By doing
1. For instance, if TC(q) = 100 + 2q, then its average cost is AC(q) = 100q + 2, which is decreasing in q. In this
scenario, the average cost of producing q = 10 units by a single firm would be AC(10) = $12, whereas the aggregate
average cost of two firms producing 5 units each is AC(5) + AC(5) = 22 + 22 = $44. A similar argument applies
to firms with total cost (TC) function of the form TC(q) = a + bq, where a, b > 0, yielding AC(q) = aq + b.
2. Amazon net sales in 2017 were around $178 billion, while WalMart’s online sales were only $11 billion—a
generous sum, but small in comparison.
3. Patents, and the monopoly profits that the firm can obtain, can provide incentives for firms to invest in research
and development. However, many experts point out that the length of such patents (e.g., twenty years in the case of
drugs) is probably excessive. Some economists have even proposed the complete elimination of patents, because
the firm discovering a new product would still make some monopoly profits from its discovery while other firms
spend time understanding the details of the product before they can copy it to produce it in large volumes.
Monopoly 249
so, the incumbent builds a reputation of being a tough competitor, thus deterring potential
entrants from joining the industry in the future. This was the case, for instance, with the
price war between United Airlines and Frontier Airlines in the Billings, Montana to Denver,
Colorado route in 1994.4 After a year of sustaining the war, Frontier Airlines (the smaller
company of the two) withdrew from the route, leaving United as the only airline offering the
flight between these two cities. As you might suspect, United increased its prices for this
route immediately after.
10.3 The Monopolist’s Profit Maximization Problem
Some preliminaries. To better understand a monopoly, let us recall the polar opposite:
perfectly competitive markets, which we examined in chapter 9. In perfectly competitive
industries, the market share of every firm is so small that each individual production deci-
sion has no effect on market prices. For instance, a firm’s decision to produce 100 more
computers does not significantly affect aggregate supply (which can be hundreds of mil-
lions of units every year). Because market price is a function of aggregate supply, market
price is, therefore, unaffected by a negligible change in aggregate supply.5 Informally, the
increased production of firm i is like adding a drop of water to the sea.
In contrast, in a monopolized industry, a single firm decides the output level, which
implies that individual and aggregate outputs coincide in this scenario (i.e., q = Q), as there
are no other firms in the market. As a result, a change in output q does affect market prices,
as measured by the inverse demand function p(q), which decreases in output q. A common
example is the linear inverse demand p(q) = a − bq, where a, b > 0. Graphically, this inverse
demand function originates at a, and decreases at a rate of b, reaching the horizontal axis
at ab . Intuitively, when the monopolist sells few units (i.e., low values of q), consumers are
willing to pay a relatively high price for the scarce good, but as the firm offers more units
(larger values of q), consumers are willing to pay less for the relatively abundant good.
Writing the monopolist’s problem. We can express the monopolist PMP as follows:
max π = TR(q) − TC(q) = p(q)q − TC(q). (10.1)
q
Essentially, the problem asks the monopolist to choose its output q to maximize its profits
π , as measured by the difference between total revenue and total cost. This problem is anal-
ogous to that presented in chapter 9 for firms operating in a perfectly competitive industry,
with only one difference—namely, that the price was assumed to be a constant p in that
4. This price war lead both firms to reduce their prices by half! What a bargain!
5. Formally, we say that the individual production of firm i is represented by qi , which implies that, in an industry

with N firms, aggregate supply is the sum of individual supplies across all N firms (i.e., Q = N i=1 qi ). In a market
where the number of firms, N, is sufficiently large, an increase in the supply of one firm i, qi , does not significantly
affect the aggregate supply Q, as the latter includes output from many other firms.
250 Chapter 10
scenario (and every firm’s output decision is unaffected, given its negligible market share),
whereas now it is a function of the monopolist output q, so we write its price as p(q). This
will lead to different results, as we examine next.
Solving the monopolist’s problem. Differentiating the profit in equation (10.1) with respect
to the monopolist’s output q, we obtain
∂p(q) ∂TC(q)
p(q) + q− = 0,
∂q ∂q
or, rearranging,
∂p(q) ∂TC(q)
p(q) + q = .
∂q ∂q

Marginal revenue, MR(q) Marginal cost, MC(q)
Therefore, to maximize its profits, the monopolist increases its output q until the marginal
revenue obtained from selling an additional unit coincides with the marginal cost (i.e., extra
cost) from producing such unit. If, instead, MR(q) > MC(q), the monopolist would still
have incentives to increase output q because its revenues increase more than its cost. The
opposite argument applies if MR(q) < MC(q), where the monopolist would have incentives
to decrease its output q.
10.3.1 A Closer Look at Marginal Revenue

From these results, marginal revenue is given by
∂p(q)
MR(q) = p(q) + q.
∂q
Positive effect
Negative effect
To understand this expression, consider that the monopolist increases its output by 1 unit.
This additional unit produces two effects on the firm’s revenue (one positive and one
negative), as indicated in the expression for MR(q). We next discuss each of these effects:
• Positive effect on MR(q). If the firm sells 1 more unit, it would earn a price p(q) from that
unit, as captured by the first (positive) term in MR(q), reflecting that the firm’s revenue
increases.
• Negative effect on MR(q). When offering 1 more unit, however, the firm needs to decrease
the price of all units sold, as captured by the second term in MR(q), which is negative
because ∂p(q)
∂q < 0. Intuitively, the second effect emerges because the market becomes
6
∂p(q)
6. That is, demand decreases in q. For example, consider the inverse demand in p(q) = a − bq, where ∂q =
−b < 0.
Monopoly 251
flooded, thus forcing the monopolist to sell the new unit at a lower price. Because the
firm charges the same price for all its units (i.e., uniform price), the price reduction on
the new unit, as measured by ∂p(q)
∂q , must be applied to all units, q, ultimately reducing the
∂p(q)
monopolist’s revenue by ∂q q.
In summary, increasing output entails a positive and a negative effect on the firm’s addi-
tional revenue, whose total effect must exactly offset the additional costs that producing 1
more unit generates for the monopolists; that is,
MR(q) = MC(q).
Example 10.1: Positive and negative effects of selling more units Consider
a monopoly facing an inverse demand function p(q) = 10 − 3q. If the firm were to
marginally increase its output, its marginal revenue becomes
MR(q) = (10 − 3q) + (−3) q = 10 − 6q,
where p(q) = 10 − 3q and ∂p(q)

∂q = −3. If the firm sells q = 2 units, its total revenue is
TR(1) = p(2)2 = (10 − 3 × 2)2 = $8.
Evaluating this marginal revenue at q = 2 units yields
MR(2) = p(2) + (−3)2 = 4 − 6 = −$2
because the inverse demand function is p(q) = 10 − 3q, the price when the monop-
olist sells q = 2 units is p(2) = 10 − (3 × 2) = $4. Intuitively, MR(2) = 4 − 6 means
that the monopolist’s total revenue experiences a positive effect of $4, because the
firm now sells 1 more unit at a price of $4; but it also experiences a negative effect,
because selling 1 more unit entails applying a price discount of $3 on all previous
units. Overall, these two effects generate a total (net) decrease in its revenue of $2.
Example 10.2: Finding marginal revenue with linear demand Consider a mon-
opoly facing an inverse demand p(q) = a − bq. In this scenario, marginal revenue is
∂p(q)
MR(q) = p(q) + q = (a − bq) + (−b)q = a − 2bq.
∂q
Figure 10.1 depicts marginal revenue, MR(q) = a − 2bq, which originates at a price
of a, and decreases in output q at a rate of 2b.7 As suggested previously, when the
7. Indeed, evaluating the marginal revenue MR(q) = a − 2bq at an output level of zero (q = 0) yields MR(0) = a.
In addition, the derivative of MR(q) = a − 2bq with respect to output q is −2b.
252 Chapter 10
–2b
Demand Curve, p(q) = a – bq
MR(q) = a – 2bq
a a q
2b b
Figure 10.1
Marginal revenue curve with linear demand.
monopolist sells few units, its decision to sell 1 more unit brings a large additional rev-
enue, (i.e., high MR(q) on the left side of figure 10.1), because the monopolist applies
a price discount to only a few units. However, when she sells a large volume, selling
1 more unit brings a small increase in its revenue (i.e., low MR(q) on the right side of
figure 10.1), because the monopolist is forced to apply price discounts to many units.
As a result, the marginal revenue is decreasing in sales or, in other words, the
additional revenue that the monopolist earns from selling further units is decreasing.
Self-assessment 10.1 Consider a monopolist facing an inverse demand p(q) =

10 − 4q. Find the marginal revenue curve, its vertical intercept, horizontal intercept,
and slope.
Figure 10.1 illustrates two interesting features of the monopolist’s marginal revenue
curve: (1) MR(q) lies below the inverse demand curve p(q); and (2) MR(q) and p(q)
originate at the same point on the vertical axis. These are properties that not only hold
for the linear demand function of example 10.2, but also for all downward-sloping demand
curves, as we show next:
• MR(q) lies below the demand curve. For the marginal revenue curve to lie below the
inverse demand curve p(q), we need that MR(q) p(q). That is, p(q) + ∂p(q)
∂q q p(q),
Monopoly 253
∂p(q) ∂p(q)
which simplifies to ∂q q 0. Because demand curve p(q) decreases in output q, ∂q
∂p(q)
0, condition ∂q q 0 must hold, implying that MR(q) p(q), as required.
• MR(q) and the demand curve originate at the same height. The demand curve evaluated at
q = 0 is p(0), whereas marginal revenue curve is p(0) + ∂p(q)
∂q 0 = p(0) as well. Graphically,
both curves originate at p(0). For instance, if the monopolist’s demand curve is p(q) =
a − bq, both marginal revenue and demand originate at p(0) = a, where q = 0.
10.3.2 Solving the Monopolist’s Problem

After our detour explaining the monopolist’s marginal revenue and its properties, we can
now return to this firm’s PMP in equation (10.1), which yields MR(q) = MC(q).
Example 10.3: Finding monopoly output with linear demand Consider again
the monopoly of example 10.2, and assume a total cost of TC(q) = cq, where c > 0.
The monopolist maximizes its profits by solving
max π = TR(q) − TC(q) = (a − bq) q − cq .

q
TR TC
Differentiating with respect to output q yields
a − 2bq − c = 0,
or, rearranging,
a − 2bq =
c .

MR(q) MC(q)
Figure 10.2 separately depicts the marginal revenue and cost found here, as a func-
tion of q. As discussed in the previous section, MR(q) is decreasing in q, whereas
MC(q) = c is constant in this example, and thus is depicted by a horizontal line.
Rearranging this result a − 2bq = c, we find a − c = 2bq. Solving for output q, we
obtain the profit-maximizing output for the monopolist:
a−c
qM = .
2b
We can now find the monopoly price by inserting this monopoly output into the inverse
demand, as follows:
qM

a−c
p(q ) = a − bq = a − b
M M
2b
254 Chapter 10
a+c
pM =
2
c MC (q) = c
MR(q) = a – 2bq p(q) = a – bq
a –c a a q
qM =
2b 2b b
Figure 10.2
Output and price in a monopoly with linear demand.
2ab − b(a − c) a + c
= = ,
2b 2
and monopoly profits are
π M = p(qM )qM − cqM

a+c a−c a−c
= −c
2 2b 2b

a+c a−c
= −c =
2 2b
(a − c)2
= .
4b
Lastly, we can evaluate the consumer surplus under this monopoly, as follows:

1 a + c a − c 1 2a − a − c a − c (a − c)2
CS M = a− = = .
2 2 2b 2 2 2b 8b

height base
For instance, if the inverse demand function is p(q) = 10 − q (i.e., parameters a and b
take values a = 10 and b = 1); and TC(q) = 4q, entailing c = 4, output, price, profits
and consumer surplus under monopoly become: qM = 10−4 2 = 3 units, p = 2 =
M 10+4
$7, π M = (10−4) (a−c) (10−4)

2 2 2
4 = 36
4 = $9, and CS = 8b =
M
8 = 36
8 = $4.5.
Monopoly 255
Self-assessment 10.2 Repeat the analysis in example 10.3, but assume a TC

function TC(q) = cq + αq2 . Find the monopolist output, price, profits, and consumer
surplus. (Hint: Marginal cost is now increasing in output, rather than being flat.)
10.4 Common Misunderstandings of Monopoly Markets
In this section, we discuss three common misunderstandings related to monopoly markets:

(1) while the monopolist does not face competition, it does not have any incentive to set
infinitely high prices; (2) as opposed to firms operating in perfectly competitive industries,
the monopolist does not have a supply curve; and (3) the monopolist chooses its output level
in the elastic portion of the demand curve.
There are no infinitely high prices. While the monopolist is the only firm in its industry,
it faces a demand curve p(q), such as p(q) = a − bq in example 10.3. As a consequence,
while setting higher prices might be attractive, it surely would lead to fewer sales. Hence,
the monopolist must balance the increase in total revenue brought by a higher price per unit
against the fewer sales that such higher prices entail. As discussed in the previous section,
this trade-off implies that the monopolist does not set too high a price, and certainly not an
infinitely high price because that would imply no sales at all. In example 10.3, for instance,
any price above p = $a (e.g., $10 if a = 10) entails no sales for the monopolist.
The monopolist does not have a supply curve. A common misunderstanding is to con-
sider that the optimal output, where MR(q) = MC(q), constitutes the monopolist’s supply
curve. In perfectly competitive markets, we found that every firm observes the given market
price, and responds by offering the output that satisfies p = MC(q). As a consequence, we
obtained a supply function q(p) which, for every price p, indicated how many units the firm
supplies to maximize its profits. With a monopoly scenario, however, this does not occur
because the monopolist determines output and price simultaneously. In other words, when
the monopolist chooses to produce qM = 3 units (as in example 10.3), it simultaneously
determines the market price of pM = 10 − 3 = $7, not allowing the firm to choose different
output levels for a given market price of pM = 7. Graphically, when the monopolist chooses
to produce a specific output level qM , it extends a dotted line from the horizontal axis (repre-
senting quantities) that hits the demand curve, determining the price at which every unit will
be sold.
The monopolist produces in the elastic portion of the demand curve. In previous chapters,
we learned that goods with few (or no) close substitutes tend to have a relatively inelas-
tic demand curve. Monopolies often produce goods with no close substitutes; otherwise,
256 Chapter 10
consumers would not be in the need to purchase such an expensive product. Many students
reading about monopolies for the first time then conclude that the monopolist must be
producing in the inelastic portion of the demand curve. This line of reasoning is, however,
incorrect. To see this, consider again the formula of price elasticity of demand:
%q
εq,p = .
%p
If the monopolist was producing in the inelastic portion of the demand curve, εq,p would
satisfy εq,p < 1. Essentially, an increase in price by 1 percent would entail a reduction in
sales of less than 1 percent (i.e., a less-than-proportional decrease in q). However, if that were
the case, the monopolist would have clear incentives to increase its price, as sales would not
be greatly affected. In other words, the monopolist does not set a price in the inelastic por-
tion of the demand curve, as that would not be profit maximizing. If, instead, the monopolist
produces in the elastic segment of the demand curve, εq,p , an increase in its price p by 1
percent entails a reduction in sales of more than 1 percent, thus leaving the firm with no
incentive to further adjust its price. Example 10.4 evaluates the price elasticity of demand
in the profit-maximizing output qM found in example 10.3, confirming that εq,p > 1.
Example 10.4: Price elasticity of output q M under a linear demand Consider the
monopolist of example 10.3, where the inverse demand function was given by p(q) =
10 − q. We found that the profit-maximizing output was qM = 3 units, entailing an
optimal price of pM = $7. In this scenario, we seek to find price elasticity, as follows:
%q q p
εq,p = = ,
%p p q
or, if the change in price is small, εq,p = ∂q(p) p ∂q(p)

∂p q . Before finding the first term, ∂p ,
we need to obtain the direct demand function q(p) from inverse demand function
p(q) = 10 − q. Solving for q, we find q(p) = 10 − p. Hence, we obtain that ∂q(p)
∂p = −1,
ultimately yielding a price elasticity of
∂q(p) pM 7
εq,p = = −1 −2.33.
∂p qM 3
because qM = 3 units and pM = $7. Intuitively, if the monopolist increases prices by
1 percent, its sales decrease by 2.33 percent. Therefore, εq,p = 2.33 > 1, which illus-
trates the previous discussion: the monopolist sets a price pM that lies in the elastic
portion of its demand curve. Lastly, note that this result also applies to the more
general linear inverse demand function p(q) = a − bq considered in example 10.2.
In particular, we first solve for q in p(q) to obtain the direct demand q(p) = ab − 1b p.
Hence, dq(p)
dp = − b , yielding a price elasticity of
1
Monopoly 257
a+c
∂q(p) pM 1 1 2b a + c a+c
εq,p = =− 2
=− =− ,
∂p qM b a−c
2b
b 2 a−c a−c
where pM = a+c 2 and q = 2b , as described in example 10.3. Hence, we found that
M a−c
εq,p = − a−c , where ratio a−c is larger than 1 given that a + c > a − c. As a con-
a+c a+c
sequence, the monopolist sets its profit-maximizing price pM = a+c 2 in the elastic
segment of the demand curve.8
Self-assessment 10.3 Consider again the monopolist in self-assessment 10.2, but

assuming that the monopolist faces a TC function TC(q) = 4q2 . Use your findings
from self-assessment 10.2 to evaluate the monopolist price-elasticity at qM .
10.5 The Lerner Index and Inverse Elasticity Pricing Rule
While the monopolist produces in the elastic portion of the demand curve, it can charge a
larger margin when facing a relatively inelastic demand curve (e.g., when consumers have no
close substitutes of the product) than when facing a relatively elastic demand curve (when
close substitutes exist). To show a relationship between margin, measured by the differ-
ence, p − MC(q), and price elasticity, εq,p , let us start by reproducing the profit-maximizing
condition for the monopolist found previously, MR(q) = MC(q) or, alternatively,
∂p(q)
p(q) + q = MC(q).
∂q
The marginal revenue MR(q) can be rearranged as MR(q) = p 1 + ∂p(q) q

∂q p , where we factor
price p out. We can further rewrite the marginal revenue as
⎛ ⎞

⎝ 1 ⎠ 1
MR(q) = p 1 + ∂q(p) p = p 1 + ,
εq,p
∂p q
because the price elasticity of demand is given by εq,p = dq(q) p

dp q . We can now use this
expression of MR(q) in the monopolist’s profit-maximizing condition MR(q) = MC(q),
as follows:

1
p 1+ = MC(q).
εq,p
8. For instance, if the parameters a and c take the values 10 and 3, respectively, we obtain a price elasticity of
a+c = − 10+3 = − 13 −1.85.
εq,p = − a−c 10−3 7
258 Chapter 10
Rearranging, we obtain p + p εq,p

1
= MC(q), or p − MC(q) = −p εq,p
1
. Dividing both sides by
p yields
p − MC(q) 1
=− .
p εq,p
This is the “Lerner index,” which says that a monopolist’s ability to set a price above
marginal cost, p−MC(q)
p , is inversely related to the price elasticity of demand. (This index
is also known as the “markup index” because it measures the price markup over marginal
cost.) Intuitively, as demand becomes relatively elastic (i.e., a more negative number
εq,p , such as −4), the ratio on the right side of the Lerner index, − εq,p1
, decreases (e.g.,
if εq,p = −4, we obtain − εq,p
1
= − −4
1
= 0.25), which implies that the left side must also
decrease. Therefore, the price markup over marginal cost decreases, for instance, to only
25 percent when price elasticity is εq,p = −4. If, in contrast, demand is relatively inelas-
tic εq,p = −0.5, we find that − εq,p
1
= − −0.5
1
= 2, thus yielding a higher price markup of
200 percent.
Example 10.5: Lerner index with a linear demand Consider again the linear
demand of example 10.3, where p(q) = 10 − q. After solving for q, we obtain the
direct demand q(p) = 10 − p, which yields an elasticity of
∂q(p) p p
εq,p = = −1 .
∂p q 10 − p
p−MC(q)
In this scenario, marginal costs were MC(q) = 4. Hence, the Lerner index, p =
− εq,p , becomes
1

p−4 1
=− p .
p −1 10−p
After rearranging, we obtain

p − 4 10 − p
= ,
p p
which simplifies to p − 4 = 10 − p or, after solving for price p, p = $7. This result, of
course, coincides with that in example 10.3.9
p−MC(q)
9. Generally, if we do not have precise information about marginal costs, the Lerner index becomes p =
10−p 10+MC(q)
p , which reduces to p − MC(q) = 10 − p or p = 2 . Expressed in words, the monopolist sets a price
equal to its marginal cost plus $10, and then divides the result by two.
Monopoly 259
Self-assessment 10.4 Consider a monopolist facing an inverse demand p(q) =

10 − 4q. Following the same steps as in example 10.5, use the Lerner index to find
the monopolist’s profit-maximizing price.
Example 10.6: Lerner index with constant elasticity demand Consider now a
monopolist facing demand curve q(p) = 5p−ε . This demand function is referred to as
“constant elasticity” because the price elasticity is exactly equal to the exponent of the
demand curve, −ε, regardless of the price and quantity at which we evaluate the price
elasticity.10 Let us now apply the Lerner index to this demand function, assuming a
marginal cost of MC(q) = $4:
p−4 1
=− .
p εq,p
For instance, if the demand curve is q(p) = 5p−2 (i.e., price elasticity is ε = −2), this
expression becomes
p−4 1
=− ,
p −2
which simplifies to 2p − 8 = p, or p = $8. As an exercise, note that if the demand
function changes to q(p) = 5p−5 , the monopolist’s price decreases to p = 20
4 = $5.
Intuitively, as demand becomes more elastic, price decreases.
Self-assessment 10.5 Consider a monopolist facing the demand curve q(p) =

10p−ε . Following the same steps as in example 10.6, use the Lerner index to find
the monopolist’s profit-maximizing price.
10. Indeed, for any demand function of the form q(p) = Ap−ε , where A > 0 and ε > 0, we obtain a price elasticity
∂q(p) p
of εq,p = ∂p q = −εAp−ε−1 , which simplifies to εq,p = −ε Ap−ε 1p −ε = −ε. As a consequence,
p p
Ap−ε Ap
price elasticity εq,p is a constant, thus being independent for q and p.
260 Chapter 10
We can use the Lerner index and solve for price p in

Inverse elasticity pricing rule (IEPR).
order to find an expression of the monopolist’s profit-maximizing price as a function of its
marginal cost MC(q) and the price elasticity of demand εq,p , as follows:
MC(q)
p= ,
1 + εq,p
1
which is known as the “inverse elasticity pricing rule (IEPR).”11 For instance, if the monop-
olist faces a marginal cost of $4 and a price elasticity of εq,p = −2, the IEPR provides an
optimal price of p = 4 1 = 41 = $8.
1+ −2 2
Self-assessment 10.6 Consider a monopolist facing a marginal cost of $3 and

a price-elasticity of εq,p = −1.5. Use the IEPR to find the monopolist’s profit-
maximizing price.
10.6 Multiplant Monopoly
Our previous analysis considered a monopoly producing in a single plant (factory); but what
if the firm has plants at different locations (e.g., countries), each with distinct costs? This
could occur if, for instance, wages differ across countries, even if the monopolist uses the
same technology and management everywhere. Would the monopolist produce all its output
in the plant with the lowest marginal cost? Not necessarily, as we illustrate next.
For simplicity, we consider only two plants, 1 and 2, where q1 denotes the output produced
in plant 1, q2 that in plant 2, and Q = q1 + q2 represents the total output across all plants.
Our analysis can be extended to monopolies with more than two plants. In this context, the
monopolist maximizes the joint profits from both plants, as follows:
max π = π1 + π2 = TR1 (q1 , q2 ) − TC1 (q1 ) + TR2 (q1 , q2 ) − TC2 (q2 ),

q1 ,q2
π1 π2
where TR1 (q1 , q2 ) = p(q1 , q2 ) × q1 denotes the total revenue from selling q1 units;
TR2 (q1 , q2 ) = p(q1 , q2 ) × q2 represents the total revenue from selling q2 units; TC1 (q1 )
measures the total cost of producing q1 units; and TC2 (q2 ) is the total cost of produc-
ing q2 units. While price p(q1 , q2 ) is affected by the units produced in each plant (e.g.,
p
11. To see that the IEPR originates from the Lerner index, rearrange the index as follows: p − MC(q) = − εq,p , or
p 1
p + εq,p = MC(q). Factoring price p out on the left side, we obtain p 1 + εq,p = MC(q). Finally, solving for p,
we find the expression of the IEPR, p = MC(q)
1 .
1+ εq,p
Monopoly 261
p(q1 , q2 ) = 300 − q1 − q2 ), the total cost in each plant depends only on the units pro-
duced on that plant—that is, TC1 (q1 ) is unaffected by q2 . We can alternatively express this
maximization problem as
max π = [p(q1 , q2 ) × q1 − TC1 (q1 )] + [p(q1 , q2 ) × q2 − TC2 (q2 )]

q1 ,q2
= p(q1 , q2 ) × (q1 + q2 ) − TC1 (q1 ) − TC2 (q2 ).
Differentiating with respect to q1 yields

∂p(q1 , q2 ) ∂TC1 (q1 )
p(q1 , q2 ) + =
∂q1 ∂q
1
MR1 MC1
where the left side captures the marginal revenue that the multiplant monopolist earns after
increasing the production of plant 1 by 1 unit, whereas the right side indicates the marginal
cost from this additional production. Differentiating with respect to q2 , we obtain a similar
expression:
∂p(q1 , q2 ) ∂TC2 (q2 )
p(q1 , q2 ) + = ,
∂q2 ∂q
2
MR2 MC2
In the special case ∂p(q 1 ,q2 )

∂q1 = ∂p(q 1 ,q2 )
∂q2 , marginal revenues from each plant coincide
(MR1 = MR2 = MR), implying that the multiplant monopoly maximizes its joint profits at
the point where
MR = MC1 = MC2 .
Intuitively, this occurs when prices are affected to the same extent when either plant increases
its production, such as with inverse demand function p(q1 , q2 ) = 300 − q1 − q2 . In this
scenario, the multiplant monopoly only needs to equate marginal costs across plants; oth-
erwise, the manager still has the incentive to shift production from the plant with the
highest marginal cost to that with the lowest marginal cost. However, when ∂p(q 1 ,q2 )
∂q1 =
∂p(q1 ,q2 )
∂q2 , marginal revenues from each plant do not coincide, which may occur if inverse
demand function is p(q1 , q2 ) = 300 − q1 − 0.5q2 . In this context, the multiplant monopoly
maximizes joint profits when these first-order conditions, MR1 = MC1 and MR2 =
MC2 , hold.
Example 10.7: Multiplant monopoly Consider a monopolist facing inverse

demand function p(Q) = 100 − Q, where Q denotes aggregate output. In addition,
assume that the monopolist operates two plants, one in the US with total cost
262 Chapter 10
TC1 (q1 ) = 5 + 12q1 + 6 (q1 )2 , and another in Chile, with total cost TC2 (q2 ) = 2 +
18q2 + 3 (q2 )2 . The monopolist maximizes the joint profits from both plants as
follows:
max π = π1 +π2 = (100 − q1 − q2 ) q1 − TC1 (q1 )

q1 0, q2 0
π1
+ (100 − q1 − q2 ) q2 − TC2 (q2 ).

π2
Therefore, the monopolist chooses its output in the US plant, q1 , and in the Chilean
plant, q2 , to maximize its total profits π = π1 + π2 , where the latter are given by the
sum of revenues across both plants and the total costs in each plant. Differentiating
with respect to output q1 , we obtain
100 − 2q1 − q2 − 12 − 12q1 − q2 = 0,
88 − 14q1 − 2q2 = 0,
44−q2
or, after solving for q1 , q1 = 7 . Similarly, differentiating these total profits with
respect to output q2 , yields
100 − q1 − 2q2 − 18 − 6q2 − q1 = 0,
which collapses to
82 − 2q1 − 8q2 = 0,
and, after solving for q2 , entails q2 = 41−q 44−q2

4 . Inserting this result into q1 = 7 ,
1
41−q1
44−
, which simplifies to 7q1 = 135+q
4
we obtain q1 = 7 4
1
, yielding an optimal
production in the US plant of q1 = 5 units. Therefore, the optimal production in
the Chilean plant is q2 = 41−54 = 9 units, entailing an aggregate output of Q =
q1 + q2 = 5 + 9 = 14 units. In summary, the multiplant monopoly produces a share
of qQ1 = 59 ∼
= 0.56 in the US plant, and the remaining Q2 = 49 ∼
q
= 0.44 in the Chilean
plant.12
12. As an exercise, note that if both plants were symmetrical in costs (e.g., both faced a total cost of TC(qi ) =
5 + 12qi + 6 (qi )2 for every plant i), the optimal output levels would coincide across all plants. In particular, the dif-
ferentiation with respect to output qi would yield 88 − 14qi − 2qj = 0 for every plant i = j. Simultaneously solving
for qi and qj , yields qi = qj = 5.5 units.
Monopoly 263
Self-assessment 10.7 Consider the multiplant monopolist in example 10.7, but

assume that the inverse demand function changes to p(Q) = 300 − 12 Q. Follow the
steps in example 10.7 to find the optimal output in each plant. How are these results
affected by demand increase?
Cartels. Our analysis about how the multiplant monopolist determines its aggregate pro-
duction Q, and how it distributes such production among its plants (how much is being
produced in plant 1, and how much in plant 2) is analogous to a “cartel” problem. A cartel
is a group of firms coordinating their production decisions to increase their joint profits,
such as the Organization of the Petroleum-Exporting Countries (OPEC), the diamond car-
tel, and the lysine cartel (as portrayed in the movie The Informant, featuring Matt Damon).
Therefore, the cartel is, as a group, equivalent to a monopolist with different plants, where
each plant is one of the firms participating in the cartel. For instance, in the OPEC cartel,
some countries have a lower marginal cost of production (i.e., a lower cost of extracting an
additional barrel of oil), such as Saudi Arabia, while others have a higher marginal cost,
such as Angola or Venezuela. As a consequence, they coordinate their total production and
distribute it among the cartel participants. We return to the analysis of cartels in chapter
14, where we examine imperfectly competitive markets, and firms’ incentives to collude to
further increase their profits.
10.7 Welfare Analysis under Monopoly
As suggested in previous sections, output is lower under monopoly than it is under per-
fectly competitive industries, entailing a higher price. As figure 10.3 illustrates, this implies
that consumer surplus is much smaller than under perfect competition because customers
pay more per unit and buy fewer units. In contrast, profits are larger. A natural question
is whether the firm’s profit gain offsets the consumers’ loss, giving rise to an overall increase
in social welfare. As the graph depicts, the firm’s profit gain does not compensate for the
loss in consumer surplus, ultimately yielding a net loss in social welfare.
In particular, consumer surplus decreases from A + B + C to A, entailing a loss of B + C,
as summarized in table 10.1. In contrast, profits increase from D + E + F to D + F + B,
implying a net gain of B − E.
As a consequence, a part of consumer welfare (region B) is transferred to the monopolist
in the form of larger profits. However, a portion of total welfare under a perfectly competitive
market is not transferred to another agent, but lost (see the region C + E). This net loss in
social welfare, C + E, is often referred to as the “deadweight loss” of monopoly, which is
explored in example 10.8.
264 Chapter 10
MC(q)
A
pM
B C
PC
p
D E
F MR(q) Demand, p(q)
qM q PC q
Figure 10.3
Welfare changes from a monopolistic market.
Table 10.1
Welfare changes from monopoly.
Perfect Competition Monopoly Difference
Consumer Surplus A+B+C A −B − C

Profits D+E+F D+F +B B−E
Welfare A+B+C+D+E+F A+D+F +B −C − E
Example 10.8: Finding the deadweight loss of a monopoly Consider the

monopolist in example 10.3, where p(q) = 10 − q and MC(q) = 4. In that scenario, we
found that monopoly output was qM = 3 units, entailing a monopoly price of pM = $7,
which generates a consumer surplus of CS M = 12 (10 − 7)3 = $4.50 and profits of
π M = (7 × 3) − (4 × 3) = $9, for a total welfare of
W M = CS M + π M = 4.50 + 9 = $13.50.
Under perfect competition, output is found at the point where demand crosses sup-
ply (marginal cost (MC) curve), 10 − q = 4, yielding qPC = 6 units, and a price of
pPC = $4. Figure 10.4 depicts qPC and pPC under perfect competition, comparing
them against qM and pM under monopoly. Therefore, consumer surplus is CS PC =
2 (10 − 4)6 = $18, and profits are π
1 PC = (4 × 6) − (4 × 6) = $0, which generates a
total welfare of
W PC = CS PC + π PC = 18 + 0 = $18.
Monopoly 265
10
p M = $7
DWL MC(q)=4
p PC = $4
MR(q) = 10 – 2q Demand, p(q) = 10 – q
5 10 q
qM = 3 q PC = 6
Figure 10.4
Monopoly versus perfect competition—an example.
The difference between the welfare levels across market structures, W PC − W M =

18 − 13.50 = $4.50 represents the deadweight loss of monopoly.
Alternatively, we could find such deadweight loss by measuring the area of triangle
DWL in figure 10.4, as follows:13
1 M 1 9
DWL = p − MC(qM ) qPC − qM = ($7 − $4) (6 − 3) = = $4.5.
2 2 2
height base
Intuitively, the society loses $4.5 from having a monopoly, rather than a perfectly
competitive industry. As discussed previously, this is a net loss, not simply a welfare
transfer from consumers to the monopolist.
Self-assessment 10.8 Consider the monopolist in example 10.8, but assume

now that its total cost is TC(q) = 4q2 . Repeat the steps in example 10.8 to find the
consumer surplus, profits, and welfare under monopoly, and then under perfect com-
petition, and ultimately, find the welfare difference measuring the deadweight loss
from monopoly.
13. Note that MC(qM ) = $4 because MC(q) = 4 is a flat horizontal line, as depicted in figure 10.4.
266 Chapter 10
10.8 Advertising in Monopoly
When investing in advertising campaigns, the monopolist must balance the additional
demand that advertising entails and its associated costs. In other words, the monopolist
faces a trade-off: advertising increases demand, but it is costly. To find the profit-maximizing
amount of advertising, A, let us write the monopolist problem as follows:
max π = TR − TC − A,
A
where the last term, A, denotes the cost of advertising. Because total revenue is TR = p × q,
and total cost is TC(q), we can rewrite this problem as follows:
max π = (p × q) − TC(q) − A
A
= [p × q(p, A)] − TC[q(p, A)] − A,
where q = q(p, A) represents the demand function (sales), which decreases in price p, but
increases in the amount of advertising A.
We can now differentiate with respect to the amount of advertising A, to obtain14
∂q(p, A) ∂TC ∂q(p, A)
p − − 1 = 0,
∂A ∂q ∂A

MC
or, rearranging,
∂q(p, A)
(p − MC) = 1. (10.2)
∂A
To express this result more compactly, let us define the advertising elasticity of demand,
εq,A , as follows:
q
% increase in q q q A
εq,A = = A
= .
% increase in A A q
A
In the case of a small change in A, εq,A can be rewritten as εq,A = ∂q(p,A) A

∂A q . After
rearranging this expression, we find
q ∂q(p, A)
εq,A = .
A ∂A
14. In the first term of the monopolist profit, p × q(p, A), advertising affects only the second component, q(p, A),
∂q(p,A)
so the derivative of p × q(p, A) is p ∂A . In the second component of profits, TC(q(p, A), differentiating with
∂q(p,A)
respect to A requires the application of the chain rule, yielding ∂TC∂q ∂A , because TC(q) is a function of q,
and q is a function of A. Finally, the third term in the monopolist’s profits, A, is linear in advertising, producing a
derivative of 1.
Monopoly 267
Therefore, we can rewrite equation (10.2) as

q
(p − MC) εq,A = 1.
A

∂q(p,A)
∂A
Dividing both sides by εq,A and rearranging yields p − MC = εq,A

1 A
q . In addition, dividing
both sides by p, we find
p − MC 1 A
= . (10.3)
p εq,A pq
p−MC
From the IERP, we know that p = − εq,p
1
. Hence, the left side of equation (10.3)
becomes
1 1 A
− = .
εq,p εq,A pq
And rearranging, we get,
εq,A A
− = .
εq,p pq
The right side represents the advertising-to-sales ratio. Therefore, for two markets with the
same price elasticity of demand, εq,p , the advertising-to-sales ratio pq
A
must be larger in the
market where demand is more sensitive to advertising (higher εq,A ).
Example 10.9: Finding the monopolist’s optimal advertising ratio Consider a

monopolist with a price elasticity of demand of εq,p = −1.5 and an advertising elastic-
ε
ity of εq,A = 0.1. In this scenario, the advertising-to-sales ratio should be pq
A
= − εq,A
q,p
,
which entails
εq,A 0.1
− =− = 0.067.
εq,p −1.5
Therefore, advertising should account for 6.7 percent of this monopolist’s total
revenue.
Self-assessment 10.9 Consider the monopolist in example 10.9, but assume now
that advertising elasticity increases to εq,A = 0.3. Find the advertising-to-sales ratio
A
pq , and compare it to that in example 10.9. Interpret your results.
268 Chapter 10
10.9 Monopsony
A “monopsony” can be understood as a direct application of monopoly where, rather than

a single seller offering its good to several buyers, there is now only one buyer in the market
and several sellers. Examples of monopsonies are often found in small labor markets, such
as mining jobs in a small town with only one mine (employer) and many workers willing
to work (employees), or Walmart in a small town where it is the main employer.15 In this
scenario, the buyer (employer) will be able to pay less for each hour of labor (lower wages)
than if it had to compete against other employers to attract employees, as in a perfectly
competitive market.
To show this result, and to explicitly find how the monopsony chooses its output and
price (wage), consider a firm (e.g., a coal mine) with production function q = f (L) which,
as in chapter 7, increases the number of workers hired, f (L) > 0, but at a decreasing rate,
f (L) < 0. The profits of the coal mine are then given by
π = TR − TC = pq − w(L)L.
Intuitively, the firm extracts q units of coal, each sold at a price p in the international
market, yielding a total revenue of TR = pq. (For simplicity, we assume that such a
price is given, implying that the international market for coal is perfectly compet-
itive, so a larger/smaller production by the mine does not alter the international
price p.)
Regarding total costs, the firm hires L workers, paying each of them a wage of w(L).
Importantly, such a wage w(L) is a function of L, specifically increasing in the number of
workers hired L. Intuitively, as the firm hires more workers, labor becomes more scarce,
and a more generous wage must be offered to attract new workers; that is, w(L) is increasing
w (L) > 0.
We are now ready to write the monopsonist’s PMP. Because the production function
is given by q = f (L), we can rewrite the profit as π = pf (L) − w(L)L, which is only
a function of L, the number of workers hired by the mine. Hence, the monopsonist
solves
max π = pf (L) − w(L)L.

L0
Intuitively, this problem says: “Choose the number of workers you plan to hire, L, so as
to maximize your profits.”
15. Other examples include technology companies, such as Cisco and Oracle, which have recently started contracts
that prohibit their employees from working for a competing firm for a period of time after leaving their current job
(such as months or years). This contract requirement essentially keeps a worker from using competing offers as a
negotiation tool with her current employer.
Monopoly 269
Differentiating with respect to L, we obtain16

pf (L) − w(L) + w (L)L = 0,
or, rearranging,
pf (L) = w(L) + w (L)L.

MRPL MEL
Let us intuitively express this condition. The left side represents that, after hiring 1 more
worker (increase in L), the firm produces f (L) more units of output (e.g., coal). Because
these additional units are sold at a price p, the left side, pf (L), measures the marginal
revenue product of hiring 1 more worker, MRPL . In other words, it denotes the market value
of the additional output that the firm can produce when hiring 1 more worker.
In contrast, the right side measures the increase in cost w(L)L that the firm experiences
when hiring 1 more worker. On the one hand, this worker must be paid w(L), as represented
by the first term in MEL . On the other hand, the additional worker is only attracted to the job
if the firm offers her a higher salary because labor becomes scarcer as the firm hires each
additional worker. Such a wage increase, w (L), must be passed on to all existing workers,
which entails a cost increase of w (L)L for the firm. Overall, the firm’s total expenditure on
labor increases by MEL = w(L) + w (L)L.17
In summary, the monopsonist optimality condition says that
MRPL = MEL ,
implying that the monopsonist hires workers until the point where the additional market
value of the output produced by the new worker, MRPL , coincides with the additional cost
that the firm incurs when hiring such a worker, MEL .
Example 10.10: Finding optimal L in monopsony Consider a coal company that

is the only employer in a small town. The mine has production function q = 100 × ln L
and faces an international perfectly competitive price of coal, given by p = $8. In
addition, assume that the supply curve for labor is w(L) = 3 + 12 L. In this scenario,
the marginal revenue product of labor is
16. The second term in the firm’s profit, w(L)L, is a product with both elements depending on L. Therefore, when
∂w(L)
we differentiate w(L)L with respect to L, we apply the product rule, obtaining w(L) ∂L dw(L)
∂L + dL L = w(L) + ∂L L,
∂w(L)
because ∂L
∂L = 1. Using w (L) = ∂L to denote the derivative of w(L) with respect to L, we can express this
derivative more compactly as w(L) + w (L)L.
17. Note that this increase in costs is analogous to the increase in total revenue that the monopolist experiences
if it chooses to sell 1 more unit of output. The extra unit is sold at a price p(q), but that unit is sold only if the
monopolist charges a lower price p (q), where the price discount must be applied to all units, entailing a loss in
revenue of p (q)q.
270 Chapter 10
MEL = 3 + L
1
w(L) = 3 + L, Supply curve
A 2
PC
w = $21.50
M
w = $16
800
MRPL =
L
3
LM = 26 LPC = 37 L
Figure 10.5
Labor and wages under monopsony.
1 800
MRPL = pf (L) =
8 × 100 = .
L
L
p
f (L)
Figure 10.5 depicts the MRPL = 800 L curve, which decreases in labor L, becoming
flatter as L increases.18
In addition, we can find the marginal expenditure that the firm suffers from hiring
1 more worker, MEL , as follows:

1 1
MEL = w(L) + w (L)L = 3 + L + L = 3 + L,
2 2
which, as depicted in figure 10.5, originates at $3, which is the same height as the
labor supply curve w(L). However, MEL increases at twice the slope as w(L) does
(i.e., the slope of MEL is 1, while that of w(L) is only 1/2). Setting MRPL = MEL , we
find their crossing point, as follows:
800
= 3 + L,
L
18. Indeed, note that the first-order derivative of MRPL with respect to L, ∂MRP 800
∂L = − 2 , is negative for all
L
L
2
values of L; and its second-order derivative, ∂ MRP L = 1600 , is positive for all values of L.
∂L
2 3L
Monopoly 271
which, expanding, yields 800 = 3L + L2 , or L2 + 3L − 800 = 0. Solving for L in this

equation, we find two roots,19 L = −29.82 and L = 26.82. Because the firm must hire
a positive number of workers (or zero), we find that LM = 26 workers is optimal. At
LM = 26, wages become w(26) = 3 + 26 × 12 = $16.
Under a perfectly competitive labor market, the number of workers is determined
by the point where MRPL crosses the labor supply curve w(L), MRPL = w(L), which
in this example implies
800 1
= 3 + L;
L 2
2
after expanding, this expression yields 800 = 3L + L2 . Solving for L, we also obtain
two roots, L = −43.11 and L = 37.11, with the latter being the optimal number of
workers hired under perfect competition in the labor market, LPC = 37 (in inte-
ger amounts, as depicted in figure 10.5). In this context, wages become w(37) =
3 + 12 37 = $21.5. Hence, when the labor market is competitive, workers receive a
higher wage and more workers are willing to work at that wage; whereas under a
monopsony, the single employer takes advantage of her purchasing power by offering
a lower wage, which entails that fewer workers are willing to work.
Self-assessment 10.10 Consider the coal company in example 10.10, but assume
now that the price of coal p increases from $8 to $10. Use the same steps as in
example 10.10 to find the number of workers hired under monopsony, under perfect
competition, and also find the corresponding salaries.
Exercises
1. Monopoly equilibrium-linear costs.B Consider a drug company holding the patent of a new drug
for a rare disease (monopoly rights). The firm faces inverse demand function p(q) = 100 − 0.1q,
and a cost function C(q) = 4q.
(a) Find the monopolist profit-maximizing output, its price, and its profits.
19. Expression L2 + 3L − 800 = 0 is a quadratic function, which has the √

general form ax2 + bx + c = 0. We can
use the quadratic formula to find the two roots for x, as follows: x = −b± 2a
b2 −4ac
. In this context, the quadratic
√
formula entails L = −3± 3 2×1
2 +(4×1×800)
, which simplifies to −3± 2 3,209 = −3±56.64
2 , which in turn produces two
roots, L = −3−56.64 = −29.82 and L = −3+56.64 = 26.82.
2 2
272 Chapter 10
(b) Assume now that the government seeks the monopolist to produce the competitive equilibrium
output (i.e., where demand crosses the MC function). Find the competitive equilibrium output
in this context.
(c) Find the subsidy per unit of output that the government needs to offer the monopolist to induce
the latter to produce the competitive equilibrium output you identified in part (b).
(d) What is the total cost that the government incurs with the subsidy? How are profits affected by
the subsidy (i.e., the change in profits from parts a to c)?
2. Monopoly equilibrium-convex costs.B Consider a drug company holding the patent of a new drug
for a rare disease (monopoly rights). The firm faces demand function p(q) = 100 − 3q, and a cost
function C(q) = 5q2 .
(a) Find the monopolist profit-maximizing output, its price, and its profits. Find the deadweight
loss from the monopoly.
(b) Assume that the government seeks to collect $100 by imposing a tax on the monopolist. (For
simplicity, let us assume that the tax is revenue neutral, so upon collecting it from the firm, it
is transferred to consumers.) Consider that the government sets a tax t per unit of output. Find
the optimal tax that helps the government collect $100. Then identify the resulting equilibrium
output, price, profits, and deadweight loss.
(c) Consider now that the government sets a lump-sum tax T on profits. Find the optimal tax
that helps the government collect $100. Then identify the resulting equilibrium output, price,
profits, and deadweight loss.
3. Maximizing revenue versus profit.A Consider a monopolist facing linear inverse demand function
p(q) = 20 − 2q, and constant marginal cost MC(q) = 1.
(a) Assume that the monopolist seeks to maximize total revenue rather than profits. Which output
does the monopolist choose? What are the equilibrium price and profits?
(b) Assume now that the monopolist seeks to maximize profits. Show that its optimal output
decreases relative to that maximizing total revenue in part (a), that price increases, and that
profits increase.
4. Taxing a monopoly.B A local cable provider faces demand q(p) = 100p−2 , and cost function
C(q) = q3/2 . Assume that this provider is the only firm offering cable in this town.
(a) Find the equilibrium price, quantity, and profit for the cable provider.
(b) Find the equilibrium price, quantity, and profit if the monopolist were to produce at the
perfectly competitive equilibrium.
(c) Can the local regulator impose a lump-sum tax on the cable provider to produce at the
competitive equilibrium? Why or why not? If so, find the value of the tax T.
(d) Can the local regulator impose a per-unit tax on the cable provider to produce at the competitive
equilibrium? Why or why not? If so, find the value of the per-unit tax t.
5. Regulating a natural monopoly.B Duchess Energy, an electric utility company, provides electric-
ity to Spartanburg. The demand for electricity is p(q) = 10 − 0.1q, and this company’s costs are
C(q) = 1 + 0.5q.
Monopoly 273
(a) Does Duchess Energy exhibit the properties to be a “natural monopoly”?

(b) Find the unregulated monopolist’s profit-maximizing price, output, and profit.
(c) The Spartanburg city government passes a law that requires utility and other electricity
providers to practice MC pricing (i.e., p(qR ) = MC(qR )). What is the regulated monopolist’s
output, price, and profit?
(d) What is the lump-sum subsidy that the regulator must provide the electric utility company to
practice MC pricing without operating at a loss?
(e) Compute the consumer surplus from the pricing strategies in parts (a) and (b).
(f) Discuss the pros and cons of MC pricing in natural monopolies.
6. Advantages to a monopoly.A We posed the question in section 10.2, “Why do monopolies exist
in the first place if they are bad for society?” After reading the chapter, we know that a monopoly
results in lower welfare than in a perfectly competitive industry, but are there social benefits to a
monopoly?
7. Two parts of marginal revenue.A There are two parts to a monopolist’s marginal revenue
function. Identify the two parts in each of the following demand functions:
(a) p(q) = 25 − 1.5q.
(b) p(q) = 12 + 50.
q
(c) p(q) = e−q .
8. Monopoly and changing condition.B Some small towns may have only one restaurant, making
it a monopoly in that town. Consider Rosie’s Diner, in a small mountain town. Rosie’s inverse
demand is p(q) = 20 − 0.4q, where q represents meals per week, and costs are C(q) = 5q.
(a) Find Rosie’s profit-maximizing price, quantity, and profits.
(b) The road into the town has become considerably harder to traverse since a recent mudslide,
and Rosie’s suppliers have increased their delivery price. This has increased costs to C(q) =
8q + 10. How do her equilibrium prices, quantity, and profits change?
(c) After the mudslide, fewer visitors have been hiking the trails around town, which has
decreased demand to p(q) = 15 − 4q. Does Rosie stay in business?
9. Factors of a high monopoly price.B A monopolist does not charge an infinitely high price, but
certain market conditions can lead to a situation where prices may seem infinitely high. Discuss
what the “perfect storm” of market conditions might be.
10. Monopolies produce on the elastic part of demand.B Show that a monopolist facing inverse
demand p(q) = q−2 + 50 with constant marginal cost MC = 5 will produce on the elastic segment
of the demand curve.
11. Monopoly with general linear demand.C Consider a monopolist with general inverse demand
p(q) = a − bq and constant marginal cost c. How does the monopolist’s optimal quantity, price,
profit, and consumer surplus change as each of the parameters a, b, and c increase?
12. Using the IEPRA One advantage of using the Lerner index is that it uses elasticity, which can be
easily estimated. Use the IEPR to solve for the optimal price for the following situations:
274 Chapter 10
(a) εq,p = −2, MC = $2.

(b) εq,p = −3, MC = $2.
(c) εq,p = −4, MC = $2.
(d) εq,p = −2, MC = $3.
(e) εq,p = −2, MC = $4.
(f) How does the optimal price change as demand becomes more elastic? How does the optimal
price change as marginal cost increases?
13. Multiplant monopoly–I.B Consider a firm that holds a patent on technology that makes the
production of concrete less harmful to the environment (resulting in a monopoly on this tech-
nology). The firm has two plants: one domestic (D) and one located in Australia (A). Demand
for their technology is p(Q) = 250 − 10Q, where Q = qD + qA is aggregate output. The domes-
tic plant has total cost TCD (qD ) = 5 + 10qD + 4(qD )2 , and the Australian plant has total cost
TCA (qA ) = 15 + 4qA + 5(qA )2 . Find the optimal output at each plant, and the price it will charge.
14. Multiplant monopoly–II.B Consider a monopolist that is considering outsourcing some of its
production to a plant overseas. The firm currently faces demand p(Q) = 75 − 0.5Q, and its sole
factory has total cost TC(q1 ) = 10 + 2q1 + (q1 )2 . If it invests in the overseas plant, it estimates
that the plant will have total cost TC(q2 ) = 5 + 25q2 + 5(q2 )2 . In that case, the monopolist can
produce in either plant or both plants. Should the firm invest in the new plant?
15. Multiproduct monopoly.C Consider a pharmaceutical company with a patent on two different
prescription drugs, granting them a monopoly in each market. Both drugs (x1 and x2 ) are made
in similar ways, with total cost of
TC(x1 , x2 ) = 50 + 2(x1 + x2 ) + 0.5(x1 + x2 )2 .
Drug x1 has demand p1 (x1 ) = 500 − x1 , and drug x2 has demand p2 (x2 ) = 1, 000 − x2 . Find the
monopoly output, price, and profit for each drug.
16. Multiperiod monopoly.C Many new inventions rely on crowdfunding campaigns to finance
the development of the resulting products. Many of these products also benefit from network
externalities—the idea that the more of their products they sell today, the more valuable they
will be to consumers tomorrow as more consumers are involved. Consider such a product, which
would result in a monopoly over two periods: (1) the crowdfunding period with demand p(q) =
100 − 2q; and (2) the post-crowdfunding period, where demand increases to p(q) = 150 − 2q if
the firm sells at least 20 units and remains unchanged if it does not sell 20 units. Assume that the
firm has marginal cost MC = $40.
(a) No network effects. Consider that the monopolist ignores network effects, assuming that its
demand function is p(q) = 100 − 2q in both periods. Find the equilibrium output and prices
in both periods.
(b) Network effects, second period. Assume now that the monopolist recognizes the presence of
network effects. Starting at the second period, what are its profit-maximizing output and price
when the firm sold fewer than 20 units in the first period? What if the firm sold more than 20
units in the first period?
Monopoly 275
(c) Network effects, first period. Still with the situation set out in part (b), let us move on to the
first period. Find the monopolist output and price in this period. (Hint: The firm anticipates
its second-period profits if it sells more than 20 units today, and if it doesn’t. The monopolist
then chooses the first-period output that yields the largest overall profit.)
17. Advertising-to-sales ratio.A Consider a monopolist with a price elasticity of demand of εq,p =
−2.5 and an advertising elasticity of εq,A = 0.5. What is the advertising-to-sales ratio? Comment
on how price elasticity of demand affects the advertising-to-sales ratio.
18. Optimal advertising.B Annie’s Apples company is the only local producer of caramel apples in
Appleville, making her a monopoly. Her demand for caramel apples is
√
p(q, A) = 100 − q + A,
where q is the number of apples and A is advertising expenditure. If she has a constant marginal
cost of $2 per caramel apple, what is Annie’s equilibrium number of caramel apples sold and
advertising expenditure?
19. Identify a monopsony.A Outside of employers in small towns, describe an example of a real-life
monopsony. Be specific about what the good traded is and who the buyers and sellers are.
20. Monopsony–one input.B Consider a family business that produces shoes with a son (Edgar)
eager to join the workforce. The family business is the only potential employer for Edgar, as
he will be kicked out of the family if he works at a different store (and Edgar would rather not
work at all than get kicked out of the family). The business sells shoes for $100 a pair, with
production function q = 10 ln L. The supply curve for the number of hours that Edgar will work
is w(L) = 5 + L. How many hours will Edgar work, and what will his wage be?
21. Monopsony–two inputs.B In many rural towns, there may be only one employer. An example of
this may be a large, corporation-owned farm. This farm recently bought out many smaller farms in
the area, and there is a large surplus of both high- and low-skilled labor (Lh and Ll , respectively).
The production function for the farm is
q = 10 ln Lh + 4 ln Ll .
The supply curves for labor are wh (Lh ) = 5 + 4Lh and wl (Ll ) = 2 + 2Ll , and the farm’s output
sells for $10 per unit. How much of each type of labor will the farm hire, and at what wages?
How much output will the farm sell? What is the farm’s total profit?
11 Price Discrimination and Bundling
11.1 Introduction
This chapter analyzes firms’ strategies to increase profits by charging different prices to
different types of consumers, such as those with distinct demands for the good, or those
purchasing it at different time periods or locations, or in different quantities. In particular,
we discuss three forms of price discrimination. Under first-degree (or perfect) price dis-
crimination, the firm has access to enough information about consumer demand that it can,
essentially, charge a “personalized price” to each consumer, which coincides with her maxi-
mum willingness-to-pay (WTP) for the good. In second-degree price discrimination, the
firm offers quantity discounts to customers. Lastly, in third-degree price discrimination, the
firm charges different prices to groups of consumers with different characteristics and needs,
such as students and nonstudents at the movies.
While all forms of price discrimination allow the seller to increase profits, first-degree
price discrimination increases its profits by the largest amount because consumers have no
consumer surplus, transferring it entirely to the firm. This form of price discrimination is
difficult to implement, however, as it requires extremely detailed information about each
consumer’s WTP for the good, whereas second- and third-degree price discrimination do
not assume such rich access to information.
We finish the chapter by analyzing bundling strategies. You have probably encountered
this before when purchasing a computer, facing a price for the monitor, another price for
the (CPU), and another for the whole computer (bundling the monitor and CPU).1 In most
cases, prices are set so that purchasing the bundle sounds like a better deal than buying
all the parts separately. We analyze when the seller finds it profitable to offer bundles, the
1. Actually, if you purchase the CPU alone, you are being offered a bundle as well, because the package often
includes the keyboard and mouse. Other common examples of bundle pricing are tickets to water and amusement
parks, where you can purchase (1) a ticket giving you access to the park without access to rides (you can then buy
each ride separately inside the park) or (2) a ticket allowing you unlimited access to the park and rides everywhere.
278 Chapter 11
Customers with WTP above pM
pM Customers with WTP

below pM but above MC(qM)
MC(qM) MC(q)
MR(q) Demand
qM q
Figure 11.1
Room for larger profits in monopoly.
optimal price that the seller should charge for the bundle, and the optimal price for each
item sold separately.
11.2 Price Discrimination
Monopolists, as well as firms with market power, can earn large profits. A natural ques-
tion, however, is whether firms can do even better. As we discuss in this chapter, the answer
is “Yes.” To understand this point, consider the monopolist’s decision again, as shown in
figure 11.1. By choosing an output level where its marginal revenue coincides with its
marginal cost, MR = MC, it charges a unique price pM to all its customers, giving up two
business opportunities:
• Price pM attracts buyers who would have been willing to pay a higher price, as depicted in
the segment of the demand curve to the left of pM , where p > pM . Hence, the monopolist
would like to charge a higher price to these customers in order to earn a larger profit margin
from them.
• Price pM does not attract buyers who are not willing to pay pM , but are willing to pay more
than the cost that the monopolist incurs to produce the good. That is, the monopolist could
charge a price p in the segment of the demand curve below pM and above MC(qM ). This
means that the monopolist can make an additional profit p − MC(qM ) per unit by selling
its product to these customers.
These points highlight that the monopolist could increase its profits if it could charge
different prices to specific customers (i.e., if the monopolist could “price discriminate”).
In this section, we explore three types of price discrimination: first-degree, where the
Price Discrimination and Bundling 279
monopolist sets a different price for each customer that coincides with her willingness-to-pay
for the good; second-degree (or “nonlinear pricing”), where the monopolist offers a quan-
tity discount to buyers purchasing large amounts of product; and third-degree, where the
monopolist charges different prices to different groups of customers, each having a different
demand curve.
Conditions for price discrimination.Before we start analyzing each type of price discrim-
ination, it is important that we understand under which conditions the monopolist can
price-discriminate:
• No arbitrage. The good cannot be resold from one customer to another (i.e., no arbitrage
can occur); otherwise, individuals with a low WTP would purchase the good at a low price
and resell it to individuals with a high WTP (but charge them less than the monopolist
would).
• Information about WTP. The monopolist needs some information about customers’ WTP
for its good. While firms rarely observe extremely detailed information about such
WTP for each potential customer, they at least gather information for various groups of
customers.
11.2.1 First-Degree Price Discrimination

In this scenario, the monopolist charges to every consumer i a price that coincides with her
maximum willingness-to-pay (i.e., a personalized price). For instance, if the monopolist
faces an inverse demand p(q) = a − bq, it charges price p = a to the individual with the
highest WTP, then a price p = a − $0.01 to the individual with the second-highest WTP,
and similarly for all subsequent buyers. The monopolist stops this pricing strategy when
p = MC(q) because customers with WTP below MC(q) would entail a per-unit loss.
As a consequence, the firm extracts all the surplus from every consumer, generating a
total profit that coincides with the area below the demand curve and above its marginal
cost (MC) function MC(q), as depicted in figure 11.2. In addition, the output that the
monopolist produces under first-degree price discrimination, qFD (where the superscript
FD denotes first-degree price discrimination), coincides with that under a perfectly com-
petitive market, qPC , because at qPC , the demand curve crosses the firm’s marginal cost
p(q) = MC(q).
Example 11.1: First-degree price discrimination Consider a monopoly facing an

inverse demand curve p(q) = a − bq, where a, b > 0; and a total cost (TC) function
TC(q) = cq, where c > 0.
Uniform pricing. If the monopolist sets a uniform price for all its customers
(as described in chapter 10), it would produce MR(q) = MC(q); that is, a − 2bq = c.
280 Chapter 11
Profits
under FD
MC(q)
Demand
qFD = qFC q
Figure 11.2
First-degree price discrimination.
After solving for output qM , we find qM = a−c

2b , which entails a monopoly price of
a−c a+c
pM = a − b = ,
2b 2
with profits of π M = (a−c)

2
4b . (These results coincide with those in example 10.3 in
chapter 10.)
First-degree price discrimination. If, instead, the monopolist practices first-degree
price discrimination, it produces an output level where the demand curve crosses
the marginal cost (i.e., a − bq = c) or, after solving for q, we find qFD = a−c b . As
figure 11.2 depicts, the monopolist’s profits coincide with the area of the triangle below
the demand curve p(q) = a − bq, and above the marginal cost c; that is,

1 a−c (a − c)2
π FD
= (a − c) −0 = ,
2 b 2b
Height
Base
which exceeds those under a uniform (unique) price, π M = (a−c) (a−c)2

2
4b , because 2b >
(a−c)2
4b simplifies to > 1
2b or 4b > 2b. For instance, if the monopolist faces a
1
4b ,
demand function p(q) = 10 − q (i.e., a = 10 and b = 1), and c = 2, the profit from set-
ting a uniform price is π M = (a−c) (10−2)2
2
4b = 4 = $16, but it doubles with first-degree
price discrimination because π FD = (a−c) (10−2) 2 2
2b = 2 = $32.
Self-assessment 11.1 Consider the monopolist in example 11.1, but assume that
the inverse demand changes to p(q) = 16 − q and the marginal cost c = 3. Follow the
steps in example 11.1 to find the monopolist’s profit if it sets a uniform price, π M , and
if it practices first-degree price discrimination, π FD .
First-degree price discrimination is ideal for the monopolist, of course, as it extracts all
possible surplus from consumers. However, the monopolist needs a massive amount of infor-
mation to practice this type of price discrimination; namely, it needs to know the maximum
willingness-to-pay of every buyer, making this type of practice relatively uncommon, at
least in its purest form. One of the closest examples of this pricing strategy is the Free
Application for Federal Student Aid (FAFSA) forms that students applying for federal finan-
cial aid must submit to the college or university they attend. This form includes relatively
detailed information about the student’s income, as well as her family’s, which is highly
correlated with their willingness-to-pay for education—information that the student’s insti-
tution can use to better assess how much the student (and her family) are willing to pay in
tuition.2
11.2.2 Second-Degree Price Discrimination

With second-degree price discrimination, the monopolist offers a quantity discount to indi-
viduals willing to purchase several units, such as discounts for buying in bulk. That is, the
monopolist charges at least two prices: one for each of the first q1 units, and another for each
unit beyond q1 . For instance, the monopolist sets a price p1 = $4 for the first 3 units, and a
lower price p2 = $2 for all units there after.3 This is a common pricing strategy in utilities,
such as electricity and water, and in mass transit systems, where one may benefit from a
discount after purchasing a large number of units.
As you probably noticed from this discussion, this type of price discrimination gives rise
to three unknowns that the firm needs to determine:
• Where should the monopolist set the boundary q1 , so customers can start benefiting from
a quantity discount?
• Which price should the monopolist set for each unit in the first block, p1 ?
• Which price should it set for each unit in the second block, p2 ?
2. Besides detailed information about WTP, the second condition for price discrimination to be successful (namely,
no possibility of arbitrage) also holds in this scenario: degrees are nominative, so the student cannot resell her
education to another student.
3. Hence, the monopolist charges the same price for all the units in the first block (e.g., before reaching 3 units),
but a lower price for all the units in the second block (e.g., beyond 3 units). This explains why this pricing strategy
is also known as “block pricing.”
282 Chapter 11
To find these three unknowns, we only need to solve the following monopolist’s problem:
max p1 q1 + p2 (q2 − q1 ) − TC(q2 ),

q1 ,q2
TR1 TR2
where TR1 = p1 q1 denotes the total revenue from the q1 units in the first block (i.e., units
from q = 0 to q = q1 ), and TR2 = p2 (q2 − q1 ) represents the total revenue from the units in
the second block (i.e., those from q1 to q2 ). Total cost TC (q2 ) is evaluated at q2 units
of output because the firm produces a total of q2 units. Intuitively, this problem asks the
monopolist:
Choose the number of units in the first block, q1 , and in the second block, q2 − q1 , to maximize
the profits from both blocks.
Example 11.2 illustrates this pricing strategy.
Example 11.2: Second-degree price discrimination Consider a monopolist fac-

ing the inverse demand function p(q) = 10 − q, where a, b > 0. The firm’s total cost
(TC) function is TC(q) = cq, where c > 0. We first need to write down the monopolist’s
profit maximization problem (PMP) as follows:
TR1 TR2 2)
TC(q

max π = (10 − q1 )q1 + (10 − q2 ) (q2 − q1 ) − cq2 .
q1 ,q2
p1 p2
Differentiating with respect to q1 , we obtain4
∂π
= 10 − 2q1 − (10 − q2 ) = 0,
∂q1
which simplifies to −2q1 + q2 = 0 or, after solving for q1 , we find that q1 = q22 .
Differentiating now with respect to q2 , we obtain
∂π
= 10 − 2q2 + q1 − c = 0,
∂q2
which leads to q2 = 10+q2 1 −c . Inserting q1 = q22 into q2 = 10+q2 1 −c , we find
4. Note that, to facilitate your differentiation, you can first expand the firm’s profit, obtaining π = 10q2 + q1 q2 −
(q1 )2 − (q2 )2 − cq2 .
q1

q2
10 + −c
q2 = 2 ,
2
or 3q2 + 2c = 20, which, solving for q2 , gives q2 = 2(10−c)

3 . Inserting this result into
q1 = q22 , we find q1 = 10−c
3 . The first block is then q1 = 10−c
3 units, while the second
block is
2(10 − c) 10 − c 10 − c
q2 − q1 = − = units.
3 3 3
We can now find the optimal prices for each block by plugging these output levels into
the inverse demand function as follows:
10 − c 20 + c
p(q1 ) = 10 − = , and
3 3
2(10 − c) 2(5 + c)
p(q2 ) = 10 − = .
3 3
Numerical example. For instance, if the marginal cost is c = $4, the monopolist sells
q1 = 10−4
3 = 2 units in the first block at a price of p1 = 3 = $8 per unit. In addition,
20+4
2(10−4)
this firm sells q2 = 3 = 4 units in total, implying q2 − q1 = 4 − 2 = 2 units in the
second block, each of them at a price of p2 = 2(5+4)
3 = $6 per unit, thus offering a price
discount once the buyer purchases more than 2 units. These prices and output levels
generate profits of
π = (8 × 2) + (6 × 2) − (4 × 4) = $12.
If, instead, the monopolist charged a uniform price to all its customers (i.e., not
practicing price discrimination), its output qM would solve to 10 − 2q = 4, or
qM = 3 units, at a monopoly price of pM = 10 − 3 = $7, yielding a profit of
only π M = (7 × 3) − (4 × 3) = $9. As expected, the monopolist increases its pro-
fits by price discriminating. Exercise 13 at the end of this chapter shows that
profits can be further increased if the monopolist offers three blocks, rather
than two.
284 Chapter 11
Self-assessment 11.2 Consider the monopolist in example 11.2, but assume

that the inverse demand curve changes to p(q) = 16 − q, and its marginal cost is
c = 4. Follow the steps in example 11.2 to find the units that the monopolist sells
to each block, its corresponding prices, and the overall profits from doing so. Also,
find the profit that the monopolist obtains from setting a uniform price for all
customers, π M .
Non-linear pricing. Uniform pricing is also known as “linear pricing” because the monop-
olist charges the same price per unit, regardless of how many units the buyer purchases. In
contrast, second-degree price discrimination is known as “non-linear pricing” because the
price per unit is not constant in output. As example 11.2 illustrates, the monopolist sets a
relatively high price of $8 for the first block of units, but it offers a price discount ($6) for
all subsequent units. Hence, if the monopolist offers at least one price discount, the price
per unit is nonconstant (i.e., non-linear).
11.2.3 Third-Degree Price Discrimination

In third-degree price discrimination, the monopolist charges different prices to customers
with different demand curves. This entails that the monopolist, upon observing a potential
customer, can easily identify which group she belongs to. Mathematically, the monopolist
treats each group of customers as a separate monopoly, because they cannot resell the good
to customers in another group (i.e., there is no arbitrage condition). As a consequence, the
monopolist starts by finding the marginal revenue curve for each demand function, and then
it sets each of them equal to the firm’s marginal cost. Example 11.3 illustrates this pricing
strategy.
Example 11.3: Third-degree price discrimination Consider a small town with

only one movie theater. As a monopolist, the movie theater faces two groups of cus-
tomers, students and non-students, which it can easily distinguish by checking whether
they have a student ID. Students have a lower willingness-to-pay for movies, captured
by inverse demand p1 (q) = 10 − q, whereas non-students have a higher willingness-
to-pay, measured by p2 (q) = 25 − q. The marginal cost of a ticket is the same for both
types of customers, MC = $3. In this scenario, the monopolist seeks to maximize its
profits from both groups, π1 + π2 , as follows:
max π = π1 + π2 = (10 − q1 )q1 − 3q1 + (25 − q2 )q2 − 3q2 .

q1 ,q2
π1 π2
Differentiating with respect to q1 , we obtain 10 − 2q1 = 3 which, solving for q1 ,

yields q1 = 3.5 tickets. Differentiating now with respect to q2 , we find 25 − 2q2 = 3,
yielding q2 = 11 tickets.
Alternatively, this problem can be solved by noticing that profits from students, π1 ,
only depend on the number of tickets sold to this group, q1 ; and, similarly, profits
from non-students, π2 , only depend on the number of tickets sold to this group, q2 .
We can then write this maximization problem as two separate problems:
max π1 = (10 − q1 )q1 − 3q1 , and (Students)
q1
max π2 = (25 − q2 )q2 − 3q2 (Non-students)

q2
That is, the firm treats each group of customers as a separate monopoly, setting the
monopoly rule MR1 = MC on students and MR2 = MC on non-students. In particular,
to maximize profits from students, the monopolist sets MR1 = MC, or 10 − 2q1 = 3,
which yields an output level of q1 = 3.5 units, selling each of them at a price of
p1 = 10 − 3.5 = $6.5. Similarly, to maximize profits from the non-student group, the
monopolist sets MR2 = MC, or 25 − 2q2 = 3, which yields an output level of q2 = 11
tickets, each sold at a price of p2 = 25 − 11 = $14. As a result, profits from both
groups become:
π1 + π2 = [(6.5 × 3.5) − (3 × 3.5)] + [(14 × 11) − (3 × 11)]

= 12.25 + 121,
implying that total profits are π = $133.25.
Self-assessment 11.3 Consider the monopolist in example 11.3, but assume that
students’ inverse demand changes to p(q) = 16 − q. Follow the steps in example 11.3
to find the monopolist’s sales to each market segment, the corresponding prices, and
the profits. Compare your results against those in example 11.3.
Screening. In example 11.3, students pay much less than non-students at the movies,
reflecting their different demands ($6.50 versus $14). Customers might, however, try to pose
as part of the low-demand group to buy an item at a lower price. What can the monopolist
do to avoid such a strategy? The firm can rely on screening devices, such as student IDs,
to sort customers. That is, the firm cannot directly observe the customer’s demand for the
286 Chapter 11
good and, if asked, the customer would have an incentive to lie to buy the good at a cheaper
price. The firm can, however, use screening to infer the customer’s unobserved demand. As
a consequence, screening must satisfy two key properties to work: (1) it must be perfectly
observable for the firm, such as a customer’s age, student status, or residence; and (2) it
must be strongly correlated with the customer’s WTP for the good. In the example of the
movie theater a student ID can be observed by an employee at the ticket counter, and it is
negatively correlated with the customer’s WTP (because students’ budgets are often more
constrained than those of working adults).5
11.3 Bundling
Common examples of bundling are found in electronics, where you can buy a desktop com-
puter as a whole (with a monitor, CPU, keyboard, and mouse), or buy each of its units
separately. Similarly, in water parks, you can purchase an entry ticket with access to all
rides, or an entry ticket without access to rides (so you pay for each ride individually). As a
consequence, we can consider three forms of bundling:
• No bundling, where the firm does not bundle any good, allowing the buyer to purchase
each item individually (e.g., each part of the computer is sold separately).
• Pure bundling, where the firm allows the buyer to purchase either the bundle (e.g., the
whole computer) or no good at all.
• Mixed bundling, where the firm sets prices for each individual item and for the bundle,
allowing the buyer to choose whether to purchase individual items or the bundle.
Example 11.4 illustrates that the monopolist can increase its profits by offering
pure bundling, so long as the customer’s demand for the different items is negatively
correlated.
Example 11.4: Bundling Consider a monopolist selling computers. Table 11.1

reports the WTP for the CPU alone, the monitor alone, or the bundle, for each
customer.
5. Common screening methods used by airlines (or traveling websites such as Orbitz or Kayak) include the number
of days in advance that the customer books her ticket because business travelers with a higher WTP often book their
tickets just a few days in advance. Other methods include whether she stays at her destination over Saturday night
(business travelers rarely do), the Internet Protocol (IP) address of the computer, tablet or smartphone on which
that search was done, and whether it was a repeated search from the same device. Other screening methods you
might have encountered are the number of days since the release of a new book or electronic gadget, where firms
charge a higher price during the first days after the release because customers with a high WTP rush to purchase
the item, but drop its price in a matter of days to target the general public, who are willing to wait a few weeks
before purchasing the item. (Firms often learn about cheaper ways to produce goods as time progresses; however,
a significant decrease in the item’s price a month after its release cannot be explained by lower costs of production,
but price discrimination could be the reason.)
Table 11.1
WTP for the CPU, the monitor, and the bundle for a computer purchase.
CPU Monitor Both Items (Computer)
Consumer 1 500 100β 500 + 100β

Consumer 2 500α 100 500α + 100
Average cost 400 80 480
Starting with the first column, which describes WTP for the CPU, consumer 1 has
a WTP of $500, while consumer 2’s WTP is a share of that, 500α, where α ∈ (0, 1).
Similarly, the second column indicates that consumer 2’s WTP for the monitor is $100,
whereas that of consumer 1 is lower, 100β, where β ∈ (0, 1). Therefore, consumer 1
has the higher WTP for the CPU, but the lower for the monitor; in contrast, consumer 2
has the higher WTP for the monitor but the lower for the CPU.6 The last column sums,
for every consumer, the WTP across all items, in order to find her total WTP for the
bundle. For simplicity, assume that consumer 1’s WTP is larger than that of consumer
2 (500 + 100β > 500α + 100).7 The last row represents the average cost (i.e., cost per
unit) that the firm incurs. We next separately analyze the profits from not practicing
bundling and from pure bundling, examining which pricing strategy generates the
higher profit.
No bundling. In this case, the firm sells the CPU at either $500 or $500α. If the
firm sells the CPU at the lower of these two prices, $500α, it entices both types of
consumers to buy the CPU, earning profits of
(2 × 500α) − (2 × 400) = 1, 000α − 800.
If the firm, instead, chooses to set the price equal to consumer 1’s WTP for the
CPU, $500, its profits are only 500 − 400 = 100. As a consequence, the firm will
choose to entice both consumers only if 1, 000α − 800 > 100 or, solving for α, if
α > 0.9. Intuitively, the firm entices both types of consumers when consumer 2’s
WTP for the CPU is relatively close to that of consumer 1 (i.e., parameter α is
close to 1). Otherwise, selling to the buyer with the higher WTP (consumer 1) is
more attractive.
6. Note that the negative correlation in WTP holds because α, β ∈ (0, 1). If β > 1, however, then consumer 1 would
have the higher WTP for the CPU and the monitor, while consumer 2 would exhibit the lower WTP for both items.
In other words, WTP would now be positively correlated. The end-of-chapter exercises explore this possibility,
showing that the seller no longer has incentives to offer pure bundling.
7. If we solve for parameter α in this inequality, we obtain that consumer 1’s WTP for both items is larger than
that of consumer 2 if α < 4+β5 . That is, consumer 2 cannot have a WTP for the CPU close to that of consumer 1;
otherwise, her WTP for the sum of both items would be larger.
288 Chapter 11
A similar argument applies to the pricing of the monitor. The firm can choose to set
the monitor’s price at the lowest WTP, $100β, and entice both customers, generating
a profit of
(2 × 100β) − (2 × 80) = 200β − 160.
Alternatively, the firm can choose to price at consumer 2’s WTP, $100, attracting only
this consumer to buy the monitor. This would give the firm a profit of 100 − 80 = $20.
As a result, the firm would choose to sell to both consumers only if 200β − 160 > 20
200 = 0.9. A similar intuition applies to the monitor:
or, after solving for β, if β > 180
the firm entices both types of customers, so long as consumer 1’s WTP is sufficiently
close to that of consumer 2 (i.e., parameter β is close to 1). Otherwise, selling to the
buyer with the higher WTP (consumer 2) is more attractive.
Bundling. With pure bundling, the firm sets a single price for the combination of
CPU and monitor (i.e., the whole computer). Similarly as in the previous discus-
sion about the individual items, the firm has two pricing options. First, it can set a
price equal to consumer 1’s WTP, 500 + 100β, and only entice her, which generates a
profit of
(500 + 100β) − 480 = 20 + 100β.
Instead, the firm can set a price equal to consumer 2’s WTP (the lower WTP for the
computer), 500α + 100, inducing both consumers to purchase the computer, yielding
a profit of
2 × (500α + 100) − (2 × 480) = 1, 000α − 760.
Therefore, the firm entices both consumers if 1, 000α − 760 > 20 + 100β or, after
solving for parameter α, if α > 0.78 + 0.1β. Figure 11.3 depicts line α = 0.78 + 0.1β,
which originates at 0.78 and increases in β at a rate of 0.1, along with the two cutoffs
found in the previous discussion (namely, a horizontal line α = 0.9 and a vertical line
β = 0.9). Figure 11.3 illustrates the six regions that the previous discussion generated,
which we describe separately next. (For compactness, let α denote 0.78 + 0.1β, so we
can write α > α.)
Region I. If α > 0.9 and β > 0.9, condition α > α holds.8 In this scenario, the firm
prefers to sell the CPU, the monitor, and the bundle to both customers. It prefers to
sell the bundle rather than the separated items (practicing no bundling) because
8. To see this point, note that α = α reaches its highest point at β = 1, where α = 0.78 + 0.1 = 0.88, lying below
the horizontal line α = 0.9. Then, condition α > α holds for all values of β if α > 0.9.
α
Region I
1
Region II
α = 0.9
Region III
Region IV
0.78 α = 0.78 + 0.1β
Region V
Region VI
β
β = 0.9 1
Figure 11.3
Bundling incentives as functions of α and β.
1, 000α − 760 > (1, 000α − 800) + (200β − 160)

Profits from the bundle Profits from the CPU Profits from the monitor
simplifies to −760 > 200β − 960, or β < 1, which holds by assumption (negatively
correlated demands).
Region II. If α > 0.9 but β < 0.9, condition α > α still holds. However, the firm now
sells the CPU and the bundle to both customers, and the monitor to customer 2 alone.
In this context, the firm offers bundling given that,
1, 000α − 760 > (1, 000α − 800) + 20

collapses to 780 > 760.

Region III. If α < 0.9, β > 0.9, and α > α, the firm sells the monitor and bundle to
both customers, but the CPU to customer 1 alone. Therefore, the firm offers bundling
because
1, 000α − 760 > 100

+ (200β − 160) ,

which yields 1, 000α > 700 + 200β, or α > 0.7 + 0.2β. Figure 11.4 depicts the line
α = 0.7 + 0.2β on the regions identified in figure 11.3. This dashed line originates
290 Chapter 11
1
α = 0.9 Region III
α = 0.78 + 0.1β
0.78
0.7 α = 0.7 + 0.2β
β
β = 0.8 β = 0.9 1
Figure 11.4
Bundling incentives in region III.
at 0.7 and reaches a height of 0.9 when β = 1, and it crosses cutoff α at β = 0.8,9
thus dividing region III into two areas: one where α > 0.7 + 0.2β holds (above the
dashed line at the top of region III) and the firm prefers to bundle; and another where
this condition is violated (at the bottom of region III), and the firm sells each item
separately.
Region IV. If α < 0.9, β < 0.9, and α > α, the firm sells the bundle to both customers,
the CPU to customer 1 alone and the monitor to customer 2 alone. It offers bundling
in this region as well, given that
1, 000α − 760 > 100

+ 20 ,

which yields 1, 000α > 880, or α > 0.88. Because condition α > α is satisfied in this
region, and cutoff α reaches its highest point at 0.88, condition α > 0.88 holds for all
points in region IV.
Region V. If α < 0.9, β > 0.9, and α < α, the firm sells the monitor to both cus-
tomers, the CPU to customer 1 alone and the bundle to customer 1 alone. In this
scenario, the firm does not offer bundling because
9. To understand this, set the equations of both lines equal to each other (0.78 + 0.1β = 0.7 + 0.2β). Rearranging,
we obtain 0.08 = 0.1β, which, solving for β, yields β = 0.8. You can also find the height that both lines reach at
20 + 100β < 100

+ (200β − 160) ,

which simplifies to 80 < 100β, or 0.8 < β. This condition on β holds because β > 0.9
is satisfied by all the points in region V. Therefore, the firm does not offer bundling in
this region.
Region VI. If α < 0.9, β < 0.9, and α < α, the firm sells the CPU to customer 1
alone, the monitor to customer 2 alone, and the bundle to customer 1 alone. In this
context, offering bundling is unprofitable, given that
20 + 100β < 100

+ 20 ,

which collapses to 100β < 100, or β < 1, which holds by assumption (negatively
correlated demands).
In summary, the firm finds bundling profitable in regions I, II, and IV, which can be
defined by condition α > α in figure 11.3, and in the top area of region III, defined by
α > 0.7 + 0.2β. Otherwise, the firm sells each item separately. Intuitively, conditions
α > α and α > 0.7 + 0.2β indicate that the WTP of customers 1 and 2 for the CPU
are relatively similar (indeed, they are identical when α = 1). In contrast, when α α,
their demands are so different that the firm prefers to sell each item separately (no
bundling).
Self-assessment 11.4 Consider the bundling table in example 11.4. Assume now
that the average cost of the CPU decreases to $300. Follow the steps in example 11.4 to
find under which conditions the firm chooses to sell different items to each consumer.
Compare your results against those in example 11.4.
Exercises
1. Price discrimination with different demands.B Consider a monopolist selling to two markets.
Every market faces a different demand function, which the monopolist can observe, and the monop-
olist can charge different prices in each market, thus practicing third-degree price discrimination.
For simplicity, assume that marginal cost is c > 0 in both markets.
their crossing point by inserting β = 0.8 into the equation of either line (i.e., α = 0.78 + (0.1 × 0.8) = 0.86) thus
lying below the horizontal line depicting α = 0.9 in figure 11.4.
292 Chapter 11
(a) Linear demand. Consider that the inverse demand in each market i is given by pi (qi ) = ai −
bi qi , where i = {1, 2}. Find the profit-maximizing price and quantity in each market. Under
which conditions on the parameters (ai , bi , and c) does the monopolist charge the same price
in both markets?
(b) Constant elasticity of substitution (CES) demand. Consider now that the direct demand in each
−b
market i is given by qi (pi ) = Ai pi i , where i = {1, 2}. (Recall that the exponent −bi indicates
the elasticity of substitution, which is just a number, and thus is constant in qi .) Find the profit-
maximizing price and quantity in each market. Under which conditions on parameters (Ai , bi ,
and c) does the monopolist charge the same price in both markets?
2. Comparing discrimination profits.B Consider a cooperative of wheat producers in Washington
State, who sell their products to two types of customers: households, with demand q1 = 100 − 3p1 ;
and firms, with demand q2 = 100 − 12 p2 . The cooperative operates as a monopolist in the area, and
its cost function is C(q) = 1, 300 + 4q, where q = q1 + q2 denotes aggregate output. There is no
possibility of arbitrage between the two groups.
(a) Third-degree price discrimination. Set up the PMP for the cooperative. Find the optimal output
levels and prices for each group of customers.
(b) First-degree price discrimination. Assume now that the cooperative can practice first-degree
price discrimination. Find the optimal output levels and prices for each group of customers.
(c) Compare the total profits of the cooperative when practicing each type of price discrimination.
3. Implementing price discrimination.A Describe how the following firms could implement price
discrimination. Be specific about the degree, which markets/consumers are charged higher or lower
prices, and what barriers they may face.
(a) Restaurants
(b) Airlines
Multi product- different prices, combined cost
(c) Cable providers Multi plant- combined price, different costs
(d) Wheat growers
4. Third-degree price discrimination.A A local car wash faces two types of customers: older, more
traditional customers, who like their cars to sparkle all the time, with demand q1 = 75 − 2p1 ; and
the younger generation, who do not care as much about having a clean car, with demand q2 =
25 − 4p2 . Each car costs $2 to wash, and the fixed costs of the car wash are $50. Find the price
and number of cars washed to each group if the car wash were to price-discriminate. What is their
total profit?
qty should be positive
5. When to price-discriminate.B Consider a monopolist selling to two markets, each with a different
demand: (high) pH = aH − bH qH , and (low) pL = aL − bL qL , where aH > aL and bH bL , so that
demand is greater in the high market for any price. The firm practices third-degree price discrimi-
nation and has a constant marginal cost MC = c > 0. Is there a level of cost where the monopolist
chooses not to sell to the low market?
6. Third-degree price discrimination with elasticity.A A microchip manufacturer produces
microchips used in cell phones and sells them in two countries: the United States (US) and
Japan (J ). The price elasticity in the United States is εUS = −1.5; and in Japan, it is εJ = −2.5.
If the monopolist practices third-degree price discrimination, the marginal cost of producing the
chips is $50, and the cost of shipping the chips to Japan is $5 per chip, what price does it set in
each country?
7. Willingness to price discriminate.B An airline has been collecting data to estimate demand
for flights between Greenville, SC and Seattle, for which it would be the only provider. It has
estimated this demand to be p = 1, 000 − 2q. The total cost (in dollars) of this flight is TC(q) =
50, 000 + 20q.
(a) Uniform pricing. If the airline cannot discriminate, what price does it charge, how many
tickets does it sell, and what is its profit?
(b) First-degree price discrimination. If the airline can do first-degree price discrimination (based
on information it receives through its partners during online booking), how many tickets does
it sell, and what is its profit?
(c) Information acquisition. If the airline has to pay for the information on prices through its
partner in order to practice first-degree price discrimination, how much are they willing to
pay for that service?
8. Ineffective price discrimination.B A local movie theater has been worried about customers pos-
ing as students to purchase movie tickets at a lower price. To combat this, the owner of the
theater is interested in purchasing a student ID scanner from the local university, at a cost of $75.
The movie theater faces inverse demand of pS = 25 − 0.1qS for students, and inverse demand
pO = 30 − 0.1qO for all other customers. The theater faces a marginal cost of $2 per customer.
(a) If the theater price-discriminates, what does it charge each group, and what is its total profit?
(b) If every non-student can pass as a student and get in at student prices, how many non-students
will go to the theater? (Hint: Plug the student price into the non-student demand.) What is
the theater’s profit?
(c) What is the theater’s profit if it purchases and uses the student ID scanner? Should it purchase
the scanner?
9. Deciding when to withdraw from a market.B Clarke’s Crisp Croissants has a monopoly on
the local market for breakfast pastries, which it makes at zero marginal cost. Demand in the
local market is qL = 10 − pL . The firm also sells croissants in a neighboring town with demand
qN = 5 − pN , where transportation costs are zero.
(a) Uniform price. If Clarke chooses to set a uniform price (i.e., the same price in all markets),
what is the profit-maximizing price, quantity in each market, and total profit?
(b) Third-degree price discrimination. If Clarke employs third-degree price discrimination, what
price does it set in each market? How much does the firm sell in each market? What is Clarke’s
total profit? Is it profitable for it to price-discriminate?
(c) Demand change. If the neighboring town’s demand falls to qN = 2.5 − pN , should the
monopolist set a uniform price that ignores the neighboring market?
10. Quantity discounts.B Phil’s Paper Supply is a monopoly in a small Midwestern town; it sells
paper and faces the inverse demand function p(q) = 25 − 0.01q and has total cost of TC(q) =
10 − 5q + 2q2 .
294 Chapter 11
(a) Uniform pricing. If Phil charges a single price for his paper, what does he charge, how many
units does he sell, and what is his profit?
(b) Offering quantity discounts. Phil wants to reward his major customers by offering a quantity
discount. How much does Phil charge for the first q1 units of paper, and how many units does
he sell? How much does Phil charge for the next block of paper q2 − q1 , and how many units
does he sell in that block?
(c) Calculate Phil’s profit if he decides to offer the quantity discount found in part (b). Should he
implement this pricing strategy?
11. Second-degree price discrimination.A Some local water companies offer a discount to cus-
tomers who use large quantities of water. Consider a local water utility that faces the inverse
demand p(q) = 100 − 10q. If the water utility has a marginal cost of $0.5 per unit of water, find
the price and quantity the water utility sells if it practices price discrimination with two blocks,
q1 and q2 − q1 .
12. Second-degree price discrimination with more general demand.B Consider a monopolist
facing the inverse demand function p(q) = a − bq, with total cost of TC(q) = cq, where a > c > 0.
(a) Write down the firm’s profit maximization problem if it were to practice second-degree price
discrimination to two blocks, q1 and q2 − q1 .
(b) Find the price that it would charge in each block, and how many units are in each block.
(c) Show that the price that it charges in the first block q1 is greater than the price in the second
block q2 − q1 .
13. Second-degree price discrimination with three blocks.C Consider the demand function from
example 11.2, p(q) = 10 − q, and marginal cost c = 4. Consider now that the monopolist wants
to add a third block of discounts, q3 − q2 . Set up and solve the monopolist’s problem in this case.
Compare your answer to the results found in example 11.2, with two blocks.
14. Price discrimination and advertising. C Annie’s Apples is a monopoly selling caramel apples
√
with demand p(q, A) = 100 − q + A, where q is the number of apples and A is the advertising
expenditure. Annie’s marginal cost is $2 per apple.
(a) If Annie wishes to offer a quantity discount, how much does she charge for each of the two
blocks of consumers? How much does Annie spend on advertising?
(b) If advertising is banned by law, so A = 0, what is the firm’s optimal block pricing strategy?
How are your results in part (a) affected? Interpret.
15. Bundling prices.A A fast food restaurant faces two types of consumers and is deciding on a
bundling strategy. The table here reports each consumer’s WTP for hamburgers, french fries, and
the bundle of both items, as well as the average cost of each good.
Hamburger French Fries Both items (bundle)

Consumer 1 $3 $4 $7
Consumer 2 $5 $2.5 $7.5
Average cost $2 $2 $4
(a) What prices should the restaurant set for each food item if it sells the items separately? What
is its profit in each scenario?
(b) What prices should the restaurant set if it sells the bundle “meal” of a hamburger and french
fries?
(c) How do your answers change if the restaurant’s cost of beef increases so that the marginal
cost of hamburgers increases to $4?
16. Bundling–I.B Consider the scenario in example 11.4, but assume that individuals’ WTP for both
goods are positively correlated (i.e., parameter β = 1.4). For simplicity, assume that α = 0.92, as
in the following table:
CPU Monitor Both Items (computer)

Consumer 1 $500 $140 $640
Consumer 2 $460 $100 $560
Average cost $400 $80 $480
Show that the firm has no incentive to bundle in this case of positively correlated demands.
17. Bundling–II.B TV-Net, a local cable TV and internet provider, is deciding on a bundling strategy.
This table reports the WTP for TV alone, internet alone, and the bundle for each customer, as
well as the average cost.
TV Internet Both Items (bundle)

Consumer 1 60 50β 60 + 50β
Consumer 2 60α 50 60α + 50
where α, β ∈ (0, 1). For simplicity, assume that customer 1 has the highest WTP for the bundle
of both goods. Repeat the analysis from example 11.4 to show when the firm should prefer to
bundle, sell the items separately, or choose a mixed-bundling strategy.
18. Why bundle?A Many TV providers offer cable TV, internet, and telephone services, which are
often bundled in different ways. Explain the intuition behind offering each separately, bundling
two out of three services, and bundling all three services together.
19. Bundling–III.B TV-Net is deciding on a bundling strategy. This table reports the WTP for TV
alone, internet alone, and the bundle for each customer, as well as the average cost.
TV Internet Both items (bundle)

Consumer 1 60 40 100
Consumer 2 40 50 90
Consumer 3 25 60 85
296 Chapter 11
(a) If TV-Net only faces consumers 1 and 2, does it prefer to bundle, sell the items separately, or
choose a mixed-bundling strategy?
(b) If TV-Net faces all three consumers, does it prefer to bundle, sell the items separately, or
choose a mixed-bundling strategy?
20. Two-part tariff.B A two-part tariff is another price-discrimination method where the producer of
a good is able to capture the entire consumer surplus. An example of this might be an amusement
park that charges a fee for entry (the tariff), and then charges the customer for each ride (by buying
tickets). Let’s investigate how a firm sets the optimal two-part tariff by assuming that we have 100
consumers each, with demand for rides of p = 9 − q, and the costs of running the amusement park
are C(q) = 100 + q.
(a) Uniform pricing. If the firm acts as a monopoly, setting a single price, what is its profit-
maximizing price, quantity of rides (per person and aggregate), and profit?
(b) Marginal cost pricing. If the firm sets its price per ride equal to marginal cost, what is the
number of rides it will sell (per person and aggregate) and consumer surplus?
(c) Two-part tariff. If the amusement park uses a two-part tariff, setting its entrance fee equal to
consumer surplus while charging a price per ride equal to its marginal cost, what is its total
profit?
12 Simultaneous-Move Games
12.1 Introduction
This chapter is the first one to analyze game theory tools in economics, a discussion that
we expand upon in chapter 13. Many of the tools we present are then applied to analyze
imperfectly competitive markets (chapter 14), contract theory (chapter 16), and externalities
and public goods (chapter 17).
We start by describing the contexts in which economists and other social scientists refer
to a situation where agents interact as a “game,” and how to represent these scenarios graph-
ically using either matrices or game trees. The remainder of the chapter focuses on how to
predict players’ behavior in various games, which we do by deploying various “solution
concepts” that help us identify equilibrium strategies where, intuitively, no player has any
incentive to change her strategy.
We start with a simple solution concept known as “strategic dominance.” Rather than
seeking to find which strategy provides the highest payoff to a given player, dominance
looks at which strategies a rational player would never use because other strategies give
her a strictly higher payoff, regardless of what her opponents do. Essentially, a dominated
strategy provides a player with an unambiguously lower payoff than other strategies and, as
a result, we delete it from her set of available strategies. Deleting all strategies that players
regard as dominated is a straightforward tool for some games, which can help us provide
relatively precise equilibrium predictions about how players behave.
However, the application of strategic dominance may not delete many strategies (or have
no effect at all) in some games. In these cases, we need to rely on solution concepts that
help us predict players’ behavior more precisely. The concept of a player’s “best response”
can help us in this regard, as it identifies which strategy (or strategies) provides a player
with the highest possible payoff against each of the strategies that her opponents select. The
Nash equilibrium (NE), defined later in this chapter, then uses the notion of best response
by searching for a scenario (a profile of strategies, one for each player) where every player
plays a best response to her opponents’ strategies (i.e., mutual best response). We discuss
standard games in economics, such as the Prisoner’s Dilemma game, the Battle of the Sexes
298 Chapter 12
game, and Coordination and Anticoordination games, and how they can apply to related
disciplines like business, finance, or political science.
For several games (most of those covered in Intermediate Microeconomics courses), the
notion of the NE helps us predict more precisely how players behave in equilibrium. Yet,
the NE does not offer a precise equilibrium prediction for some games if we assume that
players choose a specific strategy with 100 percent probability. In these games, we show
that allowing players to randomize across some (or all) of their available strategies allows
a precise NE prediction. The NE in which players randomize their strategies is known as
a “mixed-strategy Nash equilibrium.” We illustrate its application with an example of the
penalty kicks round in the 2015 Women’s World Cup final between the US and China. We
then describe how to depict the best response of each player in this type of game.
12.2 What Is a Game?
In economics and business, we often refer to a “game” every time we consider scenarios in
which one agent’s actions affect other agents’ well-being. For instance, when a firm increases
its output, it may lower market prices, which in turn decreases the profits of other firms in the
same industry. Similarly, when a country sets a higher tariff on imports, it may decrease the
volume of imports, at the expense of another country’s exports and welfare. As you probably
noticed, day-to-day life is packed with strategic contexts in which our actions affect the
wellbeing of other agents (either individuals, firms, or governments) and, as a consequence,
most contexts can be modeled as games.
We next describe the main ingredients of a game. Whether analyzing firm competition,
donations to a charity, or government subsidies, all strategic scenarios include the following
elements:
• Players. The set of individuals, firms, or countries, that interact with one another. When
we examine competition between two sellers, we say that there are only two players (i.e.,
two firms), whereas when analyzing the incentives to donate to a charity we may have
more than a million players (e.g., individuals receiving a phone call to contribute to a
charity).1
• Strategy. A complete plan describing which actions a player chooses in each possible sit-
uation (contingency). A strategy can be informally understood as an instruction manual:
a player opens the manual, looks for the page describing the contingency she is fac-
ing in the game (e.g., the actions other players chose, and what stage of the game the
player is at), and reads the action that the instruction manual tells her to choose in such a
situation.
1. Needless to say, we do not consider “games with one player,” as there is no one else to be affected by the player’s
actions.
Simultaneous-Move Games 299
Player 2
Left Right
Up −4, −4 0, −7
Player 1
Down −7, 0 −1, −1
Matrix 12.1
Example of a two-player game.
a (4, 4)
Player 2
A b (–2, –2)
Player 1
B
(0, 10)
Payoff for Player 1 Payoff for Player 2

(First Mover) (Second Mover)
Figure 12.1
Example of a game tree.
• Payoffs. A game must also list the payoff that every player obtains under each possible
strategy path. For example, if player 1 chooses A and players 2 and 3 choose B, the vector
of payoffs is ($5, $8, $7), where the first component of the triplet lists the payoff going to
player 1, $5, while $8 is accrued to player 2, and $7 to player 3.
Throughout the analysis, we assume that all players are rational. In a strategic scenario,
this requires that every player knows the rules of the game (i.e., who the players are, what
their available strategies are in each contingency, and their resulting payoffs in each case). In
addition, it requires that every player knows that every player knows the rules of the game,
and every player knows that every player knows… ad infinitum. To better understand this
assumption, consider that Ana and Felix are about to play checkers. “Rationality,” in this
context, means that they both know the rules of the game, that Ana (Felix) knows that Felix
(Ana) knows the rules of the game, that Felix (Ana) knows that Ana (Felix) knows that Felix
(Ana) knows the rules of the game, ad infinitum. In short, this assumption is often referred
to as “common knowledge of rationality,” and informally guarantees that every player can
put herself in the shoes of her opponent at any stage of the game to anticipate her moves.
Two graphical approaches. We will encounter two approaches to graphically represent
games: matrices and trees, as matrix 12.1 and figure 12.1 illustrate. In the case of matri-
ces, player 1 is typically located on the left side of the matrix, as she chooses rows (and she
300 Chapter 12
is often referred to as the “row player”), whereas player 2 is placed at the top of the matrix
because she selects columns (and hence is called the “column player”). In matrix 12.1, for
instance, if player 1 chooses Up while player 2 picks Right, their payoff becomes (0, −7),
indicating that player 1’s payoff is zero, while player 2’s is −7. Matrices are often used to
represent games in which players choose their actions simultaneously.
In the case of game trees, such as that shown in figure 12.1, players act sequentially,
with player 1 acting first (e.g., the leader) and player 2 responding to player 1’s action as a
follower. In figure 12.1, player 1 can choose between A and B. If player 1 chooses B, the
game is over and payoffs are distributed, whereas if she chooses A, player 2 is called on to
move (responding with either action a or b). For instance, if player 1 chooses A and player 2
responds with b, we say that (A, b) is the “strategy profile”( i.e., the list of player 1’s and
player 2’s strategies, or how the game was played).2
Now that we know how to describe a strategic scenario between players (a game), we
turn to the main question of the chapter: How do we predict the way in which a game will
be played? In other words, how can we forecast players’ behavior in a competitive con-
text? In that regard, we seek to identify scenarios in which no player has any incentive
to alter her strategy choice, given the strategy of her opponents. In short, these scenar-
ios are called “equilibria” because players have no incentive to deviate from their strategy
choices.
12.3 Strategic Dominance
In this section, we analyze the first solution concept: equilibrium dominance. We define the
types of dominance next, and then we apply them to a standard game.
Strict dominance Player i finds that strategy si strictly dominates another strategy si
if choosing si provides her with a strictly higher payoff than selecting si , regardless of
her rivals’ strategies.
When strategy si strictly dominates another strategy si , we say that si is a “strictly dom-
inant strategy.” As a consequence, a player wants to choose a strictly dominant strategy
because it provides her with an unambiguously higher payoff than any other available
strategy (i.e., regardless of the strategy her opponents select). In other words, one specific
2. If, when player 2 responds with b, player 1 is again called on to move in the third stage of the game, choosing
between two new actions C and D, then an example of a strategy profile would be (AC, b), where player 1 chooses
A, player 2 responds with b, and player 1 ultimately responds with C in the last stage of the game. Note that, to
illustrate that actions A and C are selected by player 1 (although one is in stage 1 and the other is in stage 3), we
list both of them together in the first element of her strategy pair.
strategy (her strictly dominating strategy) yields a higher payoff, regardless of the beliefs
that she holds about her rivals’ choices.3
In contrast, in this definition, we say that strategy si is “strictly dominated” by strategy si .
Intuitively, a strictly dominated strategy gives player i a strictly lower payoff, regardless of
her rivals’ choice. We then expect a rational player to never choose such a strategy.
Tool 12.1 provides a step-by-step road map on finding dominant strategies, while
example 12.1 puts the tool to work.
Tool 12.1. How to find a strictly dominant strategy:
1. Focus on the row player by fixing your attention on one strategy of the column player
(i.e., one specific column).
(a) Cover with your hand all columns you aren’t considering.
(b) Find the highest payoff for the row player by comparing, across rows, the first com-
ponent of every pair.
(c) For future reference, underline this payoff.
2. Repeat step 1, but now fix your attention on a different column.
3. If, after repeating step 1 enough times, you find that the highest payoff for the row player
always occurs at the same row (your underlined payoffs are all on the same row), this row
becomes her dominant strategy. Otherwise, she does not have a dominant strategy.
4. For the column player, the method is analogous, but now fix your attention on one strategy
of the row player (one specific row), covering with your hand all other rows you aren’t
considering, and comparing the payoffs of the column player (second component of every
pair) across columns.
Example 12.1: Finding strictly dominant strategies Matrix 12.2a considers two
firms simultaneously and independently choosing a technology, either A or B for
firm 1, and a or b for firm 2. (All payoffs in these examples are in the millions of
dollars.) We can easily show that technology A is strictly dominant for firm 1 because
it yields a higher payoff than B, both when firm 2 chooses a in the left column (because
5 > 3) and when it selects b in the right column (given that 2 > 1).4
3. For a more formal (and shorter!) definition, we say that, in a game with two players i and j, player i finds strategy
si to strictly dominate another strategy si if ui (si , sj ) > ui (si , sj ) for every strategy sj of her rival. That is, si yields
a higher payoff than si regardless of the strategy her rival (player j) picks.
4. When comparing the payoffs of the row player (firm 1 in this example), we focus on the first number in every
pair, such as 2 in the cell corresponding to (A, b) and 1 in the cell corresponding to (B, b).
302 Chapter 12
Firm 2
Tech a Tech b
Tech A 5, 5 2, 0
Firm 1
Tech B 3, 2 1, 1
Matrix 12.2a
Technology choice game–I.
A similar argument applies to technology a, which is strictly dominant for firm 2

because it provides this firm with a higher payoff than b, both when firm 1 chooses
technology A (in the top row, where 5 > 0) and when it selects B (in the bottom row,
where 2 > 1).5 As a result, we can expect firm 1 choosing A and firm 2 selecting a in
matrix 12.2a, yielding (A, a) as the equilibrium of this game.
The definition of strict dominance does not allow for ties in the payoffs that player i earns.
The next term, “weak dominance,” allows ties to occur.
Weak dominance Player i finds that strategy si weakly dominates another strategy si
if choosing si provides her with a strictly higher payoff than selecting si for at least one
of her rivals’ strategies, but provides the same payoff as si for the remaining strategies
of her rivals.
Therefore, a weakly dominant strategy yields the same payoff as other available strategies,
but a strictly higher payoff against at least one strategy of the player’s rivals.6 In matrix 12.2b,
Firm 1 finds that technology A weakly dominates B because A yields a higher payoff than B
against a (when firm 2 chooses the left column, where 5 > 3), but provides firm 1 with
exactly the same payoff as B, $2, against b (when firm 2 selects the right column).
A similar argument applies to firm 2, which finds that technology a weakly dominates b.
Indeed, technology a yields a higher payoff than b when firm 1 selects A (5 > 0, on the top
row of the matrix), but generates the same payoff as b, $1, when firm 1 chooses B at the
bottom row.
5. When comparing the payoffs of the column player (firm 2 in this example), we focus on the second number in
every pair, such as 2 in the cell corresponding to (B, a) and 1 in the cell corresponding to (B, b).
6. More formally, a player i finds strategy si to weakly dominate another strategy si if ui (si , sj ) ≥ ui (si , sj ) for
every strategy sj of her rival, holding strictly for at least one strategy sj . Note that the last part of the definition
(“… holding strictly for at least one strategy sj ”) is required to avoid a complete tie, where player i earns the same
payoff when choosing strategy si and si against every strategy of her opponent, sj .
Firm 2
Tech a Tech b
Tech A 5, 5 2, 0
Firm 1
Tech B 3, 1 2, 1
Matrix 12.2b
Technology choice game–II.
Self-assessment 12.1 Consider matrix 12.2a again, but assume that the payoff
when firms choose technology (B, b), in the lower-right cell of the matrix, is (3, 3),
indicating that both firms receive a payoff of $3, rather than the payoff of $1 obtained
in matrix 12.2a. Follow the steps in example 12.1 to find if either firm has a dominant
strategy. Interpret.
In matrices with more than two rows and/or columns, finding which strategies are strictly
dominated can prove particularly helpful. From the previous discussion, we know that a
rational player should not use a strictly dominated strategy because it yields a lower payoff
than other available strategies, regardless of the strategy her rivals pick. As a consequence,
we can delete from a matrix those strategies (rows or columns) that are strictly dominated
for one player because she would not choose them in any case.7 Once we have deleted
these dominated strategies for one player, we can move on to another player, and delete the
strategies she considers strictly dominated, and subsequently move on to another player;
this process is known as Iterative Deletion of Strictly Dominated Strategies (IDSDS). Once
we cannot find any more strictly dominated strategies for either player, we are left with the
equilibrium prediction according to IDSDS. We formally say that those strategy profiles
(i.e., cells) survive IDSDS.
In matrix 12.2a, this solution concept yields a precise equilibrium prediction: we can
delete technology B for firm 1 because it is strictly dominated by A, and b for firm 2 because
it is strictly dominated by a. Once we delete the bottom row corresponding to technology B
and the right column associated with technology b, we are left with a unique cell surviving
the application of IDSDS, corresponding to strategy profile (A, a), which predicts that firm 1
chooses A, while 2 selects a.
While IDSDS offers precise equilibrium predictions in some games, it provides imprecise
predictions in other cases, yielding multiple equilibria (i.e., several cells surviving IDSDS).
Example 12.2 illustrates this possibility.
7. More formally, we say that player i does not choose a strategy she finds to be strictly dominated, regardless of
the beliefs she sustains about the strategy that her rivals will pick. In plain English, player i could say: “I don’t care
which strategy my rivals pick—choosing a strictly dominated strategy makes me worse off!”
304 Chapter 12
Example 12.2: When IDSDS does not provide a unique equilibrium Consider
matrix 12.3 representing the pricing decision of two firms. Each firm simultaneously
chooses whether to set high, middle, or low prices. In this context, let us apply IDSDS,
starting with firm 1. High is strictly dominated by Low because it yields a lower payoff
than that from Low, regardless of the price chosen by firm 2 (i.e., independent of the
column that firm 2 selects).8
Firm 2
High Medium Low
High 2, 3 1, 4 3, 2
Firm 1 Medium 5, 1 2, 3 1, 2
Low 3, 7 4, 6 5, 4
Matrix 12.3
When IDSDS yields more than one equilibrium–I.
Firm 2
High Medium Low
Medium 5, 1 2, 3 1, 2
Firm 1
Low 3, 7 4, 6 5, 4
Matrix 12.4
When IDSDS yields more than one equilibrium–II.
After deleting the strictly dominated strategy High from firm 1’s rows in matrix
12.3, we are left with the reduced matrix 12.4, which has only two rows. We can put
ourselves in the shoes of firm 2, to see if we can find a strictly dominated strategy for
it. Specifically, Low is strictly dominated by Medium because Low yields a strictly
lower payoff than Medium, regardless of the row that firm 1 selects.9
After deleting the Low column from firm 2’s strategies, we are left with a further
reduced matrix (see matrix 12.5). We can now move again to analyze firm 1. At this
point, however, we cannot identify any more strictly dominated strategies for this firm
8. To understand this, note that when firm 2 chooses High (in the left column of matrix 12.3), firm 1 obtains a
higher payoff with Low ($3) than with High ($2). Similarly, when firm 2 selects Medium (in the center column),
firm 1 receives a higher payoff from playing Low ($4) than from High ($1), and so does firm 2 when it chooses
Low (at the right column), where firm 1’s payoff is $5 from Low, and only $3 from High. Therefore, we can claim
that High is strictly dominated by Low.
9. Indeed, when firm 1 chooses Medium (in the top row of matrix 12.4), firm 2 obtains a higher payoff by selecting
Medium (3) than from choosing Low (2). Similarly, when firm 1 selects Low (in the bottom row), firm 2 receives a
higher payoff from playing Medium (6) than from Low (4). Hence, we can claim that firm 2 finds Low to be strictly
dominated by Medium.
Firm 2
High Medium
Medium 5, 1 2, 3
Firm 1
Low 3, 7 4, 6
Matrix 12.5
When IDSDS yields more than one equilibrium–III.
because there is no strategy (no row in matrix 12.5) yielding a lower payoff, regardless
of the column that firm 2 plays. Indeed, firm 1 prefers Medium to Low if firm 2
chooses High (in the left column) because 5 > 3; but it prefers Low to Medium if firm 2
chooses Medium (in the right column) given that 4 > 2. A similar argument applies
to firm 2 because there is no strategy (column) yielding a lower payoff, regardless of
the row that firm 1 selects.
Therefore, the remaining four cells in matrix 12.5, (Medium, High), (Medium,
Medium), (Low, High), and (Low, Medium), constitute our most precise equilibrium
prediction after applying IDSDS. This is actually one of the disadvantages of IDSDS,
as well as a motivation to consider other solution concepts to predict equilibrium
behavior in games, as we discuss in the next section. The NE solution concept will
help us provide more precise predictions (or at least the same) about how players
behave in equilibrium.
Self-assessment 12.2 Consider matrix 12.3 again, but assume that the payoff that
firms obtain from (High, Medium) is (3, 4) rather than (1, 4) in the top row of the
matrix. Which strategy profiles survive IDSDS? Compare your results against those
in example 12.2.
The application of IDSDS in example 12.2 left us with several equilibria. IDSDS nonethe-
less helped us delete one strategy for each player, as they are strictly dominated. That is,
other strategies provide the player with a strictly higher payoff, regardless of her opponent’s
strategy. In some games, such as that in example 12.3, IDSDS does not even allow us to
delete a strategy for any player. In those cases, we say that IDSDS “doesn’t have a bite”
because IDSDS does not help us reduce the set of strategies that a rational player would
choose in equilibrium.
Example 12.3: When IDSDS does not have a bite Matrix 12.6 represents the
Matching Pennies game, which you may have played in your childhood. Players 1
and 2 each hold a penny in one hand, but don’t show it to each other. Both players must
306 Chapter 12
Player 2
Heads Tails
Heads 1, −1 −1, 1
Player 1
Tails −1, 1 1, −1
Matrix 12.6
Matching Pennies game.
then simultaneously show their coins, with the following payoffs: If both players show
Heads or they both show Tails, player 1 gets player 2’s penny, for a gain of 1, while
player 2 loses his penny, for a loss of 1. This is illustrated in the matrix by the cells
along the main diagonal, with payoffs (1, −1). However, if the players show different
sides of their coins, player 1 must give her penny to player 2, so player 2’s payoff is 1,
while player 1’s is −1, with payoffs (−1, 1).
To see that IDSDS has no bite, consider one player at a time. Player 1 does not
find any of her strategies strictly dominated: she prefers Heads when player 2 chooses
Heads (left column), but Tails when player 2 chooses Tails (right column). In short,
player 1 seeks to select the same strategy as player 2 because by doing so, she wins
a penny. As a consequence, we cannot find a strategy that player 1 does not use
regardless of player 2’s strategy (i.e., regardless of the column that player 2 chooses).
A similar argument applies to player 2. He prefers to choose the opposite strategy as
player 1 (i.e., choosing Tails when player 1 selects Heads (top row), but Tails when
she chooses Heads (bottom row)). In summary, no player has strictly dominated strate-
gies, implying that we cannot delete any row or column from the matrix. Therefore,
the application of IDSDS left us with the original matrix! In these cases, we say that
IDSDS has “no bite.”
Self-assessment 12.3 Consider matrix 12.6 again, but assume that the payoff from
both players choosing the same action (i.e., their payoff from (Heads, Heads) or their
payoff from (Tails, Tails)), is (0, 0) rather than (1, −1). Which strategy profiles survive
IDSDS? Compare your results against those in example 12.3.
12.4 Nash Equilibrium
From examples 12.1–12.3, we learned that applying IDSDS helps us delete all but one cell
from the matrix in some games (and thus predict a unique equilibrium in the game). For
other games, however, IDSDS deleted only a few strategies for each player, leaving several
surviving cells, thus providing a relatively imprecise equilibrium prediction (e.g., four strat-
egy profiles could emerge as equilibria of the game). And for some games, such as Matching
Pennies, applying IDSDS did not delete any strategy for any player, so we say that IDSDS
does not have a bite.
In this section, we examine a different solution concept which has “more bite” than
IDSDS, and thus offers either the same or more precise equilibrium predictions (i.e., fewer
strategy profiles can emerge as equilibria of the game). This solution concept, known as the
“Nash equilibrium” after Nash (1950), builds upon the notion that every player finds her
best response to each of her rivals’ strategies, and hence we start with the definition of “best
response.”10
Best response Player i regards strategy si as a best response to her rival’s strategy sj
if si yields a weakly higher payoff than any other available strategy si against sj .
Tool 12.2. How to find best responses in matrix games:
1. Focus on the row player by fixing your attention on one strategy of the column player
(i.e., one specific column).
(a) Cover with your hand all columns that you are not considering.
(b) Find the highest payoff for the row player by comparing the first component of every
pair.
(c) For future reference, underline this payoff. This is the row player’s best response to
the column that you considered from the column player.
2. Repeat step 1, but now fix your attention on a different column.
3. For the column player, the method is analogous, but now direct your attention on one
strategy of the row player (i.e., one specific row), cover with your hand all other rows
you are not considering, and compare the payoffs of the column player (i.e., second
component of every pair).
We next use the concept of best response to define a NE as a scenario in which every player
chooses her best strategy, given the strategies chosen by her rivals. In such a scenario, no
player has unilateral incentives to deviate from her equilibrium strategy. Indeed, because she
is choosing a best response to her rivals’ strategies, deviating would only lower her payoff
(or leave it unaffected).
10. Formally, in a game with two players, strategy si is a best response against player j’s strategy sj if and only if
ui (si , sj ) ≥ ui (si , sj ) for every strategy si that is different from si . That is, there is no other strategy si that provides
player i with a strictly higher payoff than si against her opponent’s strategy sj .
308 Chapter 12

Nash equilibrium (NE) A strategy profile s∗i , s∗j is a NE if every player chooses
a best response to her rivals’ strategies.
In other words, a strategy profile is a NE if it is a mutual best response: the strategy that
player i chooses is a best response to that selected by player j, and vice versa. As a result,
no player has incentives to deviate because doing so would either lower her payoff, or keep
it unchanged.11 Tool 12.3 describes how to find NEs. Example 12.4 illustrates how to find
best responses and then use these responses in our search of NEs.
Tool 12.3. How to find Nash equilibria:
1. Find the best responses to all players (see tool 12.2 for details).
2. Identify which cell or cells in the matrix has all payoffs underlined, meaning that all
players have a best response payoff. These cells are the NEs of the game.
Example 12.4: Finding best responses and NEs Using the left matrix from
example 12.1 again, let us first identify the best responses to each firm.
Firm 1’s best responses. When firm 2 chooses a in the left column, firm 1’s best
response is A (on the top row) because it yields a higher payoff than B (bottom row). To
see this, a common visual guide many students use is to cover with one hand (or a piece
of paper) the column that firm 2 is not choosing (b in this case, on the right side of the
matrix), leaving strategy a uncovered. Once you focus on the column corresponding
to a, it is obvious that firm 1’s best response is A, on the top row, because 5 > 3.
Following a similar approach, when firm 2 chooses b in the right column, firm 1’s
best response is … (cover with your hand the unchosen column a on the left side of
the matrix!) technology A, given that 2 > 1. Summarizing, firm 1’s best responses are
BR1 (a) = A when firm 2 chooses a and BR1 (b) = A when firm 2 selects b.12
Firm 2’s best responses. We can now follow the same approach to figure out the best
responses of firm 2 to each strategy chosen by firm 1. Let us first analyze the case
11. Another way to describe a NE is by focusing on every player i’s beliefs about how her rivals will behave.
Therefore, player i’s beliefs assign a probability to each of her opponents’ strategies. Using that approach, we can
say that a NE is a system of beliefs (that is, a list of beliefs for each player) and a list of actions that satisfy two
properties: (1) every player uses a best response to her beliefs about how her rivals behave; and (2) the beliefs that
players sustain are, in equilibrium, correct. For simplicity, however, we focus on the definition given here.
12. In other words, firm 1 responds with technology A regardless of firm 2’s choice. Indeed, as shown in example
12.1, Firm 1 finds strategy A to strictly dominate B; that is, strategy A yields a higher payoff than B independent of
the strategy chosen by firm 2 (i.e., regardless of the column firm 2 chooses).
Firm 2
Tech a Tech b
Tech A 5, 5 2, 0
Firm 1
Tech B 3, 2 1, 1
Matrix 12.7
Finding best responses and NEs in Technology game–I.
in which firm 1 selects technology A (in the top row). To focus your attention on the
strategy that firm 1 selects, cover with your hand the bottom row (corresponding with
the strategy firm 1 does not select). We can easily see that in this context firm 2’s best
response is BR2 (A) = a because 5 > 0. Similarly, when firm 1 chooses B (bottom row),
we cover the top row with a hand and find that firm 2’s best response is BR2 (B) = a
because 2 > 1. Like firm 1, firm 2 chooses a, regardless of the strategy chosen by its
rival (firm 1). Therefore, strategy profile (A, a) constitutes a mutual best response, and
thus the NE of the game. In particular, firm 1 does not have an incentive to deviate
from A when its rival chooses a, nor does firm 2 have the incentive to deviate from a
when its rival chooses A.
Faster tool: underlining BR payoffs. A common tool used to rapidly find NEs in
games is to underline best response payoffs (i.e., the payoff that a player obtains from
playing her best response to each of her opponents’ strategies). In the game we analyze
here, matrix 12.7 underlines the payoff that firm 1 obtains from choosing A against a
($5 in the left column) and b ($2 in the right column), and the payoff that firm 2 accrues
from selecting a against A ($5 in the top row) and B ($2 in the bottom row). Once
we are done underlining best response payoffs (see matrix 12.7), the cells where the
payoffs from all players are underlined must constitute a NE of the game because play-
ers are playing best responses against their rivals’ strategies (mutual best responses).
In this matrix, the NE solution concept provides the same equilibrium prediction as
IDSDS did in example 12.1, (A, a). We next examine matrix 12.1b from example 12.1,
showing that NE yields more precise predictions than IDSDS.
Matrix 12.8 reproduces matrix 12.1b for easier reference. It is easy to identify that
firm 1’s best responses are BR1 (a) = A when firm 2 chooses a (in the left column),
and BR1 (b) = {A, B} when firm 2 selects b (in the right column), where the latter indi-
cates that firm 1 is indifferent between responding with A or B when firm 2 chooses b
(in the right column) as both yield a payoff of 2. Similarly, firm 2’s best responses
are BR2 (A) = a when firm 1 chooses A (in the top row), and BR2 (B) = {a, b} when
firm 1 selects B (in the bottom row). We can follow this approach of underlining best
response payoffs, obtaining matrix 12.8.
310 Chapter 12
Firm 2
Tech a Tech b
Tech A 5, 5 2, 0
Firm 1
Tech B 3, 1 2, 1
Matrix 12.8
Finding best responses and NE in Technology game–II.
Two strategy profiles have the payoffs from all players underlined, (A, a) and (B, b),
which constitute the two NEs of the game. Recall that the application of IDSDS to
this game did not have a bite, as we could not delete any strategy as being strictly
dominated for either firm 1 or 2. As a consequence, we were left with all four cells
(four strategy profiles) as the most precise equilibrium prediction according to IDSDS.
We have now shown that the NE solution concept yields two NEs, thus providing a
more precise prediction than IDSDS.
Self-assessment 12.4 Consider matrix 12.8 again, but assume that the payoff play-
ers obtain from choosing technology (B, b), in the lower right side of the matrix, is
(3, 3). Intuitively, coordinating on the superior technology (A, a) is still preferable,
yielding a payoff (5, 5), but the payoff difference to (B, b) is now smaller than in
matrix 12.8. Find the NE of the game, and compare your results against those in
example 12.4. Interpret.
12.5 Common Games
In this section, we apply the NE solution concept to four common games in economics
and other social sciences: the Prisoner’s Dilemma game, the Battle of the Sexes game, the
Coordination game, and the Anticoordination game.
Example 12.5: Prisoner’s Dilemma game Consider the following scenario. Two
people have been arrested by the police, and they are placed in different cells so they
cannot communicate with each other (cell phones were left in custody too!). The
police have only minor evidence against them, which would lead to a minor sentence
(a year in jail). However, the police suspect that these two individuals committed a
specific crime, and separately offer each of them the following deal:
If you confess to the crime and your partner doesn’t, we will let you go home, while your partner
will serve 10 years in jail. If instead you don’t confess but your partner does, she will go home
and you will serve 10 years in jail. If both you and your partner confess, both of you will serve
5 years in jail. Finally, if neither of you confess, both of you will only serve one year in jail.
Matrix 12.9a describes this game, where all amounts are negative to represent that
years in jail generate a disutility. As in the previous discussion, we first identify best
responses for each player.
Player 2
Confess Not confess
Confess −5, −5 0, −10
Player 1
Not confess −10, 0 −1, −1
Matrix 12.9a
The Prisoner’s Dilemma game.
Player 1’s best responses. For player 1, we first fix player 2’s strategy at Confess (left
column), yielding BR1 (C) = C because −5 > −10. Similarly, fixing player 2’s strat-
egy at Not confess (right column), yields BR1 (NC) = C because 0 > −1. Therefore,
player 1 responds with Confess regardless of player 2’s strategy (both when player 2
confesses and when he does not).
Player 2’s best responses. Player 2’s best responses are symmetric because her
payoffs are symmetric to those of player 1, but we include the analysis here as further
practice. We first fix player 1’s strategy at Confess (top row), obtaining BR2 (C) = C
because −5 > −10; second, we fix player 1’s strategy at Not confess (bottom row),
yielding BR2 (NC) = C because 0 > −1. Overall, both players choose Confess regard-
less of her opponent’s strategy, thus indicating that Confess is a strictly dominant
strategy for both players. Underlining best response payoffs, we obtain matrix 12.9b.
Player 2
Confess Not confess
Confess −5,−5 0, −10
Player 1
Not confess −10,0 −1, −1
Matrix 12.9b
The Prisoner’s Dilemma game – Underlining best response payoffs.
As a result, (Confess,Confess) is the unique NE of the game, because in that strategy

profile, both players choose mutual best responses.
312 Chapter 12
Self-assessment 12.5 Consider the Prisoner’s Dilemma game from matrix 12.9a
again. However, let us now assume that, when a player confesses while her part-
ner does not, police do not offer any deal to the confessing player. As a conse-
quence, payoff (−10, 0) becomes (−10, −1); and similarly, payoff (0, −10) becomes
(−1, −10). All other payoffs are unaffected. Find the NE of the game, and compare
your results against those in example 12.5. Interpret.
The NE in the Prisoner’s Dilemma game is rather somber: every player, seeking to maxi-
mize her own payoff, confesses, which entails that they both serve 5 years in jail. If, instead,
players could coordinate their actions and not confess, they would only serve 1 year in jail.
However, the individual incentives of each prisoner lead her to confess, both when her oppo-
nent confesses (as her sentence is reduced from 10 to 5 years) and when her opponent does
not confess (as her sentence is reduced from 1 year to zero). This game, hence, illustrates
strategic scenarios in which there is tension between the individual incentives of each player
and the collective interests of the group. In contexts that can be modeled like a Prisoner’s
Dilemma game, players’ equilibrium behavior does not result in the socially optimal out-
come (e.g., in example 12.5, the group’s payoff is maximized when no player confesses, and
every player only serves 1 year in jail). Scenarios in which similar conflicts arise between
individual and social incentives are common in economics, such as price wars between firms
(both firms would be better off by setting high prices, but each firm has individual incen-
tives to lower its own price to capture a larger market share); tariff wars between countries
(where both countries would be better off by setting low tariffs, but each country has individ-
ual incentives to raise its own tariff to protect its domestic industry); or the use of negative
campaigning in politics (where all candidates would be better off by not spending money on
negative campaigning, but each candidate has incentives to spend some money on it to win
the election).
Example 12.6: Battle of the Sexes game Consider the following scenario: Ana
and Felix are incommunicado in separate areas of the city. In the morning, they talked
about where to go after work, the football game or the opera, but they never agreed on
where to go. Every individual must simultaneously and independently choose whether
to attend the football game (F) or the opera (O). As illustrated in matrix 12.10a, Felix
prefers to attend the football game if Ana is also there, followed by attending the opera
with Ana, followed by being at the football game without her, and finally by attending
the opera without her. Ana’s payoffs are symmetric: her most preferred event is being
at the opera with Felix, followed by being at the football game with him, finally by
Ana
Football Opera
Football 5, 4 3, 3
Felix
Opera 2, 2 4, 5
Matrix 12.10a
The Battle of the Sexes game.
Ana
Football Opera
Football 5,4 3, 3
Felix
Opera 2, 2 4,5
Matrix 12.10b
The Battle of the Sexes game–underlining best response payoffs.
being at the opera without him, and followed by being at the football game without
him.13
Felix’s best responses. Following the previous approach to identifying best respon-
ses, we can find that Felix’s best responses are BRFelix (F) = F when Ana goes to the
football game because 5 > 2 (see left column), and BRFelix (O) = O when Ana goes
to the opera because 4 > 3 (see right column). Intuitively, Felix seeks to attend the
same event as Ana does.
Ana’s best responses. Similarly, Ana’s best responses are BRAna (F) = F when Felix
goes to the football game because 4 > 3 (see top row), and BRAna (O) = O when Felix
goes to the opera because 5 > 2 (see bottom row). Hence, they both prefer to be
together than separated, but each has a more preferred event.
Matrix 12.10a becomes matrix 12.10b (with the best response payoffs under-
lined). Therefore, we found two cells in which both players’ payoffs are underlined.
These two cells constitute the two NEs in this game (Football, Football) and (Opera,
Opera).
13. Alternatively, we can understand matrix 12.10a by looking at each strategy profile. When both players attend
the football game, Felix’s payoff is 5, while Ana’s is 4. When they both attend the opera, these payoffs are switched—
now Ana receives a payoff of 5 and Felix a payoff of 4. When players miscoordinate, however, their payoffs are
lower: both receive a payoff of 3 when Felix goes to the football game while Ana is at the opera (each at their
preferred event), but each receives a payoff of 2 when Felix goes to the opera while Ana is at the football game
(each goes to their least preferred event).
314 Chapter 12
Self-assessment 12.6 Consider again the Battle of the Sexes game from mat-
rix 12.10a. However, let us now assume that Felix started to appreciate opera, changing
the payoffs in the second row of matrix 12.10a to (3, 2) and (5, 5)—that is, only Felix’s
payoffs from going to the opera changed. Find the NE of the game, and compare your
results against those in example 12.6. Interpret.
Example 12.7: Coordination game Consider the game in matrix 12.11a, illustrat-
ing a “bank run” between depositors 1 and 2, with payoffs in thousands of dollars.
News suggest that the bank where depositors 1 and 2 have their savings accounts
could be in trouble, and each depositor must decide simultaneously and independently
whether to withdraw all the money in her account or wait. If both wait, they both main-
tain their funds ($150); if both withdraw, the bank can offer cash for only a portion
of their savings ($50); and if one depositor withdraws while the other waits, the bank
can provide the former with funds for most of her savings ($100), while the waiting
depositor is left with no money.
Depositor 1’s best responses. In this scenario, depositor 1’s best responses are
BR1 (W ) = W when depositor 2 withdraws because 50 > 0 (in the left column), and
BR1 (NW ) = NW when depositor 2 does not withdraw because 150 > 100 (in the
right column). Similar to the Battle of the Sexes game, depositor 1 chooses the same
strategy as her opponent.
Depositor 2’s best responses. Because depositors’ payoffs are symmetric, depositor
2’s best responses are also BR2 (W ) = W when depositor 1 withdraws because 50 > 0
(in the top row), and BR2 (NW ) = NW when depositor 1 does not withdraw, given that
150 > 100 (in the bottom row).
Matrix 12.11a becomes matrix 12.11b (with the best response payoffs underlined).
As a result, the two NEs in this game are (Withdraw, Withdraw) and (Not withdraw,
Not withdraw). Because players seek to coordinate, by choosing the same strategy,
either both withdrawing or both not doing so, this type of games is commonly known
as a “Coordination game.”
Depositor 2
Withdraw Not withdraw
Withdraw 50, 50 100, 0
Depositor 1
Not withdraw 0, 100 150, 150
Matrix 12.11a
Coordination game.
Depositor 2
Withdraw Not withdraw
Withdraw 50,50 100, 0
Depositor 1
Not withdraw 0, 100 150,150
Matrix 12.11b
The Coordination game–underlining best response payoffs.
Self-assessment 12.7 Consider again the Coordination game from matrix 12.11a.
Let us now assume that the payoff that depositors obtain when they both withdraw
their funds is lower: only (10, 10). Check if the two NEs found in example 12.7 still
emerge in this scenario.
Example 12.8: Anticoordination game Matrix 12.12a presents a game with the
opposite strategic incentives as the Coordination game in example 12.7. In particular,
the matrix illustrates the Game of Chicken, as seen in movies like Rebel without a
Cause and Footloose,14 where two teenagers in cars drive toward each other (or toward
a cliff). If both swerve, they avoid the accident, but both are regarded as “chicken”
by their friends, yielding a negative payoff of −1 to each player; if only one player
swerves, he is declared the chicken, obtaining a negative payoff of −10, while his
friend (who stayed) is the top dog, getting a payoff of 10; finally, if both players stay,
they crash in a serious car accident, yielding a payoff of −20 for both of them (they
almost die!).
Player 2
Swerve Stay
Swerve −1, −1 −10, 10
Player 1
Stay 10, −10 −20, −20
Matrix 12.12a
Anticoordination game.
As usual, let us start by finding best responses in this setting.
14. In the new version of Footloose, released in 2011, two teenagers drive school buses until one of them gives up
and a similar strategic scenario to the one we consider here arises.
316 Chapter 12
Player 2
Swerve Stay
Swerve −1, −1 −10,10
Player 1
Stay 10,−10 −20, −20
Matrix 12.12b
Anticoordination game–underlining best response payoffs.
Player 1’s best responses. Player 1 has BR1 (Swerve) = Stay when player 2 Swerves
(in the left column), because 10 > −1; and BR1 (Stay) = Swerve when player 2 Stays
(in the right column), because −10 > −20. Intuitively, player 1 chooses the opposite
strategy as his opponent: when his opponent Swerves, player 1 becomes the top dog by
Staying, whereas when his opponent Stays, player 1 avoids the accident by Swerving.
Player 2’s best responses. Because players’ payoffs are symmetric, player 2’s best
responses are also BR2 (Swerve) = Stay when player 1 Swerves (in the top row); and
BR2 (Stay) = Swerve when player 1 Stays (in the bottom row).
Underlining best response payoffs, matrix 12.12a becomes matrix 12.12b. As a
result, the two NEs in this game are (Swerve, Stay) and (Stay, Swerve). Since every
player seeks to anticoordinate by choosing the opposite strategy of her opponent, this
type of game is known as “an Anticoordination game.”
Self-assessment 12.8 Consider again the Anticoordination game from matrix

12.12a, but assume that all payoffs are doubled. Show that the two NEs found in
example 12.8 still emerge in this scenario.
For more examples of games and details on how to predict equilibrium behavior, see
Harrington (2014) and Muñoz-Garcia and Toro-Gonzalez (2019).
12.6 Mixed-Strategy Nash Equilibrium
All games we have analyzed thus far in this chapter have had at least one NE (one NE in
the Prisoner’s Dilemma game and two NEs in the Battle of the Sexes game, the Coordina-
tion game, and the Anticoordination game). A natural question at this point is whether all
games have a NE. The answer is Yes, under relatively general conditions. However, some
games may not have a NE if we restrict players to choose a specific strategy 100 percent of
the time, rather than allowing them to randomize across some of their available strategies.
Example 12.9 illustrates such a possibility with penalty kicks in soccer, and then we examine
how to find a NE when we allow players to randomize.
Example 12.9: Penalty kicks in soccer Consider matrix 12.13a, representing a

penalty kick in soccer. If the kicker aims left (right) and the goalie dives to the
left (right, respectively), the kicker does not score, and both players’ payoffs are
zero.15 However, when the kicker aims left (right) and the goalie dives in the opposite
direction (right and left, respectively), the kicker scores, yielding a negative payoff of
−5 to the goalie and a positive payoff of 8 for the kicker.16
Kicker
Aim Left Aim Right
Dive Left 0, 0 −5, 8
Goalie
Dive Right −5, 8 0, 0
Matrix 12.13a
Anticoordination game.
Kicker
Aim Left Aim Right
Dive Left 0, 0 −5,8
Goalie
Dive Right −5,8 0, 0
Matrix 12.13b
Anticoordination game–underlining best response payoffs.
No pure strategy NE. Finding best responses for the goalie, we obtain that, BRG (L) =
L when the kicker aims left (in the left column) because 0 > −5, and BRG (R) = R
when the kicker aims right (in the right column) because 0 > −5. Intuitively, the goalie
tries to move in the same direction as the kicker, aiming to prevent the latter from
scoring. In contrast, the kicker’s best responses are BRK (L) = R when the goalie dives
left (in the top row) because 8 > 0, and BRK (R) = L when the goalie dives right (in
the bottom row) because 8 > 0. Intuitively, the kicker seeks to aim to the opposite
location of the goalie to score a goal. Hence, underlining best response payoffs, we
obtain matrix 12.13b.
15. For simplicity, we say that the goalie dives to the left, meaning to the same left as the kicker (rather than to the
goalie’s left). A similar argument applies when the goalie dives to the right. Otherwise, it would be a bit harder to
keep track of the fact that the goalie’s left corresponds to the kicker’s right, and vice versa.
16. Similar payoffs would still produce our desired result of no NE when players are restricted to using a specific
strategy with 100 percent probability. You can make small changes on the payoffs, and then find the best responses
of each player again to confirm this point.
318 Chapter 12
Kicker
Prob. q Prob. 1 − q
Aim Left Aim Right
Goalie Prob. p Dive Left 0, 0 −5, 8
Prob. 1 − p Dive Right −5, 8 0, 0
Matrix 12.13c
Anticoordination game—including probabilities.
In summary, there is no cell where the payoffs for all players have been underlined,
indicating that there is no mutual best response. As a consequence, there is no NE
when we restrict players to use a specific strategy (either left or right) with 100 percent
probability. If, instead, we allow players to randomize (e.g., playing left with some
probability, such as 1/3, and right with the remaining probability of 2/3) we can find
the NE of the game. Because players in such a scenario mix their strategies, this type of
NE is known as a “mixed-strategy NE,” whereas those in which players use a specific
strategy with 100 percent probability are referred to as a “pure-strategy NE.”
Allowing for randomization. Let us next consider that the goalie dives left, with pro-
bability p, and right, with the remaining probability 1 − p. (For easier reference,
matrix 12.13c includes these probabilities next to the corresponding row for the
goalie.) If p = 1, the goalie would be diving left with 100 percent probability. Simi-
larly if p = 0, she dives right with 100 percent probability, whereas when p satisfies
0 < p < 1, the goalie randomizes her diving decision. Graphically, this randomization
can be understood as choosing the top row of the matrix with probability p and the
bottom row with probability 1 − p. Following a similar approach, let the kicker assign
a probability q to aiming left (in the left column) and the remaining probability 1 − q
to her aiming right (in the right column). Matrix 12.3c also includes these probabilities
on the top of each column for the kicker.
Goalie (row player). Once we have assigned probabilities to each row and column,
we can make an important point about mixed strategies: if the goalie does not select a
particular action with 100 percent probability, it must be that she is indifferent between
her two options: dive left and dive right. That is, her expected utility from both options
must coincide. To represent this conclusion mathematically, let’s first find the goalie’s
expected utility from diving left:
EUGoalie (Left) = q0 + (1 − q)(−5) .

kicker aims left kicker aims right
This expression can be understood as follows: when the goalie dives left (in the top
row), she does not know whether the kicker will aim left or right. If the kicker aims
left, which occurs with probability q, she does not score, yielding a payoff of 0 for the
goalie (see the first zero in the left column of the top row). If, instead, the kicker aims
right (which happens with probability 1 − q), she scores, which produces a payoff
of −5 for the goalie (see the right column in the top row). Simplifying this expres-
sion, we obtain that the goalie’s expected payoff from diving left is EUGoalie (Left) =
−5 + 5q.
Similarly, the goalie’s expected utility from diving right is
EUGoalie (Right) = q(−5) + (1 − q)0

kicker aims left kicker aims right
= −5q.
In this case, the goalie dives right (in the bottom row of the matrix), and she either
obtains a payoff of −5, which occurs when the kicker aims left and thus scores, or a
payoff of 0, which happens when the kicker aims right and does not score. As discussed
previously, if the goalie is not playing a pure strategy (i.e., either choosing to dive left
or right 100 percent of the time), she must be indifferent between diving left and right.
We can express this indifference as follows:
EUGoalie (Left) = EUGoalie (Right)
Using these results, this expression is equivalent to
−5 + 5q = −5q,
which simplifies to 10q = 5, and solving for q yields q = 10 5

= 12 . Therefore, the
goalie is indifferent between diving left and right when the kicker aims left 50 percent
(because we found that q = 1/2).
Kicker (column player). Following a similar approach for the kicker, we first find
her expected utility from aiming left:
EUKicker (Left) = p0 + (1 − p)8 .

goalie dives left goalie dives right
In terms of the matrix, we fix our attention on the left column because the kicker aims
left. Recall that the kicker’s payoff is uncertain because she does not know if the goalie
will dive left, preventing the kicker from scoring (which yields a payoff of 0 for the
kicker, in the top row of the matrix), or if the goalie will dive right, which entails a
320 Chapter 12
score and a payoff of 8 (in the bottom row of the matrix). Simplifying this expected
utility, we find EUKicker (Left) = 8 − 8p. Turning now to the case in which the kicker
aims right, we obtain an expected utility of
EUKicker (Right) = p8 + (1 − p)0

goalie dives Left goalie dives right
= 8p,
which entails the opposite payoffs than before because the kicker scores only when
the goalie dives left, which occurs with probability p (the probability with which the
goalie plays the top row). As discussed previously, if the kicker randomizes, it must
be that she is indifferent between aiming left and right, or more compactly,
EUKicker (Left) = EUKicker (Right),
which, using these results, entails
8 − 8p = 8p,
which simplifies to 8 = 16p. Solving for p, we obtain p = 16 8

= 12 . Therefore, the kicker
is indifferent between aiming left and right when the goalie aims left with 50 percent
probability (because we found that p = 1/2).
We can then summarize that the only NE of this game has both players randomiz-
ing between right and left with 50 percent probability. That is, the mixed-strategy NE
(msNE) is p = q = 12 . Note that players do not need to randomize with the same proba-
bility. They only did it in this situation because payoffs are symmetric in matrix 12.13b.
Self-assessment 12.9 Consider again the penalty-kicks scenario from matrix

12.13a. Let us now assume, however, that the payoff that players obtain when the
kicker scores a goal is (−2, 30) rather than (−5, 8). Intuitively, the kicker is really
happy about winning the game, while the goalie is just a bit unhappy. All other pay-
offs are unaffected. Show that this game does not have a pure-strategy (psNE) either,
and find the msNE. Compare the mixing probabilities that you find against p = q = 12
in example 12.9. Interpret your results.
Do all games have an msNE? Not necessarily. The Prisoner’s Dilemma game, for instance,
has a psNE in which all players choose to confess. However, because players find confessing
to be a strictly dominant strategy, they have no incentive to randomize their decision. In other
games, such as the Battle of the Sexes game or the Coordination game, players do not have
a strictly dominant strategy. In these cases, we found two psNE values and, as an exercise,
check that each game has one msNE when we allow players to randomize. Lastly, note that
the penalty kicks example illustrated that all games must have at least one NE, either in pure
or mixed strategies (i.e., either a psNE or an msNE).17
12.6.1 Graphical Representation of Best Responses

In this section, we describe how to graphically represent the best response of each player
and its interpretation. For presentation purposes, consider the goalie and kicker in example
12.9. We separately examine the goalie and kicker’s best responses next.
Goalie. From the previous analysis, the goalie chooses to dive left if her expected utility
from diving left is higher than from diving right; that is,
EUGoalie (Left) > EUGoalie (Right),
which can be expressed as −5 + 5q > −5q, ultimately simplifying to q > 12 . Intuitively,

when the probability that the kicker aims left (as measured by q) is sufficiently high (in this
case, q > 12 ), the goalie responds by diving left, so she increases her chances of blocking
the ball. Mathematically, this means that, for all q > 12 , the goalie chooses to dive left (i.e.,
p = 1). In contrast, for all q < 12 , the goalie responds by diving right (i.e., p = 0). Figure 12.2a
depicts this best response function.18
Kicker. A similar argument applies to the kicker. From our analysis in example 12.9, we
know that she aims left if
EUKicker (Left) > EUKicker (Right),
which entails 8 − 8p > 8p, or, after simplifying, p < 12 . Intuitively, when the goalie is likely
diving right (as captured by p < 12 ), the kicker aims left, increasing her chances of scoring.
Mathematically, we can write this result by saying that, for all p < 12 , the kicker aims left
(i.e., q = 1), while for all p > 12 , the kicker aims right (i.e., q = 0). Figure 12.2b illustrates
the kicker’s best response function.19
17. This result holds so long as a game is not extremely “strange,” in the sense that players’ payoffs are discontin-
uous at several points. All games considered in this book, even the most complicated you may envision, have at
least one NE.
18. Graphically, condition “for all q > 12 ” means that we are on the right side of figure 12.2a. For these points, the
goalie’s best response says that p = 1 at the top horizontal line of the graph. Similarly, condition “for all q < 12 ”
indicates that we look at the left side of the graph. For these points, the goalie’s best response of setting p = 0 is at
the bottom horizontal line on the left of the graph (the overlapping part of the horizontal axis).
19. Graphically, condition “for all p < 12 ” means that we are on the bottom half of figure 12.2b. For these points,
the kicker’s best response says that q = 1 at the vertical line on the right side of the graph. Similarly, condition “for
all p > 12 ” indicates that we look at the top half of the graph. For these points, the kicker’s best response is to set
q = 0, at the vertical line on the left of the graph (on top of the vertical axis).
322 Chapter 12
For all q > 1/2, the goalie

chooses p = 1 (dive left)
p
1
0 q = 1/2 1 q
For all q < 1/2, the goalie

chooses p = 0 (dive right)
Figure 12.2a
Goalie’s best responses.
For all p > 1/2, the kicker

chooses q = 0 (aim right)
p
1
q = 1/2
0 1 q
For all p < 1/2, the kicker

chooses q = 1 (aim left)
Figure 12.2b
Kicker’s best responses.
Putting together goalie’s and kicker’s responses. Figure 12.3 superimposes the goalie’s and
the kicker’s best response functions, which we can do because we used the same axes in
figures 12.2a and 12.2b. The players’ best responses only cross each other at one point in
the graph, where p = q = 12 , as predicted in example 12.9. Graphically, the fact that both
players’ best responses cross means that both are using their best responses or, in other
words, that the strategy profile is a mutual best response, as required by the definition of NE.
For this example, the crossing point p = q = 12 is the only NE of the game, an msNE, which
has both players mixing. If the game had one or more psNEs, the best response functions
p
BRGoalie(q)
1
msNE where
p = ½ and q = ½
p = 1/2
BRKicker(q)
0 q = 1/2 1 q
Figure 12.3
Both players’ best responses.
should cross at some point on the vertices of the unit square of figure 12.3. Generally, if
the game you analyze has more than one NE, the best responses you depict should cross at
more than one point; namely, one point in the (p, q)–quadrant of figure 12.3 for each psNE
that you find and, similarly, one for each msNE that we obtain.
Self-assessment 12.10 Consider again the Anticoordination game in matrix

12.12a. While we found two psNEs in that game, we can still find one msNE. Repeat
the analysis in example 12.9 to find the msNE of the Anticoordination game, and
depict the best responses for each player. Show that the best responses cross at three
points: (1) at (p, q) = (0, 1) at the corner of the graph, which corresponds to the psNE
(Stay, Swerve); (2) at (p, q) = (1, 0) at the top-left corner of the graph, corresponding
to the psNE (Swerve, Stay); and (3) at an interior point where both p and q are strictly
between 0 and 1, illustrating the msNE of the game.
Exercises
1. Strict dominance.A Apply IDSDS to the game shown here:
Player 2
L C R
U 1, 1 3, 2 5, 3
Player 1 M 2, 3 4, 4 6, 5
D 3, 5 5, 6 7, 5
2. IDSDS and deletion order.A Consider the game in exercise 12.1.

(a) Start deleting strictly dominated strategies for player 1. What is the equilibrium that you find
after applying IDSDS?
324 Chapter 12
(b) Start deleting strictly dominated strategies for player 2. What is the equilibrium you find after
applying IDSDS?
(c) Compare the results of parts (a) and (b). What do they imply about the ordering of IDSDS?
3. Strict dominance—some bite.B Apply IDSDS to the game shown here:
Player 2
W X Y Z
A 1, 1 3, 2 5, 3 3, 3
B 2, 3 4, 4 6, 5 1, 3
Player 1
C 3, 5 5, 6 7, 5 4, 3
D 2, 6 6, 4 4, 2 3, 2
4. Strict dominance—no bite.B Apply IDSDS to the game shown here:
Player 2
L R
U 1, 6 5, 7
Player 1
D 4, 1 5, 1
5. Prisoner’s Dilemma.B In the Prisoner’s Dilemma, why did each player elect to confess when each
player would be better off had they both remained silent? Do we see similar behavior like this in
the real world? Explain.
6. Mixed dominance.C Consider the following normal-form game:
Player 2
L R
U 2, 1 2, 3
Player 1 M 1, 3 4, 2
D 4, 1 1, 4
(a) Apply IDSDS to this game.

(b) Suppose that player 1 chooses to randomize between selecting M and D with probability of
0.5 each. Show that player 1’s expected payoff from this randomization is strictly higher than
that of strategy U.
7. Weak dominance.A Consider the game presented in exercise 12.4.
(a) Identify any weakly dominated strategies.
(b) Does IDWDS provide a unique solution to this game?
8. Weak dominance—deletion order matters.B Consider the following payoff matrix:
Player 2
L R
U 1, 1 1, 4
Player 1
D 3, 2 1, 2
Let us show that IDWDS does not necessarily provide the same equilibrium result, regardless of
which player we start with.
(a) Start deleting weakly dominated strategies for player 1, and then player 2, and so on. Find
the equilibrium prediction after applying IDWDS.
(b) Start deleting weakly dominated strategies for player 2, and then player 1, and so on. Find
the equilibrium prediction after applying IDWDS.
(c) Do your equilibrium predictions in parts (a) and (b) coincide?
9. Weak dominance–III.B In several game shows across the US and UK, two players work together
to build up a cash prize for the end of the show of size M. At the end of the show, each player
must simultaneously choose whether to “Split” or to “Steal” the cash prize.
• If both players choose “Split,” each player leaves the show with half the cash price, M
2.
• If one player chooses “Split” while the other player chooses “Steal,” the player who chooses
“Split” receives none of the cash prize, while the player who chooses “Steal” receives the whole
cash prize.
• If both players choose “Steal,” they both receive none of the cash prize.
(a) Depict the normal-form representation of this game.
(b) Identify any strictly or weakly dominated strategies in this game.
(c) Which is your equilibrium prediction after applying IDSDS?
(d) Which is your equilibrium prediction after applying IDWDS?
10. Nash equilibrium in a 2x2 matrix.A Find all psNEs in the game in the following payoff matrix:
Player 2
L R
U 6, 4 −2, 1
Player 1
D 5, 2 2, 3
11. Nash equilibrium in a 3x3 matrix.B Find all psNEs in the normal-form representation of the
game shown here:
Player 2
L C R
U 5, 0 1, 9 5, 8
Player 1 M 0, 4 −9, −4 8, 0
D 1, 6 8, 9 1, 6
12. Nash equilibrium—several equilibria.B Find all psNEs in the game shown here:
Player 2
L R
U 6, 4 2, 3
Player 1
D 6, 4 2, 3
326 Chapter 12
13. Nash equilibrium in the Split–Steal game.B Find all psNEs in the game presented in exercise
12.9. Is there ever a reason for a player to choose “Split”?
14. Three–player games.B Find all psNEs in the normal-form representation of the game shown
here. Player 3 acts as the matrix player.
Player 2 Player 2
L R L R
U 4, 2, 6 4, 6, 9 U 3, 5, 7 5, 1, 6
Player 1 Player 1
D 3, 10, 6 5, 5, 3 D 7, 3, 10 8, 6, 6
Player 3 : A Player 3 : B
15. Rock, Paper, Scissors–I.A Suppose that you and a friend engaged in the classic game of
rock, paper, scissors. In this game, both players simultaneously choose among “Rock,” “Paper,”
and “Scissors,” where “Paper” defeats “Rock,” “Scissors” defeats “Paper,” and “Rock” defeats
“Scissors.” Suppose that whoever wins this game receives $1 from the loser.
(b) Describe each player’s best response.
(c) Is there a psNE of this game? If so, what is it? If not, why not?
16. Competition.B Suppose that two firms are determining how to price their products against one
another. They each simultaneously choose whether to price high or low with the following results:
• If they both price high, each firm brings in a profit level of B.
• If one firm prices high while the other prices low, the high pricing firm receives a profit level
of D, while the low pricing firm receives a profit level of A.
• If they both price low, each firm brings in a profit level of C.
Assume that A > B > C > D.
(b) Describe each player’s best response.
17. Charitable contributions.B Suppose that two wealthy donors are considering making a charitable
contribution to a public project. This project is costly, and it only reaps benefits if both donors
contribute. To contribute (C), a donor must pay a cost of $1, and the project is worth $3 to each
donor if both donors contribute, and 0 otherwise. Not contributing (NC) costs nothing.
(b) Describe each donor’s best response.
18. Finding msNE–Battle of the Sexes.B Find the msNE in the Battle of the Sexes game described
in this chapter.
19. Finding msNE–Coordination game.B Find the msNE in the Coordination game described in
this chapter.
20. Rock, Paper, Scissors–II.B Find the msNE in the Rock, Paper, Scissors game given in
exercise 12.15.
21. Penalty kicks.A Consider the penalty kicks scenario in example 12.9. Suppose that the goalie
received information with certainty that the kicker was going to aim right for this shot.
(a) Should he continue to randomly dive? Why or why not?
(b) Suppose now that the kicker knew that the goalie had this information. What should the
kicker do?
22. Scaling payoffs.B Consider the results of exercise 12.10. Suppose now that all the payoffs for
both players were doubled. Would that change the results for either the pure or mixed-strategy
NE? Explain why or why not.
13 Sequential and Repeated Games
13.1 Introduction
Chapter 12 analyzed strategic scenarios in which all players act at the same time (simul-
taneous-move games). In many contexts, however, one player has the ability to select her
action first (e.g., the first mover), while other agents respond to her moves. For instance, an
industry leader may choose the price of its product (or its production level) before other firms
get to choose theirs. We refer to these scenarios as “sequential-move games” or “sequential
games.”
While we can predict how players behave by deploying the notion of Nash equilibrium
(NE) learned in the previous chapter, we show that this solution concept provides us with
too many equilibria when the game we analyze is sequential; and some NEs can be based
on incredible beliefs about players’ future moves.
Given these problems, we present a more common tool used to understand players’ equi-
librium behavior in sequential games, “subgame perfect equilibrium (SPE)” or “rollback
equilibrium,” which essentially starts by analyzing the optimal action by the last mover in
the game. Once we know how the last mover will behave, we move to the previous-to-
last mover, who anticipates how the last mover will behave. Intuitively, the previous-to-last
mover puts herself in the shoes of the last mover, understands her motives, and then fore-
casts how the latter will behave. Understanding this optimal response by the last mover, the
previous-to-last mover can maximize her payoff fully by anticipating how the game ensues
after each of her own moves. We can then repeat a similar process, moving one more step
closer to the first mover, again and again until we finally reach her.
In the second part of the chapter, we apply an SPE to repeated games (i.e., scenarios
in which players interact with one another several times). For presentation purposes, we
return to the standard Prisoner’s Dilemma game discussed in the previous chapter, where
we know that players do not cooperate when playing the game only once. We then ask
whether repeating the game twice can help players cooperate with one another. Our answer,
however, is No. To understand this result, consider the twice-repeated Prisoner’s Dilemma
game. In round 2, players can anticipate that they will both defect. Importantly, this behavior
330 Chapter 13
would be unaffected by their previous behavior during the first round of play (i.e., players
will defect in round 2 regardless of whether they both cooperated or defected in round 1).
Expecting this behavior, the game that players face in round 1 becomes independent from
that in round 2, leading players to behave as they do in the unrepeated version of the game
explored in chapter 12.
A similar argument applies when we repeat the game three times or, more generally, a
finite number of times, such as 5 or 200 times, because players can anticipate their mutual
defection in the last round of play, regardless of previous history, and roll back from that
until their first round of interaction. You may wonder: “Wow, this is grim news because we
don’t seem to achieve cooperation even if we repeat the game hundreds of times!” Well, we
have some good news for you: If we repeat the game an infinite number of times, cooperation
can be supported as the SPE of the game if players care enough about their future payoffs.
We discuss how to identify the conditions under which this type of cooperation emerges in
equilibrium, and then we provide several examples that illustrate how to approach similar
problems.
13.2 Game Trees
The games analyzed so far in this book have assumed that players chose their strategies
simultaneously or, alternatively, that the time difference between one player’s choices and her
opponent’s is small enough to be modeled as if players acted at the same time.1 In some real-
world scenarios, however, players may act sequentially, with one player choosing her strat-
egy first and another player responding with his strategy choice days or even months later.
The game tree in figure 13.1a provides an example, where a potential entrant first chooses
whether to enter an industry in which an incumbent firm operates as a monopolist. (Because
the entrant is the first mover, its choice is labeled at the “root” of the tree on the left side of
the game.) If it does not enter, the game is over, yielding a payoff of zero for the entrant, but
a monopoly payoff of 10 for the incumbent. If the entrant enters, however, the incumbent
is called on to move, as indicated by the node labeled “Incumbent” at the top of the game
tree. At this point, the incumbent chooses whether to accommodate the entry, which entails
a payoff of 4 for each firm, or start a price war, which generates a payoff of −2 (i.e., loss)
for both firms.
Figure 13.1b offers an example of a game where one of the players (firm 2) does not
observe the moves of its opponent in previous stages. In particular, firm 1 chooses to invest
or not in the first period. If it does not invest, the game is over, but if it invests, firm 1
1. This principle may apply to games such as the Rock-Paper-Scissors game, or penalty kicks between a kicker and
a goalie. Alternatively, simultaneous-move games can be used to model scenarios where two players act sequen-
tially, but the follower does not observe the leader’s actions before choosing her own. For instance, an industry
leader may choose its technology by acting as the follower, not observing the leader’s decision before selecting its
own technology.
Sequential and Repeated Games 331
Accommodate
Entry (4, 4)
Incumbent
In
Price (–2, –2)
Potential War
Entrant
Out
(0, 10)
Payoff for Entrant Payoff for Incumbent

(First Mover) (Second Mover)
Figure 13.1a
Entry game.
H´ (4,4)
High
price
L´ (1,6)
Firm 1
Firm 2
Firm 2's information set
H´ (4,1)
Invest Low
price
L´ (2,2)
Firm 1
Does not
invest
(2,6)
Figure 13.1b
Sequential-move game with an information set.
chooses whether to set a high or a low price. Firm 2, without observing whether firm 1 chose
a high or a low price (but knowing that it invested), responds with a high price (H’) or a
low price (L’). The dotted line connecting firm 2’s two nodes in the upper part of the tree is
referred to as an “information set” because this player does not know at which node it gets to
play. Firm 2’s available actions at the information set have the same labels. Indeed, because
firm 2 cannot condition its response on firm 1’s price, it must be that firm 2 responds with a
high (H’) or a low price (L’).2
2. If, instead, the labels of firm 2’s actions were HH and LH in the top node (after firm 1 chooses a high price) and
HL and LL in the bottom node (after firm 1 selects a low price), firm 2 would know at which node it is called on to
move just by looking at its available actions (either HH and LH , or HL and LL ), implying that firm 2 would not be
uninformed about which action firm 1 chose in previous stages.
332 Chapter 13
13.3 Why Don’t We Just Find the Nash Equilibrium of the Game Tree?
A natural question at this point is: “Well, we learned how to find equilibria in simultaneous-
move games by using the NE solution concept. Why not apply NE to sequential-move
games?”
NE can indeed help us in identifying equilibrium behavior in a game tree that depicts
players’ sequential moves but, as we next illustrate, the NE provides us with several equi-
libria. Most important, some of them may be illogical in a context where players act
sequentially.
Example 13.1: Applying NE to the Entry game Let us consider the Entry game
in figure 13.1a again. To find the NEs in this game tree, we first need to represent
the game in its matrix form (see matrix 13.1). The potential entrant has only two
available strategies, In and Out, and thus the matrix has two columns. Similarly, the
incumbent has two strategies at its disposal: Accommodate or Price war. (All payoffs
are in millions of dollars.)
We can now underline the best response payoffs, as we did in the previous chapter,
to label the NEs of the game.
Incumbent’s best responses. For the incumbent, we find that its best response to In
(in the left column of matrix 13.1) is BRinc (In) = Acc because 4 > −2, while its best
response to Out is BRinc (Out) = {Acc, War} because both yield a profit of 10. To
illustrate our results, matrix 13.2 reproduces matrix 13.1, with the best response
payoffs underlined.
Potential entrant
In Out
Accommodate 4, 4 10, 0
Incumbent
Price war −2, −2 10, 0
Matrix 13.1
The Entry game in matrix form.
Potential entrant
In Out
Accommodate 4,4 10, 0
Incumbent
Price war −2, −2 10,0
Matrix 13.2
Finding NEs in the Entry game.
The entrant’s best responses. For the potential entrant (the column player), we find
that its best responses to each of the incumbent’s strategies (i.e., for each row of the
matrix) are BRent (Acc) = In, because 4 > 0, and BRent (War) = Out, because 0 > −2.
Intuitively, if the entrant believes that the incumbent will choose to accommodate after
its entry, then it should enter, obtaining a profit of 4, rather than staying out, with a
payoff of 0. In contrast, if the entrant believes that the incumbent will respond with a
price war to its entry, it should remain outside the industry; otherwise, its profit from
entering and facing a price war is negative!
Overall, this analysis found two cells in which we underlined both players’ payoffs
as being best responses. That is, we found two strategy profiles where players choose
mutual best responses to each other’s strategies (two NEs of the game):
(Acc, In) and (War, Out) .

As discussed previously, in the first NE, entry occurs and the incumbent follows by
accommodating, whereas in the second NE, the entrant does not enter because it antic-
ipates a price war. Do you notice something fishy about the second NE—something
that doesn’t sound right? You should! In the strategy profile (War, Out), the incum-
bent is located in the upper part of the game tree. In this position, the incumbent must
take entry as given and, taking that into account, choose the response that maximizes
its profit. In particular, its payoff from accommodating entry (4) is larger than that
from starting a price war (−2). In other words, once the entrant is in, the best option
that the incumbent has is to accommodate entry. As a result, the entrant’s belief that
a price war will ensue upon entry is not sequentially rational, in the sense that the
entrant should put itself in the incumbent’s shoes to better anticipate its subsequent
moves if the entrant were to join the industry. Alternatively, the incumbent’s threat to
start a price war upon entry in strategy profile (War, Out) is noncredible, because the
entrant can anticipate that, upon entry, the incumbent prefers to avoid a price war.
Self-assessment 13.1 Repeat the analysis in example 13.1, but assume that when
the potential entrant joins the industry and the incumbent responds by accommodat-
ing, both firms earn a payoff of only 1. Are the results in example 13.1 affected?
Interpret.
In the following section, we present a new solution concept, subgame-perfect equilibrium

(SPE), which identifies only those NEs that are sequentially rational (i.e., those that are not
based on incredible beliefs).
334 Chapter 13
13.4 Subgame-Perfect Equilibrium
To predict how players behave in these sequential contexts, we apply the solution concept
of backward induction, illustrated in tool 13.1.
Tool 13.1. Applying backward induction
1. Go to the farthest right side of the game tree (where the game ends), and focus on the
last mover.
2. Find the strategy that yields the highest payoff for the last mover.
3. Shade the branch that you found to yield the highest payoff for the last mover.
4. Go to the next-to-last mover and, following the response of the last mover that you found
in step 3, find the strategy that maximizes her payoff.
5. Shade the branch that you found to yield the highest payoff for the next-to-last mover.
6. Repeat steps 4–5 for the player acting before the previous-to-the-last mover, and then for
each player acting before her, until you reach the first mover at the root of the game.
Example 13.2: Backward induction in the Entry game To apply backward induc-
tion, we first focus on the last mover, the incumbent. Comparing its payoff from
accommodating entry, 4, and starting a price war, −2, we find that its best response
to entry is to accommodate. Following the steps in tool 13.1, we shade the branch
corresponding to Accommodate in figure 13.2.
We now move to the player acting before the incumbent, which in this example is
the first mover. The entrant can anticipate the incumbent’s subsequent choices if it
were to enter. As a consequence, the entrant can expect that, if it chooses to enter, the
incumbent will respond with accommodation because such strategy yields a higher
Accommodate
(4, 4)
Incumbent
In
Price (–2, –2)
Potential War
Entrant
Out
(0, 10)
Figure 13.2
Applying backward induction in the Entry game.
payoff for the incumbent than the price war. Graphically, the entrant can understand
that, if it enters, the game will proceed through the shaded branch of accommodation,
ultimately yielding a payoff of 4 from entering. If, instead, the entrant stays out, its
payoff is only 0 and, therefore, the optimal strategy for the entrant is to enter. We then
say that the SPE after applying backward induction is
{Enter, Accommodate},
which indicates that the first mover (entrant) chooses to enter, and the second mover
(incumbent) responds by accommodating, entailing equilibrium payoffs of (4, 4).
the potential entrant joins the industry and the incumbent responds by accommodat-
ing, both firms earn a payoff of only 1. Are the results in example 13.2 affected?
Interpret.
The equilibrium that results from applying backward induction is also referred to as “roll-
back equilibrium” because the backward induction procedure looks like rolling back the
game tree from its branches on the right side to the game’s root on the left side; or SPE,
because backward induction helps us find the equilibrium strategy of each player when she
is called on to move at any point along the tree.
13.4.1 Subgame Perfect Equilibrium in More Involved Games

In this section, we explore how to apply backward induction, and thus find SPEs, in games
where at least one player faces an information set, meaning that she does not observe the
moves from a previous player before she is called on to move. To apply backward induction
in this scenario, we first need to define what we mean by a “subgame” in a game tree.
Subgame A portion of the game tree that can be circled around without breaking
any information set.
Intuitively, the circle of a subgame indicates a portion of the game in which a player (e.g.,
the second mover) is called on to move, and it takes into account the information that this
player observes at that specific point of the game tree. For instance, in the Entry game of
figure 13.2, there are only two subgames: one in which we circle the part of the game tree
starting at the incumbent’s node and ending at the end of the game, and another in which we
336 Chapter 13
circle the game as a whole. Other games with more involved trees may have more or fewer
subgames, as example 13.3 illustrates.
Example 13.3: Applying backward induction in more involved game trees

Consider the game tree in figure 13.3, where firm 1 acts as the first mover, choos-
ing either Up or Down. If firm 1 selects Down, the game is over, with firm 1 obtaining
a payoff of 2, while firm 2 earns a payoff of 6. However, if firm 1 chooses Up, this
firm gets to play again, choosing between A and B. Firm 2 is then asked to respond,
but without seeing whether firm 1 chose A or B. Firm 2’s uncertainty is graphically
represented by the dotted line connecting the end of the branches that it doesn’t distin-
guish, A and B. This dotted line is formally known as an “information set” for firm 2,
because this firm doesn’t know which of these two actions was chosen by firm 1.3
X (3,4)
A
Y (1,4)
Firm 1
Firm 2
X (2,1)
B
Up
Y (2,0)
Firm 1
Down
(2,6)
Figure 13.3
A more involved game tree.
Before applying backward induction to this game, a usual trick is to find all the
subgames (i.e., circling the portions of the tree that do not break any information set).
Starting from the last mover (firm 2), the smallest subgame that we can circle is one
initiated after firm 1 chooses Up, which is labeled “Subgame 1” in figure 13.4a. If we
now move to the lower part of the game tree, note that we cannot circle any part of the
tree without breaking firm 2’s information set. Circles that breaks firm 2’s information
3. Firm 2 has the same available strategies when firm 1 chooses A and when it chooses B, i.e., firm 2 must select
either X or Y in both cases. If, instead, firm 2 had to choose between X and Y when firm 1 chooses A, but between
a different pair of action, X and Y , when firm 1 chooses B, firm 2 would be able to infer which action firm 1
selected by just observing its own available actions.
(a)
X (3,4)
A
Y (1,4)
Firm 1
Firm 2
X (2,1)
B
Up
Y (2,0)
Firm 1 Subgame 1
Down
(2,6)
Game as a whole
(b)
Not a subgame!
X (3,4)
A
Y (1,4)
Firm 1
Firm 2
X (2,1)
B
Up
Y (2,0)
Firm 1
Not a subgame!
Down
(2,6)
Figure 13.4
(a) Proper subgames. (b) Not proper subgames.
338 Chapter 13
set are included in figure 13.4b as a reference. As a result, the only subgame that
we can identify in this tree, besides subgame 1, is the game as a whole. You may be
wondering: “These are nice circles, but why should we care about the subgames in
a game tree?” The answer is simple: we can next apply backward induction by just
focusing on the two subgames we found.
Subgame 1. Let us start by analyzing subgame 1. In this subgame, firm 2 does not
observe which action firm 1 chose (either A or B).4 Therefore, subgame 1 can be
represented using matrix 13.3, with firm 1 in rows and 2 in columns.
Firm 2
X Y
A 3, 4 1, 4
Firm 1
B 2, 1 2, 0
Matrix 13.3
Representing subgame 1 in matrix form.
We can now find the NE of subgame 1 by underlining best response payoffs, as

discussed in tool 12.3 and section 12.4 of the previous chapter. Matrix 13.4 reproduces
matrix 13.3, but it includes underlined best response payoffs.5 As discussed in chapter
12, the cell in which both firms’ payoffs are underlined constitutes the NE of subgame
1, (A, X ), with corresponding payoffs (3, 4).
Firm 2
X Y
A 3,4 1,4
Firm 1
B 2,1 2, 0
Matrix 13.4
Finding the NE of subgame 1.
4. Even if firm 2 acts a few hours (or days) after firm 1 chooses between A and B, firm 2 cannot condition its
response (i.e., whether to respond with X or Y ) on the specific action selected by firm 1. In this sense, firm 2
acts as if it were selecting its action at the same time as firm 1 chose its own, making the analysis of subgame 1
analogous to a simultaneous-move game.
5. This is a good moment to practice best responses. Recall that the underlined payoffs in matrix 13.4 illustrate
that firm 1’s best response to firm 2 selecting X (in the left column) is A because 3 > 2, and to firm 2 choosing Y
(in the right column) is B, given that 2 > 1. Similarly, firm 2’s best response to firm 1 selecting A (in the top row)
is {X , Y } because firm 2 receives a payoff of 4 in both X and Y , and its best response to firm 1 choosing B (in the
bottom row) is X because 1 > 0.
(3,4)
Up
From the NE
Firm 1 (A, X ) of Subgame 1
Down
(2,6)
Figure 13.5
The reduced game tree from figure 13.4a.
The game as a whole. We can now study the game as a whole. Firm 1 must choose
between Up and Down, anticipating that if it chooses Up, subgame 1 will start. From
our previous analysis, firm 1 can anticipate equilibrium behavior in subsequent stages
of the game; that is, the NE of subgame 1 is (A, X ), while the payoffs are (3, 4). Firm
1 can then simplify its decision problem to the tree depicted in figure 13.5, where
we insert the equilibrium payoffs from subgame 1, (3, 4), if firm 1 were to select
Up. Therefore, firm 1 only needs to conduct the following payoff comparison: if it
chooses Down, the game is over and its payoff is 2, whereas if it chooses Up, sub-
game 1 is initiated, obtaining a payoff of 3. Because 3 > 2, firm 1 prefers to choose
Up rather than Down, as illustrated by the thick arrow on the branch corresponding
to Up.
Summarizing, after applying backward induction, the SPE of this game is (Up,
(A, X )), which yields an equilibrium payoff of 3 for firm 1 and 4 for firm 2.
firm 1 chooses Up and A, and firm 2 responds with X , their payoff becomes (5, 5)
rather than (3, 4). Find the equilibrium of the game tree, and compare your result
against that in example 13.3.
340 Chapter 13
13.5 Repeated Games
Previous sections of this chapter have analyzed games where players interact only once.
These games are also known as “one-shot games” or “unrepeated games,” and they can help
us model strategic scenarios in which players do not anticipate interacting again, such as two
randomly picked individuals in a large city, like Seattle, who may not encounter each other
again. In other settings, however, agents interact several times, as in a small town like Ana-
tone, Washington, and so they face the same game repeatedly. Repeated games are common
in real life, such as Treasury bill auctions (some of which are organized monthly or weekly),
price competition between the same group of firms operating in a given industry, and pro-
duction decisions of countries participating in the Organization of the Petroleum-Exporting
Countries (OPEC) cartel. In these three examples, the set of players is unchanged from one
week to the next (or is mostly unaffected), and the game they play is also unchanged (firms
can choose among the same set of prices, and sustain similar technologies as in previous
editions of the game). An interesting feature of repeated games is that players’ interaction in
a repeated scenario can help us rationalize cooperation in contexts where such cooperation
could not be sustained if players interact only once.
Consider the following Prisoner’s Dilemma game. As discussed in chapter 12 (sec-
tion 12.5), the only NE of the game is (Confess, Confess), where both players confess,
obtaining a payoff of −4 (that is, 4 years in jail). As we highlighted, this outcome is inef-
ficient, as players could be better off if they coordinated their actions; namely, if they both
choose not to confess, each player’s payoff increases to −1 (serving only 1 year in jail). In
this section, we explore if such a cooperative outcome can be sustained when the game is
repeated (i.e., when players interact many times, playing the game reproduced in matrix
13.5 in each round).
Player 2
Confess Not confess
Confess −4, −4 0, −7
Player 1
Not confess −7, 0 −1, −1
Matrix 13.5
Another iteration of the Prisoner’s Dilemma game.
13.5.1 Finite Repetitions

Let us first consider that the game is repeated T periods, where T is a finite number (e.g.,
2 times, or 500 times, but not an infinite number of times). In this scenario, every player
chooses her action at stage t = {1, 2,… , T}, and an outcome emerges for stage t, which is
perfectly observed by both players; and then stage t + 1 starts, whereby every player chooses
her action at that stage. In a nutshell, this is a sequential-move game because every player,
when considering her move at stage t + 1, perfectly observes the past history of play by
both players from stage 1 until t. Given this history, every player responds with her choice
at stage t + 1. Fortunately, we know how to solve sequential-move games! As described in
section 13.4, we can use backward induction to solve for the SPE of the game as follows:
Period T. Starting from the last round of play at t = T, we see that every player’s strictly
dominant strategy is Confess (C), thus providing us with (C,C) as the NE of the last-stage
game.
Period T − 1. In the next-to-last stage, t = T − 1, every player can anticipate that (C,C)
will ensue if the game proceeds until stage t = T, and that both players will be choosing C
regardless of the outcome in stage T − 1. As a consequence, every player finds C a strictly
dominant strategy once more. Therefore, strategy profile (C,C) is, again, the NE of the stage
game (in this case, for stage T − 1).
Period T − 2. A similar argument applies if we move one step up, to stage T − 2, where
both players anticipate that (C,C) will be the equilibrium outcome in both subsequent stages
T − 1 and T, and choose C at the current stage, thus yielding (C,C) as the NE outcome in
this stage as well.
Continuing with this argument, we find that (C,C) is the NE of every stage t, from the
beginning of the game, at t = 1, to the last stage, t = T. Therefore, the SPE of the game
has every player choosing C at every round, regardless of the outcomes in previous rounds.
Intuitively, the existence of a terminal period makes every individual anticipate that both
players will defect during that period, and because the last stage outcome is unaffected by
previous moves, players in prior stages find no benefit from not confessing. In the next
section, we explore whether such an unfortunate result can be avoided by allowing the game
to be repeated an infinite number of times.
13.5.2 Infinite Repetitions

Consider now an infinitely repeated Prisoner’s Dilemma game. You might wonder: “How
are we going to have the prisoners playing forever—tie them to their chairs at the police
station?” This is operationalized by assuming that, at any given moment, players continue
to play the game one more round with some probability p. Even if this probability is close
to 1, the probability that players interact for a large number of rounds drops very rapidly.6
However, it is still statistically possible that players interact for infinite rounds.
As we know from previous sections, when the game is played once or a finite number
of times, the only equilibrium prediction is (C,C) in every single round of play. How can
we sustain cooperation if the game is played an infinite number of times? By the use of the
so-called Grim-Trigger Strategy (GTS). A standard GTS works as follows:
1. In the first period of interaction, t = 1, every player starts by cooperating (playing Not
confess, NC, in the Prisoner’s Dilemma game).
6. For instance, if the probability of interacting with one another is p = 0.9, the probability that players interact
for 10 rounds is 0.910 ∼= 0.34, and the probability that they continue playing for 100 rounds is extremely small
(0.9100 ∼
= 0.00002). Nonetheless, this probability is always positive.
342 Chapter 13
2. In all subsequent periods, t > 1,

(a) Every player continues to cooperate, so long as she observes that all players
cooperated in all past periods.
(b) If, instead, she observes some past cheating in any previous round (deviating from
this GTS), then she plays C thereafter.
To show that the GTS can be sustained as an SPE of the infinitely repeated game, we need
to show that every player finds the GTS optimal at every time period t (i.e., at any period
in which she wonders whether to continue with the implicit cooperative agreement that the
GTS entails, both t = 1 and any t > 1). In addition, she must find the GTS optimal after any
previous history of play, which in our case implies that it is optimal both (1) after no history
of cheating, and (2) after some cheating episode. Let us separately analyze cases (1) and (2)
next.
Example 13.4: Sustaining cooperation with a Grim-Trigger Strategy

Case (1). No cheating history. If no previous cheating occurs, the previously
described GTS dictates that every player keeps cooperating in the next period, which
yields a payoff of −1 for every player. Therefore, by sticking to the GTS, every player
obtains the following stream of discounted payoffs:
−1 + δ(−1) + δ 2 (−1) + …,
where δ ∈ (0, 1) represents her discount factor. Intuitively, δ indicates how much the
individual cares about future payoffs. When δ → 1, she assigns the same weight to
future as to present payoffs; on the other hand, when δ → 0, she assigns no importance
to future payoffs.7 Alternatively, a high discount factor δ can be interpreted as being
that the individual cares similarly about current and future payoffs (she is patient),
while a low discount factor can be understood as that she cares only about current
payoffs, essentially ignoring future payoffs (she is impatient).
Factoring out the −1 payoff yields
−1 + δ(−1) + δ 2 (−1) + … = −1(1 + δ + δ 2 + …),
7. For simplicity, we assume that all players have the same discount factor. The results will not qualitatively change
if we allow different discount factors for each player, and one of the end-of-chapter exercises asks you to revisit
the infinitely repeated Prisoner’s Dilemma game, considering a discount factor δ1 for the row player and δ2 for the
column player.
which ultimately reduces to −1 1−δ1

because the term in parentheses, 1 + δ + δ 2 + …,
1 8
is an infinite geometric progression that can be simplified to 1−δ . If, instead, the
player cheats today (playing C while her opponent plays NC), her payoff is 0.9 How-
ever, her defection is detected by the other players, who respond with C thereafter
(recall that this is the punishment prescribed in step 2b of the GTS), which yields a
payoff of −4 thereafter. As a result, her stream of discounted payoffs from cheating
becomes
0
+ δ(−4) + δ 2 (−4) + …,

She cheats Punishment thereafter
δ
−4(δ + δ 2 + δ 3 + …) = −4δ(1 + δ + δ 2 + …) = −4 .
1−δ
Therefore, after a history of no previous cheating, every player chooses to cooperate,
δ
obtaining −1 1−δ
1
, rather than defect, receiving −4 1−δ , if
1 δ
−1 −4 .
1−δ 1−δ
A common trick to simplify this inequality is to multiply both sides by the denominator
(1 − δ), which yields −1 −4δ, ultimately reducing the expression to
1
δ .
4
Case (2). Some cheating history. If some (or all) of the players cheat in a previ-
ous period t − 1, then the GTS prescribes that every player should play C thereafter,
yielding a stream of discounted payoffs:
−4 + δ(−4) + δ 2 (−4) + … = −4(1 + δ + δ 2 + …)

1
= −4 .
1−δ
8. Indeed, recall that the infinite sum 1 + δ + δ 2 + … can be expressed as δ 0 + δ 1 + δ 2 + … or, more compactly,
∞

as δ t . This is an infinite geometric series that can be written as 1−δ
1 . We make extensive use of this compact
t=0
expression in this chapter.
9. Note that, to check if the GTS is optimal for every player, we must maintain all other players selecting the GTS
while she is the only player deviating. That is, we test for unilateral deviations. In this context, this means that all
344 Chapter 13
If, instead, a player deviates from such a punishment scheme (playing NC while her
opponent chooses C), her stream of discounted payoffs becomes
−7
+ δ(−4) + δ 2 (−4) + ….

She deviates Punishment thereafter
Intuitively, her payoff is extremely low when she plays NC while her opponent con-
fesses, −7, but then her cheating triggers an infinite punishment by all players, as
prescribed by the GTS, yielding a payoff of −4 thereafter.10 This stream of payoffs
reduces to
−7 + (−4)(δ + δ 2 + δ 3 + …) = −7 − 4δ(1 + δ + δ 2 + …)
δ
= −7 − 4 .
1−δ
Comparing these results, we can say that, upon observing a defection to C, every
player prefers to stick to the GTS rather than deviating if
1 δ
−4 −7 − 4 ,
1−δ 1−δ
which simplifies to δ 1, thus holding for all values of δ. This result is not surprising:
if your opponent will play C thereafter, you don’t have any incentive to unilaterally
deviate toward NC, even for one period. If you do, your payoff during the period
or periods that you deviate will be lower than if you didn’t, and when you start playing
C again, your payoff will be the same as that derived from playing C during all periods.
Summary. Overall, we found that the only condition we require for cooperation to
be sustained as an equilibrium of this infinitely repeated game (i.e., for the GTS to
be SPE of the game) was found in Case 1 (namely δ 14 ). Therefore, this condition
states that players cooperate every single round of the game, so long as they assign a
sufficiently high weight to future payoffs.
Figure 13.6 illustrates the trade-off that every player faces when, upon observing that no
player defected in previous rounds, she must choose whether to continue cooperating or to
other players keep cooperating (choosing NC, as prescribed by the GTS because there was no previous cheating),
while the player that we consider here deviates to C.
10. The GTS then triggers an infinite punishment by all the players if any deviated from cooperating in any previous
period. In this case, a deviation to C was detected in a prior period, and one of the players, rather than responding
with C thereafter (as prescribed by the GTS), foolishly selects NC while her opponent chooses C, yielding outcome
(NC,C) during that period. In all subsequent rounds, players observe that at least one player previously selected C,
which triggers the infinite punishment again.
Payoff
Instantaneous gain
from cheating
0 Payoff from
cooperating
–1
Future payoff loss
from cheating
–4
t t+1 t+2 ... Time Periods
Figure 13.6
Incentives to cheat in repeated games.
cheat, as analyzed in case 1. If a player cooperates, her payoff remains −1 in all subsequent
periods. While this payoff looks good, there is another attractive option out there: if she
cheats today, her payoff increases from −1 to 0 today. However, her defection is thereafter
punished by her opponent, yielding a payoff drop from 0 to −4 in all subsequent periods.
Graphically, the instantaneous gain from cheating today is represented by the left square,
whereas the future loss from cheating is illustrated with the right rectangle.
Figure 13.6 also helps us understand in which contexts cooperation can more easily occur.
For instance, if the instantaneous gain from cheating decreases, the incentives to cheat also
decrease. This may occur when the payoff from cheating only increases from −1 to −0.5
(the shallow square on the graph), or because cheating is immediately detected rather than
requiring several periods to be detected by other players (the narrow square).
Lastly, it is important to note that we can design variations of the Grim-Trigger Strategy
that still help us sustain cooperation in the infinitely repeated game. A common variation
is to consider a temporary reversion to the NE of the unrepeated game, (C,C), rather than
the permanent reversion assumed previously. For instance, the GTS could prescribe that,
upon cheating, every player chooses C during N rounds (e.g., 3 periods) but returns to
cooperation once the punishment has been inflicted (i.e., after (C,C) has been played for
N rounds). One of the end-of-chapter exercises asks you to revisit the Prisoner’s Dilemma
game, finding under which conditions you can sustain cooperation under a GTS that tem-
porarily punishes defections for only 3 periods. As you probably suspect, cooperation can
be sustained under more restrictive conditions on discount factor δ when players tem-
porarily punish defections than when they permanently do. Graphically, the future payoff
loss from cheating (depicted in the right rectangle in figure 13.6) is narrower because the
punishment phase lasts only 3 periods. Intuitively, a temporary punishment following a devi-
ation becomes less threatening than a permanent punishment, thus making defection more
attractive.
346 Chapter 13
Self-assessment 13.4 Repeat the analysis in example 13.4 but use the follow-
ing game: Suppose that two firms could choose to Cooperate or Compete with one
another. If both firms choose Cooperate, they both receive a payoff of 5. If one firm
chooses Cooperate but the other firm chooses Compete, the firm that chooses Coop-
erate receives a payoff of 0, while the firm that chooses Compete receives a payoff
of 7. Lastly, if both firms choose Compete, they both receive a payoff of 1. For
which minimal discount factor δ do firms choose Cooperate? What if, when one
firm chooses Cooperate and the other firm chooses Compete, the firm that chooses
Compete receives a payoff of 10 instead? Interpret.
13.6 A Look at Behavioral Economics—Cooperation in the Experimental Lab?
As suggested in chapter 12, the Prisoner’s Dilemma game clearly illustrates the tension
between individual and group incentives often seen in real life. As a consequence, this game
has been recurrently tested in experimental labs over several decades, in its unrepeated and
repeated versions, along with many variations in the experimental designs. Individuals par-
ticipating in the experiment (often college students) are asked to sit at computer terminals
where they are informed about the rules of the game, are allowed to ask questions, and even
practice for a trial run of the game, before they start playing the game on their computers. In
finitely repeated games (such as those repeated two or four times), experiments found that
in the last round of interactions, players behave as if they were in an unrepeated (one-shot)
game, but in the first rounds, players are more likely to cooperate. This behavior contradicts
the theoretical prediction discussed in section 13.5.1, where players defect in all rounds of
interaction when playing the finitely repeated Prisoner’s Dilemma game.
What about the infinitely repeated version of the game? Because an infinitely repeated
game cannot actually be played, individuals participating in the experiment were informed
that they will play one more round of the game with some probability (e.g., p = 80 per-
cent).11 Overall, the literature found that, in this scenario, players are more likely to
cooperate when there is a higher probability that they will interact in future rounds (e.g.,
p increases from 80 percent to 90 percent). This result is consistent with our previous find-
ings of cooperation being easier to sustain when players care more about the future (i.e.,
higher probability p would play the same role as a higher discount factor δ in supporting
cooperation). However, when players interact during many rounds, they start defecting more
frequently. Anticipating that they may not interact in the future (because the probability that
11. As discussed in section 13.5.2, the probability that players interact in a future round declines rapidly. For
instance, the probability that they interact for 10 periods is 0.810 0.1, while that of interacting for 50 periods
decreases to 0.850 0.00001.
they will encounter each other again declines rapidly), they try to reap the gains from a uni-
lateral defection in one of the last rounds of play. For more details and references, see Duffy
and Ochs (2009) and Dal Bó and Fréchette (2011).
Exercises
1. Backward induction–I.A Find the SPE of the extensive-form game depicted in figure 13.7. Player
1’s payoff is the top number, whereas player 2’s payoff is the bottom number.
2. Backward induction–II.A Find the SPE of the extensive-form game depicted in figure 13.8. Player
1’s payoff is the top number, player 2’s payoff is the middle number, and player 3’s payoff is the
bottom number.
Player 1
L R
Player 2 Player 2
A B C D
3 4 4 5
6 2 1 5
Figure 13.7
Backward induction–I.
Player 1
L R
Player 2 Player 2
A B C D
Player 3 Player 3 Player 3 Player 3

S T U V W X Y Z
1 4 7 3 2 8 6 5
4 5 2 8 6 1 3 7
3 2 1 5 7 4 8 6
Figure 13.8
Backward induction–II.
348 Chapter 13
Player 1
L R
Player 2 Player 2
A B C D E
Player 1 Player 1
7 2 3
V W X 1 7 4 Y Z
1 4 5 6 8
6 5 2 8 3
Figure 13.9
Backward induction–III.
Player 1
L R
Player 2 Player 2
A B C D
Player 1 Player 1 Player 1
U V W X Y Z Y Z
1 4 7 3 7 8 6 5
4 5 2 8 2 1 3 7
Figure 13.10
Backward induction–IV.
3. Backward induction–III.B Find the SPE of the extensive-form game depicted in figure 13.9.
Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number.
4. Backward induction–IV.B Find the SPE of the extensive-form game depicted in figure 13.10.
Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number.
5. Backward induction–V.B Find the SPE of the extensive-form game in figure 13.11. Player 1’s
payoff is the top number, whereas player 2’s payoff is the bottom number.
6. Prisoner’s Dilemma.A Consider the Prisoner’s Dilemma game in example 12.5 from chapter 12.
(a) Suppose now that player 1 chooses whether to stay silent or confess first, then player 2 observes
player 1’s choice and responds by staying silent or confessing. Find the SPE of this game.
(b) Suppose now that player 2 chooses whether to stay silent or confess first, then player 1 observes
player 2’s choice and responds by staying silent or confessing. Find the SPE of this game.
Player 1
L R
Player 2 Player 2
A B C D
Player 1 Player 1
W X W X Y Z Y Z
1 4 7 3 7 8 6 5
4 8 2 5 2 1 3 7
Figure 13.11
Backward induction–V.
(c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version
of this game. If they are similar, explain why. If they are different, provide an explanation for
any differences.
7. Battle of the Sexes game.A Consider The Battle of the Sexes game in example 12.6 from
chapter 12.
(a) Suppose now that Felix chooses where he goes first, and then Ana observes Felix’s choice and
decides where to go afterward. Find the SPE of this game.
(b) Suppose now that Ana chooses where she goes first, and then Felix observes Ana’s choice and
decides where to go afterward. Find the SPE of this game.
any differences.
8. Coordination game.A Consider the Coordination game in example 12.7 from chapter 12.
(a) Suppose now that depositor 1 chooses whether to withdraw or not withdraw first, and then
depositor 2 observes depositor 1’s choice and decides whether to withdraw or not. Find the
SPE of this game.
(b) Suppose now that depositor 2 chooses whether to withdraw or not withdraw first, and then
depositor 1 observes depositor 2’s choice and decides whether to withdraw or not. Find the
SPE of this game.
any differences.
9. Anticoordination game.A Consider the Anticoordination game in example 12.8 from chapter 12.
(a) Suppose now that player 1 chooses whether to swerve or stay first, and then player 2 observes
player 1’s choice and responds by staying or swerving. Find the SPE of this game.
350 Chapter 13
(b) Suppose now that player 2 chooses whether to swerve or stay first, and then player 1 observes
player 2’s choice and responds by staying or swerving. Find the SPE of this game.
any differences.
10. Penalty kicks.B Consider the penalty kicks scenario in example 12.9 from chapter 12.
(a) Suppose now that the goalie chooses whether to dive left or dive right first, and then the
kicker observes the goalie’s choice and responds by aiming left or right. Find the SPE of this
game.
(b) Suppose now that the kicker chooses whether to aim left or aim right first, and then the goalie
observes the kicker’s choice and responds by diving left or right. Find the SPE of this game.
(c) Compare the results from parts (a) and (b), and the results from the simultaneous-move ver-
sion of this game. If they are similar, explain why. If they are different, provide an explanation
for any differences.
11. Complementary pricing.B Consider a situation where two firms of complementary goods are
simultaneously deciding whether to price high or low. While not direct competitors in their respec-
tive markets, they know that if one firm prices high while the other prices low, that is not beneficial
to either firm. The normal-form representation is presented here where the payoffs (in dollars)
denote the profits for each firm:
Firm 2
High Low
High 90, 75 35, 40
Firm 1
Low 45, 30 60, 60
(a) Find all pure-strategy Nash equilibria of this game. Do the firms know how they should price?
(b) Suppose now that firm 1 sets their price first, and then firm 2 observes the price and responds.
Depict the extensive form of this game and find the SPE.
(c) Suppose now that firm 2 sets their price first, and then firm 1 observes the price and responds.
Depict the extensive form of this game and find the SPE.
(d) Compare the results from parts (a), (b), and (c). If they are similar, explain why. If they are
different, provide an explanation for any differences.
12. Holdout game.B Consider a situation where two individuals (players 1 and 2) have neglected to
purchase their tickets to the latest blockbuster movie in advance. They both arrive at the movie
theater at exactly the same time, and once they reach the front of the line, there is only one ticket
left. The cashier decides that whoever waits out the other player will receive the ticket. Receiving
the ticket is worth $100 to either player.12 The game proceeds as follows:
12. This game, which is also known as the “War of Attrition,” helps us understand firm exits in declining industries
that exhibit enough demand to support only one firm. Each firm decides whether to stay or exit, suffering costs
(e.g., losses) in each period. If only one firm remains, then demand is sufficient for this firm to earn an economic
profit.
• Player 1 chooses whether to remain in the line or leave. If she leaves the line, player 2 receives
the ticket. If she remains in the line, both players pay a cost of $10 (they both also neglected to
use the bathroom beforehand, and it’s getting uncomfortable), and then player 2 gets to act.
• Player 2 responds choosing whether to remain in the line or leave. If she leaves the line, player
1 receives the ticket. If she remains in the line, both players pay a cost of $10, and then the
game repeats.
• After both players have had two opportunities to remain or leave, the game ends, and a player
is chosen randomly to receive the ticket (each with 50 percent probability). This corresponds
with a payoff of $10 for both players.
(a) Depict the extensive form of this game.
(b) Find the SPE of this game.
(c) Suppose that player 2 knew that the cashier was friends with player 1, and that after two
rounds of play, he would just choose to sell the ticket to his friend. Find the SPE of this game.
13. Centipede game.C Consider a game where two players take turns deciding whether to take a pile
of money. At the start of the game, there are two piles of money, one “large” pile, with $5 dollars
and one “small” pile, with $1. Player 1 gets to choose whether to take the large pile or leave both
piles alone.
• If player 1 takes the large pile, he receives $5 as his payoff, and player 2 receives the small pile,
$1 as his payoff, and the game ends.

• If player 1 leaves both piles alone, both the large and the small piles double ($10, and $2,
respectively).
• Player 2 then decides whether to take the large pile or leave both piles alone.
• Every time a player leaves both piles alone, both piles double in size.
• Players take turns deciding whether to take the large pile or leave the piles alone until each
player has had three opportunities to take the large pile (six rounds total).
• If, at the end of the game, neither player has taken the large pile, player 1 is awarded the large
pile (which will contain $320 at this point) and player 2 is awarded the small pile (which will
contain $64).
(a) Depict the extensive form of this game.
(b) Find the SPE of this game.
(c) Suppose that, if at the end of the game no player ever takes the large pile of money, both
players are awarded an equal share of both the piles of money, rather than player 1 being
awarded the large pile as in previous parts of the exercise. Find the SPE of this game.
14. Seven bean game.C Consider a situation where there are seven beans sitting on a table. Two
players take turns taking either one or two beans off of the table, with player 1 going first. Whoever
takes the last bean off the table loses the game and must pay $1 to the winner.
(a) If both players act optimally, who wins the game? What is the winning player’s optimal strat-
egy? (Hint: Try this game with fewer beans on the table first, and work your way up to seven
beans.)
352 Chapter 13
(b) Suppose instead that there are nine beans on the table. If both players act optimally, who wins
the game? What is the winning player’s optimal strategy?
15. Repeated dilemmas.B Repeat the infinitely repeated Prisoner’s Dilemma game of this chapter,
but assume different discount factors for each player, δ1 for the row player and δ2 for the column
player.
16. Collusion.B Consider a situation where two identical firms are simultaneously deciding whether
to price high or price low. If both firms price high, they both receive half the market’s profits,
π . If one firm prices high while the other firm prices low, the low-pricing firm receives all the
2
market’s profits, π , while the high-pricing firm receives 0 in profits. If both firms price low, they
both receive 0 in profits.
(a) Depict the normal-form representation of this simultaneous move game. Find all pure strategy
Nash equilibria.
(b) Suppose now that the firms decided to collude to charge a high price. For what minimal
discount factor δ do the firms cooperate by charging a high price?
17. Alternative triggers.B Consider the following normal-form game below:
Player 2
H M L
H 100, 100 30, 125 20, 80
Player 1 M 125, 30 60, 60 30, 50
L 80, 20 50, 30 40, 40
(a) Find all pure strategy Nash equilibria of this game.

(b) Suppose that each player implements the following GTS: “Choose H in the first round. In
every other round, if both players chose H in all previous rounds, choose H again. Otherwise,
choose M forever after.” For what minimal discount factor δ do the players cooperate by
choosing H?
(c) Suppose that each player implements the following GTS: “Choose H in the first round. In
every other round, if both players chose H in all previous rounds, choose H again. Other-
wise, choose L forever after.” For what minimal discount factor δ do the players cooperate by
choosing H?
(d) Compare the results from parts (b) and (c). If they are similar, explain why. If they are
different, provide an explanation for any differences.
18. Carrot and stick.B Consider the following normal-form game below:
Player 2
H M L
H 100, 100 30, 125 20, 80
Player 1 M 125, 30 80, 80 30, 50
L 80, 20 50, 30 40, 40
(a) Find all pure strategy Nash equilibria of this game.

(b) Suppose this game is played twice. Consider the following strategy: “Choose H in the first
round. If both players chose H in the first round, choose M. Otherwise, choose L.” For what
minimal discount factor δ do the players cooperate by choosing H?
19. Temporary punishments.C Repeat the infinitely repeated Prisoner’s Dilemma game of example
13.4, finding under which conditions you can sustain cooperation under a GTS that temporarily
punishes defections for only two periods. Compare your results with those in the chapter (where
the GTS has a permanent reversion to NE).
20. Imperfect monitoring.C Repeat example 13.4, but assume imperfect monitoring. Specifically,
if player j cooperates, the probability that player i will observe j defecting is zero; but if player j
defects, the probability that i will observe j defecting is p. When p = 1, the game exhibits perfect
monitoring, but when p < 1, monitoring is imperfect.
21. Punishment size.B Consider the situation in exercise 13.19. While it is harder to sustain cooper-
ation when the punishment is temporary (i.e., higher δ required), it might be more beneficial to
have only small punishments. Why?
22. Rock, Paper, Scissors–I.A Consider the Rock, Paper, Scissors game from exercise 12.15 in
chapter 12. Suppose instead that the game is played sequentially. Would you rather move first
or second? Why?
23. Rock, Paper, Scissors–II.B Consider the results of exercises 12.15 and 13.22. A friend of yours
approaches you and states that he has found the best strategy for Rock, Paper, Scissors. He claims
that he should always pick a random choice for the first round of the game. If he wins, he should
stay with his choice; but if he loses, rotate to whatever item beat him previously (e.g., if he lost
while playing “Paper,” he switches to “Rock”). How would you respond to your friend? Can you
describe a strategy that could beat him almost all the time?
14 Imperfect Competition
14.1 Introduction
In this chapter, we return to our analysis of market structure, which we initiated with
two extreme types of industries (perfect competition, where infinitely many firms oper-
ate, as described in chapter 9; and monopolies, where only one firm operates, as discussed
in chapter 10). In particular, we seek to understand output and pricing decisions in less
extreme industries, where a small number of firms operate. We refer to such industries as
“oligopolies.” Examples of these types of markets are (1) smartphones, where Samsung,
Apple, and Huawei capture almost 50 percent of the industry; (2) airlines competing on
the same route, where we rarely see more than three or four carriers; or (3) light bulbs,
where Philips Lightning, Osram, and Acuity Brands capture a large market share. Because
every firm’s decision affects its rivals’ profits, we can deploy our tools from game theory
(discussed in chapters 12 and 13) to characterize how firms will behave when they compete
with each other in this setting.
We start by describing how to measure market power with a single index and how to
interpret it in terms of market concentration. We then examine firms’ interactions in models
where every firm simultaneously chooses its strategy (either its output or its price). We also
apply the tools from repeated games to investigate firms’ output choices when they interact
with one another a finite or infinite number of times. In particular, we identify under which
conditions firms can cooperate with each other (i.e., when they can form a cartel and sustain
it through time). Understanding firms’ incentives to collude in cartels has important policy
implications for antitrust authorities, as they can better design regulations that will reduce
collusion incentives and promote competition.
We then move on to scenarios in which firms compete sequentially (e.g., an industry
leader choosing its output first, and the follower responding with its own output decision).
The last section of the chapter examines firms selling products that are regarded as not
being identical by customers (i.e., products are differentiated in some dimension such as
taste, e.g., Mac and PCs, iPhones and Samsung smartphones).1 In this context, we show
1. Generally, we distinguish whether products are differentiated horizontally or vertically. The former indicates that
some consumers prefer one of the goods because its features are closer to their ideal, whereas the latter represents
356 Chapter 14
Table 14.1
Summary of market structures.
Industry Number of firms Type of good Price-takers? Entry barriers?
Perfect competition Many Homogeneous Yes No

Monopoly One No close substitutes No Yes
Oligopoly Some Homogeneous or heterogeneous No Yes
that firms’ competition is attenuated because, intuitively, customers do not fully switch their
purchasing decisions if one of the firms slightly undercuts its rivals’ price, given their relative
preference for one brand over another.
Table 14.1 summarizes the different industries encountered in previous chapters—perfect
competition (chapter 9) and monopoly (chapter 10)—and the market structures we study in
this chapter (oligopoly), describing how they differ in terms of (1) the number of firms in
the industry, (2) the type of good they sell, (3) whether firms are price-takers or not, and
(4) the presence of entry barriers.
14.2 Measuring Market Power
A common measure of market power is the number of firms in an industry, N 1. Most

people agree that a market with N = 3 is probably less competitive than one with N = 1, 000.
However, such a measure is relatively vague, as it does not inform us about the market share
that each firm sustains. For instance, this measure would evaluate two industries, A and B,
with the same number of firms (e.g., N = 3) as being equivalent. A closer look, however,
could reveal that in industry A, one of the firms enjoys a 98 percent market share, while the
other two firms only have 1 percent each; on the other hand, in industry B, market share
is evenly distributed across firms (each firm holds 33.3 percent of the market share). To
avoid this problem, the Herfindahl-Hirschmann index (HHI) accounts for both the number
of firms and their market shares.
Herfindahl-Hirschman index (HHI) of market concentration This index is

given by
HHI = (s1 )2 + (s2 )2 + … (sN )2 ,
where s1 represents the market share of firm 1 (in percentage), s2 is that of firm 2, and
similarly for all remaining N firms in the industry.
markets where all customers regard one good as superior in terms of quality. This chapter does not cover vertical
differentiation; for a detailed presentation, see section 5.3.1 in Belleflamme and Peitz (2015).
Imperfect Competition 357
To understand the HHI, it is useful to consider extreme market structures. In a monopoly, a

single firm captures the entire market share, implying that s1 = 100 percent, which produces
an HHI of HHI = (100)2 = 10, 000. Similarly, in a duopoly with two firms evenly sharing
the market, the HHI decreases to HHI = (50)2 + (50)2 = 5, 000. In an oligopoly with 1, 000
1
firms, each capturing 1,000 of the market share, the HHI further decreases to
2 2 2
1 1 1
HHI = + +… +
1, 000 1, 000 1, 000
2
1
= 1, 000 = 0.001.
1, 000
Generally, in an industry with N 1 firms, all of them evenly sharing the market, and thus
entailing a market share of si = N1 for every firm i, the HHI is given by
2 2 2
1 1 1
HHI = + + +…
N N N
2
1 1
=N = ,
N N
which converges to zero when the number of firms, N, is sufficiently large.
In summary, a high HHI arises in highly concentrated industries, which can occur because
a single firm captures all market share (as in the monopoly example given previously) or
because a few firms sustain most market power, despite several firms being present. In con-
trast, a low HHI emerges when market power is more evenly distributed. As a consequence,
the HHI ranges from 10, 000 to 0. As a reference, the US light bulb market, with around
fifty-seven firms, has an HHI of 2, 757. This value indicates that some of these firms enjoy a
large market share. In contrast, glass container manufacturing, despite having only twenty-
two firms, exhibits a lower HHI of 2, 582. This value suggests that market shares are more
evenly split among firms (i.e., the market is less concentrated).2
14.3 Models of Imperfect Competition
Consider a market with N 2 firms, all of them selling a relatively homogeneous product
(i.e., a good with close attributes).3 In this scenario, we will consider three models of firm
competition: (1) the Cournot model of simultaneous quantity competition; (2) the Bertrand
2. For information about other industries, visit the concentration ratios posted on the US Census website at
http://www.census.gov/epcd/www.concentration.html.
3. A common example is that of brands of unflavored mineral water, where fewer than eight brands compete in
most US stores; or that of salt, where Morton Salt, Cargill, and IMC are the main players, all of which sell an
extremely similar product. In both markets, firms seek to differentiate their products by adding flavors or touting
their health properties (which are often difficult to confirm by the buyer). As we discuss later in this chapter, firms’
profits increase when their products are regarded as differentiated from those of their rivals’.
358 Chapter 14
model of simultaneous price competition; and (3) the Stackelberg model of sequential
quantity competition.4 In future sections, we explore how the results in these models are
affected when firms sell differentiated products.
14.3.1 Cournot Model—Simultaneous Quantity Competition

Let us consider an industry with N = 2 firms selling a homogeneous product. (The appendix
at the end of this chapter analyzes how the results are affected when we allow for N 2
firms.) In this model every firm independently and simultaneously chooses its profit-
maximizing output (q1 for firm 1 and q2 for firm 2). The market price is then determined
by inserting output levels q1 and q2 into the inverse demand function p(q1 , q2 ). For sim-
plicity, assume that this function is linear, p(q1 , q2 ) = a − b(q1 + q2 ), where a, b > 0 are
positive constants.5 Firm 1’s total cost (TC) function is TC1 (q1 ) = cq1 , where c > 0 is a pos-
itive parameter representing firm 1’s marginal cost of production. Firm 2’s TC function is
symmetric, TC2 (q2 ) = cq2 . (Our analysis considers how our results would change if firms
were asymmetric in their costs–that is, one firm is more efficient than the other in producing
the good.)
Firm 1. Let us start by considering firm 1’s profit maximization problem (PMP). In
particular, the firm chooses its output q1 to solve
max π1 = TR1 − TC1 = p(q1 , q2 )q1 − cq1 = [a − b (q1 + q2 )] q1 − cq1 ,

q1
TR1 TC1
where TR1 = p(q1 , q2 )q1 denotes total revenue (price times units sold) and TC1 = cq1 is its
total cost. To maximize its profits, firm 1 differentiates this expression with respect to its
output level, q1 , to obtain
∂π1
= a − 2bq1 − bq2 − c = 0.
∂q1
Rearranging this, we get a − c − bq2 = 2bq1 , and solving for q1 yields
a−c 1
q1 (q2 ) = − q2 , (BRF1 )
2b 2
which is referred to as firm 1’s “best response function.” This function describes the profit-
maximizing output that firm 1 chooses as a response to each of the output levels that
firm 2 selects. For instance, if a = 10, b = 1, and c = 2, firm 1’s best response function
4. One of the exercises at the end of the chapter examines the Stackelberg model of sequential price, rather than
quantity, competition.
5. Example 14.1 considers p(q1 , q2 ) = 12 − q1 − q2 . In that scenario, if firm 1 produces q1 = 5 units, and firm 2
produces q2 = 4 units, the price becomes p(5, 4) = 12 − 5 − 4 = $3.
q1
a–c
2b
a–c 1
q1 (q 2 ) = – q2
2b 2
a–c q2
b
Figure 14.1
Firm 1’s best response function.
becomes q1 (q2 ) = 10−2 2 − 2 q2 = 4 − 2 q2 . If firm 2 produces q2 = 3 units, firm 1 responds

1 1
with q1 (2) = 4 − 2 2 = 2.5 units.

1
Figure 14.1 depicts firm 1’s best response function, which originates at a height of a−c 2b
units on the vertical axis when firm 2 does not produce at all, but decreases with a slope
of −1/2 for every unit of firm 2’s output. In addition, when q2 = a−c
a−c b , firm 1 optimally
responds with q1 b = a−c 2b − 1 a−c
2 b = 0 units. This outcome extends to all output levels
q2 a−cb . Intuitively, as firm 2 increases its output q2 , firm 1 is left with a smaller residual
demand to serve (i.e., fewer customers). When firm 2’s output is massive, exceeding a−c b ,
firm 1’s profit-maximizing decision is to shut down, producing q1 = 0, rather than selling
units of q1 at a loss. This is illustrated in figure 14.1 by the flat segment of firm 1’s best
response function, which overlaps the horizontal axis, where firm 1’s output is zero (q1 = 0)
for all q2 a−c
b .
Firm 2. A similar argument applies to firm 2, which solves
max π2 = TR2 − TC2 = p(q1 , q2 )q2 − cq2 = [a − b (q1 + q2 )] q2 − cq2 .

q2
TR2 TC2
Differentiating with respect to q2 , we find

∂π2
= a − bq1 − 2bq2 − c = 0.
∂q2
Rearranging this, we get a − c − bq1 = 2bq2 . Solving for q2 yields firm 2’s best response
function, BRF2 as follows,
a−c 1
q2 (q1 ) = − q1 , (BRF)
2b 2
360 Chapter 14
q1
a–c a– c 1
q 2 ( q1) = – q1
b 2b 2
a–c q2
2b
Figure 14.2
Firm 2’s best response function.
which is symmetric to that of firm 1 (i.e., only the subscripts changed) because both
companies face the same demand and costs. Figure 14.2 depicts this best response func-
tion. Like firm 1’s best response function, firm 2’s function originates at q2 = a−c 2b units
when firm 1 is inactive, but it decreases at a rate of 1/2 as firm 1 increases its production. A
common graphical trick used to plot firm 2’s best response function is to use the same axis
orientation as that used to depict firm 1’s best response function.6
Superimposing firm 1’s and firm 2’s best response functions onto the same graph, we
obtain their crossing point, as depicted in figure 14.3. At this crossing point, both firms
are choosing output levels that constitute a best response to the output of its rival (i.e., firms
are selecting mutual best responses). From chapter 12, we know that a mutual best response
is the Nash equilibrium (NE) of a game.
To find the point where the best response functions cross each other, we can insert firm 2’s
best response function into that of firm 1, which yields

a−c 1 a−c 1
q1 = − − q1 ,
2b 2 2b 2

q2
which depends on output q1 alone. Rearranging this, we obtain

3 a−c
q1 = .
4 2b
6. That is, you can rotate the page counterclockwise 90 degrees and plot q2 (q1 ) = a−c 1
2b − 2 q1 , starting with a
a−c a−c
vertical intercept at q2 = 2b units and a horizontal intercept at q1 = b units, where q2 = 0. As in figure 14.1,
firm 2’s best response function in figure 14.2 depicts q2 = 0 for all q1 ≥ a−c
b by including a segment that overlaps
the vertical axis (the top-left side of figure 14.2).
q1
a–c 45 degrees, q1 = q2
b
a–c
2b
a–c Cournot equilibrium
3b
a–c a–c a–c q2

3b 2b b
Figure 14.3
Cournot equilibrium.
After solving for q1 , we find firm 1’s equilibrium output q∗1 = a−c
3b . Inserting this output level
in firm 2’s best response function yields
q∗1

a−c a−c 1 a−c
q2 = −
3b 2b 2 3b
3(a − c) − 2(a − c)
=
6b
a−c
= .
3b
Because firms face the same demand function and the same cost function, they produce
the same output level in equilibrium. This output pair (q∗1 , q∗2 ) = a−c
3b , a−c
3b is the NE of the
Cournot game, and figure 14.3 depicts it at the point where the best response functions of
the firms cross each other.
Alternative approach. A more straightforward approach to solve for the equilibrium
output is to invoke symmetry. Indeed, because firms are symmetric in their revenues
and costs, we can claim that there must be a symmetric equilibrium where both firms
produce the same amount, q∗1 = q∗2 = q∗ . Inserting this property into either firm’s best
response function simplifies it to the following equation, which no longer includes
subscripts:
a−c 1 ∗
q∗ = − q ,
2b 2
362 Chapter 14
or 32 q∗ = a−c ∗
2b . Solving for q , we obtain the equilibrium output for every firm in this Cournot
∗
model, q = 3b , thus coinciding with the output level found previously (where we inserted
a−c
firm 2’s best response function into firm 1’s).

After finding equilibrium output, we can turn our attention to the equilibrium price, which
we obtain by evaluating the inverse demand function p(q1 , q2 ) = a − b(q1 + q2 ) at q∗1 = q∗2 =
a−c
3b , as follows:

a−c a−c a−c a−c
p∗ , =a−b +
3b 3b 3b 3b
2(a − c)
=a−
3
a + 2c
= .
3
Finally, equilibrium profits for every firm i = {1, 2} are

a + 2c a − c a−c
πi∗ = p∗ q∗i − cq∗i = −c
3 3b 3b
(a + 2c)(a − c) 3c(a − c)
= −
9b 9b
a − 2ac + c
2 2
= ,
9b
or, more compactly, πi∗ = (a−c)

2
9b because (a − c) = a − 2ac + c . This can be alternatively
2 2 2
∗ ∗
expressed as πi = (q ) .
2
Example 14.1: Cournot model with symmetric costs Consider a duopoly with
inverse demand function p(q1 , q2 ) = 12 − q1 − q2 , where every firm i = {1, 2} faces a
symmetric cost function TCi (qi ) = 4qi .
Firm 1’s best response function. In this scenario, firm 1 chooses its output level q1
to solve
max π1 = (12 − q1 − q2 ) q1 − 4q1 .

q1
To maximize its profits, firm 1 differentiates this expression with respect to its output
level, q1 , to obtain
∂π1
= 12 − 2q1 − q2 − 4 = 0.
∂q1
Rearranging this, we get, 8 − q2 = 2q1 , and solving for q1 yields

1
q1 (q2 ) = 4 − q2 , (BRF)
2
which is firm 1’s best response function, originating at 4 units and decreasing with a
slope of −1/2 as firm 2 increases its production.
Firm 2’s best response function. A similar argument applies to firm 2, which chooses
q2 to solve
max π2 = (12 − q1 − q2 ) q2 − 4q2 .
q2

∂π2
= 12 − q1 − 2q2 − 4 = 0.
∂q2
Rearranging this, we find 8 − q1 = 2q2 . Solving for q2 yields firm 2’s best response
function, as follows:
1
q2 (q1 ) = 4 − q1 ,
2
which is symmetric to that of firm 1 (only the subscripts change).
Finding equilibrium output. To solve for the equilibrium output levels of each firm,
we can invoke symmetry because firms are symmetric, and we just confirmed that their
best response functions are symmetric! In other words, there must be a symmetric
equilibrium where both firms produce the same amount, q∗1 = q∗2 = q∗ . Inserting this
property into either firm’s best response function simplifies it to
1
q∗ = 4 − q∗ ,
2
or 32 q∗ = 4. Solving for q∗ , we obtain the equilibrium output for every firm in this
Cournot model, q∗ = 83 . As a consequence, equilibrium price is

∗ 8 8 8 8 20 ∼
p , = 12 − q∗ − q∗ = 12 − − = = $6.67,
3 3 3 3 3
ultimately producing equilibrium profits of

∗ ∗ ∗ ∗ 20 8 8 160 96 64
πi = p q − cq = −4 = − =
3 3 3 9 9 9
for every firm i = {1, 2}.
364 Chapter 14
Self-assessment 14.1 Repeat the analysis in example 14.1, but assume that firm
1 faces an inverse demand function p(q1 , q2 ) = 5 − 13 (q1 + q2 ) (i.e., a = 5 and b = 13 ).
Find the firm’s best response function, its vertical and horizontal intercept, and
slope.
inverse demand function changes to p(q1 , q2 ) = 20 − q1 − q2 . Find each firm’s best
response function, the Cournot equilibrium output, and the corresponding equilibrium
price and profits.
Example 14.2 considers an industry with two firms facing different production costs.
Example 14.2: Cournot model with asymmetric costs Consider two firms com-
peting à la Cournot, facing the same inverse demand function as in example 14.1,
p(q1 , q2 ) = 12 − q1 − q2 , but different cost functions: TC1 (q1 ) = 4q1 for firm 1, and
TC2 (q2 ) = 3q2 for firm 2. (Note that the marginal cost of firm 2 is less than that of
firm 1, and hence we can expect its equilibrium output to be larger. We confirm this
suspicion in the discussion that follows.)
Firm 1’s best response. We first find firm 1’s BRF by solving its PMP:
max π1 = (12 − q1 − q2 ) q1 − 4q1 .

q1
This problem coincides with the one discussed in example 14.1, and thus it yields the
same best response function, q1 (q2 ) = 4 − 12 q2 .
Firm 2’s best response. In contrast, firm 2’s PMP is now
max π2 = (12 − q1 − q2 ) q2 − 3q2 .

q2
Differentiating with respect to its output q2 yields

∂π2
= 12 − q1 − 2q2 − 3 = 0.
∂q2
Rearranging this, we find that 9 − q1 = 2q2 . Solving for q2 yields firm 2’s best
response function, as follows:
9 1
q2 (q1 ) = − q1 .
2 2
This function has the same slope as that in example 14.1, −1/2 (where we assumed
that its marginal costs were 4), but it originates at 9/2 rather than at 4. This indicates
that, for every output of firm 1, firm 2’s output is now larger because its marginal cost
is 3 rather than 4.
Finding equilibrium output. At this point, we cannot invoke symmetry in output level
in equilibrium because firms face different production costs. As a result, we need to
simultaneously solve for q1 and q2 in BRF1 and BRF2 by, for instance, inserting BRF2
into BRF1 , as follows:

1 9 1
q1 = 4 − − q1 .
2 2 2

q2
Rearranging this, we find q1 = 4 − 94 + 14 q1 , or 34 q1 = 74 . Solving for q1 yields an equi-

librium output of q∗1 = 73 2.33 units. Inserting this output into firm 2’s best response
function, we find its equilibrium output:
9 1 7 10 ∼
q∗2 = − = = 3.33 units,
2 2
3 3
q∗1
where q∗2 > q∗1 because firm 2’s marginal cost is lower than that of firm 1.
As an exercise, you can check that in this scenario, equilibrium price is p∗ = 19
3 ,
and equilibrium profits are π1∗ = 49
9 for firm 1 and π ∗ = 100 for firm 2. Comparing
2 9
equilibrium profits under asymmetric costs (this example) and under symmetric costs
(example 14.1), we find that the firm benefiting from a cost advantage (firm 2) earns
a larger profit, while the firm suffering from a cost disadvantage (firm 1) earns a
smaller profit.
Self-assessment 14.3 Repeat the analysis in example 14.2 but assuming that firm
2’s cost function changes to TC2 (q2 ) = q2 , thus emphasizing the cost advantage of
firm 2 relative to firm 1. Compare your results against those in example 14.2.
14.3.2 Bertrand Model—Simultaneous Price Competition

Consider now that firms compete in prices rather than in quantities. Will our equilibrium
results from the Cournot model be affected? The Bertrand model of price competition that
we explore in this section answers this question with a robust “Yes.”
366 Chapter 14
Let us start by clarifying the setup of the game: two symmetric firms produce an
homogeneous good and face a common marginal cost, c > 0. They simultaneously and inde-
pendently set prices for their products, p1 and p2 . If firm 1 charges the lowest price (i.e., p1
satisfies p1 < p2 ), firm 1 captures all the demand, while firm 2 captures none:
x1 (p1 , p2 , I) > 0 and x2 (p1 , p2 , I) = 0,
where x1 (p1 , p2 , I) denotes the demand function found in chapter 3; and I > 0 represents
income level. Similarly, if firm 2 sets the lowest price, p1 > p2 , the roles are switched, as
it is now firm 2 captures all demand. Lastly, if prices coincide, p1 = p2 , both firms equally
share market demand; that is, 12 x1 (p1 , p2 , I) > 0 for firm 1, and similarly, 12 x2 (p1 , p2 , I) > 0
for firm 2.
The Bertrand model of price competition claims that, in equilibrium, both firms set the
same price, and this common price coincides with their marginal cost:
p1 = p2 = c.
Next, let us show this result by systematically going over all possible price pairs (p1 , p2 )
that are different from (p1 , p2 ) = (c, c) (i.e., where both firms’ price coincides with their
common cost, c). We will demonstrate that these price profiles cannot be equilibria of the
Bertrand model of price competition. What do we mean by that? We only need to show that
any price different than the marginal cost c is “unstable” in the sense that at least one firm
has an incentive to deviate to a different price.
For presentation purposes, we will first examine asymmetric price pairs, p1 = p2 , and
then analyze symmetric price pairs, where p1 = p2 .
1. Asymmetric price profiles.

(a) Consider a price profile p1 > p2 > c, as depicted in figure 14.4. In this scenario
firm 2 charges the lower price, thus capturing the entire market and making a pos-
itive margin per unit because p2 > c. This price profile, however, cannot be stable
because firm 1 has incentives to deviate by undercutting firm 2’s price, charging
p1 = p2 − ε, where ε > 0 indicates a small reduction in firm 2’s price.7 Hence, price
profile p1 > p2 > c cannot be an equilibrium because we found at least one profitable
deviation.8
(b) Consider now a price profile p1 > p2 = c. As depicted in figure 14.5, firm 2 sets the
lowest price (and so captures all sales), but in this case, it makes no profit per unit.
Firm 1 would not have the incentive to undercut firm 2’s price, as that would entail
charging a price below marginal cost, thus incurring a loss per unit. Firm 2, instead,
7. If, for instance, p2 = $10, firm 1 could undercut firm 2’s price by 1 cent, by charging p1 = $9.99, thus making
ε = 0.01. A similar argument applies if ε could be smaller than 1 cent.
8. A similar argument applies if we switch the identities of the firms by considering the price profile p2 > p1 > c,
whereby firm 1 captures the whole market, and firm 2 would now have incentives to undercut firm 1’s price by a
small amount.
p2 – ε
c p2 p1
Profitable
Deviation
Figure 14.4
Profitable deviation when p1 > p2 > c.
p1 – ε
p2 = c p1
Profitable
Deviation
Figure 14.5
Profitable deviation when p1 > p2 = c.
would have an incentive to deviate by increasing its price from p2 = c to slightly

below its rival’s price, p2 = p1 − ε, where ε > 0 is a small number (e.g., 1 cent) and
make a higher profit. Because we found a profitable deviation, we can claim that
price profile p1 > p2 = c cannot be an equilibrium either.9
In summary, points 1(a) and 1(b) considered all possible asymmetric price profiles. For
all of them, we showed that at least one firm has an incentive to deviate, entailing that
the price profile considered in each case cannot be an equilibrium. As a consequence, if
an equilibrium exists, it must be symmetric in the sense that both firms charge the same
price, p1 = p2 = p. We examine this possibility next.
2. Symmetric price profiles.
(a) Consider a price profile where both firms charge the same price, but such a com-
mon price is larger than the marginal cost of production, p1 = p2 > c, as depicted in
figure 14.6. In this case, both firms evenly share the market because their prices are
the same. Every firm i now has the incentive to deviate by undercutting its rival’s
price p by a small amount, ε, so that pi = p − ε.10 Hence, price profile p1 = p2 > c
cannot be an equilibrium either.
9. A similar argument applies if we switch the identities of the two firms by considering the pricing profile p2 >
p1 = c, where now firm 1 captures the market but would have the incentive to increase its price until p1 = p2 − ε.
10. Firm i’s price decrease, from p to p − ε, exerts two effects on its profits. On the one hand, it increases its sales
from half the market to all the market. On the other hand, it reduces its margin per unit from p − c to (p − ε) − c.
However, the first (positive) effect dominates the second (negative) effect, yielding an overall increase in profits,
when the firm undercuts its rival’s price p by a small amount (i.e., when ε is a small number).
368 Chapter 14
pi – ε
c p1 = p2
Profitable
Deviation
Figure 14.6
Profitable deviation when p1 = p2 > c.
(b) Finally, consider the price profile p1 = p2 = c. Here, prices coincide, thus leading
firms to evenly share the market. In addition, these prices leave no positive margin
per unit because pi = c for every firm i. While profits are zero in this price profile,
no firm can strictly increase its payoff by unilaterally deviating: setting a lower price
would attract all customers, but at a loss per unit, and setting a higher prices would
reduce the deviating firm’s sales to zero, as its price is now higher than that of its rival.
Summarizing, we can claim that setting a price equal to the common marginal cost, pi = c,
is a weakly dominant strategy in the Bertrand model of price competition because no firm
can strictly increase its profit by deviating from such a price.
This discussion considers, for simplicity, two firms. Nonetheless, a similar argument can
be extended to scenarios with more than two firms, where pi = c remains an equilibrium of
the game for every firm i.11
Example 14.3: Bertrand model Consider again the inverse demand function in
example 14.1, p(q1 , q2 ) = 12 − q1 − q2 . Because Q ≡ q1 + q2 denotes the aggregate
output in the industry, we can express the inverse demand function as p(Q) = 12 − Q.
According to the Bertrand model of price competition, all firms in the industry lower
their prices until p = c. In this context, because p(Q) = 12 − Q, equilibrium condition
p = c entails 12 − Q = c, which, solving for aggregate output Q, yields Q = 12 − c.
For instance, if the marginal cost is c = 4, as in examples 14.1 and 14.2, aggregate
output becomes Q = 12 − 4 = 8 units, each of which sold at a price of $4.
Self-assessment 14.4 Repeat the analysis in example 14.3, but assuming that
firms face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results
in example 14.3 affected?
11. In oligopolies with more than two firms, however, other NEs can exist where two firms set their price equal
to the common marginal cost c, while all other firms set their prices above c (i.e., pi = pj = c and pk > c for every
firm i = j = k).
Reconciling the Cournot and Bertrand models. Why are the results in the Cournot model of
quantity competition and the Bertrand model of price competition so dramatically different?
Recall that in the Cournot model, firms set a price above marginal cost, thus making positive
profits, whereas in the Bertrand model, firms set p = c, earning no economic profits. The
underlying assumption driving the difference in their equilibrium predictions is, essentially,
the absence of capacity constraints in the Bertrand model of price competition: if a firm
charges 1 cent less than its rival, it captures all market demand, regardless of its size. This
assumption might be reasonable for certain goods (such as online movie streaming), but
relatively difficult to justify for others (such as smartphones or smartwatches) with a world
demand that cannot be served by a single firm.
14.3.3 Cartels and Collusion

Our previous results indicate that firms competing in quantities earn profits below those
under monopoly; this result becomes emphasized when firms compete in prices (à la
Bertrand). What if, rather than competing against each other, firms were to coordinate their
production decisions? In this section, we analyze how collusion can help firms increase their
profits, and under which conditions such cooperation can be expected to hold.12 Cartels
seek to coordinate production decisions to raise prices and profits for cartel participants.
A famous example of a cartel is the Organization of the Petroleum-Exporting Countries
(OPEC), which limits the oil extraction of each participating country in order to increase
market prices. Other famous examples include lysine, vitamin B2 , vitamin C, steel, rayon
fiber, diamonds, or heating pipes.13 As the next example illustrates, in a cartel firms seek to
maximize their joint rather than their individual profits, making our analysis analogous to
that under multiplant monopolies (discussed in section 10.6 of chapter 10).
Example 14.4: Collusion when firms compete in quantities Consider the indus-
try in example 14.1, where p(q1 , q2 ) = 12 − q1 − q2 and TCi (qi ) = 4qi for every firm
i. If firms join a cartel, they choose the output of firm 1, q1 , and that of firm 2, q2 , to
maximize their joint profits, π = π1 + π2 , as follows:
max π = π1 + π2
q1 ,q2
= (12 − q1 − q2 ) q1 − 4q1 + (12 − q1 − q2 ) q2 − 4q2 .

π1 π2
12. We consider here quantity competition, while one of the end-of-chapter exercises examines collusion under
price competition.
13. See Levenstein and Suslow (2006) and Harrington (2006) for more details on these cartels.
370 Chapter 14
This expression looks scary (we agree on that), so let us try to simplify it a bit. First,
note that (12 − q1 − q2 ) shows up twice and can be factored out, and so does the unit
cost, 4, which yields
max (12 − q1 − q2 ) (q1 + q2 ) − 4 (q1 + q2 ) .

q1 ,q2
Because 12 − q1 − q2 = 12 − (q1 + q2 ), we obtain
max [12 − (q1 + q2 )] (q1 + q2 ) − 4 (q1 + q2 ) .

q1 ,q2
Finally, because Q = q1 + q2 denotes aggregate output, we can rewrite the cartel’s

profit maximization problem more compactly as
max [12 − Q] Q − 4Q.

Q
The cartel just needs to choose the aggregate amount of output Q—the total production
for all the cartel—to maximize profits [12 − Q] Q − 4Q, as if it were a single firm (i.e.,
a monopolist).
Differentiating with respect to Q, we find
12 − 2Q − 4 = 0,
which, after solving for Q, yields Q∗ = 82 = 4 units. Because firms are symmetric, each
produces half of Q∗ = 4 units (i.e., 2 units per firm). In contrast, under Cournot compe-
tition (as found in example 14.1), every firm produces q = 83 2.66 units. Therefore,
under the cartel, every firm limits its own production to increase market price and
profits. We can confirm this result by finding that the cartel price is
p(2, 2) = 12 − 2 − 2 = $8,
which is higher than under Cournot competition ($6.67). Similarly, the cartel profits
for every firm i are
πi∗ = (12 − q1 − q2 ) qi − 4qi = (12 − 2 − 2) 2 − (4 × 2) = $8,

while under Cournot competition, profits were only πi = 64
9 $7.11.
Self-assessment 14.5 Repeat the analysis in example 14.4, but assume that firms
face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results in
example 14.4 affected?
Example 14.4 indicates that firms have incentives to coordinate their production
decisions, reducing their individual output to increase market prices and ultimately prof-
its. Why are cartel profits larger than under Cournot competition? Under Cournot, when
every firm increases its individual output, it considers the effect that such additional pro-
duction has on its own profits, but it ignores the effect that a larger output has on its rival’s
profits. Indeed, a larger output lowers market prices, ultimately reducing its rival’s profits.
Under the cartel agreement, in contrast, firms take into account each other’s benefits, as
the cartel maximizes joint (rather than individual) profits. As a consequence, firms produce
less under the cartel, both at the individual and aggregate levels, elevating market prices
and ultimately increasing their profits. In short, by sharing profits, the cartel helps every
firm internalize the negative effect that an increase in its own output produces in its rival’s
profits.
While the previous discussion ranks profits in cartels and Cournot, it does not iden-
tify under which conditions collusion can be sustained over time. Importantly, every firm
has an incentive to cheat on the cartel agreement—that is, produce more than its quota
(2 units per firm in example 14.4)—while its rival sticks to the agreement. From chapter
13, we know that, if firms interact only once, cooperation cannot be sustained in equili-
brium, nor can it be supported if firms interact a limited number of times. However, if firms
interact infinitely (or there is a probability that both firms will still be in the industry tomor-
row), then cooperation can be sustained. We evaluate under which conditions this occurs in
example 14.5.
Example 14.5: Sustaining cooperation within the cartel Assume that firms play
an infinitely repeated Cournot game, and they seek to coordinate their production deci-
sions through the following Grim-Trigger Strategy (GTS), similar to that discussed in
chapter 13 for the Prisoner’s Dilemma game:
1. In the first period of interaction t = 1, every firm starts cooperating (producing 2

units).
2. In all subsequent periods t > 1,
(a) Every firm continues cooperating, so long as all firms cooperated in all
previous periods.
(b) If, instead, a firm observes some past cheating (deviating from this GTS), then
it produces the Cournot output q∗ = 83 thereafter.
As shown in chapter 13, we only need to check if every firm has incentives to deviate
from the GTS: (1) after observing a history of cooperation; and (2) after observing
372 Chapter 14
that some firm/s cheated. We focus here on testing (1), while you can explore option
(2) as an exercise.14
Cooperation. If firm i continues cooperating (i.e., producing the cartel output of 2
units), it obtains the cartel profit of $8. Therefore, its stream of discounted payoff from
cooperating becomes
8 + δ8 + δ 2 8 + … = 8(1 + δ + δ 2 + …)
8
= ,
1−δ
where δ denotes the discount factor weighting future payoffs.
Best deviation. If, instead, firm i deviates from producing 2 units while its rival sticks
to the cartel agreement, its profits could increase. But what is firm i’s best deviation?
To find this, we need to evaluate its profits when its rival produces the cartel output of
2 units, qj = 2, obtaining
(12 − qi − 2) qi − 4qi = (10 − qi ) qi − 4qi .

Differentiating with respect to qi , we obtain 10 − 2qi − 4 = 0, which, solving for qi ,
yields qi = 3 units. Inserting this “best deviation” into firm i’s profits, we obtain devi-
ation profits of π Dev = (10 − 3) 3 − (4 × 3) = $9, which are indeed larger than the
cartel profit of $8. Therefore, if firm i deviates, its stream of discounted payoffs
becomes
64 64 64
9 + δ + δ 2 + … = 9 + (δ + δ 2 + …)

9 9 9
Deviation
Punishment
64
=9+ δ(1 + δ + …)
9
64 δ
=9+ .
9 1−δ
Intuitively, the deviating firm increases its profits from $8 to $9 for one period (i.e.,
instantaneous gain from deviation), but its defection is detected by its cartel partner,
which triggers an infinite punishment in which both firms produce the Cournot output,
yielding a Cournot profit of 649 thereafter.
14. As in chapter 13, you should find that, upon observing some player or players cheating, every player has an
incentive to implement the punishment in the GTS (choosing its Cournot output thereafter) rather than producing
any other output level. Importantly, this result should hold for all values of the firm’s discount factor δ.
Comparing profits. As a consequence, every firm i prefers to cooperate, so long as
8 64 δ
9+ .
1−δ 9 1−δ
Multiplying both sides by (1 − δ), we obtain 8 9(1 − δ) + 64 9 δ. Solving for discount

factor δ, we find that the cartel output can be sustained with this GTS if δ 17
9
0.53
(i.e., as firms assign sufficient importance to their future profits). If, in contrast, δ <
0.53, the cartel agreement cannot be sustained over time because firms would have
incentives to cheat during every period. In this case, the Cournot outcome emerges in
equilibrium in every period.
detect a deviation only after two periods, so a deviating firm earns a profit of $9 during
two periods before the punishment starts. This means that cheating is still detected
with certainty, but with a lag of two periods rather than immediately. Find the minimal
discount factor δ supporting cooperation in this scenario, and show that cooperation
is more difficult to be sustained than in example 14.5.
14.4 Stackelberg Model—Sequential Quantity Competition
Let us now modify the Cournot model of simultaneous quantity competition by considering
that, while firms still compete in quantities, they do so sequentially. Specifically, the time
structure of the game is the following:
1. Firm 1 chooses its output q1 .

2. Firm 2 observes q1 and responds with its own output, q2 .
This timing may be due to industry or legal reasons that provide firm 1 with an advantage.
For instance, firm 1 was the first to develop a new product or technology, allowing it to
choose its output before firm 2. Because this is a sequential-move game, with firm 1 acting
as the leader and firm 2 as the follower, we can solve it by applying backward induction, the
game-theoretic tool discussed in chapter 13. We first need to focus on the last mover (firm
2), and analyze its profit-maximizing output for every possible output that firm 1 produces.
Firm 2 takes the leader’s output q1 as given, because it is already chosen
Firm 2 (follower).
by the time firm 2 gets to move. Mathematically, firm 2 treats q1 as a parameter when
374 Chapter 14
maximizing its profits, as follows:
max [a − b (q1 + q2 )] q2 − cq2 .

q2
Differentiating with respect to firm 2’s output, q2 , we obtain
a − bq1 − 2bq2 − c = 0;
and solving for q2 yields

a−c 1
q2 (q1 ) = − q1 . (BRF2 )
2b 2
This expression is similar to firm 2’s best response function in the Cournot model of
example 14.1. Indeed, in that setting, firm 2 chose its profit-maximizing output for every
q1 chosen by its rival, firm 1. A similar intuition applies now, except for the fact that firm
2 observes firm 1’s output before choosing its own, whereas in the Cournot model, firm 2
chooses its output level simultaneously with that of firm 1. Nonetheless, in both scenarios
firm 2 treats firm 1’s output q1 as given, either because firm 2 cannot alter it (in the Cournot
model) or because q1 is already produced (in the Stackelberg model that we analyze).
Firm 1 (leader). The leader chooses its output q1 to maximize its profits, as follows:
max [a − b (q1 + q2 )] q1 − cq1 .

q1
However, firm 1 can anticipate that firm 2 will optimally respond with the same best
response function q2 (q1 ) = a−c
2b − 2 q1 , as this maximizes the follower’s profits. Intuitively,
1
the leader can put himself in the shoes of the follower, expecting the latter to respond with
the output level q2 (q1 ) = a−c
2b − 2 q1 . Inserting this best response function into the leader’s
1
PMP yields
⎡ ⎛ ⎞⎤
⎡ ⎛ ⎞⎤
⎢ ⎜ ⎟⎥
⎢ ⎜ ⎟⎥ ⎢ ⎜ a−c 1 ⎟⎥
max ⎣a − b ⎝q1 + q2 (q1 )⎠⎦ q1 − cq1 = ⎢a − b ⎜q1 + − q1 ⎟⎥ q1 − cq1 ,
q1 ⎣ ⎝ 2b 2 ⎠⎦
BRF2
q2 (q1 ) from BRF2
or, after simplifying,15
1
max (a + c − bq1 ) q1 − cq1 .
q1 2

2bq1 +a−c−bq1
15. Note that the term a − b q1 + a−c 1
2b − 2 q1 simplifies to a − b 2b , which further reduces to
2a−2bq1 −a+c+bq1 1
2 , ultimately yielding 2 (a + c − bq1 ).
Note that the leader’s problem became a function of its output level, q1 , alone. Differen-
tiating with respect to q1 , we obtain
1
(a − c − 2bq1 ) = 0.
2
Further, solving for q1 yields the profit-maximizing output for the leader, q∗1 = a−c 2b .
Hence, if the leader chooses q∗1 = a−c
2b in equilibrium, we can find the follower’s equilibrium
output by inserting q∗1 = a−c
2b into the follower’s best response function as follows:

a−c a−c 1 a−c 2(a − c) a − c a − c
q2 = − = − = ,
2b 2b 2 2b 4b 4b 4b

q∗1
which is exactly half of the leader’s output, q∗2 = 12 q∗1 . However, we describe the subgame
perfect equilibrium (SPE) of the game more generally as
a−c a−c 1
q∗1 = and q2 (q1 ) = − q1 ,
2b 2b 2
because the follower’s best response function allows firm 2 to optimally respond to the
leader’s output level, both in equilibrium, q∗1 = a−c
2b , and off the equilibrium q1 = 2b . If,
a−c
∗
instead, we said that the follower chooses q2 = 4b in the SPE of the game, we would pro-
a−c
vide no information about how the follower responds if the leader “made a mistake” by
deviating from its equilibrium output q∗1 .
Interestingly, the leader produces more in the Stackelberg model than in Cournot, because
2b > 3b , whereas the follower produces less, given that 4b < 3b . The leader anticipates
a−c a−c a−c a−c
the follower’s reaction after observing a larger output from the leader, and thus increases q1
to gain larger profits.
In this context, equilibrium price is

∗ a−c a−c
p =a−b +
2b 4b
2 (a − c) a − c
=a− −
4 4
a + 3c
= .
4
Equilibrium profits for the leader are

a + 3c a − c (a − c)2
π1∗ = −c = ,
4 2b 8b
376 Chapter 14
and for the follower, they are

a + 3c a − c (a − c)2
π2∗ = −c = ,
4 4b 16b
that is, exactly half of the leader’s profits, π2∗ = 12 π1∗ . As an exercise, you can easily check
that the leader’s profits are higher in Stackelberg than in Cournot, whereas the follower’s
profits are lower.
Example 14.6: Stackelberg model Consider the same inverse demand function
as in example 14.1, p(q1 , q2 ) = 12 − q1 − q2 , and marginal cost c = 4. Inserting the
follower’s best response function found in example 14.1, q2 (q1 ) = 4 − 12 q1 , into the
leader’s PMP yields
⎡ ⎛ ⎞⎤
⎢ ⎜ ⎟⎥
⎢ ⎜ 1 ⎟⎥
max ⎢12 − ⎜q1 + 4 − q1 ⎟⎥ q1 − 4q1 .
q1 ⎣ ⎝ 2 ⎠⎦

q2 (q1 )
Simplifying this, we obtain16
1
max (16 − q1 ) q1 − 4q1 .
q1 2
8 − q1 − 4 = 0.
Solving for q1 , we find the profit-maximizing output for the leader, q∗1 = 4 units.
In this scenario, equilibrium price is p∗ = $6, and equilibrium profits become π1∗ =
(6 × 4) − (4 × 4) = $8 for firm 1 and π2∗ = (6 × 2) − (4 × 2) = $4 for firm 2.
Self-assessment 14.7 Repeat the analysis in example 14.6 but assume that firms
face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results in

16. Note that the term 12 − q1 + 4 − 12 q1 simplifies to 12 − q1 − 4 + 12 q1 q1 , which further reduces to

8 − 12 q1 q1 . Factoring 1/2 out, we can alternatively write this expression as 12 (16 − q1 ) q1 .
14.5 Product Differentiation
In previous sections, we considered that firms sell undifferentiated products (i.e., homoge-
neous goods). While this might occur in some markets, such as specific agricultural products
and cereals, most goods are differentiated from those of their rivals, such as Coke and Pepsi
in the soda industry, Dell and Lenovo in the computer industry, and iPhone and Samsung
Galaxy in the smartphone market. To understand firm competition in these industries, and
to predict their output and pricing decisions, we will rely on a similar approach as in the
Cournot model of section 14.3.1, but with a twist, because we need to account for product
differentiation between products.
Demand for product differentiation. Consider a scenario with two firms, A and B, with the
following inverse demand functions:
pA (qA , qB ) = a − bqA − dqB and pB (qA , qB ) = a − bqB − dqA ,
where b, d 0 and b d. We next interpret these demand functions. Because they are sym-
metric, let us focus on one of the demand functions, such as that for good A. An increase in
either qA or qB reduces the price of good A, pA , but the effect of firm A’s output qA is larger
than the effect of firm B’s output because b > d. Intuitively, the price of a particular brand is
more sensitive to changes in its own output than to changes in its rival’s output. We refer to
this assumption by saying that “own-price effects” dominate “cross-price effects.”
Furthermore, note that when d = 0, the inverse demand function for good A collapses to
pA (qA , qB ) = a − bqA (and similarly for the demand of good B), thus indicating that every
firm’s price is unaffected by its rival’s output, as in two separate monopoly markets, one for
good A and another for B. In contrast, if parameter d increases until it coincides with b,
d = b, the inverse demand function for good A becomes
pA (qA , qB ) = a − bqA − bqB = a − b (qA + qB ) ,

reflecting that price pA is symmetrically affected by an increase in either qA or qB , as in the
Cournot model with homogeneous goods.
Best responses with product differentiation. As in previous sections, we assume that every
firm i = {A, B} faces a cost function TC(qi ) = cqi , where c > 0 indicates its marginal cost of
production. We are now ready to represent the PMP of firm A as follows:
max [a − bqA − dqB ] qA − cqA .

qA
Differentiating with respect to qA , we obtain
a − c − 2bqA − dqB = 0.
378 Chapter 14
qA
a–c
BRFA
2b
–½ when d < b
BRFA
when d = b
a–c a–c qB
b d
Figure 14.7
Best response function and product differentiation.
Rearranging this yields a − c − dqB = 2bqA . Solving for qA , we find firm i’s best response
function
a−c d
qA (qB ) = − qB .
2b 2b
Figure 14.7 depicts this best response function. As with best response functions found
throughout the chapter, firm A’s optimal output is a−c 2b when its rival, firm B, produces zero
units (qB = 0) but it decreases at a rate 2b for every unit of output of its rival. If firm B pro-
d
duces more than a−c d units, firm A chooses to optimally respond with zero output (qA = 0).
17
This intuition about parameters b and d in the demand function extends to the best
response function as well. In particular, if d = 0, the best response function reduces to
qA = a−c2b , which is independent of qB , as expected, because firm i’s demand is unaffected
by firm B’s sales, effectively transforming firm i into a monopolist. In contrast, if d = b, the
best response function collapses to qA (qB ) = a−c2b − 2 qB , as in the standard Cournot model
1
of homogeneous products. Graphically, when d = b, the best response function has a slope of
−1/2, whereas when d < b, this slope becomes smaller than −1/2, thus producing a pivot-
ing effect on the best response function: it becomes flatter. Intuitively, firms’ competition is
ameliorated, because every firm i is induced to reduce its output by a smaller amount when
products are differentiated (b > d) than when they are homogeneous (d = b). As an exer-
cise, you can find firm B’s best response function, which is symmetric to that of firm A; that
is, qB (qA ) = a−c ∗ ∗
2b − 2b qA . We can then invoke symmetry in equilibrium output qi = qj = q,
d
which yields
a−c d
q= − q.
2b 2b
17. Recall that, in order to obtain the point where the best response function crosses the horizontal axis, we only
need to set it equal to zero, a−c d a−c d a−c
2b − 2b qB = 0, and solve for qB . Rearranging, we find 2b = 2b qB , or qB = d .
(2b+d)q
Rearranging this, we find 2b = a−c
2b . Solving for q, we obtain the equilibrium output:
a−c
q∗ = .
2b + d
When products are completely differentiated (d = 0) this output becomes a−c 2b , as in

monopoly, whereas when products are homogeneous (d = b) this output simplifies to
2b+b = 3b , as in the Cournot model described in section 14.3.1. Equilibrium price is then
a−c a−c
given by
a−c a−c ab + c(b + d)
p∗i = a − bq∗i + dq∗j = a − b −d = ,
2b +
d 2b + d
2b + d
q∗i q∗j
whereas equilibrium profits for every firm i are

ab + c(b + d) a−c (a − c)2 b
πi∗ = (p∗ − c)q∗ = −c = .
2b + d 2b + d (2b + d)2
As suggested previously, when products are completely differentiated (d = 0), this profit
(a−c)2 b (a−c)2
becomes (2b+0) 2 = 4b , as in monopoly, whereas when products are homogeneous
(a−c)2 b
= (a−c)
2
(d = b), this profit collapses to (2b+b)2 9b , as in the Cournot model.
Example 14.7: Output competition with product differentiation Consider two

firms, A and B, facing the demand curves
pA (qA , qB ) = 100 − 5qA − 2qB and pB (qA , qB ) = 100 − 5qB − 2qA .
In this context, parameters are a = 100, b = 5, and d = 2, which indicates that own-
price effects are larger than cross-price effects (i.e., b > d). In addition, assume that
both firms have a symmetric marginal cost of c = 3. Inserting these parameters in
the previous equilibrium results, we obtain that equilibrium output is q∗ = (2×5)+2
100−3
=
97
12 8.08 units. The equilibrium price is then
(100 × 5) + 3(5 + 2) 521

p∗i = = $43.41,
(2 × 5) + 2 12
(100−3)2 5
and profits become πi∗ = $326.7.
[(2×5)+2]2
380 Chapter 14
experience a higher marginal production cost of c = 5 (rather than c = 3). How are the
results in example 14.7 affected?
Exercise 22 at the end of the chapter examines how our results are affected when firms, still
selling differentiated products, compete on prices (à la Bertrand) rather than on quantities.
For a more detailed presentation of imperfectly competitive markets, see Cabral (2017).
Appendix. Cournot Model with N Firms
How do the results in the Cournot model of simultaneous quantity competition change if,
rather than N = 2 firms, we consider more firms? To answer this question, let us first write
the inverse demand function in this scenario, p(Q) = a − bQ, where Q denotes the aggre-
gate output by all firms. Alternatively, we can express Q = qi + Q−i , which decomposes
aggregate output into two components: qi , the output that firm i produces, and Q−i , which
represents the production of all firms different than firm i; that is,
Q−i = q1 + q2 + … + qi−1 + qi+1 + … + qN ,
where you can see that term qi is not included in the sum. For instance, if there are N = 4
firms in the market, and firm i is the second firm, then i = 2 and Q−2 = q1 + q3 + q4 . (Note
that Q−i sums across the output of N − 1 firms, because firm i is not included in the sum.)
The representation of aggregate output as Q = qi + Q−i allows us to rewrite the inverse
demand function as
p(qi , Q−i ) = a − b(qi + Q−i ).

Q
If all N firms face the same marginal cost c, where c satisfies a > c > 0, every firm i solves
the following PMP:
max [a − b (qi + Q−i )] qi − cqi . (PMPi )

qi
Differentiating with respect to firm i’s output qi , we obtain
a − 2bqi − bQ−i − c = 0.
Rearranging this, we find a − c − bQ−i = 2bqi . Solving for qi yields firm i’s best response
function:
a−c 1
qi (Q−i ) = − Q−i . (BRFi )
2b 2
Firm i’s best response function informs us about this firm’s profit maximizing output qi
as a function of the sum of its rivals’ output, Q−i . The best response function is analogous
to the one that we previously found for the Cournot model with only two firms: it originates
at a−c
2b and decreases in Q−i at a rate of 1/2. In addition, this function captures the Cournot
model with two firms as a special case. Indeed, if we consider only two firms i and j, then
firm i has a single rival (firm j), and thus the total output of firm i’s rivals is Q−i = qj . In that
scenario, the BRFi collapses to that given in section 14.3.1.
Let us now continue with the case of N 2 firms. Because all firms are symmetric,
they all solve a problem similar to PMPi , obtaining best response functions like that in
BRFi , a result that holds for every firm i. We can hence invoke symmetry in equilibrium
output, which means that q1 = q2 = … = qN = q; that is, every firm produces the same
amount in equilibrium. (We just dropped the subscripts in the individual output levels,
which greatly simplifies our next calculations!) Therefore, aggregate output is Q = Nq,
and the sum of firm i’s rivals’ output is Q−i = (N − 1)q. Inserting this result in the BRFi ,
we find
a−c 1
q= − (N − 1)q,
2b 2
Q−i
which depends on output q alone; all other elements are parameters (and treated as given by
the firm). Rearranging this yields
2q + (N − 1)q a − c
= ,
2 2b
which further simplifies to q [2 + (N − 1)] = a−c
b . Solving for q, we obtain the equilibrium
output in a Cournot model with N 2 firms:
1 a−c
q∗ = .
N +1 b
This individual output level decreases with the number of firms operating in the market,
N. Intuitively, as more firms compete, the individual production of each firm decreases.18
The aggregate output in this scenario becomes

1 a−c
Q∗ = Nq∗ = N ,
N +1 b
∗
18. To confirm this result, differentiate equilibrium output q∗ with respect to N, finding ∂q
∂N = −
(a−c)
2 , which
(N+1) b
is negative, given that a > c by assumption, implying that q∗ decreases with the number of firms, N. As a numerical
example, consider a = 100, b = 1, and c = 10. Then, this individual output simplifies to q = N+11 100−10 = 90 ,
1 N+1
90 = 30 units in the case of N = 2 firms, 90 = 22.5 in the case of N = 3 firms, 90 = 18 in the case
which is 2+1 3+1 4+1
of N = 4 firms, and so on for a larger number of firms.
382 Chapter 14
which increases as more firms enter the industry, N.19 The equilibrium price is

1 a−c
p(Q∗ ) = a − bQ∗ = a − b N
N +1 b

Q∗
a + Nc
= ,
N +1
which is decreasing with the number of firms, N.20 Interestingly, the results in this model
encompass the results in previous chapters as special cases. To see this, let us start by
considering an industry with only one firm (a monopoly), entailing N = 1.
Monopoly. If we insert N = 1 into our equilibrium output q∗ , we obtain
1 a−c a−c
q∗ = = .
1+1 b 2b
Aggregate output is, of course, Q∗ = Nq∗ = a−c ∗
2b , and equilibrium price becomes p = 1+1 =
a+c
a+c
2 . Needless to say, these three results coincide with the profit-maximizing output and
price found in monopoly (see chapter 10).
Duopoly. Let us now consider an oligopoly with two firms (i.e., a duopoly). Inserting N = 2
into the previous results, we obtain that individual output is
1 a−c a−c
q∗ = = ,
2+1 b 3b
aggregate output is Q∗ = Nq∗ = 2q∗ = 2 a−c ∗
3b , and equilibrium price becomes p = 2+1 =
a+2c
a+c
3 , which also coincides with the profit-maximizing output and price found in duopoly
(see section 14.3.1 of this chapter).
Perfect competition. Lastly, consider an industry with a large number of firms, N → +∞,
as in perfectly competitive markets where each firm represents a negligible share of the
industry. We start by finding the limit of the individual output found previously, when
N → +∞,
1 a−c
lim q∗ = lim = 0;
N→+∞ N→+∞ N +1 b
∂Q∗
19. To see this point, differentiate Q∗ with respect to the number of firms N, to obtain ∂N = − (a−c)N2 + b(N+1)
a−c ,
b(N+1)
which simplifies to a−c 2 . This expression is positive because a > c by assumption, implying that aggregate
b(N+1)
output, Q∗ , increases with the number of firms, N.
∂p(Q∗ )
20. Differentiating the equilibrium price p(Q∗ ) with respect to the number of firms N, we find ∂N = N+1
c −
a+Nc , which collapses to − (a−c) . Because a > c by definition, the derivative is unambigously negative,
(N+1)2 (N+1)2
implying that the equilibrium price decreases with the number of firms, N.
and aggregate output is given by

1 a−c a−c
lim Q∗ = lim N = ,
N→+∞ N→+∞ N +1 b b
whereas equilibrium price becomes

a + Nc
lim p∗ = lim = c.
N→+∞ N→+∞ N +1
As suspected, this coincides with the profit-maximizing output and price obtained in
perfectly competitive markets (see chapter 9).
Self-assessment 14.9 Repeat the analysis in this subsection, but assume that firms
face an inverse demand function p(Q) = 10 − 2Q, and all N firms have a marginal cost
of c = 3. Evaluate your results under monopoly, duopoly, and perfect competition.
Exercises
1. Herfindahl-Hirschman index.A Calculate the HHI in the following markets, where three firms
operate under different levels of market share:
(a) Each firm has an equal share of the market (i.e., 33.3 percent).
(b) One firm captures 50 percent of the market, while the other two each have 25 percent.
(c) One firm captures 80 percent of the market, while the other two each have 10 percent.
(d) Two firms have 45 percent of the market, while the other firm has 10 percent.
(e) How do these different market shares (in parts a–d) affect the HHI?
2. Cournot competition between two breweries.B Two breweries across the street from each other
sell slightly differentiated beers. Clay’s Brews (subscript C) has demand pC = 10 − qC − 0.5qJ
and total cost TCC (qC ) = 10 + qC , while John’s Barley Sodas (subscript J ) has demand pJ = 14 −
qJ − 0.5qC and total cost TC(qJ ) = 12 + 1.1qJ .
(a) Find Clay’s and John’s best response functions.
(b) How much beer will Clay and John sell, and what will each one set the price at?
3. Properties of the best response function.B Consider the best response function of firm 1 in the
Cournot model of quantity competition:
a−c 1
q1 (q2 ) = − q2 .
2b 2
Let us do some comparative statics, to understand how this expression changes as we increase one
parameter at a time.
384 Chapter 14
(a) How is the best response function q1 (q2 ) affected by a marginal increase in the vertical
intercept of the inverse demand function, a? Interpret.
(b) How is the best response function q1 (q2 ) affected by a marginal increase in the slope of inverse
demand function, b? Interpret.
(c) How is the best response function q1 (q2 ) affected by a marginal increase in the firm’s marginal
production cost, c? Interpret.
4. Properties of the Cournot equilibrium.B Consider the equilibrium in the Cournot model of quan-
tity competition, where every firm produces q∗ = a−c
3b . Let us do some comparative statics in order
to understand how this expression changes as we increase one parameter at a time.
(a) How is the equilibrium output q∗ affected by a marginal increase in the vertical intercept of
the inverse demand function, a? Interpret.
(b) How is the equilibrium output q∗ affected by a marginal increase in the slope of inverse demand
function, b? Interpret.
(c) How is the equilibrium output q∗ affected by a marginal increase in the firm’s marginal
5. Symmetric Cournot.A Two medical supply companies are the only two firms that supply stetho-
scopes to the medical professionals and are competing à la Cournot: Hearts Beat (H) and Lungs
Breathe (L). Inverse market demand is p = 50 − 2(qH + qL ), and each firm has the same total cost
of producing stethoscopes of TC(qi ) = 5qi .
(a) Write the PMP for Hearts Beat and Lungs Breathe.
(b) Find each firm’s best response function.
(c) Find the equilibrium quantity that each firm will produce, and the market price.
6. Cournot with asymmetric marginal costs.B Consider the Cournot duopoly in section 14.3.1.
Assume that firm 1 faces marginal cost c1 , while firm 2’s is c2 , where c1 < c2 (so firm 1 enjoys a
cost advantage relative to firm 2) and a > c2 .
(a) Find the best response function of firm 1 and of firm 2. Compare them.
(b) Insert firm 2’s best response function into that of firm 1, to find the output that each firm
produces in the NE of the Cournot game of quantity competition. Which firm produces a
larger output?
(c) Find equilibrium price and equilibrium profits for each firm. Which firm earns a larger profit?
(d) Assume that both firms now become cost symmetric, so that, c1 = c2 = c. Evaluate your results
from parts (b) and (c) at c1 = c2 = c, showing that you obtain the same results as in section
14.3.1.
7. Cournot with asymmetric fixed costs.B Consider the Cournot duopoly in section 14.3.1. Assume
that firm 1 faces a TC function TC1 (q1 ) = F1 + cq1 , where F1 > 0 denotes its fixed cost and c > 0
represents its marginal cost. Firm 2’s TC function is TC2 (q2 ) = F2 + cq2 , where F2 > 0 denotes
its fixed cost and satisfies F2 > F1 , and c > 0 is the same marginal cost as firm 1. Consider that
firms still face a linear inverse demand function p(Q) = a − bQ, where parameter a satisfies a > c
and b > 0. The scenario is therefore analogous to the Cournot duopoly of section 14.3.1, except
for the fact that both firms now face fixed costs of production.
(a) Find the best response function of each firm, as well as the equilibrium output.
(b) How are the equilibrium results affected? Interpret.
8. Cournot with asymmetric marginal costs.A Two firms, Melissa’s Meals (M) and Stephanie’s
Sustenance (S), compete à la Cournot over the service of meal delivery via bicycle. Melissa has
extensive knowledge of bike maintenance and keeps her bike fleet in tip-top shape so that her total
costs are TCM (qM ) = 1 + qM , while Stephanie is not as good at maintenance and upkeep, so her
total costs are higher, TCS (qS ) = 2qS . Inverse market demand is p = 12 − 0.3(qM + qS ).
(a) Write down the PMP for Melissa and Stephanie.
(b) Find each firm’s best response function.
(c) Find the equilibrium quantity that each firm will produce, as well as the market price.
(d) Will each firm produce an identical amount? Why or why not?
(e) Find equilibrium profits for each firm and compare them.
9. Cournot with three firms.C Consider a market with three firms producing a homogeneous good
and facing a linear demand function p(Q) = 1 − Q, where Q ≡ q1 + q2 + q3 denotes aggregate
output. All firms face a constant marginal cost of production given by c, where 1 > c > 0.
(a) Set up firm 1’s PMP, differentiate with respect to its output q1 , and obtain this firm’s best
response function. [Hint: It should be a function of firm 2’s and 3’s output, q2 and q3 .]
(b) Repeat the process for firms 2 and 3, to obtain their best response functions. [Hint: You should
find that all firms have symmetric best response functions.]
(c) Interpret firm 1’s best response function: if firm 2 were to marginally increase its output, does
firm 1 increase or decrease its own output? Either way, by how much?
(d) Using the three best response functions for these firms, find the point where they cross. The
triplet (q∗1 , q∗2 , q∗3 ) characterizes the NE of this Cournot game.
(e) Is the equilibrium output that you found in part (d) increasing or decreasing in marginal
cost c?
(f) Find the price that emerges in equilibrium, along with the profits that every firm earns.
10. Investigating the Bertrand equilibrium.A Consider the Bertrand model of simultaneous price
competition in section 14.3.2.
(a) Assume that firms’ common marginal cost c increases to c , where c > c. How are the results
in that section affected?
(b) Our presentation assumed two firms competing in prices. Repeat the analysis on that section,
assuming N 2 firms. How are the main findings in that section affected by the number of
firms?
11. Finitely repeated Grim Trigger Strategy.B Let us repeat example 14.5, but without considering
an infinitely repeated game.
386 Chapter 14
(a) Assume that firms interact for T = 2 periods. Can the GTS in example 14.5 be sustained as
an SPE of the game?
(b) Assume that firms interact for T 2 periods. Can the GTS in example 14.5 be sustained as
an SPE of the game?
12. Grim Trigger Strategy and Bertrand.B Consider our analysis of collusive behavior between
two firms which competed on the basis of quantities (section 14.3.3). Assume that firms compete
on the basis of prices (à la Bertrand). For which discount factor δ can collusion be sustained?
Compare your results with those in section 14.3.3.
13. Collusion with delayed detection.B Consider self-assessment 14.6, allowing firms to deviate
from the collusive outcome without being detected during three periods. This means that cheat-
ing is still detected with certainty, but with a lag of three periods rather than immediately. If a
firm can deviate and earn a profit of $9 during three periods before the punishment starts, what
is the minimal discount factor supporting cooperation? Compare your results with those under
immediate detection.
14. Collusion in a one-shot game.A Let us now repeat example 14.5, but in a one-shot version,
where every firm chooses to either produce the cartel output, qCartel , or the Cournot competition
output, qCournot .
(a) Find each firm’s profit when both choose qCartel , when both choose qCournot , and when only
one chooses qCartel .
(b) Show that every firm finds qCournot a strictly dominant strategy, so cooperation cannot be
supported if the game is unrepeated.
(c) Let us now generalize these findings by allowing that firms can choose any output level, rather
than restricting them to select either qCartel or qCournot . Show that, if firm j chooses qCartel ,
firm i’s best response is not to choose qCartel , making the cartel output unsustainable in the
unrepeated (one-shot) version of the game.
15. Colluding barbecue.B Mike (M) and Jeff (J ) each owns a barbecue place in a Southern town.
Market demand for barbecue is p = 15 − 0.75Q, where Q = qM + qJ . Mike’s costs are TC(qM ) =
10 + 1.5qM , and Jeff ’s are TC(qJ ) = 5 + 3qJ .
(a) Assuming that Mike and Jeff compete in quantities (à la Cournot), find their best response
functions.
(b) Find the equilibrium price and quantity for Mike and Jeff.
(c) If Mike and Jeff form a cartel, how much barbecue product will they sell, and at what price?
16. Colluding gas stations.C Two gasoline stations are situated across the street from each other
and are in fierce competition. They face market demand of p = 10 − 0.05Q, where Q = q1 + q2
denotes aggregate output, and each has total cost TC(qi ) = 10 + 0.5qi , where i ∈ {1, 2} denotes
the firm.
(a) If firms compete on the basis of quantities, find each firm’s best response function.
(b) Find equilibrium output for each firm, price, and profits.
(c) If the firms collude, what equilibrium price and quantity will each firm offer? What will their
profits be?
(d) If the firms play an infinitely repeated game, and they seek to coordinate their production deci-
sion through the GTS considered in example 14.5. What discount factor supports continued
collusion?
17. Properties of the Stackelberg equilibrium.A Consider the equilibrium output in the Stackelberg
game discussed earlier in the chapter, q∗1 = a−c ∗ a−c
2b for the leader and q2 = 4b for the follower. Let
us do some comparative statics in order to understand how this expression changes as we increase
one parameter at a time.
(a) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the vertical intercept
of the inverse demand function, a? Interpret.
(b) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the slope of inverse
demand function, b? Interpret.
(c) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the firm’s marginal
18. Stackelberg with two and three firms.C Consider a market where two firms produce a homo-
geneous good, and face a linear demand function p(Q) = 1 − Q, where Q ≡ q1 + q2 denotes
aggregate output. All firms face a constant marginal cost of production given by c, where
1 > c > 0. Firm 1 is the industry leader, choosing its output q1 in the first stage; firm 2 is the
follower, who observes the choice of q1 from the leader and responds with its own output level
q2 in the second stage of the game.
(a) Find the follower’s best response function. Interpret.
(b) Set up the leader’s PMP. [Hint: You will need to insert the follower’s best response function
in the leader’s profits.]
(c) Find the leader’s optimal output q∗1 . Which output does the follower respond with?
(d) Allowing for three firms. Assume now that a third firm enters the industry. The time structure
remains unaffected: first, firm 1 chooses output q1 ; observing q1 , firm 2 responds with its
output q2 ; and, observing both q1 and q2 , firm 3 responds by choosing its output q3 . Follow
the same process as in the two-firm version of the game to find the output levels that firms
1–3 choose in the equilibrium of the Stackelberg game.
19. Cournot versus Stackelberg.B Consider two neighboring wineries in fierce competition over
the production of their specialty wine (where their grapes come from the same vineyard, so we
assume that their wines are regarded as identical by customers). One winery is owned by Jill (J ),
and the other by Ray (R). Each winery produces its wine the same way and have symmetric TC
function TCi (qi ) = 3 + 0.5qi . Inverse market demand for wine is p = 50 − 2(qJ + qR ).
(a) Cournot competition. Write down the PMP for each firm if they compete on the basis of
quantities.
(b) If the firms compete à la Cournot, what is each winery’s equilibrium output and price?
(c) Stackelberg competition. If Jill was able to get her wine to market first (and become a
Stackelberg leader), how will each winery’s output and price change?
20. Comparing monopoly and Stackelberg.B Patents give pharmaceutical companies monopoly
rights over new drugs. After the patent expires, generic versions of these drugs hit the market.
388 Chapter 14
Consider such a market, where demand for a new drug is p = 500 − 5q and the company that
created it (i.e., leader) has total cost of TC = 25 − 2q + 0.5q2 .
(a) If the leader has monopoly rights for the product, what will the equilibrium price, quantity,
and profit be for this drug?
(b) After the monopoly rights end and a generic version of the drug is released, what will happen
to the market equilibrium? (For simplicity, assume that only one other competitor releases
the generic drug, has the same costs as the leader, and acts as a follower.)
(c) Compare the equilibrium price, quantity, and profit for the leader in the market.
21. Product differentiation.A Two companies sell cell phone cases and compete over quantity. Each
firm has a slightly different case, but the two companies, 1 and 2, face symmetric demand as
follows:
pi = 25 − 2qi − qj ,
where i ∈ {1, 2} and j = i. This inverse demand indicates that every firm i is more significantly
affected by its own sales, qi , than by its rival’s sales, qj . Each firm has total cost TCi = qi + 0.5q2i .
Assuming that the firms compete over quantity, find the equilibrium output, price, and profit.
22. Bertrand and product differentiation.B Consider a similar scenario to that in section 14.5,
where two firms, A and B, offer a differentiated good but now compete over prices. The firm’s
demand functions are
1 1
qA (pA , pB ) = 1 − pA + pB and qB (pA , pB ) = 1 − pB + pA .
2 2
Intuitively, every firm’s sales are more sensitive to its own price than to its rival’s price. For com-
pactness, we refer to this property by saying that own-price effects dominate cross-price effects.
Assume that each firm faces the same constant marginal cost, c = 18 .
(a) Find each firm’s pricing best response function. Interpret its slope.
(b) What are each firm’s equilibrium price and output? Do the firms practice marginal cost
pricing?
23. Stackelberg prices with homogeneous goods.A Consider the Bertrand model in 14.3.2, except
now firm 1 can set prices first, while firm 2 is the follower. Show that in this Stackelberg version
of the Bertrand game, the equilibrium set of prices is (p1 , p2 ) = (c, c).
24. Stackelberg prices with heterogeneous goods.B Consider a similar scenario to that in section
14.5, where two firms, A and B, offer a differentiated good but now compete over prices. The
firm’s demand functions are
1 1
qA (pA , pB ) = 1 − pA + pB and qB (pA , pB ) = 1 − pB + pA .
2 2
As in previous exercises with heterogeneous goods, these demand functions indicated that every
firm’s sales are more sensitive to its own price than to its rival’s price. Stated more compactly,
own-price effects dominate cross-price effects. Assume that each firm faces the same constant
marginal cost, c = 18 .
(a) Second stage (follower). If firm A is a Stackelberg leader and can set prices first, what is firm
B’s best response function?
(b) First stage (leader). Set up firm A’s PMP and solve for the price that they offer.
(c) What price will firm B offer?
25. Reconciling Bertrand and Cournot through capacity.C A common criticism of the Bertrand
model of price competition is that firms face no capacity constraints. In particular, if firm 1 sets
the lowest price in the market, it attracts all customers and can serve them regardless of how
large demand is. In this exercise, we add a previous stage to the standard Bertrand model of price
competition where firms choose a capacity level.
Consider a market with two firms. In the first stage, each firm i chooses a production capacity
q̄i at a cost of c = 14 per unit of capacity, where 0 q̄i 1. In the second stage, the firms observe
each other’s capacity and respond by competing over prices. Once capacity q̄i is decided, the firms
can produce up to that capacity with zero marginal cost. Each firm faces a demand of p = 1 − Q
and chooses prices simultaneously in the second stage, and sales are distributed as in the Bertrand
model of price competition.
(a) Second stage. Begin in the second stage. Show that both firms set a common price p1 = p2 =
p∗ = 1 − q̄1 − q̄2 in the second stage.
(b) First stage. In the first stage, every firm i simultaneously and independently chooses its
capacity q̄i . How much capacity does each firm invest in?
(c) How do your results compare to the standard Cournot model, with two firms competing on
the basis of quantities, facing the inverse demand function p(Q) = 1 − Q, and marginal cost
c = 14 ?
26. Collusion in a Cournot model with N firms.C Consider a market with N firms producing a
homogeneous good and facing a linear demand function p(Q) = 1 − Q, where Q = q1 + … + qN
denotes aggregate output. All firms face a constant marginal cost of production given by c, where
1 > c > 0.
(a) If all N firms compete on the basis of quantities, what is firm i’s equilibrium output and profit?
(b) What is the equilibrium output and profit for each firm i if N firms were to collude? What is
the discounted stream of profit from colluding if firms collude in an infinitely repeated game?
(c) Consider a GTS where every firm starts cooperating in the first period (producing the cartel
output) and keeps doing so if all firms cooperated in past periods. Otherwise, every firm
produces the Cournot output level thereafter. Assuming a previous history of cooperation,
what is firm i’s discounted stream of profits from cooperating? What is its discounted stream
of profits if it deviates?
(d) What is the discounted stream of profits for a deviating firm if all firms play the GTS described
in part (c)?
(e) Assuming firms collude and play this GTS, what is the lowest common discount rate δ that
sustains collusion?
(f) How does the discount rate found in part (e) change as the number of firms N increases?
Interpret your results.
15 Games of Incomplete Information and Auctions
15.1 Introduction
Previous chapters of this book considered economic situations in which agents—

individuals, firms, or countries—strategically choose their actions simultaneously or sequ-
entially. We learned how to predict equilibrium behavior with two simple, yet powerful,
tools: the Nash equilibrium (NE) solution concept, with the help of best responses; and
the subgame perfect equilibrium (SPE) concept by applying backward induction. While we
studied different types of games, we always assumed that players were perfectly informed
about each other’s characteristics. This meant that every player could perfectly predict
her opponent’s payoff in every contingency. In other words, we only considered games of
complete information. However, many strategic settings in real life involve elements of
incomplete information, such as the following:
• Firms can observe their own production costs, but they do not perfectly observe their
rivals’ costs. In this context, firms may have estimates about rivals’ costs, but do not
accurately observe them.
• An incumbent firm, with decades of experience in an industry, may have reliable infor-
mation about market demand, while a new entrant in the industry has limited information
about demand.
• Bidders in an auction know how much they are willing to pay for the object being sold (e.g.,
a painting), but usually cannot observe other bidders’ private valuations for the object.
In these scenarios, players need to choose their best strategy by comparing payoffs but,
given their limited information, this payoff comparison may be in expectation: finding the
expected payoff that the player receives from each of her strategies, and then choosing the
strategy that yields the highest expected payoff. As we examine in this chapter, this approach
to selecting optimal strategies is analogous to the NE solution concept explored in chapter
12, but now it is extended to games of incomplete information (i.e., settings where players
do not observe all information about their opponents).
392 Chapter 15
We start the chapter by defining this solution concept, and then applying it to the Cournot
model of quantity competition, where we now assume that firms do not observe each other’s
costs. The rest of the chapter is devoted to the application of this solution concept to auctions,
seeking to predict the optimal bidding strategy that bidders use in an auction. We also look at
various auction formats, such as first-price auctions (FPAs), second-price auctions (SPAs),
and common-value auctions.
15.2 Extending Nash Equilibria to Games of Incomplete Information
Before we extend the NE solution concept to incomplete information games, let us clarify a
couple of points about notation. First, a player’s “type” will be used to represent her private
information. In the example of two firms privately observing their costs, every firm i’s type
is its production cost (e.g., high cH or low cL , where cH > cL 0). Similarly, in the auction
example, a bidder’s type denotes her valuation for the object being sold, v > 0 (i.e., a positive
dollar amount).
Second, we will express the strategies of player i as a function of her type. Continuing with
the previous example about two privately informed firms, a production strategy specifies
how many units firm i produces as a function of its cost, a number potentially being lower
when the firm experiences a high cost cH than a low cost cL . In the auction setting, a bidding
strategy specifies how much player i bids as a function of her valuation of the object, v, which
we write as bi (v).
We are now ready to extend the NE solution concept to incomplete information games.
First, we need to extend the notion of a player’s best response to allow for incomplete
information, as we do next.
Best response Player i regards strategy si as a “best response” to her rival’s strategy
sj if si yields a weakly higher expected payoff than any other available strategy si
against sj .
This definition is identical to that of best responses in complete information games

(chapter 12), except for the fact that we are now considering expected payoffs, rather than
payoffs that occur with certainty. To understand this definition, consider again the example
of the two firms discussed previously. Firm i observes its own production cost, such as cH ,
but does not observe that of its rival. We then say that a production strategy qi (cH ) is its best
response to its rival j’s output level if qi (cH ) yields a higher expected profit than any other
production different from qi (cH ).1
1. This assumes, of course, that firm i’s type (its cost cH ) is given because that is something that the firm cannot
change.
Games of Incomplete Information and Auctions 393
The definition given here says something more: firm i must have an optimal production
strategy for each of its possible types (e.g., costs) [i.e., a profit-maximizing output when its
costs are high, qi (cH ), and one when its costs are low, qi (cL )]. As a result, a best response
in this context can be understood as a list specifying this player’s optimal strategy for each
of her privately observed types.
Using this extended version of best response, we can define a Bayesian Nash equilibrium
(BNE) as follows:
Bayesian Nash Equilibrium (BNE) A strategy profile (s∗i , s∗j ) is a Bayesian Nash
equilibrium if every player chooses, for each of her types, a best response (evaluated
in expectation) given her rivals’ strategies.
Like the definition of best response, the BNE definition is analogous to that of NE, except
for the fact that players choose best responses by comparing expected payoffs rather than
certain payoffs. Intuitively, players select mutual best responses to each other’s strategies,
where best responses are now “lists,” as discussed previously, specifying which strategy a
player chooses for each of her possible types.
We understand that these definitions, while maintaining a common theme with those pre-
sented in chapter 12 for complete information games, can look a bit intimidating. Without
further ado, we apply this definition to the two-firms example, which should illustrate how
to approach strategic scenarios where players interact under incomplete information.
Example 15.1: Cournot competition, with asymmetric information about costs

Consider a duopoly game where two firms compete on the basis of quantities and face
the inverse demand function p = 1 − q1 − q2 . Assume that firm 1 is an incumbent that
operated in the industry for decades, with marginal cost MC1 = 0, which every firm
can accurately estimate. Firm 2 privately observes its marginal costs, which can be
low, MC2 = 0, or high, MC2 = 1/4. Because firm 2 is a newcomer (i.e., a company
from a different industry or from a foreign country), firm 1 cannot accurately observe
firm 2’s costs, but after some research (e.g., hiring a consulting company), it assigns
equal probability to firm 2 having low and high costs. We now seek to find the BNE
of this duopoly game, specifying how much every firm produces. We can start by
focusing on the informed player (firm 2).
Firm 2’s best response. When firm 2 has low costs (MC2 = 0), it chooses its produc-
tion level qL2 (where superscript L indicates that the firm has low costs), to maximize
394 Chapter 15
its profits as follows:
max π2L = (1 − q1 − qL2 )qL2 .

qL2 0
Differentiating with respect to qL2 yields
1 − q1 − 2qL2 = 0,
and solving for qL2 , we obtain firm 2’s best response function when experiencing low
costs:
1 1
qL2 (q1 ) = − q1 . (BRF2L (q1 ))
2 2
On the other hand, when firm 2 has high costs (MC2 = 14 ), its profit maximization
problem (PMP) is
1 H
max π2H = (1 − q1 − qH
2 )q2 − q2 .
H
q2 0
H 4
Differentiating with respect to qH

2 yields
1
1 − q1 − 2qH
2 − = 0,
4
and solving for qH
2 , we find firm 2’s best response function when experiencing high
costs:
3 1
2 (q1 ) = − q1 .
qH (BRF2H (q1 ))
8 2
Comparing the best response function under low and high costs, we can see that for
a given output level of firm 1, q1 , firm 2 responds by producing more units when
its own costs are low than when they are high because qL2 (q1 ) > qH 2 (q1 ) for every
value of q1 .2 Graphically, qL2 (q1 ) and qH
2 (q 1 ) are parallel to each other, 2 (q1 )
but qH
3∼
originates at 2 , while q2 (q1 ) originates at a lower height, 8 = 0.375.
1 H
Firm 1. Let us now analyze firm 1 (the uninformed player in this game). This firm
seeks to maximize its expected profits because it does not observe whether firm 2 has
low or high costs. Then firm 1 solves the following problem:

2. As an example, consider that q1 = 12 units. Then firm 2 produces qL2 (2) = 12 − 12 × 12 = 14 units when its

costs are low, but only qH 3 1 1 1
2 (2) = 8 − 2 × 2 = 8 units when its costs are high.
1 1
max π1 = (1 − q1 − qL2 )q1 + (1 − q1 − qH
2 )q1
q1 0
2 2
if firm 2 has low costs if firm 2 has high costs

qL2 qH
= 1 − q1 − − 2 q1 .
2 2
Differentiating with respect to q1 yields
qL2 qH
1 − 2q1 − − 2 = 0.
2 2
Solving for q1 , we obtain firm 1’s best response function:
1 1 L 1 H
2 =
q1 qL2 , qH − q − q , (BRF1 (qL2 , qH
2 ))
2 4 2 4 2
which is a function of both firm 2’s output when having low costs, qL2 , and when having
high costs, qH
2 . We then found three best response functions, which we can solve to
obtain the three unknown output levels q1 , qL2 , and qL2 . Inserting the best response
functions for firm 2, qL2 (q1 ) and qH
2 (q1 ), into the expression of firm 1’s best response
function, q1 qL2 , qH
2 , yields
1 1 1 1 1 3 1
q1 = − − q1 − − q1 ,
2 4 2 2 4 8 2

qL2 (q1 ) 2 (q1 )
qH
which simplifies to q1 = 9+8q32 . Solving for output q1 , we obtain q1 = 8 . We can now

1 3
plug this result into firm 2’s best response function, first when having low costs,
3 1 13 5
qL2 = − = ,
8 2 2 8 16
and then when having high costs,
3 3 13 3
qH
2 = − = .
8 8 2 8 16
Therefore, the BNE of this duopoly game with incomplete information prescribes
production levels q1 , qL2 , qH
2 = 3 5 3
8 16 , 16 .
,
396 Chapter 15
Self-assessment 15.1 Repeat the analysis in example 15.1, but assuming that firm
1’s marginal cost changes to MC1 = $ 12 . Firm 2’s costs are still either low, MC2 = $0,
or high, MC2 = $ 14 . How are the results in example 15.1 affected? Interpret.
15.3 Auctions
Auctions have always been a large part of the economic landscape, with some auctions
reported as early as in Babylon, around 500 BCE and during the Roman Empire, in 193 CE.3
Auctions with precise sets of rules emerged in 1595, where the Oxford English Dictionary
first included the term; and auction houses like Sotheby’s and Christie’s were founded as
early as 1744 and 1766, respectively. Commonly used auctions nowadays are often online,
with popular websites such as eBay, with $9 billion in total revenue in 2017 and thousands
of employees worldwide, which attracted the entry of competitors into the online auction
industry, such as QuiBids.
Auctions have also been used by governments throughout history. In addition to auction-
ing off treasury bonds, in the last decade, governments started to sell airwaves (3G and
4G technology). For instance, the British 3G telecom licenses generated $34 billion (about
2 percent of their gross domestic product at the time) in what British economists called
“the biggest auction ever.”4 In the rest of the chapter, we study the common ingredients
in most auction formats (understanding them as an allocation mechanism). Then, we ana-
lyze optimal bidding behavior in first-price auctions (FPAs), second-price auctions (SPAs),
common-value auctions, and the so-called winner’s curse.
15.3.1 Auctions as Allocation Mechanisms

Consider N bidders who seek to acquire a certain object, where each bidder i has a valuation
vi for the object, and assume that there is one seller. Note that we can design many different
rules for the auction, following the same auction formats commonly observed in real-life
scenarios. For instance, we could use any of the following:
1. First-price auction (FPA), where the winner is the bidder submitting the highest bid, and
she must pay the highest bid (which in this case is hers).
3. In particular, the Praetorian Guard, after killing Pertinax, the emperor, announced that the highest bidder could
claim the empire. Didius Julianus was the winner, becoming the emperor for two short months, after which he was
beheaded.
4. Several game theorists played an important role in designing and testing the auction format before its final
implementation. In fact, the design of 3G auctions created a great controversy in most European countries during
the 1990s because similar countries collected enormously different revenue amounts from the sale.
2. Second-price auction (SPA), where the winner is the bidder submitting the highest bid,
but in this case, she must pay the second-highest bid.
3. Third-price auction, where the winner is still the bidder submitting the highest bid, but
now she must pay the third-highest bid.
4. All-pay auction, where the winner is still the bidder submitting the highest bid, but in
this case, every single bidder must pay the price she submitted.
These auction formats have several features in common, implying that all auctions can
be interpreted as allocation mechanisms with two main ingredients:
1. An allocation rule, specifying “who gets the object.” The allocation rule for most auc-
tions determines that the object is allocated to the bidder submitting the highest bid. This
was, in fact, the allocation rule for all four auction formats considered here. However, we
could assign the object by using a lottery, where the probability of winning the object is
a ratio of my bid relative to the sum of all bidders’ bids (i.e., prob(win) = b1 +b2 +b…
1
+bN ),
an allocation rule often used in certain Chinese auctions.
2. A payment rule, which describes “how much each bidder pays.” For instance, the pay-
ment rule in the FPA determines that the individual submitting the highest bid pays her
own bid, while everybody else pays zero. In contrast, the payment rule in the SPA spec-
ifies that the individual submitting the highest bid (the winner) pays the second-highest
bid, while everybody else pays zero. Finally, the payment rule in the all-pay auction
determines that every individual must pay the bid that she submitted.5
For ease of exposition, we first present SPAs and then move to FPAs. Our presentation
seeks to avoid most technicalities. For a more advanced introduction to auction theory,
see the books by Krishna (2002), Milgrom (2004), Menezes and Monteiro (2004) and
Klemperer (2004).
15.4 Second-Price Auctions
In the SPA class of auctions, bidding your own valuation (i.e., bi (vi ) = vi ) is a weakly dom-
inant strategy for all players. That is, regardless of the valuation you assign to the object,
and independent of your opponents’ valuations, submitting a bid equal to your valuation,
bi (vi ) = vi , yields an expected profit equal to or higher than that of submitting any other bid,
bi (vi ) = vi . To show this bidding strategy is an equilibrium outcome of the second-price
auction, let’s first examine bidder i’s expected payoff from submitting a bid that coincides
5. This auction format is used by the internet seller QuiBids.com. For instance, if you participate in the sale of a
new iPad, and you submit a low bid of $25, but some other bidder wins by submitting a higher bid, you will still
see your $25 withdrawn from your QuiBids account.
398 Chapter 15
with her own valuation vi (which we refer to as the “First case”), and then compare it with
what she would obtain by deviating to bids below her valuation for the object, bi (vi ) < vi
(denoted as “Second case”), or above her valuation, bi (vi ) > vi (“Third case”).
First case: If the bidder submits her own valuation, bi (vi ) = vi , then either of the following
situations can arise:
1a) If the highest competing bid lies below her bid, hi < bi , where hi = max{bj },6 then
j=i
bidder i wins the auction. In this case, she obtains a net payoff of vi − hi because in
an SPA, the winning bidder does not pay the bid she submitted, but rather the second-
highest bid, hi , and in this case, bi > hi .
1b) If, instead, the highest competing bid lies above her bid, hi > bi , then she loses the
auction, earning zero payoff.
We do not consider the case when her bid coincides with the highest competing bid
(i.e., bi = hi ), and thus a tie occurs. Ties are normally solved in auctions by randomly assign-
ing the object to the bidders who submitted the highest bids (e.g., if bidders 3 and 7 are tied
in the highest bid, the auctioneer can flip a coin to determine if bidder 3 or 7 will receive
the object). As a consequence, bidder i’s payoff becomes vi − hi , but with only 12 probability
(i.e., her expected payoff of 12 (vi − hi )).7 However, because vi = hi in this case, the bidder
earns a zero expected payoff.
Second case: Let us now compare these equilibrium payoffs with those that bidder i could
obtain by deviating toward a bid that shades her valuation (i.e., bi < vi ). In this case, we can
also see three cases emerging (see figure 15.1), depending on the ranking between bidder
i’s bid, bi , and the highest competing bid, hi :
2a) If the highest competing bid hi lies below her bid (i.e., hi < bi ), then she still wins the
auction, obtaining the same net payoff as when she does not shade her bid, vi − hi .
2b) If the highest competing bid hi lies between bi and vi (see case 2b in figure 15.1), bidder i
loses, making zero payoff. Had she submitted a bid equal to her valuation for the object,
she would have won the auction, earning a payoff of vi − hi > 0.
2c) If the highest competing bid hi is higher than vi (see case 2c), bidder i loses the auction,
thus yielding the same outcome as when she submits a bid, bi = vi .
Hence, we just showed that when bidder i shades her bid, bi < vi in cases 2a–2c, she
obtains the same or lower payoff as when she submits a bid that coincides with her valuation
6. Intuitively, expression hi = max{bj } just finds the highest bid among all bidders different from bidder i, j = i.
j =i
Alternatively, hi can be written more explicitly as hi = max{b1 , b2 , … , bi−1 , bi+1 , … , bN }, where we find the
highest bid among all N bidders except for bidder i (note that we wrote everyone’s bid but i’s, bi ).
7. More generally, if K ≥ 2 bidders are tied submitting the highest bid, the auctioneer randomly assigns the object
to any of them, implying that each of these bidders earns an expected payoff of K1 (vi − hi ).
hi
Case 2a
vi Bids
bi
hi
Case 2b
vi Bids
bi
hi
Case 2c
vi Bids
bi
Figure 15.1
Cases when bidder i shades his bid, bi < vi .
hi
Case 3a
bi Bids
vi
hi
Case 3b
bi Bids
vi
hi
Case 3c
bi Bids
vi
Figure 15.2
Cases when bidder i bids above his value, bi > vi .
for the object (bi = vi ). Therefore, she does not have incentives to shade her bid because her
payoff would not improve from doing so.
Third case: Let us finally examine bidder i’s equilibrium payoff from submitting a bid above
her own valuation (i.e., bi > vi ). Three cases also arise (see figure 15.2):
3a) If the highest competing bid hi lies below bidder i’s valuation, vi , she still wins, earn-
ing a payoff of vi − hi , which coincides with that when she submits her valuation,
bi = vi .
3b) If the highest competing bid hi lies between vi and bi (see case 3b in figure 15.2), bidder
i wins the object but earns a negative payoff because vi − hi < 0. If, instead, bidder i
submits a bid bi = vi , she would have lost the object, earning zero payoff.
3c) If the highest competing bid hi lies above bi (see case 3c in figure 15.2), bidder i wins,
but at a loss since her payoff is negative (i.e., vi − hi < 0). If, instead, bidder i submits
a bid bi = vi she loses the auction, earning zero payoff.
Hence, bidder i’s payoff from submitting a bid above her valuation either coincides with
her payoff from submitting her own value for the object, or becomes strictly lower, thus
eliminating her incentive to deviate from her equilibrium bid of bi (vi ) = vi . In other words,
400 Chapter 15
there is no bidding strategy that provides a strictly higher payoff than bi (vi ) = vi in the SPA,
and all players bid their own valuation, without shading their bids; in the next section we
see that this result differs from the optimal bidding function in FPA, where players shade
their bids unless N → ∞.
Remark—The equilibrium bidding strategy in the SPA is, first, unaffected by the num-
ber of bidders who participate in the auction, N, or their risk-aversion preferences. In
particular, our discussion considered the presence of N bidders, and an increase in their
number does not emphasize or ameliorate the incentives that every bidder has to sub-
mit a bid that coincides with her own valuation, bi (vi ) = vi . Second, these results remain
when bidders evaluate their net payoff (e.g., vi − hi ), according to a concave utility func-
tion, such as u(x) = xα , exhibiting risk aversion. Specifically, for a given value of the
highest competing bid, hi , bidder i’s expected payoff from submitting a bid, bi (vi ) = vi ,
would still be weakly larger than when deviating to a bidding strategy above, bi (vi ) > vi ,
or below, bi (vi ) < vi , her true valuation for the object. Finally, our results are also unaf-
fected by how valuations for the object are distributed (e.g., following a uniform, normal,
or exponential distribution); as these arguments did not rely on the specific distribution of
valuations.
Self-assessment 15.2 Consider an SPA with N = 25 bidders. If your valuation for

the object is vi = $14, what is your optimal bidding strategy? What if your valuation
for the object increases to vi = $17? What if the number of bidders increases to N =
120? Interpret.
15.5 First-Price Auctions
15.5.1 Privately Observed Valuations

Before analyzing equilibrium bidding strategies in first-price auctions, note that auctions
are strategic scenarios where players must choose their strategies (i.e., how much to bid) in
an incomplete information context. In particular, every bidder knows her own valuation for
the object, vi , but does not observe other bidders’ valuations, such as vj . That is, bidder i is
“in the dark” about her opponents’ valuations.
Despite not observing j’s valuation, bidder i knows the probability distribution behind
bidder j’s valuation. For instance, vj can be relatively high (e.g., vj = $10, with probability
0.4), or low (e.g., vj = $5, with probability 0.6). More generally, bidder j’s valuation, vj , is
distributed according to a cumulative distribution function F(v) = prob(vj < v), intuitively
representing that the probability that vj lies below a certain cutoff v is exactly F(v). For
simplicity, we normally assume that every bidder’s valuation for the object is drawn from a
F(v)
1.0
F(v) = v
(i.e., 45-degree line)
0.8
prob(vj > v) =
1 – F(v) = 1 – v
0.6
v
0.4
prob(vj < v) =
F(v) = v 0.2
0.2 0.4 0.6 0.8 1.0 vj

v
vj < v vj > v
(i.e., bidder j’s valuation is lower (i.e., bidder j’s valuation is higher
than bidder i’s) than bidder i’s)
Figure 15.3
Uniform probability distribution.
uniform distribution function between 0 and 1 (i.e., vj ∼ U[0, 1]), while the appendix in this
chapter extends our analysis to other cumulative distribution functions F(vi ).8
Figure 15.3 illustrates this uniform distribution, where the horizontal axis depicts vj and
the vertical axis measures the cumulated probability F(v). For instance, if bidder i’s valu-
ation is v, then all points on the left side of v on the horizontal axis represent that vj < v,
entailing that bidder j’s valuation is lower than that of bidder i. The mapping of these points
on the vertical axis gives us the probability prob(vj < v) = F(v) which, in the case of a uni-
form distribution, is F(v) = v. Similarly, the valuations to the right side of v describe points
where vj > v, and thus bidder j’s valuation is higher than that of bidder i. Mapping these
points on the vertical axis, we obtain the probability prob(vj > v) = 1 − F(v) which, under
a uniform distribution, implies 1 − F(v) = 1 − v.
15.5.2 Equilibrium Bidding in First-Price Auctions

Let’s start analyzing equilibrium bidding behavior in the first-price auction. First, note that
submitting a bid above one’s valuation, bi > vi , is a dominated strategy. To understand this
point, the bidder would obtain a negative payoff if she wins, and zero payoff if she loses.
8. Note that this assumption does not imply that bidder j does not assign a valuation vj larger than 1 to the object.
Instead, her valuation vi , which lies on interval [0, v], can be divided by v, which normalizes the interval to [0, 1].
402 Chapter 15
Bid, b i
1.0 vi (i.e., 45-degree line)
0.8 Bid shading
0.6 b i (v i ) = a . v i
0.4
0.2
0.2 0.4 0.6 0.8 1.0 vi
Figure 15.4
“Bid shading” in a FPA.
Specifically, her expected utility from participating in the auction is
EUi (bi |vi ) = prob(win) × (vi − bi ) + prob(lose) × 0,

Negative if vi <bi
which becomes negative when the bidder submits a bid above her valuation, vi < bi , regard-
less of her probability of winning. Note that in this expected utility, we specify that, upon
winning, bidder i receives a net payoff of vi − bi ; that is, the difference between her true
valuation for the object and the bid she submits (which ultimately constitutes the price she
pays for the good if she were to win).9 Similarly, submitting a bid bi that exactly coincides
with one’s valuation, bi = vi , also constitutes a dominated strategy because even if the bidder
happens to win, her expected utility would be zero; that is,
EUi (bi |vi ) = prob(win) × (vi − bi ) .

Zero if vi =bi
Therefore, the equilibrium bidding strategy in an FPA must imply a bid below one’s valu-
ation, bi < vi . That is, bidders must practice what is usually referred to as “bid shading.”
In particular, if bidder i’s valuation is vi , her bid must be a share of her true valuation
(i.e., bi (vi ) = a · vi , where a ∈ (0, 1)). Figure 15.4 illustrates bid shading because bidding
strategies lie below the 45-degree line (where vi = bi ).
A natural question at this point is: how intense must bid-shading be in the first-price
auction? Or, alternatively, what is the precise value of the bid shading parameter a? To
9. Upon losing, bidders do not obtain any object and, in this type of auction, they do not have to pay any monetary
amount, thus implying a zero payoff.
Bid, b i
0.8
0.6 b i (v i ) = a . v i
bi = x
0.4
0.2
vi
0.2 0.4 0.6 0.8 1.0
x
a
Figure 15.5
Recovering bidder i’s valuation.
answer such a question, we must first describe bidder i’s expected utility from submitting a
given bid x, when her valuation for the object is vi ,
EUi (x|vi ) = prob(win) × (vi − x) + prob(lose) × 0.
Before continuing our analysis, we still must precisely characterize the probability of
winning in the expression, prob(win). Specifically, upon submitting a bid bi = x, bidder j
can anticipate that bidder i’s valuation is ax , by just inverting the bidding function bi (vi ) =
x = a × vi (i.e., solving for vi in x = a × vi yields vi = ax ). This inference is illustrated in
figure 15.5, where bid x on the vertical axis is mapped into the bidding function a × vi ,
which corresponds to a valuation of ax on the horizontal axis. Intuitively, for a bid x, bidder
j can use the symmetric bidding function a × vi to “recover” bidder i’s valuation, ax , that
generated a bid of $x.
Hence, the probability of winning is given by prob(bi bj ) and, according to the verti-
cal axis in figure 15.5, prob(bi > bj ) = prob(x > bj ). If, rather than describing probability
prob(x > bj ) from the point of view of bids (see shaded portion of the vertical axis in
figure 15.6), we characterize it from the point of view of valuations (in the shaded segment
of the horizontal axis), we obtain that prob(bi > bj ) = prob( ax > vj ).
Indeed, the shaded set of valuations on the horizontal axis illustrates valuations of bidder
j, vj , for which her bid lies below player i’s bid, x. In contrast, valuations vj satisfying vj > ax
entail that player j’s bids would exceed x, implying that bidder j wins the auction. Hence,
if the probability that bidder i wins the object is given by prob( ax > vj ) and valuations are
uniformly distributed, we have that prob( ax > vj ) = ax .10
10. Recall that if random variable y is distributed according to a uniform distribution function U[0, 1], the
probability that the value of y lies below a certain cutoff c is exactly c (i.e., prob(y < c) = F(c) = c).
404 Chapter 15
Bid, b j
vj (i.e., 45-degree
1.0 line)
bi < bj 0.8
(and bidder i
loses) bj (v j ) = a . v j
0.6
bi = x
0.4
bi > bj
(and bidder i
0.2
wins)
0.2 0.4 0.6 0.8 1.0 vj

x
a
Valuations of bidder j,vj , for which Valuations of bidder j,vj ,
bi > bj (bidder i wins). for which bi > bj (bidder i loses).
Figure 15.6
Probability of winning in a FPA.
We can now plug this probability of winning into bidder i’s expected utility from
submitting a bid of x in the FPA, as follows:
x vi x − x2
EUi (x|vi ) = (vi − x) = .
a a
vi −2x
Taking first-order conditions with respect to bidder i’s bid, x, we obtain a = 0 which,
solving for x, yields bidder i’s optimal bidding function:
1
x(vi ) = vi .
2
Intuitively, this bidding function informs bidder i how much to bid, as a function of her
privately observed valuation for the object, vi . For instance, when vi = $0.75, her optimal
bid becomes 12 0.75 = $0.375. This bidding function implies that, when competing against
another bidder j, and with only N = 2 players participating in the FPA, bidder i shades her
bid in half, as figure 15.7 illustrates.
Self-assessment 15.3 Consider a first-price auction with N = 2 bidders. If your

valuation for the object is vi = $14, what is your optimal bidding strategy? What if
your valuation increases to vi = $17? Interpret.
Bid, bi
0.8
0.6
vi
bi ( vi ) =
2
0.4
0.2
0.2 0.4 0.6 0.8 1.0 v i
Figure 15.7
Optimal bidding function in an FPA with N = 2 bidders.
15.5.3 Extending the First-Price Auction to N Bidders

Our results are easily extended to FPA with N bidders. In particular, the probability of bidder
i winning the auction when submitting a bid of $x is
x x x x
prob(win) = prob > v1 ·… · prob > vi−1 · prob > vi+1 ·… · prob > vN
a a a a
x x x x x N−1
= ·… · · ·… · = ,
a a a a a
where we evaluate the probability that the valuation of all other N − 1 bidders, v1 , v2 ,…,
vi−1 , vi+1 ,…, vN (except for bidder i), lies below the valuation vi = ax , which generates a bid
of exactly $x. Hence, bidder i’s expected utility from submitting x becomes
x N−1
EUi (x|vi ) = (vi − x).
a
prob(win)
To facilitate the differentiation with respect to bid x, the bidder’s expected utility can be
rewritten as follows: EUi (x|vi ) = aN−1
1
xN−1 vi − xN−1 x , which entails aN−1
1
xN−1 vi − xN .
Taking first-order conditions with respect to her bid, x, we obtain
1
(N − 1) xN−2 vi − NxN−1 = 0,
aN−1
which is zero when the term in brackets is nil, (N − 1) xN−2 vi − NxN−1 = 0. Rearrang-
N−1 xN−1
ing this term, we obtain xxN−2 = N−1
N vi . Recall that the left side, xN−2 , can be rewritten as
x(N−1)−(N−2) = x, which helps us further simplify our results, finding that bidder i’s optimal
bidding function is
406 Chapter 15
Bid, bi
1.0 v, where N → ∞
0.8 3v
, where N = 4
4
2v
3
, where N = 3
0.6
v
, where N = 2
2
0.4
0.2
0.2 0.4 0.6 0.8 1.0 vi
Figure 15.8
Optimal bidding function in a FPA increases in N.
N −1
x(vi ) = vi .
N
Figure 15.8 depicts the bidding function for the case of N = 2, N = 3, and N = 4 bid-
ders, showing that bid shading is ameliorated when more bidders participate in the auction
(i.e., bidding functions approach the 45-degree line). Indeed, for N = 2, the optimal bid-
ding function is 2−1 2 vi = 2 vi , but it increases to 3 vi = 3 vi when N = 3 bidders compete
1 3−1 2
for the object, to 4 vi = 4 vi when N = 4 players participate in the auction, etc. For an
4−1 3
extremely large number of bidders (e.g., N = 2, 000), bidder i’s optimal bidding func-
tion becomes bi (vi ) = 1,999
2,000 vi vi and, hence, bidder i’s bid almost coincides with her
valuation for the object, describing a bidding function that approaches the 45-degree
line.
Intuitively, if bidder i seeks to win the object, she can shade her bid when few bidders are
competing for the good. However, when several players are competing in the auction, the
probability that some of them have a high valuation for the object (and thus submit a high
bid) increases. That is, competition gets tougher as more bidders participate, where every
bidder responds increasing her bid, ultimately ameliorating her incentives to practice bid
shading.
Self-assessment 15.4 Consider a first-price auction with N = 25 bidders. If your

valuation for the object is vi = $14, what is your optimal bidding strategy? What if
the number of bidders increases to N = 120? Interpret.
15.5.4 First-Price Auctions with Risk-Averse Bidders

Let us next analyze how our equilibrium results would be affected if bidders are risk averse
(i.e., their utility function is concave in income, x), [e.g., u(x) = xα , where 0 < α 1 denotes
bidder i’s risk-aversion parameter]. In particular, when α = 1, she is risk neutral, while when
α decreases, she becomes risk averse.11
Two bidders. First, note that the probability of winning is unaffected because for a sym-
metric bidding function bi (vi ) = a · vi for every bidder i, where a ∈ (0, 1), the probability
that bidder i wins the auction against another bidder j is
x x
prob(bi > bj ) = prob(x > bj ) = prob > vj = .
a a
Therefore, bidder i’s expected utility from participating in this auction is
x
EUi (x|vi ) = × (vi − x)α ,
a
where, relative to the case of risk-neutral bidders analyzed previously, the only difference
arises in the evaluation of the net payoff from winning, vi − x, which is now evaluated as
(vi − x)α . Taking first-order conditions with respect to her bid, x, we have
1 x
(vi − x)α − α(vi − x)α−1 = 0,
a a
and solving for x, we find the optimal bidding function,
1
x(vi ) = vi .
1+α
This case embodies that of risk-neutral bidders analyzed here as a special case. Specifi-
cally, when α = 1, bidder i’s optimal bidding function becomes x(vi ) = v2i . However, when
her risk aversion increases (i.e., α decreases), bidder i’s optimal bidding function increases.
Specifically, ∂x(v i) vi
∂α = − (1−α)2 , which is negative for all parameter values. In the extreme case
in which α decreases to α → 0, the optimal bidding function becomes x(vi ) = vi , and players
do not practice bid shading. Figure 15.9 illustrates the increase in players’ bidding function,
starting from v2i when bidders are risk neutral, α = 1, and approaching the 45-degree line
(no bid shading) as players become more risk averse.
Intuitively, a risk-averse bidder submits more aggressive bids than a risk-neutral bidder,
to minimize the probability of losing the auction. In particular, consider that bidder i reduces
α , is pos-
11. This utility function is increasing in income because its first-order derivative, u (x) = αxα−1 = 1−α
x
itive. In addition, it is concave in income because its second-order derivative, u (x) = α(α − 1)xα−2 = − α(1−α) ,
x2−α
is negative, given that α satisfies 0 < α ≤ 1. However, when α = 1, the utility function becomes linear in income
because u (x) simplifies to u (x) = 1−1α = α = α, and u (x) collapses to zero. In contrast, when α < 1, the function
x 1
√
is concave in income. A typical example of this utility function explored in chapter 6 was u(x) = x1/2 = x.
408 Chapter 15
Bid, bi
1.0 v, where α = 0
v
, where α = 0.2
0.8 1.2
v
, where α= 0.5
1.5
0.6
v
, where α =1
2
0.4
0.2
0.2 0.4 0.6 0.8 1.0 vi
Figure 15.9
Optimal bidding function in an FPA with risk-averse bidders.
her bid from bi to bi − ε. If she wins the auction, she obtains an additional profit of ε because
she has to pay a lower price for the object. However, by lowering her bid, she increases the
probability of losing the auction. Importantly, for a risk-averse bidder, the positive effect
of getting the object at a cheaper price is offset by the negative effect of increasing the
probability of losing the auction. This result connects with our discussion of risk aversion
in section 6.6.1 of chapter 6, where we said that, for a risk-averse individual, the disutility
she suffers from the downside of a lottery is larger than the utility she experiences from the
upside of a lottery. Overall, the risk-averse bidder does not have incentives to reduce her
bid, but rather to increase it, relative to a risk-neutral bidder.
N 2 bidders. These results can be easily extended to scenarios with more bidders. From
section 15.5.3, we know that the probability that bidder i wins the auction is
x x x x
prob(win) = prob > v1 ·… · prob > vi−1 · prob > vi+1 ·… · prob > vN
a a a a
x x x x x N−1
= ·… · · ·… · = .
a a a a a
Therefore, bidder i’s expected utility from participating in this auction is
x N−1
EUi (x|vi ) = × (vi − x)α .
a
Differentiating with respect to bidder i’s bid, x, we obtain
x N−2 x N−1
α 1
(N − 1) (vi − x) − α(vi − x)α−1 = 0,
a a a
x N−1
(vi − x)α−1 [(N − 1)vi + (N − 1 + α)x] = 0.
a
Solving for x, we obtain the equilibrium bidding function
N −1
x(vi ) = vi .
N −1+α
When N = 2 bidders compete for the object, this bidding function becomes x(vi ) =
2−1+α vi = 1+α vi , thus coinciding with the expression found here. However, when N = 3
2−1 1
bidders compete, the bidding function increases to x(vi ) = 3−1+α 3−1

vi = 2+α
2 2
vi , where 2+α >
1
1+α . More generally, we can show that the bidding function x(vi ) increases in N because
the derivative ∂x(v
∂N
i)
= αvi
(N−1+α) 2 is positive, indicating that, as N increases, bidders become
more aggressive.
Self-assessment 15.5 Consider an FPA with N = 2 bidders. If your valuation for

the object is vi = $14 and your utility function is u(x) = x1/3 , what is your optimal
bidding strategy? What if your utility function is u(x) = x1/10 ? Interpret.
15.6 Efficiency in Auctions
Auctions, and generally allocation mechanisms, are characterized as efficient if the bidder
(or agent) with the highest valuation for the object is indeed the person receiving the object.
Intuitively, if this property does not hold, the outcome of the auction (i.e., the allocation of
the object) would open the door to negotiations and arbitrage among the winning bidder—
who, despite obtaining the object, may not be the player who assigned the highest value to
it—and bidders with higher valuations who would like to buy the object from her. In other
words, the auction’s outcome would still allow negotiations that are beneficial for all parties
involved, thus suggesting that the initial allocation was not efficient.
According to this criterion, both the FPA and the SPA are efficient because the bidder
with the highest valuation submits the highest bid, and the object is ultimately assigned to
the player who submits the highest bid.
Other auction formats, such as the Chinese (or lottery) auction described in section 15.3,
are not necessarily efficient because they may assign the object to an individual who did
not submit the highest valuation for the object. In particular, recall that the probability of
winning the object in this auction is a ratio of the bid that you submit relative to the sum
of all players’ bids. Hence, a bidder with a low valuation for the object, and who submits
410 Chapter 15
the lowest bid (e.g., $1), can still win the auction. Alternatively, the person who assigns
the highest value to the object, submitting the highest bid, might not end up receiving the
object. Therefore, for an auction to satisfy efficiency, bids must be increasing in a player’s
valuation, and the probability of winning the auction must be 1 (100 percent) if a bidder
submits the highest bid.
15.7 Common-Value Auctions
The auction formats considered here assume that each bidder privately observes her own val-
uation for the object. Her valuation was drawn from a distribution function (e.g., a uniform
distribution), implying that two bidders are unlikely to assign the same value to the object
for sale. However, in some auctions, such as the government sale of oil leases, bidders (oil
companies) might assign the same monetary value to the object (common value) (i.e., the
profits they would obtain from exploiting the oil reservoir). Bidders are, nonetheless, unable
to precisely observe the value for the object.
In the oil lease example, firms cannot accurately observe the exact volume of oil in the
reservoir, or how difficult it will be to extract, but they can accumulate different estimates
from their own engineers, or from other consulting companies, that inform them about the
potential profits to be made from the oil lease. Such estimates are nonetheless imprecise, and
they allow the firm to assign a value to the object (profits from the oil lease) only within a
relatively narrow range, such as v ∈ [10, 11,… , 20], in millions of dollars. In other words, the
value in profits that all firms assign to the oil lease is common, which explains why we refer
to this type of auction as “common-value auctions.” The estimate ei that each firm i receives
about this common value is potentially different, however, with some firms receiving an
upward-biased estimate, ei > v, and others receiving a downward-biased estimate, ei < v.
In this type of auction, shading your bid—which, at first glance, could be regarded as a
conservative strategy—can lead you to win the auction, but at a loss! To understand this
point, consider two bidders A and B, each receiving an estimate eA and eB , where eA > v >
eB . Bidder A’s estimate is then biased upward relative to the true value of the oil lease, v,
while bidder B’s estimate is biased downward. (A similar argument applies if we start from
the opposite ranking of estimates.) If every bidder submits a bid that shades her estimate by
$1, we would have
bA = eA − 1 and bB = eB − 1, where bA > bB .

Therefore, bidder A submits a more aggressive bid than B does (bA > bB ) because the former
received a higher estimate than the latter, (eA > eB ). Bidder A is then the winner of the
auction, but her payoff could be negative! This occurs if her margin after paying bid bA ,
v − bA = v − (eA − 1),

bA
is negative, which, solving for eA , yields v + 1 < eA . Intuitively, if bidder A’s estimate,
eA , is $1 million larger than the true value of the oil lease, shading her bid by $1
million can still lead this bidder to win the auction, even if paying too much for the
object.
This is the so-called winner’s curse in common-value auctions: winning the auction means
that the winning bidder probably received an overestimated signal of the true value of the
object for sale, as firm A in this example. Therefore, to avoid the winner’s curse, participants
in common-value auctions must significantly shade their bid to account for the possibility
that the estimates they receive are above the true value of the object.12
The winner’s curse in the classroom. Despite the straightforward intuition behind this
result, the winner’s curse has been empirically observed in several scenarios. A common
example is that of subjects in an experimental lab, where they are asked to submit bids in
a common-value auction where a jar of nickels is being sold. Consider that your instructor
shows up in class with a glass jar full of nickels. The monetary value that you assign to the
jar (value of the coins inside the jar) coincides with that of your classmates, but none of
you can accurately estimate the number of nickels in the jar because you can only gather
some imprecise information about its true value by looking at it for a few seconds. In these
experiments, it is usual to find that the winner ends up submitting a bid above the jar’s true
value (i.e., the winner’s curse emerges).13
15.8 A Look at Behavioral Economics—Experiments with Auctions
Several controlled experiments have been developed to test whether individuals bid accord-
ing to the optimal bidding function discussed in this chapter. Generally, an experimental
session starts by randomly distributing to every individual her valuation for the object prior
to each auction period, where their valuations are typically drawn from a uniform distribu-
tion. In each period, the bidder submitting the highest bid earns a profit equal to her value
minus the auction price (either her own bid in an FPA, or the second-highest bid in an SPA),
while other bidders earn zero profit.
Overall, most studies indicate that individuals tend to bid more aggressively than what
would be expected according to the bidding function bi (vi ) found in previous sections of
this chapter, although this bid difference is partly reduced when we consider their risk aver-
sion. However, the comparative statics remain: individuals tend to bid more aggressively
when competing against more bidders, when their valuation of the object is higher, and
12. It can be formally shown that, in the case of N = 2 bidders who receive independent signals, the optimal
bidding function is bi (vi ) = 12 ei , where ei denotes the signal that bidder i receives. Intuitively, every bidder needs
to “shade her signal” by submitting a bid of exactly half of the signal she received. More generally, for N bidders,
bidder i’s optimal bid becomes bi (vi ) = (N+2)(N−1) ei . For more details, see Harrington (2014), pp. 400–02.
2N 2
13. For some experimental evidence on the winner’s curse see, for instance, Thaler (1988).
412 Chapter 15
when they are more risk averse. For an excellent review of the literature, see Kagel and
Levin (2014).
Appendix. First-Price Auctions in More General Settings
Section 15.5 analyzes equilibrium bidding in first-price auctions assuming that every bidder
i’s valuation is drawn from a uniform distribution—that is, vi ∼ U[0, 1]. In this appendix,
we extend our analysis to allow for valuations to be drawn from more general cumulative
distribution functions, F(vi ), with positive density in all its support—that is, f (vi ) > 0 for
all vi ∈ [0, 1]. In the case of uniformly distributed valuations, F(vi ) = vi and f (vi ) = 1, but
now we consider a general function F(vi ).
Writing expected utility.We can then write bidder i’s expected utility maximization
problem (UMP) as follows:
max prob(win)(vi − bi ),
bi 0
which denotes the probability of winning the object times bidder i’s net payoff from win-
ning, vi − bi , because she values the object at vi and pays her bid bi for it.
At this point, we need to write the probability of winning, prob(win), as a function of
bidder i’s bid, bi . Bidder i wins the auction when her bid exceeds that of bidder j, bj < bi ,
which is equivalent to her valuation exceeding that of bidder j, vj < vi . We can express this
probability as
prob(vj < vi ) = F(vi ).
Therefore, when bidder i faces N − 1 rivals, her probability of winning the auction
is the probability that her valuation exceeds that of all other N − 1 bidders. Because
valuations are independently distributed, we can write this probability as the following
product:
prob(vj < vi ) × prob(vk < vi ) × … × prob(vl < vi ) = F(vi ) × F(vi ) × … × ·F(vi ) = F(vi )N−1

N−1 times
where bidders j = k = l represent i’s rivals. As a result, we can express the expected UMP
as follows:
max F(vi )N−1 (vi − bi ).
bi 0
Using this bidding function, we can write bi (vi ) = xi , where xi ∈ R+ represents bidder
i’s bid when her valuation is vi , as in section 15.5. Applying the inverse b−1 (·) on both
sides, yields vi = b−1
i (xi ), which helps us rewrite the probability of winning, F(vi )
N−1 , as
−1
F(bi (xi )) N−1 , so this problem becomes
max F(b−1
i (xi ))
N−1
(vi − xi ),
xi 0
where we expressed the bid as xi in the last term because bi (vi ) = xi .

Finding equilibrium bids. Now that the probability of winning is written as a function of
bidder i’s bid, xi , we are ready to differentiate with respect to xi to find the equilibrium
bidding function that players use in the FPA. Differentiating with respect to xi yields
∂b−1 (x )
− F(b−1 −1 −1 i i
i (xi )) N−1
+ (N − 1)F(bi (xi )) N−2
f bi (xi ) (vi − xi ) = 0.
∂xi
∂b−1
i (xi )
Because b−1
i (xi ) = vi and ∂xi = 1 , this expression simplifies to
b b−1
i (xi )
1
− F(vi )N−1 + (N − 1)F(vi )N−2 f (vi ) (vi − xi ) = 0.
b (vi )
Rearranging this further, we obtain
(N − 1)F(vi )N−2 f (vi )vi − (N − 1)F(vi )N−2 f (vi )xi = F(vi )N−1 b (vi )
or
F(vi )N−1 b (vi ) + (N − 1)F(vi )N−2 f (vi )vi = (N − 1)F(vi )N−2 f (vi )xi .

∂ F(vi )N−1 bi (vi )
The left side is ∂vi ,
which helps us write this expression as

∂ F(vi )N−1 bi (vi )
= (N − 1)F(vi )N−2 f (vi )xi .
∂vi
Integrating both sides yields
vi
F(vi ) N−1
bi (vi ) = (N − 1)F(vi )N−2 f (vi )vi dvi .
0
Applying integration by parts on the right side,14 we find

vi vi
(N − 1)F(vi )N−2 f (vi )vi dvi = F(vi )N−1 vi − F(vi )N−1 dvi
0 0
14. Recall that, when integrating by parts, we consider two functions, g(x) and h(x), such that (gh) = g h +
gh . Integrating both sides yields g(x)h(x) = g (x)h(x)dx + g(x)h (x)dx. Reordering this expression, we find

g (x)h(x)dx = g(x)h(x) − g(x)h (x)dx. At this point, we can apply integration by parts in our auction setting
by defining g (x) ≡ (N − 1)F(vi )N−2 f (vi ) and h(x) ≡ vi , so we obtain the result given in the text.
414 Chapter 15
so we can write our above first-order condition as

vi
F(vi )N−1 bi (vi ) = F(vi )N−1 vi − F(vi )N−1 dvi .
0
We can now solve for the equilibrium bidding function bi (vi ) that we seek to find. Dividing
both sides by F(vi )N−1 yields
vi
F(vi )N−1 dvi
0
bi (vi ) = vi − .
F(vi )N−1

bid shading
Intuitively, bidder i submits a bid equal to her valuation for the object, vi , minus an amount
captured by the second term in this expression, which is referred to as her “bid shading.”
We can then claim that the bidding function bi (vi ) constitutes the BNE of the FPA when
bidders’ valuations are distributed according to F(vi ).
Uniformly distributed valuations. Consider, for instance, when individual valuations

) = N−1 = vN−1 and
are
vi uniformly distributed, F(vi vi . In this scenario, we obtain F(vi ) i
1
F(vi ) N−1
dvi = vi , producing a bidding function of
N
0 N
1 N
N vi vN
i N −1
bi (vi ) = vi − = vi − = vi
viN−1 NviN−1 N
which coincides with the result in section 15.5.3. When only two bidders compete for the
object, N = 2, this bidding function simplifies to bi (vi ) = v2i , which coincides with the result
in section 15.5.2. When N = 3, equilibrium bids increase to bi (vi ) = 2v3 i , and a similar result
occurs when N = 4 bidders compete for the object, where bi (vi ) = 3v4 i .15
Informally, as more bidders participate in the auction, every bidder i submits more aggres-
sive bids because she faces a higher probability that another bidder j has a higher valuation
for the object than she has.
Exercises
1. Pareto uncertainty–I.A Two firms are considering the adoption of a new technology that would be
mutually beneficial if they both chose to implement it. In the case where only one firm adopted the
technology, however, the results could be unpredictable. Firm 1, however, has insider information
about whether the technology is useful (with a payoff of 6) or useless (with a payoff of 0) if firm

15. More generally, the derivative of bidding function bi (vi ) = vi N−1
N with respect to the number of bidders N
yields ∂b∂N
i (vi ) = 1 v , which is clearly positive.
2 i
N
2 does not adopt the new technology. Firm 2 does not have this information, but it knows that
the technology is useful with probability 0.5, and useless otherwise. The payoff for both firms are
depicted in the following normal form games:
Firm 2 Firm 2
New Old New Old
New 8, 8 0, 0 New 8, 8 6, 0
Firm 1 Firm 1
Old 0, 0 4, 4 Old 0, 6 4, 4
Useless technology Useful technology
(a) Find the best responses of the privately informed player, firm 1, which is type-dependent.
(b) Find the best response of the uninformed player, firm 2.
(c) Identify the BNE of the game and interpret your results.
2. Pareto uncertainty–II.B Consider the situation in exercise 15.1, but now assume that the pro-
bability that the technology is useless on its own is p, where p takes some value between 0
and 1.
(a) Find the best responses of the privately informed player, firm 1, which is type-dependent.
(b) Find the best response of the uninformed player, firm 2.
(c) Identify the BNE of the game and interpret your results.
3. Stackelberg leader facing uncertain costs.A Consider the situation in example 15.1, but suppose
that firm 1 acts as a Stackelberg leader. Find the BNE of this duopoly game.
4. All firms facing uncertain costs.B Consider the situation in example 15.1, but now firm 2 cannot
observe firm 1’s costs. Firm 1 has low costs, MC1 = 0, with probability 0.5, and high costs, MC1 =
1 , with probability 0.5. Firm 1 is able to observe its own costs. Find the BNE of this duopoly game.
4
5. Uncertain demand—one uninformed firm.B Consider a duopoly game where two firms compete
on the basis of quantities and face inverse demand function p = a − q1 − q2 . Assume that firm 1
is an incumbent and understands that the size of the market is a = 100. Firm 2, the entrant, is
unable to accurately observe the size of the market and instead knows that it is low, a = 80, with
probability 0.5, and high, a = 100, with probability 0.5. Assume that marginal costs of production
for both firms are 0. Find the BNE of this duopoly game.
6. Uncertain demand—two uninformed firms.A Consider the situation in exercise 15.5, but now
neither firm is privately informed about the size of the market. Both firms know that the market
size is either low, a = 80, with probability 0.5, or high, a = 100, with probability 0.5. Find the BNE
of this duopoly game.
7. Entry deterrence.B Consider the Entry game presented in example 13.1 in chapter 13, but sup-
pose that the incumbent had private information about whether she was crazy or not. A noncrazy
incumbent has a game tree exactly as depicted in example 13.1, but a crazy incumbent loves to
engage in price wars, and receives a payoff of 6 from doing so. Suppose that the entrant was aware
that the probability of an incumbent being crazy is p = 0.1.
(a) Find the BNE of this game. Does the entrant still enter this market?
(b) For what value of p is the entrant indifferent between entering or staying out of this market?
416 Chapter 15
8. First-price auction–I.A Consider an auction with two participants, each of them with the
following (privately observed) valuation of the object for sale: Person A ($50), Person B ($60).
(a) If the seller organizes an FPA, who will be the winner? What will be her winning bid? What
price will she pay for the object?
(b) Suppose now that Person A was able to observe Person B’s private valuation prior to the
auction. Would Person A change her bid? If so, how? If not, why not?
9. First-price auction–II.A Consider the situation in exercise 15.8, but suppose that Person A’s
valuation is only $25.
(b) Suppose that Person A were able to observe Person B’s private valuation prior to the auction.
Would Person A change her bid? If so, how? If not, why not?
10. First-price auction–III.B Consider an FPA with N = 10 bidders.
(a) If your valuation for the object is vi = $200, what is your optimal bidding strategy?
(b) Suppose that you received information that the valuations of all bidders ranges from a low
of $150 to a high of $210. Assuming that no other bidder has this information, would your
bidding strategy change? If so, how?
11. Risk aversion–I.B Consider the situation in exercise 15.8, but suppose that both bidders have
utility function u(x) = x0.4 .
(c) For what degree of risk aversion (α) would Person A not want to change her bid in part (b)?
12. Risk aversion–II.A After losing an auction to a sole rival bidder for a bid of bj = $100, you later
learn that her valuation for the object was vj = $250. Based on this information, what is bidder j’s
attitude toward risk?
13. Second-price auction–I.A Consider the situation in exercise 15.8.
(a) If the seller organizes a second-price auction, who will be the winner? What will be her
winning bid? What price will she pay for the object?
14. Second-price auction–II.B Consider an auction with five participants, each of them with the
following (privately observed) valuations of the object for sale: Person A ($10), Person B ($6),
Person C ($45), Person D ($81), and Person E ($62).
(a) If the seller organizes an SPA, who will be the winner? What will be her winning bid? What
(b) Suppose that bidders can observe each other’s valuations, but the seller cannot. The seller,
however, only knows that bidders’ valuations are in the range {0, 1, … , $90}. If players
participate in an SPA, who will be the winner? What is her winning bid?
15. Lottery allocation–I.B Consider a situation where a public all-pay auction takes place for an item,
with its allocation determined by a lottery. Each bidder is able to observe all the bids of all the
bi
other bidders as they make their own. The probability that bidder i wins the auction is p = b +B
i −i
where B−i denotes the total bids made by all other bidders. Suppose that bidder i has a valuation
of vi = 9 for this item, and he knows that bids totaling B−i = $4 have already been submitted.
Find the optimal bidding strategy for bidder i, bi , taking into consideration that he must pay her
bid regardless of whether he wins the auction.
16. Lottery allocation–II.B Consider the scenario in exercise 15.15. Answer the following questions:
(a) If the seller organizes an FPA, which is bidder i’s equilibrium bid? Who wins the object?
(b) If the seller organizes an SPA, which is bidder i’s equilibrium bid? Who wins the object?
17. All-pay auction.C Consider the following all-pay auction, with two bidders privately observing
their valuations for the object. Valuations are uniformly distributed vi ∼ U[0, 1]. The player sub-
mitting the highest bid wins the object, but all players must pay the bid they submitted. Find
the optimal bidding strategy, taking into account that it takes the form bi (vi ) = m × v2i , where m
denotes a positive constant.
18. Third-price auction.B Consider a third-price auction, where the winner is the bidder who submits
the highest bid, but she only pays the third-highest bid. Assume that you compete against two other
bidders, whose valuations you are unable to observe, and that your valuation for the object is $10.
Show that bidding above your valuation (with a bid of, for instance, $15) can be a best response
to the other bidders’ bids, while submitting a bid that coincides with your valuation ($10) might
not be a best response to their bids.
19. Efficiency with risk aversion.B Consider a situation where bidders with heterogeneous attitudes
toward risk compete in an FPA. Provide an example of how these bidders can lead to an inefficient
allocation of the object.
20. Bid shading.A On your way back from an SPA, an inexperienced colleague of yours informs
you that she had been advised by a veteran competitor that she should shade her bid by bidding
only 90 percent of her valuation for the object. Did your colleague act optimally? Why would her
competitor give her this advice?
21. Comparing auctions.A Compare and contrast the similarities and differences between an FPA
where all bidders can observe everyone’s private valuations and an SPA.
16 Contract Theory
16.1 Introduction
This chapter covers another scenario where inefficiencies may exist: contracting under
asymmetric information. In these contexts, one agent has different information from another
agent, such as when the employee observes how much effort she exerts on a task while the
employer does not or, in the other direction, when the employer knows about an indus-
try’s characteristics more than a candidate who seeks to work for it. As we show in this
chapter, asymmetric information leads to lower aggregate payoffs than when all individu-
als are perfectly informed. In other words, total surplus is suboptimal due to asymmetric
information.
Specifically, two common problems arise when agents interact under asymmetric infor-
mation. First, “moral hazard” problems may exist when one of the parties (i.e., the employer)
cannot observe the actions of the other party (i.e., the employee). These scenarios are, there-
fore, also known as “hidden action” because one party does not get to observe the action
chosen by the other party. Intuitively, the firm manager’s lack of information about a worker’s
effort on the job could lead the worker to slack off if she knew that her job security, salary, or
chances of promotion were unaffected. Of course, firms understand the asymmetric informa-
tion situation in which they operate, anticipate the moral hazard problem that flat contracts
may generate, and respond by designing contracts in which a worker’s salary, promotion, or
job security increases based on her performance.1
As we discuss in this chapter, firms can design “high-powered” contracts to ameliorate
moral hazard problems. In these contracts, the worker receives a relatively low salary when
her performance is low (e.g., when a worker doing manual labor in a factory produces
few units, or when a marketing executive secures few sales), but earns more money (e.g.,
1. While performance may be an imperfect predictor of the worker’s effort, a firm can accummulate several obser-
vations about her performance after weeks on the job, and can even compare her performance relative to that
of other workers, which ultimately helps the firm infer more accurately the worker’s effort from the observed
performance.
420 Chapter 16
a bonus) when her performance exceeds a certain level. While the contract differs from one
in which both employer and employee are symmetrically informed, the worker now has the
incentive to exert more effort than under a flat contract paying her the same salary regardless
of her performance. As a result, expected profits are higher than under flat contracts.
The second type of problem that often emerges in scenarios of asymmetric information is
“adverse selection.” In this case, the uninformed party observes the actions of the other party,
but she does not observe a piece of private information, such as a manager not observing
a job candidate’s productivity while interviewing her for the job. This explains why these
scenarios are also known as “hidden information,” to emphasize that one party has access
to a piece of information (e.g., productivity) that the other party does not.2
We then explore two common scenarios in which adverse selection problems arise, and
how firms can design contracts which, despite being imperfect, help ameliorate this pro-
blem and increase expected profits. The first example we consider is the used-cars market.
In this market, the seller is privately informed about a car’s quality—high-quality cars, also
known as “peaches,” and low-quality cars, “lemons”— while the buyer can obtain only
a rough estimate of it during a test drive or a short visual inspection. If the buyer was
as informed about the car’s quality as the seller, all types of cars (peaches and lemons)
would be sold at a market-clearing price (a high price for peaches, given their high qual-
ity; and a lower price for lemons, given their low quality). When the buyer is uninformed,
however, we demonstrate that high-quality cars go unsold. Intuitively, the buyer forms
an expectation about the average car quality and purchases a car only if the seller asks a
price below that reference point. Anticipating this relatively low willingness-to-pay from
the buyer, the seller does not have incentives to offer high-quality cars, as she would need
to ask a high price, which would not be accepted by the buyer. Therefore, good cars are not
traded, which occurs exclusively based on the asymmetric information between seller and
buyer.
The second scenario in which adverse selection problems are common is the labor market,
where a worker can observe her cost of exerting effort on the job, while the employer does
not. Because the employer seeks to induce the worker to exert effort, these contexts are often
referred to as “principal-agent” models. We examine the salary that the firm offers and the
effort that the worker exerts in response in a symmetric information scenario, comparing
them against that occurring under asymmetric information. We then analyze how the firm
can design contracts inducing workers who experience a high (low) cost from effort to exert
little (great) effort on the job for a low wage (high wage), and how the contract meant for
one type of worker cannot be profitably chosen by her counterpart.
2. The firm manager has some information about the share of high- and low-productivity workers in town, or in
that profession, allowing her to find an “expected productivity” of candidates. However, the exact productivity
of a candidate is still better known by the candidate herself than by the firm manager, maintaining the hidden
information structure.
Contract Theory 421
In a scenario where a firm hires a worker, adverse selection problems are often referred
to as “precontractual” because the firm does not observe the worker’s type (such as her
productivity on the job) before hiring her. Moral hazard, however, is “postcontractual”
because the firm cannot observe the worker’s effort after hiring her, assuming that the firm
knows her productivity level, which eliminates adverse selection problems. In real life, firms
often face both problems: adverse selection when interviewing candidates for a position, fol-
lowed by moral hazard issues after hiring them.3 A similar argument applies to insurance
markets, where companies offering health plans do not observe an individual’s health status
before purchasing insurance (or her genetic predisposition to develop certain conditions),
nor her level of care after acquiring a specific health plan, such as her diet and exercise
routine.
While firms face precontractual problems (adverse selection) before hiring workers, their
presentation is more involved than postcontractual problems (moral hazard) and, for this
reason, we start the chapter by examining moral hazard.
16.2 Moral Hazard
Moral hazard (or hidden action) A scenario in which an agent cannot observe the
actions taken by other agents.
Moral hazard problems arise, of course, in health insurance plans because companies offer-
ing these plans cannot observe the actions that their clients take to maintain good health
(e.g., exercising, eating a good diet, and avoiding risky behaviors). Another context in which
moral hazard problems abound is insurance markets because an insurance company does not
observe how careful the insured party is (e.g., the driver’s carefulness is a hidden action for
the company) but designs insurance policies to give incentives to its clients so that they are as
careful as possible. For example, Progressive’s “Snapshot” device monitors the insured indi-
vidual’s driving behavior, such as sudden braking, while other firms offer “pay-per-drive”
insurance, thus providing discounts for low mileage.
Consider the following scenario: you are hired by a small firm, which pays you $400 a
week to work 6 hours a day. If the contract does not specify a target outcome of your effort
(units of output being produced per week), how much effort will you exert? We know that
you are a responsible worker, but the firm does not monitor your work, and the contract sets
a flat weekly pay (i.e., it is a flat contract, thus providing no incentives to increase effort). You
may then slack off a little, or at least not work as much as you could every minute of the day.
3. For a more detailed presentation of contractual problems, see Macho-Stadler and Perez-Castrillo (2001) and
Campbell (2018), both at the undergraduate level. For more technical presentations, see Laffont and Martimort
(2002) and Bolton and Dewatripont (2004).
422 Chapter 16
(Workers may exert effort even in this scenario because of nonmonetary incentives, such as
career concerns and future promotions. For simplicity, we abstract from these concerns in
the next discussion.)
16.2.1 Contracts When Effort Is Observable

Anticipating the incentive problems of a flat contract, the firm could specify a salary con-
nected to the effort you exert. The problem with this contract, however, is that effort is
relatively difficult to measure from the employer’s point of view. A firm manager may see
how many hours you put in at the workplace, but she cannot easily observe how focused you
were at a task or which distractions affected your concentration during the day. Alternatively,
she could write a contract specifying that your pay will increase based on the output you
produce (e.g., $2 per unit of output without defects). Sound good? Well, if the relationship
between effort and units of output was not affected by any randomness, then the manager
would be able to infer effort from output. For instance, if every hour of effort yields 4 units
of output, the observation of 16 units of output must imply that the worker exerted 4 hours
of effort. In that scenario, the observation of output would be equivalent to observing effort,
and the contract would provide workers with the right incentives.
What’s the problem with this argument? Simply put, life is messy; effort does not simply
materialize into a constant amount of output. While exerting more effort often implies pro-
ducing more output, random shocks affect our performance (focus, being infected with the
COVID-19 virus, sleep patterns, distractions with other co-workers). Even if we put in the
exact same amount of working hours for more than 2 consecutive days (and try to concen-
trate on our jobs fully), we may obtain different results each day. This randomness between
effort and output emerges both in manual jobs and intellectual tasks and cannot be ignored
by managers at the time of drafting contract details.
Example 16.1 examines optimal contracts when effort is observable. We then move on to
optimal contracts in contexts where the relationship between effort and output is random.
Example 16.1: Finding optimal contracts when effort is observable Consider a

√
worker with utility function u(w) = w, where w 0 denotes her salary. The worker
experiences disutility from exerting effort, e, measured by g(e) = e, and her reserva-
tion utility is u 0, which captures the utility that she would obtain in an alternative
job (or earning an unemployment benefit). For simplicity, assume that this reservation
utility is zero (u = 0), and that there are two effort levels the worker can exert, eH = 5
and eL = 0.
As reported in the top row of table 16.1, when the worker exerts a high effort, the
firm’s sales are $0 with probability 0.1, $100 with probability 0.3, and $400 with
probability 0.6. As reported in the bottom row, when the worker exerts low effort,
Contract Theory 423
Table 16.1
Probability of sales for each effort level.
$0 in sales $100 in sales $400 in sales
High effort 0.1 0.3 0.6

Low effort 0.6 0.3 0.1
the firm’s sales are $0 with probability 0.6, $100 with probability 0.3, and $400 with
probability 0.1. Intuitively, low effort makes it more likely that $0 sales occur, while
high effort increases the probability of $400 in sales.
As a consequence, the expected sales when the worker exerts high effort becomes
(0.1 × 0) + (0.3 × 100) + (0.6 × 400) = $270,
while when she exerts low effort, expected sales are only
(0.6 × 0) + (0.3 × 100) + (0.1 × 400) = $70.
How can the firm induce high or low effort from the worker? The worker accepts
the high-effort contract if u(wH ) − g(eH ) u, which in this context implies
√
wH − 5 0,
√
or, after rearranging, wH 5. Squaring both sides, we obtain wH (5)2 = $25.
Because the firm seeks to pay the lowest possible salary, it will reduce wH until con-
dition wH $25 holds with equality, thus paying wH = $25. Operating similarly for
low effort, the worker accepts the low-effort contract if u(wL ) − g(eL ) u, which in
this scenario implies
√
wL − 0 0,
√
or, after simplifying, wL 0. Squaring both sides yields wL $0, which entails
a salary of wL = $0. Lastly, we can compare the firm’s expected profits (measuring
expected sales less salary) as follows:
With high effort, $270 − $25 = $245

With low effort, $70 − $0 = $70.
Therefore, the firm offers a contract (wH , wL ) = ($25, $0), inducing the agent to
exert a high effort level.
424 Chapter 16
worker’s reservation utility increases to u = 1/2. This could happen if, for instance,
unemployment benefits become more generous. Find the optimal salaries wH and wL
in this scenario, and compare them against those in example 16.1. Interpret.
Role of risk aversion. In example 16.1, we assume that the worker is risk averse (i.e., her
√
utility function u(w) = w is concave) while the firm is risk neutral (i.e., the profit function
is linear in money). We showed that the principal offers a contract that pays a relatively
generous salary when the worker exerts high effort (which the firm seeks to induce) but
a lower payoff if she exerts low effort. If the worker is less risk averse (e.g., her utility
function changed to u(w) = w9/10 , thus being close to linear), wages become less generous.
In contrast, if the worker is more risk averse (e.g., her utility function changes to u(w) =
w1/10 , thus becoming more concave), she needs more generous compensation. A similar
argument would apply if the firm manager was not risk neutral, but risk averse. The end-of-
chapter exercises ask you to confirm these results by altering the utility and profit functions
in example 16.1.
16.2.2 Contracts When Effort Is Unobservable

When the firm cannot observe the effort that the worker exerts, it needs to provide her with
incentives to exert the amount of effort that maximizes profits. (A similar argument applies
if the relationship between effort and output is not random, as discussed previously.) In order
to understand the optimal wage in this context, let us consider again the previous scenario
where effort was observable. Under observable effort, inducing eH gives rise to two effects:
on the one hand, it increases expected profits (because higher outcomes are more likely to
occur when the worker exerts eH than eL ) but on the other hand, effort eH is more expensive
to induce than eL , as the former requires a more generous salary. For simplicity, let us assume
that the positive effects offset the negative effects, entailing that the firm seeks to induce a
high effort eH from the worker.
How are these effects modified when we introduce unobservability of effort? The positive
effect from effort level eH (larger expected profits) is unaffected. However, the wage that
the firm must pay to induce eH is more generous when effort is unobservable (requiring
the worker to voluntarily choose high rather than low effort) than when it is observable. In
summary, while the expected benefits from eH are unaffected, its expected costs go up when
effort is not observable, thus restricting the cases for which the firm continues to induce this
high effort.
Contract Theory 425
Example 16.2: Finding optimal contracts when effort is unobservable Con-

sider the firm and worker in example 16.1, but now let us allow for effort to be
unobservable for the firm. In this context, assume that when the worker exerts high
effort, the firm observes high output with probability 0.6, but low output otherwise,
with probability 0.4. In contrast, if the worker exerts low effort, the firm observes
high (low) output with probability 0.1 (0.9, respectively). Intuitively, high output is
more likely to originate from high than low effort, but both efforts have a probability
of yielding high or low output. Table 16.2 summarizes these probabilities.4
Table 16.2
Probability of high and low outputs for each effort level.
High output Low output
High effort 0.6 0.4

Low effort 0.1 0.9
Because in this discussion, we assumed that the firm prefers to induce a high effort
level, the firm’s problem in this context becomes
max $270 − [0.6wH + 0.4wL ],
wH ,wL
Expected labor cost
subject to
√ √
0.6 wH + 0.4 wL −50 (PC)

Expected utility from high effort
√ √ √ √
0.6 wH + 0.4 wL −5 0.1 wH + 0.9 wL − 0, (IC)

Expected utility from high effort Expected utility from low effort
where $270 denotes the firm’s expected sales from high effort, while the term in brack-
ets represents the firm’s expected labor cost (either paying wH when the observed
output is high or wL when it is low). The first constraint of the firm’s problem sim-
ply states that the worker prefers to exert high effort (obtaining an expected utility of
√ √
0.6 wH + 0.4 wL , but suffering an effort cost of 5) rather than rejecting the firm’s
contract altogether (receiving a payoff of 0). That is, the worker prefers to participate
4. Strictly, the entries in table 16.2 are “conditional probabilities” because high or low output level is affected by
the effort that the worker selects. Specifically, a high output becomes more likely when the worker exerts high effort
rather than low.
426 Chapter 16
in the contract, which explains why this constraint is often referred to as the agent’s
“participation constraint,” or PC. The second constraint, however, indicates that the
worker prefers to exert high rather than low effort. In other words, the contract pro-
vides her with sufficient incentive to exert high effort, which is known as the “incentive
constraint,” or IC.
In this context, IC holds with equality. Intuitively, if IC did not hold, the firm could
still reduce the salary offered to the worker when high (low) output is observed, wH
(wL , respectively), increasing its profits as a result. Because IC holds, we obtain
√ √ √ √
0.6 wH + 0.4 wL − 5 = 0.1 wH + 0.9 wL ,
√ √
or after rearranging, 0.5 wH − 0.5 wL = 5. Solving for wH in IC, we find wH =
√ 2
wL + 10 . We can then plug in this result everywhere we had wH in the previous
maximization problem, simplifying it to
⎡ ⎤
⎢ √ 2 ⎥
max $270 − ⎣0.6 wL + 10 + 0.4wL ⎦ ,
wL
wH
subject to
√ √
0.6 wL + 10 + 0.4 wL − 5 0. (PC)
Note that the firm now only has one choice variable below the max operator
√ (salary 2
wL ) rather than two (salaries wH and wL ) because we plugged in wH = wL + 10
making the maximization problem a function of wL alone.
Finally, a common approach of the problem at this point is to ignore the PC condi-
tion, treating the program as an unconstrained maximization problem (as if we didn’t
have a constraint!). Once we are done solving this unconstrained problem, we will
need to check that our results satisfy the PC condition. This trick indeed simplifies our
calculations significantly because we can operate as if PC were absent, differentiating
the firm’s objective function with respect to wL , which yields

∂π 0.6 √
=− √ wL + 10 + 0.4
∂wL wL
6
= −0.6 − √ − 0.4
wL
6
= −1 − √ .
wL
Contract Theory 427
This expression is clearly negative for all salaries wL . Therefore, the firm reduces
this salary as much as possible, to w∗L = 0. We can conclude that the firm pays w∗L = $0
after observing low output, and
√ 2 √ 2
w∗H = wL + 10 = 0 + 10 = $100
after observing high output. Relative to the case where effort is observable, the firm
still pays a salary wL = $0 after observing low output. However, the salary after high
output, wH , increases from $25 to $100 when the agent’s effort is unobservable. We
further check that the PC is slack (i.e., it holds with strict inequality) because
√ √
0.6 0 + 10 + 0.4 0 − 5 = 1 > 0.
In contrast, when effort is observable, PC holds with equality, leaving the worker
indifferent between accepting and rejecting the contract (i.e., zero expected payoff).
In contrast, when effort is unobservable, her expected payoff is larger ($1, in this
example).
Our last result—that the worker’s utility is larger when the firm cannot observe her effort
than when the firm can observe it—is often referred to as that the informed agent in this
relationship enjoys an “information rent.”
Information rent A utility gain that an agent enjoys when moving from a symmetric
to an asymmetric information context.
Similar information rents will emerge in other contractual relationships analyzed in this
chapter, in which the worker’s payoff is higher when she benefits from an information advan-
tage relative to the firm (asymmetric information) than when both parties are symmetrically
informed.
worker’s reservation utility increases to u = 1/2. Find the optimal salaries w∗H and
w∗L in this scenario, the worker’s information rent, and compare them against your
findings in self-assessment 16.1 (the complete information version of this scenario).
Interpret.
428 Chapter 16
16.2.3 Preventing Moral Hazard

Given the inefficiencies that emerge under moral hazard, firms seek to observe the worker’s
effort. For instance, the firm manager may monitor the worker’s effort. This is costly for the
firm, however, even if it only measures the worker’s effort for a few minutes every month. For
monitoring to be effective: (1) workers must know that monitoring may occur, as otherwise,
they would behave as if the manager could never observe their effort levels; and (2) they
must not know when their effort will be monitored, as otherwise, they would strategically
work hard only when their effort is monitored. Is our department chair looking over our
shoulder while we write this?
16.3 Adverse Selection
In this section, we continue with our analysis of asymmetric information but rather than
considering scenarios where an agent does not observe the actions taken by another agent
(hidden action), we now examine contexts where she cannot observe some private infor-
mation of the other agent (hidden information). Examples include a buyer not observing
a used car’s quality (which is only observed by the seller) or a manager not observing a
job applicant’s ability. As we show, lack of information could lead the uninformed party to
make a wrong decision (e.g., select a low-quality car, or a low-ability job candidate). This
explains why hidden information models are also known as “adverse selection.” Examples
in insurance markets abound, with insurance companies not being able to observe the risk
of an insured party (an individual’s health or her driving ability).
16.3.1 Market for Lemons

In previous chapters, we assume that if buyers are interested in purchasing a good and sellers
find it profitable to sell it, then a market will exist where parties exchange the good at a
mutually agreeable price. In this section, however, we show that markets might fail to exist if
buyers and sellers have access to different amounts of information. Because a market would
exist when agents are symmetrically informed, we can say that information asymmetries
can lead to market imperfections.
Following Akerlof (1970), consider a used-cars market, where quality is denoted by q.
Quality is a random variable whose realization is observed by the seller, but not by the
buyer. For simplicity, quality q is uniformly distributed between 0 and 32 . A car of quality
q
q is valued as such by the buyer, and at discounted value 3/2 = 23 q by the seller. The buyer,
therefore, assigns to the car a larger valuation than the seller, and they could find prices
between 23 q and q for which the trade makes both parties better off. For instance, if they
agree on a price p = 34 q, the seller makes a profit of π = p − 23 q = 34 q − 23 q = 12
1
q, whereas
the buyer obtains a utility of u = q − p = q − 4 q = 4 q.
3 1
Contract Theory 429
16.3.2 Market for Lemons—Symmetric Information

When both seller and buyer observe the car’s quality q, the seller only needs to charge a
price p that maximizes her profits, p − 23 q, subject to guaranteeing that the price is accepted
by the buyer (i.e., q − p 0 or q p). Formally, the seller sets p to solve
2
max p − q
p 3
subject to q − p 0. (PC)
The buyer’s participation constraint (PC) must hold with equality (i.e., q − p = 0 or q = p).
Otherwise, the seller could charge a higher price, which would still be accepted by the buyer.
Inserting q = p into the seller’s profit (in this maximization problem), simplifies it to
2 1
max p − p = p.
p 3 3
Differentiating with respect to p, we obtain 13 . Because this result is always positive, we
found a corner solution where the seller increases price p as much as possible, that is to say,
p = q. (Higher prices would not satisfy the buyer’s PC, leading her not to purchase the car.)
Therefore, the seller charges a price equal to the car’s quality q, which in this scenario the
buyer can perfectly observe. Importantly, all car types are traded: from those with q close
to zero (poor quality, or “lemons”) to those with q close to 32 (good quality, or “peaches”).
In summary, when both parties observe the car’s quality, no market failures arise.
Self-assessment 16.3 Repeat the analysis in subsection 16.3.2, but assume that
the car quality is uniformly distributed between 0 and 2 (rather than between 0 and
q
2 ). This means that the seller now values a car of quality q as 2 = 2 q, rather than at
3 1
q
3 = 3 q. How are the results in subsection 16.3.2 affected?
2
2
16.3.3 Market for Lemons—Asymmetric Information

How would these results change if the buyer could not observe the car’s true quality?5 In this
context, the buyer will accept a price p if she receives a positive expected utility, E[q] − p
0. Term E[q] indicates the car’s expected utility, which we can find as follows:
5. We have all been in that position as buyers of a used item. For instance, when we purchased our first car (a used
car, of course), the seller said something along the lines of, “This is an excellent car, an old lady owned it for a few
years and took great care of it.” Well, by the number of miles on the car, our “old lady” must have driven across
many states every weekend! We admit that the car ended up being in great condition (a peach!), so we still trust
that seller.
430 Chapter 16
3
+0 3
E[q] = 2
= ,
2 4
because quality q is uniformly distributed between 0 and 32 . The buyer then accepts a price
p if 34 − p 0, or p 34 . In this scenario, the seller’s problem now becomes
2
max p − q
p 3
3
subject to p . (PC)
4
By the same argument, the seller can raise the price p until the PC holds with equality,
p = 34 . But we have then solved this problem: the seller sets the highest acceptable price to
the buyer, as any higher price yields a negative expected utility for the buyer. This price,
however, leads the seller to offer cars with quality q that satisfies p − 23 q = 34 − 23 q 0 (i.e.,
positive profits). Rearranging, this condition entails 34 23 q, or solving for quality q, we
obtain
3/4 9
= q.
2/3 8
In other words, offering cars with qualities above 98 is unprofitable for the seller. This is
problematic because, as we described in the symmetric information scenario, the seller and
buyer can make a positive margin from exchanging cars of all quality levels if they could
both observe the quality. In other words, the buyer’s inability to observe the car’s quality
leads to the nonexistence of the market for good cars (“peaches”), whereas only bad cars
(“lemons”) exist in this market. Informally, bad cars pushed out good cars from the market.
the car quality is uniformly distributed between 0 and 2 (rather than between 0 and
q
2 ). This means that the seller now values a car of quality q as 2 = 2 q, rather than at
3 1
q
3 = 23 q. How are the results in subsection 16.3.3 affected?
2
Lemons in other markets. Similar problems emerge in the labor market where buyers of
job services (firms) have access to less information than sellers of labor (job applicants). In
this context, a worker privately observes her productivity, θ , but firms do not. Following this
argument, firms would only offer a wage equal to the worker’s expected productivity, w =
E[θ ]. However, this salary attracts only workers whose productivity lies below such a salary
(i.e., θ E[θ ]), but it does not attract those with relatively high productivity, (i.e., θ > E[θ ]).
Contract Theory 431
In short, asymmetric information prevents the existence of a market of high-skilled workers,

leaving only low-skilled workers employed.
Overcoming the lemon problem. Sellers often try to overcome this market failure by
offering warranties for their items (used cars). If sellers offer warranties when selling a
high-quality car (peach) but not a low-quality car (lemon), the observation of whether a car
comes with a warranty signals its true quality to the buyer, who can now operate as in the
symmetric information context. In that scenario, markets for both lemons and peaches exist.
More recent tools include CARFAX, which provides information about the car’s reported
accidents, miles accumulated by every owner, large repairs and factory recalls of defective
parts, as if their quality was certified by a third party. In addition, some manufacturers offer
certified preowned vehicles to signal good quality.
16.3.4 Principal-Agent Model

Consider now a scenario between a principal (firm) and an agent (worker). The principal’s
profits are given by
π(e) = log(e) − w,
thus increasing in the effort e that the worker exerts, but at a decreasing rate because the log
function is concave. In addition, profits decrease in the wage w that the principals pays to
the agent. The agent’s utility is
u(w, e) = w − θ e2 ,
which increases in the salary that she receives from the firm w. The second term in this
utility function, θ e2 , can be understood as the worker’s “cost of effort,” which is increas-
ing and convex in the effort she exerts, e. In addition, the worker’s cost of effort increases
in parameter θ . For instance, her innate productivity might be lower, and thus the same
amount of effort generates a larger disutility when θ increases.6 For simplicity, consider
that parameter θ is either high or low, denoted as θH and θL where, θH > θL .
16.3.5 Principal-Agent Model—Symmetric Information

When the firm observes parameter θ , it knows the cost θ e that the worker incurs from exert-
ing effort. In that scenario, the firm finds a wage w and an effort level e that maximize
its profits, log(e) − w, subject to guaranteeing that the worker accepts the contract (i.e.,
w − θ e2 0). Formally, the firm sets a salary w to solve
max log(e) − w
w,e
subject to w − θ e2 0. (PC)
6. The marginal cost of effort, 2θ e, is also larger for workers with a high parameter θ .
432 Chapter 16
As in the lemon problem, where the firm increased the price it charges as much as possible,
the firm now seeks to decrease wages as much as possible, while still guaranteeing workers’
acceptance. That is, the PC w − θ e2 0 holds with equality, w − θ e2 = 0, which yields
w = θ e2 . Inserting this result into the firm’s profits transforms this problem as follows:7
max log(e) − θ e2 .
e
Differentiating with respect to e, we obtain 1e − 2θ e = 0. Solving for e, we find 1

e = 2θ e, or
2θ = e . Applying the square root on both sides yields the optimal effort level
1 2
1
1 2
e = SI
,
2θ
where superscript SI denotes “symmetric information.” Because θH > θL , efforts satisfy

1 1
2 2
eSI
H = 1
2θH < 2θ1L = eSI L , implying that the high-cost worker exerts a lower effort than
the low-cost worker.
Lastly, we can find the optimal wages in this context using w = θ e2 . When the firm
observes θH , it offers a wage of
1 2
1 2 1 1
H = θH (eH ) = θH ×
wSI = θH =$ .
SI 2
2θH 2θH 2
Similarly, when it observes θL , the firm offers a wage of
1 2
1 2 1 1
L = θL (eL ) = θL ×
wSI = θL =$ .
SI 2
2θL 2θL 2
Both types of worker receive the same wage because it is more expensive for the firm to
induce effort from the high-type (because effort is more costly for this worker), implying
that the firm induces less effort from her. In other words, the firm pays this type of worker
the same salary as if she were a low-type worker, but this salary induces her to exert a lower
effort level than the low-type worker.
Example 16.3: Principal-agent problem under symmetric information Con-

sider that θH = 2 and θL = 1. Using the previous results, we find that, when the firm
observes the worker’s type to be θH = 2, it requires an effort level of
7. Because the firm’s profits now do not depend on wage w, we delete it from the list of variables the firm can
choose from (the max operator only includes e below it, rather than w and e).
Contract Theory 433
1 1
1 2 1 2 1
H =
eSI = =
2θH 2×2 2
from the worker. In contrast, when the firm observes θL = 1, it requires an effort of
1 1
1 2 1 2 1
L =
eSI = =√ .
2θL 2×1 2
Intuitively, the firm demands more effort from the worker with the lowest cost of effort,
L > eH . As shown previously, however, the firm pays the same salary to both types
eSI SI
L = wH = $ 2 .
1
of workers, wSI SI
the worker’s reservation utility is $2 rather than zero (see the right side of her PC).
Find the optimal efforts, eSI SI SI SI
L and eH , and the optimal salaries, wL and wH . How are
the results in subsection 16.3.5 affected?
16.3.6 Principal-Agent Model—Asymmetric Information

The firm now does not observe the worker’s type θ , but it knows that a proportion γ of
workers are θH , while the remaining share, 1 − γ , are θL . In that context, the firm maximizes
its expected profits (because it does not know whether it deals with a high or a low type of
worker).
Like in the symmetric information scenario, workers must be willing to work for the firm
(PC), which implies that the contract generates a positive utility, both for the high and the
low type of worker. Unlike in the symmetric information case, we must require that each
type of worker prefers to choose the contract meant for her, rather than that of the other
type of worker. That is, the contract meant for the high type must be profitable only for the
high type, and the contract meant for the low type must be profitable only for her. We next
discuss how to mathematically represent the firm’s profit maximization problem (PMP) so
it takes these points into account.
Writing the firm’s problem under asymmetric information. Mathematically, we write the
firm’s expected PMP as follows:
max γ [log(eH ) − wH ] + (1 − γ ) [log(eL ) − wL ],

wH ,eH ,wL ,eL
If worker is high type If worker is low type
434 Chapter 16
subject to
wH − θH e2H 0 (PCH )
wL − θL e2L 0 (PCL )
wH − θH e2H wL − θH e2L (ICH )
wL − θL e2L wH − θL e2H . (ICL )
Intuitively, the firm offers a “menu” of two contracts, each of which specifies a wage and
an effort level: one contract is meant for the high-type worker, (wH , eH ), and another is
meant for the low-type worker, (wL , eL ). The firm obtains profits of log(eH ) − wH when the
worker is a high type, but log(eL ) − wL when she is of a low type, so the firm maximizes its
expected profits.
In addition, the four constraints in this problem specify that (1) the high-type worker finds
her contract acceptable, and thus prefers to participate, as indicated by PCH ; (2) the low-
type worker finds her contract acceptable, as reflected by PCL ; (3) the high-type worker has
incentives to choose the contract meant for her rather than that of the low type, as indicated
by ICH ; and (4) the low-type worker prefers the contract meant for her rather than that
written for the high-type worker, as captured by ICL . In short, the PC constraints guarantee
the voluntary participation of all types of workers, while the IC constraints ensure self-
selection (i.e., each type of worker selecting the contract meant for her).
Simplifying this problem. In this context, it is straightforward to show that PCH and ICL hold
with equality, which yields wH − θH e2H = 0 and wL − θL e2L = wH − θL e2H , respectively. (See
appendix at the end of the chapter for a step-by-step proof.) This is a common feature worth
remembering: the PC of the least efficient agent (PCH in this context, as she experiences the
highest cost of effort) and the IC of the most efficient agent (ICL in our case, as the low-type
worker exhibits the lowest cost of effort) both hold with equality. From PCH binding, we find
that wH = θH e2H . We can now insert the binding PCH into the binding ICL to obtain
wL − θL e2L = θH e2 − θL e2H ,
H
wH =θH e2H from PCH

which, rearranging, yields a salary of wL = θH e2H + θL e2L − e2H .
We can now insert these results for wH and wL into the maximization problem, which
simplifies to8
8. Note that the firm’s profits now do not contain wages because we already found them in the previous discussion,
limiting the list of choice variables to effort levels eH and eL alone. In addition, we do not include the constraints
PCL and ICH . A common trick in problems with several constraints is to ignore some of them (such as these two
constraints) and solve the problem as if did not have them. At the end of the problem, however, we must check that
our solutions satisfy all constraints, including those we ignored in the process.
Contract Theory 435
⎡ ⎤ ⎡ ⎤
⎢ ⎥
⎢ ⎥
max γ ⎣log (eH ) − θH e2H ⎦ + (1 − γ ) ⎢
⎣ log (eL ) − θH e2H + θL e2L − e2H ⎥⎦.
eL ,eH
wH
wL
Solving the simplified problem. Differentiating with respect to eL , we obtain

1
(1 − γ ) − 2θL eL = 0, (FOCeL )
eL
which, after rearranging and solving for effort eL , yields
1
1 2
eAI
L = ,
2θL
where the superscript AI indicates that we are in an “asymmetric information” context.
Differentiating with respect to eH , we find

1
γ − 2θH eH − 2eH (1 − γ ) (θH − θL ) = 0, (FOCeH )
eH
which, after rearranging and solving for effort eH , yields
1
γ 2
eAI
H = .
2[θH − (1 − γ ) θL ]
We can now find the wage for
the low-type worker. Using the expression of the binding
ICL , wL = θH e2H + θL e2L − e2H , we obtain

γ 1 γ
wAI
L = θH + θ L −
2[θH − (1 − γ ) θL ] 2θL 2[θH − (1 − γ ) θL ]
γ 1
= (θH − θL ) + θL
2[θH − (1 − γ ) θL ] 2θL
(1 + γ ) θH − θL
= .
2[θH − (1 − γ ) θL ]
Analogously, the wage to the high-type worker is found using the binding PCH , wH =
θH e2H , yielding
2
H = θH eH
wAI AI
γ
= θH .
2[θH − (1 − γ ) θL ]
436 Chapter 16
Example 16.4: Principal-agent problem under asymmetric information Let us

continue with example 16.3, where θL = 1 and θH = 2, and assume that the probability
(or proportion) of high-cost workers in the pool of workers is γ = 13 . We can then
evaluate the optimal effort levels under asymmetric information as follows:
1 1
1 2 1 2 1
eAI
L = = =√
2θL 2×1 2
⎛ ⎞1
1 1 2
γ 2
⎠ = √1 ,
H =
eAI =⎝ 3
2[θH − (1 − γ ) θL ] 2[2 − 1 − 1 × 1] 8
3
while optimal wages are

(1 + γ ) θ − θ 1 + 13 × 2 − 1 5
H L
L =
wAI = =$
2[θH − (1 − γ ) θL ] 2 2 − 1 − 1 × 1 8
3
1
γ 1
H = θH
wAI =2 3 =$ .
2[θH − (1 − γ ) θL ] 2[2 − 1 − 1 × 1] 4
3
Self-assessment 16.6 Repeat the analysis in example 16.4, but assume that
the proportion of high-cost workers increases to γ = 12 . How are the results in
16.3.7 Principal-Agent Model—Comparing Information Settings

The results found previously showed that the introduction of asymmetric information entails
L = eL = 1. This is often
no changes in effort for the worker with low cost of effort because eSI AI
known as the “no distortion at the top” result, which predicts that the most efficient agent
suffers no distortion in her effort (or output) across information scenarios. However, these
results find that the high-cost worker exerts less effort when the firm is uninformed about
her type than when it is informed; that is,9
1 1
γ
9. More generally, it is fairly straightforward to check that eAI SI
H < eH , because 2[θH −(1−γ )θL ]
2
< 2θ1 2
H
translates into γ θH < θH − (1 − γ ) θL , leading to (1 − γ ) θL < (1 − γ ) θH , or θL < θH ; a condition that holds true
by assumption.
Contract Theory 437
1 1
H = √ < = eH .
eAI SI
8 2
Salaries are, in contrast, higher for the low-cost worker (the efficient worker) under
asymmetric than symmetric information; that is,10
5 1
L = $ > $ = wL ,
wAI SI
8 2
but lower for the high-cost (inefficient) worker; that is,11
1 1
H = $ < $ = wH .
wAI SI
4 2
As a consequence, the efficient worker earns a positive information rent under asym-
metric information because she receives a larger wage by exerting the same level of effort.
Intuitively, for the efficient worker to voluntarily choose the contract meant for her, rather
than that of the inefficient type, the firm must offer a positive rent. In contrast, the inefficient
worker is left with zero utility (no information rent) in both the symmetric and asymmetric
information contexts.
In the context of example 16.4, the efficient worker’s utility under asymmetric informa-
tion is
2 5 1

1
uAI
L = w AI
L − θL eAI
L = − 1 × = ,
8 2 8
whereas under symmetric information, her utility was
2 1 1

uL = wL − θL eL = − 1 ×
SI SI SI
= 0,
2 2
implying that under asymmetric information, she earns information rent (i.e., a higher utility
than under symmetric information). Regarding the inefficient worker, we find that her utility
under asymmetric information is
2 1 1

uAI
H = wAI
H − θH eAI
H = − 2 × = 0,
4 8
10. More generally, we can check that the wage to the low-cost worker is higher under asymmetric than symmetric
information, wAI SI
L > wL . From these results (before plugging in specific numbers for our parameter values), we
(1+γ )θH −θL 1 SI
need to show that wAI L = 2[θH −(1−γ )θL ] > 2 = wL . Simplifying this expression, we find that (1 + γ ) θH − θL >
θH − (1 − γ ) θL , leading to θH > θL , which is true by assumption.
11. One can check that, generally, the salary to the high-cost worker is lower under asymmetric than symmetric
information, wAI SI
H < wH . From these results (before inserting specific numbers for our parameters), we need to
γ θH 1 SI
show that wAI
H = 2[θH −(1−γ )θL ] < 2 = wH . Simplifying this inequality, we obtain γ θH < θH − (1 − γ ) θL , which
leads to θL < θH , a condition that holds true by assumption.
438 Chapter 16
which coincides with her utility under symmetric information because

2 1 1

uH = wH − θH eH = − 2 ×
SI SI SI
= 0,
2 4
illustrating that this type of worker does not earn a higher utility under asymmetric than
symmetric information.
16.3.8 Preventing Adverse Selection

Given the inefficiencies from adverse selection, a natural question is what agents can do to
ameliorate these problems. We next present three common approaches to prevent adverse
selection, all of which help the uninformed agent to become better informed:
• Screening. In insurance markets, a typical tool that firms use to reduce asymmetric infor-
mation problems is by identifying groups of individuals with a higher (lower) risk and
charging them a higher (lower) premium, but a low (high, respectively) deductible in case
of an accident.12 Formally, we say that companies offer a menu (or list) of contracts—one
meant for individuals with high risk and another for individuals with low risk—and each
type of individual has incentives to select the contract meant for her. Intuitively, if you are
young and healthy, you may choose a health insurance with a low monthly premium but
a high deductible because you do not expect many doctor visits. If, instead, you are old
or have a serious medical condition, you may prefer a health plan with a relatively high
premium, but low deductibles. In summary, insurance companies design menus of con-
tracts that work as “screening devices” to identify which individuals are more or less risky
because individuals themselves have incentives to select the contract they prefer and, by
doing so, reveal their riskiness to the company.
• Signaling. Another common tool to prevent adverse selection problems is the informed
party (worker) doing something costly, such as earning a graduate degree, to signal her
type to the uninformed party (firm). In the principal-agent model, however, this signaling
from the worker to the firm can occur only if the worker has the ability to send a signal
before the firm offers the contract. Consider that the worker has an undergraduate degree
and has only two available actions at this point: earn a master’s degree in her field or
not. For simplicity, we assume that this degree does not change the worker’s productivity
in the firm that is considering hiring her, which helps us focus on the role of education
as a signaling device rather than as a productivity-enhancing tool. Assume, in addition,
that the efficient worker (that with parameter θL ) suffers a cost of $100 from earning this
master’s degree, while the inefficient worker (that with θH ) suffers a cost of $300. The
cost difference reflects that the efficient worker can finish her coursework faster, which
12. For instance, life insurance companies go through underwritting that evaluates whether to give an applicant a
policy and determines her premium.
Contract Theory 439
reduces her tuition costs as well as other opportunity costs of time while completing the
degree. While the firm is uninformed about the worker’s efficiency, observing a master’s
degree signals that the worker is more likely to be efficient because it is more costly for the
inefficient than the efficient worker to earn the degree. Education can then work as a signal
that the informed agent uses to convey information to the uninformed firm. As discussed in
section 16.3.3, a similar argument applies to the used-cars market, where a seller of high-
quality cars may offer a 3-year warranty that the seller of low-quality cars cannot profitably
match. Unfortunately, spending time and resources on acquiring a master’s degree is costly
for the worker, and it does not increase the worker’s productivity. In other words, education
helps convey information, but it only does that! Costly signaling then, while effective,
gives rise to its own inefficiencies relative to the complete information scenario analyzed
in section 16.3.5.
• Legal rules. Finally, most countries provide buyers with rights that can help ameliorate
adverse selection problems, such as laws requiring the seller to replace the object if it
breaks down during a certain period after the purchase. These laws are often known as
“implied warranties” because they do not need to be included in the purchasing contract.
Appendix. Showing That PCH and ICL Hold with Equality
In this short appendix, we show that two of the constraints in the profit-maximization pro-
blem that the firm solves in contexts of adverse selection (hidden information) hold with
equality (PCH and ICL )—that is, they bind. In contrast, the other two constraints in the
firm’s problem hold with strict inequality (PCL and ICH )—that is, they are slack.
• PCL is slack. The incentive compatibility condition of the low-cost worker, ICL , is wL −
θL e2L wH − θL e2H . Therefore, because θH > θL by definition, we have
wL − θL e2L wH − θL e2H

> wH − θH e2H 0.
By θH >θL
Combining the first and last elements of the inequality yields wL − θL e2L > 0. This result
coincides with PCL . In other words, we just showed that PCL holds with strict inequality
(>) rather than with a weak inequality () (i.e., PCL is slack).
• ICL binds. The incentive compatibility condition of the low-cost worker, ICL , must hold
with equality (i.e., it must bind). Otherwise, the principal could reduce the wage that it
offers to the low-cost worker, still inducing her to take the contract meant for her rather
than the one meant for the high-cost worker. Therefore, ICL holds with equality, implying
that wL − θL e2L = wH − θL e2H , which, rearranging, yields

wL = wH + θL e2L − e2H .
440 Chapter 16
• ICH is slack. The incentive compatibility condition of the high-cost worker, ICH , says that
wH − θH e2H wL − θH e2L .
Using the binding ICL , wL = wH − θL (eL − eH ), this expression can be rewritten as

wL from ICL

wH − θH e2H wH + θL e2L − e2H − θH e2L .
Canceling out wH on both sides of the inequality, and rearranging, we obtain

θH e2L − e2H − θL e2L − e2H 0,
or, more compactly,

(θH − θL ) e2L − e2H 0,
which is strictly positive because θH > θL by assumption and if the firm induces a higher
effort from the more efficient, low-cost worker, than the less efficient, high-cost worker
(i.e., eL > eH ). Our analysis of the principal-agent problem under asymmetric information
must then find that eL > eH in equilibrium (as we did in section 16.3.6 and example 16.4).
Otherwise, ICH would not necessarily hold strictly. Therefore, the original expression
for ICH satisfies wH − θH e2H > wL − θH e2L , a condition that holds with strict inequality.
Intuitively, the high-cost agent has no incentive to take the contract meant for the low-cost
agent because doing so would entail a loss.
• PCH binds. The participation constraint of the high-cost worker, PCH , holds with equality
(i.e., it must bind). Otherwise, the firm can still reduce the wage offered to the high-cost
worker, while inducing her to take the contract meant for her.
Exercises
1. Moral hazard.A Give two real-world examples where moral hazard problems exist. In all
examples, identify the individuals/firms involved, their order of play, and the available actions.
2. Risk aversion–I.A Consider the situation in example 16.1.
(a) Suppose that the worker is risk neutral (i.e., u(w) = w). Find the optimal salaries, wH and wL in
this scenario, and compare them against those in example 16.1.
(b) Suppose that the worker is risk loving (i.e., u(w) = w2 ). Find the optimal salaries, wH and
wL in this scenario, and compare them against those in example 16.1.
3. Risk aversion–II.A Consider the situation in example 16.2.
(a) Suppose now that the worker is risk neutral (i.e., u(w) = w). Find the optimal salaries, wH and
wL in this scenario, and compare them against those in example 16.2.
Contract Theory 441
(b) Suppose now that the worker is risk loving, with utility function u(w) = w2 . Find the optimal
salaries, wH and wL in this scenario, and compare them against those in example 16.2.
4. Risk loving–I.B Amelia has been hired by Boeing to develop the first electric passenger airplane.
Amelia’s utility function is u(w) = ( w2 )2 , where w denotes wage. She needs to balance her profes-
sional work with her personal life; hence, she experiences a disutility from exerting effort at work,
√
e, measured by g(e) = e, and her reservation utility is u = 0. Amelia can choose between two
effort levels: high, eH = 16, or low, eL = 4. If she spends a lot of energy trying to develop Boe-
ing’s project, the probability that she is successful (unsuccessful) is 0.7 (0.3, respectively), which
generates a profit of $750 million ($100 million, respectively). However, if Amelia’s effort is low,
the project is successful (unsuccessful) with a probability of 0.2 (0.8, respectively). It is evident
that low effort implies that an electric passenger airplane is less likely to be developed by Boeing.
Identify the type of contract that Boeing should offer to Amelia to induce her to exert high effort.
5. Household chores–I.B Ana and Felix have decided that Felix will wash dishes every time Ana
prepares a meal. However, Ana cannot observe how careful Felix is when washing them. She knows
that when Felix is very meticulous (careless), the probability of having spotless dishes is 0.95 (0.2)
and having them dirty is 0.05 (0.8, respectively). Felix’s utility is measured by the time he spends
√
watching his favorite TV show each week (i.e., u(t) = t), which depends on his success at washing
dishes. If he is very careful when washing dishes, his effort level is represented as eH = 1.8, and if
he is careless, it is eL = 0. Ana’s expected utility when Felix achieves spotless dishes is 20. Identify
the contract that Ana needs to offer Felix in terms of time spent watching his show. (Assume that
letting Felix watch his favorite TV show is costly for Ana because there is only one TV in the
house, and she does not like this particular TV show.)
6. Household chores–II.B Consider the situation in exercise 16.5. Suppose that Felix has purchased
a new dishwasher, at a utility cost of K = 1, that increased the probability of having spotless dishes
under careless effort to 0.4.
(a) Identify the contract that Ana needs to offer Felix in terms of TV time.
(b) Is Felix better off after purchasing this dishwasher? Why or why not?
(c) For what values of K, the cost of the dishwasher, would Felix be better off after purchasing the
dishwasher?
7. Basketball–I.A Consider a situation where a star basketball player is in negotiation for a new
contract. The player knows that if he exerts high (low) effort in a game, the probability that his
team wins the championship is 0.7 (0.3), which is worth $1 million to the ownership of the team.
The basketball player’s utility function can be expressed as u(w) = w0.4 , and he incurs a utility cost
of 150 when exerting high effort, and 50 when exerting low effort. Assume that his reservation
utility is zero.
(a) Identify the type of contract that the ownership should offer the star basketball player to induce
him to exert a high effort level.
(b) Find the team’s expected profits.
8. Basketball–II.B Consider the situation in exercise 16.7, but suppose that as the star player gets
older, the utility cost he incurs to exert high effort increases to 250.
442 Chapter 16
(a) Identify the type of contract that the ownership should offer the star basketball player to induce
him to exert a high effort level.
(b) Should the ownership even offer this contract to the star player? Why?
9. Insurance market–I.B Consider a situation where an insurance firm wants to incentivize its pol-
icyholder to exert effort in prevention of risky behaviors. Suppose that when a policyholder exerts
high effort (eH = 10), she has a probability of 0.1 of experiencing an adverse event (e.g., an acci-
dent), which costs the insurance firm $10,000. A policyholder that exerts low effort (eL = 0) has
a probability of 0.2 of the same event happening. The utility function for a policyholder is d − e2 ,
where d represents the size of any policy discounts the insurance company offers her (assume
that her reservation utility is equal to ū = 0). If the effort level of policyholders is observable by
the insurance firm, identify the type of contract that the insurance firm should offer to induce the
policyholder to exert high effort level.
10. Insurance market–II.B Consider the situation in exercise 16.9, but suppose that the effort level
of the policyholder is no longer observable to the insurance firm. Identify the type of contract that
the insurance firm should offer the policyholder to induce her to exert a high effort level.
11. Risk-averse lemons–I.B Consider the market for lemons presented in subsection 16.3.2. Suppose
that while both the buyer and seller can observe the car’s quality, the buyer is now risk averse, so
√
her utility from purchasing the car is now q − p.
(a) What price does the seller charge to the buyer in this situation?
(b) Is this sale profitable for the seller? What is the range of qualities that he would be willing to
sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.)
12. Risky lemons.C Consider the market for lemons presented in subsection 16.3.2. Suppose that
while both the buyer and seller can observe the car’s quality, the buyer’s utility from purchasing
the car is now qα − p, where α ∈ [0, 1] measures risk aversion (e.g., when α = 1, the buyer is risk
neutral, whereas when α = 12 , the buyer is risk averse).
(a) As a function of α, what price does the seller charge to the buyer in this situation?
(b) Is this sale profitable for the seller? What is the range of qualities that he would be willing to
sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.)
(c) What happens to the results of parts (a) and (b) as α increases?
13. Risk-averse lemons–II.B Consider the market for lemons presented in subsection 16.3.3, where
the buyer cannot observe car quality. Suppose that the buyer is now risk averse, so her utility from
√
purchasing the car is now q − p.
(a) What price does the seller charge to the buyer in this situation?
(b) Is this sale profitable for the seller? What are the range of qualities that he would be willing
to sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.)
14. Used car market.B The used car market in Lemonville has two types of cars: high-quality cars
(H) and low-quality cars (L). A proportion r of the total used cars are H type. Denote the type
of car by K. So, for a typical car, K = H with probability r, and K = L with probability (1 − r).
Contract Theory 443
The valuation of a K-type car to the seller and buyer are Ks and Kb , respectively. That is, the
seller is willing to sell a K-type car at a price greater than or equal to Ks , and the buyer is will-
ing to buy it at a price less than or equal to Kb . We assume that Hb > Hs > Lb > Ls > 0. For
simplicity, assume that Hb = 4, Hs = 3, Lb = 2, and Ls = 1. The number of used cars for sale is
limited, but the demand of the used cars is competitive (i.e., there is an infinite number of potential
buyers).
(a) Symmetric information. Suppose that both sellers and buyers observe the type of each
individual car in the market. Explain what will happen in the market.
(b) Asymmetric information. Now we assume asymmetric information: the seller observes the
type of the cars he sells, but the buyers do not. We assume risk-neutral buyers, so they will
buy if the price is less than the expected valuation of the car. Suppose that there is no effective
way of transmitting the information on the type of the cars to the buyers, so there will be only
one price in the used car market. Explain what will happen. In particular, find an expression
of the particular value r, such that only L cars will be traded if r <
r. What happens if r > r?
15. Risk neutrality–II.B Repeat exercise 16.14, but assume a risk-averse buyer, with utility function
√
u(q) = q, where q > 0 denotes his income.
(a) Symmetric information. Suppose that both the sellers and the buyers know the type of each
individual car in the market. Explain what will happen in the market.
(b) Asymmetric information. Now we assume asymmetric information: the seller observes the
type of the cars he sells, but the buyers do not. We assume risk-neutral buyers, so they will
buy if the price is less than the expected valuation of the car. Suppose that there is no effective
way of transmitting the information on the type of the cars to the buyers, so there will be only
one price in the used car market. Explain what will happen. In particular, find an expres-
sion of the particular value r, such that only L cars will be traded if r < r. What happens
if r >
r?
16. Risk neutrality–I.A Repeat the analysis in example 16.3, but assume that the worker is risk neutral
(i.e., their payoff from the contract is w − θe). How are equilibrium results affected? [Hint: Repeat
all the steps in subsection 16.3.5 to find the equilibrium effort levels and salaries, and then evaluate
your findings at the parameter values considered in example 16.3.]
17. Risk neutrality–II.B Repeat the analysis in example 16.4, but assume that the worker is risk
neutral (i.e., their payoff from the contract is w − θe). How are equilibrium results affected? [Hint:
Repeat all the steps in subsection 16.3.6 to find the equilibrium effort levels and salaries, and then
evaluate your findings at the parameter values considered in example 16.4.]
18. Optimal contracts.B Repeat the analysis in example 16.4, but assume that the worker’s reserva-
tion utility is 2, rather than zero. How are the equilibrium results affected? [Hint: Repeat all the
steps in subsection 16.3.6 to find the equilibrium effort levels and salaries, and then evaluate your
findings at the parameter values considered in example 16.4.]
19. Training.A Consider the results of example 16.4. Suppose that the high-cost worker could pay for
some training to lower her cost of effort to θL . How much would she be willing to pay to achieve
this?
444 Chapter 16
20. Information premiums.B Consider the results of subsection 16.3.6.

(a) What happens to the equilibrium effort and wage levels as θH increases?
(b) What happens to the equilibrium effort and wage levels as θL increases?
(c) Intuitively, why does the price of one type of worker’s effort (θH or θL ) affect the wages or
effort of the other type of worker?
21. Adverse selection.A Give two real-world examples where adverse selection problems exist. In all
examples, identify the individuals/firms involved, their order of play, and the available actions.
17 Externalities and Public Goods
17.1 Introduction
This chapter examines other market imperfections: externalities and public goods.
Externalities arise when the actions of one agent (an individual, a firm, or a country) affect
the welfare of another agent. If the agent creating the externality ignores the effects that
her actions impose on other individuals, market mechanisms will not allocate resources
efficiently. A common example of a negative externality is pollution; a polluting firm may
ignore the effect that its emissions have on other firms or society, producing a large amount.
Specifically, the firm generates more pollution than would be socially optimal, as identified
in this chapter. In such contexts, an explicit coordination among the affected parties may be
recommended, and if this does not work, market intervention may be justified.
A similar argument applies to the case of public goods, which are goods and services
from which all individuals can benefit, and for which excluding noncontributors is either
unfeasible or extremely costly. A typical example of this type of good is national defense.
Once defense is provided, all of us can enjoy it, whether or not we paid our taxes. Therefore,
providing it to one more individual does not alter its cost, and excluding noncontributors is
unfeasible. Individuals understanding these features of public goods may have incentives to
free-ride because, at the end of the day, they cannot be easily excluded from enjoying the good.
We discuss these incentives, as well as the potential of public policy to correct them, next.
17.2 Externalities
Externalities The effect that the action of an agent has on the welfare of another
agent, beyond the effects transmitted by changes in prices.
Examples of negative externalities abound: a firm’s pollution of a river that is being used
by fishing farms downstream (production externality); a driver entering a highway at peak
hours, which increases the driving time for all drivers (consumption externality); or a
446 Chapter 17
roommate streaming online, which slows down the internet speed of other roommates in
her network (consumption externality).
However, the decrease in market prices that occurs after one firm brings more units for sale
cannot be interpreted as an externality. To understand this point, note that if firm 1 produces
more units market price will decrease, hurting the profits of other companies selling the
same product. This effect is, however, transmitted via prices since a market for the good
exists. In the example about pollution affecting downstream fishing farms, however, markets
for pollution do not exist, and a similar argument applies to other examples on negative
externalities.
We can similarly find examples of positive externalities, such as an individual choos-
ing to vaccinate, thus helping other individuals around him be better protected against that
illness (consumption externality), or unpatented research and development (R&D) com-
pleted by a university or a firm, which can be used for free by other firms or research centers
to rapidly improve their own production processes and inventions (production externality).1
We next study the amount of externality generated under no regulation, and then compare
it against the optimal amount of externality (e.g., pollutant emissions) for society.
17.2.1 Unregulated Equilibrium

In the case of a negative externality, like a factory polluting a river, the polluter ignores
the effect that its actions have on other individuals, such as poor air quality for citizens in
the nearby area and larger costs for filtering water by a fishing farm downstream. If left
unregulated, this polluting firm would produce a large amount of pollution, which is not
necessarily optimal. In particular, the firm maximizes profits as example 17.1 illustrates.
Example 17.1: Unregulated equilibrium Consider a monopolist facing inverse

demand function p(q) = 10 − q, and total cost TC(q) = 2q. The firm maximizes its
profits as follows:
max (10 − q)q − 2q.

q
Differentiating with respect to q yields 10 − 2q − 2 = 0, which simplifies to 8 = 2q.

Solving for q, we obtain an output of
qU = 4 units;
where superscript U denotes “unregulated equilibrium.”2
1. This was the case for laser technology which, within a few years of its invention, found initialy unsuspected
applications, such as barcode reading in checkout counters at supermarkets, stores, and warehouses.
2. Because we assume an industry with a single firm, this approach coincides, of course, with our approach to find
profit-maximizing output under monopoly discussed in chapter 10. A similar approach would apply if the industry
Externalities and Public Goods 447
$8
Marginal profit,
8 – 2q
Unregulated
equilibrium,
qU = 4 units
4 q
Figure 17.1
Pollution in the unregulated equilibrium.
Assuming that each unit of output generates α 0 units of pollution, the total
amount of pollution that this firm generates when left unregulated is 4α. Figure 17.1
graphically represents the firm’s problem, which increases output q until marginal
profits are zero; that is,
∂π
= 10 − 2q − 2 = 8 − 2q = 0.
∂q
The curve representing marginal profits, 8 − 2q, originates at a height of 8 and
decreases in q, crossing the horizontal axis at exactly qU = 4 units.3
Intuitively, the firm has no more profit opportunities beyond that level of output:
producing more than 4 units yields negative marginal profits (i.e., profits decrease),
whereas producing fewer than 4 units means that the firm could still increase output
and further increase its profits.
Self-assessment 17.1 Repeat the analysis in example 17.1, but assume now the
inverse demand function changes to p(q) = 14 − q. Compare your results with those
in example 17.1.
was, instead, oligopolistic, where we would follow the tools learned in chapter 14. We consider this scenario in
one of the end-of-chapter exercises.
3. Recall that to find the crossing point with the horizontal axis, we only need to set this equation equal to zero,
8 − 2q = 0, and then solve for q to obtain 8 = 2q, or q = 82 = 4 units.
448 Chapter 17
17.2.2 Social Optimum

How can we evaluate whether the unregulated amount of pollution is socially excessive or
not? We first examine how much pollution would be generated by a social planner who con-
siders both the firm’s profits and the externality that pollution imposes on other individuals
and firms. Example 17.2 describes this calculation.
Example 17.2: Finding the social optimum Continuing the scenario in example
17.1, assume that every unit of emissions e 0 generates an external cost of EC =
3 (e)2 , which is increasing and convex in emissions. This implies that emissions are
damaging for individuals in the vicinity of the polluting factory, and at an increasing
rate; that is, the first ton of carbon dioxide (CO2 ) might just create fog in the area,
while the 10,000th ton creates serious health problems. Because emissions are defined
as e = αq, the external cost can be rewritten as EC = 3 (αq)2 . For example, if every
unit of output generates α = 14 units of emissions, total emissions are e = 14 q, implying
2
that external costs become EC = 3 14 q = 16 3 2
q .
The social planner cares about society as a whole, thus considering the sum of firm
profits and external costs by solving the following problem:
max [(10 − q)q − 2q] − 3 (αq)2 .

q
Profits External cost
This, essentially, adds the external cost of pollution, EC = 3 (αq)2 , to the firm’s
profit-maximization problem discussed in example 17.1.
Differentiating with respect to q yields (10 − 2q − 2) − 6αq = 0, which simplifies
to 8 = q(2 + 6α). Solving for output q, we obtain that the social optimum is
8
qSO = ,
2 + 6α
which is decreasing in the rate of emissions per unit of output, α. If every unit of out-
put generates 1 unit of emissions, α = 1, the social optimum is only qSO = 2+6 8
=1
unit. In contrast, when output does not generate any unit of emissions, α = 0, the
social optimum increases to qSO = 2+0 8
= 4 units, thus coinciding with the unregu-
lated equilibrium. Intuitively, external cost EC = 3 (αq)2 is zero when α = 0, and as
a consequence, the social planner’s maximization problem coincides with that of the
unregulated firm in example 17.1.
Using a similar approach as in figure 17.1, figure 17.2 depicts the social planner’s
problem. For comparison purposes, it splits this problem into two parts: marginal
profit ∂π ∂EC ∂π
∂q and marginal damage ∂q , where ∂q = 8 − 2q as shown in figure 17.1,
$8
Marginal profit,
8 – 2q
Marginal damage, 6αq
4 q
8
q SO =
2 + 6α
Figure 17.2
Socially optimal pollution.
whereas ∂EC
∂q = 6αq is a straight line starting from the origin and growing at a rate
of 6α. Leaving the firm unregulated yields the output level where the marginal profit
curve ∂π
∂q = 8 − 2q crosses the horizontal axis at q = 4. The social optimum, in con-
U
trast, lies where marginal profit crosses marginal damage, at qSO = 2+6α8
. Increasing
SO
output beyond q would generate more external costs than profits and would thus be
inefficient, whereas decreasing output from qSO would not be efficient either, because
a larger output would increase profits more significantly than external costs.
Finally, note that the regulator does not necessarily recommend the prohibition of
the externality-generating activity. Indeed, the socially optimal output qSO = 2+6α8
decreases in α, but it does not become zero for any value of α. Even in the case in
which α = 100 (i.e., every ton of output generates 100 tons of emissions, a rather
8
unlikely scenario), the socially optimal output becomes 2+(6×100) = 0.013 units.
Self-assessment 17.2 Repeat the analysis in example 17.2, but assume the inverse
demand function changes to p(q) = 14 − q. Show that the socially optimal output qSO
is larger than in example 17.2. Interpret.
While example 17.2 presents a scenario in which pollution is never banned, other
industries might recommend such a prohibition, as example 17.3 illustrates.
450 Chapter 17
Example 17.3: Prohibiting pollution Consider example 17.2, but assume an exter-
nal cost of EC = 3 (e)2 + 7e, which is also increasing and convex in emissions e, but
yields a higher marginal damage than the external cost in example 17.2.4 The social
planner’s problem is analogous to that in the previous example:

max [(10 − q)q − 2q] − 3 (αq)2 + 7αq .
q
Profits
External cost
Differentiating with respect to q yields (10 − 2q − 2) − (6αq + 7α) = 0, which

simplifies to 8 − 7α = q(2 + 6α). Solving for output q, we obtain that the social opti-
mum is
8 − 7α
qSO = .
2 + 6α
It is straightforward to check that, like the socially optimal output of example 17.2,
this output is also decreasing in the rate at which output transforms into emissions, α,
because its derivative is
∂qSO −7(2 + 6α) − 6(8 − 7α) 62
= =−
∂α (2 + 6α) 2 4(1 + 3α)2
31
=− ,
2(1 + 3α)2
which is negative for all values of α. In contrast to qSO in example 17.2, the output
2+6α 0, so long
found here can become negative if α is large enough. In particular, 8−7α
as 8 − 7α 0, or α 87 . Intuitively, if every unit of output generates slightly more
than 1 unit of emissions, the socially optimal output should be reduced to zero, thus
banning the pollution-generating activity.
external cost decreases to EC = 3 (e)2 + 5e. Find under which values of parameter α
the pollution-generating activity should be banned. Interpret.
4. To see this point, find the marginal damage in example 17.3, 6αq + 7α, and compare it with that in example
17.2, 6αq. The marginal damage in example 17.3 originates at 7α, while that in Example 17.2 originated at zero.
In addition, the marginal damages are parallel to each other because differentiating with respect to q again yields
6α for both functions. Therefore, 6αq + 7α is parallel to 6αq but lies above it for all values of q.
17.3 Restoring the Social Optimum
A natural question is: how do you induce agents to internalize the externalities that their
actions impose on other individuals, rather than ignoring them completely in the unregulated
equilibrium? Two approaches are often suggested: letting the parties bargain, and market
intervention through government policy.
17.3.1 Bargaining between the Affected Parties

The following theorem, based on Coase (1960), identifies scenarios where bargaining can
be an effective tool to address externality problems.
Coase theorem The agents producing the externality and those affected by
the externality can negotiate, generating a socially optimal amount of external-
ity, if the following conditions hold: (1) all parties are perfectly informed about
each other’s benefits and costs; (2) the negotiation and transaction costs are
zero; (3) the amount of the externality is observable by a third party; and (4)
their agreement is enforceable. This result holds both when the property rights
of the resource are assigned to the agent generating the externality (the pol-
luter) and when they are assigned to the agent affected by the externality (the
victim).
To understand this bargaining possibility (known as the “Coase theorem”), let us examine
it in the context of an example. An upstream firm pollutes a river, affecting a fishing farm
that is located downstream. As water becomes more polluted, the fishing farm needs to
spend more resources in filtering water for its operations, thus giving rise to a production
externality because pollution affects the costs of the fishing farm. For presentation purposes,
we first analyze the case in which the property rights over the river are assigned to the fishing
farm.
Fishing farm. If the fishing farm owns the river, it would initially be completely clean; that
is, the externality-generating activity would be q = 0. Is this outcome efficient? No, because
the polluting firm could pay the fishing farm for an increase in the externality-generating
activity, from q = 0 to qSO . As depicted in figure 17.2, output levels between q = 0 and qSO
generate more profits for the polluting firm than the external costs that it imposes on the
fishing farm. Formally, the marginal profit curve ∂π ∂q lies above the marginal damage curve
∂EC
∂q ,
thus indicating that, to increase pollution, the polluting firm would be willing to pay
more than what the fishing farm needs as compensation. Beyond qSO , the polluting firm
would still obtain additional profits from further increases in pollution, but they are now
452 Chapter 17
smaller than the additional compensation that the fishing farm needs in order to accept such
an increase in pollution. As a result, negotiating parties would reach an agreement at exactly
qSO units.5
Polluting firm. If, instead, the polluting firm owns the river, it would initially be com-
pletely dirty, as this firm would choose output level qU . We again ask ourselves whether
this outcome is efficient. And, again, our answer is no. In this case, the fishing farm could
pay the polluting firm for a decrease in the externality-generating activity, from qU to qSO .
As depicted in figure 17.2, output levels between qSO and qU = 4 generate a larger external
cost for the fishing farm than additional profits for the polluting firm. Formally, the marginal
damage curve ∂EC ∂π
∂q lies above the marginal profit curve ∂q , thus indicating that, to decrease
pollution, the fishing farm would be willing to compensate more than what the polluting
firm needs as compensation. Reducing pollution below qSO , the fishing farm would obtain
additional reductions in external costs, but they are now smaller than the additional com-
pensation that the polluting firm needs to further decrease pollution. As a result, negotiating
parties would reach an agreement at exactly qSO units.
Hence, socially optimal output qSO emerges as the outcome of the bargaining process
between the agents, both when the polluting firm and when the fishing farm owns the river.
This is great news—agents alone, without government intervention, could reach the socially
optimal outcome, independent of how property rights are assigned. Under which cases can
we expect the Coase theorem to hold, then? Essentially, this theorem holds when its main
assumptions are satisfied:
• Zero negotiation costs. Negotiation costs tend to increase as more agents generate the
externality and more agents are affected by it. We can then expect negotiation costs to be
low when only a few agents are involved (e.g., a single polluting firm and a single firm
being affected by the externality, as in the fishing farm example), but otherwise, these
costs can be large.
• Well-defined property rights. In addition, we need property rights to be well defined, thus
allowing both parties to know who should be compensated for an increase or decrease of
the externality.
• Perfect information. Agents must be well informed about the benefits and costs that the
other party experiences from the externality. This is a rather restrictive assumption. In the
fishing farm example, for instance, it requires this farm observing how beneficial each ton
of pollution is for the polluting firm, so the fishing farm can assess how much to offer to
reduce pollution.
5. In addition, the polluting company will not have incentives to break the agreement (by polluting more than qSO )
because pollution is perfectly observable by a court of law, and the contract with the fishing farm is, by assumption,
enforcable.
• Observable pollution and enforceable contracts. In addition, the theorem requires that the
amount of pollution must be observable by a third party, such as a court of law, and the
contract must be enforceable in case one of the parties breaks it.
When any of these conditions does not hold, we can generally expect that the negotia-
tion does not generate an efficient amount of the externality; in these cases, government
intervention might be required. We examine these cases next.
17.3.2 Government Intervention

Public policy seeking to correct externalities often takes two forms: a quota, which sets an
upper limit on the amount of the externality that agents can generate (e.g., maximum tons
of CO2 that firms can emit per year, or maximum amount of fish that a fishing company can
appropriate); or emission fees, which increases the cost that the firm faces per unit of the
output generating externalities (e.g., the firm pays $7 per ton of cement being produced, as
this production generates emissions).6 We analyze the design of each policy tool next.
Emission quotas If the regulator seeks to induce a socially optimal output qSO from the
polluting firm, she can simply set an emission quota of exactly qSO . When the firm emits
less than qSO , no fines are imposed, whereas when the firm emits more, a hefty fine is
levied. In the case of example 17.2, for instance, the regulator only needs to set the emission
quota at qSO = 2+6α8
, where α denotes the rate at which every unit of output transforms
into emissions. For example, if α = 1/3, the emission quota would be qSO = 8 1 = 2 tons
2+6 3
of CO2 .
Emission fees In this scenario, if the regulator seeks to induce a socially optimal output
qSO , she only needs to set an emission fee t that induces the firm to produce exactly qSO .
How can she calculate the exact amount of fee t that achieves this goal? By anticipating
the firm’s production behavior, the regulator knows how the firm reacts to the emission fee
(which increases its unit cost by t). Example 17.4 illustrates the design of emission fees
when the regulator faces the polluting monopolist from example 17.2.
Example 17.4: Finding optimal emission fees From examples 17.1 and 17.2, a
polluting monopolist faces a linear demand p(q) = 10 − q and marginal costs of 2.
Consider that the regulator seeks to induce this socially optimal output of qSO = 2 tons
6. Examples include emission fees imposed by the Environmental Protection Agency (EPA) on coal-fired power
plants in 2008, estimated to increase the cost of every megawatt-hour generated in plants using pulverized coal
by around $33 because of its CO2 emissions and another $0.67 because of its sulphur dioxide (SO2 ) and nitrogen
oxide (NOx ) emissions.
454 Chapter 17
of CO2 . She then faces a two-period game: in the first stage, the regulator sets emis-
sions fee t; and, in the second stage, observing this fee, the polluting firm responds
choosing its output q. We next solve this sequential-move game by applying backward
induction, so we start analyzing the second stage.
Second stage. If the regulator sets a fee t on every unit of output, the monopolist’s
profit-maximization problem becomes
max (10 − q)q − (2 + t)q,

q
where the firm’s unit cost increases from 2q under no regulation to (2 + t)q under
regulation. Differentiating with respect to q, we obtain 10 − 2q − (2 + t) = 0. Solving
for q, we find that the monopolist’s output is
8−t
q(t) = .
2
When fees are absent (t = 0), output reduces to q(0) = 4 units, as in the scenario ana-
lyzed in example 17.2. However, when the firm is subject to a positive fee t > 0, its
output decreases in the severity of the fee.
First stage. The regulator sets the emission fee in the first period, while the firm
responds to that fee in the second period. The regulator can then put herself in the
shoes of the monopolist, anticipating the output that maximizes the firm’s profits,
q(t) = 8−t
2 , and set it equal to the socially optimal output q
SO = 2 tons of CO that she
2
seeks to induce. That is,
8−t
= 2.
2
Rearranging this equation, we find 8 − t = 4 and, solving for emission fee t, yields
t = $4. To confirm that this fee induces the firm to produce the socially optimal output
(2 tons), we can insert the fee t = $4 into the firm’s output function, q(t) = 8−t2 , to
obtain q($4) = 8−4
2 = 2 units. Therefore, by setting a fee of $4 per unit of output,
the regulator increases the monopolist’s costs, which ultimately induces the firm to
voluntarily produce the socially optimal output.7
7. In the case of a positive externality, such as vaccinations, clean air, or education, the socially optimal output qSO
is actually larger than the output that firms choose in the unregulated equilibrium, qU > qSO . In such a context, the
optimal emission fee that the regulator finds becomes negative, thus indicating that she needs to provide a negative
tax (i.e., a subsidy) per unit of output to induce firms to increase their production toward the optimal output level
qSO . In the scenario of example 17.4, consider, for instance, that qSO = 20, and solve for t to find the optimal
subsidy per unit of output.
Self-assessment 17.4 Consider your results in self-assessment question 17.2.

Following the steps in example 17.4, find the emission fee t that induces firms to
produce the socially optimal output qSO .
For a more detailed analysis of externality problems, and how to correct them using var-
ious instruments, see Kolstad (2010). For a more technical presentation, see Phaneuf and
Requate (2016).
17.4 Public Goods
The term “public goods” refers to goods and services which are nonrival (its consumption
by one individual does not reduce the amount of the good available to other individuals)
and nonexcludable (preventing an individual from enjoying the good is extremely expen-
sive or impossible). A common example is national defense, because my consumption
does not reduce your consumption, and if you were to not pay your taxes tomorrow, it
would be essentially impossible for the government to prevent you from enjoying national
defense, even if you didn’t help in its funding.8 Another common example is clean air,
because that also satisfies the two features of nonrivalry (your consumption of clean air
does not reduce my own) and nonexcludability (how can you be prevented from enjoying
clean air?).9
In contrast, goods that do not satisfy either property are “private goods,” such as an apple,
because its consumption is rival (if you eat it, I cannot enjoy the same apple) and excludable
(if you don’t pay for an apple, you cannot eat it). You might be wondering: “What if only
one of the two features holds?” Table 17.1 illustrates the taxonomy of cases that emerge
when combining these two features, with rivalry in rows and excludability in columns. Four
cases arise:
• Public goods (nonrival and nonexcludable).
• Private goods (rival and excludable).
• Club goods (nonrival and excludable).
• Common-pool resources (rival but nonexcludable).
A “club good,” such as a gym, is nonrival because the good can be enjoyed by sev-
eral members without affecting each other’s utility, unless the gym becomes too crowded.
8. Well, the government could deport you so you don’t get to enjoy national defense, but this is not a penalty for
tax evasion. At least yet!
9. Many other examples abound, such as public fireworks, official statistics, and publicly available inventions
through unpatented R&D.
456 Chapter 17
Table 17.1
Taxonomy of public goods.
Excludable Nonexcludable
Rival Private goods Common-Pool Resources

(Example: apples) (Example: fishing grounds)
Nonrival Club goods Public goods
(Example: gyms) (Example: national defense)
In addition, it is excludable since the gym owners can easily prevent nonmembers from
entering the center by requiring users to show a membership card.10 In contrast, common-
pool resources (such as forests, aquifers, hunting grounds, and fishing grounds) are rival
because the exploitation of the resource by one agent reduces the stock available for other
agents (e.g., if a fisherman catches 1 more ton of fish, other fishermen in the area may need
to incur higher costs to catch the same amount of fish).
Non-excludable goods, such as public goods and common-pool resources, result in agents
exhibiting free-riding behavior, in which consumers do not pay for the goods because they
expect that others will pay.11 Example 17.5 illustrates free-riding in a familiar situation.
Example 17.5: Free-riding of public goods Consider two roommates cleaning

their apartment on a Saturday. Every roommate i simultaneously and independently
chooses the number of hours she spends cleaning, hi , where hi ∈ [0, 24] because she
cannot spend more than 24 hours a day, and her utility from cleaning is given by
ui (hi , hj ) = (24 − hi ) + βhi (hi + hj ) .

Leisure Cleaner apartment
The first term here indicates the number of hours she enjoys in leisure, 24 − hi
(i.e., the hours she does not spend cleaning). For instance, if she spends hi = 2 hours
cleaning the apartment, she has 24 − 2 = 22 hours left in the day for leisure (or at
least activities other than cleaning). The second term, instead, reflects the benefit that
10. A more recent example of club goods is satellite TV, or pay-TV channels, because their consumption is nonrival
(if you watch my favorite TV series, my consumption is not reduced), but it is excludable because you cannot watch
a specific TV channel if you did not pay for it. Generally, most types of copyrighted works, such as books, movies,
and software, are club goods because they all satisfy nonrivalry and excludability.
11. Common examples of free-riding are Public Broadcasting Service (PBS), with 100 million viewers and only
4 million contributors, and National Public Radio (NPR), with 22 million listeners and only 3 million contribu-
tors. Another example is individual effort in a team project for which all the participants receive the same grade.
Remember the last course you took that included a team project—did you free-ride off your teammates’ effort? Or
were your teammates free-riding off of you?
hi
1
2β
1 hj
β
Figure 17.3
Individual i’s best response function.
she obtains from living in a cleaner apartment, which increases in both the hours she
dedicates to cleaning, hi , and the hours her roommate spends cleaning, hj . This benefit
is increasing in parameter β > 0.
In particular, roommate i chooses the hours she spends cleaning, hi , to maximize
her utility ui (hi , hj ). Differentiating with respect to hi , we obtain
−1 + 2βhi + βhj = 0.
Rearranging this expression yields 2βhi = 1 − βhj , and solving for hi , we find hi =
1−βhj
2β , which we can express as
1 1
hi = − hj .
2β 2
Because this expression determines the optimal cleaning time for individual i, hi , as
a function of individual j’s cleaning time, hj , it can be understood as a “best response
function,” similar to those we found in the oligopoly markets of chapter 14. Specif-
ically, when individual j spends no time cleaning, hj = 0, individual i responds with
hi = 2β
1
, as depicted in the vertical intercept of figure 17.3; and when individual j
increases her cleaning time, individual i responds by decreasing her own because i can
free-ride off j’s cleaning time (which explains the negative slope of the best response
function in figure 17.3). When individual j increases her cleaning time until hj = β1
(or beyond that), individual i responds by spending no time cleaning, as illustrated in
the horizontal intercept of figure 17.3.12
1 − 1 h = 0, and solve
12. To illustrate this point, we only need to set the best response function equal to zero, 2β 2 j
for hj , obtaining hj = β1 .
458 Chapter 17
A symmetric best response function applies to individual j, hj (hi ). Invoking sym-

metry, we can obtain the symmetric equilibrium cleaning time, where h∗i = h∗j =
∗
h∗ . Inserting h∗ into hi (hj ) yields h∗ = 1−βh
2β or 2βh∗ = 1 − βh∗ . Solving for h∗
entails
1
h∗ = .
3β
1
Therefore, every roommate spends 3β hours cleaning. For instance, if β = 1/10, every
individual would spend 1
3(1/10) 3.3 hours cleaning on Saturday.
Self-assessment 17.5 Repeat the analysis in example 17.5, but assume that every
roommate has only 12 hours (rather than 24), so the benefit of leisure in her utility
function decreases to 12 − hi . How are the results in example 17.5 affected?
How can free-riding be prevented? A common policy tool is to require all individuals to
pay for the provision of the good via taxes rather than voluntary contributions. This tool is
often criticized, as it requires both users and nonusers of a public good (e.g., a highway) to
pay for it. A less extreme tool is to require users to pay a certain amount every time they use
the good. In the case of highways, for instance, drivers must pay tolls every time they access
a road, which essentially transforms the nature of the good from nonexcludable (freeway)
to excludable (controlled-access highway). Besides helping fund the highway, tolls are also
used to alleviate traffic congestion, as they vary significantly according to the time of day the
driver enters the highway, reaching its highest (lowest) dollar amount during peak (valley)
hours, when the most (least) traffic congestion occurs.
Examples include highways in California, Chile, Brazil, and Singapore, where each driver
installs a transponder on her car’s windshield, adding funds to it online. Drivers do not slow
down when passing through the transponder’s reader (usually a large arc at the entry point
of the highway), which makes traveling through the toll area more convenient. At several
points close to the highway, drivers are informed of the toll price for that time of day so that
they can decide whether to access the highway or not.13
13. For a more detailed presentation of public goods and policy, see Hindricks and Myles (2013).
17.4.1 A Look at Behavioral Economics—Public-Good Experiments

Several researchers have studied public-good games in controlled experiments in many
countries. In a typical experiment, every individual is asked to sit at a computer termi-
nal and is presented with a game in which she can independently choose how many dollars
(or tokens) to contribute to a private account (which only she can enjoy) or to a public
account (which provides benefits to all individuals in the group, thus capturing the nonrival
property). Overall, these experiments found that individuals tend to make relatively high
donations to the public good, but these contributions can decrease rapidly as individuals
interact during several rounds. However, average contributions increase as the benefit from
the public good increases.14
17.5 Common-Pool Resources
In this section, we investigate equilibrium and socially optimal appropriation in a common-

pool resource like a fishing ground, a forest, or an aquifer. Assume that N individuals have
access to the resource. Every unit of appropriation (e.g., 1 ton of fish) is sold in the inter-
national market which, for simplicity, is assumed to be perfectly competitive. Intuitively,
every fisherman’s appropriation (e.g., 20 tons of cod) represents a small share of industry
catches, and this does not affect market prices for this variety of fish. As a result, every firm
takes the market price p as given, which we normalize to p = $1 to facilitate this analysis.
In addition, every firm faces the following cost function:
qi (qi + Q−i )
C(qi , Q−i ) = , (17.1)
S
where Q−i = qj represents the sum of all appropriations by individuals other than i. For
j=i
instance, when only two fishermen exploit the resource (fisherman 1 and 2), the cost function
in equation (17.1) simplifies to
q1 (q1 + q2 )
C(q1 , q2 ) =
S
for fisherman 1 (so that Q−1 = q2 ), and similarly C(q2 , q1 ) = q2 (q2S+q1 ) for fisherman 2 (so
that Q−2 = q1 ).15 In addition, S > 0 denotes the stock of the resource. Intuitively, a more
abundant resource (higher S) decreases fisherman i’s cost because fish is easier to catch.
Importantly, this cost function is increasing in fisherman i’s own appropriation, qi , and in
14. For a recent survey of these experiments, see Vesterlund (2014).

15. In the case of three fishermen, Q−i becomes Q−1 = q2 + q3 for fisherman 1, Q−2 = q1 + q3 for fisherman 2,
and Q−3 = q1 + q2 for fisherman 3. In addition, note that the market can still be perfectly competitive if several
other fishermen, located in other fishing grounds but appropriating the same type of fish, sell their catches in the
international market.
460 Chapter 17
his rival’s appropriations, Q−i .16 Intuitively, the fishing ground becomes more depleted as
other firms appropriate fish, making it more difficult for fisherman i to catch fish.
Therefore, every fisherman chooses its appropriation level qi to maximize its profits as
follows:
qi (qi + Q−i )
max πi = qi − ,
qi S
where the first term represents the fisherman’s revenue from additional units of appropriation
(recall that, for simplicity, the price of every unit was normalized to p = $1), and the second
term indicates the total cost that the fisherman incurs when appropriating qi units of fish,
while his rivals appropriate Q−i units.
17.5.1 Finding Equilibrium Appropriation

Differentiating with respect to qi in the above maximization problem for fisherman i, we
obtain
2qi + Q−i
1 −
= 0.
S
MR
MC
Intuitively, the first term captures the marginal revenue from catching additional units of
fish, MR, whereas the second term indicates the marginal cost that the firm experiences
from these additional catches, MC. That is, the fisherman increases his appropriation until
the marginal revenue and cost exactly offset each other. Rearranging this expression, yields
S = 2qi + Q−i , and solving for qi , we find
S 1
− Q−i .
qi (Q−i ) = (BRFi )
2 2
Intuitively, qi (Q−i ) represents fisherman i’s best response function because it describes
how many units to appropriate, qi , as a response to how many units his rivals appropriate,
Q−i . In particular, he appropriates half the available stock ( S2 ) when his rivals do not appro-
priate any units (Q−i = 0), but his appropriation decreases as his rivals appropriate positive
amounts, Q−i > 0, as depicted in figure 17.4.17
Firms are symmetric in this scenario because they face the same price for each unit of
fish ($1) and the same cost function. Therefore, the best response function of any other
∂C(qi ,Q−i ) 2q +Q
16. To confirm this result mathematically, note that the cost function C(qi , Q−i ) satisfies ∂qi = i S −i ,
∂C(qi ,Q−i ) qi
which is positive for all appropriation levels; and ∂Q−i = S , which is also positive for all appropriation
levels.
17. Recall that, in order to find the horizontal intercept of the best response function shown in figure 17.4, we
only need to set it equal to zero, S2 − 12 Q−i = 0, rearrange, S2 = 12 Q−i , and solve for Q−i to obtain Q−i = S. Intu-
itively, this point represents that if fisherman i’s rivals appropriate all the available stock, Q−i = S, then fisherman
i responds by not appropriating anything (qi = 0).
qi
S
2
BRFi
S Q–i
Figure 17.4
Fisherman i’s best response function.
firm j (where j = i) is symmetric to the best response function discussed previously (i.e.,
qj (Q−j ) = S2 − 12 Q−j ), so we only change the subscript.
In a symmetric equilibrium, each fisherman appropriates the same amount of fish, imply-
ing that q∗1 = q∗2 = … = q∗N = q∗ , which helps us ignore the subscripts because all the firms’
catches coincide. Therefore, Q∗−i becomes Q∗−i = q∗ = (N − 1)q∗ , given that we sum
j=i
over all N − 1 fishermen other than i. Inserting this result in the best response function
yields
S 1
q∗ = − (N − 1)q∗ , (17.2)
2 2
which is now a function of q∗ alone (recall that the stock S, and the number of fishermen N
are both parameters, as opposed to q, which is the only variable we seek to solve for).
∗ ∗
Rearranging equation (17.2) yields 2q +(N−1)q
2 = S2 , or (N + 1)q∗ = S, which, solving
∗
for q , entails an equilibrium appropriation of
S
q∗ = .
N +1
For instance, if the stock is S = 100 tons of fish and N = 9 fishermen, equilibrium appro-
priation becomes q∗ = 9+1
100
= 10 tons. Generally, the equilibrium appropriation q∗ increases
in the stock of the resource, S, but decreases in the number of firms competing for the
resource, N.
Self-assessment 17.6 Repeat the analysis in subsection 17.5.1, but assume N =

12 fishermen and a stock of S = 230 tons of fish. What if the number of fishermen
increases to N = 14? What if, still with N = 12 fishermen, the stock of fish increases
to S = 250 tons? Interpret.
462 Chapter 17
17.5.2 Common-Pool Resources—Joint Profit Maximization

A natural question at this point is whether equilibrium appropriation is excessive –or, in other
words, can fishermen increase their profits if they coordinate their catches? As we show next,
the answer is yes. For simplicity, we focus on the case of two fishermen, N = 2, but a similar
argument applies to common-pool resources with more fishermen. When fishermen 1 and
2 coordinate their catches, they maximize their joint profits as follows:
q1 (q1 + q2 ) q2 (q2 + q1 )
max π1 + π2 = q1 − + q2 − ,
q1 ,q2 S S

π1 π2
(q1 + q2 )2
max (q1 + q2 ) − .
q1 ,q2 S
2(q1 + q2 )
1− = 0, (17.3)
S
and the same result occurs after differentiating with respect to q2 . Intuitively, the first
term represents the marginal revenue from additional catches, while the second term
captures fisherman i’s marginal cost, 2(q1S+q2 ) . Relative to the previous individual deci-
sion problem, increasing catches now produces twice as much marginal costs because
every fisherman takes into account not only the increase in his own costs, but also the
increase in his rival’s costs. In short, every fisherman now internalizes the cost external-
ity that his appropriation generates on other fishermen, as larger qi increases the cost of
fisherman j.18
Rearranging equation (17.3), we obtain S = 2(q1 + q2 ), and solving for q1 we find
S
q1 (q2 ) = − q2 . (17.4)
2
As depicted in figure 17.5, this line originates at the same height as fisherman i’s best
response function in figure 17.4, S2 , but decreases in his rival’s appropriation faster than
that in figure 17.4, thus lying below it. This indicates that, for a given amount of appropria-
tion from firm 2, q2 , firm 1 chooses to appropriate fewer units when firms coordinate their
18. Unlike in the collusive behavior analyzed in chapter 14, where firms’ decision to reduce their output pro-
duces an increase in market prices, fishermen’s coordination does not affect the market prices because we consider
that such prices are given on the international market. If, instead, fishermen had market power, they would have
stronger incentives to coordinate their output decisions because, besides internalizing the cost externality, they
could increase market prices.
qi
S
2
S 1
qi (Q– i ) = – Q– i
2 2
S
qi (Q– i ) = – Q– i
2
S S Q– i
2
Figure 17.5
Equilibrium versus joint-profit maximization in the commons.
exploitation of the resource (jointly maximizing profits) than when every firm independently
selects its own appropriation.19
To confirm this finding, let us simultaneously solve for appropriation levels q1 and q2
in equation (17.4), q1 (q2 ) = S2 − q2 for fisherman 1 and q2 (q1 ) = S2 − q1 for fisherman 2.
However, these equations perfectly overlap each other, indicating that a continuum of opti-
mal pairs (q1 , q2 ) solves the joint-profit maximization problem, graphically illustrated by
all points along the line q1 (q2 ) = S2 − q2 . Because firms are symmetric, the literature often
considers that, among all optimal pairs, a natural equilibrium is that in which both firms
appropriate the same amount (qJP 1 = q2 = q , where the superscript JP indicates “joint
JP JP
profit” maximization).
1 = q2 = q into equation q = 2 − q , and solving for q , entails q =
S
Inserting qJP JP JP JP JP JP JP
S
4 . Comparing this result against the equilibrium appropriation when agents independently
choose their appropriation levels (evaluated for the case of N = 2 fishermen), q∗ = 2+1 S
= S3 ,
yields
S S
q∗ > qJP because > .
3 4
This result says that agents exploit the resource less intensively when they coordinate
their appropriation decisions (and thus internalize the cost externalities their appropriation
generates) than when they do not coordinate their exploitation.
19. To find the horizontal intercept of expression q1 (q2 ) = S2 − q2 , we only need to set it equal to zero, 0 = S2 − q2 ,
rearrange, S2 = q2 , and solve for q2 to obtain q2 = S2 ; as illustrated in the horizontal intercept of figure 17.5.
Intuitively, this point represents that, if fisherman 2 appropriates half of the available stock, q2 = S2 , fisherman 1
responds by not appropriating anything (q1 = 0).
464 Chapter 17
Self-assessment 17.7 Repeat the analysis in subsection 17.5.2, but assume N =

12 fishermen and a stock of S = 230 tons of fish. What if the number of fishermen
increases to N = 14? What if, still with N = 12 fishermen, the stock of fish increases
to S = 250 tons? Interpret.
Exercises
1. Regulated duopoly.B Redo example 17.1 (unregulated equilibrium), but with two firms. Then
redo example 17.4 to find the optimal emission fee in a duopoly. Compare this result with that in
the monopoly found in example 17.4.
2. Finding the social optimum when considering consumer surplus.A Redo example 17.2 (find-
ing the social optimum), but now consider that social welfare includes consumer surplus as well.
Welfare in example 17.2 only includes profits and external cost.
3. Positive externalities and social optimum.A Redo example 17.2, but assume positive externali-
ties, where the external benefit function is EB = 5(αq)2 + 3, where α ∈ [0, 15 ). Find the unregulated
equilibrium and social optimum.
4. Setting quotas while considering consumer surplus.B Redo example 17.3 (prohibiting pollu-
tion), but now consider that social welfare includes consumer surplus too. Welfare in example
17.3 only includes profits and external cost. Talk about the presence of two market imperfections
(monopoly versus externalities).
5. Optimal emission fee.B Redo example 17.4, but with the environmental damage function in
example 17.3. Which is the lowest emission fee that achieves this objective?
6. Optimal subsidy with positive externality.A Redo example 17.4, but assuming positive external-
ities, where the external benefit function is EB = 5(αq)2 + 3 for α ∈ [0, 15 ). Which subsidy per unit
output induces the monopolist to choose the socially optimal output?
7. Public goods and free-riding.B Consider two roommates, 1 and 2, who simultaneously choose
the number of hours that they spend cleaning their apartment. In particular, assume that roommate
i’s utility function when he spends hi hours cleaning and roommate j spends hj hours cleaning is
1/3
ui (hi , hj ) = (24 − hi ) + βhi (hi + hj ) . As in example 17.5, the firm term represents the utility
that roommate i enjoys from the hours he spends not cleaning the apartment, because the day has
24 hours; and the second term measures the utility that he enjoys from a cleaner apartment, which
depends on both hi and hj , and is increasing in parameter β > 0.
(a) Suppose that the two roommates choose their hours of cleaning independently. What are the
optimal number of hours of cleaning in this context?
(b) Assume now that the two roommates can coordinate their actions, choosing hi and hj to maxi-
mize their joint utility ui (hi , hj ) + uj (hj , hi ). What are the optimal number of hours of cleaning
in this context?
(c) Compare your results in parts (a) and (b). Interpret your comparison in terms of free-riding
incentives.
8. Common-pool pasture.B Consider a small village that grazes sheep on an adjacent plot of land.
Sheep produce wool that depends on the number of other sheep grazing in the field, such that the
wool per sheep is w = 100 − 2S, where S = s1 + s2 is the number of sheep in the field. Assume
the wool can be sold at $1 per unit and sheep can be bought at $4 each. Villagers will buy a sheep
so long as it is profitable.
(a) How many sheep will the village graze on their field?
(b) How many sheep should the village graze on their field? (Hint: what is the maximum profit
the village can earn?)
9. Common-pool resource–I.B Consider a common-pool resource (e.g., a lake) operated by a single
firm during two periods, appropriating x units in the first period and q units in the second period. In
2
particular, assume that its first-period cost function is x3 , while its second-period cost function is
q2
.
3 − (1 − β) x
Intuitively, parameter β denotes the regeneration rate of the resource. That is, if regeneration is
complete, β = 1, first- and second-period costs coincide; but if regeneration is null, β = 0, second-
q 2
period costs become 3−x , and thus every unit of first-period appropriation x increases the firm’s
second-period costs. For simplicity, assume that every unit of output is sold at a price of $1 at the
international market.
(a) Find the profit-maximizing second-period appropriation.
(b) Using your result from part (a), find the profit-maximizing first-period appropriation.
(c) How are your results affected by a larger regeneration rate, β?
10. Common-pool resource–II.C Consider the scenario in exercise 9, but assume now that entry
occurs in the second
period,
and that the second-period cost function for both incumbent and
iq +q q
j i
entrant becomes 3−(1−β)x , where i = {inc, ent}.
inc
(a) Find the profit-maximizing second-period appropriation for each firm, qi and qj .
(b) Using your result from part (a), find the profit-maximizing first-period appropriation for the
incumbent, xi .
(c) Compare the incumbent’s first-period appropriation under entry with that under no entry.
Interpret your results.
11. Common-pool resource–III.B Consider the common-pool resource problem in section 17.5, but
2q (q +Q )
with a new cost function C(qi , Q−i ) = i iS −i , where qi denotes firm i’s appropriation and
Q−i represents the sum of the appropriation from all firm i’s rivals. Assume there are N individuals
with access to the fish which they can sell on the international market at $1 per unit, which every
individual takes as given.
(a) Find the equilibrium appropriation of fish.
466 Chapter 17
(b) What is the equilibrium appropriation of fish when there are N = 10 fishermen and a stock
of S = 100 tons of fish.
(c) Find the appropriation of fish if the fishermen were to coordinate their catches?
(d) How much would each of the 10 now-coordinating fishermen catch if there were 100 tons of
fish?
12. Pollution and optimal policy.B Black Smoke eatery is the only restaurant in a small town. They
face inverse demand of p = 25 − 0.05q and have costs TC(q) = 3 + 4q. Unfortunately, the eatery
produces a lot of unsightly black smoke at the same rate as output (so pollution is equal to q).
(a) Find the unregulated equilibrium.
(b) Assume that the external cost of Black Smoke’s pollution is EC = 2q. Find the social
optimum.
(c) If the regulator is to seek the socially optimal output, what pollution quota would she set?
(d) If the regulator is to seek the socially optimal output, what emission fee would she set?
13. When does the Coase theorem apply.A Can the following situations be effectively addressed
under the Coase theorem? Discuss why or why not.
(a) Air pollution
(b) A homeowner playing loud music (negatively affecting his neighbors) within a homeowners’
association (HOA)
(c) Light pollution in a town with a powerful telescope (that needs surrounding darkness to be
effective)
(d) Use of an irrigation ditch between two ranches
14. Dealing with negative externalities.B Two neighbors in a rural community were fed up with the
town’s landfill policies and decided to purchase land together to use as their own landfill. However,
the two neighbors did not anticipate the consequences of their purchase and quickly found that
their new landfill smelled. Each neighbor has 10 bags of trash. Dumping on their own land is
cheap, but the two neighbors have to endure an increased bad smell; however, dumping at the
town’s landfill incurs a cost of $3 per bag. Neighbor 1 lives downwind of the new landfill and
endures the brunt of the smell. Her utility is
u1 (b1 , b2 ) = −3(10 − b1 ) − (b1 + b2 )2 − (b1 + b2 ),
while the upwind neighbor’s utility is
u2 (b1 , b2 ) = −3(10 − b2 ) − (b1 + b2 )2 ,
where bi is the number of bags each neighbor dumps at her own landfill. Note that the utilities
are negative, as both actions are positive numbers (i.e., b1 , b2 > 0).
(a) How much will each neighbor dump at her new landfill?
(b) If the neighbors were to coordinate, how much would they dump at their new landfill?
15. Coase theorem in action.A Consider Jordan and Hannah, new neighbors in a nice neighborhood.
Each have home businesses, but with different needs. Jordan runs a woodworking business that
makes a lot of noise and creates a lot of sawdust, so his garage door has to stay open during the
workday. Hannah runs a yoga studio, which needs a quiet environment to be successful. If Jordan
runs his shop, he can make $500, while Hannah will have no customers and make $0. If Jordan
does not run his shop, he makes $0, while Hannah makes $600.
(a) Assuming that Jordan has the right to operate his shop, can Hannah induce him to shut down
his shop so that she can make a profit? What is the total profit?
(b) Hannah found out that Jordan can install a dust collection system in his shop for X dollars,
which would allow Jordan to close his garage door and lower the noise enough for her to
run her studio. What is the largest amount X that Hannah would be willing to pay for the
collection system? What is the total profit?
(c) Assume that there is an HOA contract (for which Hannah is the HOA president) that does
not allow Jordan to make noise. How much would Jordan offer Hannah not to enforce the
agreement and allow him to operate with the door open? What would be the total profit?
(d) How much would Jordan be willing to pay for a dust collection system to allow him to operate
under the HOA rules? What would be the total profit if he were to invest in the system?
16. Pollution regulation and perfect competition.B Consider a perfectly competitive industry that
faces demand of p = 10 − Q, and each firm faces a constant marginal cost of c = 2. The external
cost of the pollution is EC = 3(αq)2 .
(a) Find the unregulated equilibrium quantity Q∗ .
(b) Find the socially optimal quantity.
(c) Find the emission fee that would induce the socially optimal quantity.
17. Multiple polluters.B Two polluting utility companies offer power at a regulated price of $3 per
unit but have different cost functions. The first company produces a cheaper but more polluting
energy at cost TCd = 2 − qd + 0.5q2d , with emissions ed = 2qd . The second company produces a
less polluting energy, but at a higher cost, TCc = 4 − qc + q2c , with emissions ec = qc .
(a) Find the amount of energy and emissions that each firm will produce if left unregulated.
(b) If the external cost of pollution is EC = 12 (ed + ec )2 (the regulator cannot directly measure
each firm’s emissions, but can measure total emissions), find the socially optimal amount of
output from each firm.
(c) Is it possible to find a single emissions fee t that would induce the market to produce at the
social optimum?
18. Reducing emissions.B A coal company produces electricity with total cost TC = 4q and emis-
sions e = 2q. The coal company, being large, is a monopoly in its region and faces demand of
p = 20 − q.
(a) If the external cost of emissions is EC = 2e, what emission fee would induce the social
optimum?
(b) Now assume that the company can invest in a new technology that reduces their emissions to
e = 2(q − α), at a cost of 2α 2 . Intuitively, if α = 0, the firm does not invest in the technology,
and its emissions are the same as in part (a), while a larger investment in the technology
468 Chapter 17
increases α, which decreases emissions and the fee paid by the firm. Assuming the fee you
found in part (a), how much will the company invest in α?
19. Common-pool refrigerator.A Each year, college students are finding themselves living with new
roommates with different lifestyle habits than their own. A potentially frustrating habit is the use
of the common refrigerator. Discuss the use, potential problems, and solution to the use of a
common refrigerator among a group of 2 or more roommates.
20. Spillover effects.C Firms within an industry can experience “spillover effects” of investing in new
technologies. Many competing industries have firms concentrated in one region or city, where the
workers at competing firms may interact with each other, leading to the exchange of ideas and
this positive externality. Consider two firms that face inverse demand of p = 10 − q1 − q2 . Each
firm faces total costs of
TC = (4 − xi − 0.25xj )qi + 0.5x2i ,
where the original marginal cost of production ($4) can be reduced by investing in xi and also
decreases by a portion of its rival’s investment, at a rate of 0.25xj . The cost of investing in the
cost-reducing technology is 0.5x2i . Assume that, in the first stage, every firm i invests xi dollars
in R&D. In the second stage, every firm observes the R&D investment (xi , xj ), and thus the cost
function of each firm, and firms compete in quantities (à la Cournot).
(a) If the firms act competitively, how much will each firm produce, and how much of the cost-
reducing technology will they invest in? (Hint: Because this is a sequential-move game where
firms are perfectly informed, you need to find the subgame perfect equilibrium (SPE) of the
game, operating by backward induction. You should then start analyzing firms’ output choices
in the second stage of the game, for any pair of xi and xj ; and then, anticipating the profits that
firms make in the second stage, find their equilibrium investment in R&D in the first stage.)
(b) Discuss how the knowledge spillover affects investment in the new technology. In other words,
how does firm 2’s investment x2 affect firm 1’s decisions on output and investment, compared
to if they were to act cooperatively?
References
Akerlof, George A. (1970) “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism,” Quarterly
Journal of Economics 84(3): 488–500.
Angner, Erik (2016) A Course in Behavioral Economics, 2nd ed. Red Globe Press.
Belleflamme, Paul and Martin Peitz (2015) Industrial Organization: Markets and Strategies, 2nd ed. Cambridge
University Press.
Besanko, David and Ronald Braeutigam (2013) Microeconomics, 5th ed. Wiley Publishers.
Bolton, Gary E. and Axel Ockenfels (2000) “ERC: A Theory of Equity, Reciprocity, and Competition,” American
Economic Review 90(1): 166–93.
Bolton, Patrick and Mathias Dewatripont (2004) Contract Theory. MIT Press.
Cabral, Luis (2017) Introduction to Industrial Organization, 2nd ed. MIT Press.
Camerer, Colin F. (2003) Behavioral Game Theory: Experiments in Strategic Interaction (The Roundtable Series
in Behavioral Economics). Princeton University Press.
Campbell, Donald E. (2018) Incentives: Motivation and the Economics of Information, 2nd ed. Cambridge Uni-
versity Press.
Coase, Ronald H. (1960) “The Problem of Social Cost,” Journal of Law and Economics, 3(1): 1–44.
Dal Bó, Pedro and Guillaume R. Fréchette (2011) “The Evolution of Cooperation in Infinitely Repeated Games:
Experimental Evidence,” American Economic Review 101(1): 411–29.
Duffy, John and Jack Ochs (2009) “Cooperative Behavior and the Frequency of Social Interaction,” Games and
Economic Behavior 66(2): 785–812.
Fehr, Ernst and Klaus M. Schmidt (1999) “A Theory of Fairness, Competition, and Cooperation,” Quarterly Journal
of Economics 114(3): 817–68.
Goolsbee, Austan, Steven Levitt, and Chad Syverson (2015) Microeconomics, 2nd ed. Worth Publishers.
Harrington, Joseph (2006) “How Do Cartels Operate?” Foundations and Trends in Microeconomics 2(1): 1–105.
Harrington, Joseph (2014) Games, Strategies, and Decision Making, 2nd ed. Worth Publishers.
Hindricks, Jean and Gareth D. Myles (2013) Intermediate Public Economics, 2nd ed. MIT Press.
Jensen, Robert T. and Nolan H. Miller (2007) “Giffen Behavior: Theory and Evidence,” NBER Working Paper No.
13243.
Just, David R. (2013) Introduction to Behavioral Economics. Wiley Publishers.
Kagel, John H. and Dan Levin (2014) “Auctions: A Survey of Experimental Research,” Working paper, The Ohio
State University.
Kahneman, Daniel (2013) Thinking, Fast and Slow. Farrar, Straus and Giroux.
Kahneman, Daniel and Amos Tversky (1979) “Prospect Theory: An Analysis of Decision under Risk,” Economet-
rica 47(2): 263–92.
Kahneman, Daniel and Amos Tversky (2000) Choice, Values and Frames. Cambridge University Press.
470 References
Klemperer, Paul (2004) Auctions: Theory and Practice (Toulouse Lectures in Economics). Princeton University
Press.
Kolstad, Charles D. (2010) Environmental Economics. Oxford University Press.
Krishna, Vijay (2002) Auction Theory. Academic Press.
Laffont, Jean-Jacques and David Martimort (2002) The Theory of Incentives: The Principal-Agent Model.
Princeton University Press.
Levenstein, Margaret C. and Valerie Y. Suslow (2006) “What Determines Cartel Success?” Journal of Economic
Literature, 44(1): 43–95.
Macho-Stadler, Ines and David Perez-Castrillo (2001) An Introduction to the Economics of Information: Incentives
and Contracts. Oxford University Press.
McKenzie, David (2002) “Are Tortillas a Giffen Good in Mexico?,” Economics Bulletin vol. 15(1): 1–7.
Menezes, Flavio M. and Paulo K. Monteiro (2004) An Introduction to Auction Theory. Oxford University Press.
Milgrom, Paul (2004) Putting Auction Theory to Work. Cambridge University Press.
Muñoz-Garcia, Felix (2017) Practice Exercises for Advanced Microeconomic Theory. MIT Press.
Muñoz-Garcia, Felix and Daniel Toro-Gonzalez (2019) Strategy and Game Theory: Practice Exercises with
Answers, 2nd ed. Springer.
Nash, John F., Jr. (1950) “Equilibrium Points in N-Person Games,” Proceedings of the National Academy of Science
36(1): 48–49.
Perloff, Jeffrey M. (2016) Microeconomics: Theory and Applications with Calculus, 4th ed. Pearson Publishers.
Phaneuf, Daniel J. and Till Requate (2016) A Course in Environmental Economics: Theory, Policy, and Practice.
Cambridge University Press.
Smith, Vernon L. (1991) “Rational Choice: The Contrast between Economics and Psychology,” Journal of Political
Economy 99(4): 877–97.
Thaler, Richard H. (1988) “Anomalies: The Winner’s Curse,” Journal of Economic Perspectives 2(1): 191–202.
Tversky, Amos and Daniel Kahneman (1986) “Rational Choice and the Framing of Decisions,” Journal of Business
59(4): 251–78.
Tversky, Amos and Daniel Kahneman (1992) “Advances in Prospect Theory: Cumulative Representation of
Uncertainty,” Journal of Risk and Uncertainty 5(4): 297–323.
Varian, Hal (2014) Intermediate Microeconomics: A Modern Approach, 9th ed. W. W. Norton & Company.
Vesterlund, Lise (2014) “Charitable Giving: A Review of Experiments on Voluntary Giving to Public Goods,” in
Handbook of Experimental Economics, vol. 2, ed. C. R. Plott and V. L. Smith. Princeton University Press.
Index
Adverse selection problems, 420–421, 428 Bertrand model of imperfect competition,

in market for lemons, 428–431 365–369
preventing, 438–439 Best response functions, 358–361
principal-agent model of, 431–438 for games of incomplete information, 392–393
profit maximization problem in context of, 439–440 with incomplete information, 393–394
Advertising, by monopolies, 266–267 with product differentiation, 377–379
Airline industry, 249 Bid shading, 402–403, 414
Akerlof, George A., 428 Bliss points, 12–14
Allocation rule, 397 Block pricing (second-degree price discrimination),
All-pay auctions, 397 281n3
Amazon (firm), 248, 248n2 Bolton, Gary E., 37
Anticoordination game, 315–316, 323 Bolton and Ockenfels social preferences utility
Arbitrage, 279 function, 37
Arrow-Pratt coefficient of absolute risk aversion, Budget constraints, 45–49
140–142 Budget lines, 46
Auctions, 396–397 kinked, 60–63
as allocation mechanism, 396–397 Bundles, 7
common-value auctions, 410–411 budget restraints for, 45–49
double auctions, 240–241 consumer choices for, 45
efficiency in, 409–410 income effects on, 75
experiments with, 411–412 preferences for, 8–14
first-price auctions, 400–409, 412–414 Bundling, 277–278, 286–291
second-price auctions, 397–400 Buyers. See Customers
Average costs, 198–199
economies of scale in, 201–203 Capital
Average product, 157–161 as fixed short-run cost, 196
relationship between marginal product and, 161–163 as input, 155
in isocost lines, 183–185
Backward induction, 334–339 technological progress in, 174–175
Bads, 8 Cardinality, 14
Bang for the buck, 50, 64 Cartels, 263, 355, 369–373
Bargaining between parties, 451–453 Certainty effect, 147
Battle of the Sexes game, 312–314 prospect theory on, 148
Bayesian Nash equilibrium, 393 Certainty equivalent, 139–140
Behavioral economics, 36–37, 142–143 Cheating, in games, 343–344
on auctions, 411–412 Chinese auctions, 397, 409
on cooperation in games, 346–347 Cisco (firm), 268n
market experiments in, 240–241 Club goods, 455–456
prospect theory in, 145–148 Coase, Ronald H., 451
public-goods experiments, 459 Coase theorem, 451–452
weighted utility in, 144–145 Cobb, Charles, 32n
472 Index
Cobb-Douglas production functions, 156–157 Convex utility, 136

in cost-minimization problem, 188 Cooperation
elasticity of substitution in, 178–179 among cartel members, 371–372
finding input demands with, 190 in games, 346–347
marginal rate of technical substitution, 165, Coordination game, 314–315
166–167 Cost advantages, 248
output elasticity in, 200–201 Cost functions, 193–195
in profit maximization problem, 216–217 Cost minimization, 183
Slutsky equation applied to, 100–101 average and marginal costs in, 198–201
for total costs, 193–194 cost functions in, 193–195
for utility maximization problems, 53 diseconomies of scale, 205–206
Cobb-Douglas utility functions, 32–33 economies of scale, scope, and experience in,
for expenditure minimization problems, 66–67 201–206
for finding income effect and substitution effect, input demands in, 189–193
91–92 isocost lines, 183–185
income elasticity in, 78 Lagrange analysis for, 206–207
increasing income in, 77 problem in, 185–189
Stone-Geary utility function and, 35 types of costs, 195–198
Collusion, in imperfect competition, 369–373 Cost-minimization problem, 185–189
Common-pool resources, 455, 456, 459–464 Lagrange analysis for, 206–207
Common-value auctions, 410–411 Costs
Comparative statics, 2 of advertising, 266–267
Compensated demand, 66 average and marginal, 198–201
Compensating variation, 110–113 types of, 195–198
alternative representation of, 120–122 Coupons, 62–63
measuring, with quasilinear utility function, Cournot model of imperfect competition,
117–118 358–365
Competition Bertrand model reconciled with, 369
cartels and collusion in, 369–373 for cartels, 370, 371
imperfect, 355–356 with incomplete information, 393–394
imperfect, models of, 357–369 with N firms, 380–383
market power in, 356–357 Stackelberg model and, 373, 376
perfectly competitive markets, 214, 249 Cross-price elasticity, 84
Competitive markets, 214 Customers
Completeness, 8–9 adverse selection problems faced by, 428–431
Concave utility, 134 first-degree price discrimination on, 279–281
Constant elasticity of substitution production legal rights of, 439
function, 171 preferences for bundles among, 8–14
Constrained maximization problems, 64 screening of, 438
Consumer choice, 45 second-degree price discrimination on, 281–284
budget constraints in, 45–49 third-degree price discrimination on, 284–286
kinked budget lines in, 60–63 of used cars, 420
revealed preferences in, 57–60 willingness-to-pay of, 277
utility maximization problem in, 49–56
Consumers, 1. See also Customers Deadweight loss, 263–265
Consumer surplus, 107–110 Demand
measuring, with quasilinear utility function, 117 derivative of, 76–77
Consumer theory, 2–4, 7 income effects and, 87–88
marginal rate of substitution in, 25–28 price changes and, 82–87
marginal utility in, 18–19 Demand advantages, 248
preferences for bundles in, 8–14 Demand curves
utility functions in, 14–17 consumer surplus and, 107–110
Consumption in monopoly markets, 255–256
in income-consumption curve, 79–80 Derivative of demand, 76–77, 83
in price-consumption curves, 85–87 Diodes Julianus, 396n3
Contract theory, 419–421 Diseconomies of scale, 202–203
adverse selection problems in, 428–439 Double auctions, 240–241
moral hazards in, 421–428 Douglas, Paul, 32n
Index 473
Duopolies, 357 Explicit costs, 195

Cournot model of, 362–363 Externalities, 445–446
Cournot model with, 382 social optimum for, 448–455
with unregulated equilibrium, 446–447
eBay (firm), 248, 396
Economies Feasible allocations, 229
adding production to, 239–240 Fehr, Ernst, 36, 38n
of experience, 205–206 Fehr-Schmidt social preferences utility function,
of scale, 201–203, 248 36–37
of scope, 203–205 Financial aid to students, 281
Efficiency Finite repetitions of games, 340–341
in auctions, 409–410 Firms, 1
equilibrium versus, 234–239 cartels, 263
Efficient allocations, 233–235, 240 competition among, 355
marginal rate of substitution and, 241–242 market power of, 356–357
Elasticity monopolies, 247–249
constant elasticity of substitution production moral hazards in contracts with employees,
function, 171 421–428
of demand, in monopoly markets, 255–256 principal-agent model for contracts with employees,
of income, 77–79, 97–98 431–438
of outputs, 199–201 production functions for, 155–157
of price, 83–84 production theory for, 4
of price, in monopoly markets, 256–257 profit maximization problem for, 214–217
Slutsky equation to represent, 101–102 supply curves for, 217–220
of substitution, 176–179 First-degree price discrimination, 277, 279–281
Emissions, government interventions to reduce, First-price auctions, 396
453–455 equilibrium bidding in, 401–404
Employees in more general settings, 412–414
adverse selection problems involving, 420–421 with N bidders, 405–406
moral hazard problems involving, 421–428 with privately observed valuations, 400–401
performance of, 419–421 with risk-averse bidders, 407–408
principal-agent model for contracts with, 431–438 First Welfare Theorem, 213, 234–237
Engel curve, 80–81 Fishing farms, 451–452
Environmental damage, 464 Fixed costs, 196–198
Equilibrium, 213–214 Fixed proportions production function, 169–170
for common-pool resources, 460–461 elasticity of substitution in, 178
general equilibrium, 228–240 Free Application for Federal Student Aid (FAFSA)
long-run equilibrium, 225–226 forms, 281
short-run equilibrium, 224–225 Free-riding, 456–458
Equilibrium allocations, 239–240 Frontier Airlines, 249
Equilibrium price, 230–233
Equivalent variation, 114–116 Game of Chicken, 315–316
alternative representation of, 122–124 Game theory, 5–6
measuring, with quasilinear utility function, 119 applied to common games, 310–316
Exchange economies, 239 on auctions, 396–414
Expected utility, 131–132 behavioral economics on cooperation in, 346–347
weighted utility and, 144 games defined for, 298–300
Expected value, 128–129 for games with incomplete information, 391–396
variance and, 130 game trees for, 330–331
Expenditure function, 120–124 mixed-strategy Nash equilibrium for, 316–321
Expenditure minimization problems, 65–68 Nash equilibrium in, 306–310, 332–333
utility maximization problems and, 68–70 for repeated games, 340–346
Experience, economies of, 205–206 for sequential games, 329–330
Experimental tests. See also Behavioral economics simultaneous-move games, 297–298
of auctions, 411–412 strategic dominance in, 300–306
of cooperation in games, 346–347 subgame perfect equilibrium in, 334–339
of economics, 142–143 Game trees, 330–331
on public-goods, 459 Nash equilibrium for, 332–333
474 Index
Geary, Roy C., 35n production functions for, 156

General equilibrium, 213–214, 228–229 returns to scale of, 171–173
Giffen goods, 83, 89–91 Insurance markets
Goods moral hazards in, 421
in bundles, 7, 8 screening of customers in, 438
derivative of demand for, 76–77 Inverse demand function, 249, 251
Engel curve for, 80–81 Inverse elasticity pricing rule, 260
inferior, 82, 97–98 Isocost lines, 183–185
public goods, 455–459 in cost-minimization problem, 185–189
Government interventions, 453–455 Isoquants, 155, 163–165, 183
Grim-Trigger Strategy, 341–345, 371–372 in cost-minimization problem, 185–189
of fixed proportions production function, 170–171
Health insurance marginal rate of technical substitution, 165–167
moral hazards in, 421 Iterative Deletion of Strictly Dominated Strategies,
screening of customers for, 438 303–307
Herfindahl-Hirschmann index, 356–357
Hicks, John Richard, 66n11 Jensen’s inequality, 134n3
Hicksian demand (compensated demand), 66n11
Hidden actions. See Moral hazards Kahneman, Daniel, 143, 145, 148
Hidden information problems, 420 Kaiser Aluminum (firm), 195
Kinked budget lines, 60
Imperfect competition, 355–356 coupons in, 62–63
Bertrand model of, 365–369 for quantity discounts, 60–62
cartels and collusion in, 369–373
Cournot model of, 358–365 Labor
market power in, 356–357 as fixed short-run cost, 196
models of, 357–358 in isocost lines, 183–185
product differentiation in, 377–380 productivity of, 157
Stackelberg model for, 373–376 technological progress in, 174
Implicit costs, 195 Labor market
Implied warranties, 439 adverse selection problems in, 420–421
Incentive constraints, 426, 439–440 income and substitution effects on, 94–97
Income lemons in, 430–431
in budget constraints, 45–49 Labor-saving technological progress, 175
effects of changes in, 75–82 Lagrange multipliers
and substitution effects, 87–88 for cost-minimization problem, 206–207
Income-consumption curve, 79–80 for utility maximization problem, 64–65
Income effects, 87–88 Leisure, 94–97
alternative representation of, 98–101 Lemons (undesirable used cars), 420
in labor market, 94–97 market for, 428–431
substitution effects and, 88–94 Leontieff, Wassily, 30n
welfare changes and, 116–119 Leontieff utility function, 30n
Income elasticity, 77–79, 97–98 Lerner index (markup index), 257–260
Slutsky equation to represent, 101–102 Linear inverse demand, 249
Indifference curves, 20–24 Linear pricing (uniform pricing), 284
isoquants and, 164 Linear production functions, 168–169
marginal rate of substitution and, 25–26 in cost-minimization problem, 189
Inferior goods, 76, 82, 89–91, 97–98 elasticity of substitution in, 177–178
Infinite repetitions of games, 341–346 for finding input demands, 191
experimental studies of, 346–347 marginal rate of technical substitution,
Information rent, 427 167–168
Input demands, 183, 189–193 for total costs, 194–195
Inputs, 155 Linear utility function, 137–138
in fixed proportions production function, Long-run costs, 196–197
169–170 Long-run equilibrium, 225–226
in isocost lines, 183–185 Long-run supply curves, 219–220
isoquants for substitutions of, 163–165 Loss aversion, 146–147
Index 475
Lotteries, 127, 128 Monopsony, 268–271

expected utility of, 131–132 Monotonicity, 10–12, 16
expected values in, 128–129 Moral hazards, 419, 421–422
experimental research on, 142–148 preventing, 428
inefficiency of, 409–410 when effort is observable, 422–424
prospect theory on, 145–148 when effort is unobservable, 424–427
risk premiums of, 138–139
variance in, 129–131 Nash, John F., Jr., 307
Nash equilibrium, 297, 298, 306–310
Marginal costs, 198–199 for games of incomplete information, 392–396
Marginal products, 157–161 for game trees, 332–333
marginal rate of substitution as ratio of, 175–176 for mixed strategies, 316–321
relationship between average product and, 161–163 for sequential games, 329
technological progress increasing, 174 National defense, 455
Marginal rate of substitution, 25–28 Natural monopolies, 248
efficient allocations and, 241–242 Negative taxes, 238
in equilibrium prices, 230 Nonexpected utility, 142
finding, 37–38 Nonlinear pricing, 284
of fixed proportions production function, 170 Nonsatiation, 12, 16
as ratio of marginal products, 175–176 Normal goods, 76
of technical substitution, 165–167
Marginal revenues Ockenfels, Axel, 37
for monopolies, 250–253 Oil leases, auctions for, 410
in third-degree price discrimination, 284 Oligopolies, 355
Marginal utility, 18–19 Herfindahl-Hirschmann index for, 357
Market equilibrium One-shot games (unrepeated games), 340
long-run equilibrium, 225–226 Opportunity costs, 195
short-run equilibrium, 224–225 Oracle (firm), 268n
Market power, 356–357 Ordinality, 14
Markets, 4–5 Organization of the Petroleum-Exporting Countries
experiments involving, 240–241 (OPEC), 263, 340, 369
failures of, 6 Outputs
general equilibrium of, 228–240 competition among, with product differentiation,
imperfect competition in, 355–356 379–380
for monopolies, 255–257 economies of scale in, 201–203
perfectly competitive, 214, 249 elasticity of, 199–201
Market supply curves, 220–221 in long-run equilibrium, 225–226
Markup index (Lerner index), 258 of monopolies, 249, 253–255
Marshall, Alfred, 66n12 in profit maximization problem, 215
Marshallian demand (uncompensated demand), 66n12
Microeconomics, 1–2 Partial equilibrium, 213–214
Mixed-strategy Nash equilibrium, 298 Participation constraints, 426, 429, 439–440
Monopolies, 4–6, 247–249 Patents, 248, 248n3
advertising by, 266–267 Payment rule, 397
bundling by, 286–291 Payoffs, 299
Cournot model with, 382 Peaches (desirable used cars), 420, 430, 431
first-degree price discrimination by, 278–281 Perfect complements utility function, 30–31
Herfindahl-Hirschmann index for, 357 Perfectly competitive markets, 214, 249
Lerner index and inverse elasticity pricing rule for, Cournot model with, 382–383
257–260 Perfect substitutes utility function, 29–30
markets for, 255–257 Performance, of employees, 419–421
monopsonies and, 268–271 Pharmaceutical industry, 248
multiplant, 260–263 Players (in games), 298
profit maximization problem for, 249–255 Pollution, 445–447
second-degree price discrimination by, 281–284 bargaining to reduce, 451–453
third-degree price discrimination by, 284–286 government interventions to reduce, 453–455
welfare analysis under, 263–265 social optimum for, 448–450
476 Index
Positive externalities, 446, 454n average product and marginal product measurements
Postcontractual problems, 421 of, 157–161
Precontractual problems, 421 of employees, 420
Price-consumption curves, 85–87 Profit maximization problem, 214–217
Price discrimination, 277–279 in adverse selection context, 439–440
first-degree, 279–281 Cournot model of, 358
second-degree, 281–284 for monopolies, 249–255
third-degree, 284–286 Prospect theory, 145–148
Prices Public goods, 445, 455–459
Bertrand model of simultaneous price common-pool resources, 459–464
competition, 365–369
in budget constraints, 48–49 Quantity competition
changes in, 82 collusion in, 369–370
compensating variation changes in, 110–113 with N firms, 380–383
consumer surplus and changes in, 107–110 sequential, Stackelberg model of, 373–376
derivative of demand and, 83 simultaneous, Cournot model of, 358–365
equilibrium price, 230–233 Quantity discounts, 60–62
equivalent variation and changes in, 114–116 in second-degree price discrimination, 281–284
inverse elasticity pricing rule, 260 Quasilinear utility function, 34–35
in long-run equilibrium, 225–226 for expenditure minimization problems, 67–68
in monopolies’ profit maximization problem, for finding income effect and substitution effect,
249–250 93–94
in monopoly markets, 255–257 measuring welfare changes with, 117–119
in perfectly competitive markets, 214 QuiBids.com (firm), 397n
price-consumption curves, 85–87 Quota, 371, 453
price-elasticity of demand and, 83–84
responses to changes in, 192–193 Rationality, 2, 299
Price wars, 248–249 Reference points, 146
Principal-agent model, 431 Regulators, 1
with asymmetric information, 433–436 Repeated games, 340
comparing information settings for, 436–438 with finite repetitions, 340–341
with symmetric information, 431–433 with infinite repetitions, 341–346
Prisoner’s Dilemma game, 310–312, 329–330 Research and development (R&D), 446
experimental study of, 346–347 Returns to scale production functions, 171–173
infinite repetitions of, 341–346 Risk aversion, 127, 132–134
mixed strategy for, 320–321 Arrow-Pratt coefficient of, 140–142
repetitions of, 340 in auctions, 407–409
Private goods, 455 Risk loving, 134–136
Probabilities, 128. See also Uncertainty Risk neutrality, 136–138
Probability weights, 146 Risk premium, 138–139
Producer surpluses, 226–228 measurement of, 140
Product differentiation, 377–380 Risks
Production, adding to economy, 239–240 attitudes toward, 132–138
Production functions, 155–157 behavioral economics of, 142–148
Cobb-Douglas, 170–171 measurement of, 127, 138–142
constant elasticity of substitution, 171 screening of customers to avoid, 438
fixed proportions, 169–170 variance in measurement of, 129–131
isoquants, 163–165 Rollback equilibrium (subgame perfect equilibrium),
linear, 168–169 329, 334–339
marginal and average product, 157–161
marginal rate of technical substitution, 165–167 Satiation, 12–14
relationship between average product and marginal Scale, economies of, 201–203
product, 161–163 Schmidt, Klaus M., 36, 38n
returns to scale, 171–173 Scope, economies of, 203–205
for technological progress, 173–175 Screening
Production theory, 4 of customers, 285–286, 286n
Productivity of insurance risks, 438
Index 477
Second-degree price discrimination, 277, 281–284 to prevent free-riding, 458

Second-order conditions, 215 Technological progress, production functions for,
Second-price auctions, 397–400 173–175
Second Welfare Theorem, 213, 237–239 Third-degree price discrimination, 277, 284–286
Sequential games, 329–330 Third-price auctions, 397
Stackelberg model for, 373–376 Tolls, 458
Short-run costs, 196–198 Transitivity, 9
Short-run equilibrium, 224–225 Tversky, Amos, 143, 145, 148
Short-run supply curves, 221–223
Shutdown price, 219 Uncertainty, 127
Signaling, 438–439 behavioral economics of, 142–148
Simultaneous-move games, 297–298 expected utility in, 131–132
Slutsky equation, 100–101 expected values in, 128–129
elasticities to represent, 101–102 in lotteries, 128
Smith, Vernon L., 241 prospect theory on, 145–148
Soccer, 317–320 variance in, 129–131
mixed strategy for, 321–323 Uncompensated demand, 66
Social optimum, 448–450 Uniform pricing (linear pricing), 284
for common-pool resources, 459–464 United Airlines, 249
restoring, 451–455 Unregulated equilibrium, 446–447
Social preferences, 36–37 Unrepeated games (one-shot games), 340
Social welfare, 263 Unsunk costs, 195–196
Stackelberg model of sequential quantity competition, Used-cars market, 420
373–376 market for lemons in, 428–431
Standard deviation, 131 Utility elasticity, 33
Stone, Richard, 35n Utility functions, 14–17
Stone-Geary utility function, 35–36 Cobb-Douglas, 32–33
Strategic dominance, 297, 300–306 diminishing marginal rate of substitution in, 26
Strategies, 5 Fehr-Schmidt social preferences function,
for common games, 310–316 36–37
defined, 298 indifference curves, 20–24
mixed-strategy Nash equilibrium, 316–321 for marginal utility, 18–19
Nash equilibrium to find, 306–310 for perfect complements, 30–31
Strict dominance, 300–306 for perfect substitutes, 29–30
Strict monotonicity, 10–12, 16 quasilinear, 34–35
Subgame perfect equilibrium (rollback equilibrium), Stone-Geary, 35–36
329, 334–339 Utility maximization problems, 49–55
for sequential quantity competition, 375 expenditure minimization problems and,
Subgames, 335 68–70
Subsidy, 237–239 in extreme scenarios, 55–56
Substitution effects, 75, 87–88 income effects in, 75
alternative representation of, 98–101 Lagrange multiplier to solve, 64–65
constant elasticity of substitution production
function, 171 Variable costs, 196–198
elasticity of, 176–179 Variance, 129–131
income effects and, 88–94 Von Neumann-Morgenstern EU function, 132n
in labor market, 94–97
marginal rate of technical substitution, 165–167 WalMart (firm), 248, 248n2
Sunk costs, 195–196 Walras, Leon, 66n12
Supply curves, 217–220 Walrasian demand, 66n12
market supply curves, 220–221 Warranties, 431
in monopoly markets, 255 implied warranties, 439
short-run supply curves, 221–223 Weak Axiom of Revealed Preference (WARP),
Surpluses, 226–228 57–60, 263
Weak dominance, 302
Taxes, 107, 124 Weather forecasting, 127, 128
negative taxes, 238 Weighted utility, 144–145
478 Index
Welfare analysis
externalities effecting, 445–446
under monopolies, 263–265
Welfare changes, 107
compensating variation measurement of, 110–113,
120–122
consumer surplus measurement of, 107–110
equivalent variation measurement of, 114–116,
122–124
with no income change, measurement of, 116–119
Willingness-to-pay (WTP), 277, 279
Winner’s curse, 411
Workers. See Employees

Intermediate Microeconomic Theory Tools and Step by Step Examples 0262044234 9780262044233 Compress

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Intermediate Microeconomic Theory Tools and Step by Step Examples 0262044234 9780262044233 Compress

Uploaded by

Copyright:

Available Formats

Intermediate Microeconomic Theory

Intermediate Microeconomic Theory

Tools and Step-by-Step Examples

Ana Espinola-Arredondo and Felix Muñoz-Garcia

The MIT Press

Library of Congress Cataloging-in-Publication Data

Names: Espinola-Arredondo, Ana, author. | Muñoz-Garcia, Felix, author.

Chapter Examples xiii

2 Consumer Preferences and Utility 7

4 Substitution and Income Effects 75

5 Measuring Welfare Changes 107

5.3 Compensating Variation 110

6 Choice under Uncertainty 127

7 Production Functions 155

8 Cost Minimization 183

9 Partial and General Equilibrium 213

10.7 Welfare Analysis under Monopoly 263

11 Price Discrimination and Bundling 277

12 Simultaneous-Move Games 297

13 Sequential and Repeated Games 329

14 Imperfect Competition 355

15 Games of Incomplete Information and Auctions 391

16 Contract Theory 419

17 Externalities and Public Goods 445

17.5 Common-Pool Resources 459

Chapter 2: Consumer Preferences and Utility

Example 2.1: Monotonic and strictly monotonic preferences. 11

Chapter 3: Consumer Choice

Example 3.1: UMP with interior solutions–I. 51

Chapter 4: Substitution and Income Effects

Example 4.1: Increasing income in a Cobb-Douglas utility function. 77

Example 4.8: Finding IE and SE with a Cobb-Douglas utility function. 91

Chapter 5: Measuring Welfare Changes

Example 5.1: Finding CS with linear demand. 108

Chapter 6: Choice under Uncertainty

Example 6.1: Finding the EV of a lottery. 129

Chapter 7: Production Functions

Example 7.1: Examples of production functions. 156

Chapter 8: Cost Minimization

Example 8.1: A particular isocost. 185

Chapter 9: Partial and General Equilibrium

Example 9.1: PMP in the Cobb-Douglas case. 216

Chapter 10: Monopoly

Chapter 11: Price Discrimination and Bundling

Example 11.1: First-degree price discrimination. 279

Chapter 12: Simultaneous-Move Games

Example 12.1: Finding strictly dominant strategies. 301

Chapter 13: Sequential and Repeated Games

Example 13.1: Applying NE to the Entry game. 332

Chapter 14: Imperfect Competition

Example 14.1: Cournot model with symmetric costs. 362

Chapter 15: Games of Incomplete Information and Auctions

Chapter 16: Contract Theory

Example 16.1: Finding optimal contracts when effort is observable. 422

Chapter 17: Externalities and Public Goods

Example 17.1: Unregulated equilibrium. 446

This textbook offers an introduction to intermediate microeconomic theory for under-

173 odd-numbered end-of-chapter exercises. In addition, it offers step-by-step explana-

Organization of the Book

Chapter 1 deﬁnes Microeconomics, how it is used to examine different real-world prob-

How to Use This Textbook

Ana Espinola-Arredondo and Felix Muñoz-Garcia

1.1 What Is Microeconomics?

1.2 Comparative Statics