You are on page 1of 184

Lecture Notes Mathematical Economics 1

Period 1

René van den Brink & Harold Houba

Mathematical and Computational Economics (MACE)


Department of Economics
Vrije Universiteit, Amsterdam

Autumn 2023
ii
Contents

1 Mathematical Economics: Introduction to the Course 1


1.1 De…ning the …eld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 On the use of Mathematics in Economics . . . . . . . . . . . . . . . . . 4
1.3 Descriptive versus Normative Economics . . . . . . . . . . . . . . . . . 5
1.4 Overview of this course . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Deterministic Choice among a Finite number of Alternatives 11


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 A Choice Model based on Preferences and chosing Best elements . . . . 13
2.2.1 The Set of Feasible Alternatives . . . . . . . . . . . . . . . . . . 13
2.2.2 The Preference Relation . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Axioms of Preference Relations . . . . . . . . . . . . . . . . . . 17
2.2.4 The Optimization Principle . . . . . . . . . . . . . . . . . . . . 20
2.3 A Choice Model of Utility Maximization . . . . . . . . . . . . . . . . . 24
2.3.1 The Utility Function . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 Utility maximization . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Inference of Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Revealed preferences . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Independence of Irrelevant Alternatives . . . . . . . . . . . . . . 29
2.4.3 The attraction e¤ect and IIA in experiments . . . . . . . . . . . 30
2.5 Behavioral Economics (Optional) . . . . . . . . . . . . . . . . . . . . . 31
2.6 Economics and Decision Theory (Optional) . . . . . . . . . . . . . . . . 33
2.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.8 Exercises (week 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Deterministic Choice among uncountably many alternatives 45


3.1 A Choice Model based upon Preferences . . . . . . . . . . . . . . . . . 46
3.1.1 The Set of Feasible Alternatives . . . . . . . . . . . . . . . . . . 46
3.1.2 The Preference Relation . . . . . . . . . . . . . . . . . . . . . . 48
3.1.3 Axioms of Preference Relations . . . . . . . . . . . . . . . . . . 48
3.2 The Optimization Principle . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 A Choice Model of Utility Maximization . . . . . . . . . . . . . . . . . 60
3.3.1 The Utility Function . . . . . . . . . . . . . . . . . . . . . . . . 60

iii
iv CONTENTS

3.3.2 Properties of Utility Functions . . . . . . . . . . . . . . . . . . . 60


3.3.3 Utility Maximization . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5 Computational Economics (Optional) . . . . . . . . . . . . . . . . . . . 63
3.6 Behavioral Economics (Optional) . . . . . . . . . . . . . . . . . . . . . 63
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4 Stochastic Choice among a Finite Set of Alternatives 69


4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.1.1 Probabilistic versus Algebraic Theories . . . . . . . . . . . . . . 70
4.1.2 Multiple Alternative Choices . . . . . . . . . . . . . . . . . . . . 71
4.1.3 Well-De…ned Sets of Alternatives . . . . . . . . . . . . . . . . . 72
4.2 The Probability and Choice Axioms . . . . . . . . . . . . . . . . . . . . 73
4.3 Existence of a Ratio Scale . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.5 Behavioral Economics (Optional) . . . . . . . . . . . . . . . . . . . . . 85
4.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5 The Consumer 89
5.1 The Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 The Budget set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 Demand of the consumer . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.1 Finding the demand of the consumer . . . . . . . . . . . . . . . 94
5.3.2 The Lagrange multiplier and indirect utility . . . . . . . . . . . 98
5.4 Expenditure function and compensated demand function . . . . . . . . 99
5.5 Market demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.6 Elasticities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6 The Producer and Individual Decision making on Markets: Perfect


Competition and Monopoly 109

6.1 The Producer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109


6.1.1 Production functions . . . . . . . . . . . . . . . . . . . . . . . . 109
6.1.2 Decision model of a producer . . . . . . . . . . . . . . . . . . . 111
6.1.3 Pro…t maximization . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1.4 Cost minimization . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1.5 Cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2 Individual Decision Making on Markets: Perfect Competition and Monopoly116
6.2.1 Market forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3 Perfect competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.3.1 Supply of a perfect competitor . . . . . . . . . . . . . . . . . . . 120
CONTENTS v

6.3.2 Perfect competition: equilibrium . . . . . . . . . . . . . . . . . 121


6.4 Monopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.4.1 Finding the optimal price . . . . . . . . . . . . . . . . . . . . . 123
6.4.2 Deadweight loss of monopoly . . . . . . . . . . . . . . . . . . . 124
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7 Preference aggregation 131


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.2 Social choice situations . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.3 Social choice functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3.1 Some examples of scoring rules . . . . . . . . . . . . . . . . . . 136
7.3.2 Majoritarian social choice functions . . . . . . . . . . . . . . . . 141
7.4 Properties of social choice functions . . . . . . . . . . . . . . . . . . . . 143
7.5 Single-peaked preferences . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.6 Social welfare functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.6.1 Condorcet social welfare function . . . . . . . . . . . . . . . . . 149
7.6.2 The Borda social welfare function . . . . . . . . . . . . . . . . . 150
7.6.3 Properties of social welfare functions (Optional) . . . . . . . . . 150
7.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8 Ranking methods 157


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 Directed graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.3 Score functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.4 Eigenvector scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.5 Application to Collective Choice . . . . . . . . . . . . . . . . . . . . . . 169
8.5.1 Majoritarian social choice functions based on score functions . . 170
8.5.2 Top cycle and Uncovered set . . . . . . . . . . . . . . . . . . . . 171
8.5.3 Two properties of social choice functions . . . . . . . . . . . . . 174
8.6 Social welfare functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
vi CONTENTS
Chapter 1

Mathematical Economics:
Introduction to the Course

1.1 De…ning the …eld


Mathematical Economics consists of two words that obviously suggest that it is con-
cerned with mathematics and economics. The …rst word, mathematical, is an adjective
indicating "pertaining to or the use of mathematics " according to Webster’s Dictio-
nary and the second term, economics, is a noun and indicates the subject to which the
mathematics are applied to. The second word is leading, it is economics and not, say,
physics or sociology that is central to Mathematical Economics.
There are various de…nitions of Economics. In its broadest de…nition, Economics is
about the allocation of scarce resources. Economics studies allocation mechanisms
such as markets, auctions, networks etc. Economics can be distinguished into sev-
eral sub…elds. One important distinction is between Microeconomics (which focuses
on economics from the viewpoint of individual agents) and Macroeconomics which
considers the economy as a whole. Traditionally, Mathematical Economics is closely
related to Microeconomics.
Some common ideas about Economics (see, e.g. Wikipedia):

Economics focuses on the behavior and interactions of economic agents


and how economies work

Economics is a social science which studies human behavior as a relationship


between ends and scarce means which have alternative uses (Lionel Robbins,
1932)

1
2CHAPTER 1. MATHEMATICAL ECONOMICS: INTRODUCTION TO THE COURSE

The ultimate goal of economics is to improve the living conditions of people


in their everyday life

During the last centruy, incentives in economic behavior play an important role.
In many textbooks this is underemphasized.
Also, there are various de…nitions of Mathematical Economics (for an extensive
overview again, see e.g. Wikipedia). Three important aspects are:

Mathematical Economics is the application of mathematical methods to rep-


resent theories and analyze problems in economics. Topics of Mathematical Eco-
nomics that return in the EOR curriculum are:

– Mathematical Economics 1: Individual Choice, Collective Choice and


Interdependent Choice (Noncooperative Game Theory, Strategic Interac-
tion).
– Mathematical Economics 2: Strategic decision making in Networks and
Markets
– Mathematical Economics 3: Contemporary Challenges in Business and
Economics - From Model and Data to Strategy

By convention, the applied methods refer to those beyond simple geometry,


such as di¤erential and integral calculus, di¤erence and di¤erential equations,
matrix algebra, mathematical programming, and other computational
methods.

Computational Economics is considered as a branch of Mathematical Eco-


nomics.

Students of Econometrics & OR at the Vrije Universiteit Amsterdam will recognize


the contents of the …rst-year program and two parallel courses during the …rst two
blocks in the second year. Because of the inclusion of Computational Economics and
its importance in policy advice, the sta¤ of the Department of Econometrics regard
themself as

Mathematical and Computational Economics (MACE)


1.1. DEFINING THE FIELD 3

Figure 1.1 shows the four circles representing the bachelor curriculum Econometrics
& OR at the Vrije Universiteit Amsterdam. The intersection of two adjacent circles
form four blades of an aircraft propeller that all originate from the white intersection in
the centre. Mathematical Economics is represented by the upward dark-blue blade at
the intersection of the circles Mathematics and Economics. Computational Economics
is in the white part of the propeller, right at the heart of curriculum Econometrics &
OR.
Problems in Operations Research are economic problems, but to treat Operations
Research as a branch of Economics would not do justice to Operations Research. There-
fore, it is worthwhile to make explicit some important di¤erences in perspective between
Operations Research and Mathematical Economics:

Operations Research problems are large-scale economic problems, a majority with


a single decision maker whose economic environment is irrelevant and that
require algorithms to numerically solve.

Mathematical Economic problems are economic problems with multiple deci-


sion makers whose economic environment is relevant because they interact
through their choices.

The …gure representing the curriculum Econometrics & OR also contains a propeller
blade called Econometrics. To quote the Wikipedia’s page on Economics:

Economic theories are frequently tested empirically, largely through the


use of econometrics using economic data. The controlled experiments com-
mon to the physical sciences are di¢ cult and uncommon in economics, and
instead broad data is observationally studied; this type of testing is typ-
ically regarded as less rigorous than controlled experimentation, and the
conclusions typically more tentative. However, the …eld of experimental
economics (Ed. also called behavioral economics) is growing, and increas-
ing use is being made of natural experiments.

In addition, econometricians measure economic reality. Is it possible to infer the


e¤ect from economic data, how large is the theoretical e¤ect in reality, does the e¤ect
di¤er among gender, cultures and countries, etc. Correct measurement is an important
aspect of economic policy making. Economic theory can be seen as a lens by which
we perceive reality: a perspective on reality. By its nature and design, lenses distort.
4CHAPTER 1. MATHEMATICAL ECONOMICS: INTRODUCTION TO THE COURSE

Figure 1.1: The four circles representing the bachelor curriculum Econometrics & OR
at the Vrije Universiteit Amsterdam. The four circles, starting with the upper left one
and moving clock-wise, are Economics, Mathematics, Computer Science and Statistics.

In spectacles, lenses are of use because their distortion compensates the distortion
in natural eye sight. The econometrician is like the optician who is the expert in
measuring distortions and who has to report these to (mathematical) economists for
economic policy advice.
Empirical tests of mathematical economic theories are even more important. E¤ec-
tive policy intervention in practical economic problems requires evidence-based theories.

1.2 On the use of Mathematics in Economics


Of all social sciences, Economics is the one that relies most and heavily on mathematical
modeling. Many renowned economists commented on the use of mathematics:

Alfred Marshall, cofounder of consumer theory, in 1906:


[I had] a growing feeling ... that a good mathematical theorem dealing
with economic hypotheses was very unlikely to be good economics.

John Maynard Keynes, in The General Theory (1936):


1.3. DESCRIPTIVE VERSUS NORMATIVE ECONOMICS 5

Too large a proportion of recent ‘mathematical’ economics are merely


concoctions, as imprecise as the initial assumptions they rest on, which
allow the author to lose sight of the complexities and interdependencies of
the real world in a maze of pretentious and unhelpful symbols.
Robert M. Solow (Nobel-laureate 1987):

Economics ... has become a technical subject. Like any technical subject
it attracts some people who are more interested in the technique than the
subject. That is too bad, but it may be inevitable.
Reinhard Selten (Nobel-laureate 1994):
Game theory is for proving theorems, not for playing games.

Many students of Econometrics & OR are attracted to this …eld because of an


interest in mathematics and the application of it. To avoid getting carried away by the
technique and forget about the meaning of what you try to accomplish, consider the
following wise words:

Alfred Marshall, in 1906:


I went more and more on the rules
1. Use mathematics as a shorthand language, rather than
an engine of inquiry.
2. Keep to them till you have done.
3. Translate into English.
4. Then illustrate by examples that are important in real life.
5. Burn the mathematics.
6. If you can’t succeed in 4, burn 3.
This last I did often

Therefore, if you really understand the theories in this course, at the end of the
course you should be able to explain in simple wording to laymen, such as most likely
your family, without the use of mathematics what each theory aims at and what it
implies.

1.3 Descriptive versus Normative Economics


Their are two approaches to decision making in Economics:
6CHAPTER 1. MATHEMATICAL ECONOMICS: INTRODUCTION TO THE COURSE

Descriptive (or positive): describing "what is" or describing reality; discovering


and explaining how people/organizations make decisions.

Normative: advocating "what ought to be", "is best", or "wise" leading to


policy recommendations that should be followed. Those who make a living of
normative economics are like merchants selling principles to customers.

The distinction between normative and descriptive is often unaddressed in economic


textbooks and the economic profession.
These perspectives imply a conceptual distinction between Econometrics and Oper-
ations Research, while it positions Mathematical Economics between both perspectives.

Econometrics ("statistics") deals with inferring the descriptives.

Operations Research is normative because aims to "make better decisions".

(Mathematical) Economics is a mixture of both. On the one hand, it provides


theories to explain economic reality and for econometric testing, and on the other
hand, many theories indicate how to "do better" in economic situations.

This distinction also applies to situations of interdependent decision making, also


called Game Theory. According to science-historian Paul Erickson in his 2015 book
"The World the Game Theorists made" there was already in the late 1940’s a consensus
that game theory failed to distinguish between normative and descriptive, it claimed
to be both at the same time. Also, many already were convinced that it was unclear
how Game Theory could ever be prescriptive (similar to a medical doctor’s recipe) in
the sense of providing a "guide to action for would-be game players". This explains
game theorist Reinhard Selten (Nobel-laureate 1994) saying that “Game theory is for
proving theorems, not for playing games”. In this course, we also want to provide some
guidance how to play games and move somewhat beyond traditional Game Theory.

Students and practitioners of Econometrics & OR at the Vrije Universiteit Amster-


dam should re‡ect on the following aspects of normative economics:

Companies that "would like to follow the principles" are willing to pay for advise
according to these principles. This willingness to pay determines the monetary
value of your diploma. And you are the merchant selling these principles who
wants to make a living of it.
1.4. OVERVIEW OF THIS COURSE 7

It is di¢ cult to persuade people only by mathematical proofs.

What is the quality of the economic recommendation for society (the "economy")
if it is not rooted in descriptive economics? To quote John Maynard Keynes once
more: these are as "imprecise as the initial assumptions they rest on".

This manuscript devotes e¤ort to re‡ecting on the assumptions of theories and


whether the theoretical predictions are evidence based. These re‡ections should also
harness you against getting carried away by the technique with a loss of focus on what
really matters in using mathematics to study and / or resolve economic problems.

1.4 Overview of this course


In this course we discuss three theories of decision making:

Individual Decision Making (Period 1, Weeks 1-4)

Collective Decision Making (Period 1, Weeks 5-6)

Interdependent Decision Making (Game Theory, Strategic Interaction) (Period


2)

The di¤erence between these theories can be framed in terms of the number of
decision makers and objectives, as in the table below.

# Objectives 1 >1
# Decision Makers
Independent Multi Criteria
1 Decision Decision
Making Making
Collective Interdependent
>1 Decision Decision
Making Making

Individual decision making considers decision problems where one decision maker
tries to optimaze one (its own) objective. In Collective decision problems, we consider
a society of individuals and try to optimize a joint objective (for example to obtain
an e¢ cient, fair etc. allocation or production proces.) Interdependent decision making
8CHAPTER 1. MATHEMATICAL ECONOMICS: INTRODUCTION TO THE COURSE

considers a society of individual decision makers who each have their own objective
they want to optimize. Usually, these individual optimization problems are intertwined.
For example, what is optimal for a …rm to do in a market where it sells its product,
usually depend on what its competitors do, and the …rm should take this into account
when solving its own optimization problem. (Of course, the competitor’s optimal
behavior depends on the behavior of this …rm, so all individual decision problems are
intertwined). Finally, Multicriteria decision making considers one individual decision
maker, who makes a trade o¤ between di¤erent criteria in determining its objective
(for example the People, Planet, Pro…t principle discussed in the next chapter).
In Period 1 we discuss individual and collective decision making. Period 2 is fully
devoted to interdependent decision making and applications. In this course, we will
not discuss Multi Criteria Decision Making.
Part 1

Individual Choice
Chapter 2

Deterministic Choice among a


Finite number of Alternatives

During the 20th century, axiomatic models dominated economics. An axiom is a propo-
sition or principle that is assumed without proof and taken as self-evident. An ax-
iomatic model is a model that derives necessary consequences from a set of independent
and non-con‡icting axioms. In this chapter, the ideas of the axiomatic approach are
introduced in the simplest economic setting possible: the canonical or standard form
of deterministic individual choice from a few alternatives. For example, what ‡avor of
ice cream to choose, or which job candidate to hire.
In this chapter, the canonical form of the individual decision model based on prefer-
ences and chosing best elements, and the conditions under which this can be represented
by utility maximization, will be discussed.

2.1 Introduction
People, …rms, governments, etc. constantly have to make decisions.

Do I choose co¤ee, tea or hot chocolate from the vending machine at the VU?
Preferences di¤er, so does the actual choice made.

What smart phone to buy? How much money to spend? What contract? How
much internal memory? Most of you have di¤erent preferences, so do the di¤erent
models students have with them in class.

In elections, what political party do I vote for, or do I abstain?

In a referendum, do I vote in Yes, No, or do I abstain?

11
12CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Do I buy cheaper cloths produced in sweat shops where labor conditions are very
poor, or do I buy more expensive cloths produced under decent labor conditions?

How to spent my income? How to spent my time?

Many companies, NGOs and governments follow the principle of the 3Ps:

3P: People, Planet, Pro…t

where the third P is Prosperity for NGOs and governments.

Companies such as Nike and Adidas have to trade o¤ their pro…t margins against
paying more in order to improve people’s working conditions in "sweat shops",
etc.

Companies such as Unilever have to trade o¤ its pro…t margin against investing
more in ecologically friendly palm-tree farms and avoid turning tropical forests
into such farms.

The selection committee entrusted with hiring a new employee has to choose one
among several job applicants.

Hospitals are health-care providers and how to manage hospitals and other care
providers? How to de…ne and measure care?

Since 1927, Dutch government contemplated building the A3 highway that would
directly connect Amsterdam and Rotterdam. This highway has direct economic
bene…ts, but also economic costs in terms of money and environmental degrada-
tion.1

These dilemmas are all examples of choices to be made.

If the issue is how to decide, then Mathematical Economics and Operations Re-
search aim to improve decisions making by building mathematical models to
support the decision making process.

If actual choice behavior is observed, as in economic data sets, or click data by


e.g. Google, Amazon and other online companies, these mathematical models
try to explain observed choices and can be used to predict upcoming choices in
similar choice situations.
1
For the complete history of the A3, you go to https://www.wegenwiki.nl/A3_(Nederland)
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS13

What is common in all choice situations?


In this chapter, we discuss the most used model of economic decision making of an
individual decision maker that consists of the following elements.

A set of di¤erent options, called alternatives, that are available.

The decision maker has preferences over the di¤erent alternatives. These prefer-
ences are represented by a preference relation. Under certain conditions (that
we will discuss) these preferences can be represented by a utility function.

A decision rule according to which the decision maker chooses from the set of
alternatives based on its preference relation. We assume that the decision maker
wants to maximize a certain objective, which boils down to choose the best as
is possible given limited time, information, and perception of preferences over
these alternatives, etc.

The mathematical model should be ‡exible to capture a wide range of decision situa-
tions in order to be widely applicable.

2.2 A Choice Model based on Preferences and chos-


ing Best elements
2.2.1 The Set of Feasible Alternatives
The …rst element of the decision model is the set of alternatives that is available to
the decision maker. The cases mentioned in the introduction of this chapter involve
individuals, companies, governments, NGOs etc. These are examples of economic
agents. An economic agent is an "actor" (in the sense of someone who acts or does
something). In this chapter, we focus on the abstraction of an economic agent who
faces a single choice among a …nite number of feasible alternatives. Feasible means
that it is available to the economic agent, even though some actions may be illegal,
such as whether to speed in tra¢ c or whether a company joins a cartel. The main
motivation for a …nite number of alternatives is twofold. On the one hand it captures
many relevant real-life situations with a …nite number of alternatives. On the other
it avoids unnecessary distraction of technicalities that arise in case of in…nite sets of
alternatives. (This case will be discussed in Chapter 3 and Week 2)

Notation
14CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

A d2A
- Black Box -
Choice Choice
Situation

Figure 2.1: The fundamental black box of Economics and Psychology.

m 2 is the (…nite) number of feasible alternatives.

A = fa1 ; : : : ; am g is the set of feasible alternatives,


for example, a1 = co¤ee, a2 = tea, a3 = hot chocolate.

a 2 A is an arbitrary alternative from the set A.

d 2 A is a decision made / predicted about choice.


In the previous example, for someone who drinks co¤ee the decision or choice
made is a1 and we write d = a1 .

Sometimes we want to distinguish only two or three elements and we write a; b; c 2 A


in case either A = fa; bg, or A = fa; b; cg.
Choice situations can be visualized as in Figure 2.1. From a descriptive perspective,
the set of feasible alternatives A and the choice d 2 A are supposed to be observable
to the economist (after choice d has been made). What is inside of the black box, say
the mental processes of individuals or the decision processes within governments and
companies, are supposed to be unobservable. Theory …lls in the black box to explain
actual choice and to predict choice for new choice situations A. From observing A and
d, theories about choice can be tested, or choice data can be used to infer about the
processes of how choices are made. This combines theory with Econometrics in the
statistical sense.
From the perspective of normative economics, several alternatives are given and a
choice is required. Choice d 2 A is the advised choice from the set A. Theory needs
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS15

to …ll in the black box by coming up with the procedure that leads to advice d, why
this particular d and not some other alternative. This is how Operations Research
originated in the Royal Navy in Great-Brittain some years before the Second World
War.

2.2.2 The Preference Relation


Before we de…ne preferences mathematically, we …rst discuss preferences in ordinary
language. A stereotype is that British have a preference for tea. Companies, NGOs
and governments express their organization’s preferences in terms of objectives or goals
to be reached.

Maximization of shareholders’interests seems a preference for pro…t maximiza-


tion.

Parcel delivery services such as UPS, FedEx want to deliver in the most cost-
e¤ective manner, which seems a preference for cost minimization.

Executives of hospitals and other care providers express providing health-care as


their goal.

The People, Planet, Pro…t (3Ps) principle expresses preferences of companies


that want to operate in a societal responsible manner, rather than pursuing only
pro…ts.

The People, Planet, Prosperity principle expresses preferences how govern-


ments evaluate their policies.

Some of these preferences seem easy to quantify in terms of money (f.e. pro…t, costs),
others such as health care or the 3Ps are harder to de…ne and measure. It is self-evident
that people have preferences, although some preferences may not be fully articulated.
Nevertheless, preferences are the natural starting point for economic theories of choice.
The same model is also studied in mathematical psychology.
Economist translate preferences as "at least as good as" and then ask what choice
behavior seems natural of someone who considers "a banana at least as good as an
apple". Stating "at least as good" is mathematically a binary relation.
16CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

De…nition 1 A (weak) preference relation of an agent on a set of alternatives A


is a binary relation % with the interpretation that the agent considers alternative a 2 A
‘at least as good’as alternative b 2 A.

So, a preference relation compares alternatives pairwise. If a % b, we also say that


the agent ‘weakly prefers’alternative a to alternative b.

Remark 1 (Popular Mistake!)


Notation % (at least as good as) is often read as (weak inequality).

The following quote is slightly adapted from page 5 of Jehle and Reny (2011):2

"That we use a binary relation to characterize preferences is signi…cant


and worth a moment’s re‡ection. It conveys the important point that, from
the beginning, our theory requires relatively little of the economic agent it
describes. We require only that economic agents make binary comparisons,
that is, that they only examine two alternatives at a time and make a
decision regarding those two."

Example 1 Suppose A = f1; 2; : : : ; 10g are exam grades. Most people prefer higher
grades, which we write as a % b () a b. It means that this individual indicates
"For me a % b holds if a b also holds and vice versa". Then, a % b and a b are
equivalent statements and we express this with the () symbol.

Quiz 1 Suppose A = f1; 2; : : : ; 10g are exam grades


Is a % b () a > b a preference relation?
Is a % b () a b 3 a preference relation?

There are two additional relations that express "indi¤erence" and "better than".
Each is determined by the weak preference relation % and they formalize the notions
of indi¤erence and strict preference.

1. the agent is indi¤erent and evaluates a and b as equally good if

a % b and b % a:

If alternatives a and b are equally good, then we write a b. (So, a b , [a %


b and b % a ]). We call the indi¤erence relation.
2
Geo¤rey Jehle and Philip Reny (2011), Advanced Microeconomic Theory, Addison Wesley:
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS17

2. the agent evaluates a better than b if3

a % b and b 6% a:

If alternatives a is better than alternative b, then we write a b. (So, a b,


[a % b and b 6% a].) (we say that a 2 A is ‘better than’or ‘strictly prefered to’
b 2 A) We call the strong preference relation or better-than relation.

Remark 2 The indi¤erence and better-than relations can be derived from the weak
preference relation similar as, in the context of numbers, = and > can be derived from
.

2.2.3 Axioms of Preference Relations


In this section, we will introduce four axioms for preference relations and discuss their
interpretation.

Page 5 of Jehle and Reny (2011) motivate the approach that is common
in economics as follows: "Preferences are characterized axiomatically. In
this method of modelling as few meaningful and distinct assumptions as
possible are set forth to characterize the structure and properties of prefer-
ences. The rest of the theory then builds logically from these axioms, and
predictions of behavior are developed through the process of deduction.
These axioms of choice are intended to give formal mathematical ex-
pression to fundamental aspects of choice behavior and attitudes towards
the objects of choice. Together, they formalize the view that the economic
agent can choose and that choices are consistent in a particular way."

We …rst state each axiom, and then discuss it using arguments from Jehle and Reny
(2011).

Axiom 1 (Complete) Preference relation % on A is complete if for each pair of


a; b 2 A it holds that

a % b or b % a, or both

Interpretation: A preference relation is complete if any two alternatives can be


compared to each other.
3
[b 6%i a ] means [NOT b %i a].
18CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Quiz 2 Consider exam grades A = f1; 2; : : : ; 10g and a % b () a b. Is this


preference relation complete? What if > instead of .

Page 5 of Jehle and Reny (2011): "Axiom 1 formalizes the notion that
the economic agent can make comparisons, that is, that he has the ability
to discriminate and the necessary knowledge to evaluate alternatives. It
says the economic agent can examine any two distinct alternatives a and b
and decide whether a is at least as good as b, or b is at least as good as a,
or both (meaning indi¤erence)."
Pages 7 Jehle and Reny (2011): "Using the two supplementary relations
and , we can establish something very concrete about the economic
agent’s ranking of any two alternatives if % on A is complete. For any pair
a and b, exactly one of three mutually exclusive possibilities holds: a b,
or b a, or a b. To this point, we have simply managed to formalize the
requirement that preferences re‡ect an ability to make choices and display
a certain kind of consistency."

Before stating the next axiom, recall the following property of numbers that you
are familiar with: a b and b c implies a c. This property is called transitivity
and it can be de…ned for preference relations.

Axiom 2 (Transitive) Preference relation % on A is transitive if for each triple of


a; b; c 2 A, it holds that:

if [a % b and b % c], then a % c

So, transitivity expresses some idea of ‘consistent’preferences in the sense that if


you prefer alternative a to alternative b, and you prefer alternative b to alternative c,
then you should also prefer alternative a to alternative c. (Question: Do you expect
preferences of an agent to satisfy this condition?)

Quiz 3 Consider exam grades A = f1; 2; : : : ; 10g and a % b () a b. Is this


preference relation transitive? What if > instead of ?

Page 5-6 of Jehle and Reny (2011): "Axiom 2 gives a very particular
form to the requirement that the economic agent’s choices be consistent.
Although we require only that the economic agent be capable of comparing
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS19

two alternatives at a time, the assumption of transitivity requires that those


pair wise comparisons be linked together in a consistent way. At …rst brush,
requiring that the evaluation of alternatives be transitive seems simple and
only natural. Indeed, were they not transitive, our instincts would tell
us that there was something peculiar about them. Nonetheless, this is a
controversial axiom. Experiments have shown that in various situations,
the choices of real human beings are not always transitive. Nonetheless, we
will retain it in our description of the economic agent, though not without
some slight trepidation.4
Axiom 1 and 2 together imply that the economic agent can completely
rank any …nite number of elements in the set of feasible alternatives A, from
best to worst, possibly with some ties. "

Axiom 3 (Asymmetric) Preference relation % on A is asymmetric, if for every


pair of alternatives a; b 2 A, it holds that:

if [a % b; and a 6= b], then b 6% a:

Quiz 4 Suppose A = f1; 2; : : : ; 10g are exam grades. Is a % b () a > b an asymmet-


ric preference relation?

The proof of the following two propositions are left as exercises.

Proposition 5 If % on A is complete and asymmetric, then for every pair a; b 2 A,


a 6= b, either a b, or b a.

This result states for every pair of di¤erent alternatives, one of them must be better
than the other. Therefore, asymmetry rules out indi¤erences.

Axiom 4 (Re‡exive) Preference relation % on A is re‡exive if a % a for all a 2 A.

Proposition 6 If % on A is complete, then % on A is re‡exive.

Example 2 Suppose you can choose among two identical cans of Coke, denoted C1 ; C2 ,
and one can of Origina, O. Then we may write A = fC1 ; C2 ; Og. It seems self-evident
that C1 C2 , or [ C1 % C2 AND C2 % C1 ]. We might have modelled this situation as
4
On interpretation of trepidation is "a feeling of fear or anxiety about something that may happen."
It has many synonyms of which worry, tension, uneasiness are most appropriate.
20CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

a choice between Coke, denoted C, and Origina: fC; Og. Given two identical cans of
Coke, we might say C % C is true when we compare these two cans. This is a formal
(and clumsy?) way of saying that an alternative is always at least as good as itself.
Or, that a can of Coke cannot be worse to an identical can of Coke.

Example 3 (An irre‡exive preference relation) As an example of a preference


relation that fails to be re‡exive, consider the exam grades A = f1; 2; : : : ; 10g once
more and the preference relation a % b () a > b. Then, for grade b = a, we
cannot compare a and b (= a). Of course, this irre‡exive preference relation is counter
intuitive.

Quiz Suppose that preference relation % is complete, transitive, asymmetric and


re‡exive. Which of these properties are then satis…ed by the indi¤erence relation
and the better-than relation ?

2.2.4 The Optimization Principle


The optimization principle of economics is another axiom and it states that economic
agents "try to do what is best". The idea of best, requires a yardstick to express
what is best, which is the preference relation % on A. It translates as a principle
of choosing a "best" element in A according to % on A. Combining axioms for the
preference relation % on A and the optimization principle allows us to derive su¢ cient
conditions for existence of a best element, and somewhat tighter su¢ cient conditions
for uniqueness of a best element.

Page 4 of Jehle and Reny (2011): "Finally, the model is ‘closed’ by


specifying some behavioral assumption. This expresses the guiding principle
the economic agent uses to make …nal choices and so identi…es the ultimate
objectives in choice. It is supposed that the economic agent seeks to identify
and select an available alternative that is most preferred in the light of his
personal tastes."

Axiom 5 (Optimization Principle) An economic agent is an ‘actor’(someone who


acts) who tries to do what is best in the given situation

How does an agent make a choice, given its preference relation? There are several
ways to de…ne ‘what is best’, or ‘what is optimal’. We assume (as usual in economics)
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS21

that an agent chooses a ‘best element’. Therefore, we will now introduce the set of
best elements. Because of the possible multiplicity of best elements, we need to de…ne
best choices in terms of a set.

De…nition 2 Alternative a 2 A is a best element in preference relation % if a % a


for all a 2 A.

So, a best element is an element that is at least as good as every (other) element.
We introduce the following notation.
Notation

B (A) A denotes the best choice set of % on A and it consists of all best
elements in A:

B (A) = fa 2 Aja % a for all a 2 Ag

Remark 3 Alternative de…nitions to express "best" are possible and are proposed in
the literature. One of these alternative de…nitions, a maximal element, will be studied
in an exercise. What is important to realize is that there often are many competing
de…nitions one may come up with, but textbooks often only discuss one and do not ask
to use your own imagination.

Quiz 5

1. Derive B (A) if A = fa; b; cg and a b, b c and a c. Explain.

2. Derive B (A) if A = fa; bg and a b. Explain

3. Derive B (A) if A = fa; b; cg and a b, b c and c a. Explain.

4. Derive B (A) if A = fa1 ; a2 ; a3 ; a4 g and a1 a2 , a2 a3 , a3 a1 , a4 a1 ,


a4 a2 , a4 a3 . Explain.

(The proofs of the following two propositions are left as exercises.)


22CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

A % on A complete and transitive d 2 B(A)


- + -
Choice The Optimization Principle Choice
Situation

Black Box

Figure 2.2: The model of preferences …lls in the black box of Economics.

Proposition 7 (Existence) If A is …nite and % on A is both complete and transitive,


then B (A) 6= ;.

Remark 4 Without completeness and transitivity, existence of a best a 2 A is not


guaranteed. Every theory based upon preference relation % on A and the choice of a
best a 2 A cannot get around this!

As mentioned, the axioms of completeness and transitivity allow the economic agent
to rank all the alternatives in the set of feasible alternatives. After ranking all alterna-
tives from best to worst, the …rst alternative is a best element. If the economic agent is
indi¤erent between the …rst and second alternative of his ranked order of alternatives,
there are at least two best alternative, namely the …rst ranked alternative and the
second ranked one. The implication is that, under Axiom 1 and 2 and the Optimiza-
tion Principle, the simple model of preferences allows an explanation / prediction of
choice behavior, or allows us to make a normative recommendation what alternative
the agent should choose. As Figure 2.2 illustrates, the model of preferences …lls in the
fundamental black box of Figure 1.1.
Adding the axiom of asymmetry, which excludes indi¤erences, to the axioms of the
previous proposition is su¢ cient to arrive at a unique best element.

Proposition 8 (Uniqueness) If A is …nite, % on A is complete, transitive and asym-


metric, then B (A) is a singleton set, i.e., jB (A)j = 1 where j j denotes the number of
elements of a …nite set.
2.2. A CHOICE MODEL BASED ON PREFERENCES AND CHOSING BEST ELEMENTS23

Remark 5 In case of existence and uniqueness, we might rede…ne B (A) as a best


choice function b (A). Then, our theory makes a unique prediction about the choice
behavior of the economic agent.
A unique prediction, or a unique recommendation, can be seen as a good property
for any model or theory to have. Theories with multiple predictions can be said to
be indecisive. Or, about the supposed mutability of opinions held by John Maynard
Keynes: One of the jokes is that if Parliament asked six economists for an opinion on
any subject they always got seven answers. Two from John Maynard Keynes. I think
when we economists examine the seven answers we …nd that the discrepancies among
them are not as great as the layman thinks.

Remark 6 As a …nal remark, for any preference relation % on A, we can apply the
idea of a best element and the best choice set to every non-empty subset A0 A:

a % a for all a 2 A0 and B (A0 ) = fa 2 A0 ja % a for all a 2 A0 g :

Then, B can be seen as a correspondence or multi-function that maps arbitrary subsets


A0 of A into subsets of its argument A0 . The axioms of completeness and transitivity
ensure this correspondence is well-de…ned in the sense that it assigns a non-empty
subset of best elements for every subset A0 . If additionally asymmetry also holds, then
the existence and uniqueness of a best element for every subset A0 of A implies that
we may treat B : 2A ! A as a function. We may write b (A0 ), where b : 2A ! A is
a function,5 to express that the best choice correspondence boils down to a function.
Figure 2.3 illustrates the uniqueness result. Although the concept of the correspondence
is important in Mathematical Economics, it is outside the scope of this course.

Remark 7 The Optimization Principle seems self-evident. One may ask, whether this
is the only manner to take decisions? Herbert Simon became Nobel-laureate of 1978
because of his seminal work on bounded rationality. One idea that he is well known
for the concept satis…cing, which is a combination of satisfy and su¢ ce. Satis…cing
is a decision-making strategy or cognitive heuristic that entails searching through the
available alternatives until an acceptability threshold is met. The following example
5
Recall your course on Probability Theory. Events are modelled as subsets E of the universe U ,
which is a set of all events possible. For subsets of events probabilities were assigned. Formally, the
probability function p : U ! [0; 1] assigns probability p (E) to each subset E U . The function b
has a similar interpretation but its image is an element of the set A. The correspondence B assigns
subsets of A to every subset of A.
24CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

A % on A complete, transitive and asymmetry d = b(A)


- + -
Choice The Optimization Principle Choice
Situation

Black Box

Figure 2.3: The canonical model of preferences with a unique prediction.

from wikipedia’s page on satis…cing may be clarifying. Consider a person whose task
is to sew a patch onto a pair of jeans. The best needle to do the threading is a 4 inch
long needle. This needle is hidden in a haystack along with 1000 other needles varying
in size from 1 inch to 6 inches. Satis…cing claims that the …rst needle that can sew on
the patch is the one that should be used. Spending time searching for that one speci…c
needle in the haystack is a waste of energy and resources.6

2.3 A Choice Model of Utility Maximization


2.3.1 The Utility Function
Utility maximization is the cornerstone of economic modeling. But what is utility?
Usually, we do not give an interpretation to utility, but we do use utility functions.
A utility function is a quantitative representation of a preference relation in the
sense that it gives a higher ‘utility value’ than the utility value of alternative a, to
every alternative that is strictly preferred to alternative a

De…nition 3 A utility (or objective) function induced by % on A is a function


u : A ! R such that for all a; b 2 A : u(a) u(b) () a % b
6
It is ironic that every algorithm in numerical optimization, either in Operations Research or
maximum likelihood estimation in Econometrics, is based upon satis…cing, because algorithms stop
if the improvement in the objective between two subsequent iterations is less than some pre-speci…ed
" > 0.
2.3. A CHOICE MODEL OF UTILITY MAXIMIZATION 25

Remark 8 As a consequence of a …nite set of alternatives A = fa1 ; : : : ; am g, the utility


function consists of m utility numbers that we may write as the m-dimensional vector
u = (u1 ; : : : ; um ), where u1 = u (a1 ) ; u2 = u (a2 ), etc.

Example 4 Consider exam grades A = f1; 2; : : : ; 10g and a % b () a b. Then,


p
u(a) = a is a utility function, but also u(a) = 10a, u(a) = ln a, or u(a) = a. (Verify
this.)

The example shows that % on A does not induce just one utility function, but many.
Even worse, suppose the function f : R ! R is increasing, then

a % b () u(a) u(b) () f (u(a)) f (u(b))

implies that both u and f u are utility functions induced by % on A. This simple
argument implies the following result.

Proposition 9 If preference relation % on A induces the utility function u and f :


R ! R is increasing, then % on A also induces the utility function f u on A

Example 5 Consider exam grades A = f1; 2; : : : ; 10g and a % b () a b. Then,


p
u (a) = a is a utility function and for f (z) = z we obtain the utility function
p
f (u(a)) = a.

If multiple utility functions can represent the same preference relation, and if the
image of utility functions can di¤er too, the question arises what is utility. Utility
functions are a mathematical construct to replace % by a function without a natural
scale of measurement:

1. utility does not have a natural unit like pro…t (in Euros).

2. utility does not have a natural zero like pro…t.

3. utility in this lecture is an ordinal measure and not cardinal such as pro…t.

Example 6 2000 euro is twice 1000 euro. For utility this does not hold. Although
p p
u(a) = a suggests u(2a) = 2u(a), u(a) = a suggests u(2a) = 2u(a) 6= 2u(a).

Quiz 6
26CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

1. Derive u (a), u (b), and u (c) in case: a b, b c, and a c (transitive).

2. Do utilities u (a), u (b), and u (c) exist in case: a b, b c, and c a (intransi-


tive). Explain the problem.

3. Do utilities u (a), u (b), and u (c) exist: a b and c is incomparable with a and
b (incomplete). Explain the problem.

4. Does a utility function u : A ! R exist in case of exam grades A = f1; 2; : : : ; 10g


and a % b () a > b (irre‡exive)? Explain the problem.

Preference relation % on A being complete and transitive is necessary for the


existence of a utility function.
By an appropriate choice of f it is possible to normalize every utility function such
that the worst alternative has utility 0 and the best alternative has a utility of 100,
i.e., u : A ! [0; 100].7
co¤ee. What utility number u (b) would you assign to this person?

2.3.2 Utility maximization


If a utility function exists, then choosing a best element is equivalent to maximizing
utility. In essence, it is a translation of the Optimization Principle to utility functions
based upon the de…nition of a best element.
Suppose % on A induces the utility function u. Then, a best element of % can be
found as

a = arg max u (a) ;


a2A

or the equivalent formulation8

a = arg max u (a) ; s.t. a 2 A:


a

7
The normalization to [0; 100] is as arbitrary as a normalization to [0; 10] or [0; 1]. As we will see
later, a natural (and cardinal) interpretation for the normalization [0; 1] exists for expected utility.
8
"s.t." stands for "subject to". It expresses that a is not free to choose, but that it is constrained
to some set, here the set of feasible alternatives.
2.4. INFERENCE OF PREFERENCES 27

% on A complete and transitive


+ () Maximize u : A ! R
The Optimization Principle

Figure 2.4: The fundamental equivalence between the models of preferences and utility
maximization for …nite sets of alternatives.

Remark 9 Utility functions and utility maximization are far from being self-evident.
No economist has ever met a person who could express his preferences by such a func-
tion. Nevertheless, utility maximization is the ultimate consequence of the model with
complete and transitive preferences and the Optimization Principle. Most people have
little objection against the model with preferences.

Recall that we can express the utility function over a …nite set by a vector of utilities.
It is left as an exercise to make computation of the utility maximizing alternative
operational through an optimization program.
Figure 2.4 summarizes the main result of this section: the equivalence between the
model of preferences and utility maximization.

2.4 Inference of Preferences


2.4.1 Revealed preferences
Figure 2.2 and 2.3 …ll in the fundamental black box of Economics. Recall that the
observables consist of feasible alternatives and a choice while the unobservables consists
of preferences, the rule of choosing and the other axioms. The fundamental inference
problem is whether choice data allows economists to reconstruct or infer unobservable
preferences if we assume that the model of Figure 2.3 describes how the economic agent
actually decides.9 Choice data consists of observing several subsets of the set of feasible
alternatives and choices made from these subsets. For example, Bo observes the choice
9
Note that inference di¤ers from testing the theory, which is treated in Section 2.4.
28CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

of Ann several times in front of the VU vendor machine, one time co¤ee, tea and hot
chocolate were all available, at other times one of these products ran out of stock and
were unavailable. Let us work out this simple example.

Example 7 Consider the VU vendor machine with a1 = co¤ee, a2 = tea, and a3 =


hot chocolate. Suppose Ann has complete, transitive, and asymmetric preferences and
prefers tea best and co¤ee worst. Thus, A = fa1 ; a2 ; a3 g and % on A is given by
a2 a1 , a2 a3 , and a3 a1 . For the following three binary subsets, the choice model
of preferences predicts

b (fa1 ; a2 g) = b (fa2 ; a3 g) = a2 and b (fa1 ; a3 g) = a3 :

Consider choice data that are consistent with our model. Then, we have the following
observations:

choice a2 from fa1 ; a2 g and fa2 ; a3 g ; and choice a3 from fa1 ; a3 g :

The choice a2 from fa1 ; a2 g reveals a weak preference for a2 over a1 , i.e., a2 % a1 . As
outside observers, we must allow the possibility of indi¤erence between binary choices.
We may only write if we, as outsiders, are willing to interpret our data as if Ann’s
preference relation satis…es asymmetry. Similarly, choice a2 from fa2 ; a3 g reveals a2 %
a3 and choice a3 from fa1 ; a3 g reveals a3 % a1 . From this inference, we must conclude
that we can rank Ann’s unobservable preferences as a2 % a3 % a1 and that Ann has
transitive preferences over A, which is indeed true if we would ask Ann to write down
her preferences directly. If we are willing to impose asymmetry to interpret Ann’s
choice, we would obtain her true strict preferences. But without asking Ann, it remains
an arbitrary and subjective interpretation of the choice data.

More general, consider the observation in which d0 is chosen from A0 A. The


…rst issue to address is how to interpret observation d 2 A if we assume % on A
describes the agent’s preferences, these preferences are complete and this agent follows
the Optimization Principle. (We do not yet impose other axioms.) The answer is
trivial. If A0 and choice d0 2 A0 are observed, the agent chose d0 because d0 2 A0 must
have been a best element of A0 according to our theory. By de…nition, d0 % a for all
a 2 A0 . Then, invoking the de…nition of a best element on choice d0 2 A0 is the only way
to interpret d0 2 A0 that is consistent with the theory that we presuppose holds true.
This interpretation of choice data is also called an axiom, and too many economist it
is self-evident.
2.4. INFERENCE OF PREFERENCES 29

Axiom 6 (Directly-revealed preference) Let A0 A. If the economic agent chooses


d0 2 A0 , then his choice d0 directly reveals a weak preference of d0 over any other
a 2 A0

Remark 10 Why directly revealed? Because any other a 2 A0 was available instead
of d0 and in the direct pairwise "confrontation" between d0 and this rivalling a 2 A0 , d0
weakly beats a 2 A0 . Indirectly-revealed preferences are left as an exercise.

As the example of Ann and Bob shows, inference of the entire preference relation
% on A requires observations of choices for all binary subsets fa; bg A, a; b 2 A and
b 6= a. In the simple example, three binary subsets were su¢ cient. For larger sets A,
the principle of inference is similar.

Observed choice a from the binary set fa; bg implies a % b.

One needs 21 m (m 1) observations.

Most important, if we observe enough choice data from an economic agent who chooses
according to the model of Figure 2.2, then we can reconstruct the exact and true
preference relation of this economic agent from these choice data. This result was …rst
shown in Arrow (1959). Kenneth Arrow is Nobel-laureate of 1972.

2.4.2 Independence of Irrelevant Alternatives


One way to test the choice model of preferences is tightly connected to the axiom of
independence of irrelevant alternatives (IIA). John Nash (Nobel-laureate 1994) intro-
duced this axiom in 1950 in his seminal study of bargaining theory. In 1959, Kenneth
Arrow (Nobel-laureate 1972) showed that the preference relation % on A can only be
reconstructed from choice data obtained from B (A) if and only if the choice func-
tion satis…es IIA. The axiom of IIA has been used to design experiments to test the
choice model of preferences in laboratory experiments by among others Amos Tver-
sky. In this section, we explain the axiom of IIA and Arrow’s result for inferring the
preference relation % on A. The discussion of testing is postponed to a later section.
For explanatory reasons, we impose that jB(A0 )j = 1 for all A0 A. Then we can
can speak about that the best choice function b : 2A ! A which assigns to every
subset of alterantives A0 A, the unique best elemant b(A0 ) 2 A0 . Here, 2A = fA0 j
A0 Ag.
30CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

De…nition 4 The best choice function b : 2A ! A satis…es independence of irrele-


vant alternatives (IIA) if for all non-empty subsets S T A such that b (T ) 2 S,
we have b (S) = b (T )

Interpretation: If, given the choice among all alternatives from a set T , you choose
an element of a subset S T , then you also chose this el;ement if you can only choose
from the alternatives in S. (The alternatives that you did not choose when you had
the possibility to choose them are irrelevant.)

Example 8 Consider A = fa; b; cg and the complete, transitive and asymmetric pref-
erence relation % on A given by a b c. Obviously, b (A) = a. In the terminology
of the de…nition, consider T = A and S = fa; bg. Then, it is easy to verify that all
conditions of the de…nition are ful…lled: S T A and b (T ) = a 2 S. Next, b (S) = a
is indeed equal to b (T ), which is what the de…nition imposes.

John Nash considered IIA as self-evident. Is it self-evident? Mathematicians con-


sider IIA as the axiom that is equivalent to optimization. The following result was …rst
shown in Arrow (1959).

Proposition 10 Let % on A be complete, transitive and asymmetric.


The weak preference relation % on A can be reconstructed from choice data generated
by an individual who chooses according to best choice function b : 2A ! A if and only
if b satis…es IIA.

As a …nal remark, the set valued de…nition (i.e. not assuming that there is always
a unique best element) is more involved because the best choice set B(A) may only
partly overlap the set S.

2.4.3 The attraction e¤ect and IIA in experiments


An experiment that questions the validity of IIA tests what is known as the attraction
e¤ect. The attraction e¤ect can be illustrated with an example from Simonson and
Tversky (1992).10 They gave one group of subjects a choice between $6 and a nice
Cross pen. The pen was chosen by 36% of the subjects while the remaining 64% chose
10
Itamar Simonson and Amos Tversky (1992), Choice in Context: Tradeo¤ Contrast and Extreme-
ness Aversion, Journal of Marketing Research, 29, 281-295.
2.5. BEHAVIORAL ECONOMICS (OPTIONAL) 31

$6. A second group was given a choice between three options: $6, a nice Cross pen,
or another less attractive pen. As one might expect, only 2% chose the less attractive
pen. Yet, the mere addition of this asymmetrically dominated alternative boosted the
proportion of subjects who chose the Cross pen to 46%. This behavior - no doubt well
known among marketers and salespeople - does not seem quite right from the view
point of IIA.

2.5 Behavioral Economics (Optional)


Many axioms that we discussed here, and that underly the most common model of
economic decision making, are not self-evident. Behavioral Economics challenges these
axioms, and tries to verify in speci…c (often labaratory) environments if they are sat-
is…ed. Examples of questions that are addressed are the following.

How self-evident are complete preferences?

– When multiple dimensions are involved, many people have di¢ culties in
choosing consistently with complete and transitive preferences.
– When risk is involved, many people have trouble comparing risky …nancial
products, for example mortgages with a saving component in stock market
assets, insurance products, pension savings.

How self-evident are transitive preferences?

– For risky choices in an experimental laboratory, common wisdom is that


observed choices reveal non-transitivity
– A neuroeconomic experiment revealed that Capuchin Monkeys choose
transitive among three di¤erent juices, Padoa-Schioppa and Assad (2008).

How self-evident is choosing a best element? We already discussed the idea of


bounded rationality and satis…cing11 proposed by Herbert Simon (Nobel-laureate
of 1978).

– Satis…cing behavior is a decision-making strategy or cognitive heuristic that


entails searching through the available alternatives until an acceptability
threshold is met.
11
Satis…cing is a combination of satisfy and su¢ ce.
32CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Figure 2.5: The responses from subjects in a laboratory experiment di¤er when asked
to choose between A, B and C compared to choose between A and C. According to
IIA, B is an irrelevant alternative and it should not have any in‡uence on choice.

– Let a 2 A be the utility maximizing alternative.


For " > 0, b 2 A is satis…cing if u (b) u (a ) ".
The agent is willing to settle on b 2 A and willing to forego an improvement
of u (a ) u (b) because the agent considers improvements less than " in
utility (expressed by the utility scale of u) not "worth the e¤ort".

Herbert Simon, in his Nobel Prize of Economics speech of 1978:

"Decision makers can satis…ce either by …nding optimum solutions for


a simpli…ed world, or by …nding satisfactory solutions for a more realistic
world. Neither approach, in general, dominates the other, and both have
continued to coexist in the world of management science."

How self-evident is independence of irrelevant alternatives?

– Tversky (1972) Elimination by aspect, Psychological Review


– Tversky and Sha…r (1992), Choice under con‡ict: The dynamics of deferred
decision, Psychological Science 3
– In these laboratory experiments, subjects (mostly students) had to make
choices with multiple dimensions, one illustrated in Figure 2.5. The experi-
ment rejected the IIA axiom.
2.6. ECONOMICS AND DECISION THEORY (OPTIONAL) 33

– This rejection of IIA allows a re-interpretation of these classic experimental


results: it con…rms that people have di¢ culties comparing multiple dimen-
sions in choice situations.

Other pitfalls in individual decision making

– People often choose di¤erently in case of di¤erent wording / presentation


(framing e¤ect)

Remark 11 Complete and transitive preferences are not self-evident, choosing best
element is not self-evident either, independence of irrelevant alternatives is not self-
evident, but every theory based upon choosing a best a 2 A / utility maximizing a 2 A
cannot get around these properties

Standard economic model: Theory of choosing best a 2 A

Behavioral economics: How do people make decisions (psychological, experi-


mentalist perspective)

Neuroeconomics: How does the brain make economic choices (brain scans of
people and primates)

Is there an alternative general theory available? Not yet, there are still many Nobel
Prizes to win! How to deal with this critique?

1. Milton Friedman: theory only "as if" people make such calculations - compare to
expert pool players who don’t have a PhD degree in Physics but play as if they
fully understand the Newtonian Laws of Physics.

2. Richard Thaler: admit that actual behavior sometimes di¤ers from predictions
of economic theory; see economics as guide for e¢ cient decision-making; build
better descriptive theories.

2.6 Economics and Decision Theory (Optional)


Itzak Gilboa considers himself as a merchant of principles, see Chapter 1. As a student
of Econometrics & OR, you probably try to make a career with selling the principles
developed in this chapter. The value of your diploma depends upon whether potential
clients are willing to follow your advice and what they are willing to pay for it.
34CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Imagine yourself as a consultant who is asked by a client to assist in making a


decision. Based upon the model of preferences, you would proceed as follows.

Step 1 Through one or more interviews with your client, you determine the set of
feasible alternatives A and your client’s preference relation % on A. (Consider
the possibility that your client may not be aware of all alternatives that are out
there, or that he might have di¢ culties in stating his preferences.)

Step 2 Check whether the reported preference relation % on A is complete and tran-
sitive. From a computational perspective, it would be nice if the preferences
relation is also asymmetric.

Step 3 Compute the set of best alternatives B (A) and report to your client. (In case
there are multiple best elements, you may discuss with your client whether some
of these alternative are really equally good or whether closer inspection might
bring some asymmetries to the foreground.)

Step 2 and 3 can be automatized, which can be seen as Computational Economics.


The following economic dilemmas demonstrate that alternatives may have multiple
dimensions for which preferences per dimension are easy to quantify. What is unclear
is how to evaluate across dimensions:

Consumers contemplating a new smart phone have to consider price, contracts,


size, technical speci…cations, etc.

Most European citizens want that, on the one hand, the government provides for
safety, good infrastructure, social security, good medicare, etc., and on the other
hand, citizens want to pay less taxes.

The Federal Reserve Act of 1913 established the Federal Reserve System (FED),
the central bank of the United States, and it states three key objectives for
monetary policy: maximizing employment, stabilizing prices, and moderating
long-term interest rates. Preferences over each of these objectives are easy to
formulate, but the act does not specify how to evaluate di¤erent policies that are
good for one objective but bad for one of the other objectives.

Many companies adhere to the 3P principle, but how to evaluate policies that
involve costly improvements of labour conditions, or more costly investments in
green production technologies.
2.6. ECONOMICS AND DECISION THEORY (OPTIONAL) 35

Central and local governments have limited budgets and which priority group
to spend it on. For example, spending to improve living conditions of disabled
people versus spending to create extra sports facilities for obese people can be
interpreted as two separate dimensions concerning disabled people and obese
people.

These examples demonstrate that many real choice situations have multiple dimen-
sions that must be taken into account and that it will be hard to state preferences
how to evaluate di¤erent dimensions. If the latter cannot be achieved, the preference
relation will not be complete and the approach of this chapter is in dire. How to
deal with these hard problems can be a course on its own, but the key idea of multi-
criterion decision making and analytic hierarchy processes is to give weights to each
dimension.12 For example, a company following the 3P principle may give a weight
of 10% to People, 30% to Planet and the remaining 60% to Pro…t. Then, the utility
functions per dimension can be forged into a single utility function over all dimensions.
The following example, taken from Taha (2011), illustrates the procedure. To quote
Taha:

"The analytic hierarchy process is designed for situations in which ideas,


feelings, and emotions a¤ecting the decision process are quanti…ed to pro-
vide a numeric scale for prioritizing the alternatives."

Example 9 (Example 15.1-1 of Taha, 2011) Martin Hans, a bright high school
senior, has received full academic scholarships from three universities: U of A, U
of B, and U of C. To select a university, Martin speci…es two main dimensions: loca-
tion and academic reputation. The following table ranks the two criteria for the three
universities

Dimension U of A U of B U of C
Location Score 12:9 27:7 59:4
Reputation Score 54:5 27:3 18:2

Each dimension is relatively easy to quantify, and also the preferences over each
dimension are clear: less travel time preferred and a better reputation is preferred. The
issue is, however, how to evaluate both dimensions at the same time when Martin Hans
12
In Psychology, dimensions are called attributes (see e.g. Tversky, 1972), and criteria in multi-
criteria decision making and analytic hierarchical processes (see Chapter 13 of Taha, 2011).
36CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

compares two alternative universities. Without further speci…cation, his preference re-
lation is incomplete because he cannot compare any pair of alternatives.
Being the excellent student he is, Martin Hans judges academic reputation to be …ve
times as important as location, giving a weight of approximately 17% to location and
83% to reputation. This will produce weighted total scores to each university, which can
be interpreted as utility numbers. Having utility numbers ensures that the associated
preference relation has become complete and transitive. The ranking of each university
is based on computing the following composite weights:
1 5
U of A = 6
12:9 + 6
54:5 = 47:43;
1 5
U of B = 6
27:7 + 6
27:3 = 27:37;
1 5
U of C = 6
59:4 + 6
18:2 = 25:20:

Based on these calculations, U of A has the highest composite weight, and hence rep-
resents the best choice for Martin Hans.

Given that the preference model (and thus the utility model also) is rejected in
many experiments, should one search for other ways to …ll in the black box? Or should
one give up …lling in the black box at all? Instead of inferring preferences, one can
simply state B (A) is the set of alternatives and estimate the relation between subsets
of A and observed choice directly from choice data.

2.7 References
This chapter is based on (extensive quotes from):

Wikipedia on Economics; Mathematical Economics; and Satis…cing.

Chapter 1 of Itzak Gilboa (2011), Making Better Decisions, Wiley-Blackwell.

Pages 5 - 7 of Geo¤rey Jehle and Philip Reny (2011), Advanced Microeconomic


Theory, Addison Wesley.

Pages 35-36 (Section 3.2) and 54-59 (Chapter 4) of Hal R. Varian (2010), Inter-
mediate Microeconomics, Norton.

Amos Tversky (1972), Elimination by aspects: A theory of choice. Psychological


Review, 79, 281-299.
2.8. EXERCISES (WEEK 1) 37

Amos Tversky and Eldar Sha…r (1992), Choice under con‡ict: The dynamics of
deferred decision, Psychological Science 3, 358-361.

Reading for fun! (Not because "you have to", but because "you can")

2.8 Exercises (week 1)


Exercises

Exercise 1
Consider the regular grid A = 2; 1 21 ; 1; : : : ; 1 21 ; 2 [ 2; 2].

a. Preference relation % on A is given by a % b () F loor (jaj) F loor (jbj).


Determine B (A).

b. Preference relation % on A is given by a % b () Ceiling (jaj) Ceiling (jbj).


Determine B (A).

Exercise 2
Consider a father with an easily-jealous child of 3 years old and a child of 6 years. The
father wants to divide a bar of chocolate that can be broken into smaller square pieces.

You are asked to express the preference relation % on A as in the main text: a % b ()
the conditions that make a at least as good as b. Use a b if a and b cannot be compared.
Consider a chocolate bar that can be broken into 8 identical pieces, see picture on
the right, and a father who strictly prefers divisions that are more equal / egalitarian.
38CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

a. Specify the set A and express the father’s preference relation % on A. Hint:
There are two dimensions: the …rst dimension measures the number of pieces
the youngest child gets and the second dimension measures the number of pieces
to the oldest child.

b. Verify which of the axioms 1 4 hold.

c. Derive the better-than preference relation on A and the indi¤erence relation


on A.

d. Derive B (A).

Consider a chocolate bar that can be broken into 5 identical pieces, see picture on the
left.

e. Specify the set A, express % on A, and derive B (A).

The youngest kid is easily jealous when he does not get more pieces than his older
sibling and makes a lot of fuzz if in an aroused state of being jealous. Suppose that
the father prefers more equal divisions as long as it avoids such fuzz.

f. Express % on A and derive B (A) in case the chocolate bar can be broken into 5
identical pieces.

Exercise 3
A college American-football coach says that given any two linesmen X and Y , he always
prefers the one who is bigger (in height) and faster (in speed). The team consists of Alex
(A), Bob (B), Chris (C), David (D), Eric (E) and Francisco (F ). Their characteristics
are given by the following table.

Alex Bob Chris David Eric Francisco


Height (in m) 1:90 1:85 1:75 1:80 1:75 1:80
Speed (in m=s) 5:5 6:5 6:5 6:0 5:0 8:0

These dimensions are illustrated by the following …gure.


2.8. EXERCISES (WEEK 1) 39

a. Express the preference relation % on A as in the main text: a % b () the conditions


that make a at least as good as b. Hint: use a b if a and b cannot be compared.

b. Derive the better-than preference relation on A and the indi¤erence relation


on A.

c. Verify which of the axioms 1 4 hold.

d. In week 2, for any alternative c 2 A we will encounter the weakly-better(-than-


c) set of alternatives % (c) = fa 2 Aja % cg, the weakly-worst(-than-c) set of
alternatives (c) = fa 2 Ajc % ag and the indi¤erence(-to-c) set of alternatives
(c) = fa 2 Ajc ag. For c equal to David, determine the sets % (David),
(David) and (David).

e. Derive B (A).

Exercise 4
Consider A = f1; 2; : : : ; 10g and % on A given by a % b () a divides b into a natural
number (for example the 3 divides the integer 9 into 3, hence 3 % 9).

a. Prepare a 10 10 table in which you indicate whether or not a % b. Hint: Use 1


as the (a; b) element if a % b, and 0 if a and b cannot be compared.

b. Verify which of the axioms 1 4 hold.


40CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

c. Derive B (A).

d. Re‡ect on the following: "The order of the elements in each pair of % on A is


important: if a 6= b, then a % b and b % a can be true or false, independently of
each other. Resuming the above example, the prime 3 divides the integer 9, but
9 doesn’t divide 3. "

Concepts, Theory and Proofs

Exercise 5
Give a proof or convincing argument for each of the following claims made in the text.

a. Let % on A be complete. For any a; b 2 A, only one of the following holds: a b,


or b a, or a b.

b. Let % on A be complete. Neither nor is complete.

c. Prove that: if % on A is transitive, then it holds that [a b and b % c =) a % c].

d. Prove that: if % on A is transitive, then both and are transitive.

Exercise 6
When working with numbers, you have , > and =, but also and <. Can we de…ne
- on A from % on A.

a. Interpret - on A and de…ne it in terms of %.

b. Derive - on A for the preference relation of college football coach.

Exercise 7
Which of the preference relations of the previous exercises induce a utility function? If
it induces one, state one.

Exercise 8
Consider the weak preference relation % on A, where A has m 2 elements. As
an alternative concept to a best element, the notion of a maximal element has been
proposed.
2.8. EXERCISES (WEEK 1) 41

A maximal element is an element for which no better alternative in A exists:


a 2 A is maximal on A if there does not exist an a 2 A such that a a .

The set of maximal elements on A is denoted M (A). Formally,

M (A) = fa 2 Aj@a 2 A : a a g:

a. Calculate M (A) in the exercise of the college American-football coach.

b. Show that: B (A) M (A) and its corollary: if B (A) 6= ;, then M (A) 6= ;.

c. Show that: "If % on A is complete, then M (A) B (A)", and its corollary: "if %
on A is complete and transitive B (A) = M (A) 6= ;".

d. Construct an example for which M (A) 6= ; and B (A) = ;.

e. Construct an example for which M (A) = ;.

f. How do the de…nitions of a best and a maximal element translate to maximum of


a function u : A ! R? Look up which of these two de…nitions your Calculus
textbook applies.

g. Re‡ect on which of the two de…nitions, B (A) and M (A), you prefer from a math-
ematical point of view.

Exercise 9
Show the following propositions from this chapter.

Proposition 5 Let % on A be complete and asymmetric. If a % b, a 6= b, then one of


two mutually exclusive cases holds: either a b or b a.

Proposition 6 If % on A is complete, then % on A is re‡exive.

For the next propositions, try to come up with two proofs: a constructive proof and
one that starts "Suppose not. etc."

Proposition 7 [Existence] If A is …nite and % on A is both complete and transitive,


then B (A) 6= ;.
42CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Proposition 8 [Uniqueness] If A is …nite, % on A is both complete, transitive and


asymmetric, then B (A) consists of a unique element.

Exercises Computational Economics

Software tool to support the decision making of an individual are called decision support
systems (DSS). In the following exercises, you are asked to think about how to program
speci…c DSSs and to write pseudo code for it.

Exercise 10
Consider preference relation % on A = fa1 ; : : : ; am g.

a. [Bubble-sort] Consider https://en.wikipedia.org/wiki/Bubble_sort. This


is an e¢ cient sorting method. Write pseudo code for this algorithm to compute
the set B (A).

b. [Constructive proof] Write your constructive proof of Proposition 18 into pseudo


code to compute B (A).

Exercise 11
Every preference relation % on A = fa1 ; : : : ; am g can be represented by an m m
matrix R with binary elements rij 2 f0; 1g.

a. Think of a way to accomplish this. Hint: Treat index i as ai and index j as aj .

b. If % on A is re‡exive, what are the diagonal elements of R?

c. If % on A is complete and transitive and already ranked as a1 a2 ::: am ,


derive the matrix R.

d. If % on A is given without further speci…cations and % is unranked. How can you


recognize from R an element that is a best element?

e. Provide the pseudo code that exploits the matrix R to compute B (A).
2.8. EXERCISES (WEEK 1) 43

Exercise 12
Recall from the main text that if % on A = fa1 ; : : : ; am g is complete and transitive, then
there exists a vector with utility numbers u = (u1 ; : : : ; um ) induced by % on A. You
could do a bubble sort to …nd the alternative with the maximum utility. Alternatively,
de…ne the m dimensional vector of decision variables x = (x1 ; : : : ; xm ), xi 2 f0; 1g for
all i = 1; : : : ; m, and xi = 1 indicates alternative i is optimal and it is 0 otherwise.

a. Try to come up with an optimization problem to solve the utility maximization


problem. It will be an Linear Integer Program in x (linear objective function and
linear constraints). Hint: How to formulate the utility function and how ensure
you choose only one element as the utility maximizing element?

b. Determine the utility maximizing x if the outcomes in A are ordered such that
u1 > u2 > : : : > um .

c. Someone suggest to run your model as if all xi are continuous variables xi 2 [0; 1].
Comment on this suggestion.

d. Suppose you take the advice of itme c. For a di¤erent utility maximization problem
the software returns xm 1 = xm = 12 and xi = 0 for all i m 2 as an optimal
solution. What conditions have to hold on the utility weights u1 ; u2 ; : : : ; um that
this can occur?

Exercises inference and revealed preferences

Exercise 13 [indirect-revealed preference]


In the …rst lecture, we encountered directly-revealed preferences. Economists have also
de…ned indirectly-revealed preferences.

a. To set ideas, for m = 5, suppose that we have two observations on two subsets
of A: the observed choice a1 from fa1 ; a2 ; a3 g and the observed choice a3 from
fa3 ; a4 ; a5 g. Can you say something about the this economic agent’s preferences
between a1 and a4 ? What else can you say?

b. What assumption do you need to make in order to justify your statements in a.?

c. As a., but with observed choice a4 from fa3 ; a4 ; a5 g. What can you infer about %
on A?
44CHAPTER 2. DETERMINISTIC CHOICE AMONG A FINITE NUMBER OF ALTERNATIVES

Consider the overlapping sets A1 ; A2 A, A1 \ A2 6= ;. If the economic agent


chooses d1 2 A1 , chooses d2 2 A2 , and d2 2 A1 \ A2 , then these choices indirectly
reveal a weak preference of d1 over any a 2 A2 .

d. Does the example of a. …t this de…nition?

e. How to read this de…nition? What is its interpretation? Is it self-evident?

In academic year 2017-2018, i) add more examples in the main text how to apply
theory, such as Exercise 2 chocolate bar with (5; 3) (3; 5), ii) move Exercise 4 with
R-matrix in front of Exercise 2 + 3, switch order Exercise 2 + 3 (and be more explicit
about X % Y () X Y ), iii) encourage use of R in current 2 + 3 (perhaps add as
example too). Exercise 2 will consider case (5; 3) (3; 5), plus re‡ection on modelling.
Chapter 3

Deterministic Choice among


uncountably many alternatives

In Week 1 (Chapter 2) we discussed deterministic choice among a …nite set of alterna-


tives. Although many real life decision problems are about choosing from a …nite set
of alternatives, in many economic problems it is convenient to model decision variables
as continuous variables and to apply techniques from Calculus in multi-dimensional
Euclidean spaces, Rm . Alternatives are represented as vectors. The use of such tech-
niques require well-de…ned utility functions, being continuous and partially di¤eren-
tiable. As we saw in the previous chapter, in terms of preferences, the minimal con-
ditions that need to be imposed are completeness and transitivity of the preference
relation.

In this chapter, we consider sets of feasible alternatives that are well-de…ned sub-
sets of multi-dimensional spaces. In order to derive results for existence and uniqueness
additional axioms are required. Furthermore, in economics the use of graphical repre-
sentations is also common, and we discuss such representations also.

The structure of this chapter closely follows that of Chapter 2, but tries to avoid
repetitious discussions of that chapter since many concepts have the same meaning,
except that they are de…ned for a di¤erent set of alternatives.

(The structure of this chapter will be the same as of Chapter 2.)

45
46CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

3.1 A Choice Model based upon Preferences


3.1.1 The Set of Feasible Alternatives
An economic agent is faced with a single choice from a non-empty uncountable set of
feasible alternatives that is represented as a subset of Rm , m 1. Because we will use
vectors denoted x; y and z, we will write the set of feasible alternatives as X instead
of A to distinguish the case of a uncountable set of alternatives from a …nite set.

Notation

X Rm is a non-empty uncountable set of feasible alternatives.

x 2 X is an arbitrary alternative from the set X.

d 2 X is a decision about choice.

We continue with several examples of sets of feasible alternatives. Some examples


are followed by some more general concepts.

Example 10 (Consumer Theory) Consider a consumer whose income is 100 and


who wants to spent it on x1 0 units of food and x2 0 units of gasoline for driving
his car. The price for food is 2 and the price of gasoline is 3. The consumer’s budget
set X is given by

X = (x; x2 ) 2 R2+ j2x1 + 3x2 100 :

In general, x = (x1 ; : : : ; xm ) is a consumer’s consumption vector, p = (p1 ; : : : ; pm )


a price vector of commodity prices, I is the consumer’s income and the budget set is
given by

X = x 2 Rm
+ j p 1 x1 + : : : + p m xm I
= x 2 Rm
+ j p x I :

Figure 3.1 illustrates the budget set for m = 2.

Example 11 (Producer Theory) Jumbo Cable has acquired an order to produce 60


km of …bre cable. As inputs, it uses x1 0 hours of labor and x2 0 units of raw
materials, 2 menhour of labor, and 3 units of raw material produce one km of cable.
Adding more of a single input, does not produce more. Therefore, x1 0 and x2 0
3.1. A CHOICE MODEL BASED UPON PREFERENCES 47

Figure 3.1: The budget set in consumer theory.

1
produce min x ; 1x
2 1 3 2
km of cable. To meet the order, Jumbo Cable needs to set a
production plan (x1 ; x2 ) 0 in the set given by

X = (x1 ; x2 ) 2 R2+ min 1


x ; 1x
2 1 3 2
60

This production technology is called a Leontief production function, named after Nobel
Laureate 1973 Wassily Leontief who introduced these functions to economics.

The general model of producer theory interprets x = (x1 ; : : : ; xm ) 2 Rm


+ as a pro-
ducer’s production plan of inputs, such as (un)skilled labor, capital, energy, etc., and
de…nes f : Rm
+ ! R as the producer’s production function that describes the quantity
of output that can be produced from production plans. The set of production plans to
produce at least the output level q 2 R+ is given by

X = x 2 Rm
+ j f (x) q :

Note that the implicit assumption is made that the producer produces a single product,
that is q is one-dimensional. In the next example, the producer produces multiple
products and the raw materials specify a production plan x = (24; 6) of two inputs.
48CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Example 12 (Linear Production) This example is taken from OR1, Taha, p. 47.
The Reddy Mikks Company produces q1 0 units of interior paints and q2 0 units of
exterior paints from two raw materials, M1 and M2 . One unit of interior paint requires
4 units of M1 and 2 unit of M2 . One unit of exterior paint requires 6 units of M1 and
1 unit of M2 . The inventory of M1 is 24 and the inventory of M2 is 6 units. The set
of output plans (q1 ; q2 ) 0 that can be implemented is given by
6q1 + 4q2 24
X= (q1 ; q2 ) 2 R2+ :
q1 + 2q2 6

3.1.2 The Preference Relation


Preference relations on uncountable X are de…ned in the same way as they were de-
…ned on a …nite set A Chapter 2. For completeness, we state the de…nition here with
uncountable set of altermnatives X instead of …nite set A.

De…nition 5 A (weak) preference relation of an agent on an uncountable set of


alternatives X is a binary relation % with the interpretation that the agent considers
alternative x 2 X ‘at least as good’as alternative y 2 X.

Notation

% on X represents preferences over alternatives in X by comparing pairs of


elements in X; it is called a weak preference relation.

x % y means that x 2 X is at least as good as y 2 X.

From % on X, we can derive the relations , , , and similar as in Chapter 2.


These will return in this chapter.

3.1.3 Axioms of Preference Relations


The axioms of completeness and transitivity will be modi…ed but have the same mean-
ing as in Chapter 2. Therefore, we state them without further discussion. All remarks
and caveats from the previous chapter still apply.

Axiom 7 Preference relation % on X is complete if for each pair of alternatives


x; y 2 X it holds that

x % y, or y % x, or both.
3.1. A CHOICE MODEL BASED UPON PREFERENCES 49

Figure 3.2: Hypothetical sets of preferences on R2 . Source: Jehle and Reny (2011).

Axiom 8 Preference relation % on X is transitive if for each triple of alternatives


x; y; z 2 X :

if x % y and y % z, then x % z:

We now de…ne the following sets that we derive from a preference relation.

De…nition 6 Relative to alternative x0 2 X, we de…ne the following subsets of X:

1. % (x0 ) fxjx 2 X; x % x0 g, called the ‘at least as good as’set.

2. (x0 ) fxjx 2 X; x0 % xg, called the ‘no better than’set.

3. (x0 ) fxjx 2 X; x x0 g, called the ‘preferred to’or ‘better than’set.

4. (x0 ) fxjx 2 X; x0 xg, called the ‘worse than’set.

5. (x0 ) fxjx 2 X; x x0 g, called the ‘indi¤erence’set.

Quiz 2 Let X = R2 . Derive % (z), - (z) and (z) for z = (1; 1:99) and their
properties (closed, bounded, convex) if

(x1 ; x2 )> % (y1 ; y2 )> , F loor (x1 + x2 ) F loor (y1 + y2 )

Example 13 (From Jehle and Reny (2011)) "A hypothetical sets of preferences sat-
isfying Axiom 7 and 8 has been sketched in Figure 3.2 for X = R2+ . Any point in
the set of feasible alternatives X, such as vector x0 = (x01 ; x02 ), represents a feasible
50CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

alternative consisting of x01 units of the …rst dimension, together with x02 units of the
second dimension. Under Axiom 7, the economic agent is able to compare x0 with any
and every other alternative in X and decide whether the other is at least as good as x0
or whether x0 is at least as good as the other. Given our de…nitions of the various sets
relative to x0 , Axiom 7 and 8 tell us that the economic agent must place every point
in X in one of three mutually exclusive categories relative to x0 ; every other point is
worse than x0 , indi¤erent to x0 , or preferred to x0 . Thus, for any point x0 the three
sets (x0 ), (x0 ), and (x0 ) partition the set of feasible alternatives."1
....
"The preferences in Figure 3.2 may seem rather odd. They possess only the most
limited structure, yet they are entirely consistent with and allowed for by the …rst two
axioms alone. Nothing assumed so far prohibits any of the ‘irregularities’depicted there,
such as the ‘thick’indi¤erence zones, or the ‘gaps’and ‘curves’within the indi¤erence
set (x0 ). Such things can be ruled out only by imposing additional requirements on
preferences."

Continuity of preferences

Next, we shall consider several new assumptions on preferences, speci…cally for un-
countable sets of alternatives. The …rst one is a technical axiom and has no behavioral
signi…cance.
From now on we explicitly take X = Rm for explanatory reasons.

Axiom 9 (Continuity) For all x0 2 Rm , the ‘at least as good as’ set, % (x0 ), and
the ‘no better than’set, (x0 ), are closed in Rm .

Motivation from Jehle and Reny (2011):

"The continuity axiom guarantees that sudden preference reversals do


not occur. Indeed, the continuity axiom can be equivalently expressed by
saying that if each element xn of a sequence of points, fxn gn2N , is at least
as good as (no better than) x0 , and xn converges to x, then x is at least
as good as (no better than) x0 . Note that because % (x) and (x) are
1
Warning: (Ed.) This partitioning automatically holds when these two axioms hold. However,
as the exercise of the American college football coach of Chapter 2 shows, without these axioms
large parts of X may belong to neither of the three sets (x0 ), (x0 ), and (x0 ) due to pairs of
alternatives being incomparable.
3.1. A CHOICE MODEL BASED UPON PREFERENCES 51

Figure 3.3: The economic agent is indi¤erent among all alternatives in the white circle
and this violates local non-satiation.

closed, so, too, is (x) because the latter is the intersection of the former
two. Consequently, Axiom 9 rules out the open area in the indi¤erence set
depicted in the north-west of Figure 3.2."

Quiz 3 Is the preference relation given in Quiz 2 a continuous preference relation?


Non-satiation

Next, we discuss some axioms that are related to the idea of non-satiability and
monotonicity (’more is better’).

Non-satiation re‡ects that an agent is never satiated in the sense that he/she can
always do better. Let B" (x0 ) denote the open ball of radius " > 0 centred at x0 .

Axiom 10 (Local Non-satiation) For all x0 2 Rm , and for all " > 0, there exists
some x1 2 B" (x0 ) such that x1 x0 .

Motivation from Jehle and Reny (2011):

"Axiom 10 says that within any vicinity of a given point x0 , no matter


how small that vicinity is, there will always be at least one other point
x1 that the economic agent prefers to x0 . Its e¤ect on the structure of
indi¤erence sets is signi…cant. It rules out the possibility of having ‘zones
of indi¤erence’, such as that surrounding x1 in Figure 3.3. To see this, note
that we can always …nd some " > 0, and some B" (x1 ), containing nothing
52CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Figure 3.4: Local non-satiation holds because every circle around x0 intersects the
’better than’set (x0 ).

but points indi¤erent to x1 . This of course violates Axiom 10, because it


requires there always be at least one point strictly preferred to x1 , regardless
of the " > 0 we choose. The preferences depicted in Figure 3.4 do satisfy
Axiom 10 as well as Axiom 7 to 9."

Monotonicity

A stronger view is monotonicity which requires that an agent always get better o¤
if he/she gets more.

If the point x0 contains at least as much of every good as does x1 we


write x0 x1 , while if x0 contains strictly more of every good than x1 we
write x0 x1 ."

Axiom 11 (Strict Monotonicity) For all x0 ; x1 2 Rm , if x0 x1 then x0 % x1 ,


while if x0 x1 , then x0 x1 .

From Jehle and Reny (2011):

Whereas local non-satiation requires that a preferred alternative nearby


always exists, it does not rule out the possibility that the preferred alterna-
tive may involve less of some or even all commodities. Speci…cally, it does
not imply that giving the economic agent more of everything necessarily
makes that agent better o¤, which is implied by strict monotonicity.
3.1. A CHOICE MODEL BASED UPON PREFERENCES 53

Figure 3.5: All points on the indi¤erence curve that are below the dashed line violate
strict monotonicity.

"Axiom 11 says that if one alternative is weakly better in each dimen-


sion as another alternative, then the one is at least as good as the other.
Moreover, it is strictly better if it is strictly better in every dimension. The
impact on the structure of indi¤erence and related sets is again signi…cant.
First, it should be clear that Axiom 11 implies Axiom 10, so if preferences
satisfy Axiom 11, they automatically satisfy Axiom 10. Thus, to require
Axiom 11 will have the same e¤ects on the structure of indi¤erence and
related sets as Axiom 11 does, plus some additional ones. In particular,
Axiom 11 eliminates the possibility that the indi¤erence sets in R2 ‘bend
upward’, or contain positively sloped segments. It also requires that the
‘preferred to’sets be ‘above’the indi¤erence sets and that the ‘worse than’
sets be ‘below’them.
To help see this, consider Figure 3.5. Under strict monotonicity, no
points north-east of x0 or south-west of x0 may lie in the same indi¤erence
set as x0 . Any point north-east, such as x1 , scores better in both dimensions
than does x0 . All such points in the north-east quadrant must therefore be
strictly preferred to x0 . Similarly, any point in the south-west quadrant,
such as x2 , scores less in both dimensions. Under strict monotonicity, x0
must be strictly preferred to x2 and to all other points in the south-west
quadrant, so none of these can lie in the same indi¤erence set as x0 . For
any x0 , points north-east of the indi¤erence set will be contained in (x0 ),
and all those south-west of the indi¤erence set will be contained in the set
(x0 ). A set of preferences satisfying Axiom 7 to 9, and 11 is given in
54CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Figure 3.6: This is impossible under Axioms 7 - 9 and 11

Figure 3.6."

Convex preferences

From Jehle and Reny (2011):

"The preferences in Figure 3.6 are the closest we have seen to the kind
undoubtedly familiar to you from your previous economics classes. They
still di¤er, however, in one very important respect: typically, the kind of
non-convex region in the north-west part of (x0 ) is explicitly ruled out.
This is achieved by invoking one …nal assumption on preferences. We will
state two di¤erent versions of the axiom and then consider their meaning
and purpose."

Axiom 12 (Convexity and Strict Convexity) Convexity of preferences re‡ects the


idea that ‘mixing is better’.

[Convexity] If x1 % x0 , then tx1 + (1 t)x0 % x0 for all t 2 (0; 1).

[Strict Convexity] If x1 6= x0 and x1 % x0 , then tx1 +(1 t)x0 x0 for all t 2 (0; 1).

"Notice …rst that either version of Axiom 12 –in conjunction with Axiom
7 to 9, and 11 (monotonicity) –will rule out concave-to-the-origin segments
in the indi¤erence sets, such as those in the north-west part of Figure 3.6.
To see this, choose two distinct points in the indi¤erence set depicted there.
3.1. A CHOICE MODEL BASED UPON PREFERENCES 55

Because x1 and x2 are both indi¤erent to x0 , we clearly have x1 % x2 .


Convex combinations of those two points, such as xt , will lie within (x0 ),
violating the requirements of either version of Axiom 12.
There are at least two ways we can intuitively understand the implica-
tions of convexity for preferences. The preferences depicted in Figure 3.7
are consistent with both versions of convexity. Again, suppose we choose
x1 x2 . Point x1 represents a alternative containing a proportion of the
score for the second dimension, element x2 , which is relatively ‘extreme’,
compared to the proportion of this dimension in the other alternative x2 .
The alternative x2 , by contrast, contains a proportion of the score for the
…rst dimension, element x1 , which is relatively extreme compared to that
contained in x1 . Although each contains a relatively high proportion of one
dimension compared to the other, the economic agent is indi¤erent between
the two alternatives. Now, any convex combination of x1 and x2 , such as
xt , will be an alternative containing a more ‘balanced’ combination of x1
and x2 than does either ‘extreme’alternative x1 or x2 . The thrust of both
versions of convexity is to forbid the economic agent from preferring such
extremes. Convexity requires that any such relatively balanced alternative
as xt be no worse than either of the two extremes between which the eco-
nomic agent is indi¤erent. Strict convexity goes a bit further and requires
that the economic agent strictly prefer any such relatively balanced alter-
native to both of the extremes between which he is indi¤erent. In either
case, some degree of ‘bias’in favour of balance in dimensions is required of
the economic agent’s preferences."

Quiz 4
Is on R a convex relation?

Quiz 5

1. What does this de…nition imply if x y?

2. Is on R a strictly convex relation?


56CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Figure 3.7: The economic agent strictly prefers xt to either x1 or x2 .

3.2 The Optimization Principle


Recall the optimization principle from Chapter 2: “An economic agent is an ‘actor’
(someone who acts) who tries to do what is best in the given situation”. We translated
this principle into choosing a best alternative. Here, we de…ne best elements in the
same way as in Chapter 2, but now on uncountable sets X (by simply replacing a by
x, and A by X.

De…nition 7 Alternative x 2 X is a best element in preference relation % if x % x


for all x 2 X.

Modifying the notation to the symbols used in this chapter, we introduce the fol-
lowing notation.

Notation

x 2 X denotes a best element in X.

B (X) X denotes the best choice set of % on X and it consists of all best
elements in X.

The best choice set of % on X is de…ned as

B (X) = fx 2 Xjx % x for all x 2 Xg :

The next two results are existence and uniqueness. On the one hand, existence
is easier to prove for utility functions than for preference relations, and therefore we
postpone the proof of existence. On the other hand, uniqueness is easier to prove for
3.2. THE OPTIMIZATION PRINCIPLE 57

X % on X complete, transitive, and continuous d 2 B(X)


- + -
Choice Optimization Principle Choice
Situation

Black Box

Figure 3.8: The model of preferences and conditions for existence of best choices.

preference relations than for utility functions, and we state the proof of this impor-
tant result in this section. Therefore, without proof, we state the existence of a best
alternative.

Proposition 11 If X R is a non-empty and compact set and % on X is complete,


transitive and continuous, then % on X admits a best element x 2 X.

Remark 12 Without completeness, transitivity, and continuity of % on X, existence


of a best x 2 X is not guaranteed. Every theory based upon preference relation % on
X and the choice of a best x 2 X cannot get around this!

As Figure 3.8 illustrates, the model of preferences …lls in the fundamental black
box.

The next result states su¢ cient conditions for uniqueness. (The result is also illus-
trated in Figure 3.9.)

Proposition 12 (Uniqueness) If X Rm is a non-empty, compact, and convex


set and % on X is complete, transitive, continuous, and strictly convex, then the
best element x 2 X of % on X exists and is unique.

Remark 13 Beware! Twice the notion of convex in one proposition.


58CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Figure 3.9: Geometric interpretation of a best alternative on the boundary of X.

Proof of Proposition 12
Existence is guaranteed by the previous proposition. Next, suppose there is no unique-
ness. Then, there are at least two best elements, say x ; x 2 X and x 6= x . Then,
x x (otherwise one would be better than the other).

Because X is a convex set, for any t 2 (0; 1) : tx + (1 t) x 2 X (feasibility).

Because % on X is a strictly convex preference relation, we have that for any


t 2 (0; 1) : tx + (1 t) x x x .

The last results contradicts that x ; x % x for all x 2 X, in particular if we take


x = tx + (1 t) x .

The geometric interpretation of a unique best element on the boundary of X is


illustrated in Figure 3.9 for a preference relation that satis…es Axiom 7 to 9, strict
monotonicity, and strict convexity. Adding the axiom of strict convexity is su¢ cient
to arrive at a unique best element, i.e., jB (X)j = 1 where j j denotes the number of
elements of a …nite set. In that case, we may write b (X), where b : X ! X is a
function.
Local non-satiation excludes the possibility that best elements belong to the interior
of the set of alternatives. We de…ne interior and its boundary.
3.2. THE OPTIMIZATION PRINCIPLE 59

X % on X complete, transitive, continuous, strictly convex d = b(X)


- + -
Choice Optimization Principle Choice
Situation

Black Box

Figure 3.10: The model of preferences and conditions for uniqueness of best choice.

Notation

X 0 denotes the interior of X.

@X 0 denotes the boundary of X.

Proposition 13 (Boundary solution) If X Rm


+ is a non-empty, compact and
convex set and for % on X completeness, transitivity, continuity, and local non-satiation
holds, then % on X admits a best element x 2 @X.

Proof
The conditions ensure existence. Next, suppose not, meaning all best elements belong
to the interior. For arbitrary x 2 B (X) =) x 2 X 0 . Then:

Because X is a convex set, there exists a su¢ ciently small " > 0 such that
B" (x ) X 0 (feasibility).

Because % on X satis…es local non-satiation, we have that there exists a x1 2


B" (x ) : x1 x.

The last results contradicts that x % x for all x 2 X, in particular if we take x = x1 .


Hence, x 2 @X.

Figure 3.9 illustrates this last result. Every circle around x intersects (x ), but
all better alternatives are infeasible given that the choice is restricted to the set of
feasible alternatives, X. Formally, the intersection between (x ) and X is the empty
set.
60CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

3.3 A Choice Model of Utility Maximization


3.3.1 The Utility Function
When continuity of preferences is added to completeness and transitivity, we can ensure
the existence of a continuous utility function on X. We restate the de…nition of a utility
function for uncountable X.

De…nition 8 A utility function induced by % on X is a function u : X ! R such


that for all x; y 2 X : u (x) u (y) () x % y.

Proposition 14 If % on X is complete, transitive and continuous, then % on X


induces a continuous utility function u : X ! R.

The property of a continuous utility function is mathematically convenient. And


in discussing computational economics, we will go one step further in convenience by
assuming partial di¤erentiability.
In the remainder of this chapter, whenever a utility function and a preference rela-
tion are stated in the same proposition, we assume that the continuous utility function
u : X ! R is induced by the complete, transitive and continuous preference relation
% on X.

3.3.2 Properties of Utility Functions


In this section, we state two properties of utility functions and relate these one-to-one
to properties of preference relations. We will do so one property at a time.

The …rst property, the utility function being increasing, requires little explanation.
It will not be too surprising that it is one-to-one related to strict monotonicity.

De…nition 9 u : X ! R is increasing if for every x0 ; x1 2 X, x1 6= x0 :

x1 x0 =) u x1 > u x0 :

Proposition 15 u : X ! R is increasing () % on X is strict monotonic.

Next, we state the properties of strict concavity and strict quasi concavity.
3.3. A CHOICE MODEL OF UTILITY MAXIMIZATION 61

De…nition 10 Let X be a non-empty and convex set.


Utility function u : X ! R is strictly concave, if for each x; y 2 X, x 6= y, and
t 2 (0; 1) :

u (tx + (1 t) y)) > tu (x) + (1 t)u (y)

Quiz 16 For u : R ! R given by u (x) = x (1 x), verify whether the de…nition holds,
either graphically or with a formal proof using the de…nition.

De…nition 11 Let X be a non-empty and convex set. Utility function u : X ! R is


strictly quasi concave, if for each x; y 2 X, x 6= y, and t 2 (0; 1) :

u (tx + (1 t) y)) > min fu (x) ; u (y)g

Quiz 6 For u : R ! R given by the convex(!) function u (x) = x2 , verify that this
function is strictly quasi concave, either graphically or with a formal proof using the
de…nition.
It turns out that strict convexity of a preference relation is equivalent with strict
quasi-concaviti of the associated utility function (if a utility function exists).

Proposition 17 Let X b a convex set.


Utility function u : A ! R is strictly quasi concave if and only if % on X is strictly
convex

3.3.3 Utility Maximization


In this section, we derive su¢ cient conditions for the existence of a utility maximizing
alternative, and similar for the uniqueness.
The …rst result gives su¢ cient conditions for the existence of a utility maximizing
alternative.

Proposition 18 If X is a non-empty and compact set and u : X ! R is continuous,


then there exists a utility maximizing x 2 X.

Proof
Existence follows directly from a standard result in Calculus and Analysis: Any con-
tinuous function on a non-empty, compact set attains a maximum.
62CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Recall that we postponed the proof of existence of a best element for the model of
preferences. The equivalence between the utility maximization model and the choice
model of preferences immediately implies the following result for the model of prefer-
ences.

Corollary 19 If X is a non-empty and compact set and % on X is complete, transitive


and continuous, then % on X admits a best element x 2 X.

Uniqueness of a utility maximizing alternative is related to strict quasi concavity of


the utility function and the set of feasible alternatives being a convex set. Since strict
quasi concavity of utility functions is equivalent to strict convexity of the preference
relation, and we already established that the latter was su¢ cient for uniqueness in the
model with preferences, we immediately obtain the following result.

Proposition 20 If X Rm is a non-empty, compact and convex set and utility func-


tion u : X ! R is continuos and strictly quasi concave, then the utility maximizing
a 2 A exists and is unique.

The geometric interpretation of this result is similar to the one pictured in Figure 3.9.

Example 14 (Consumer Theory) Consider a consumer who has preferences over


x1 0 units of food and x2 0 units of gasoline for driving his car given by: x % y ,
x1 x2 y1 y2 . The consumer’s budget set is given by

A = (x1 ; x2 ) 2 R2+ j2x1 + 3x2 100 :

Because the utility function is increasing, the consumer will spend his entire income
in the utility maximizing vector. Therefore, the Lagrange method can be applied to
>
calculate the utility maximizing vector, which is x = 25; 16 23 . At x , the indi¤erence
curve is tangent to the budget constraint. (We ellaborate more on this in Week 4.)

3.4 Inference
All concepts and results are identical to those in Section 2.4.
3.5. COMPUTATIONAL ECONOMICS (OPTIONAL) 63

3.5 Computational Economics (Optional)


In computational economics, calculus methods are applied to numerically compute
utility maximizing points. Application requires utility functions that are partially
di¤erentiable.

Notation

ru (x) denotes the gradient of the function u in x.2

Proposition 21 Let X be a non-empty, compact and convex set, and % on X induces


a partially di¤erentiable utility function u : X ! R.
If % on X is locally non-satiated, then the gradient ru (x) 6= 0.
And every utility maximizer x 2 @X (boundary solution).

In case the boundary of X can be described by a single function, as is the case


for the area enclosed by the unit circle, x21 + x22 1, the Lagrange method becomes
available to solve for utility maximizing alternatives on the boundary. In case X can
be described by several constraints, the Kuhn-Tucker conditions need to be invoked in
numerical computations to compute utility maximizing alternatives.
In many numerical applications in economics, strict concavity of the utility func-
tions is imposed (together with convexity of the functions describing constraints in
standard form g (x) 0) to utilize very e¢ cient algorithms. We refer to the course on
OR1 for details.
Quadratic utility functions are also popular in Economics as a second-order Taylor
approximations of more complex strictly concave utility functions. These allow to
compute formulas that express the utility maximizing choice. For example, u (x) =
x (a x) on R has x = 21 a as its maximum.

3.6 Behavioral Economics (Optional)


The fundamental violations of the model with preferences were already established for
…nite sets of alternatives. The empirical tests reject the model of preferences, and
hence, reject the utility maximization model. Nevertheless, the assumptions of strict
2
There is no uniform symbol for the gradient, ru is common in Mathematical Economics, as is
notation Du.
64CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

monotonicity and strict convexity of preferences have economic appeal, but do not
expect most people have such preferences.
Although the assumptions about preferences, or equivalently utility functions, will
often not be realistic, these can hardly be avoided in applying numerical techniques of
Calculus and boost the e¢ ciency of available computer algorithms.
A similar conclusion as in Chapter 2: Given that the preference model (and thus
the utility model also) is rejected in many experiments, one could give up …lling in
the black box. Instead of inferring preferences, one can simply state B (A) is the set
of alternatives and estimate the relation between subsets of A and observed choice
directly from choice data.

3.7 Exercises
Exercises

Exercise 1
Consider X = R2 and preference relations % on X that can be represented by the utility
functions u : R2 ! R. The utility functions of item a and b are called devil’s mountains
in Computational Economics and Operations Research. Solve for B (X) graphically by
drawing the preference sets % (x0 ), . (x0 ) and (x0 ) for several well-chosen choices
of x0 until you can explain what is going on.

a. u (x) = F loor (x21 + x22 ).3

b. u (x) = Ceiling (x21 + x22 ).

c. Argue whether each % on X of a and b are locally non-satiated.

d. Recall Herbert Simon’s concept of satis…cing. What to expect from a satis…cing


individual or software program?

Exercise 2
In this exercise, we consider X = R2 and % on X given by x % y () x1 > y1 (indeed,
the inequality is strict).
3
Note that x21 + x22 = c, c > 0, describes a circle.
3.7. EXERCISES 65

a. Sketch the preference sets % (x0 ), . (x0 ) and (x0 ) for several well-chosen choices
of x0 .

b. Argue whether % on X is complete and re‡exive (recall x % x for all x 2 X). How
does your answer relate to % (x0 ), . (x0 ) and (x0 )? And is % (x0 ) [ . (x0 ) =
X true?

c. Argue whether % on X is transitive.

d. Argue whether % on X is continuous.

e. Argue whether % on X is locally non-satiated.

f. Argue whether % on X is strict monotonic.

g. Argue whether % on X is strict convex (if not, provide a counter example).

Exercise 3
Consider square X = R2 and % on X given by x % y () x2 (x1 )2 y2 (y1 )2 .

a. Sketch the preference sets % (x0 ), . (x0 ) and (x0 ) for x0 = (0; 0).

b. Argue whether % on X is continuous.

c. Argue whether % on X is locally non-satiated.

d. Argue whether % on X is strict monotonic.

e. Argue whether % on X is strict convex.

f. For X = [ 1; 1]2 R2 , derive B (X) from your …gures (plot % (x0 ) for several values
of x0 ).

Exercise 4
Consider once more X = R2 and the preference relation % on X given by x % y ()
F loor (x21 + x22 ) F loor (y12 + y22 ). Use the preference sets % (x0 ), . (x0 ) and
(x0 ) sketched before.

a. Argue whether % on X is continuous.


66CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

Either show the statement, or provide a counter example.

b. % on X is strict monotonic.

c. % on X is strict convex.

d. Re‡ect on whether it is possible for % on X to induce a continuous utility function


u : X ! R, or only discontinuous utility functions.

Concepts, Theory and Proofs

Exercise 5
Consider the lexicographic preference relation % on X = R2+ given by x % y ()
x1 > y1 , or x1 = y1 and x2 y2 .

a. Sketch the preference sets % (x0 ), . (x0 ) and (x0 ) for x0 = (1; 2).

b. Argue whether % on X is continuous.

Either show the statement, or provide a counter example.

c. % on X is transitive.

d. % on X is locally non-satiated.

e. % on X is strict convex.

f. Re‡ect on whether lexicographic preferences induce a utility function.

Exercise 6
Consider X Rm and the preference relation % on X. Either show the statement, or
provide a counter example.

a. % on X is complete () % (x0 ) [ . (x0 ) = X .

b. % on X is complete =) 8x0 2 X : (x0 ) 6= ;.

c. Let X be a non-empty and convex set and % on X is complete, transitive and


continuous. The statement is:
3.7. EXERCISES 67

% on X is a convex preference relation () % (x0 ) is a convex set.

d. If u : R ! R is a constant function (u (x) = u (y) for all x; y), then u is quasi


concave.

e. If u : R ! R is an increasing function (u (x) > u (y) whenever x > y), then u is


strictly quasi concave.

Exercise 7
Consider the multidimensional decision problem of the American college football coach
once more who considers more height and a higher speed as more desirable. We assume
the triangle X = x 2 R2+ j2x1 + 3x2 6 R2+ and x % y () x y.

a. Sketch the preference sets % (x0 ), . (x0 ) and (x0 ) for x0 = (1; 1).

Either show the statement, or provide a counter example.

b. % on X is continuous.

c. % on X is strict monotonic.

d. % on X is strict convex.

e. Derive B (X) and the set of maximal elements M (X).

g. Re‡ect on the statement: % on X is complete () % (x0 ) [ . (x0 ) = X.

Computational Economics

Exercise 8
In this exercise, we reformulate the problem of the football coach to capture the multi-
dimensional problems such as the 3P (People, Planet, Pro…t) where the set Z below
could represent production plans that create pro…t but also a¤ect sustainability (reduce
pollution).
Consider a non-empty, compact and convex set of alternatives Z Rm with
two-dimensions. Each dimension has a ratio scale with a well-de…ned 0. The relation
between z and the ’utility measure’of dimension i = 1; 2 is given by the increasing
and continuous utility function ui : Z ! R. Thus, the functions u1 and u2 map every
68CHAPTER 3. DETERMINISTIC CHOICE AMONG UNCOUNTABLY MANY ALTERNATIVES

z 2 Z into X R2 . (As an illustration, the American college football coach considers


height and speed, mapping athletes (z) into the height-speed space.)
Consider the multidimensional decision problem in % on Z. Obviously,

z 1 % z 0 () u1 z 1 ; u2 z 1 u1 z 0 ; u2 z 0 :

In Chapter 2, we argued that weight 2 (0; 1) could be given to restore completeness


and transitivity. That is, de…ne the utility function u : Z ! R given by U (z) =
u1 (z) + (1 ) u2 (z). This function has to be maximized over the set Z.

a. Argue whether % on Z is continuous.

b. Argue whether % on Z is strict monotonic, or provide a counter example.

c. Argue whether maxz2Z u (z) admits a maximizing z and whether it is a boundary


solution, z 2 @Z.

d. Let c belong to the image of u2 . Write down the Lagrangian function of the op-
timization maxz2Z u1 (z), s.t. c u2 (z) = 0.4 Hint: Relate the Lagrangian to
maxz2Z U (z). Explain.

e. Suppose your are a consultant at one of the big four consultancy …rms. A senior
colleague of suggests the approach under item a - c. You disagree and suggest
to use d for a suitable choice of coe¢ cient c. What are your (short) arguments
to convince your senior colleague to adopt the Lagrange method of d. Hint: is
there a similarity between your colleague’s and your approach?

In academic year 2017-2018 add exercises on preference for inequity aversion as a


multi-dimensional decision making and how to make use of f u in computational
economics, in particular for log-concave functions u such as Cobb Douglas.

4
Actually, maxz2Z u1 (z), s.t. u2 (z) c is the approach, but that requires the similar Kuhn-Tucker
conditions applied to the standard formulation maxz2Z u1 (z), s.t. c u2 (z) 0.
Chapter 4

Stochastic Choice among a Finite


Set of Alternatives

Until now, we considered deterministic choice problems. In this chapter, we consider


stochastic choice problems where the agents do not have all information to formulate
its preference relation.
Consider a consumer who visits a particular web site to, say, book a ‡ight, a holiday
or to buy some electronics product. This consumer may visit the web site, look around
and leave, and after several visits book a ‡ight or buy the electronics product. In
a nutshell, this consumer looked at the same set of alternatives several times, where
"leave and do not purchase anything" is one of the alternatives. So, several times
this consumer chose "leave" from the alternatives available to her and one time she
purchased an item. Can this behavior be explained in terms of the preference model?
Perhaps, if there is su¢ cient information.
The web company wants to know how to predict the consumer’s behavior. Nowa-
days, the data analytics of click data has matured and many web shops purchase
consumer pro…les. These consumer pro…les are used to predict behavior of an individ-
ual consumer, but these pro…les are also incomplete because not everything is known
(and not everything can be known, for example the mood a consumer is in). From the
perspective of the web shop, of consumers with pro…le X only a fraction buy during
the …rst visit. Therefore, the web company faces stochastic choice from individual
consumers. Web companies invest large amount of money to better predict consumer
behavior and optimize their web site and pricing strategy.
Most of this chapter is taken from Chapter 1 of Duncan Luce (1959),1 including

1
Duncan Luce (1959), Individual Choice Behavior: A Theoretical Analysis, Blackwell-Wiley.

69
70CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

some discussions on the perspective taken, including some illuminating discussions of


similarities and di¤erences with economics. Some minor adaptations in notation are
made to arrive at uniform notation in this manuscript, and some quiz questions and
editorial comments are added for better understanding.

4.1 Motivation
Duncan Luce: "One large portion of psychology –including at least the
topics of sensation, motivation, simple selective learning, and reaction time
–has a common theme: choice. Stochastic choice attempts a partial math-
ematical description of individual choice behavior in which the distinction
is not made except in the language used in di¤erent interpretations of the
theory. Thus the more neutral word "alternative" is used to include the
several cases.
In the approach taken, we shall be concerned with possible lawfulness
found among di¤erent, but related, choice situations, whether these are
choices among stimuli or among responses. Possibly the simplest prototype
of this type of theory is the frequently assumed rule of transitivity among
choices: given that a person chooses a over b and that he chooses b over c,
then he chooses a over c when a and c are o¤ered. This assumption, were
it true, would be a law relating a person’s choice in one situation to those
in two others.
The results that follow – which seem to a¤ord some insight into, and
some integration of, psychological and psychophysical scaling, utility theory,
and learning theory – will implicitly serve as the argument for the course
taken."

4.1.1 Probabilistic versus Algebraic Theories


(Ed. Probabilistic = stochastic; algebraic choice = deterministic model of preferences)

"A basic presupposition of stochastic choice is that choice behavior is


best described as a probabilistic, not an algebraic, phenomenon. That is
to say, at any instant when a person reaches a decision between, say, a and
b we will assume that there is a probability P (a; b) that the choice will be
4.1. MOTIVATION 71

a rather than b.2 These probabilities will generally be di¤erent from 0 and
1, although these extreme (and important) cases will not be excluded. The
alternative is to suppose that the probabilities are always 0 and 1 and that
the observed choices tell us which it is; in this case the algebraic theory of
relations seems to be the most appropriate mathematical tool.
The decision between these two approaches does not seem to be em-
pirical in nature. Various sorts of data – intransitivities of choices and
inconsistencies when the same choices are o¤ered several times – suggest
the probabilistic model, but they are far from conclusive. Both of these
phenomena can be explained within an algebraic framework provided that
the choice pattern is allowed to change over time, either because of learn-
ing or because of other changes in the internal state of the individual. The
presently unanswerable question is which approach will, in the long run,
give a more parsimonious and complete explanation of the total range of
phenomena.
The probabilistic philosophy is by now a commonplace in much of psy-
chology, but it is a comparatively new and unproven point of view in utility
theory. To be sure, economists when pressed will admit that the psycholo-
gist’s assumption is probably the more accurate, but they have argued that
the resulting simplicity warrants an algebraic idealization. Ironically, some
of the following results suggest that, on the contrary, the idealization may
actually have made the utility problem arti…cially di¢ cult."

4.1.2 Multiple Alternative Choices


"Once choice behavior is assumed to be probabilistic, a problem arises
which does not exist in the algebraic models. Complete data concerning the
choices that a person makes from each possible pair of alternatives taken
from a set of three or more alternatives do not appear to determine what
choice he will make when the whole set is presented. Because they cannot
escape multiple alternative choice problems, economists have been particu-
larly sensitive to this feature of probabilistic models, and it has undoubtedly
2
Ed. In terms of conditional probabilities, we have

P (a; b) = P (aj fa; bg) :

Thinking in terms of conditional probabilities may help understand the theory more easily.
72CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

been one source of their resistance in admitting imperfect discrimination.


Early psychologists, particularly learning theorists, studied multiple alter-
natives experimentally, but since the data seemed dreadfully complicated
a trend set in toward fewer and fewer alternatives until now many studies
employ only two. For the most part, present-day psychologists have been
willing to ignore –or, to be more accurate, to bypass and postpone –the
connections between pairwise choices and more general ones. And so the
relations have remained obscure.
We shall center our attention on this problem. The method of attack is
to introduce a single axiom, the Choice Axiom, relating the various proba-
bilities of choices from di¤erent …nite sets of alternatives. It is a simple and,
I feel, intuitively compelling axiom that appears to illuminate many of the
more traditional problems, in particular the question of whether or not a
comparatively unique numerical scale exists which re‡ects choice behavior.
Such a scale, unique except for its unit, is shown to exist very generally.
It appears to be the formal counterpart of the intuitive idea of ’utility (or
value) in economics, of incentive value in motivation, of subjective sensation
in psychophysics, and of response strength in learning theory."

4.1.3 Well-De…ned Sets of Alternatives


"So far, there seems to have been an implicit assumption that no di¢ -
culty is encountered in deciding among what it is that an individual makes
its choices. Actually, in practice, it is extremely di¢ cult to know, and
much experimental technique is devoted to arranging matters so that the
individual and the experimenter are (thought to be) in agreement about
what the alternatives are. All of our procedures for data collection and
analysis require the experimenter to make explicit decisions about whether
a certain action did or did not occur, and all of our choice theories –includ-
ing this one – begin with the assumption that we have a mathematically
well-de…ned set, the elements of which can be identi…ed with the choice
alternatives. How these sets come to be de…ned for individuals, how they
may or may not change with experience, how to detect such changes, etc.,
are questions that have received but little illumination so far. There are
limited experimental results on these topics, but nothing like a coherent
4.2. THE PROBABILITY AND CHOICE AXIOMS 73

theory. Indeed, the whole problem still seems to be ‡oundering at a con-


ceptual level, with us hardly able to talk about it much less to know what
experiments to perform.
More than any other single thing, in my opinion, this Achilles’heel has
limited the applicability of current theories of choice: it certainly has been
a signi…cant stumbling block in the use of information theory in psychol-
ogy, it has limited learning theory applications to a rather special class of
phenomena typi…ed by T-maze experiments, etc. The present theory is no
di¤erent in this respect from the others."

4.2 The Probability and Choice Axioms


"We shall suppose that a …nite set A = fa1 ; : : : ; am g is given which
is to be interpreted as the universe of possible alternatives (stimuli or re-
sponses). In practice A will have to possess a certain homogeneity: the
decision maker will have to be able to evaluate the elements of A according
to some comparative dimension and to be able to select from certain …nite
subsets of A the elements that he thinks are superior (or inferior or distin-
guished in some way) along that dimension. For example, in economics A
may be taken to be a set of commodity bundles among which a person can
express preferences; in psychophysics it may be the set of possible sound
energies (at a …xed frequency) which a subject can be asked to evaluate
according to loudness; or in learning theory A may be the set of alternative
responses available to the individual.
In general, a subject is not asked to make a choice from the whole of A
but rather from some (small) …nite subsets. In a great many experiments
only two alternatives are presented to the subject at a time, and he is
required to choose the one he prefers or the one he deems louder, etc. Of
course, larger subsets could be used, although for the most part they have
not been, and certainly most daily decisions are from larger subsets (e.g.,
the choice of a meal from a menu or the choice among several jobs, etc.).
Let T be a …nite subset of A and suppose that an element must be
chosen from T . If a is an element of T (written a 2 T ), let PT (a) denote
the probability that the selected element is a. Slightly more generally, if
S is a subset of T (written S T ), let PT (S) denote the probability that
74CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

the selected element lies in the subset S. These probabilities are the basic
ingredients of the following theory.3
In most choice models we would write P (a) for PT (a) because the choice
set T is held invariant throughout the discussion; in fact, we would let T
and A be the same set. Here, however, several di¤erent choice sets are
to be considered at once. Let us suppose that we are working with 1000
cps tones at di¤erent intensities measured in db above some reference level;
let z, a, b, and c denote, respectively, the 50, 52, 54, and 56 db tones.
Formally, A = fz; a; b; cg. Let T = fz; a; bg and T 0 = fa; b; cg and consider
choices according to loudness. There is assumed to be some probability,
denoted by PT (a), that a, the 52 db tone, will be called loudest when T is
presented, and another, generally di¤erent, probability PT 0 (a) that a will
be called loudest when T 0 is presented. There is no reason to expect these
probabilities to be the same, and the purpose of the subscripts is to make
the several probabilities identi…able. It must not be forgotten, however,
that all of the probabilities having the same subscript T form an ordinary
probability measure on the subsets of T . This means, explicitly, that the
following is assumed:"

Axiom 13 (The Probability Axiom (P-axiom)) (i) For T A, 0 PA (T ) 1.

(ii) PA (A) = 1.

(iii) If S; T A and S \ T = ;, then PA (S [ T ) = PA (S) + PA (T ).

(Ed.) Because the probability axiom is so self-evident for Duncan Luce, he almost
never mentions this axiom and only talks about the Choice Axiom as being the only
axiom imposed, see previous sections. Both axioms are mentioned in the main results
below as a reminder that both are needed.

"Repeated application of part (iii) implies that


X
PA (T ) = PA (a) (4.1)
a2T

therefore, it is always su¢ cient to state results just for PA (a).


3
Ed. Later PT (S) will be interpreted as the conditional probability P (SjT ), the probability of an
element in the set S being chosen given that only alternatives from the set T A are available.
4.2. THE PROBABILITY AND CHOICE AXIOMS 75

Note that, given our interpretation of these probabilities, part (ii) means
that the subject is forced to make a choice: the probability is 1 that his
choice is in T when he must con…ne his choice to T A.
For simplicity of notation, and to conform to standard usage, P (a; b) is
written to stand for Pfa;bg (a) when a 6= b. It will be convenient to intro-
1
duce the convention P (a; a) = 2
so that certain equations (e.g., P (a; b) +
P (b; a) = 1) can be written without any restriction on the values assumed
by a and b. For later reference, P (b; a) = 1 P (a; b) for all a and b, and
1
in particular for b = a the convention P (a; a) = 2
follows immediately.4

The axioms of ordinary probability theory establish certain restraints


upon each of the measures PA , but no connections are assumed among the
several measures. However, one suspects that, at least for choice behavior,
the several measures cannot be completely independent. The relationship
we shall investigate can be stated as follows:"

Axiom 14 (The Choice Axiom (C-axiom)) Let S; T A

(i) If P (a; b) 2 (0; 1) for all a; b 2 A, then for all S T A:

PA (S) = PT (S) PA (T ) :

(ii) If P (a; b) = 0 for some a; b 2 A, then for all T A:

PA (T ) = PAnfag (T n fag) :

"The Choice Axiom holds for the set A" is used to mean not only that it
holds for A itself but also that it holds for every subset of A."
Pleskac (2012)5 explains: "Part (ii) is more or less a housekeeping as-
sumption. It allows alternatives that are never chosen in pairwise choices
to be deleted from A without impacting the choice probabilities. If salted
4
Ed. Think of P (a; a) in terms of the example of re‡exivity with two identical cans of Coca Cola,
C1 and C2 . Suppose these two identical cans are put next to each other on a shelf in a store, then it
seems equally likely that someone picks either C1 or C2 .
5
Timothy J. Pleskac (2012) is a working paper retrieved from https://msu.edu/~pleskact/
research/papers/pleskac_inpress_Luce.pdf. This paper appeared as: Pleskac, T. J. (2015). De-
cision and choice: Luce’s choice axiom. In J. D. Wright (Ed.), International encyclopedia of the social
and behavioral sciences. Elsevier.
76CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

white…sh is never chosen in pairwise choices with trout, then in choices


between salted white…sh, trout, and walleye, salted white…sh can be safely
deleted reducing the choice to trout or walleye."
There are a number of points, both technical and conceptual, that
should be made about the axiom.

"Part (ii) of the axiom simply states that if b is invariably chosen over
a then a may be deleted from A when considering choices from T A.
This seems reasonable. If one never selects liver in preference to roast beef,
then in choosing among liver, roast beef, and chicken one can immediately
reduce the problem to consideration of roast beef and chicken. Furthermore,
in choice work in which discrimination is assumed to be deterministic it has
been customary to assume that pairwise choices are transitive. It would be
unfortunate if the Choice Axiom were at variance with this assumption; it
is not."

Lemma 22 Let the Axiom 13 and 14 hold.

1. [IIA] If P (a; b) = 0 for some a; b 2 A, then PA (a) = 0.

2. [Transitivity] Let A = fa; b; cg. If P (a; b) = 1 and P (b; c) = 1, then P (a; c) = 1.

The proof of this lemma is left as an exercise.

"Wrt the …rst part of Lemma 22: By repeated applications of part (ii)
of the Choice Axiom, the choice set can be reduced to one in which only
cases of imperfect discrimination (P (a; b) 6= 0 or 1) occur, and then part
(i) becomes applicable to all elements of this reduced subset of A . The
removed alternatives are irrelevant for the choice situation at hand. Next,
consider the second part of Lemma 22. It states that the Choice Axiom is
a probabilistic version of transitivity."

(Ed.) To summarize, the Choice Axiom is a probabilistic version of two of the more
important axioms in deterministic choice theory: independence of irrelevant alterna-
tives (IIA) and transitivity.

Interpretation part (i) of the Choice Axiom.


4.2. THE PROBABILITY AND CHOICE AXIOMS 77

"To deal with complicated decisions (P (a; b) 6= 0 or 1), it is usual to subdi-


vide them into two or more stages: the alternatives are grossly categorized
in some fashion and a …rst decision is made among these categories; the
one chosen is further categorized and a second decision is made, etc. It is
commonly accepted, and it is probably true, that when such a multistage
process is needed the over-all result depends signi…cantly upon which in-
termediate partitionings are employed. One senses, however, that if the
decision situation is quite simple – so that a two-stage process is not re-
ally needed –then the intermediate categorization, if used, will not matter.
That is to say, the product PT (S) PA (T ) will not depend upon T . But,
by taking T = A, we see that this product must be PA (S), which is part
(i) of the Choice Axiom.
These remarks make it clear that we cannot expect the axiom to be
valid except for simple decisions, but this is no real limitation, since, as we
shall see, our results really require only that it be correct for sets of three
alternatives.
The axiom may be viewed in another way provided conditional proba-
bility is de…ned in the usual manner, i.e., if PA (T ) > 0, then"

PA (S \ T )
PA (SjT ) = :
PA (T )

(end quotation)"

Lemma 23 Let the Axiom 13 and 14 hold.

1. If P (a; b) 2 (0; 1) for all a; b 2 A, then PT (S) = PA (SjT ) for all S T A.

2. If P (a; b) 2 (0; 1) for all a; b 2 A, then for any T A such that a; b 2 T :

PT (a) P (a; b)
= :
PT (b) P (b; a)

The proof of this lemma is also left as an exercise.

"Wrt. the …rst part of Lemma 23: Ignoring cases of perfect discrimi-
nation, this lemma says that the axiom requires that the measure PT be
identical to the conditional measure induced by PA . As a concrete example,
78CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

PT (a) P (a;b)
A Ordinary Probability Axiom PT (b) = P (b;a)
- + -
Choice Duncan Luce’s Choice Axiom Choice
Situation

Black Box

Figure 4.1: The model of stochastic choice …lls in the black box of Economics.

suppose that A is the set of entrees on a certain menu, T is some proper


subset of A that includes roast beef, and S the single element set of roast
beef. The heart of the axiom is the assumption that when, for whatever
reason, the restaurant has only the entrees T the probability of selecting
roast beef is the same as the conditional probability of selecting it from T
when the whole menu is available.
When …rst examining part (i) of the axiom, some have felt that it is
tautological; however, the foregoing example should make it clear that a
substantive assumption is involved. This can be checked formally by writing
out the sample space involved –it will not be done here –or, less formally,
by just observing that two distinct experiments are required to verify the
axiom. In one A is o¤ered to the subject and PA is estimated; in the other
T is o¤ered and PT is estimated.
It has been implicit in the discussion, and is explicit in the title of
the book (Ed. Individual Choice Behavior), that this theory –the Choice
Axiom in particular – applies to single individuals, not to averages over
groups of them. It is not di¢ cult to see that every individual in a group
could satisfy the axiom, yet the average probabilities violate it, and vice
versa. For example, consider two individuals, 1 and 2, with probabilities

PA (1)(S) = 0:72 PT (1)(S) = 0:80 PA (1)(T ) = 0:90


PA (2)(S) = 0:02 PT (2)(S) = 0:20 PA (2)(T ) = 0:10
4.2. THE PROBABILITY AND CHOICE AXIOMS 79

that satisfy the Choice Axiom individually. The group averages are 0:37,
0:50, and 0:50, which fail to satisfy the axiom, since 0:50 0:50 = 0:25 6=
0:37. This does not mean that group studies can never be used in connection
with this theory, but they must be chosen with care so as not to do violence
to the basic ideas.

Wrt. the second part of Lemma 23: The essential fact contained in the
second part of Lemma 23 is that when the Choice Axiom holds for T and its
subsets, the ratio PT (a) =PT (b) is independent of T for all subsets T A.

In decision theory (see, for example, the book Luce and Rai¤a (1957)6
one axiomatic idea, which may be termed “independence from irrelevant al-
ternatives,”is recurrent. The idea was brought to the fore by Arrow (1951)7
in a particular choice context, but the same basic notion appears in other
context in which, of course, its axiomatic formulation di¤ers somewhat.
Arrow termed his axiomatization of the idea “independence of irrelevant
alternatives.” But, as Professor Stevens has pointed out to me, this phase
is unfortunately misleading, since it suggests that the irrelevant alternatives
are independent of one another. The actual gist of the idea is that alterna-
tives which should be irrelevant to the choice are in fact irrelevant, hence
the present term. For example, the idea states that if one is comparing
two alternatives according to some algebraic criterion, say preference, this
comparison should be una¤ected by the addition of new alternatives or the
subtraction of old ones (di¤erent from the two under consideration). (Ed.
See the experiment underlying Figure 2.5.) Exactly what should be taken
to be the probabilistic analogue of this idea is not perfectly clear, but one
reasonable possibility is the requirement that the ratio of the probability
of choosing one alternative to the probability of choosing the other should
not depend upon the total set of alternatives available, i.e., the assertion of

6
Robert Duncan Luce and Howard Rai¤a (1957), Games and Decisions: Introduction and Critical
Survey, Blackwell-Wiley.
Even nowadays, this nontechnical book is considered as one of the classic references in game theory.
7
Kenneth Arrow (1951), Social Choice and Individual Values, Wiley & Sons.
Ed. The IIA also appears under point 7 of John Nash (1950), The Bargaining Problem, Econo-
metrica 18. John Nash does not attach any name to this axiom, nor does he make any reference to
discussions with Kenneth Arrow. Wikipedia attributes IIA to Kenneth Arrow. Arrow’s 1951 book
won him the Nobel Prize many years later.
80CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

the second part of Lemma 23 which states for A = fa1 ; : : : am g that


PA (ai ) P (ai ; aj )
= :
PA (aj ) P (aj ; ai )
In this sense, then, we can say that the Choice Axiom is a probabilistic
version of the independence-from-irrelevant-alternatives idea.
It should be noted that it is only the ratio of the two probabilities, not
the probabilities themselves, that is invariant with changes of the irrelevant
alternatives."

4.3 Existence of a Ratio Scale


"As the study of choice has developed, both in psychology and eco-
nomics, one of the central issues that a formal characterization must fare
are conditions that ensure the existence of a relatively unique numerical
scale which in some sense represents the choice behavior of the subjects.
Mathematically, the problem is simply one of imposing su¢ cient axiomatic
structure to prove existence of a scale that is unique up to some group of
transformations – the group of positive linear transformations (zero and
unit unspeci…ed) has usually been deemed to be just acceptable. These are
what Stevens (1951) terms interval scales. But the empirical side-condition
that these mathematical assumptions must form a more or less plausible
description of human and animal choice behavior has rendered the problem
di¢ cult.
In economics, preferences among bundles of goods has been taken to
be the underlying primitive in economics, and, as an idealization, it has
been assumed to be an algebraic ordering of the commodity bundles. In
such models, if any numerical order preserving scale exists, many do. In
fact, they are unique only up to monotone transformations, which renders
the numerical character of scales almost super‡uous. That being so, some
economist arrived at the position that it is safer to work with orderings –as
they say, with ordinal utilities in contrast to cardinal ones –and for many
of the traditional theorems of economics this is su¢ cient. Nonetheless,
some work, particularly in modern decision theory, requires cardinal utility
scales. Some extension of the traditional formulation we needed, and a little
more than a decade ago it was a¤ected by von Neumann and Morgenstern
4.3. EXISTENCE OF A RATIO SCALE 81

Ordinary Probability Axiom Ratio scale v : A ! R


+ ()
P
Choice Axiom PA (a) = v(a)= b v(b)

Figure 4.2: The fundamental equivalence between the stochastic choice model and the
ratio scale for …nite sets of alternatives.

(1944).8 Roughly, they continued to suppose that preferences are algebraic,


but the domain is extended from a set of “pure alternatives”to the set of all
possible gambles that can be generated from the alternatives and an in…nite
set of chance events. Preferences over these gambles are assumed to meet
certain fairly restrictive axioms which, although normatively compelling,
seem at best to lack detailed descriptive realm.9 Under these conditions, a
scale is shown to exist which is unique up to positive linear transformations
(Ed. An interval scale.) and which has the important property that the
utility of a gamble is equal to the expected utility of its components."

The following result establishes the ratio scale underlying the Choice
Axiom.

Theorem 24 Let P (a; b) 2 (0; 1) for all a; b 2 A and Axiom 13 and 14 hold for all
S A, then there exists a positive real-valued function v : A ! R such that
v (a)
PS (a) = P
b2S v (b)

and v is unique up to multiplication by a positive constant.


8
John von Neumann and Oscar Morgenstern (1944), Theory of Games and Economic Behavior,
Princeton University Press.
Duncan Luce also mentions: "Actually, Ramsey (1931) suggested some of the same ideas a good
deal earlier, but the importance of his work was not recognized until recently. " See also, wikipedia
on Frank Ramsey.
9
Ed. This seems to refer to the experiments conducted by Maurice Allais, Nobel laureate 1988,
that rejected expected utility theory. These experimental results were published in 1953, six years
earlier the publication of Duncan Luce’s book.
82CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

The proof of this theorem is left as an exercise.

1
Corollary 25 (Monotonicity of v) P (a; b) 2
if and only if v (a) v (b).

"In essence, what we have shown is this. If we con…ne ourselves to a local


region A in which all the pairwise discriminations are imperfect, and if the
several probability measures are related to one another so that PT , T A,
acts like a conditional probability relative to PA (the Choice Axiom), then
the distribution PA (a) can be interpreted as a particular choice of a ratio
scale over A.
The practical will note that the v-scale obtained in Theorem 24 is not
really very useful as it stands for two reasons: (i) the probabilities PA (x)
will be extremely di¢ cult to estimate when A is large, and (ii) the scale is
de…ned only over a set having no pairwise perfect discriminations, which is
probably only a small portion of any dimension we wish to scale. The …rst
di¢ culty is much mitigated when we notice that v can be expressed as

P (a; b) P (a; b)
v (a) = k =k ;
P (b; a) 1 P (a; b)

where b is an arbitrary but …xed element of A and k is a positive constant.


This follows from the fact that

P (a; b)
PT (a) = PT (b)
P (b; a)

according to the second part of Lemma 23. Thus, if the pairwise probabili-
ties can be estimated su¢ ciently accurately so that the ratio P (a; b) =P (b; a)
is reliable, then v can be determined.
Actually, in practice it would be ill-advised to estimate the v-scale in this
manner because too little of the available data is used. Fortunately, much
more e¢ cient –maximum likelihood –estimating schemes are described in
the literature."

(Ed.) Nowadays, Logit and Probit models are applied for econometric estimation of
discrete choice models, see next section. Here ends substantially quoting Duncan Luce’s
book.
4.4. INFERENCE 83

4.4 Inference
Nowadays, the econometric technique for inference of stochastic choice among a few
alternatives, called discrete choice, are the Logit and Probit models. Daniel McFadden,
Nobel laureate 2000, won his part of the prize "for his development of theory and
methods for analyzing discrete choice". McFadden build on the foundations laid down
by Duncan Luce as discussed above. The general case is involved and better suited for
a course in econometrics. Also, to develop the model further and to bring it to data
sets of discrete choice situations is better left for such a course.

A random utility model

Logit choice can be seen as arising from a deterministic model of preferences in which
the economic agent does not correctly perceive the utility of each alternative. For the
economic agent, the utility of each alternative has a random "utility" component. These
random components may lead to the perception that alternative a may have a larger
utility than alternative b, while in the absence of these random utility components
it may well be the case that the deterministic utility part of a is smaller than the
deterministic utility part of of b, i.e., a is preferred over b.
For explanatory reasons, consider a binary set of alternatives, A = fa1 ; a2 g. The
structural interpretation of the logit model is based on the assumption that each of
the alternatives has a utility that is perturbed by a scaling parameter and a random
shock. Alternative ai , i = 1; 2, has a perceived utility of ui + "i , where "i is a random
shock representing the perturbation and > 0 is some …xed scaling parameter of the
shock. The larger , the larger the perturbation in perception of the "true" utility ui
will be. On the one hand, in the limit ! 0 the perturbation vanishes and has no
e¤ect at all. On the other hand, in the limit ! 1 the utilities become irrelevant and
the perturbations fully determine perceived utility, and hence, choice. Since utility has
1
no natural unit, we may rescale by dividing through > 0 and rede…ne as . Then,
we obtain that alternative ai , i = 1; 2, has a perceived utility of ui + "i . Note that in
the limit ! 1 (was ! 0) the perturbation no e¤ect at all, and in the limit !0
(was ! 1) perceived utility is purely random.
Alternative a1 is chosen if u1 + "1 > u2 + "2 , or (u1 u2 ) > "2 "1 . In case
perturbations are independently drawn from identical probability distributions, it is
without loss of generality that only the random di¤erence "2 "1 matters. De…ne
" "2 "1 as the di¤erences between the shocks and let F (") be the cumulative
84CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

distribution of the di¤erence ". The identical probability distributions for individual
1
shocks "1 and "2 imply that " also has a symmetric distribution around 0, i.e., F (0) = 2
and F (") = 1 F ( "). Furthermore, we assume a continuous distribution of F on
( 1; 1).

Quiz 26 Suppose u1 > u2 . First, determine the interval of " for which alternative a1
is chosen, and next, …nd an expression for the probability that alternative a1 is chosen.
Is this probability larger or smaller than 12 ? Hint: for the "next" part, sketch a …gure
of F on ( 1; 1).

The probability that alternative a1 will be chosen is equal to F ( [u1 u2 ]) 2


(0; 1). Consequently, the probability that alternative a2 will be chosen is equal to
1 F ( [u1 u2 ]) 2 (0; 1). Hence, in the notation of stochastic choice:

P (a1 ; a2 ) = F ( [u1 u2 ]) and P (a2 ; a1 ) = 1 F ( [u1 u2 ]) :

The logit choice function for alternative a1 is obtained if we take the standard
logistic cumulative distribution function F (") = 1= (1 + e " ). Substitution yields

1 e u1
P (a1 ; a2 ) = (u2 u1 )
= u1 u2
:
1+e e +e
The following discussion is adapted from Goeree, Holt and Palfrey (2015).10

"The intuition behind the e¤ects of changes in the precision parameter


can be seen from the latter equality. As the amount of noise is increased,
so that ! 0, the probabilities converge to 21 , regardless of how large the
utility di¤erences are. In the other extreme, as ! 1, the term (u2 u1 )
goes to in…nity if alternative a2 has the highest utility, so P (a1 ; a2 ) goes to
0. Conversely, P (a1 ; a2 ) goes to 1 if alternative a2 has the lowest utility.
Both e¤ects are in accordance with the limit case of deterministic choice,
as analyzed in Chapter 2. Thus the two extreme limits of the parameter
generate the two extremes of perfectly noisy and perfectly rational behavior.
For intermediate cases, it is clear from the …nal term that an increase in the
precision will make the logit choice function more responsive, i.e. with
sharper ’corners’. "
10
Jacob Goeree, Charles Holt and Thomas Palfrey (2015), Chapter 3. QRE Explanations of In-
tuitive Behavioral Anomalies, retrieved autumn 2015 from http://people.virginia.edu/~cah2k/
qre_ch3.pdf.
4.5. BEHAVIORAL ECONOMICS (OPTIONAL) 85

Quiz 27 Suppose u1 = 1, u2 = 2, and = ln 2. Calculate P (a1 ; a2 ).

The interpretation of the scaling parameter is unclear. In the above quote, it is


called "precision" parameter, larger will lead to more precise best choices, best in
the sense of the best alternatives of the model of preferences (Chapter 2). In game
settings, this parameter is called a skill parameter and more skilled translates into a
larger .

The general case is similar, but as mentioned more involved and better suited for a
course in econometrics. For an arbitrary …nite of feasible alternatives A = fa1 ; : : : ; am g,
the logit choice function is given by
ui
e 1
PA (ai ) = Pm uj
= Pm (uj ui )
:
j=1 e 1+ j=1;j6=i e

Note the stochastic version of independence of irrelevant alternatives holds:


ui
PA (ai ) e P (ai ; aj )
= uj
= :
PA (aj ) e P (aj ; ai )

This ratio is independent of other alternatives in the set A. Furthermore, we might


ui
say that the ratio scale v is de…ned by v (ai ) = e .
In exercises available on the internet, v (ai ) = ui is popular because it avoids having
to take the exponential and it results in PA (ai ) being equal to fractions.

4.5 Behavioral Economics (Optional)


We already discussed the attraction e¤ect, it not only violates IIA but also the Choice
Axiom.

Pleskac (2012): "An experiment that questions the validity of the Choice
Axiom tests what is known as the attraction e¤ect. More broadly, the sim-
ilarity e¤ect demonstrates that context matters: the response strength of
an alternative depends on the options that are in the choice set. Over the
years, other context e¤ects have been identi…ed and studied that further
attest to the importance of context. A second context e¤ect that ques-
tions the validity of the Choice Axiom (as well as elimination by aspects) is
known as the attraction e¤ect. The attraction e¤ect can be illustrated with
86CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

an example from Simonson and Tversky (1992).11 They gave one group of
subjects a choice between $6 and a nice Cross pen. The pen was chosen
by 36% of the subjects while the remaining 64% chose $6. A second group
was given a choice between three options: $6, a nice Cross pen, or another
less attractive pen. As one might expect, only 2% chose the less attractive
pen. Yet, the mere addition of this asymmetrically dominated alternative
boosted the proportion of subjects who chose the Cross pen to 46%. This
behavior - no doubt well known among marketers and salespeople - does
not seem quite right from the view point of the Choice Axiom and, for that
matter, most probabilistic theories of choice: adding an alternative to a
choice set of even minuscule value should reduce choice probabilities for the
all the alternatives, not increase them. This condition is known as regular-
ity. Stated formally for x 2 S T , PS (x) > PT (x). The attraction e¤ect
demonstrates that in particular situations individuals violate regularity."

4.6 References
This chapter is based on a slightly modi…ed and extensive quote from:

Chapter 1 of Duncan Luce (1959), Individual Choice Behavior: A Theoretical


Analysis, Blackwell-Wiley.

Jacob Goeree, Charles Holt and Thomas Palfrey (2015), Chapter 3. QRE Expla-
nations of Intuitive Behavioral Anomalies. Retrieved autumn 2015 from http:
//people.virginia.edu/~cah2k/qre_ch3.pdf.

Timothy J. Pleskac (2012) Decision and Choice: Luce’s Choice Axiom, De-
partment of Psychology, Michigan State University. Retrieved from https:
//msu.edu/~pleskact/research/papers/pleskac_inpress_Luce.pdf

4.7 Exercises
Exercises

Exercise 1
11
Itamar Simonson and Amos Tversky (1992), Choice in Context: Tradeo¤ Contrast and Extreme-
ness Aversion, Journal of Marketing Research, 29, 281-295.
4.7. EXERCISES 87

1
Suppose A = fa; b; cg, PA (a) = 2
and P (b; c) = 23 .

a. Calculate PA (b) and PA (c).

b. Calculate P (a; b) and P (a; c).

Exercise 2
Suppose A = fa; b; cg, P (a; b) = 37 , P (a; c) = 3
8
and P (b; c) = 3
10
. Do these probabili-
ties satisfy Lemma 23?

Exercise 3
Suppose A = fa; b; cg and PA (a) : PA (b) : PA (c) = 2 : 3 : 5.

a. Calculate PA (a), PA (b) and PA (c).

b. Calculate P (a; b), P (a; c) and P (b; c).

Exercise 4
Suppose u1 = 4, u2 = 2, and = ln 3. Calculate the stochastic logit choice function
P (a1 ; a2 ).

Exercise 5
Consider A = fa1 ; a2 ; a3 g, utilities associated with A written as u1 , u2 , u3 , and the
logit choice function.

a. Derive lim !1 P (a2 ) if the utilities can be ranked as u1 > u2 > u3 .

b. Derive lim !1 P (a2 ) if the utilities can be ranked as u1 = u2 > u3 .

c. Derive lim !1 P (a2 ) if the utilities can be ranked as u1 > u2 = u3 .

Concepts, Theory and Proofs

Exercise 6
Prove Lemma 22.

Exercise 7
Prove Lemma 23
88CHAPTER 4. STOCHASTIC CHOICE AMONG A FINITE SET OF ALTERNATIVES

Exercise 8
Prove Theorem 24

Exercise 9
Consider A = fa1 ; : : : ; am g, the utility function u : A ! R, and the logit choice
function. Let a 2 A be the unique utility maximizing element of u : A ! R. Thus,
u (a ) > u (b) for all b 2 An fa g.

a. Derive lim !1 PA (a ).

b. Derive lim !1 PA (b), b 2 An fa g.

Exercise 10
Consider A = fa1 ; : : : ; am g, utility vector u = (u1 ; : : : ; um ) (which represents the utility
function function u : A ! R), and the logit choice function.

a. Is the logit choice function PA (a), a 2 A, invariant to u0 = (u1 + w; : : : ; um + w),


w 2 R? Hint: Invariant means that it does not matter whether we take u or u0
in PA (a).

b. Is the logit choice function PA (a), a 2 A, invariant to u0 = (ku1 ; : : : ; kum ), k 2 R.

Exercises inference and revealed preferences

Exercise 11
Suppose you ran an laboratory experiment in which subjects choose several times from
the binary set fa; bg. One of the subjects chooses a out of fa; bg a fraction p 2 (0; 1)
of the time, say 30% if you prefer numbers. You employ the Logit model. Can this
percentage be used to infer (or retrieve) parameter and the utility numbers ua and
ub ? What can you infer and what seems to be the problem?

Exercise 12
Consider the aggregate percentages summarizing the choice data from Simonson and
Tversky (1992). These were discussed above for the choice between the Cross pen and
other pens. Verify that these aggregate percentages violate the Choice Axiom.
Chapter 5

The Consumer

In this chapter we apply the model of individual decision making that is discussed in
the previous chapters to one of the most important type of agents in economics: the
consumer. (Another type of economic agent is the producer, which we discuss in the
next chapter.)
This chapter uses the notation from, and to a large extent follows, the text book
“Microeconomic Theory, 10th edition”by Nicholson and Snyder, in particular Chapters
4 and (part of) Chapter 5.1

5.1 The Consumer


The consumer is an agent who is characterized by his/her preferences over bundles of
goods or commodities (the alternatives) and income (a nonnegative real number).
If there are no (budget) restrictions, then a consumer can consume any nonnegative
amount of each good. So, if the (preference relation of the) consumer is nonsatiated,
there might be no solution to the consumers utility maximization problem. However,
we assume the consumer to be restricted by its budget set. The consumer cannot
just get any bundle of goods, but has a given …xed budget (income), and has to buy
the goods against certain prices. Each unit of each good has a price. Then, for a
particular bundle of goods the consumer has to pay the sum over all goods of the price
of that good multiplied by the number of units that he/she buys from that good. The
prices are exogenously given for the comsumer. So, irrespective of how many units the
consumer buys, each unit costs a …xed amount. The total amount that the consumer
1
Sections 6.1-6.4 of these lecture notes are based on Chapter 4, Section 6.6 is based on Chapter 5
pages 158-165.

89
90 CHAPTER 5. THE CONSUMER

has to pay for a bundle of goods cannot be more than his/her budget (or income)
which is a given nonnegative number.

We assume that there are n goods (commodities). Since, without constraints, the
consumers can buy any nonnegative amount, the commodity space is just the set of all
nonnegative n-dimensioanl real vectors.

De…nition 12 The commodity space is IRn+ = fx 2 IRn j x 0g.

We assume the consumer to be in a market economy, meaning that goods are


exchanged through …xed exchange rates which can be represented by prices.

To build a decision model for the consumer, we address the following three ques-
tions:

1. What does the consumer want?

2. What can the consumer do?

3. What will the consumer do (How does the consumer choose)?

How to go on with these questions? (We brie‡y sketch the idea, but will be more
precise in the following sections of this chapter.)

Ad 1. What the consumer wants is given by his/her preference relation % or, if pos-
sible, its utility function U .

Ad 2. What the consumer can do is determined by his/her income and the prices of
the goods. This determines his/her budget set.

Ad 3. The consumer chooses a best element according to his/her preference relation


in the budget set.

In this chapter, we assume that the preferences of the consumer are represented by
a complete, transitive, continuous and strictly convex preference relation. From the
previous chapters/lectures, we know that such a preference relation can be represented
by a continuous utility function. Therefore, in this chapter we represent the preferences
of a consumer directly by such a utility function. (Note: Be aware that we cannot do
this if the preference relation does not satisfy these properties.)
5.2. THE BUDGET SET 91

Assumption 1 The preference relation of the consumer is complete, transitive, con-


tinuous and strictly convex. Therefore, we represent the preferences of a consumer by
a continuous utility function U .

Question: Do you recall what are the properties of the utility function corresponding
to the properties of the preference relation mentioned in this assumption?2

Choosing a best element according to the preference relation in the budget set now
boils down to maximize the utility function over the budget set. First, we need to know
more about the budget set.

5.2 The Budget set


The budget set was already mentioned in Chapter 4. The budget set of a consumer
determines what bundles of goods a consumer can buy. It is determined by the given
income of the consumer, and the prices of the goods. (For easy reading, the notation
is as close as possible to that in “Microeconomic Theory, 10th edition” by Nicholson
and Snyder.)
Let I 0 be the nonnegative …xed income of the consumer. Further, let the (unit)
prices of the goods be given by a market price vector p 2 IRn++ = fx 2 IRn j x >> 0g
with pk > 0 being the positive price of good k 2 f1; : : : ; ng. Notice that we assume
that all prices are positive.3 The budget set of a consumer simply is the set of all
bundles of goods that he/she can a¤ord with his/her income.

De…nition 13 Given income I 0 and price vector p 2 IRn++ , the budget set of the
consumer is the set of bundles of goods given by

B(p; I) = fx 2 IRn+ j p x Ig:

Here, pk xk is the amount of money the consumer spends on good k, and

X
n
p x= p k xk = p 1 x1 + p 2 x2 + : : : + p n xn
k=1
2
Answer: By de…nition of a utility function, the underlying preference relation is complete and
transitive. Continuity of a preference relation corresponds to continuity of the utility function. Strict
convexity of a preference relation corresponds to strict quasi-concavity of the utility function.
3
We will not go into details, but assuming positive prices re‡ects scarcity of the goods. Scarcity,
i.e. there is more ’demand’ for a good than there is available, is a typical property of an economic
good.
92 CHAPTER 5. THE CONSUMER

is the amount of money the consumer spends on the bundle of goods x 2 IRn+ . For a
graphical illustration of a budget set with two goods, see Figure 4.1 in Nicholson and
Snyder.
To be in the budget set, this amount must be at most equal to the income I of the
consumer.
The inequality

p x I

is called the budget restriction of the consumer.


The budget set satis…es the following properties:

Proposition 28 Let p 2 IRn++ and I 0. Then

(i) B(p; I) is nonempty;

(ii) B(p; I) is convex;

(iii) B(p; I) is compact;

(iv) B(p; I) is homogeneous of degree 0, i.e. B( p; I) = B(p; I) for all > 0.

The …rst three statements allow that we can maximize the utility function over the
budget set, i.e. a maximum exists. The fourth statement says that the budget set does
not change if all prices and the income are multiplied by the same positive constant.
You can give a formal proof, but intuitively you might see that, if all prices are mul-
tiplied by the same positive constant , then the ‘price’ of every bundle of goods is
multiplied by the same constant . Then, obviously you can a¤ord exactly the same
bundles of goods if your income is also multiplied by .

Question: Are these properties intuitive?

5.3 Demand of the consumer


Given his/her preferences (utility function) and income, for every price vector we want
to …nd what bundles of goods are demanded by the consumer. Since, under the as-
sumptions made in the previous sections, choosing a best element of the preference
relation in the budget set boils down to maximizing the utility function over the bud-
get set, we de…ne the demand at a speci…c price vector as the bundle of goods that
solves this maximization problem.
5.3. DEMAND OF THE CONSUMER 93

Consumer individual demand


- Black Box: prices p -
utility function U , d(p; I)
income I

Figure 5.1: Individual demand of a consumer.

De…nition 14 Consider a consumer with utility function U : IRn+ ! IR. The demand
at price vector p 2 IRn++ of this consumer is the set of best elements according to U
in the budget set B(p; I), i.e.

demand(p; I) = fx 2 B(p; I) j U (x) U (y) for all y 2 B(p; I)g

From Week 2, we know that for a continuous, strictly quasi-concave utility function
there is a unique best element in the convex budget set. Therefore, for every price
vector the demand is a unique bundle, and we can speak about a demand function of
prices and income.

De…nition 15 Consider a consumer with a continuous and strictly quasi-concave util-


ity function U . The demand function of this consumer is the function d : IRn++
IR+ ! IRn+ given by

d(p; I) = x

with x the unique bundle of goods in demand(p; I).

Note that in this case we can …nd the demand by maximizing the continuous and
strictly quasi-concave utility function over the budget set.

Question: Can you give a utility function and price vector such that demand is not
unique, i.e. demand(p; I) has more than one bundle of goods? (Note that this utility
function need not be strictly quasi-concave.)4
4
Answer: see Exercise 3.
94 CHAPTER 5. THE CONSUMER

5.3.1 Finding the demand of the consumer


Given price vector p 2 IRn++ and income I 0, we …nd the demand x in De…nition 15
by solving the following maximization problem:

max U (x) subject to p x I: (5.1)


x2IRn
+

This can be solved using the Kuhn-Tucker method. However, if we additionally


require the preferences to be monotone, (and thus the utility function to be increasing)
we can use the Lagrange method.

De…nition 16 A preference relation % on IRn+ is monotone if for every x; y 2 IRn+


with x > y it holds that x y.
A utility function U : IRn+ ! IRn is increasing if for every x; y 2 IRn+ with x > y it
holds that U (x) > U (y).

Proposition 29 Let % be a preference relation that is represented by utility function


U . Then % is monotone if and only if U is increasing.

De…nition 17 We call a preferences relation regular if it is complete, transitive,


continuous, strictly convex and monotone. We call a utility funtion regular if it is
continuous, strictly quasi-concave and increasing.

Assumption 2 The preference relation (and thus the utility function) of the consumer
is regular.

By the utility function U : IRn+ ! IRn being regular, (and therefore increasing,) and
I 0, we know that the budget restriction will be satis…ed with equality.

Question: Why is this the case?

Consequently, we can write the maximization problem (5.1) for the consumer as

max U (x) subject to p x = I: (5.2)


x2IRn
+

Because the constraints are linear equalities, we can solve this using the method
of Lagrange.
5.3. DEMAND OF THE CONSUMER 95

First, we write the Lagrangian being the function L given by

L(x; ) = U (x) (p x I)

where is the Lagrange multiplier.


The …rst order conditions for this maximization problem are
@L(x; )
= 0 for all k = 1; : : : ; n (5.3)
@xk
and
@L(x; )
= 0: (5.4)
@

This yields
@L(x; ) @U (x)
= pk = 0 for all k = 1; : : : ; n (5.5)
@xk @xk
and
@L(x; )
= (p x I) = 0: (5.6)
@
These are n+1 equations in n+1 unkowns (x1 ; : : : ; xn ; ), where (5.6) is the budget
restriction.
The partial derivative of the utility function to the amount of good k is called the
marginal utility of good k at bundle x.

De…nition 18 The marginal utility of good k is the function M Uk : IRn+ ! IR given


by
@U (x)
M Uk (x) = for all x 2 IRn+ :
@xk
For a particular x 2 IRn+ , M Uk (x) is called the marginal utility of good k at bundle x.

The marginal utility of good k at bundle x gives the change in utility after an
in…nitesimal change in the amount of good k when the consumer consumes the bundle
x. Notice that this marginal utility depends on the bundle of goods. Therefore, the
marginal utility of good k is a function with the commodity space as its domain.
Obviously, for monotone preferences (and thus an increasing utility function), marginal
utility is positive for all bundles.
96 CHAPTER 5. THE CONSUMER

@U (x)
Proposition 30 If the utility function U is increasing, then M Uk (x) = @xk
> 0 for
all k = 1; : : : ; n and x 2 IRn+ .

Going back to the …rst order condition of the Lagrange maximization problem, the
n equations (5.5) now can be written as

M Uk (x)
= for all k = 1; : : : ; n: (5.7)
pk
By Proposition 30, we have that the Lagrange multiplier is positive.

Corollary 31 By U being increasing, we have > 0.

Now, from (5.7) it follows that

M Uk (x) M Ul (x)
= for all k; l = 1; : : : n
pk pl
which is equivalent to

M Uk (x) pk
= for all k; l = 1; : : : n
M Ul (x) pl

Since (5.6) is the budget constraint, we can write necessary and su¢ cient conditions
for a utility maximizing bundle of goods in the interior of the commodity space without
refering to .

Theorem 32 Consider a consumer with a regular utility function U . For price vector
p 2 IRn++ and income I 0, we …nd the demand x = d(p; I) 2 IRn++ by solving the
following system of n equations:

M Uk (x) pk
= for all k = 2; : : : n
M U1 (x) p1

and

p x = I:

Question: Can you interpret those conditions intuitively?

For a graphical illustration of the utility maximization problem, see Figure 4.2 in
Nicholson and Snyder.
5.3. DEMAND OF THE CONSUMER 97

Example 15 You can …nd illustrations of …nding the demand function in Examples
4.1, 4.2 of Nicholson and Snyder.

In Theorem 32, we took good 1 as the numeraire good. So, all price and marginal
utility ratios are taken in terms of a comparison with good 1. But note that, instead
of good 1, we could take any of the n commodites as numeraire.

Question: Does this suggest that we can take money as a numerarie good?

The ratio of the marginal utilities of two goods k and l 6= k, is called the marginal
rate of substitution M RSkl (x) and is given by

M Uk (x)
M RSkl (x) = ; k; l = 1; : : : ; n:
M Ul (x)

The marginal rate of substitution between two goods gives with how much the con-
sumption of one good should increase in order to keep the same utility level after a
decrease in the consumption of another good. Using this, we can also say that necessary
and su¢ cient conditions for an optimum (demand) in the interior of the commodity
space is that all marginal rates of substitution are equal to the corresponding price
ratios, together with the budget constraint.

We need to consider also the second order conditions for maximization. However,
for the utility maximization problem discussed her, these are satis…ed by strict quasi-
concavity of the utility function. Strict quasi-concavity of the utility function implies
M Uk (x)
a decreasing marginal rate of substitution M RSkl (x) = M Ul (x)
, k; l = 1; : : : ; n.
We stress that Theorem 32 only …nds interior solutions x 2 IRn++ . There can be
corner solutions (see page 120-121 of Nicholson and Snyder).
To distinguish it from an indirect way of de…ning the demand of a consumer (to
be discussed later in Section 6.4), the demand function d(p; I) is called Marshallian
demand function, after the British economist Alfred Marshall (1842 - 1924).

We end this section be mentioning some properties of the (Marshallian) demand


function.

Proposition 33 Consider a consumer with a regular utility function U . Then

d(p; I) is continuous in p and I;


98 CHAPTER 5. THE CONSUMER

p d(p; I) = I (budget restriction)

d(p; I) is homogeneous of degree 0 in p and I, i.e. d( p; I) = d(p; I) for all


> 0.

Question: Verify these conditions intuitively.

Continuity is a technical condition, made for (mathematical) convenience. Homoge-


niety of degree zero (the third property) states that if we multiply all prices and the
income of the consumer by the same positive constant, then the demand of this con-
sumer does not change. This follows directly since, as seen before, the budget set does
not change after multiplying all prices and income with the same positive constant (see
Proposition 28), and thus the consumer faces the same utility maximization problem.

5.3.2 The Lagrange multiplier and indirect utility


Next, we can give an interpretation to the Lagrange multiplier . By Proposition 30,
we already know that is positive. To give an interpretation to , we de…ne the
demand of a consumer in a di¤erent (indirect) way.

De…nition 19 Consider a consumer with a regular utility function U . The indirect


utility function of this consumer is the function V : IRn++ IR+ ! IR given by

V (p; I) = U (d(p; I)):

The indirect utility V (p; I) = U (d(p; I)) of the consumer gives the maximal utility
the consumer can reach at prices p and income I. Notice that behind this is the utility
maximization problem of the consumer discussed before, leading to the Marshallian
demand d(p; I).

Proposition 34 Consider a consumer with a regular utility function U . Then

@V (p; I)
= ;
@I
where is the Lagrange multiplier of the utility maximization problem at price vector
p and income I.
5.4. EXPENDITURE FUNCTION AND COMPENSATED DEMAND FUNCTION99

(It is an exercise to prove this proposition, Exercise 7). This proposition gives an
interpretation to the Langrange multiplier: is the change in attainable utility from
an in…titesimal small change in income. If there is a (small) change in income of the
consumer, then the consumer will change its demanded bundle of goods. Suppose there
isa small increase in income. Then, obviously, the consumer will change its demanded
bundle in such a way that he/she reaches a higher utility level. The increase in utility
resulting from this small increase in income is approximately given by the Lagrange
mutliplier.
Note that in this way can be seen as a function of p and I.

So, the partial derivative of the indirect utility function to the income of the con-
sumer is the Lagrange multiplier. Can we also do something with the partial derivative
of the indirect utility function to the price of a good? It turns out that dividing this
by the partial derivative to income gives the negative of the Marshallian demand. This
is known as Roy’s identity.

Proposition 35 (Roy’s identity) Consider a consumer with a regular utility function


U . Then
@V (p;I)
@pk
dk (p; I) = @V (p;I)
for all k = 1; : : : ; n
@I

(The proof of this proposition is also left as an exercise, Exercise 8).

5.4 Expenditure function and compensated demand


function
In the previous section, we de…ned the Marshallian demand function of a consumer
by considering the primal optimization problem of maximizing the consumer’s utility
function over his/her budget set: given p 2 IRn++ and I 0

max U (x) subject to p x I;


x2IRn
+

(with an equality constraint if the utility fuction is increasing) and thus we looked for
the bundle of goods that maximizes the utility under the budget constraint. In this
section we consider the dual optimization problem:
100 CHAPTER 5. THE CONSUMER

Given p 2 IRn++ and u > 0

min p x subject to U (x) u:


x2IRn
+

So, in the dual optimization problem we minimize the expenditure under the utility
constraint. Fix a certain positive utility level u > 0. The look for the bundle of
goods that, given prices, minimezes the income needed to reach that utility level.
This is solved by the above minimization problem, and gives the so-called compensated
demand of the consumer. Of course, you will reach that level by choosing a bundle of
goods that is ‘best’.

Similar as with the budget constraint in the primal problem, if the utility function
is regular (and thus increasing) the utility constraint can be replaced by an equality
yielding

min p x subject to U (x) = u: (5.8)


x2IRn
+

The solution to the dual problem gives the compensated demand function.

De…nition 20 Consider a consumer with a regular utility function U . The compen-


sated demand function of this consumer is the function h : IRn++ IR ! IRn+ given
by

h(p; u) = x

with x the solution of the dual optimization problem (5.8).

The compensated demand function h(p; u) gives the demand of the consumer when
he/she wants to reach utility level u at prices p. This is also called the Hicksian
demand function after the British economist John Hicks (1904-1989). Note that
this ignores the budget of the consumer. Besides knowing what is the (compensated)
demand of the consumer when he/she wants to reach utility level u at prices p, we
also want to know what is the income needed to buy this demand. This is given by
the expenditure function which gives the minimal income needed to reach utility u at
prices p.

De…nition 21 Consider a consumer with a regular utility function U . The expendi-


ture function of this consumer is the function E : IRn++ IR ! IRn+ given by

E(p; u) = p h(p; u):


5.4. EXPENDITURE FUNCTION AND COMPENSATED DEMAND FUNCTION101

Consumer compensated demand


- Black Box: utility level u -
utility function U h(p; u)

Figure 5.2: Compensated demand of a consumer.

Here, E(p; u) gives the minimal income needed to reach utility u at prices p.
Since the dual optimization problem minimizes expenditure under a linear utility
constraint, we can again apply the method of Lagrange. This gives the following result.

Theorem 36 Consider a consumer with a regular utility function U and income I 0.


For price vector p 2 IRn++ , we …nd the compensated demand x = h(p; u) by solving the
following system of n equations:
M Uk (x) pk
= for all k 2 f2; : : : ng
M U1 (x) p1
and

U (x) = u:

(It is an exercise to prove this theorem using the method of Lagrange, Exercise 9.)

Example 16 You can …nd illustrations of …nding the compensated demand function
in Examples 4.3, 4.4 of Nicholson and Snyder.

Next, we verify homogeniety of the Hicksian demand function and the expenditure
function with respect to prices.

Proposition 37 Consider a consumer with a regular utility function U . Then

h is homogeneous of degree 0 in price vector p, i.e. h( p; u) = h(p; u) for all


> 0; p 2 IRn++ and u 2 IR.
102 CHAPTER 5. THE CONSUMER

E is homogeneous of degree 1 in price vector p, i.e. E( p; u) = E(p; u) for all


> 0; p 2 IRn++ and u 2 IR.

(It is an exercise to prove this proposition, Exercise 10)


The compensated (Hicksian) demand function is homogeneous of degree zero in the
price vector. What happens if all prices are multiplied by the same constant > 0?
From the …rst order conditions of the Lagrange minimization problem, see Theorem
36, that does not change the marginal rates of substitution since the price ratios do
not change. Therefore, the …rst order conditions are satis…ed by the same bundles of
goods. This shows that the compensated demand function is homogeneous of degree
zero in the price vector. But, if the compensated demand does not change but the
prices are all multiplied with the same positive constant , then the income needed to
reach the given utility level (i.e. to buy the compensated demand bundle) is multiplied
by , showing that the expenditure function is homogeneous of degree 1 in the price
vector.

Another type of properties are identities (equalities that always hold). Here we
state four identities on consumer demand.

Identities 1 Consider a consumer with a regular utility function U . For every price
vector p 2 IRn++ , utility level u > U (0), and income I 0 such that d(p; I) 2 IRn++ , it
holds that

(i) h(p; u) = d(p; E(p; u));

(ii) d(p; I) = h(p; V (p; I));

(iii) E(p; V (p; I)) = I;

(iv) V (p; E(p; u)) = u.

The …rst identity states that, if we give the consumer the income E(p; u) that is
needed to reach utility level u at prices p, and then look at the Marshallian demand
of the consumer at those prices and income (i.e. let the consumer maximize its utility
under the budget constraint determined by p and E(p; u)), we exactly get the compen-
sated demand of the consumer.
The second identity is some kind of reverse of the …rst, and states that if we ask
the consumer what bundle of goods he/she demands (compensated demand) in order
5.5. MARKET DEMAND 103

to reach (indirect) utility level V (p; I) (i.e. the maximal utility that can be obtained
at prices p and income I by choosing a utility maximizing bundle of goods) at prices
p, this is exactly equal to the Marshallian demand d(p; I).
The third identity states that the budget needed to reach indirect utility level
V (p; I) at prices p is exactly the income I.
Finally, the fourth identity states that the indirect utility that can be reached if the
consumer gets the budget E(p; u) at prices p is exactly the utility level u.
These identities can be very helpful when you are asked for one of the functions,
knowing already some of the other functions (see Exercise 5).
Finally, another useful result is Shephard’s lemma with which you can …nd the
compensated demand functions by taking derivatives of the expenditure function.

Proposition 38 (Shephard’s Lemma) Consider a consumer with a regular utility func-


tion U and income I 0. Then
@E(p; u)
hk (p; u) = for all k = 1; : : : ; n:
@pk

5.5 Market demand


In Block 2 we will discuss various market models where demand is confronted with
supply. On a market it are the consumers who have a demand for, and the producers
who supply the goods. The producer is an economic agent that we will also discuss in
Block 2. Usually, to determine, for example, the market price of goods, what matters
is the total demand made by all consumers together. For every price vector, this total
market demand can be obtained as the sum of the individual (Marshallian) demands
of the consumers. In this section, we assume that there are c consumers, i = 1; : : : ; c,
with income I i 0 and Marshallian demand function di (p; I i ) for consumer i.

De…nition 22 The market demand function for good k 2 f1; : : : ; ng is the function
Dk (p; I 1 ; : : : ; I c ) : IRn++ IR+ ! IRn+ given by

X
c
1 c
D(p; I ; : : : ; I ) = di (p; I i ):
i=1

Usually, market demand depends on the price vector and the distribution of in-
come. As seen from the de…nition above, it matters what are the individual incomes
I i since they a¤ect the individual demands. So, if there are two consumers, for a given
104 CHAPTER 5. THE CONSUMER

price vector the total market demand when both have income 50 can be di¤erent then
when one consumer has income 10 and the other has income 90: i.e. d1 (p; 50)+d2 (p; 50)
need not be equal to d1 (p; 10) + d2 (p; 90).
If total market demand only depends on the price vector and the total income
P
I = ci=1 I i of the consumers (and thus d1 (p; 50)+d2 (p; 50) = d1 (p; 10)+d2 (p; 90)), then
we can model market demand as if there is only one consumer, called representative
consumer, with demand function D : IRn++ IR+ ! IRn+ given by

D(p; I):

In this case we can do as if there is only one consumer in the market who behaves
as we discussed in this chapter. As mentioned before, in Block 2 the market demand
function will appear when we discuss markets (applied game theory: Industrial Orga-
nization).

5.6 Elasticities
It is interesting to know how demand reacts to certain changes in prices and/or income.5
For example, does the demand for a certain good increase or decrease when income
increases? Or how does demand of good k change if the price of another good g 6= k
increases? And does the demand of good k always decrease if the price of good k itself
increases? One way to do that is by taking derivatives. However, for empirical work
this has the disadvantage that the changes depend on the units of measurement. Is a
change in demand of 10 units a lot? Is a change in demand of 10 units a lot after a
change in income of 100 euro a lot? We cannot say this.
Therefore, it is useful to measure these e¤ects without a unit of measurement. This
can be done by elasticities. Elasticities show how demand (proportionally) changes
when price or income changes (proportionally). In this section we make the following
assumption on the market demand function.

Assumption 3 The market demand function is continuous, satis…es the budget re-
striction, is homogeneous of degree 0 and partially di¤erentiable.

Next, we de…ne three types of elasticities.


5
This section is based on Nicholson and Snyder, Chapter 5 pages 158-165.
5.6. ELASTICITIES 105

De…nition 23 1. The price elasticity of good k at price vector p 2 IRn++ and market
income I > 0 is given by
@Dk (p;I)
@pk @Dk (p; I) pk
ekk (p; I) = Dk (p;I)
=
@pk Dk (p; I)
pk

2. The cross price elasticity of good k with respect to good g at price vector
p 2 IRn++ and market income I > 0 is given by
@Dk (p;I)
@pg @Dk (p; I) pg
ekg (p; I) = Dk (p;I)
=
@pg Dk (p; I)
pg

3. The income elasticity of good k at price vector p 2 IRn++ and market income
I > 0 is given by
@Dk (p;I)
@I @Dk (p; I) I
ekI (p; I) = Dk (p;I)
= :
@I Dk (p; I)
I

First, note that these three elasticities indeed have no unit of measurement. To
interpret these elasticities, the (own) price elasticity of good k at price vector p 2 IRn++
and market income I > 0 measures with what percentage the demand of good k changes
after a percentage change in the price of good k. The cross price elasticity of good k
with respect to good g 6= k at price vector p 2 IRn++ and market income I > 0 measures
with what percentage the demand of good k changes after a percentage increase in the
price of good g. The income elasticity of good k at price vector p 2 IRn++ and market
income I > 0 measures with what percentage the demand of good k changes after a
percentage change increase in the total market income.
An advantage of elasticities is that they can be estimated with econometric tech-
niques.
Another advantage of elasticities is that they can be used to characterize di¤erent
types of goods. For example, usually one expects that the demand of a good reacts
positive on an increase in income. However, it might be that this response is negative.
This is the case for, what we call, inferior goods. For example, demand to a cheap
cheese might decrease when income increases and people can a¤ord better cheeses.
And what is a luxury good? We speak about a luxury good if the percentage increase
106 CHAPTER 5. THE CONSUMER

in demand is more than the percentage increase in income. This brings us to the
following characterizations:

Characterizing types of goods by elasticities


Good k is

an inferior good if ekI (p; I) < 0 for all p 2 IRn++ and I > 0;

a luxury good if ekI (p; I) > 1 for all p 2 IRn++ and I > 0;

a substitute for good g 6= k if ekg (p; I) > 0 for all p 2 IRn++ and I > 0;

a complement for good g 6= k if ekg (p; I) < 0 for all p 2 IRn++ and I > 0.

In the last two properties, good k and g 6= k are substitutes if you can replace less
consumption of one by more consumption of the other, as re‡ected by a positive cross
price elasticity. For example, beer and wine usually are not consumed together, and
more consumption of one leads to less consumption of the other. On the other hand,
consumption of wine and wine glasses usually go together. If that is the case and the
cross price elasticity is positive, then we call them complementary goods.

Example 17 You can …nd illustrations of …nding the compensated demand function
in Example 5.5 of Nicholson and Snyder.

Above we de…ned elasticities for the market demand function since that is how they
mostly are applied. But note that elasticities can similarly be de…ned for individual
demand functions.

5.7 Concluding remarks


In this chapter we applied the individual decision model of the previous chapters to
model a consumer on a market. We derived its demand function by utility maximiza-
tion over the budget set. Under the extra assumption of monotone preferences (besides
regularity), we could apply the Lagrange method to …nd necessary and su¢ cient condi-
tions for a positive demand vector. We also followed the dual apprauch by minimizing
expenditure to reach a certain utility level. We brie‡y discussed market demand and
elasticities.
5.8. EXERCISES 107

In Period 2 we will apply this model of a consumer when we consider market


models (Applied Game Theory, Industrial Organization). On these markets there are
consumers and producers. Producers will be introduced in Period 2. We will look at
‘the game played among the producers’. The consumers will be ‘summarized’by the
market demand function as derived in this chapter.

5.8 Exercises
p p
Exercise 1 Consider a consumer with utility function U (x) = x1 + x2 for all
x 2 IR2+ .

(a) Derive the (Marshallian) demand function of the consumer.

(b) Express that Lagrange multiplier as function of p and I.

(c) Derive the indirect utility function of the consumer.

(d) Verify Roy’s identity.

(e) Show that the demand function is homogeneous of degree 0 in p and I.

(f) Derive the compensated demand function and the expenditure function of this
consumer.

(g) Show that the compensated demand function is homogeneous of degree 0 in p, and
show that the expenditure function is homogeneous of degree 1 in p.

(h) Verify Shephard’s lemma.

Exercise 2 Answer the same questions as Exercise 8 for the (Cobb-Douglas) utility
1 2
function U (x) = (x1 ) 3 (x2 ) 3 for all x 2 IR2+ .

Exercise 3

(a) Derive the Marshallian demand function of a consumer with (linear) utility func-
tion U (x) = 2x1 + 3x2 for all x 2 IR2+ .

(b) Derive the Marshallian demand function of a consumer with utility function U (x) =
min[x1 ; 2x2 ].
108 CHAPTER 5. THE CONSUMER

Exercise 4 Consider a consumer with Cobb-Douglas utility function U (x) = nk=1 (xk ) k
P
for all x 2 IRn+ , with k > 0 for all k 2 f1; : : : ; ng and nk=1 k = 1. Derive the (Mar-
shallian) demand function, the indirect utility function and expenditure function of
this consumer.

I
Exercise 5 Consider a consumer with indirect utility function V (p; I) = 2p1 p2
.

(a) Derive the compensated demand function of this consumer.

(b) Derive the expenditure function of this consumer.

(c) Derive the Marshallian demand function of this consumer.

Exercise 6 Consider a consumer with demand function d(p; I) = I


; I
p1 p2
, + 1.

(a) Compute the price elasticities e11 and e22 of goods 1 and 2.

(b) Compute the cross price elasticity e12 .

(c) Compute the income elasticities e1I and e2I .

Concepts, Theory and Proofs

Exercise 7 Prove Proposition 34.

Exercise 8 Prove Proposition 35.

Exercise 9 Prove Proposition 36.

Exercise 10 Prove Proposition 37.


Chapter 6

The Producer and Individual


Decision making on Markets:
Perfect Competition and Monopoly

6.1 The Producer


6.1.1 Production functions
We consider a producer who produces one output good with m input goods. The inputs
that are used are given by an inputvector x = (x1 ; : : : ; xm ) 2 IRm
+ where xk is the
amount used of inputgood k = 1; : : : ; m. The amount of the outputgood that can be
produced with any combination of inputs (inputvector) is described by the production
function.

De…nition 24 The production function of the producer is a function f : IRm


+ ! IR+
such that f (x) is the maximal amount the producer can produce of the output good using
inputvector x = (x1 ; : : : ; xm ).

Note that there can be more than one inputvector that gives the same maximal
output. We make the following assumptions on the production function.

Assumption 4 We assume the production function to be regular, meaning that


1. f is increasing
2. f is strictly quasi-concave;
3. f is twice partially di¤erentiable;
4. f (0) = 0.

109
110CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

Note that some of these assumptions are similar to those considered for the utility
function of the consumer. However, here they have a di¤erent meaning, although
mathematically some are the same assumption.

Question: Which of these conditions did we not require from utility functions of con-
sumers?1 Can you explain why we did not require this for utility functions?2

An increasing production function means that using more inputs (or using at
least as much of any input good, and using more of at least one input good) leads to
a higher production of the output good. Strict quasi-concavity implies that ‘mixing
input vectors‘yields higher output. The fourth assumption is made for mathematical
convenience. The fourth assumption requires that without any inputs we can produce
only zero output. In other words, we need at least a positive amount of one input good
in order to produce a positive amount.

Question: For consumers, we only comsidered utility functions with at least two goods
in the bundle of goods. But for the producer we can study production functions with
only one input good. If there is only one input good then strict quasi-concavity of the
production function implies that it is increasing. Do you see why?

Before describing the decision model of the producer, we introduce some notions (which
again look similar to notions for the consumer.)

De…nition 25 Consider a regular production function f : IRm + ! IR+ .


The marginal product of input good k = 1; : : : ; m at inputvector x 2 IRm
+ is given by

@f (x)
M Pk (x) = :
@xk
Observe that the marginal product of inoput good k is essentially a function M P : IRm
+ !
IR since the marginal product of this input good depends on the full input vector, so
also depends on how much is used of the other input goods.
Note that:

f is increasing in xk , M Pk (x) > 0 for all x 2 IRm


+:

By the assumption of an increasing production function, marginal products are positive:


if more is used of input good k then output will increase.
1
Answer: The fourth assumption.
2
Answer: Utility has no origin, but production has.
6.1. THE PRODUCER 111

We speak about decreasing marginal product if

@M Pk (x)
< 0:
@xk
Decreasing marginal product means that, the more you use of input good k, the more
you need to use extra to increase output by one unit.
We speak about increasing marginal product if

@M Pk (x)
> 0:
@xk

6.1.2 Decision model of a producer


We assume that there is perfect competition on all input markets. This means that
the producer takes all input prices as given. Each input good has a certian positive
price. The producer knows this price and makes its decision about production based
on these input prices, assuming that whatever is the amount he/she demands from the
input goods, the input prices do not change.

Assumption 5 We assume perfect competition on the input markets, i.e. the pro-
ducer takes all input prices as given.

Concerning the output market, we will consider di¤erent market forms, such as perfect
competition, monopoly, or oligipoly, in Chapters 8. Speci…cally, we consider di¤erent
models of oligopoly as an application of noncooperative game theory.
The positive input prices are given by the input price vector w 2 IRm
++ with
wk > 0 being the price of input good k 2 f1; : : : ; mg.
Similar as with the consumer, to build a decision model of the producer, we
need to answer the following questions:

1. What does the producer want?

2. What can the producer do?

3. What will the producer choose?

Ad 1. The producer wants to maximize its pro…t being the di¤erence between revenue
and costs. If the producer produces and sells an amount q 0 of the output good
against unit price p > 0, then its revenue is pq. If the input prices are given by the
112CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

input price vector w 2 IRm


++ , and the producer uses inputs in the amounts given by the
inputvector x 2 IRm
+ , then its cost is given by the inner product w x.

Ad 2. With the inputs that the producer can a¤ord, he/she can produce at most what
is determined by the production function.

Ad 3. We consider two possibilities with respect to how the producer makes a choice:
(i) (Pro…t maximization) The producer chooses the inputvector and an amount of
output that maximizes its pro…t, or

(ii) (Cost minimization) Given an amount of the output good, the producer chooses
the inputvector that minimizes cost.
For the moment, assume that the producer also takes the output price p as given.
(Remember that in all lectures from now on we assume that the producer takes the
input price vector w as given.)
De…nition 26 A production plan is a pair (x; q) with x 2 IRm
+ a nonnegative in-
putvector and q 2 IR+ a nonnegative amount of the output good.
If the producer is not restricted by a budget, i.e. when he/she can a¤ord any in-
putvector, then he/she is only restricted by the technical restrictions described by the
production function. Then the producer can choose any inputvector, and with that
inputvector produce any amount of the outputgood that is smaller or equal to the
amount determined by the production function. So, the set of alternatives the pro-
ducer can choose from is the set of all production plans (x; q) such that output q can
be produced with inputvector x. This gives the set of feasible production plans
F = f(x; q) 2 IRm+1
+ jq f (x)g:
Assuming that the only criterion on which the producer makes its decision is pro…t,
then a producer considers production plan (x; q) at least as good as production plan
(x0 ; q 0 ) at prices p and w if and only if the pro…t obtained at (x; q) is at least as high
as pro…t obtained at (x0 ; q 0 ):
(x; q) % (x0 ; q 0 ) , pq w x pq 0 w x0
The producer then chooses a best element (production plan) in the set of feasible
production plans F .

Question: Do you think it is realistic that pro…t is the only decision criterion of a
producer? Can you think of other decsision criteria producers may use?
6.1. THE PRODUCER 113

6.1.3 Pro…t maximization


If we consider pro…t maximization the only goal of the producer, then the pro…t max-
imization problem of the producer is given as follows. Given output price p > 0 and
input price vector w 2 IRm
++ , the pro…t maximization problem is

max pq w x subject to q f (x):


q 0
x2IRn
+

Similar as with a regular utility function of the consumer, by the production function
being regular, and therefore increasing, we can write this pro…t maximization problem
as

max pq w x subject to q = f (x):


q 0
x2IRn
+

Whereas regularity of the utility function was su¢ cient to analyse consumer demand
(for example, we could apply the Method of Lagrange to obtain a demand function),
this is not the case for a regular production function. It is even worse: the pro…t
maximization problem need not have a solution, even when the production function is
regular.

Question: Can you give a regular production function for which the above pro…t max-
imization problem does not have a solution?3

6.1.4 Cost minimization


In order to solve the decision problem of the producer, we split its decision problem in
two parts. First, we consider his/her cost minimization problem. This is, given output
q 0, we …nd the input vector that minimizes the cost to produce output quantity q.
This gives the contingent demand function of the producer to produce q.
Second, from the contingent demand function we derive the cost function. This
cost function gives, for any amount of output q 0, the minimal cost the producer
must make to produce output q. With this cost function, we then …nd the output that
maximizes pro…t. However, this pro…t maximization problem depends on the market
behavior of the producer, and therefore on the market form (competition, monopoly,
oligopoly, .... ).
3
Answer: Consider the production function f (x) = x. Verify that this is a regular production
function, but pro…t maximization need not have a solution.
114CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

The cost minimization problem of the producer is:


Given input prices w 2 IRm
++ and output q 0,

min w x subject to q f (x): (6.1)


x2IRn
+

So, mininize the cost to produce a certain amount q of the output good given the input
prices w.
By the production function being regular (and thus increasing), we can replace
the inequality constraint by an equality constraint, to obtain the cost minimization
problem: Given input prices w 2 IRm
++ and output q 0,

min w x subject to q = f (x): (6.2)


x2IRn
+

If f is a regular production function then this minimization problem has a unique


solution. This solution can be found using the Lagrange method.

De…nition 27 The contingent demand function (to produce q) of the producer is


the function d : IRm
++ IR+ ! IRm
+ given by

d(w; q) = x ;

with x the solution of (6.2).

The contingent demand d(w; q) gives the input vector that minimizes the total cost to
produce output q at input price vector w. This can be seen as the producer‘s demand
for all input goods when he/she wants to minimize cost to produce an amount q of the
output good, and has to buy the input goods against the prices given by input price
vector w. Note that the demand to input good k = 1; : : : ; m, depends on the prices of
all input goods since the producer chooses the cost minimizing input vector.
Using the Method of Lagrange, we can …nd solutions in the interior with all
amounts positive by the following theorem.

Theorem 39 Consider a producer with a regular production function. For input price
vector w 2 IRm
++ and output q 0, the contingent demand x = d(w; q) 2 IRn++ solves:
M Pk (x) wk
= for all k = 2; : : : m
M P1 (x) w1
and

f (x) = q:
6.1. THE PRODUCER 115

It is an exercise to prove this theorem. Notice the di¤erence in the ‘role‘of the pro-
duction function (as part of the constraint) and the utility function for the consumer
(as the objective function).

Question: Can you give an intuition behind this theorem?

This theorem considers interior solutions x 2 IRn++ . (Solutions with shadow price
zero.) There might be corner solutions.

6.1.5 Cost functions


The cost function gives the cost of contingent demand. In other words, it gives the
minimal cost needed to produce output q at input price vector w.

De…nition 28 The cost function of the producer is the function C : IRm


++ IR+ ! IR+
given by

C(w; q) = w d(w; q);

with d the contingent demand function.

Question: Did you encounter a similar function for the consumer?

If input price vector w is given and …xed, we often write the cost function as C(q). In
the coming weeks, we will usually characterize a producer by its cost function, but now
we know where that cost function comes from. Even when the producer is selling its
output on a market where the producers are aware of their in‡uence on each other (i.e.
we have an interdependent decision situation), the cost function is determined by an
individual decision model. (Similar, consumers will be characterized by their demand
function which is derived in Lecture 6 from their utility function.)
First, we study some homogeniety properties.

Proposition 40 Consider a regular production function f . Then

the contingent demand function is homogeneous of degree 0 in w, i.e. d( w; q) =


d(w; q) for all > 0;

the cost function is homogeneous of degree 1 in w, i.e. C( w; q) = C(w; q) for


all > 0;
116CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

It is an exercise to prove this proposition.

Question: Do these properties make sense intuitively?

Homogeniety of degree zero in input prices means that multiplying all input prices
with the same positive constant, does not change the contingent demand. To produce
the same amount of output, since the price ratios do not change, the producer will
still demand the inputs in the same proportion, and thus contingent demand does
not change. Of course, since all input prices are multiplied with the same posotive
constant, if the producers deamdns the same input amounts, then its cost changes, to
be precise is mu;ltiplied by the same constant as the input prices, i.e. the cost function
is homogeneous of degree one in input prices.
Mathematically, Shephard‘s lemma for the producer is the same as for the consumer,
but now it says that you can obtain the contingent demand to an input good by partial
di¤erentiation of the cost function to the corresponding input price.

Proposition 41 (Shephard‘s lemma) Consider a regular production function f . Then

@C(w; q)
dk (w; q) = for all k = 1; : : : m
@wk

Step 2
As mentioned before, to answer the question what output to produce, we need to know
what is the output market form. Several assumptions will be discussed in Chapter @@.

6.2 Individual Decision Making on Markets: Per-


fect Competition and Monopoly
In this chapter, we present two extreme market models where decision making is not
interdependent: perfect competition and monopoly.

Whereas in the next chapters we apply models of interdependent decision making


(game theory) to markets where the decision makers (players) are the producers on the
market who are aware of each others in‡uence on the market, and therefore on each
others behaviour, in this chapter (and lecture) we apply models of individual decision
making to two extreme forms of markets where the producers are not aware of their
interaction, or at least behave as if they are not aware of that.
6.2. INDIVIDUAL DECISION MAKING ON MARKETS: PERFECT COMPETITION AND MONOP

We consider partial markets, i.e. markets for one output good. (Markets with more
than one output good are studied in the …eld of general equilibrium theory which is
beyond the scope of this course.) Each producer supplies a certain amount of the good
on the market. The total market supply is the sum of the individual supplies of all
producers. This total market supply will be confronted with total market demand on
the market. This confrontation results in a certain (equilibrium) price and an amount
of the good that is traded on the market. The equilibrium price and quantity depend
on the behavior of the producers on the market.
In principle, the equilibrium price and quantity also depend on the behavior of the
consumers, but in this course we ignore this behavior and simply assume that total
market demand is given by a market demand function. This market demand function is
build from the individual demands of the comsumers that was studied in Chapter 4. In
this part of the course we assume that adding up the individual (Marshallian) demand
functions of the consumers (see Chapter 4) gives the market demand function

D : IR++ ! IR+ ;

where D(p) is the total market demand for the good if the price of the good is p > 0.
(Note that there is only one good.)
For convenience we make the following assumption on the market demand function.

Assumption 6 D is di¤erentiable with D0 (p) < 0 for all p > 0.

As mentioned, we focus on the (choice) behavior of the producers. They are the ‘players
in the game‘. Therefore, we …rst model these producers.

6.2.1 Market forms


Trade between consumers and producers takes place on markets. A market system
consists of a collection of partial markets, where on each partial market one good is
traded with a numeraraire good (money). Since in this and the next chapters we only
consider partial markets, we refer to these simply as markets. On a market for an
output good, the producers supply the good, and the consumers demand the good. In
return, the consumers give some amount of a numeraire good (money) to the producers.
The market form describes issues as: how many consumers and producers are
on the market, is there free entry and exit, how do the producers behave (are they
aware of each other‘s presence and the e¤ect of their own behavior on the behavior
118CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

of their competitors) etc. In this and the next chapters we assume that the consumer
side of the market can be modelled by a market demand function (see Chapter 6). So,
we do not model the individual consumers but only their aggregated demand. With
respect to the producers we consider several assumptions. In this chapter we begin by
studying two benchmark cases where producers do not take account of each other, and
thus there is no strategic interaction:

Competitive market:
In a competitive market the producers assume that they do not have in‡uence
on the output price, i.e. they take the output price as given.

Monopoly:
In a monopoly there is only one producer, and thus total market supply is equal
to the supply of the monopolist.

In both cases, there is no strategic interaction between producers. Therefore,


these are models of individual decision making. In period 2 we discuss several
market models where there is strategic interaction between the producers, and therefore
we apply models of interdependent decision making (game theory).

We make the same assumptions with respect to the market demand function
and production function as before.

Assumption 7 The market demand function D is continuous with D0 (p) < 0 for all
p > 0. Each producer has a regular production function, and takes input prices as
given.

6.3 Perfect competition


We speak about perfect competition if the producers take the price of the output
good as given. So, they make their decisions, knowing that they can sell the output
they produce aganist the given price, and this price does not depend on how much
the individual producer supplies. (Recall that we always assume the producers to take
input prices as given.) Note that, also if there is only one producer (a monopoly), this
producer can behave as a perfect competitor, namely when he/she takes output price
6.3. PERFECT COMPETITION 119

as given. Even though the price might depend on the supply of the monopolist, he/she
still might behave as if he/she does not in‡uence the price.

We assume that there are n producers.

De…nition 29 The pro…t function j: IR+ ! IR of individual producer j 2 f1; : : : ; ng


is given by

j (qj ) = pqj Cj (qj );

where Cj : IR+ ! IR+ is the cost function of producer j.

We assume that the only objective of the producer is to maximize its pro…t.
(Recall the cost function is based on cost minimization.) Then, the pro…t maximization
problem of an individual producer j 2 f1; : : : ; ng is

max j (qj ) = max pqj Cj (qj )


qj 0 qj 0

The …rst order condition for an interior solution (qj > 0) is

d j (qj ) dCj (qj )


=0,p =0,p M Cj (qj ) = 0
dqj dqj

with

dCj (q)
M Cj (q) = (6.3)
dq
the marginal cost at q. It describes how the total cost changes if an in…nitesimal
higher amount of output is produced.
So, a necessary condition for maximizing pro…t is that the marginal cost at the
(positive) output equals marginal cost.
The second order condition is:
d2 j (qj ) d2 Cj (qj ) d2 Cj (qj )
<0, < 0 , > 0:
dqj2 dqj2 dqj2

So, to be pro…t maximizing, the second order derivative of the cost funtion
should be positive, so the cost should be progressively increasing in the pro…t maxi-
mizing output level.
120CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

Question: What can you say if marginal cost is negative for all nonnegative
output levels?4
This can be stated in the following theorem.

Theorem 42 If the cost function Cj of producer j 2 f1; : : : ; ng is di¤erentiable and


strictly convex, then the pro…t maximizing output qj > 0 at output price p > 0 satis…es
p = M Cj (qj ):
If p < M Cj (qj ) for all qj 0 then the solution is qj = 0.

6.3.1 Supply of a perfect competitor


In (6.3) we de…ned the marginal cost, which appeared in the necessary conditions for
pro…t maximization. Next, we de…ne the notion of average cost function which gives
the average cost per unit of output.

De…nition 30 The average cost function is the function ACj : IR++ ! IR+ given
by
C(qj )
ACj (qj ) = for all qj 2 IR++
qj
It turns out that marginal costs are equal to average costs where average costs
are minimal.

Proposition 43 If ACj (q ) AC(qj ) for all qj 0, then M Cj (q ) = ACj (q ).

It is an exercise to prove this proposition.


Now we are ready to characterize the supply of a producer who behaves as a
perfect competitor. We will shortly refer to pro…t maximizing supply as ‘supply‘.

Proposition 44 The pro…t maximizing supply qj of producer j (if it exists) is given


by:
qj if M Cj (qj ) = p and p AC(qj )
qj (p) =
0 otherwise.
Notice that this theorem only characterizes the supply if the pro…t maximization
dM Cj (qj )
problem has a solution, i.e. if a pro…t maximizing supply exists. If dqj
< 0 for all
qj 0, then there exists no optimal (pro…t maximizing) supply.
dM Cj (qj )
Question: Why does optimal supply not exist if dqj
< 0 for all qj 0?
4
Answer: If marginal cost is negative for all nonnegative output levels then there is a corner
solution, see Theorem 42.
6.4. MONOPOLY 121

6.3.2 Perfect competition: equilibrium


As mentioned before, on the market the total market demand (of all consumers to-
gether) will be confronted with the total market supply (being the sum of the supplies
of all individual producers).

De…nition 31 The market supply function is the function S : IR++ ! IR+ given
by
X
n
S(p) = qj (p) for all p 2 IR++ :
j=1

The equilibrium price is where market demand equals market supply.

De…nition 32 The perfect competition equilibrium price is the price pe that


satis…es D(pe ) = S(pe ). Then q e = D(pe ) is the perfect competition equilibrium
output.

What we considered here is usually called a short run analysis. On the other hand,
in a long run analysis, there is free entry and exit on the market. In that case pro…ts
of the producers are zero, and the equilibrium determined which producers are active.
Then market entry becomes a strategic decision, see next cahpters.

6.4 Monopoly
We speak about a monopoly if there is only one producer on the market. In that case,
the individual supply of the monopolist is also the total market supply. In equilirbirum
this supply equals total market demand. Taking the inverse of the demand function,
the monopolist sees for every amount it supplies what is the price it earns for each unit
if he/she exactly sells the market demand at that price in equilibrium.

De…nition 33 Given market demand function D, the function P : IR+ ! IR+ with
P ( ) = D 1 ( ) is called the inverse demand function or price function.

We say that a single producer on a market ‘behaves‘as a monopolist if he/she is aware


of its own in‡uence on the equilibrium price of the output good. Since equilibrium is
where market demand equals market supply, and the monopolist is the only producer
on the market, it is obvious that the choice of the monopolist with respect to how much
122CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

to supply has an e¤ect on the equilibrium price. It is not obvious that the monopolist
takes account of this e¤ect when chosing its supply, but when he/she ‘behaves‘ as a
monopolist then he/she fully takes account of this e¤ect.

m
De…nition 34 The pro…t function : IR+ ! IR of the monopolist is given by

m
(q) = P (q)q C(q): (6.4)

By taking the inverse demand funtion instead of a given price p (as in the pro…t
maximization problem of the perfect competitor), the monopolist takes account of
price changes depending on changes in output.
The pro…t maximization problem of the monopolist then is

m
max (q):
q 0

The …rst order condition for pro…t maximization is:


m
d (q) dP (q) dC(q)
= 0 , P (q) + q = 0:
dq dq dq

The second order condition for pro…t maximization is:

dP (q) d2 P (q) d2 C(q)


2 +q < 0:
dq dq 2 dq 2

This is summarized in the following theorem.

d2 P (q)
Theorem 45 If dq 2
< 0 and the cost function is strictly convex, then the pro…t
maximizing output q > 0 of a monopolist satis…es

dP (q)
P (q) + q = M C(q):
dq

Similar as with the consumer, pro…t is maximal where ‘marginal revenue‘equals ‘mar-
ginal cost‘. But now marginal revenue (P (q) + q dPdq(q) ) depends of the supply q, whereas
for the perfect competitor marginal revenue was simply the given price p.

Example 18
6.4. MONOPOLY 123

It is interesting (also for policy advice) to compare the outcome in perfect competition
with the outcome under monopoly. As one might expect, under perfect competition the
equilibrium price is smaller, and the equilibrium output is larger than under monopoly.
We can make this statement precise when we consider a market with only one producer,
and compare the equilibrium when this producer behaves as a monopolist with the
situation where he/she behaves as a perfect competitor. In the following, we denote
by pm the monopoly price (the price that maximizes trhe monopoly pro…t) and by q m
the monopoly output q m = D(pm ).

Proposition 46 Suppose there is one producer on the market with cost function C
d2 P (q)
that is twice di¤erentiable and strictly convex, and dq 2
< 0. Then

pm > pe and q m < q e :

This proposition follows from the …rst order condition for the monopolist
dP (q)
P (q) + q = M C(q);
dq
the …rst order condition for the perfect competitor

p = M C(q);
dP (q)
and the assumption that dq
< 0 for all q 0. Note that this proposition considers
only one producer. This producer can behave as a perfect competitor.

6.4.1 Finding the optimal price


As with the perfect competitor, we considered the pro…t maximization problem of the
monopolist as one where the monopolist chooses the supply that maximizes his/her
pro…t under the condition that market supply equals market demand. Since the mo-
nopolist fully determines market supply by his/her own supply, we can also consider
the pro…t maximization problem of the monopolist as the monopolist choosing the price
that maximizes his/her pro…t under the condition that market supply equals market
demand. So, for the monopolist, pro…t can also be written as a function of the price p:

pD(p) C(D(p))

In that case the maximization problem (6.4) is equivalent to

max pD(p) C(D(p))


p 0
124CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

with …rst order condition for pro…t maximization:

dD(p) dC(q) dD(p)


p + D(p) =0
dp dq dp

dD(p) dD(p)
, D(p) + p = M C(q) :
dp dp
This cannot be done for the perfect competitor since price is (assumed to be) inde-
pendent of supply. In the next chapters, we consider intermediate market forms where
there is more than one producer (so no monopoly) who are aware of their in‡uence on
the price, and therefore on each other‘s behavior. In that case, we can also look at the
producers choosing the optimal quantity or optimal price, but we will see that these
are not equivalent. So, this equivalence is typical for a monopolist.

6.4.2 Deadweight loss of monopoly


Usually, people speak negative about a monopoly. But what is so negative about a
monopoly?

Question: Can you think about reasons why people are negative about a monopoly?
Can you also think about circumstances where a monopoly has advantages?

A main problem with a monopoly is that it decreases the welfare surplus which
is a measure of the well-being of the society consisting of all consumers and producers.
The welfare surplus is the sum of the so-called consumer surplus (being a measure of
welfare of the consumers on a market) and the producer surplus (being a measure of
welfare of the producers on a market). Formally, the consumer and producer surplus
are de…ned as follows.

De…nition 35 For output q the consumer surplus CS(q ) is the area between the
(inverse) demand function and the price at q :
Z q
CS(q ) = (P (q) P (q ))dq
0

For output q the producer surplus P S(q ) is the area between the price and the
marginal cost function and the price at q :
Z q
P S(q ) = (P (q ) M C(q))dq
0
6.4. MONOPOLY 125

For output q the welfare surplus W S(q ) is the sum of the consumer surplus and
producer surplus:
Z q
W S(q ) = CS(q ) + P S(q ) = (P (q) M C(q))dq
0

How can we interpret the area between the (inverse) demand function and the
price at q as consumer surplus? Consider any price p > 0, and the demand D(p) at
that price. Who buys at that price? It are the consumers who are willing to pay at
most p. Among these consumers there are consumers that would even be willing to pay
more than p. These consumers generate some ‘surplus‘: they get the good for a price
that is less than they are prepared to pay for it. This has a positive e¤ect on the well
being of the consumers and thus increases the consumer surplus. Of course, consumers
who are not prepared to pay p will not buy, so there is not negative e¤ect. The total
positive e¤ect is measured by the area between the (inverse) demand function and the
price at q , and therefore this is the consumer surplus.
A similar interpretation can be given to the producer surplus. At price p all
producers are selling who are willing to produce for a price of p or smaller. Those
producers who are willing to sell even for a lower price will make a pro…t at price p.
All these e¤ects together form the producer surplus.
Adding the consumer and producer surplus together gives the total welfare sur-
plus.
Consumer surplus is maximal under perfect competition. Even the total welfare
surplus is maximal under perfect competition.

Question: Can you give an intuitive explanation for this if all producers have the same
cost function?

Therefore, monopoly decreases the welfare surplus. This loss, i.e. the di¤erence in
welfare surplus between perfect competition and monopoly, is called the deadweight
loss of monopoly.
This is de…nitely a disadvantage of a monopoly. But how ‘bad‘is a monopoly?
Is this a very serious problem? This depends on how strong is the negative e¤ect on the
welfare surplus. Various measures for the ‘degree of monopoly power‘on a market have
been developed. For example, if the monopoly price is close to the perfect competition
price, then the negative e¤ect of monopoly might not be so big. Of course, one should
126CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

look at the proportional increase (in price). This is measured by one of the most famous
indices for monopoly power, the Lerner index.
The Lerner index is given by

P (q m ) M C(q m )
:
P (q m )
So, if price is ‘relatively close to‘marginal cost then the outcome of the monopoly
is ‘close‘to the perfect competition outcome.
The Lerner index can be related to the price elasticity of demand. It is equal to
the reciprocal of the price elasticity of demand:
P (q m ) M C(q m ) 1
= ;
P (q m ) eD;P (q m )

where
dD(p) p
eD;P (q m ) =
dp D(p) qm

is the price elasticity of demand.


If the price elasticity of demand is large, then the e¤ect of monopoly is relatively
small. Since demand reacts strong on changes in the price, there is not so much ‘room‘
for increasing prices and in some sense a monopolist is ‘locked in‘(almost) the perfect
competition outcome. But if the price elasticity of demand is small, then large increases
in the price are possible without too much e¤ect on demand, and then the monopolist
can ‘exploit‘the consumers on the market. So, when dealing with monopolies (or more
general with market power) a competition authority should be well aware of the e¤ect
of monopoly. If the e¤ect is not that serious, then one should not spend a lot of e¤ort
to …ght against the monopoly. But if monopoly yields a big welfare loss, then it might
be worthwhile to spend e¤ort to …ght against the monopoly.
Other considerations when deciding to …ght against a monopoly are, for example,
if there is free entry and exit on the market (we did not consider that here).

6.5 Exercises

Exercise 1 Consider a regular production function f : IR+ ! IR+ , so there is


m = 1 input good.
6.5. EXERCISES 127

(a) Argue that, for m = 1, f is increasing if and only if f is strictly quasi-concave.


1
(b) Show that, if m = 1, the cost function can be obtained as C(w; q) = wf (q).

Exercise 2 Consider a producer with production function f (x) = 1 + (x 1)1=3 .

(a) Derive the contingent demand function.

(b) Derive the cost function function.

(c) Verify Shephard‘s lemma.

(d) Take w = 1, and show that M C(q ) = AC(q ) for q satisfying AC(q ) =
minq 0 AC(q).

Exercise 3 Consider a producer with (Cobb-Douglas) production function


f (x) = (x1 )2=3 (x2 )1=3 .

(a) Derive the contingent demand function.

(b) Derive the cost function.

(c) Verify Shephard‘s lemma.

Exercise 4 Answer the same questions as Exercise 6 for the production function
p p
f (x) = x1 + x2 .

Exercise 5 Prove Theorem 33.

Exercise 6 Prove Proposition 34.

Exercise 7 Consider market demand function D(p) = Ap with < 1, and a


producer with cost function C(q) = cq, c > 0.

(a) Derive the monopoly supply.

(b) Derive the monopoly price.

(c) Determine the Lerner index. Give an interpretation to .


128CHAPTER 6. THE PRODUCER AND INDIVIDUAL DECISION MAKING ON MARKETS: PER

(d) What can you say about the supply if the producer behaves as a perfect competi-
tor?

Exercise 8 Consider a market with market demand D(p) = 400 200p, on


q2
which there is one producer with cost function C(q) = 200
+ 32.

(a) Give the pro…t maximization problem of the producer if he/she behaves as a (pro…t
maximizing) monopolist.

(b) Find the monopoly price and output.

(c) Give the pro…t maximization problem of the producer if he/she behaves as a (pro…t
maximizing) perfect competitor.

(d) Find the perfect competition equilibrium price and output. (Note that this is the
only producer on the market.)

Exercise 9 A monopolist can sell its product on two separate markets. The
demand on market 1 is D1 (p) = 9 p41 , and demand on market 2 is D2 (p) = 24 p22 ,
with p1 and p2 the prices the monopolist asks on markets 1 and 2. The cost function
q2
of the monopolist is C(q) = 2
.

(a) Give the pro…t maximization problem of the monopolist if he/she can ask di¤erent
prices on the two markets.

(b) Find the supplies q1m ; q2m and corresponding monopoly prices pm m
1 ; p2 of the pro…t
maximization problem of question (a).

(c) Give the pro…t maximization problem of the monopolist if he/she must ask the
same price on the two markets.

(d) Find the supplies q1m ; q2m and corresponding monopoly price pm of the pro…t max-
imization problem of question (c).

Exercise 10 Prove Proposition 5.


Part 2

Collective Decision Making


Chapter 7

Preference aggregation

7.1 Introduction
One of the most fundamental problems in economics is how to make a joint decision
for a group of agents who might have con‡icting interests. This is a main question in
theory as well as applications. For example, it might be that the society has to choose
one out of several alternatives (build a swimming pool, or build a library, or enlarge
the army, or decrease taxes...), or has to elect a president, etc. If all agents agree what
is the best alternative for them, then the choice is easily made.
However, in reality the agents usually have di¤erent (con‡icting) preferences
over the alternatives. The question becomes what alternative to choose for the society
as a whole. These situations are dealt with by social choice theory which is one of
the theories of collective decision making. The fact that only one alternative can
be chosen re‡ects scarcity.

7.2 Social choice situations


We consider a society with a …nite set of agents or individuals who can choose among
a …nite set of alternatives. The society should come to one collective decision (choice
of one alternative) for the society as a whole, taking into account the preferences of the
individual agents. The preferences of each individual agent are given by a preference
relation.

De…nition 36 Given a …nite set of alternatives A = fa1 ; : : : ; am g and a …nite set


of agents N = f1; : : : ; ng, a preference pro…le is a tuple p = (%i )i2N with %i a
preference relation on A, for all i 2 N .

131
132 CHAPTER 7. PREFERENCE AGGREGATION

A preference pro…le describes the preferences of all individual agents, where %i


is the preference relation of agent i 2 N . So, a %i b means that agent i considers
alternative a ‘at least as good as’alternative b.
Putting together a set of agents, each with their own individual preference re-
lation, describes a social choice situation.1

De…nition 37 A social choice situation is a triple (N; A; p) where

N is a …nite set of agents

A is a …nite set of alternatives, and

p = (%i )i2N is a preference pro…le.

The preference relations %i ; i 2 N , are the preference relations that we discussed in


earlier chapters, see for example Chapters 2 and 3. Since di¤erent agents in the society
can have di¤erent preferences, we now add the subindex i for the preference relation
of agent i.
We assume that each %i , i 2 N , is transitive and complete. (Note that com-
pleteness implies re‡exivity.) We refer to these as rational preference relations.

De…nition 38 A preference relation %i of agents i is rational if it is transitive and


complete.

For the de…nitions of transitive and complete preference relations, see Section
2.2.3.
Sometimes, we require the individual preference relations to be asymmetric,
again see the de…nition in Section 2.2.3. Similar for a 6 i b and a 6 i b. Or said
di¤erently, but with the same meaning: if agent i considers alternative a at least as
good as alternative b, and a and b are di¤erent alternatives, then agent i considers a
better than b.
Since, in this and the next chapter/lecture we take the set of agents N as well
as the set of alternatives A as given, we represent a social choice situation (N; A; p)
just by its preference pro…le p.
Recall that from each preference relation %i , we can derive
1
Social choice situations are also called voting situations.
7.2. SOCIAL CHOICE SITUATIONS 133

the strict preference relation i:

a i b if and only if [a %i b and b 6%i a];

the indi¤erence relation i:

a i b if and only if [a %i b and b %i a].

Question: We know that i and i are always transitive. Are they also re‡exive?2

A preference relation expresses only pairwise comparisons of alternatives (no


intensity of preferences). So, if I consider alternative a much better than alternative b,
and you consider alternative b a bit better than alternative a, the only thing that our
preference relation expresses is that I consider alternative a better than alternative b,
and you consider alternative b better than alternative a. If our opinions matter equally,
then based on our preference relations no distinction can be made between alternatives
a and b.

The social choice situation is one of the classical choice problems in economics,
which is relevant in real life decision making, see for example elections. Two main
questions that are addressed in social choice theory are, given the preferences of the
individual agents:

1. How do/should the agents choose one alternative together for the whole society?
(Social choice function)

2. Is it possible to derive a social preference relation re‡ecting the preferences of the


society as a whole? (Social welfare function)

The …rst question we encounter very often in almost any organization or society.
For example, in a country to elect one president from a set of president candidates.
Or a city that can decide how to spend part of the budget: build a swimming pool,
build a library, invest in sports facilities, decrease city taxes, etc. Or a group of friends
who has to decide what they go to do together, a group of students that has to decide
which exercise to discuss in class, etc. The second question, to derive a social preference
relation, is ‘more demanding’since then we do not just want to know what is/are the
chosen alternative(s), but we want a full ranking or preference relation over the set of
2
Answer: i is not re‡exive, i is re‡exive.
134 CHAPTER 7. PREFERENCE AGGREGATION

all alternatives. This can be useful if, for example, there is a bigger budget to spend
and we can split the budget over di¤erent projects (education, sports, tax decrease,
etc.)
Both questions are relevant both from a normative as well as descriptive view-
point. The …rst question is dealt with by so-called social choice functions which assign
to every preference pro…le a set of alternatives that can be considered as the ‘chosen
alternatives’for the society. This set might contain more than one alternative (even all
alternatives) or might be empty. The second question is dealt with by social welfare
functions which assign to every preference pro…le a preference relation that can be
considered as the ‘social preference relation’.

Two viewpoints that have been taken in the literature to address these questions
are:

1. a cooperative viewpoint where a benevolent dictator tries to do what is ‘best’for


society

2. a strategic viewpoint where, by voting, agents can strategically manipulate the


voting outcome.

According to the cooperative viewpoint, a benevolent dictator makes a decision


on behalf of the whole society. In doing this, he/she takes into account the preferences
of all agents, and comes to a decision that is ‘fair’ or ‘reasonable’. The strategic
viewpoint is more concerned with what really could be the chosen alternative(s) if the
agents decide among themselves according to some (voting) mechanism. Of course, the
optimal behaviour of the agents depends on the voting mechanism. So, it is important
to know who decides which voting mechanism to use.

7.3 Social choice functions


As mentioned above, in the …rst type of preference aggregation that we consider, we
want to know which alternative(s) can be considered the ‘best’ for the society as a
whole, taking into account all individual preferences.

De…nition 39 A social choice function C assigns to every preference pro…le p a


subset of the set of alternatives A, i.e.

C(p) A:
7.3. SOCIAL CHOICE FUNCTIONS 135

Preference pro…le social choice set


- Social Choice Function C -
p C(p)

Figure 7.1: Social choice function.

The set C(p) is called the social choice set associated to preference pro…le p.

Note that a social choice function is essentially a correspondence. For convenience we


speak about social choice functions, as done in the literature.
Social choice functions are also called voting rules. In the sequel, we will often
refer to a social choice function as a rule.

Question: Do you have a suggestion how to choose one alternative if agents have
con‡icting preference relations?

Next, we discuss some examples of social choice functions. Most social choice
functions fall into one of the following two categories:

Scoring rules (Borda)

Majoritarian rules (Condorcet)

Scoring rules assign scores (points) to the alternatives in every preference


relation for every individual agent, and the ‘winner’ in the preference pro…le is the
alternative that has the highest sum of scores over all individual agents. (To illustrate
with an analogy from sports, you can compare this with a Formule 1 competition, where
every race is an agent and the ranking of the drivers in each race are the preference
relations.)
Majoritarian rules derive from each social choice situation one preference
relation (the social preference relation) and based on this relation, determine who is
136 CHAPTER 7. PREFERENCE AGGREGATION

the ‘winner’. (Again, to illustrate with a sports analogy, you can compare this with a
soccer competition where every team plays once against each other team, and team a
‘is at least as good as’team b if a did not loose the match it played against team b.)
We …rst give some examples of scoring rules.

7.3.1 Some examples of scoring rules

The Plurality rule

The plurality rule is one of the most common applied voting rules. The plurality rule
chooses from the alternatives by only considering what are the best alternatives for each
agent. For every agent look at what is/are the best element(s) in his/her preference
relation. (Remember that we assumed the preferences to be rational, i.e. transitive
and complete), and therefore every individual preference relation has a best element.)
For society we choose those alternatives that are best for the highest number of agents.
To de…ne it formally, consider a preference pro…le p = (%i )i2N . Then, the
plurality score of alternative a 2 A is the number of agents that have alternative a
as (one of) their most preferred alternative(s):

plura (p) = #fi 2 N j a %i b for all b 2 A n fagg;

where #T means the cardinality (that is the number of elements in) the set T . So,
#fi 2 N j a %i b for all b 2 A n fagg is the number of agents that have a as best
element in their preference relation.
The plurality choice set is the set of alternatives that are best for the most
number of agents, i.e. the alternatives with the highest plurality score:

C plur (p) = fa 2 A j plura (p) plurb (p) for all b 2 Ag:

De…nition 40 The plurality social choice function (or plurality rule) is the social
choice function that assigns to every preference pro…le p the plurality choice set C plur (p).

The alternatives in C plur (p) are called the plurality winners in preference pro…le p.

Example 19 Consider the following preference pro…le on the set of …ve agents N =
f1; 2; 3; 4; 5g and four alternatives A = fa; b; c; dg:

Agent 1: a 1 c 1 b 1 d
7.3. SOCIAL CHOICE FUNCTIONS 137

Agent 2: c 2 a 2 d 2 b
Agent 3: d 3 c 3 a 3 b
Agent 4: a 4 c 4 b 4 d
Agent 5: b 5 d 5 c 5 a

Since alternative a is best element for most agents, a is the plurality winner. Using the
plurality score, we …nd plur(p) = (plura (p); plurb (p); plurc (p); plurd (p)) = (2; 1; 1; 1),
so C plur = fag.

This social choice function is widely applied. An advantage is its simplicity. However,
some of its main disadvantages are:
1. It only takes account of the most preferred alternative of every agent and
ignores the rest of the preferences. Consider, for example, the following preference
pro…le on N = f1; 2; 3; 4; 5g and A = fa; b; c; d; eg:

Agent 1: a 1 b 1 c 1 d 1 e
Agent 2: a 2 b 2 c 2 d 2 e
Agent 3: c 3 b 3 d 3 e 3 a
Agent 4: d 4 b 4 c 4 e 4 a
Agent 5: e 5 b 5 d 5 d 5 a

The plurality winner is alternative a. Do you agree that this alternative should be
chosen for society?

2. The plurality rule is very sensitive to strategic manipulation. (We come back
to this in Sections 6.4 and 6.5).

How serious are these disadvantages? That depends on the situation. For example,
are the individual preferences observable or not? It is an essential part of the role
of a mathematical economist to judge what properties of a social choice function are
desirable or undesirable. (Normative)

Antiplurality rule

The antiplurality rule chooses the alternatives by only considering what are the worst
alternatives for each agent. It chooses the alternatives that are worst for the lowest
number of agents. So, we look for every agent to the alternative(s) on the ‘last’position,
138 CHAPTER 7. PREFERENCE AGGREGATION

and choose for society the alternative that is on the last position for the lowest number
of agents. In some sense, we try to make the fewest number of agents dissatis…ed.
Consider a preference pro…le p = (%i )i2N . The antiplurality score of alterna-
tive a 2 A is the number of agents that have a as one of their worst alternatives:

antiplura (p) = #fi 2 N j b %i a for all b 2 A n fagg:

The antiplurality choice set is the set of alternatives that are worst for the
lowest number of agents:

C antiplur (p) = fa 2 A j antiplura (p) antiplurb (p) for all b 2 Ag:

De…nition 41 The antiplurality social choice function (or antiplurality rule) is


the social choice function that assigns to every preference pro…le p the antiplurality
choice set C antiplur (p).

The alternatives in C antiplur (p) are called the antiplurality winners in preference
pro…le p.

Example 20 Consider the preference pro…le of Example 19. Since alternative c is the
only alternative that is for no agent the worst, c is the antiplurality winner. Using the
antiplurality score, we …nd antiplur(p) = (antiplura (p); antiplurb (p); antiplurc (p); antiplurd (p))
= (1; 2; 0; 2), so C antiplur = fcg.

The antiplurality rule has similar advantages and disadvantages as the plurality
rule.
To overcome the disadvantage of considering only the best or worst alternative
in each preference relation, each agent can assign points to all alternatives, and the
‘winner’is the alternative with the highest number of points when summing over all
agents. This is done by the Borda rule.

Borda rule
3
First, for every agent we assign a certain number of ‘points’to every alternative, so
that we give more points the ‘better’is the alternative. Speci…cally, we assign to every
alternative a the number of other alternatives b 6= a such that a is at least as good as b
Jean-Charles de Borda (1733 ï¿21 1799) was a French mathematician, physicist and political sci-
3

entist.
7.3. SOCIAL CHOICE FUNCTIONS 139

for this agent. This gives the Borda score of an alternative in an individual preference
relation.
The Borda score of alternative a 2 A in preference relation %i , i 2 N , is

bordaa (%i ) = #fb 2 A n fag j a %i bg:

Notice that, if there are m alternatives and all preference relations are re‡exive,
transitive, complete and asymmetric (these are called linear orders) then every agent
assigns score m r to the alternative that is on the r-th position in his/her preference
relation (where the best alternative is on position 1.).
Now, we sum up the Borda scores of every alternative over all agents. The total
Borda score of alternative a 2 A in preference pro…le p = (%i )i2N is
X
Bordaa (p) = bordaa (%i ):
i2N

Notice that the total Borda score is a score of an alternative in a preference


pro…le, and is obtained as the sum of its Borda scores in the individual preference
relations.
The Borda choice set then is the set of alternatives with the highest total
Borda scores:

C Borda (p) = fa 2 A j Bordaa (p) Bordab (p) for all b 2 Ag:

De…nition 42 The Borda social choice function (or Borda rule) is the social
choice function that assigns to every preference pro…le p the Borda choice set C Borda (p).

The alternatives in C Borda (p) are called the Borda winners in preference pro…le p.

Example 21 Consider the preference pro…le of Example 19. The Borda scores in the
individual preference relations are borda(%1 ) = (3; 1; 2; 0); borda(%2 ) = (2; 0; 3; 1),
borda(%3 ) = (1; 0; 2; 3); borda(%4 ) = (3; 1; 2; 0); borda(%5 ) = (0; 3; 1; 2). So,
the total Borda score is Borda(p) = (9; 5; 10; 6), and thus the Borda winner is alterna-
tive c.

The Borda rule takes account of the ‘full’ preference relations. Obviously, an
agent gives the highest score to an alternative that is best, but every alternative gets
from this agent more ‘points’as alternatives that are worse for this agent. Summing
up over all agents, we get some score for every alternative in the whole society. The
choice for society then is the alternative(s) with the highest total Borda score.
140 CHAPTER 7. PREFERENCE AGGREGATION

Scoring rules:

The plurality, antiplurality and Borda rules are special cases of scoring rules. Con-
sider the Borda-rule. As we saw, an advantage of this voting rule is that it takes into
account the full preference relation of every agent. But notice that the di¤erence in
score between any two consecutive alternatives in a preference relation is one. So, if
alternatives a and b are in position 1 and 2 for one agent, and in position 80 and 81
for another agent, then they get the same points from these two agents together. Why
should the di¤erence between position 1 and 2 be the same as between position 80 and
81? For example, although the plurality rule is extreme, we might consider to give
some extra ‘weight’to the alternative that is best in a preference relation of an agent.
Allowing any di¤erence between two consecutive positions we obtain scoring rules. In
a scoring rule, every agent assigns a certain number of points to the alternative on the
r-th position. Similar as in the Borda rule, the ‘winner’ is the alternative with the
highest number of points when summing over all agents.
Let m = #A be the number of alternatives. Take score numbers s1 ; s2 ; : : : ; sm 2
IN such that s1 s2 ::: sm . The idea is that sk is the number of points that every
agent assigns to the alternative on ‘position’k.
Let s = (s1 ; s2 ; : : : sm ) 2 INm = f0; 1; 2; : : :g be the vector of score numbers.
Now, every agent gives to every alternative a number of ‘points’that depends on the
position of the alternative in this agent’s preference relation. Speci…cally, it gives to
alternative a the score number sk where k is the number of other alternatives b 6= a
such that a is at least as good as b.
The s-score of alternative a 2 A in preference relation %i , i 2 N , is

Scoresa (%i ) = sk ;

with k = #fb 2 A n fag j a %i bg.


So, instead of giving m 1 points to the best alternative in (a rational asym-
metric) preference relation %i (as done in the Borda rule), we can …x how many points
we give to the best alternative in every preference relation. The second best alternative
can get any number of points, but not more than the number of points for the best
alternative, etc. Note that the score numbers are …xed beforehand, and do not depend
on the preference relation, and are also the same for every agent.
The total s-score of alternative a 2 A in preference pro…le p is obtained by
7.3. SOCIAL CHOICE FUNCTIONS 141

adding its scores over all agents, and thus is


X
s
a (p) = Scoresa (%i ):
i2N

The s-score Choice set then is the set of alternatives with highest s-scores:
s s s
C (p) = fa 2 A j a (p) b (p) for all b 2 Ag:

Question: Can you give score vectors showing that the plurality, antiplurality and
Borda rules are scoring rules?4

As we see from the plurality, antiplurality and Borda rules, di¤erent scoring
rules may lead to di¤erent choices. It is a task of a mathematical economist to (i)
make clear the di¤erences between the di¤erent scoring rules and consequences for
voting outcomes, and (ii) advice what rules to apply.

7.3.2 Majoritarian social choice functions


Majoritarian social choice functions are based on a very di¤erent principle than the
scoring rules. They are based on the majority relation by Nicolas de Condorcet (1743
- 1794) who was a French philosopher, mathematician, and political scientist.
Instead of giving scores to the alternatives in every preference relation in the
pro…le, we can de…ne one ‘social preference relation’ %p from the preference pro…le
p = (%i )i2N as follows. Consider any two alternatives a and b. We say that in the
social preference alternative a is ‘at least as good as’ alternative b if the number of
agents that strictly prefer a to be is greater or equal to the number of agents that
strictly prefer b to a. This relation is called the majority relation. It creates one
preference relation by pairwise comparison of any two alternatives.
Let the number of agents in preference pro…le p that consider alternative a better
than alternative b be given by

np (a; b) = #fi 2 N j a %i b and b 6%i ag

= #fi 2 N j a i bg
4
Special cases are (i) Plurality: s1 = 1; sk = 0; k 2 f2; : : : ; mg, (ii) Antiplurality: sm = 0; sk =
1; k 2 f1; : : : ; m 1g, (iii) Borda: sk = m k; k 2 f1; : : : ; mg.
142 CHAPTER 7. PREFERENCE AGGREGATION

De…nition 43 The majority relation of preference pro…le p = (%i )i2N is the pref-
erence relation %p given by

a %p b , np (a; b) np (b; a):

Question: Is the majority relation %p always complete? Is it always transitive?5

The Condorcet social choice function (or Condorcet rule) is determined by taking
the set of best elements in the majority relation, i.e. those alternatives that in the
majority relation %p are at least as good as any other alternative.
The Condorcet choice set is the set of alternatives that are best elements in
the social preference relation %p :

C Cond (p) = fa 2 A j a %p b for all b 2 A n fagg:

De…nition 44 The Condorcet social choice function (or Condorcet rule) is the
social choice function that assigns to every preference pro…le p the Condorcet choice
set C Cond (p).

The alternatives in C Cond (p) are called the Condorcet winners in preference pro…le
p.

Example 22 Consider the preference pro…le of Example 19. Since np (a; b) = 4 and
np (b; a) = 1, we have a %p b. Continuing in this way we obtain the following majority
p p G G G G
relation: a b; a c; a d; c b; c d; b d. Note that a is the unique
best element in %p , and thus C Cond (p) = fag. In this example, the Condorcet winner
coincides with the plurality winner, and is di¤erent from the Borda winner. But this
need not be the case always.

Notice that the majority relation in this example is transitive. But this need not be
the case. For example, the famous ‘Condorcet cycle’on N = f1; 2; 3g and A = fa; b; cg
is the preference pro…le
Agent 1: a 1 b 1 c
Agent 2: b 2 c 2 a
Agent 3: c 3 a 3 b
5
Answer: It is complete but need not be transitive.
7.4. PROPERTIES OF SOCIAL CHOICE FUNCTIONS 143

The corresponding majority relation is a %p b; b %p c and c %p a. So, a %p b and


b %p c, but a 6%p c. So, the majority relation is not transitive. Also, in this case there
is no Condorcet winner. So, C Cond (p) might be empty.6

Notice that, even when the majority relation is not transitive, there might still
exist a Condorcet winner. (Find an example yourself.)
Using power or centrality measures for digraphs, we can derive a complete and
transitive relation from every binary relation. Applying such measures to the majority
relation therefore always gives best elements, and thus a nonempty choice set (see
Chapter 8/Lecture 6).

7.4 Properties of social choice functions


Which social choice function is the ‘best’? This question is unanswered, and it is an
important question on our research agenda (already since Condorcet and before, but
still a lot of discussion on voting methods, see e.g threshholds in voting, presidential
elections, etc.) We try to …nd out which social choice function is desirable by …nding
properties of social choice functions. An axiomatization of a social choice function is
a set of properties that characterizes one (unique) social choice function. One task
of a mathematical economist is to come up with desireable properties for social choice
functions, that help society to choose which method to use. A society can be a country,
union of countries, board of a …rm, a department or club electing a chair person, etc.

Assumption 8 From now on, we assume the social choice function to be single-valued,
i.e to every social choice situation it assigns a unique element (the choice set is a
singleton).

This is a rather strong assumption. Moreover, it is an assumption on the ‘out-


come’(what is assigned by the social choice function), and not an assumption on the
preference pro…le. (We make this assumption for convenience.)

The …rst property that we discuss is about manipulation of the social choice
by individual agents. Suppose there are three candidates for a presidential election.
Candidate L is left-wing, candidate R is right-wing, and there is a candidate EL who
is extremely left-wing. Suppose that the president is elected by plurality voting, i.e.
6
You can ‘force’a choice by applying tie breaking rules, but this might ‘destroy’properties.
144 CHAPTER 7. PREFERENCE AGGREGATION

every voter votes for one of the three candidates. When your …rst preference is EL
then it seems reasonable that you prefer L above R. In a situation where the number
of votes for L and R are expected to be very close together, and candidate EL has
almost no chance of winning, it might be that it is better for you to vote for L than
for R. This is what we also notice in ‘real’elections. In presidential elections there is
a shift to centre candidates, so the number of votes obtained by extreme candidates is
usually less than they should have if the election is a re‡ection of the true preferences
of society. Similar, in parliamentary election (such as election for the dutch Tweede
Kamer) there is a shift from extreme to centre parties who have better chances to
become part of the government.
In the context of social choice functions, when an agent can improve the social
outcome by stating a di¤erent preference relation than its true preference relation, we
say that the agent has a succesfull manipulation.

De…nition 45 Consider preference pro…le p = (%h )h2N . For agent i 2 N , the prefer-
ence relation %0i , with %0i 6=%i , is a succesfull manipulation in preference pro…le p if
i a, where a = C(p) and b = C(p ), with p = (%h )h2N such that %j =%j for all
0 0 0 0
b
j 2 N n fig.

In this de…nition, the only di¤erence between preference pro…les p and p0 is the
preference relation %0 of agent i. Agent i has a succesfull manipulation in preference
pro…le p if by ‘misreporting’his/her preferences, reporting %0i instead of %i , (while the
other agents do not change their preference relation), the social choice is better for
agent i.
Now, we say that a social choice function is strategy-proof if misreporting is
never bene…cial for any agent.

Property 1 A social choice function is strategy-proof if for every preference pro…le


there is no agent who has a succesfull manipulation.

Question: Do you consider strategy proofness a desirable property? What are the
consequences for society if a voting rule is applied that is not strategy proof?

Example 23 Consider the preference pro…le of Example 19. Suppose that we aplly the
Borda rule to make a social choice. We saw in Example 21 that alternative c is the
Borda winner. Now, look at agent 4. This agent prefers alternative a, which is its best
alternative, to alternative c, which is its second best alternative. If this agent ‘switches’
7.4. PROPERTIES OF SOCIAL CHOICE FUNCTIONS 145

alternatives c and d in its preference relation, and thus pretends that its preference
0 0 0
relation is a 4 d 4 b 4 c, then borda(%04 ) = (3; 1; 0; 2) and the total Borda score
of the pro…le becomes Borda(p0 ) = (9; 5; 8; 8). Now the Borda winner is alternative a
which is better for agent 4 than alternative c. Agent 4 has a succesfull manipulation.
By telling something di¤erent than the truth, the social outcome is better for agent 4.

One of the most famous results in social choice theory states that, if there are
at least three alternatives, then the only social choice functions that satisfy strategy
proofness are dictatorial in the sense that there is always an agent whose unique best
element (if it exists) is always the social choice, irrespective of the preferences of the
other agents.

Property 2 A social choice function C is dictatorial if there is an agent i 2 N such


that, for every preference pro…le p,

a i b for all b 2 A n fag ) C(p) = fag

Theorem 47 (Gibbard-Satherthwaite Theorem7 ) If #A 3, then every strategy-proof


social choice function is dictatorial.

This result is originally stated as an impossibility result.

Corollary 48 (Gibbard-Satherthwaite Theorem) If #A 3, then there does not exist


a social choice function that is strategy proof and is not-dictatorial.

Note that Theorem 47 and Corollay 48 are two equivalent statenments.

Question: What do you conclude from this theorem? How serious is this problem?

This theorem implies that we cannot combine strategy proofness with nondicta-
torship. In other words, we must accept that we either have a voting rule where agents
might bene…t from not telling their true preferences, or we have a dictator. It seems
as if we have to choose ‘the least bad’thing.
7
Gibbard, A. (1973), “Manipulation of voting schemes: A general result”, Econometrica, 41 (4):
587ï£ ¡601.
Satterthwaite, M. A. (1975), “Strategy-proofness and Arrow’s conditions: Existence and correspon-
dence theorems for voting procedures and social welfare functions”, Journal of Economic Theory, 10,
187ï£ ¡217.
146 CHAPTER 7. PREFERENCE AGGREGATION

But there is one important thing to realize. We stress that in the Gibbard-
Satherthwaite theorem, all rational (i.e. complete and transitive) preference relations
over A are allowed. If we restrict the domain (i.e. we do not allow all preference
relations), then there might be strategy-proof social choice functions that are not dic-
tatorial. An example of such a domain are single-peaked preferences, which we discuss
in the next section.

7.5 Single-peaked preferences


Consider an agent i and a …nite set of alternatives that is linearly ordered, i.e. the alter-
natives can be labeled from 1 to jAj. It means that we assume that A = fa1 ; a2 ; : : : am g
with ak 2 IN such that ak < ak+1 for all k 2 f1; : : : ; m 1g. For example, A =
f1; 2; : : : ; mg.8
This agent has a single-peaked preference relation if there is one unique alter-
native a that is better than all other alternatives for this agent and, moreover if you
go ‘to the right’(higher labels) or ‘to the left’(lower labels), on each side the closer an
alternative is to a the better it is.

De…nition 46 Preference relation %i on A is single-peaked if

(i) there is an alternative a 2 A such that a i b for all b 2 A n fa g, and

(ii) for all alternatives a; b 2 A it holds that:

if a < b < a then b a;


and
if a > b > a then b a.

Question: Is a single-peaked preference relation always complete?9

Some examples of single-peaked preferences on A = f1; 2; : : : ; 100g are:

a %i b i¤ a b (So, 1 %i 2 %i 3; : : :)
8
The results in this section also can be stated if A is uncountable, for example when A = [0; 100],
but we do not consider that here.
9
Answer: No, alternatives ‘to the left of the peak’ need not be comparable with alternatives ‘to
the right of the peak’.
7.5. SINGLE-PEAKED PREFERENCES 147

a %i b i¤ a b (So, 100 %i 99 %i 98; : : :)

a %i b i¤ ja 4j jb 4j. (So, 2 %i 1; 2 %i 7; 3 %i 2; : : :)

Question: Can you give other examples?

It can be shown that, if all agents have a single-peaked preference relation, then
the majority relation %p is complete and transitive.

Theorem 49 If all preference relations %i , i 2 N , in preference pro…le p are single-


peaked, then the majority relation %p is complete and transitive.

(It is an exercise to prove this in a somewhat more speci…c case, Exercise 8.)
Since every transitive and complete preference relation has a best element, as a
corollary from this theorem we have that there always exists a Condorcet winner if all
preferences are single-peaked.

Corollary 50 If all preference relations %i , i 2 N , in preference pro…le p are single-


peaked, then a Condorcet winner exists.

This is a big advantage of the Condorcet social choice function in the case of
single-peaked preferences. We can even ‘easily’ characterize what alternative is the
Condorcet winner. It is based on the median voter. For convenience, suppose that
there is an odd number of agents. Call the unique best element for agent i in its
preference relation, its peak denoted by pi . Now, put all the peaks in nondecreasing
order. Then the Condorcet winner is the alternative that is the peak of some agent
such that the number of agents that have a lower labeled peak is equal to the number
of agents that have a higher labeled peak, i.e. the Condorcet winner is that alternative
a 2 A such that #fi 2 N j pi ag = #fi 2 N j pi ag.

Example 24 Consider the set of alternatives A = f1; 2; : : : ; 100g and set of agents
N = f1; 2; 3; 4; 5; 6; 7g. Suppose that all agents have single-peaked preferences with peak
pi 2 A for agent i 2 N . Suppose that the peaks of the seven agents are p1 = 7, p2 = 8,
p3 = 10, p4 = 4, p5 = 17, p6 = 3 and p7 = 2. Then, p7 < p6 < p4 < p1 < p2 < p3 < p5 ,
and thus the Condorcet winner is alternative p1 = 7.

We can conclude that for the Condorcet social choice function on the domain of
single-peaked preferences, only the peaks matter: you can …nd the Condorcet winner
if you only know the peaks.
148 CHAPTER 7. PREFERENCE AGGREGATION

But it looks even better for the Condorcet social choice function. If al prefer-
ences are single-peaked, then no agent has a succesfull manipulation according to the
Condorcet social choice function.

Theorem 51 If all preference relations %i , i 2 N , in preference pro…le p are single-


peaked, then the Condorcet social choice function is strategy-proof.

This is not true for the other social choice functions we discussed before (the
scoring methods such as the Borda social choice function, see Exercise 5).

There are other restricted domains that guarantee that the majority relation
is transitive (and thus there exists a Condorcet winner). For example, if there are
only two alternatives then the Condorcet social choice function is strategy proof. In
fact, then it coincides with many other rules such as the plurality rule. Although this
seems a very restrictive domain, many voting procedures are about two alternatives,
for example voting to accept or reject a proposal, or voting between two presidential
candidates.
Another example where the Condorcet social choice function is strategy proof
are intermediate preferences. In this case the agents can be linearly ordered (say from
left to right), such that when two agents have the same pairwise comparison between
two alternatives, then every agent between them has this pairwise comparison.

De…nition 47 Let N = f1; 2; : : : ; ng be the set of agents. Preference pro…le p = (%i


)i2N has intermediate preferences if for all i; j; k 2 N and a; b 2 A we have:

[a %i b; a %k b and i j k] ) a %j b

So, whereas single-peaked preferences have an ordering on the alternatives, in-


termediate preferences have an ordering on the agents.

7.6 Social welfare functions


Consider again a society of individuals. Instead of only only making a (social) choice,
i.e. chosing one (or more) alternatives for the full society, we might want to know
the full social preference relation. So, for every pair of alternatives a and b we state
whether the society prefers a to b, the other way around, is indi¤erent between them,
or cannot compare them.
This is another type of preference aggregation than social choice functions.
7.6. SOCIAL WELFARE FUNCTIONS 149

Preference pro…le social preference relation


- Social welfare function F -
p F (p) =%p

Figure 7.2: Social welfare function.

De…nition 48 A social welfare function F assigns a preference relation to every


social choice situation.

Question: Can you give a (intuition behind) social welfare function?

The social choice functions discussed before all give rise to a corresponding social
welfare function.

7.6.1 Condorcet social welfare function


Recall that we de…ned the Condorcet social choice function by using the majority
relation %p . But we can simply consider %p as a preference relation.

De…nition 49 The Condorcet social welfare function F Cond is obtained as the


majority relation of preference pro…le p:

F Cond (p) =%p

with %p the majority relation.

Notice that %p need not be transitive, even when all preference relations in the
pro…le are complete and transitive. However, as we saw in the previous section, %p is
transitive if all preference relations are single-peaked.
Besides restricted domains, another way to ‘solve’the problem of nontransitivity
for every preference pro…le, is to apply power or centrality measures or ranking methods
150 CHAPTER 7. PREFERENCE AGGREGATION

to turn any preference relation into a complete and transitive relation. We will do this
in Chapter 8.

7.6.2 The Borda social welfare function


Recall that the Borda social choice function …rst determines for every alternative and
for every individual agent the Borda score, and then adding up the Borda scores over
all agents gives the Borda score of the alternatives in the total preference pro…le. Ob-
viously, ranking the alternatives according to their total Borda score in the preference
pro…le always gives a complete and transitive relation.

De…nition 50 The Borda social welfare function F B is obtained by ordering the


alternatives according to their total Borda score:

F Borda (p) =%B with a %B b , Bordaa (p) Bordab (p):

Since the relation %B is transitive and complete for every preference pro…le, the
Borda social welfare function is well de…ned for every preference pro…le. Obviously this
can be done for any scoring method.

Example 25 Consider the preference pro…le of Example 19 which Borda svores are
computed in Example 21. According to these Borda scores we …nd the following social
preference relation: c B a; c B b; c B d; a B b; a B d; d B b. The social
preference relation according to the Borda Social Choice function is always transitive.

7.6.3 Properties of social welfare functions (Optional)


Similar as for social choice functions, the question is what social welfare function is
‘best’. Also this is an unanswered question.
Looking for desirable properties, we also obtain an impossibility result. The
…rst property is independence of irrelevant alternatives. It means that the collective
preference between a and b only depends on pairwise preference comparisons between a
and b. In other words, if the preference relations in a preference pro…le change in such a
way that for two alternatives a and b, for every individual agent the pairwise comparison
between a and b is the same in both pro…les, then the pairwise comparison between
a and b in the social preference relation should not change. So, under independence
of irrelevant alternatives, if for every agent the comparison between two alternatives a
7.6. SOCIAL WELFARE FUNCTIONS 151

and b is the same in preference pro…le p as in p0 , then the comparison between a and b
is also the same in F (p) and F (p0 ).10

Property 3 A social welfare function F satis…es independence of irrelevant al-


ternatives (IIA) if for all alternatives a; b 2 A and preference pro…les p = (%i )i2N
and p0 = (%0 )i2N such that for every i 2 N

a %i b , a %0i b it holds that a % b , a %0 b

where F (p) =% and F (p0 ) =%0 .

A second property is based on the famous Pareto principle. If all agents have
the same strict pairwise comparison between two alternatives, then the same pairwise
comparison should appear in the social preference relation.

Property 4 A social welfare function F is Pareto e¢ cient if for all preference pro-
…les p, and alternatives a; b 2 A, it holds that

a i b for all i 2 N ) a b

with F (p) =% .

In some part of the literature it is by de…nition assumed that a social welfare


function always assigns a transitive preference relation. We did not assume that. But
then we need to do this by a property called transitivity.

Property 5 A social welfare function F is transitive if for every preference pro…le


p, the preference relation F (p) is transitive.

Similar as for social choice functions, if there are at least three alternatives, there
is an impossibility in the sense that any social welfare function the satis…es IIA, Pareto
e¢ ciency and transitivity, is dictatorial. Dictatorship in the context of social welfare
functions means that there is an agent such that if he/she strictly prefers alternative
a to alternative b, then also in the social preference relation alternative a is strictly
better than alternative b.
10
Be aware of the di¤erence with IIA as de…ned in Lecture 1 for individual preferences.
152 CHAPTER 7. PREFERENCE AGGREGATION

Property 6 A social welfare function F is dictatorial if there is an i 2 N such that


for every a; b 2 A, it holds that

a i b)a b

with F (p) =% .

Theorem 52 (Arrow’s impossibility theorem)11 If social welfare function F on A, with


#A 3, is Pareto e¢ cient, transitive and satis…es IIA, then F must be dictatorial.

Question: What do you conclude from this theorem? How serious is this problem?

There do exist non-dictatorial social welfare functions that satisfy Pareto ef-
…ciency, transitivity and IIA on restricted domains. For example, if preferences are
single-peaked, then the Condorcet social welfare function satis…es IIA, Pareto e¢ ciency
and transitivity.

7.7 Concluding remarks


We have discussed social choice functions and social welfare functions that deal with
how to ‘aggregate’preferences of individual agents in a society. A social choice function
just describes what alternative(s) is (are) the most prefered by the society as a whole.
A social welfare function assigns a full social preference relation that can be seen as
the preference relation of the society as a whole. Related issues are, e.g. power in
parliament, seat distribution, coalition formation, agenda setting, etc.12 But also when
working in teams in a …rm, there can be disagreement about a plan or strategy to follow
as a team. Social choice theory can help in making joint decisions also in productive
…rms.
Although social choice and welfare functions are widely applied (just think about
any presidential election, or electing a chair person in a society), we have seen that it
is not obvious at all what is a ‘good’way to aggregate preferences in a society.
11
Arrow, K.J. (1950), “A Di¢ culty in the Concept of Social Welfare”, Journal of Political Economy,
58, 328ï£ ¡346.
12
The website http://www.presidency.ucsb.edu/showelection.php provides an overview of all US
presidential elections of various years. As you can see, the ‘winner’, who is the next president, of
the election is the candidate who has the most ‘electors’. The ‘electors’ are elected per state, and
the elections in the states are independent of each other. Considering the election in one state, this
is done by plurality voting. Therefore, we do not know the full preference relations of all voters. If
you make an assumption about how the preference pro…les look like, you can ‘play around’a bit with
di¤erent social choice functions, seeing who would be elected according to di¤erent rules.
7.8. EXERCISES 153

Besides questions about what is a ‘fair’way to aggregate preferences, recently


also questions concerning the computation of choices has recently gained attention,
developed in the rather new …eld of computational social choice.13 An interesting
question is how serious it is that a social choice function is not strategy proof, when it
is complex to compute a strategic manipulation.
Another issue is raised by Dowding and van Hees (2007)14 who argue that ma-
nipulation might be a virtue from a democratic perspective.

7.8 Exercises
Exercise 1 Consider the following preference pro…le on the set of …ve agents N =
f1; 2; 3; 4; 5g and four alternatives A = fa; b; c; dg:

Agent 1: a 1 c 1 d 1 b

Agent 2: a 2 c 2 b 2 d

Agent 3: b 3 c 3 d 3 a

Agent 4: d 4 c 4 b 4 a

Agent 5: a 5 c 5 b 5 d

(a) What alternative(s) is (are) the plurality winner(s)?

(b) What alternative(s) is (are) the antiplurality winner(s)?

(c) What alternative(s) is (are) the Borda winner(s)?

(d) Give the majority relation %p .

(e) Is there a Condorcet winner? If yes, what alternative(s) is (are) the Condorcet
winner(s)?
13
See for example Chevaleyre, Y., Endriss, U., Lang, J., and N. Maudet “A Short Introduction to
Computational Social Choice”. In Proceedings of the 33rd Conference on Current Trends in Theory
and Practice of Computer Science (SOFSEM-2007) (van Leeuwen, J., Italiano, G.F., van der Hoek,
W., Meinel, C., Sack, H., and F. Plásil (Eds.), LNCS, volume 4362, pages 51-69, Springer-Verlag,
January 2007.
14
Dowding, K., and M. van Hees (2008) “In Praise of Manipulation”, British Journal of Political
Science, 38, 1-15.
154 CHAPTER 7. PREFERENCE AGGREGATION

(f) Find a scoring rule that yields alternative a as the unique winner.

(g) Show that there is no scoring rule that gives alternative d as the unique winner.

(h) Give the collective preference relations according to the Condorcet and Borda
social welfare functions. Are these collective preference relations transitive?

Exercise 2 Consider the following preference pro…le on the set of …ve agents N =
f1; 2; 3; 4; 5g and four alternatives A = fa; b; c; dg:

Agent 1: c 1 a 1 b 1 d

Agent 2: a 2 b 2 d 2 c

Agent 3: d 3 c 3 b 3 a

Agent 4: b 4 d 4 c 4 a

Agent 5: d 5 c 5 b 5 a
Answer the same questions as Exercise 1 (except in question f …nd a scoring rule
that yields alternative b as the unique winner, and in question g show that there is no
scoring rule that yields alternative a as the unique winner).

Exercise 3 Give a preference pro…le where the Borda rule, the plurality rule and the
antiplurality rule give di¤erent choice sets.

Exercise 4 Consider the set of alternatives A = f1; 2; : : : ; 15g and set of agents N =
f1; 2; 3; 4; 5g. Suppose that all agents have single-peaked preferences with peak pi 2 A
for agent i 2 N .
Find the Condorcet winner if p1 = 3, p2 = 14, p3 = 5, p4 = 8 and p5 = 10.

Exercise 5 Consider the following preference pro…le on the set of three agents N =
f1; 2; 3g and …ve alternatives A = f1; 2; 3; 4; 5g:

Agent 1: 2 3 1 4 5

Agent 2: 3 4 5 2 1

Agent 3: 1 2 3 4 5
7.8. EXERCISES 155

(a) Verify that all three agents have single-peaked preferences. What are the peaks?

(b) Find the Condorcet winner.

(c) Find the Borda winner.

(d) Find an agent with a strategic manipulation in the Borda rule.

Exercise 6 Show that the Borda social welfare function does not satisfy IIA.

Exercise 7 Give a social choice function that is strategy proof.

Concepts, Theory and Proofs

Exercise 8 Consider a …nite set of alternatives A = f1; 2; : : : ; #Ag with #A odd, and
set of agents N (with #N odd). Suppose that all agents have single-peaked preferences
with peak pi 2 A for agent i 2 N .

(a) Prove that the Condorcet winner is that alternative a 2 A such that #fi 2 N j
pi ag = #fi 2 N j pi ag.

(b) Prove that on the class of preference pro…les with only single-peaked preferences,
the Condorcet rule is strategy proof.
156 CHAPTER 7. PREFERENCE AGGREGATION
Chapter 8

Ranking methods

8.1 Introduction
In the previous chapter we discussed preference aggregation in collective choice. We
considered two types of preference aggregation. A social choice function assigns a
social choice to every preference pro…le, while a social welfare function assigns a social
preference relation to every preference pro…le.
We discussed social choice and welfare functions based on the Borda score and
related scores (plurality, antiplurality etc.), and we discussed the Condorcet social
choice and welfare functions which are based on the majority relation %p . We saw
that using Borda (scoring) methods we can assign to every preference pro…le a unique
(social) preference relation by ordering the alternatives by their (Borda) score. The
Condorcet social choice function uses the majority relation, which by itself can be seen
as a social preference relation. However, it need not be transitive, even not when all
individual preference relations are linear orders (transitive, complete and asymmetric).
In the majority relation %p , an alternative a is weakly prefered to (is at least
as good as) alternative b if and only if the number of agents that strictly prefer a to
b (consider a better than b), is at least as high as the number of agents that strictly
prefer b to a. Although it seems rather ‘natural’to consider this majority relation, we
saw that one disadvantage is that it usually is not transitive, and therefore the social
choice can be empty and social welfare function assigns a nontransitive relation (which
in some part of the literature even is not considered to be a social welfare function).
We can ‘correct’this by applying ranking methods for binary relations.
In this lecture we apply score functions for directed graphs (digraphs) which
assign real numbers to every node in a digraph, to de…ne social choice functions and

157
158 CHAPTER 8. RANKING METHODS

social welfare functions by simply ranking the alternatives according to their score.
These score functions can be used to de…ne score methods which rank any
objects in a digraph, such as alternatives in a preference relation (which is our main
application in this course), but also to rank teams in a sports competition (based on
the results of the matches) or rank web pages on the internet (based on their links).

8.2 Directed graphs


Although we will mainly apply score functions and ranking methods to preference
pro…les in collective choice situations, in order to stress the general use of these ranking
methods, we discuss them more general for digraphs.1
Let A be a …xed …nite set of alternatives.

De…nition 51 A directed graph or digraph on set of alternatives A is a collection


of ordered pairs (also called a binary relation) D A A, where (a; b) 2 D can be
interpreted as ‘a defeats b’.

For example, if % is a preference relation on the set of alternatives A then, for


every a; b 2 A, we can write:

(a; b) 2 D , a % b:

Usually, in graph theory the set A is called a set of nodes or vertices. But since
we will mainly apply this to collective choice situations, we refer to the set A as a set of
alternatives (but notice that this also can be a set of ‘teams in a sports competition’,
‘web pages on the internet’, ‘players in a game’, etc.)

Assumption 9 We assume the digraph D to be irre‡exive, i.e. (a; a) 62 D for all


a 2 A.

Since in this lecture we take the set of alternatives A …xed, we represent a


digraph (A; D) just by its binary relation, and speak about digraph D. We denote by
DA the collection of all (irre‡exive) digraphs on A.

Some applications of digraphs are the following:


1
Sections 8.2-8.4, also can be read in Chapter 5 (Sections 5.1 and 5.2 (excluding 5.2.3.2)), and
corresponding parts of 5.4) of R.P. Gilles (2010). The Cooperative Game Theory of Networks and
Hierarchies, Springer Verlag, Berlin Heidelberg.
8.3. SCORE FUNCTIONS 159

1. Collective or Social choice: For two alternatives a; b 2 A, (a; b) 2 D means that


a is weakly preferred to (at least as good as) b.

2. Sports competition: For two teams a; b 2 A, (a; b) 2 D means that team a has
won the match it played against team b.

3. Web page ranking: For two web pages a; b 2 A, a; b 2 A, (a; b) 2 D means that
there is a link from webpage a to webpage b.

8.3 Score functions


In the applications mentioned above, we usually want to ‘rank’the alternatives …nding
out what alternative is ‘best’which one is ‘second best’, and so on. For example, in a
preference relation we want to know which alternative is the best element (what, as we
saw before in this course, is not obvious when the preference relation is not transitive).
In a sports competition, where each team playes against each other team once, we want
to know which team is ‘best’, which is second, third etc. Typically, there is not a team
that wins all its matches, so usually it is not obvious which team is ‘the best’.
We can evaluate the positions by a score function which is a function that assigns
numbers to alternatives in digraphs.

De…nition 52 A score function on a set of alternatives A is a function : DA !


IRA .

A score function is a function that assigns to every digraph D an jAj-dimensional


vector where a (D) is a measure of the ‘power’ or ‘strength’ of alternative a 2 A in
digraph D. For preference relations it can be a measure of ‘desirability’. Depending on
the interpretation, in the literature these measures are also referred to as e.g. power
measures, centrality measures or in‡uence measures.
For digraph D on A and alternative a 2 A, the alternatives in the set

Succa (D) = fb 2 A j (a; b) 2 Dg

are called the successors of a in D. It are the alternatives that ‘are defeated’by a.
The alternatives in the set

P reda (D) = fb 2 A j (b; a) 2 Dg


160 CHAPTER 8. RANKING METHODS

are called the predecessors of a in D. It are the alternatives that ‘defeat’a. Next, we
discuss several examples of score functions. The …rst score function is the outdegree
which simply assigns to every alternative a in a digraph the number of alternatives
that are defeated by a (are weakly preferred by a).

De…nition 53 The outdegree of alternative a in digraph D is the number of succes-


sors of a in D:

outa (D) = #Succa (D):

A disadvantage of the outdegree is that it does not take account of the ‘strength’
of the alternatives it defeats. It just assigns to every alternative the number of alter-
natives it defeats.

Example 26 Consider the digraph D = f(a; b); (a; c); (b; c); (b; d); (c; d); (d; a)g on A =
fa; b; c; dg. Since, for exampe Succa (D) = fb; cg, the outdegree of a is a (D) =
#Succa (D) = 2. The outdegree of all alternatives in this digraph is given by out(D) =
(outa (D); outb (D); outc (D); outd (D)) = (2; 2; 1; 1). If we rank the alternatives accord-
ing to the outdegree then we have two ‘winners’: a and b. The other two alternatives
c and d have the same score. So, if %out is a ranking of the alternatives by outdegree,
out out out
then we have a b c d.

We can take account of the strength of defeated alternatives, by letting the


alternatives that defeat an alternative in some sense ‘equally share power’ over that
alternative. With this we mean the following. Instead of giving 1 ‘point’ for every
1
alternative that is defeated by alternative a, alternative a gets #P redb (D)
for every
alternative b that it defeats. So, the more other alternatives defeat successor b, the less
‘points’a scores against successor b. This yields the following score function.

De…nition 54 The -score of alternative a 2 A in digraph D on A is given by


X 1
a (D) =
#P redb (D)
b2Succa (D)

Example 27 Consider the digraph D of Example 26. We saw that a has two suc-
cessors, b and c. Alternative b has a as its only predecessor, while alternative c
has one other predecessor (b) besides a. Therefore, the -score of a is a (D) =
1 1 3
1
+ 2
= 2
. The -score of all alternatives in this digraph is given by (D) =
8.3. SCORE FUNCTIONS 161

( a (D); b (D); c (D); d (D)) = ( 11 + 12 ; 12 + 12 ; 21 ; 11 ) = ( 32 ; 1; 21 ; 1). If we rank the al-


beta beta beta
ternatives according to the -score then we have a linear order a b d c.
So, now we see that a is the unique winner although it has the same outdegree as b.
Alternative a is ranked higher than alternative b since, besides both defeating c, a wins
from the ‘strong’alternative b, while b wins from the ‘weak’alternative d. Alternative
d is ranked higher than alternative c because d wins from the ‘strong’ alternative a
while c wins from the ‘weak’alternative d. It is even the case that alternatives b and d
are equally ranked although b has two successors, while d has only one successor. It is
because d defeats the ‘strong’alternative a.

The -score takes account of the ‘strength’of alternatives that are defeated in
the sense that you get less points if you defeat an alternative that is defeated by more
other alternatives. But why is in the argumentation of Example 27 alternative d a
‘weak’alternative and alternative a a ‘strong’alternative? In fact, alternative d wins
from alternative a? So, it is not so obvious how to de…ne what are ‘weak’and ‘strong’
alternatives in a digraph. We will come back to this in the next section.

As we saw in the examples, the outdegree and -score can give di¤erent outcomes
and rankings of the alternatives. Question is which is the better one to rank the nodes
in a digraph (or alternatives in a preference relation, teams in a sports competition,
web pages on the internet, etc.). A usual way to compare score functions is to …nd
comparable axiomatizations. Similar as in the previous chapter, an axiomatization
for score functions is a set of properties that uniquely determine one score function.
So, it is a set of properties that is satis…ed by a score function and, moreover, this
score function is the only one that satis…es all these properties. We speak about
comparable axiomatizations of two score functions if they di¤er in only one property.
Then this property makes the di¤erence between these two score functions. This can
be very clarifying in helping to choose which score function to use, and therefore it is
an important method in Decision Support Systems.
It turns out that the outdegree and -score can be axiomatized so that they
only di¤er in one axiom. More speci…c, they di¤er in an axiom which states the total
number of ‘points’ that is allocated over the alternatives in order to determine their
score.

We …rst discuss three properties that the outdegree and -score have in common.
First, the dummy property requires that an alternative that has no successors gets a
162 CHAPTER 8. RANKING METHODS

zero score. It implies that an alternative needs to defeat at least one other alternative
to have a positive score.

Property 7 A score function satis…es the dummy property if for every digraph D
on A and alternative a 2 A with Succa (D) = ;, it holds that a (D) = 0.

Second, symmetry requires that alternatives that have the same set of successors
as well as predecessors, get the same score.

Property 8 A score function satis…es symmetry if for every digraph D on A and


alternatives a; b 2 A such that Succa (D) = Succb (D) and P reda (D) = P redb (D), it
holds that a (D) = b (D).

The third axiom is a decomposition property. To de…ne it, we need the fol-
lowing concepts. A partition of digraph D on set A is a collection of digraphs
P = fD1 ; D2 ; : : : ; Dm g such that
Sm
1. k=1 Dk = D, and

2. Dk \ Dl = ; for all k; l 2 f1; : : : ; mg; k 6= l.

A partition P of digraph D is independent if every alternative is defeated in


at most one subdigraph in the partition, i.e. if

#fDk 2 P j P redDk (a) 6= ;g 1 for all a 2 A:

Note that for A = fa1 ; : : : ; am g, the partition P = fD1 ; : : : Dm g with Dk =


f(b; a) 2 D j a = ak g is an independent partition.

Example 28 The partition fDa ; Db ; Dc ; Dd g given by Da = f(d; a)g; Db = f(a; b)g; Dc =


f(a; c); (b; c)g and Dd = f(b; d); (c; d)g is an independent partition of digraph D given
in Example 26.

Property 9 A score function satis…es additivity over independent partitions


if for every digraph D on A and independent partition P of D, it holds that
X
(D) = (Dk ):
Dk 2P

The outdegree and -score both satisfy the three axioms above.
8.3. SCORE FUNCTIONS 163

Proposition 53 The outdegree and -score satisfy the dummy property, symmetry
and additvity over independent partitions.

(The proof is left as an exercise.)

Question: Can you comment on these axioms? Do you consider them desirable?

What then is the di¤erence between the oudegree and -score? These two scores
satisfy a di¤erent normalization. The outdegree satis…es score normalization meaning
that the total number of points to be allocated over all alternatives is the number of
‘pairs’(‘mathches’) in D. The -score satis…es dominance normalization requiring that
the total number of points to be allocated is the number of ‘dominated alternatives’
(or ‘defeated teams’).

Property 10 (i) A score function satis…es score normalization if for every di-
graph D on A, it holds that
X
a (D) = #D:
a2A

(ii) A score function satis…es dominance normalization if for every digraph


D on A, it holds that
X
a (D) = #fb 2 A j P redb (D) 6= ;g:
a2A

Theorem 54 (i) The outdegree is the unique score function that satis…es the dummy
property, symmetry, additvity over independent partitions and score normalization.
(ii) the -score is the unique score function that satis…es the dummy property,
symmetry, additvity over independent partitions and dominance normalization.

From this theorem we conclude that the di¤erence between the outdegree and
-score is only in the normalization, i.e. the total number of points that is allocated
over the alternatives. Although the two normalization properties only state something
about the total number of points that will be allocated, and nothing about how the
points are allocated over the individual alternatives, together with the other axioms
each normalization determines an allocation. So, just deciding how many ‘points’to
allocate is not as innocent as it seems in ranking alternatives (sport teams, web pages,
etc.) It can lead to di¤erent rankings, even a di¤erent ‘winner‘.
Notice that the above theorem can be phrased equivalently as follows:
164 CHAPTER 8. RANKING METHODS

Theorem 55 (i) A score function satis…es satis…es the dummy property, symmetry,
additvity over independent partitions and score normalization if and only if it is the
outdegree.
(ii) A score function satis…es the dummy property, symmetry, additvity over
independent partitions and dominance normalization if and only if it is the -score.

Question: Can you comment on these normalizations?

As we have seen, a disadvantage of the outdegree is that it does not take account
of the ‘strength’ of the alternatives it defeats. The -score takes this into account.
However, a disadvantage of the -score is that alternatives can get higher in the ranking
by ‘losing‘instead of ‘winning‘, as illustrated by the following example.

Example 29 Consider the digraph D = f(a; b); (a; c); (a; d); (b; c); (b; d); (b; e); (b; f );
(b; g); (c; d); (c; e); (c; f ); (c; g); (d; e); (d; f ); (d; g); (e; f ); (e; g); (f; g); (e; a); (f; a); (g; a)g
on A = fa; b; c; d; e; f; gg. The -scores are given by (D) = ( a (D); b (D); c (D); d (D);
1
e (D); f (D); g (D)) = 60
(110; 97; 67; 47; 47; 32; 20).
Now, let D be the digraph where we replace the arc (b; g) by (g; b), i.e. D0 = (Dn
0

f(b; g)g)[f(g; b)g. The resulting -scores are (D) = ( a (D); b (D); c (D); d (D); e (D);
1
f (D); g (D)) = 60
(80; 85; 70; 50; 50; 35; 50). Thus, by being defeated instead of defeat-
ing alternative g, alternative b becomes the highest ranked alternative. This is clari…ed
by the following reasoning. Since alternative g is defeated by all alternatives except
alternative a, by defeating alternative g, alternative b gets 15 of the power over g. So,
alternative b ‘loses’ 15 if it is defeated by alternative g instead of defeating it. If alter-
native b defeats alternative g then alterative a is the only alternative that defeats b, and
thus gets full power over b. If alternative b is defeated by alternative g then alternative
a gets only 12 of the power over b. Thus, alternative a ‘loses’ more than alternative b
if b is defeated by g. In the example this di¤erence is enough to make alternative b the
highest ranked alternative.

This disadvantage of the -score can be ‘repaired‘by letting every alternative


also share in the power over itself, yielding the following score function.

De…nition 55 The re‡exive -score of alternative a 2 A in digraph D on A is given


by
X 1
ref l
a (D) =
#P redb (D) + 1
b2Succa (D)[fag
8.3. SCORE FUNCTIONS 165

The re‡exive -score can also be obtained by taking the -score over the re‡exive
digraph where we add all loops, and thus obtain a re‡exive digraph.2 The allocation
of points over alternatives according to the re‡exive -measure looks similar as the
method accoridng to the -measure, just the di¤erence is that now each alternative
also shares in the point over itself.

Example 30 Consider the digraph D of Example 26 which -score is determined in


Example 27. Considering alternative b, now also taking account that it shares power
over itslef, its re‡exive -score is ref
b
l
(D) = 21 + 13 + 31 = 76 , where it gets 12 over
1 1
itself (shared with a), 3
over c (shared with a and c), and 3
over d (shared with c
ref l
and d). The re‡exive -score of all alternatives in this digraph is given by (D) =
ref l ref l ref l ref l
( a (D); b (D); c =(D); d (D)) ( 12 + 12 + 13 ; 12 + 13 + 13 ; 13 + 13 ; 31 + 12 )
= ( 43 ; 76 ; 32 ; 56 ).
So, if we rank the alternatives according to the re‡exive -score then we have a linear
ref l ref l ref l
order a b d c. Similar as in the ranking by -score, alternative a is
the unique winner. But now alternative b is strictly better than alternative d, although
they were equally ranked according to the -score.

The re‡exive -score satis…es the property that if alternative a wins one more
match instead of losing it, it will never do worse in pairwise comparison with the other
alternatives.

Proposition 56 Consider digraphs D; D0 2 DA and alternatives a; b 2 A. If


ref l ref l
(i) a (D) b (D)

(ii) (a; d) 2 D implies that (a; d) 2 D0 for all d 2 A n fag, and

(iii) (c; d) 2 D if and only if (c; d) 2 D0 when c 6= a,

ref l 0 ref l
then a (D ) b (D0 ).

The reason why a is not doing worse compared to the other alternatives if it
wins a match instead of losing it, is that now a also shares in the power over itself. So,
if a predecessor of a loses points because a loses the match against another alternative,
then a itself su¤ers the same loss. Even more, a also loses the points that it would earn
if it wins the match instead of losing it.
2
Digraph D on A is re‡exive if (a; a) 2 D for all a 2 A.
166 CHAPTER 8. RANKING METHODS

8.4 Eigenvector scores


The - and re‡exive -score of an alternative a depend on the number of predecessors
of its successors. You get less points if you defeat an alternative that is defeated by
more other alternatives. In this way, we take account of the ‘strength’of alternative b
that is weakly defeated by a, in determining the score of alternative a. But as we saw
before, it is not so obvious how to de…ne what are ‘weak’and ‘strong’alternatives in
a digraph. A …rst step is that the score of successor b should appear in the score of its
predecessor a.
ref l
Consider the following iterative procedure. Start with the re‡exive -score ,
1
and let’s call it :

1 ref l
(D) = (D) for all D 2 DA :

Now, we can de…ne a second order score function where we follow the same
procedure as before, but now we allocate the -score of every alternative equally over
itself and all its predecessors. In this way we obtain some kind of second-order -score.
The 2nd -order -score of alternative a 2 A in D is given by

X ref l
(D)
2 b
a (D) =
#P redb (D) + 1
b2Succa (D)[fag

Continuing in this way, we can, iteratively, de…ne higher order -score functions,
each time allocating the -score of an alternative in the previous step equally over the
alternative and all its predecessors.
t
For t 2 f2; 3; : : :g, the tth -order -score a of alternative a 2 A in digraph D
on A is iteratively, given as
X t 1
t b (D)
a (D) = :
#P redb (D) + 1
b2Succa (D)[fag

This iterative process can be seen as a Markov process.

Proposition 57 For every digraph D on set of alternatives A, the limit

t
lim (D)
t!1

exists and is unique.


8.4. EIGENVECTOR SCORES 167

Since the limit of the above iterative procedure to de…ne higher order -scores
exist, we can de…ne this limit as a new score function for digraphs.

De…nition 56 The -score function of digraph D on A is given by


t
(D) = lim (D) for all D 2 DA :
t!1

This limit has another interesting property that follows when writing the it-
erative process described before as a matrix. The transition matrix of a digraph is a
matrix with zero’s and one’s where the ab-th element is one if and only if in the digraph
a weakly defeats b.

De…nition 57 Let A be a set of alternatives. The transition matrix of digraph D


D
on A, is the #A #A matrix with entries given by

D 1 if (a; b) 2 D or a = b
ab =
0 otherwise.

The associated normalized transition matrix is obtained by normalizing such


that the columns all add up to one. This is a so-called stochastic matrix where the
entries are given as follows.

De…nition 58 Let A be a set of alternatives. The normalized transition matrix


D
of digraph D on A, is the #A #A matrix with entries given by
1
D #P redb (D)+1
if (a; b) 2 D or a = b
ab =
0 otherwise.

Question: What are the entries in the column corresponding to alternative a in matrix
D
? What are the entries in the row corresponding to alternative a?

Now, we can write the re‡exive -measure alternatively as the transition matrix
multiplied by the unit vector 1A 2 IRm where all elements are 1, i.e. the unit vector
1A ) is given by (1A )a = 1 for all a 2 A. For every digraph D on A, we have
ref l D
(D) = 1A :

D
It follows from the Perron-Frobenius Theorem that the matrix has an eigen-
value 1. The corresponding eigenvector can be seen as a score function. A vector in
D
the above proposition is an eigenvector of corresponding to eigenvalue 1.
168 CHAPTER 8. RANKING METHODS

D
Proposition 58 Let D be a digraph on A. Then the matrix has eigenvalue 1, i.e.
m
there is a vector 2 IR such that

D
= :

Question: How would you use this proposition to de…ne a score function?

Written di¤erently, for such an eigenvector, it holds that


X b
a = for every alternative a 2 A:
#P redb (D) + 1
b2Succa (D)[fag

So, if we measure the ‘strength’of alternatives in a digraph by such an eignvector


then the score of an alternative a is obtained by giving it a (equal) share in the ‘true’
scores of its successors b (and itself). In this sense, the score can be seen as really
measuring the ‘strength’of the alternatives in the digraph.

Question: How do you interpret such a vector ?

Unfortunately, the eigenvector need not be unique. (It is unique upto nor-
malization if the digraph has only one top cycle, see later in Section 8.5, but if there is
more than one top cycle it is not determined how the scores are allocated among the
di¤erent top cycles.).
However, the iterative procedure that resulted in the de…nition of the limit
measure (see De…nition 56, is one of the eigenvectors corresponding to eigenvalue 1.
So, we can single out one of the eigenvectors (corresponding to eigenvalue 1) by the
iterative procedure giving limit , see Proposition 58.

Proposition 59 The -score function satis…es the property that


X b
a = for all D 2 DA and a 2 A:
#P redb (D) + 1
b2Succa (D)[fag

In this way, the limit score can also be seen as some kind of …xed point.
In the literature on ranking and centrality measures there exist other methods
that are based on limits and eigenvectors, such as the famous Google Page Rank method
to rank web pages.
8.5. APPLICATION TO COLLECTIVE CHOICE 169

Example 31 Consider the digraph of Example 26. Its transition matrix is the matrix
0 1
1 1 1 0
D B 0 1 1 1 C
=B
@ 0
C
0 1 1 A
1 0 0 1

and its normalized transition matrix is


0 1 1 1 1
2 2 3
0
B 0 1 1 1 C
D
=B 2 3
@ 0 0 1 1 A
3 C
3 3
1 1
2
0 0 3

D D
Solving = , which is equivalent to ( I) = 0 with
0 1
1 0 0 0
B 0 1 0 0 C
I=B
@ 0
C
0 1 0 A
0 0 0 1

and thus
0 1 1 1
1
2 2 3
0
B 0 1 1 1 C
D
I=B
@ 0
2 3
2
3
1
C
A
0 3 3
1 2
2
0 0 3

yielding the unique (upto normalization) eigenvector (and thus limit of the it-
4
erative procedure) (D) = 23 (8; 6; 3; 6). So, we obtain the same ranking as with the
re‡exive -score.

Question (for those who are interested): Take a sports competition and try to compute
the outdegree, -, re‡exive - and -scores. See if there are di¤erences in the rankings,
and try to …nd out where these di¤erences come from.

8.5 Application to Collective Choice


Our main goal in this section is to apply ranking methods discusssed before to de-
…ne majoritarian social choice and social welfare functions. Consider a …nite set of
alternatives A, and …nite set of agents N .
170 CHAPTER 8. RANKING METHODS

8.5.1 Majoritarian social choice functions based on score func-


tions
A social choice function is called a majoritarian social choice function when it is based
on the majority relation %p . Throughout this section we make the following assumption
on the individual preference relations in a preference pro…le.

Assumption 10 All individual preference relations %i are complete and transitive.

As we saw before, even when all individual preference relations are complete
and transitive, the corresponding majority relation %p need not be transitive.

Question: Do you recall a preference pro…le where all preference relations are complete
and transitive, but the majority relation is not transitive?3

To stress that the score functions and ranking methods discussed here can be
applied to many …elds outside collective choice, we de…ned them for digraphs. But note
that a digraph and preference relation are equivalent ways to represent preferences.

De…nition 59 For every preference pro…le p = (%i )i2N , we de…ne the majority di-
graph Dp by

(a; b) 2 Dp if and only if a %p b:

Given a score function , the corresponding social choice function C is de…ned


in a straightforward way as the social choice function that chooses the alternatives with
the highest scores according to :

p p
C (p) = fa 2 A j a (D ) b (D ) for all b 2 Ag:

De…nition 60 Given score function , the -social choice function or -rule is the
social choice function that assigns to every preference pro…le p the choice set C (p).

Now, we can apply any of the score functions we discussed before in this chapter
to de…ne a corresponding social choice function. We will focus on the re‡exive - and
-score functions.
3
Answer: See the Condorcet cycle mentioned in Chapter 7.
8.5. APPLICATION TO COLLECTIVE CHOICE 171

De…nition 61 The (re‡exive) -social choice function or (re‡exive) -rule is


ref l
the majoritarian social choice function C given by
ref l ref l ref l
p
C (p) = fa 2 A j a (D ) b (Dp ) for all b 2 Ag:

The -social choice function or -rule is the majoritarian social choice


function C given by
p p
C (p) = fa 2 A j a (D ) b (D ) for all b 2 Ag:

8.5.2 Top cycle and Uncovered set


Comment: This subsection has been slightly revised in 2019.

In this subsection we discuss two other social choice functions that are not based on
score functions.
First, we de…ne the transitive closure of a digraph D as the digraph where there
is an arc (a; b) between alternatives a and b if and only if there is a (directed) path
from a to b, i.e. there is a sequence of alternatives that starts with a, ends with b, and
for each two consecutive alternatives the …rst defeats the second.

De…nition 62 The transitive closure of digraph D is the digraph tr(D) given by


(a; b) 2 tr(D) if and only if there exists a sequence of alternatives a1 ; : : : ; at 2 A such
that

a1 = a,

(ak ; ak+1 ) 2 D for all k 2 f1; : : : ; t 1g,

at = b.

So, the transitive closure of a digraph representing a preference relation contains


all indirect preference comparisons.
Now, we can de…ne the …rst new social choice function based on the top cycle.
A set of alternatives is a top cycle in digraph D if (i) for every two alternatives a; b
in the Top cycle, there is a directed path from one to the other (‘internal stability’),
and (ii) for every alternative a outside the Top cycle and alternative b in the Top cycle
there is no directed path from a to b (‘external stability’). So, all alternatives inside a
top cycle (indirectly) defeat each other, while no alternative outside a top cycle defeats
an alternative inside that top cycle (neither directly nor indirectly).
172 CHAPTER 8. RANKING METHODS

De…nition 63 A subset of alternatives T A is a Top cycle in D if

(i) a; b 2 T; a 6= b ) (a; b) 2 tr(D), and

(ii) a 62 T; b 2 T ) (a; b) 62 tr(D).

For digraph D, we de…ne the Top set T OP (D) as the union of all Top cycles in D.
For preference pro…le p, we de…ne the Top set T OP (p) as the union of all Top
cycles in Dp .

A digraph can have more than one top cycle. Therefore, we de…ne the top set
as the union of all top cycles.

Example 32 The digraph of Example 26 has only one top cycle, being the set of all
alternatives fa; b; c; dg.
Now, consider the digraph D = f(a; e); (b; e); (b; c); (c; d); (d; b)g on A = fa; b; c; d; eg.
This digraph has two top cycles: fag and fb; c; dg. The top set is fa; b; c; dg.

Question: Do you think every complete and asymmetric digraph can be the ma-
jority digraph of some preference pro…le?4
We remark that, if the majority relation %p is a complete and asymmetric
relation on A, then Dp has exactly one Top cycle.

A second majoritarian social choice function that is not based on score functions
is the uncovered set. We say that alternative b is covered by alternative a in digraph
D if a ‘weakly defeats’b in the digraph, and every alternative that is ‘weakly defeated’
by b is also ‘weakly defeated’by a. You can see this as some kind of weak domination
property of a against b. Then the uncovered set is the set of alternatives that are not
covered in the digraph.

De…nition 64 An alternative b is covered by alternative a in digraph D if

(i) (a; b) 2 D, and

(ii) (b; c) 2 D ) (a; c) 2 D for all c 2 A.

The uncovered set UNC(D) is the set of alternatives that are not covered by
some other alternative in D.
4
Answer: See Exercise 8.
8.5. APPLICATION TO COLLECTIVE CHOICE 173

The uncovered set UNC(p) equals the uncovered set UNC(Dp ) of the corre-
sponding majority digraph Dp .
Note that here we de…ned the Top set and the Uncovered set for a preference
pro…le, using its majority graph, but they can be de…ned for any digraph.

Example 33 The digraph of Example 26 has two covered alternatives: b is covered by


a, and c is covered by b. So, the Uncovered set is fa; dg.
The digraph of Example 32 has only one alternative that is covered: e is covered
by a and b. So, the Uncovered set is fa; b; c; dg.

ref l
It turns out that C is a re…nement of the uncovered set for every social choice
situation in the sense that all alternatives with the highest re‡exive -score belong to
the uncovered set. This is not the case for the top cycle. But if the majority relation
is complete and asymmetric then the -soial choice function is also a re…nement of the
top cycle.

Theorem 60 Consider the preference pro…le p. Then

ref l
(i) C (p) UNC(p).

(ii) if %p is complete and asymmetric, then C


ref l
(p) TOP (p).

It turns out that C is a re…nement of the uncovered set and the Top cycle for
every social choice problem.

Theorem 61 For every preference pro…le p it holds that C (p) TOP (p), and C (p)
UNC(p).

ref l
From the above two theorems C seems ‘better’then C since it is a re…ne-
ref l
ment of the top cycle for every social choice situation, while C is a re…nement only
for the special class of complete, asymmetric preference pro…les.
ref l
However, in the next subsection we discuss an advantage of C over C .

Example 34 Consider the digraph D = f(a; b); (b; c); (b; d)g on A = fa; b; c; dg. Then
(D) = (1; 2; 0; 0), and thus C (p) = fbg, but the unique top cycle in D is fag. The
Uncovered set is fa; bg.
174 CHAPTER 8. RANKING METHODS

8.5.3 Two properties of social choice functions


De…nition 56 is corrected in 2019. The explanation in words was already
correct.

We …rst consider Pareto optimality for social choice functions. It states that
if all agents consider alternative b at least as good as alternative a, and at least one
agent considers b better than a, then a cannot be in the choice set. (Be aware that in
Chapter 7 we discussed Pareto optimality of social welfare functions.)

De…nition 65 A social choice function satis…es Pareto optimality if for every pref-
erence pro…le p = (%i )i2N and alternatives a; b 2 A such that

b %i a for all i 2 N , and

there is an i 2 N such that b i a,

it holds that a 62 C(p).


ref l
Both C as well as C are Pareto optimal majoritarian social choice functions.
Next, we introduce a property that makes a di¤erence between them. Suppose that
alternative a is in the choice set, and the individual preference relations change only
by a getting ‘more prefered’in the sense that the only change is the a ‘moves up’in
some preference relations. Then we want that a is still in the social choice set.

De…nition 66 A social choice function satis…es monotonicity if for every two pref-
erence pro…les p = (%i )i2N and p0 = (%0i )i2N , and alternative a 2 A such that for every
i2N

a %i b ) a %0i b for all b 2 A n fag, and

b %i c , b %0i c for all b; c 2 A n fag; b 6= c.

it holds that a 2 C(p) implies that a 2 C(p0 )


ref l
It turns out that C satis…es monotonicity, but C does not.

ref l
Theorem 62 The social choice function C is a Condorcet social choice function
which satis…es Pareto optimality and monotonicity.
8.5. APPLICATION TO COLLECTIVE CHOICE 175

Theorem 63 The social choice function C is a Condorcet social choice function


which satis…es Pareto optimality.

C not being monotone can be seen from the following example.

Example 35 Consider the following preference pro…le on the set of eight agents N =
f1; 2; 3; 4; 5; 6; 7; 8g and seven alternatives A = fb; c; a1 ; a2 ; a3 ; a4 ; a4 g:

Agent 1: a1 a2 b a3 c a4 a5
Agent 2: a5 c a4 b a3 a1 a2
Agent 3: a2 b a3 c a4 a5 a1
Agent 4: a1 a4 a5 a3 c a2 b
Agent 5: b a1 a5 c a2 a3 a4
Agent 6: a4 a3 a2 a5 c b a1
Agent 7: b c a1 a2 a3 a4 a5
Agent 8: a5 a4 a3 a2 a1 b c

The corresponding majority digraph is Dp = f(a1 ; a2 ); (a2 ; b); (b; a1 ); (b; a3 ); (a3 ; c); (c; b);
21 7
(c; a4 ); (a4 ; a5 ); (a5 ; c)g. The -scores are (Dp )b = (Dp )c = 16
, and (Dp )i = 8
for i 2 fa1 ; : : : ; a5 g, and thus C (p) = fb; cg.
Now, change the preference pro…le of agent 1 by moving alternative b one position
higher, i.e. consider the preference pro…le p0 where agents 2-8 have the same preference
relation as in p and the preference relation of agent 1 becomes

0 0 0 0 0 0
Agent 1: a1 1 b 1 a2 1 a3 1 c 1 a4 1 a5

0
Consequently, in the majority digraph the arc (a2 ; b) dissapears, i.e. Dp = D n
0 14 0 21
f(a2 ; b)g. Now, the -scores of alternatives b and c are (Dp )b = 11
and (Dp )c = 11
.
So, alternative c has a higher score than b, and thus b 62 C (p0 ), although we obtained
preference pro…le p0 by only moving up alternative b in one of the preference relations,
showing that does not satisfy monotonicity.

ref l ref l
Concluding, when we compare C with C , the disadvantage of C is that
it is not a re…nement of the top set for every social choice problem, but a disadvantage
of C is that it is not monotone.
176 CHAPTER 8. RANKING METHODS

8.6 Social welfare functions


Obviously, score functions can also be used to de…ne social welfare functions which
assign tot every preference pro…le a complete, transitive relation.
Given a score function , we can de…ne a corresponding social welfare function
F (p) =% that ranks the alternatives according to :

a% b, a (D
p
) b (D
p
) for all a; b 2 A:

We will not discuss this further in this course.

8.7 Conclusion
In this lecture we discussed several ranking methods for digraphs, and applied them
to de…ne social choice and welfare functions. These ranking methods can be used to
transform any preference relation into a complete and transitive preference relation,
and therefore are very useful for application to the majority relation. These ranking
methods can be used to derive a complete and transitive preference relation from any
social or individual preference relation.
Besides collective choice, ranking methods are also used in, for example, to
ranking teams in sports competitions or ranking web pages on the internet. Numerous
ranking methods are applied to sports competitions and web page ranking, one of the
most famous being Google Page Rank, see Brin and Page (1998)5

8.8 Exercises
Exercise 1 Give a preference pro…le p = (%i )i2N , such that all preference relations %i
are complete, transitive and asymmetric, but the majority relation %p is not transitive.

Exercise 2 Consider the set of alternatives A = fa; b; c; d; eg and digraph


D = f(a; b); (a; d); (b; c); (b; d); (c; e); (d; c); (e; a)g on A.

(a) Give the outdegree scores of the alternatives in this digraph.

(b) Give the -scores of the alternatives in this digraph.


5
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual Web search engine.
Com-puter Networks and ISDN Systems, 30(17):10, 117, 1998.
8.8. EXERCISES 177

ref l
(c) Give the -scores of the alternatives in this digraph.

(d) Give the -scores of the alternatives in this digraph. (Hint: Find these scores
D D
by solving the matrix equation = where is the #A #A transition
matrix and 2 IRm .)

(e) Give a preference pro…le p with set of agents N = f1; 2; 3; 4g such that digraph D
represents the majority relation in the sense that (a; b) 2 D , a %p b.

(f) Give the choice set C out according to the outdegree.

(g) Give the choice set C according to the -score.


ref l ref l
(h) Give the choice set C according to the -score.

(i) Give the choice set C according to the -score.

Exercise 3 Consider the digraph D = f(a; b); (a; c); (b; c); (b; d); (b; e); (c; d); (d; a); (d; e)g
on A = fa; b; c; d; eg. Answer the same questions as Exercise 2.

Exercise 4 Consider digraph D of Exercise 2. Give the social preference relation that
is obtained by aggregating preferences according to the social welfare functions based
on the outdegree-, -, ref l - and -scores.

Exercise 5 This exercise is revised in 2019:


Consider digraph D = f(a; b); (a; c); (b; c); (b; d); (c; d); (e; d)g on A = fa; b; c; d; eg.

(a) Determine the Top set T op(D).

(b) Determine the Uncovered Set UNC(D).

(c) Change digraph D by replacing (e; d) by (e; a), so consider digraph D0 = (D n


f(e; d)g) [ f(e; a)g. Answer questions a-b for this digraph.

Exercise 6 This exercise is revised in 2019: Give a digraph D on a set of alterna-


tives such that the node with the highest re‡exive -score does not belong to the Top
set.

Exercise 7 Let D represent a sports competition between m teams A = fa1 ; : : : ; am g


such that (a; b) 2 D if and only if the match between a and b is won by a ir is a draw.
Suppose that each team plays once against each other team.
178 CHAPTER 8. RANKING METHODS

(a) Give a score function that expresses that a team gets 2 points for every match it
wins, 1 point for every draw, and 0 points for every loss.

(b) Give a score function that expresses that a team gets 3 points for every match it
wins, 1 point for every draw, and 0 points for every loss.

(c) For the two score functions you de…ned above, verify if they satisfy the dummy
property, symmetry and additivity over loss graphs.

Concepts, Theory and Proofs

Exercise 8
This exercise is corrected in 2019: Prove that for every asymmetric and complete
preference relation %, there is a preference pro…le p such that % is its majority relation,
i.e. %p =%.

Exercise 9

(a) Prove that the -score function satis…es the dummy property, symmetry, additvity
over loss graphs and dominance normalization.

(b) Prove that the -score function is the unique score function that satis…es the
dummy property, symmetry, additvity over loss graphs and dominance normal-
ization.

ref l
Exercise 10 Prove that C and C are Condorcet social choice functions, i.e. for
ref l
every preference pro…le p, C (p) = C (p) = B(A) if B(A) 6= ;, where B(A) is the
set of best elements in % on set of alternatives A.
p

Exercise 11 Prove that for every preference pro…le p with Dp a complete and asym-
metric relation, it holds that UNC(p) TOP (p).

You might also like