Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
Theodore Sider
December 4, 2007
Preface
This book is an elementary introduction to the logic that students of contempo
rary philosophy ought to know. It covers i) basic approaches to logic, including
proof theory and especially model theory, ii) extensions of standard logic (such
as modal logic) that are important in philosophy, and iii) some elementary
philosophy of logic. It prepares students to read the logically sophisticated
articles in today’s philosophy journals, and helps them resist bullying by symbol
mongerers. In short, it teaches the logic necessary for being a contemporary
philosopher.
For better or for worse (I think better), the last centuryorso’s developments
in logic are part of the shared knowledge base of philosophers, and inform,
in varying degrees of directness, every area of philosophy. Logic is part of
our shared language and inheritance. The standard philosophy curriculum
therefore includes a healthy dose of logic. This is a good thing. But the advanced
logic that is part of this curriculum is usually a course in “mathematical logic”,
which usually means an intensive course in metalogic (for example, a course
based on the excellent Boolos and Jeffrey (1989).) I do believe in the value of
such a course. But if advanced undergraduate philosophy majors or beginning
graduate students are to have one advanced logic course, that course should
not, I think, be a course in metalogic. The standard metalogic course is too
mathematically demanding for the average philosophy student, and omits
material that the average student needs to know. If there is to be only one
advanced logic course, let it be a course designed to instill logical literacy.
I begin with a sketch of standard propositional and predicate logic (de
veloped more formally than in a typical intro course.) I briey discuss a few
extensions and variations on each (e.g., threevalued logic, denite descrip
tions). I then discuss modal logic and counterfactual conditionals in detail. I
presuppose familiarity with the contents of a typical intro logic course: the
meanings of the logical symbols of rstorder predicate logic without identity
i
PREFACE ii
or function symbols; truth tables; translations from English into propositional
and predicate logic; some proof system(e.g., natural deduction) in propositional
and predicate logic.
I drew heavily from the following sources, which would be good for supple
mental reading:
• Propositional logic: Mendelson (1987)
• Descriptions, multivalued logic: Gamut (1991a)
• Sequents: Lemmon (1965)
• Further quantiers: Glanzberg (2006); Sher (1991, chapter 2); Wester
ståhl (1989); Boolos and Jeffrey (1989, chapter 18)
• Modal logic: Gamut (1991b); Cresswell and Hughes (1996)
• Semantics for intuitionism : Priest (2001)
• Counterfactuals: Lewis (1973)
• Twodimensional modal logic: Davies and Humberstone (1980)
Another source was Ed Gettier’s 1988 modal logic class at the University of
Massachusetts.
I am also deeply grateful for feedback from colleagues, and from students
in courses on this material. In particular, Marcello Antosh, Josh Armstrong,
Gabe Greenberg, Angela Harper, Sami Laine, Gregory Lavers, Alex Morgan,
Jeff Russell, Brock Sides, Jason Turner, Crystal Tychonievich, Jennifer Wang,
Brian Weatherson, and Evan Williams: thank you.
Contents
Preface i
1 Nature of Logic 1
1.1 Logical consequence and logical truth . . . . . . . . . . . . . . . . . 2
1.2 Form and abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Formal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Correctness and application . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 The nature of logical consequence . . . . . . . . . . . . . . . . . . . 8
1.6 Extensions, deviations, variations . . . . . . . . . . . . . . . . . . . . 10
1.6.1 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6.2 Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.3 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Metalogic, metalanguages, and formalization . . . . . . . . . . . . 12
1.8 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Propositional Logic 18
2.1 Grammar of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 The semantic approach to logic . . . . . . . . . . . . . . . . . . . . . 21
2.3 Semantics of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Natural deduction in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.3 Sequent proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.4 Example sequent proofs . . . . . . . . . . . . . . . . . . . . . 35
2.5 Axiomatic proofs in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.1 Example axiomatic proofs . . . . . . . . . . . . . . . . . . . . 42
2.5.2 The deduction theorem . . . . . . . . . . . . . . . . . . . . . 47
2.6 Soundness and completeness of PL . . . . . . . . . . . . . . . . . . 47
iii
CONTENTS iv
3 Variations and Deviations from PL 52
3.1 Alternate connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.1 Symbolizing truth functions in propositional logic . . . 52
3.1.2 Inadequate connective sets . . . . . . . . . . . . . . . . . . . 56
3.1.3 Sheffer stroke . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Polish notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Multivalued logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.1 Łukasiewicz’s system . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.2 Kleene’s “strong” tables . . . . . . . . . . . . . . . . . . . . . 62
3.3.3 Kleene’s “weak” tables (Bochvar’s tables) . . . . . . . . . . 63
3.3.4 Supervaluationism . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 Intuitionism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Predicate Logic 71
4.1 Grammar of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 Semantics of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Establishing validity and invalidity . . . . . . . . . . . . . . . . . . . 77
5 Extensions of Predicate Logic 80
5.1 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1.1 Grammar for the identity sign . . . . . . . . . . . . . . . . . 80
5.1.2 Semantics for the identity sign . . . . . . . . . . . . . . . . 81
5.1.3 Symbolizations with the identity sign . . . . . . . . . . . 81
5.2 Function symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2.1 Grammar for function symbols . . . . . . . . . . . . . . . . 85
5.2.2 Semantics for function symbols . . . . . . . . . . . . . . . . 86
5.2.3 Symbolizations with function symbols: some examples 88
5.3 Denite descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Grammar for ι . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.2 Semantics for ι . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.3 Eliminability of function symbols and denite descriptions 92
5.4 Further quantiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4.1 Generalized monadic quantiers . . . . . . . . . . . . . . . 96
5.4.2 Generalized binary quantiers . . . . . . . . . . . . . . . . . 98
5.4.3 Secondorder logic . . . . . . . . . . . . . . . . . . . . . . . . 100
CONTENTS v
6 Propositional Modal Logic 103
6.1 Grammar of MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 Symbolizations in MPL . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3 Semantics for MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.3.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3.2 Kripke models . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.3 Semantic validity proofs . . . . . . . . . . . . . . . . . . . . . 115
6.3.4 Countermodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.3.5 Schemas, validity, and invalidity . . . . . . . . . . . . . . . . 135
6.4 Axiomatic systems of MPL . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4.1 System K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4.2 System D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4.3 System T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.4.4 System B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.4.5 System S4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.4.6 System S5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4.7 Substitution of equivalents and modal reduction . . . . . 153
6.5 Soundness in MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.1 Soundness of K . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5.2 Soundness of T. . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5.3 Soundness of B . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.6 Completeness of MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.6.1 Canonical models . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.6.2 Maximal consistent sets of wffs . . . . . . . . . . . . . . . . 161
6.6.3 Denition of canonical models . . . . . . . . . . . . . . . . 162
6.6.4 Features of maximal consistent sets . . . . . . . . . . . . . 163
6.6.5 Maximal consistent extensions . . . . . . . . . . . . . . . . . 164
6.6.6 “Mesh” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.6.7 The coincidence of truth and membership in canonical
models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.6.8 Completeness of systems of MPL . . . . . . . . . . . . . . 169
7 Variations on MPL 172
7.1 Propositional tense logic . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.1.1 The metaphysics of time . . . . . . . . . . . . . . . . . . . . 172
7.1.2 Tense operators . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.1.3 Syntax of tense logic . . . . . . . . . . . . . . . . . . . . . . . 176
7.1.4 Possible worlds semantics for tense logic . . . . . . . . . . 176
CONTENTS vi
7.1.5 Formal constraints on ≤. . . . . . . . . . . . . . . . . . . . . 178
7.2 Intuitionist propositional logic . . . . . . . . . . . . . . . . . . . . . . 180
7.2.1 Kripke semantics for intuitionist propositional logic . . 180
7.2.2 Examples and proofs . . . . . . . . . . . . . . . . . . . . . . . 183
7.2.3 Soundness and other facts about intuitionist validity . . 186
8 Counterfactuals 190
8.1 Natural language counterfactuals . . . . . . . . . . . . . . . . . . . . 191
8.1.1 Not truthfunctional . . . . . . . . . . . . . . . . . . . . . . . 191
8.1.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.1.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.1.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.1.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.1.6 Context dependence . . . . . . . . . . . . . . . . . . . . . . . 194
8.2 The Lewis/Stalnaker approach . . . . . . . . . . . . . . . . . . . . . 196
8.3 Stalnaker’s system (SC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8.3.1 Syntax of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.3.2 Semantics of SC . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.4 Validity proofs in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.5 Countermodels in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.6 Logical Features of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.6.1 Not truthfunctional . . . . . . . . . . . . . . . . . . . . . . . 211
8.6.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.6.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.6.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.6.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.6.6 No exportation . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.6.7 No importation . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.6.8 No hypothetical syllogism (transitivity) . . . . . . . . . . . 214
8.6.9 No transposition . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.7 Lewis’s criticisms of Stalnaker’s theory . . . . . . . . . . . . . . . . 216
8.8 Lewis’s system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.9 The problem of disjunctive antecedents . . . . . . . . . . . . . . . 222
9 Quantied Modal Logic 224
9.1 Grammar of QML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.2 Symbolizations in QML . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.3 A simple semantics for QML . . . . . . . . . . . . . . . . . . . . . . . 226
CONTENTS vii
9.4 Countermodels and validity proofs in SQML . . . . . . . . . . . . 228
9.5 Philosophical questions about SQML . . . . . . . . . . . . . . . . . 234
9.5.1 The necessity of identity . . . . . . . . . . . . . . . . . . . . 234
9.5.2 The necessity of existence . . . . . . . . . . . . . . . . . . . 236
9.5.3 Necessary existence defended . . . . . . . . . . . . . . . . . 241
9.6 Variable domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
9.6.1 Countermodels to the Barcan and related formulas . . . 246
9.6.2 Expanding, shrinking domains . . . . . . . . . . . . . . . . 248
9.6.3 Strong and weak necessity . . . . . . . . . . . . . . . . . . . 249
10 Twodimensional modal logic 253
10.1 Actuality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
10.1.1 Kripke models with actual worlds . . . . . . . . . . . . . . 254
10.1.2 Semantics for @ . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
10.2 × . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
10.2.1 Twodimensional semantics for × . . . . . . . . . . . . . . 258
10.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.3 Fixedly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
10.3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
10.4 A philosophical application: necessity and a priority . . . . . . . . 265
References 272
Chapter 1
Nature of Logic
S
ixti voi avi viabixo this book, you are probably already familiar with
some logic. You probably know how to translate English sentences into
symbolic notation—into propositional logic:
English Propositional logic
If snow is white then grass is green S→G
Either snow is white or grass is not green S∨∼G
and into predicate logic:
English Predicate logic
If Jones is happy then someone is happy H j →∃xHx
Anyone who is friends with Jones is either
insane or friends with everyone
∀x[F x j →(I x ∨∀yF xy)]
You are probably also familiar with some basic techniques for evaluating argu
ments written out in symbolic notation. You have probably encountered truth
tables, and some form of proof theory (perhaps a “natural deduction” system;
perhaps “truth trees”.) You may have even encountered some elementary model
theory. In short: you had an introductory course in symbolic logic.
What you already have is: literacy in elementary logic. What you will
get out of this book is: literacy in the rest of logic that philosophers tend to
presuppose, plus a deeper grasp of what logic is all about.
So what is logic all about?
1
CHAPTER 1. NATURE OF LOGIC 2
1.1 Logical consequence and logical truth
Logic is about logical consequence. The statement “someone is happy” is a logical
consequence of the statement “Ted is happy”. If Ted is happy, then it logically
follows that someone is happy. Put another way: the statement “Ted is happy”
logically implies the statement “someone is happy”. Likewise, the statement
“Ted is happy” is a logical consequence of the statements “It’s not the case that
John is happy” and “Either John is happy or Ted is happy”. The rst statement
follows from the latter two statements. If the latter two statements are true,
the former must be true. Put another way: the argument whose premises are
the latter two statements, and whose conclusion is the former statement, is a
logically correct one.
1
Relatedly, logic is about logical truth. A logical truth is a sentence that is
“true purely by virtue of logic”. Examples might include: “it’s not the case that
snow is white and also not white”, “All sh are sh”, and “If Ted is happy then
someone is happy”. It is plausible that logical truth and logical consequence
are related thus: a logical truth is a sentence that is a logical consequence of
any sentences whatsoever.
1.2 Form and abstraction
Logicians focus on form. Consider again the following argument:
Argument A: It’s not the case that John is happy
Ted is happy or John is happy
Therefore, Ted is happy
Argument A is logically correct—its conclusion is a logical consequence of its
premises. It is customary to say that this is so in virtue of its form—in virtue of
the fact that its form is:
It’s not the case that φ
φ or ψ
Therefore ψ
1
The word ‘valid’ is sometimes used for logically correct arguments, but I will reserve that
word for a different concept: that of a logical truth according to the semantic conception of
logical truth.
CHAPTER 1. NATURE OF LOGIC 3
Likewise, we say that “it’s not the case that snow is white and snow is not white”
is a logical truth because it has the form: it’s not the case that φ and notφ.
We need to think hard about the idea of form. Apparently, we got the
alleged form of Argument A by replacing some words with Greek letters and
leaving other words as they were. We replaced the sentences ‘John is happy’ and
‘Ted is happy’ with φ and ψ, respectively, but left the expressions ‘It’s not the
case that’ and ‘or’ as they were, resulting in the schematic form displayed above.
Let’s call that form, “Form 1”. What’s so special about Form 1? Couldn’t we
make other choices for what to leave and what to replace? For instance, if we
replace the predicate ‘is happy’ with the schematic letter α, leaving the rest
intact, we get this:
Form 2: It’s not the case that John is α
Ted is α or John is α
Therefore, Ted is α
And if we replace the ‘or’ with the schematic letter γ and leave the rest intact,
then we get this:
Form 3: It’s not the case that John is happy
Ted is happy γ John is happy
Therefore, Ted is happy
If we think of Argument A as having Form 1, then we can think of it as being
logically correct in virtue of its form, since every “instance” of Form 1 is
logically correct. That is, no matter what sentences we substitute in for the
greek letters φ and ψ in Form 1, the result is a logically correct argument.
Now, if we think of Argument A’s form as being Form 2, we can continue to
think of Argument A as being logically correct in virtue of its form, since, like
Form 1, every instance of Form 2 is logically correct: no matter what predicate
we change α to, Form 2 becomes a logically correct argument. But if we think
of Argument A’s form as being Form 3, then we cannot think of it as being
logically correct in virtue of its form, for not every instance of Form 3 is a
logically correct argument. If we change γ to ‘if and only if’, for example, then
we get the following logically incorrect argument:
It’s not the case that John is happy
Ted is happy if and only if John is happy
Therefore, Ted is happy
CHAPTER 1. NATURE OF LOGIC 4
So, what did we mean, when we said that Argument A is logically correct in
virtue of its form? What is Argument A’s form? Is it Form 1, Form 2, or Form
3?
There is no such thing as the form of an argument. When we assign an
argument a form, what we are doing is focusing on certain words and ignoring
others. We leave intact the words we’re focusing on, and we insert schematic
letters for the rest. Thus, in assigning Argument A Form 1, we’re focusing on
the words (phrases) ‘it is not the case that’ and ‘or’, and ignoring other words.
More generally, in (standard) propositional logic, we focus on the phrases
‘if…then’, ‘if and only if’, ‘and’, ‘or’, and so on, and ignore others. We do this
in order to investigate the relations of logical consequence that hold in virtue
of these words’ meaning. The fact that Argument A is logically correct depends
just on the meaning of the phrases ‘it is not the case that’ and ‘or’; it does not
depend on the meanings of the sentences ‘John is happy’ and ‘Ted is happy’.
We can substitute any sentences we like for ‘φ’ and ‘ψ’ in Form 1 and still get
a valid argument.
In predicate logic, on the other hand, we focus on further words: ‘all’ and
‘some’. Broadening our focus in this way allows us to capture a wider range
of logical consequences and logical truths. For example “If Ted is happy then
someone is happy” is a logical truth in virtue of the meaning of ‘someone’, but
not merely in virtue of the meanings of the characteristic words of propositional
logic.
Call the words on which we’re focusing—that is, the words that we leave
intact when we construct the forms of sentences and arguments—the logical
constants. (We can speak of natural language logical constants—‘and’, ‘or’, etc.
for propositional logic; ‘all’ and ‘some’ in addition for predicate logic—as well
as symbolic logical constants: ∧, ∨, etc. for propositional logic; ∀ and ∃ in
addition for predicate logic.) What we’ve seen is that the forms we assign
depend on what we’re considering to be the logical constants.
We call these expressions logical constants because we interpret them in a
constant way in logic, in contrast to other terms. For example, ∧ is a logical
constant; in propositional logic, it always stands for conjunction. There are
xed rules governing ∧, in proof systems (the rule that from P∧Q one can
infer P, for example), in the rules for constructing truth tables, and so on.
Moreover, these rules are distinctive for ∧: there are different rules for other
logical constants such as ∨. In contrast, the terms in logic that are not logical
constants do not have xed, particular rules governing their meanings. For
example, there are no special rules governing what one can do with a P as
CHAPTER 1. NATURE OF LOGIC 5
opposed to a Q in proofs or truth tables. That’s because P doesn’t symbolize
any sentence in particular; it can stand for any old sentence.
There isn’t anything sacred about the choices of logical constants we make
in propositional and predicate logic; and therefore, there isn’t anything sacred
about the customary forms we assign to sentences. We could treat other words
as logical constants. We could, for example, stop taking ‘or’ as a logical constant,
and instead take ‘It’s not the case that John is happy’, ‘Ted is happy’, and ‘John
is happy’ as logical constants. We would thereby view Argument A as having
Form 3. This would not be a particularly productive choice (since it would not
help to explain the correctness of Argument A), but it’s not wrong simply by
virtue of the concept of form.
More interestingly, consider the fact that every argument of the following
form is logically correct:
α is a bachelor
Therefore, α is unmarried
Accordingly, we could treat the predicates ‘is a bachelor’ and ‘is unmarried’
as logical constants, and develop a corresponding logic. We could introduce
special symbolic logical constants for these predicates, we could introduce
distinctive rules governing these predicates in proofs. (The rule of “bachelor
elimination”, for instance, might allow one to infer “α is unmarried” from “α
is a bachelor”.) As with the choices of the previous paragraph, this choice of
what to treat as a logical constant is also not ruled out by the concept of form.
And it would be more productive than the choices of the last paragraph. Still,
it would be far less productive than the usual choices of logical constants in
predicate and propositional logic. The word ‘bachelor’ doesn’t have as general
application as the words commonly treated as logical constants in propositional
and predicate logic; the latter are ubiquitous.
At least, this remark about “generality” is one idea about what should be
considered a “logical constant”, and hence one idea about the scope of what
is usually thought of as “logic”. Where to draw the boundaries of logic—and
indeed, whether the logic/nonlogic boundary is an important one to draw—is
an open philosophical question about logic. At any rate, in this course, one
thing we’ll do is study systems that expand the list of logical constants from
standard propositional and predicate logic.
CHAPTER 1. NATURE OF LOGIC 6
1.3 Formal logic
Modern logic is “mathematical” or “formal” logic. This means simply that
one studies logic using mathematical techniques. More carefully: in order
to develop theories of logical consequence, and logical truth, one develops a
formal language (see below), one treats the sentences of the formal language as
mathematical objects; one uses the tools of mathematics (especially, the tools
of very abstract mathematics, such as set theory) to formulate theories about
the sentences in the formal language; and one applies mathematical standards
of rigor to these theories. Mathematical logic was originally developed to study
mathematical reasoning
2
, but its techniques are now applied to reasoning of all
kinds.
Think, for example, of propositional logic (this will be our rst topic below).
The standard approach to analyzing the logical behavior of ‘and’, ‘or’, and so
on, is to develop a certain formal language, the language of propositional logic.
The sentences of this language look like this:
P
(Q→R)∨(Q→∼S)
P↔(P∧Q)
The symbols ∧, ∨, etc., are used to represent the English words ‘and’, ‘or’, and
so on (the logical constants for propositional logic), and the sentence letters
P, Q, etc., are used to represent declarative English sentences.
Why ‘formal’? Because we stipulate, in a mathematically rigorous way,
a grammar for the language; that is, we stipulate a mathematically rigorous
denition of the idea of a sentence of this language. Moreover, since we are
only interested in the logical behavior of the chosen logical constants ‘and’,
‘or’, and so on, we choose special symbols (∧, ∨. . . ) for these words only; we
use P, Q, R, . . . indifferently to represent any English sentence whose internal
logical structure we are willing to ignore.
3
We go on, then, to study (as always, in a mathematically rigorous way) vari
ous concepts that apply to the sentences in formal languages. In propositional
2
Notes
3
Natural languages like English also have a grammar, and the grammar can be studied
using mathematical techniques. But the grammar is much more complicated, and is discovered
rather than stipulated; and natural languages lack abstractions like the sentence letters.
CHAPTER 1. NATURE OF LOGIC 7
logic, for example, one constructs a mathematically rigorous denition of a
tautology (“all Trues in the truth table”), and a rigorous denition of a prov
able formula (e.g., in terms of a system of deduction, using rules of inference,
assumptions, and so on).
Of course, the real goal is to apply the notions of logical consequence and
logical truth to sentences of English and other natural languages. The formal
languages are merely a tool; we need to apply the tool.
1.4 Correctness and application
To apply the tools we develop for formal languages, we need to speak of a
formal system as being correct. What does that sort of claim mean?
As we saw, logicians use formal languages and formal structures to study
logical consequence and logical truth. And the range of structures that one
could in principle study is very wide. For example, I could introduce a new
notion of “provability” by saying “in Ted Logic, the following rule may be
used when constructing proofs: if you have P on a line, you may infer ∼P.
The annotation is ‘T’.” I could then go on to investigate the properties of
such a system. Logic can be viewed as a branch of mathematics, and we can
mathematically study any system we like, including a system (like Ted logic) in
which one can “prove” ∼P from P.
But no such formal system would shed light on genuine logical consequence
and genuine logical truth. It would be implausible to claim, for example, that
when we translate an English argument into symbols, the conclusion of the
resulting symbolic argument may be derived in Ted logic from its premises iff
the conclusion of the original English argument is a logical consequence of its
premises.
Thus, the existence of a coherent, speciable logical system must be dis
tinguished from its application. When we say that a logical system is correct,
we have in mind some application of that system. Here’s an oversimplied
account of one such correctness claim. Suppose we have developed a certain
formal system for constructing proofs in propositional logic. And suppose
we have specied some translation scheme from English into the language
of propositional logic. This translation schema would translate English ‘and’
into the logical ∧ , English ‘or’ into the logical ∨, and so on. Then, the claim
that the formal system gives a correct logic of English ‘and’, ‘or’, etc. might be
taken to be the claim that one English sentence is a logical consequence of
CHAPTER 1. NATURE OF LOGIC 8
some other English sentences in virtue of ‘and’, ‘or’, etc., iff one can prove the
translation of the former English sentence from the translations of the latter
English sentences in the formal system.
In this book I won’t spend much time on philosophical questions about
which formal systems are correct. My goal is rather to introduce those for
malisms that are ubiquitous in philosophy, to give you the tools you need to
address such philosophical questions yourself. Still, from time to time, we’ll dip
just a bit into these philosophical questions, in order to motivate our choices
of logical systems to study.
1.5 The nature of logical consequence
The previous section discussed what it means to say that a formal theory gives a
correct account of logical consequence (as applied to sentences of English and
other natural languages). But what is it for sentences to stand in the relation of
logical consequence? What is logical consequence?
The question here is a philosophical question, as opposed to a mathematical
one. Logicians dene various notions concerning sentences of formal languages:
derivability in this or that proofsystem, “all trues in the truth table”, and so
on. They thereby stipulatively introduce various formal concepts. These
formal concepts are good insofar as they correctly model logical truth and
logical consequence. But in what do logical truth and logical consequence—the
intuitive concepts, as opposed to the stipulatively introduced concepts—consist?
This is one of the core questions of philosophical logic.
This book is not primarily a book in philosophical logic, so we won’t spend
much time on the question. However, I do want to make clear that the question
is indeed a question. The question is sometimes obscured by the fact that terms
like ‘logical truth’ are often stipulatively dened in logic books. This can lead
to the belief that there are no genuine issues concerning these notions. It is also
obscured by the fact that one philosophical theory of these notions—the model
theoretic one—is so dominant that one can forget that it is a nontrivial theory.
Stipulative denitions are of course not things whose truth can be questioned;
but stipulative denitions of logical notions are good insofar as the stipulated
notions accurately model the real, intuitive, nonstipulated notions of logical
consequence and logical truth. Further, the stipulated denitions generally
concern formal languages, whereas the ultimate goal is an understanding of
correct reasoning of the sort that we actually do, using natural languages.
CHAPTER 1. NATURE OF LOGIC 9
Let’s focus just on logical consequence. Here is a quick survey of some
competing philosophical accounts of its nature. Probably the most standard
account is the semantic, or modeltheoretic one. Intuitively, a logical truth is
“true no matter what”. The model theoretic account is one way of making this
slogan precise. It says that φ is a logical consequence of the sentences in set Γ
if the formal translation of φ is true in every model (interpretation) in which
the formal translations of the members of Γ are true. This account needs to be
spelled out in various ways. First, “formal translations” are translations into a
formal language; but which formal language? It will be a language that has a
logical constant for each English logical expression. But that raises the question
of which expressions of English are logical expressions. In addition to ‘and’,
‘or’, ‘all’, and so on, are any of the following logical expressions?
necessarily
it will be the case that
most
it is morally wrong that
Further, the notion of translation must be dened; further, an appropriate
denition of ‘model’ must be chosen.
Similar issues of renement confront a second account, the prooftheoretic
account: φ is a logical consequence of the members of Γ iff the translation of
φ is provable from the translations of the members of Γ. We must decide what
formal language to translate into, and we must decide upon an appropriate
account of provability.
A third view is Quine’s: φ is a logical consequence of the members of Γ
iff there is no way to (uniformly) substitute new nonlogical expressions for
nonlogical expressions in φ and the members of Γ so that the members of Γ
become true and φ becomes false.
Three other accounts should be mentioned. The rst account is a modal
one. Say that Γ modally implies φ iff it is not possible for φ to be false while the
members of Γ are true. (What does ‘possible’ mean here? There are many kinds
of possibility one might have in mind: socalled “metaphysical possibility”,
“absolute possibility”, “idealized epistemic possibility”…. Clearly the accept
ability of the proposal depends on the legitimacy of these notions. We discuss
modality later in the book, beginning in chapter 6.) One might then propose
that φis a logical consequence of the members of Γ iff Γ modally implies φ. (An
CHAPTER 1. NATURE OF LOGIC 10
intermediate proposal: φ is a logical consequence of the members of Γ iff, in
virtue of the forms of φand the members of Γ, Γ modally implies φ. More carefully:
φ is a logical consequence of the members of Γ iff Γ modally implies φ, and
moreover, whenever Γ
/
and φ
/
result fromΓ and φ by (uniform) substitution of
nonlogical expressions, Γ
/
modally implies φ
/
. This is like Quine’s denition,
but with modal implication in place of truthpreservation.) Second, there is
a primitivist account, according to which logical consequence is a primitive
notion. Third, there is a pluralist account according to which there is no one
kind of genuine logical consequence. There are, of course, the various con
cepts proposed by each account, each of which is trying to capture genuine
logical consequence; but in fact there is no further notion of genuine logical
consequence at all; there are only the proposed construals.
As I say, this is not a book on philosophical logic, and so we will not inquire
further into which (if any) of these accounts is correct. We will, rather, focus
exclusively on two kinds of formal proposals for modeling logical consequence
and logical truth: modeltheoretic and prooftheoretic proposals.
1.6 Extensions, deviations, variations
4
“Standard logic” is what is usually studied in introductory logic courses. It
includes propositional logic (logical constants: ∧, ∨, ∼, →, ↔), and predicate
logic (logical constants: ∀, ∃, variables). In this book we’ll consider various
modications of standard logic:
1.6.1 Extensions
Here we add to standard logic. We add both:
• new symbols
• new cases of logical consequence and logical truth that we can
model
We do this in order to get a better representation of logical consequence. There
is more to logic than that captured by plain old standard logic.
We extended propositional logic, after all, to get predicate logic. You can
do a lot with propositional logic, but you can’t capture the obvious fact that
4
See Gamut (1991a, pp. 156158).
CHAPTER 1. NATURE OF LOGIC 11
‘Ted is happy’ logically implies ‘someone is happy’ using propositional logic
alone. It was for this reason that we added quantiers, variables, predicates,
etc., to propositional logic (added symbols), and added means to deal with
these new symbols in semantics and proof theory (new cases of logical conse
quence and logical truth). But there is no need to stop with plain old predicate
logic. We will consider adding symbols for identity, function symbols, and
denite descriptions to predicate logic, for example, and we’ll add a sign for
“necessarily” when we get to modal logic. And in each case, we’ll introduce
modications to our formal theories that let us account for logical truths and
logical consequences involving the new symbols.
1.6.2 Deviations
Here we change, rather than add. We retain the same symbols from standard
logic, but we alter standard logic’s proof theory and semantics. We therefore
change what we say about the logical consequences and logical truths that
involve the symbols.
Why do this? Perhaps because we think that standard logicians are wrong
about what the right logic for English is. If we want to correctly model logical
consequence in English, therefore, we must construct systems that behave
differently from standard logic.
For example, in the standard semantics for propositional logic, every for
mula is either true or false. But some have argued that natural language sen
tences like the following are neither true nor false:
The king of the United States is bald
Sherlock Holmes weighs more than 178 pounds
Bill Clinton is tall
There will be a sea battle tomorrow
If so, perhaps we should abandon the standard semantics for propositional logic
in favor of multivalued logic, in which formulas are allowed to be neither true
nor false.
1.6.3 Variations
Here we also change standard logic, but we change the notation without
changing the content of logic. We study alternate ways of expressing the same
thing.
CHAPTER 1. NATURE OF LOGIC 12
For example, in intro logic we show how:
∼(P∧Q)
∼P∨∼Q
are two different ways of saying the same thing. We will study other ways of
saying what those two sentences say, including:
PQ
∼∧PQ
In the rst case,  is a new symbol for “not both”. In the second case (“Polish
notation”), the ∼ and the ∧ mean what they mean in standard logic; but instead
of going between the P and the Q, the ∧ goes before P and Q. The value of
this, as we’ll see, is that we no longer will need parentheses.
1.7 Metalogic, metalanguages, and formalization
In introductory logic, we learned how to use certain logical systems. We learned
how to do truth tables, construct derivations, and so on. But logicians do not
spend much of their time developing systems only to sit around all day doing
derivations in those systems. As soon as a logician develops a new system, he
or she will begin to ask questions about that system. For an analogy, imagine
people who make up games. They might invent a new version of chess. Now,
they might spend some time actually playing the new game. But if they were
like logicians, they would soon get bored with this and start asking questions
about the game, such as: “is the average length of this new game longer than the
average length of a game of standard chess?”. “Is there any strategy one could
pursue which will guarantee a victory?” Analogously, logicians ask questions
like: what things can be proved in such and such a system? Can you prove the
same things in this system as in system X? Proving things about logical systems
is part of “metalogic”, which is an important part of logic.
One particularly important question of metalogic is that of soundness and
completeness. Standard textbooks introduce a pair of methods for characterizing
logical truth for the formulas of propositional logic. One is semantic: a formula
is a semantic logical truth iff the truth table for that formula has all “trues” in its
nal column. Another is prooftheoretic: a sentence is a prooftheoretic logical
truth iff there exists a derivation of it (from no premises), where a derivation
CHAPTER 1. NATURE OF LOGIC 13
is then appropriately dened. (Think: introduction and elimination rules,
conditional and indirect proof, and so on.) The question of soundness and
completeness is: how do these two methods for characterizing logical truth
relate to each other? The question is answered, in the case of propositional
logic, by the following metalogical results, which are proved in standard books
on metalogic:
Soundness of propositional logic: Any prooftheoretic logical
truth is a semantic logical truth
Completeness of propositional logic: Any semantic logical truth
is a prooftheoretic logical truth
These are really interesting claims! They show that the method of truth tables
and the method of constructing derivations amount to the same thing, as
applied to symbolic formulas of propositional logic. One can establish similar
results for standard predicate logic.
A couple remarks about proving things in metalogic.
First: what do we mean by “proving”? We do not mean: constructing a
derivation in the logical system we’re investigating. We’re trying to construct a
proof about the system. We do this in English, and we do it with informal (though
rigorous!) reasoning of the sort one would encounter in a mathematics book.
Logicians often distinguish the “object language” from the “metalanguage”.
The object language is the language that’s being studied—the language of
propositional logic, for example. Sentences of this object language look like
this:
P∧Q
∼(P∨Q)↔R
The metalanguage is the language we use to talk about the object language.
In the case of the present book, the metalanguage is English. Here are some
example sentences of the metalanguage:
‘P∧Q’ is a sentence with three symbols, one of which is a logical
constant
Every sentence of propositional logic has the same number of left
parentheses as right parentheses
CHAPTER 1. NATURE OF LOGIC 14
If there exists a derivation of a formula, then its truth table contains
all “trues” in its nal column (i.e., soundness)
Thus, we formulate metalogical claims in the metalanguage, and our proofs in
metalogic take place in the metalanguage.
Second: to get anywhere in metalogic, we will have to get picky about a few
things about which one can afford to be lax in introductory logic. Let’s look at
soundness, for instance. To be able to prove this, in a mathematically rigorous
way, we’ll need to have the terms in it dened very carefully. In particular, we’ll
need to say exactly what we mean by ‘sentence of propositional logic’, ‘truth
tables’, and ‘derived’. Dening these terms precisely (another thing we’ll do
using English, the metalanguage!) is known as formalizing logic. Our rst task
will be to formalize propositional logic.
1.8 Set theory
5
As mentioned above, modern logic uses mathematical techniques to study
formal languages. The mathematical techniques in question are those of “set
theory”. Only the most elementary settheoretic concepts and assumptions will
be needed, and you are probably already familiar with them; but nevertheless,
here is a brief overview.
Sets have members. Consider, for example, the set, N, of natural numbers.
Each natural number is a member of N: 1 is a member of N, 2 is a member of N,
and so on. We use the expression “∈” for this relationship of membership; thus,
we can say: 1 ∈ N, 2 ∈ N, and so on. We often name a set by putting names of
its members between braces: “]1, 2, 3, 4, . . . ]” is another name of N.
Sets are not limited to sets of mathematical entities; anything can be a
member of a set. Thus, we may speak of the set of people, the set of cities,
or—to draw nearer to our intended purpose—the set of sentences in a given
language.
There is also the empty set, ∅. This is the one set with no members. That
is, for each object u, u is not a member of ∅ (i.e.: for each u, u / ∈ ∅.)
Though the notion of a set is an intuitive one, it is deeply perplexing. This
can be seen by reecting on the Russell Paradox, discovered by Bertrand Russell,
the great philosopher and mathematician. Let us call R the set of all and only
5
Supplementary reading: the beginning of Enderton (1977)
CHAPTER 1. NATURE OF LOGIC 15
those sets that are not members of themselves. For short, R is the set of non
selfmembers. Russell asks the following question: is R a member of itself?
There are two possibilities:
i) R / ∈ R. Thus, R is a nonselfmember. But R was said to
be the set of all nonselfmembers, and so we’d have R ∈ R.
Contradiction.
ii) R ∈ R. So R is not a nonselfmember. R, by denition,
contains only nonselfmembers. So R / ∈ R. Contradiction.
Thus, each possibility leads to a contradiction. But there are no remaining
possibilities—either R is a member of itself or it isn’t! So it looks like the very
idea of sets is paradoxical.
The modern discipline of axiomatic set theory arose in part to develop a
notion of sets that isn’t subject to this sort of paradox. This is done by imposing
rigid restrictions on when a given “condition” picks out a set. In the example
above, the condition “is a nonselfmember” will be ruled out—there’s no set
of all and only the things satisfying this condition. The details of set theory are
beyond the scope of this course; for our purposes, we’ll help ourselves to the
existence of sets, and not worry about exactly what sets are, or how the Russell
paradox is avoided.
Various other useful settheoretic notions can be dened in terms of the
notion of membership. We say that A is a subset of B (“A⊆ B”) when every
member of Ais a member of B. We say that the intersection of Aand B (“A∩B”)
is the set that contains all and only those things that are in both Aand B, and
that the union of Aand B (“A∪B”) is the set containing all and only those things
that are members of either Aor B.
Suppose we want to refer to the set of the soandsos—that is, the set
containing all and only objects, u, that satisfy the condition “soandso”. We’ll
do this with the term “{u: u is a soandso}”. Thus, we could write: “N=]u :
u is a natural number]”. And we could restate the denitions of ∩ and ∪ from
the previous paragraph as follows:
A∩B =
df
]u : u ∈ Aand u ∈ B]
A∪B =
df
]u : u ∈ Aor u ∈ B]
Sets have members, but they don’t contain them in any particular order. For
example, the set containing me and Bill Clinton doesn’t have a “rst” member.
This is reected in the fact that “{Ted, Clinton}” and “{Clinton, Ted}” are
CHAPTER 1. NATURE OF LOGIC 16
two different names for the same set—the set containing just Clinton and Ted.
But sometimes we need to talk about a setlike thing containing Clinton and
Ted, but in a certain order. For this purpose, logicians use ordered sets. Two
membered ordered sets are called ordered pairs. To name the ordered pair of
Clinton and Ted, we use: “〈Clinton, Ted〉”. Here, the order is signicant, for
〈Clinton, Ted〉 and 〈Ted, Clinton〉 are not the same thing. The threemembered
ordered set of u, v, and w (in that order) is written: 〈u, v, w〉; and similarly for
ordered sets of any nite size. A nmembered ordered set is called an ntuple.
Let’s even allow 1tuples: let’s dene the 1tuple 〈u〉 as being the object u itself.
In addition to sets, and ordered sets, we’ll need a further related concept:
that of a function. A function is a rule that “takes in” an object or objects,
and “spits out” a further object. For example, the addition function is a rule
that takes in two numbers, and spits out their sum. As with sets and ordered
sets, functions are not limited to mathematical entities: they can “take in” and
“spit out” any objects whatsoever. We can speak of the fatherof function, for
example, which is a rule that takes in a person, and spits out the father of that
person. And later in this book we will be considering functions that take in
and spit out linguistic entities: sentences and parts of sentences from formal
languages.
Each function has a xed number of “places”: a xed number of objects
it must take in before it is ready to spit out something. You need to give
the addition function two arguments (numbers) in order to get it to spit out
something, so it is called a twoplace function. You only need to give the father
of function one object, on the other hand, to get it to spit out something, so it
is a oneplace function.
The objects that the function takes in are called its arguments, and the object
it spits out is called its value. Suppose f is an nplace function, and u
1
. . . u
n
are
n of its arguments; one then writes “f (u
1
. . . u
n
)” for the value of function f as
applied to arguments u
1
. . . u
n
. f (u
1
. . . u
n
) is the object that f spits out, if you
feed it u
1
. . . u
n
. For example, where f is the fatherof function, since Ron is
my father, we can write: f (Ted) =Ron; and, where a is the addition function,
we can write: a(2, 3) =5.
There’s a trick for “reducing” talk of both ordered pairs and functions to
talk of sets. One rst denes 〈u, v〉 as the set ]]u], ]u, v]]; one denes 〈u, v, w〉
as 〈u, 〈v, w〉〉, and similarly for nmembered ordered sets, for each positive
integer n. And, nally, one denes an nplace function as a set, f , of n +1
tuples obeying the constraint that if 〈u
1
, . . . , u
n
, v〉 and 〈u
1
, . . . , u
n
, w〉 are both
members of f , then v = w; f (u
1
, . . . , u
n
) is then dened as the object, v, such
CHAPTER 1. NATURE OF LOGIC 17
that 〈u
1
, . . . , u
n
, v〉 ∈ f . Thus, ordered sets and functions are dened as certain
sorts of sets. The trick of the denition of ordered pairs is that we put the
set together in such a way that we can look at the set and tell what the rst
member of the ordered pair is: it’s the one that “appears twice”. Similarly, the
trick of the denition of a function is that we can take any arguments to the
function, look at the set that is identied with the function, and gure out
what value the function spits out for those arguments. But the technicalities of
these reductions won’t matter for us; I’ll just feel free to speak of ordered pairs,
triples, functions, etc., without dening them as sets.
Chapter 2
Propositional Logic
W
i nioix with the simplest logic commonly studied: propositional logic.
Despite its simplicity, it has great power and beauty.
2.1 Grammar of PL
Modern logic has made great strides by treating the language of logic as a
mathematical object. To do so, grammar needs to be developed rigorously.
(Our study of a new logical system will always begin with grammar.)
If all you want to do is understand the language of logic informally, and be
able to use it effectively, you don’t really need to get so careful about grammar.
For even if you haven’t ever seen the grammar of propositional logic formalized,
you can recognize that things like this make sense:
P→Q
R∧(∼S↔P)
Whereas things like this do not:
→PQR∼
(P∼Q∼(∨
But to make any headway in metalogic, we will need more than an intuitive
understanding of what makes sense and what does not; we will need a precise
denition that has the consequence that only the strings of symbols in the rst
group “make sense”.
18
CHAPTER 2. PROPOSITIONAL LOGIC 19
Grammatical formulas (i.e., ones that “make sense”) are called wellformed
formulas, or “wffs” for short. We dene these by rst carefully dening exactly
which symbols are allowed to occur in wffs (the “primitive vocabulary”), and
second, carefully dening exactly which strings of these symbols count as wffs:
Primitive vocabulary:
Sentence letters: P, Q, R. . . , with or without numerical
subscripts
Connectives: →, ∼
Parentheses: (, )
Denition of wff:
i) Sentence letters are wffs
ii) If φ and ψ are wffs then φ→ψ and ∼φ are also wffs
iii) Only strings than can be shown to be wffs using i) and ii) are
wffs
Notice an interesting feature of this denition: the very expression we
are trying to dene, ‘wff’, appears on the right hand side of clause ii) of the
denition. In a sense, we are using the expression ‘wff’ in its own denition.
Is that “circular”? Not in any objectionable way. This denition is what is
called a “recursive” denition, and recursive denitions are legitimate despite
this sort of circularity. The reason is that clause ii) denes the notion of a
wff for certain complex expressions (namely, ∼φ and φ→ψ)) in terms of the
notion of a wff as applied to smaller expressions (φ and ψ). These smaller
expressions may themselves be complex, and therefore may have their statuses
as wffs determined, via clause ii), in terms of yet smaller expressions, and so on.
But eventually this procedure will lead us to clause i), not clause ii). And clause
i) is not circular: in that clause, we do not appeal to the notion of a wff in its
own denition; we rather say directly that sentence letters are wffs. Recursive
denitions always “bottom out” in this way; they always include a clause (called
the “base” clause) like i).
Think of this procedure in reverse: we begin with the smallest wffs (sentence
letters), and build up complex wffs using clause ii). Example: we can use clauses
i) and ii) to show that the expression (∼P→(P→Q)) is a wff:
CHAPTER 2. PROPOSITIONAL LOGIC 20
• P is a wff (clause i))
• so, ∼P is a wff (clause ii))
• Q is a wff (clause i))
• so, since P and Q are both wffs, (P→Q) is also a wff (clause
ii))
• so, since ∼P and (P→Q) are both wffs, (∼P→(P→Q)) is also
a wff (clause ii))
What’s the point of clause iii)? Clauses i) and ii) provide only sufcient
conditions for being a wff, and therefore do not on their own exclude non
sense combinations of primitive vocabulary like P∼Q∼R, or even strings like
(P∨147)→⊕ that include disallowed symbols. Clause iii) rules these strings out,
since there is no way to build up either of these strings from clauses i) and ii),
in the way that we built up the wff (∼P→(P→Q)).
What happened to ∧, ∨, and ↔? Our denition of a wff mentions only
→ and ∼; it therefore counts expressions like P∧Q, P∨Q, and P↔Q as not
being wffs. Answer: we can dene the ∧, ∨, and ↔in terms of ∼ and →:
φ∧ψ =
df
∼(φ→∼ψ)
φ∨ψ =
df
∼φ→ψ
φ↔ψ =
df
(φ→ψ)∧(ψ→φ)
(The symbol “=
df
” means “is dened as meaning”.) So, whenever we subse
quently write down an expression that includes one of the dened connectives,
we can regard it as being short for an expression that includes only the ofcial
primitive connectives, ∼and →. (We will show below that the above denitions
are good ones; in short, they are good because they generate the correct truth
tables for ∧, ∨, and ↔.)
Our choice to begin with →and ∼as our primitive connectives was arbitrary.
We could have started with ∼ and ∧, and dened the others as follows:
φ∨ψ =
df
∼(∼φ∧∼ψ)
φ→ψ =
df
∼(φ∧∼ψ)
φ↔ψ =
df
(φ→ψ)∧(ψ→φ)
And other alternate choices are possible. We’ll talk about this later.
So: →and ∼ are our primitive connectives; the others are dened. Why
do we choose only a small number of primitive connectives? Because, as we
will see, it makes metaproofs easier.
CHAPTER 2. PROPOSITIONAL LOGIC 21
2.2 The semantic approach to logic
In the next section we will introduce a “semantics” for propositional logic. A
semantics for a language is a way of assigning meanings to words and sentences
of that language. For us, the central notion of meaning will be that of truth.
Roughly speaking, our approach will be to dene, for each wff of propositional
logic, the circumstances in which it is true.
Philosophers disagree over how to understand the notion of meaning in
general. But the idea that the meaning of a sentence has something to do with
truthconditions is hard to deny, and at any rate has currency within logic. On
this approach, one explains the meaning of a sentence by showing how that
sentence depends for its truth or falsity on the way the world is.
We will provide a truthconditional semantics for a symbolic (formal) lan
guage in two stages. First, we will dene mathematical models of the various
congurations the world could be in. Second, we will dene the conditions
under which a sentence of the symbolic language is true in one of these mathe
matical congurations.
These denitions will have two main benets. First, they will illuminate
meaning. In logic, the symbols of symbolic languages are typically intended to
represent bits of natural language. The PL connectives, for example, represent
‘and’, ‘or’, and so on. If the denitions are wellconstructed, then the ways
in which the congurations render symbolic sentences true and false will be
parallel to the ways in which the real world renders corresponding natural
language sentences true and false. The denitions will therefore shed light on
the meanings of the natural language sentences represented by our symbolic
language. Second, our denitions will allow us to construct a precise theory (of
the semantic/modeltheoretic variety) of logical consequence and logical truth.
The semantic conception is a way of making precise the idea that a logical truth
is a sentence that is “true no matter what”, and the idea that one sentence is a
logical consequence of some other sentences iff there is “no way” for the latter
sentences to be true without the former sentence being true. We will use our
denitions to make these rough statements more precise: we will say that one
formula is a logical consequence of others iff there is no conguration in which
the latter formulas are true but the former is not, and that a formula is a logical
truth iff it is true in all congurations.
CHAPTER 2. PROPOSITIONAL LOGIC 22
2.3 Semantics of PL
Our semantics for propositional logic is really just truth tables, only presented
a little more carefully than in introductory logic books. Let’s dene the notion
of a propositional logic interpretation function, or PLinterpretation:
A PLinterpretation is is a function ·, that assigns either 1 or 0 to
every sentence letter of PL
Think of 0 and 1 as truth values; thus a PLinterpretation assigns truth values
to sentence letters. Instead of saying “let P be false, and Q be true”, we can
say: let · be an interpretation such that ·(P) =0 and ·(Q) =1.
Once we settle what truth values a given interpretation assigns to the sen
tence letters, the truth values of complex sentences containing those sentence
letters are thereby xed. The usual, informal, method for showing exactly how
those truth values are xed is by giving truth tables. The standard truth tables
for the →and ∼ are the following:
φ → ψ
1 1 1
1 0 0
0 0 1
0 0 0
∼ φ
1 0
0 1
What we will do, instead, is write out a formal denition of a function—the
valuation function—that assigns truth values to complex sentences as a function
of the truth values of their sentence letters—i.e., as a function of a given PL
assignment ·. But the idea is the same as the truth tables: truth tables are
really just pictures of the denition of a valuation function:
For any PLinterpretation, ·, the PLvaluation for ·, V
·
, is dened
as the function that assigns to each wff either 1 or 0, and which is
such that, for any wffs φ and ψ,
i) if φ is a sentence letter then V
·
(φ) =·(φ)
ii) V
·
(φ→ψ) =1 iff either V
·
(φ) =0 or V
·
(ψ) =1
iii) V
·
(∼φ) =1 iff V
·
(φ) =0
CHAPTER 2. PROPOSITIONAL LOGIC 23
We have another recursive denition: the valuation function’s values for
complex formulas are determined by its values for smaller formulas; and this
procedure bottoms out in the values for sentence letters, which are determined
directly by the interpretation function ·.
Notice also that in the denition of a valuation function I use the English
logical connectives ‘either…or’, and ‘iff ’. I used these English connectives
rather than the logical connectives ∨ and ↔, because at that point I was not
writing down wffs of the language of study (in this case, the language of propo
sitional logic). I was rather using sentences of English—our metalanguage, the
informal language we’re using to discuss the formal language of propositional
logic—to construct my denition of the valuation function. My denition
needed to employ the logical notions of disjunction and biimplication, the
English words for which are ‘either…or’ and ‘iff’.
One might again worry that something circular is going on. We dened the
symbols for disjunction and biimplication, ∨ and ↔, in terms of ∼ and →in
section 2.1, and nowwe’ve dened the valuation function in terms of disjunction
and biimplication. So haven’t we given a circular denition of disjunction and
biimplication? No. When we dene the valuation function, we’re not trying to
dene logical concepts such as negation, conjunction, disjunction, implication,
and biimplication, and so on, at all. A reductive denition of these very basic
concepts is probably impossible (though one can dene some of them in terms
of the others). What we are doing is i) starting with the assumption that we
already understand the logical concepts, and then ii) using those notions to
provide a formalized semantics for a logical language. This can be put in terms
of object and metalanguage: we use metalanguage connectives, such as ‘iff’
and ‘or’, which we simply take ourselves to understand, to provide a semantics
for the object language connectives ∼, →, etc.
Back to the denition of the valuation function. The denition applies
only to ofcial wffs, which can contain only the primitive connectives →and
∼. But sentences containing ∧, ∨, and ↔ are abbreviations for ofcial wffs,
and therefore they too are governed by the denition. In fact, given the
abbreviations dened in section 2.1, one can show that the denition assigns
the intuitively correct truth values to sentences containing ∧, ∨, and ↔; one
can show that for any PLinterpretation ·, and any wffs ψ and χ,
V
·
(ψ∧χ) =1 iff V
·
(ψ) =1 and V
·
(χ) =1
V
·
(ψ∨χ) =1 iff either V
·
(ψ) =1 or V
·
(χ) =1
V
·
(ψ↔χ) =1 iff V
·
(ψ) =V
·
(χ)
CHAPTER 2. PROPOSITIONAL LOGIC 24
I’ll show that the rst statement is true here; the others are exercises for the
reader. I’ll write out my proof in excessive detail, to make it clear exactly how
the reasoning works:
Let ψ and χ be any wffs. The expression ψ∧χ is an abbreviation
for the expression ∼(ψ→∼χ). So we want to show that, for any PL
interpretation ·, V
·
(∼(ψ→∼χ)) =1 iff V
·
(ψ) =1 and V
·
(χ) =1.
Now, in order to show that a statement α holds iff a statement
β holds, we must rst show that if α holds, then β holds (the
“forwards ⇒direction”); then we must show that if β holds then α
holds (the “backwards ⇐direction”):
⇒: First assume that V
·
(∼(ψ→∼χ)) = 1. Then, by denition
of the valuation function, clause for ∼, V
·
(ψ→∼χ) = 0. So
1
,
V
·
(ψ→∼χ) is not 1. But then, by the clause in the denition of V
·
for the →, we know that it’s not the case that: either V
·
(ψ)=0 or
V
·
(∼χ)=1. That is: V
·
(ψ) =1 and V
·
(∼χ) =0. From the latter, by
the clause for ∼, we know that V
·
(χ) =1. That’s what we wanted
to show—that V
·
(ψ) =1 and V
·
(χ) =1.
⇐: This is sort of like undoing the previous half. Suppose that
V
·
(ψ) = 1 and V
·
(χ) = 1. Since V
·
(χ) = 1, by the clause for ∼,
V
·
(∼χ) = 0; but now since V
·
(ψ) = 1 and V
·
(∼χ) = 0, by the
clause for →we know that V
·
(ψ→∼χ) =0; then by the clause for
∼, we know that V
·
(∼(ψ→∼χ)) =1, which is what we were trying
to show.
Let’s reect on what we’ve done so far. We have dened the notion of
a PLinterpretation, which assigns 1s and 0s to sentence letters of the sym
bolic language of propositional logic. And we have also dened, for any PL
interpretation, a corresponding valuation function, which extends the inter
pretation’s assignment of 1s and 0s to complex wffs of PL. Note that we have
been informally speaking of these assignments as assignments of truth values.
That’s because the assignments of 1s and 0s accurately models the truth values
of statements in English that are represented in the obvious way by PLwffs.
1
The careful reader will note that here (and henceforth), I treat “V
·
(α)=0” and “V
·
(α) is
not 1” interchangeably (for any wff α). (Similarly for “V
·
(α)=1” and “V
·
(α) is not 0”.) This is
justied as follows. First, if V
·
(α) is 0, then it can’t also be that V
·
(α) is 1—V
·
was stipulated
to be a function. Second, since it was stipulated that V
·
assigns either 0 or 1 to each wff, if
V
·
(α) is not 1, then V
·
(α) must be 0.
CHAPTER 2. PROPOSITIONAL LOGIC 25
For example, the ∼ of propositional logic is supposed to model the English
phrase ‘it is not the case that’. Accordingly, just as an English sentence “It is
not the case that φ” is true iff φ is false, one of our valuation functions assigns
1 to ∼φ iff it assigns 0 to φ.
Semantics in logic, recall, generally denes two things: congurations and
truthinaconguration. In the propositional logic semantics we have laid
out, the congurations are the interpretation functions, and the valuation
function denes truthinaconguration. Each interpretation function gives
a complete assignment of truth values to the sentence letters. Thus, insofar
as the sentence letters are concerned, an interpretation function completely
species a possible conguration of the world. And for any interpretation
function, its corresponding valuation function species, for each complex wff,
what truth value that wff has in that interpretation. Thus, for each wff (φ) and
each conguration (·), we have specied the truth value of that wff in that
conguration (V
·
(φ)).
Onward. We are now in a position to dene the semantic versions of the
notions of logical truth and logical consequence for PL. The semantic notion
of a logical truth is that of a valid formula:
φ is valid =
df
for every PLinterpretation, ·, V
·
(φ) =1
We write “ φ” for “φ is valid”. (The valid formulas of propositional logic
are also called tautologies.) As for logical consequence, the semantic version of
this notion is that of a single formula’s being a semantic consequence of a set of
formulas:
φ is a semantic consequence of the wffs in set Γ =
df
for every PL
interpretation, ·, IF for each γ in Γ, V
·
(γ) =1, THEN V
·
(φ) =1
That is, φ is a semantic consequence of Γ iff φ is true whenever each member
of Γ is true. We write “Γ φ” for “φ is a semantic consequence of Γ”. (When
Γ has just one member, ψ, then instead of writing ]ψ] φ, let’s write simply:
ψ φ.)
We will be discussing a number of different semantical systems throughout
this book, each with its own denition of validity and semantic consequence.
What we have dened here is validity and semantic consequence in just one sys
tem: PL. So strictly, we should speak of PLvalidity and PLsemanticconsequence,
and we should subscript our symbol thus:
PL
. But let’s omit the subscript
where there is no danger of ambiguity.
CHAPTER 2. PROPOSITIONAL LOGIC 26
A parenthetical remark: now we can see the importance for setting up the
grammar for our system according to precise rules. If we hadn’t, the denition
of ‘truth value’ given here would have been impossible. In this denition we
dened truth values of complicated formulas based on their form. For example,
if a formula has the form (φ→ψ), then we assigned it an appropriate truth value
based on the truth values of φ and ψ. But suppose we had a formula in our
language that looked as follows:
P→P→P
and suppose that P has truth value 0. What is the truth value of the whole? We
can’t tell, because of the missing parentheses. For if the parentheses look like
this:
(P→P)→P
then the truth value is 0, whereas if the parentheses look like this:
P→(P→P)
then it is 1. Certain kinds of grammatical ambiguity, then, make it impossible
to assign truth values. We solve this problem in logic by pronouncing the
original string “P→P→P” as illformed; it is missing parentheses. Thus, the
precise rules of grammar assure us that when it comes time to do semantics, we
are able to assign semantic values (in this case, truth values) in an unambiguous
way.
Notice also a fact about validity in propositional logic: it is mechanically
“decidable”—a computer program could be written that is capable of telling,
for any given formula, whether or not that formula is valid. The program
would simply construct a complete truth table for the formula in question.
We can observe that this is possible by noting the following: every formula
contains a nite number N of sentence letters, and so for any formula, there
are only a nite number of different “cases” one needs to check—namely, the
2
N
permutations of truth values for the contained sentence letters. But given
any assignment of truth values to the sentence letters of a formula, it’s clearly a
perfectly mechanical procedure to compute the truth value the formula takes
for those truth values—simply apply the rules for the ∼ and →repeatedly.
It’s worth being very clear about two assumptions in this proof (which, by
the way, is our rst bit of metatheory—our rst proof about a logical system).
They are: that every formula has a nite number of sentence letters, and that
CHAPTER 2. PROPOSITIONAL LOGIC 27
the truth values of sentence letters not contained in a formula do not affect
the truth value of the formula. We need the latter assumption to be sure that
we only need to check a nite number of cases—namely, the permutations
of truth values of the contained sentence letters—to see whether a formula is
valid. After all, there are innitely many interpretation functions (since there
are innitely many sentence letters in the language of PL), and a valid formula
must be true in each one.
These two assumptions are obviously true, but it would be good to prove
them. I’ll prove the rst assumption here (the second may be proved by a
similar method), and take this opportunity to introduce an important technique
for metalanguage proofs: proof by induction.
Proof that every wff contains a nite number of sentence letters: In this
sort of proof by induction, we’re trying to prove a statement of
the form: every wff has property P. The property P in this case is
having a nite number of different sentence letters. In order to do this,
we must show two separate statements:
base case: we show that every atomic sentence has the property. This
is obvious—atomic sentences are just sentence letters, and each of
them contains one sentence letter, and thus nitely many different
sentence letters.
induction step: we begin by assuming that if formulas φ and ψ have
the property, then so will the complex formulas one can form from
φ and ψ by the rules of formation, namely ∼φ and φ→ψ. So, we
assume that φ and ψ have nitely many different sentence letters;
and we show that the same must hold for ∼φ and φ→ψ. That’s
obvious: ∼φ has as many different sentence letters as does φ; since
φ, by assumption, has only nitely many, then so does ∼φ. As for
φ→ψ, by hypothesis, φand ψhave nitely many different sentence
letters, and so φ→ψ has, at most, n + m sentence letters, where
n and m are the number of different sentence letters in φ and ψ,
respectively.
We’ve shown that every atomic formula has the property having a
nite number of different sentence letters; and we’ve shown that the
property is inherited by complex formulas built according to the
recursion rules. But every wff is either atomic, or built fromatomics
CHAPTER 2. PROPOSITIONAL LOGIC 28
by a nite series of applications of the recursion rules. Therefore,
by induction, every wff has the property. QED.
2.4 Natural deduction in propositional logic
We have investigated a semantic conception of the notions of logical truth and
logical consequence. An alternate conception is prooftheoretic, in which the
central conception is that of proof. On this conception, logical consequence
means “provable from”, and a logical truth is a sentence that can be proved
starting from no premises at all. A “proof” procedure, informally, is a method
of reasoning one’s way, step by step, according to mechanical rules, from some
premises to a conclusion. This all is, of course, informal; we must now make it
precise.
One method for characterizing proof is called the method of natural de
duction. Any system in which one has assumptions for “conditional proof”,
assumptions for “indirect derivation”, etc. is a system of natural deduction.
This is the usual method in introductory logic books. Proofs in these systems
often look like this:
1 P→(Q→R)
2 P∧Q
3 P 2, ∧E
4 Q 2, ∧E
5 Q→R 1, 3 →E
6 R 4, 5 →E
7 (P∧Q)→R 26, →I
or like this:
CHAPTER 2. PROPOSITIONAL LOGIC 29
1.
2.
3.
4.
5.
6.
7.
8.
P→(Q→R)
show (P∧Q)→R
P∧Q
show R
P
Q
Q→R
R
Pr.
CD
As.
DD
3, ∧E
3, ∧E
1, 5 →E
6, 7→E
We will implement natural deduction a little differently here, in order to reveal
what is really going on. Our derivations will therefore look a little different
from the derivations familiar from introductory books. Our version of the
above derivation will look like this:
1. P→(Q→R) ' P→(Q→R) RA
2. P∧Q ' P∧Q RA
3. P∧Q ' P 2, ∧E
4. P∧Q ' Q 2, ∧E
5. P→(Q→R), P∧Q ' Q→R 1,3 →E
6. P→(Q→R), P∧Q ' R 4,5 →E
7. P→(Q→R) ' (P∧Q)→R 5, →I
It looks different, but the underlying idea is nevertheless the same.
2.4.1 Sequents
Natural deduction systems model the kind of reasoning one employs in everyday
life. How does that reasoning work? In its simplest form, one reasons in a
stepbystep fashion from premises to a conclusion, each step being sanctioned
by a rule of inference. For example, suppose that one already knows the premise
that P∧(P→Q) is true. One can then reason one’s way to the conclusion that
Q is also true, as follows:
1. P∧(P→Q) premise
2. P from line 1
3. P→Q from line 1
4. Q from lines 2 and 3
CHAPTER 2. PROPOSITIONAL LOGIC 30
In this kind of proof, each step is a tiny, indisputably correct, logical inference.
Consider the moves from 1 to 2 and from 1 to 3, for example. These are
indisputably correct because a conjunctive statement clearly logically implies
either of its conjuncts. Likewise for the move from 2 and 3 to 4: it is clear
that a conditional statement, plus its antecedent, together imply its consequent.
Natural deduction systems consist in part of simple general principles like these
(“a conjunctive statement logically implies either of its conjuncts”); they are
known as rules of inference.
In addition to rules of inference, ordinary reasoning employs a further
technique: the use of assumptions. In order to establish a conditional claim “if P
then Q”, one would ordinarily i) assume P, ii) reason one’s way to Q, and then
iii) on that basis conclude that the conditional claim “if P then Q” is true. Once
P’s assumption is shown to lead to Q, the conditional claim “if P then Q” may
be concluded. Another example: to establish a claim of the form “notP”, one
would ordinarily i) assume P, ii) reason one’s way to a contradiction, and iii)
on that basis conclude that “notP” is true. Once P’s assumption is shown to
lead to a contradiction, “notP” may be concluded. The rst sort of reasoning
is called conditional proof, the second, reductio ad absurdum.
When one reasons with assumptions, one writes down statements that one
does not know to be true. When you write down P as an assumption, with the
goal of proving the conditional “if P then Q”, you do not know P to be true.
You’re merely assuming P for the sake of establishing the conditional “if P
then Q”. Outside the context of this proof, the assumption need not hold; once
you’ve reasoned your way to Q on the basis of the assumption of P, and so
concluded that the conditional “if P then Q” is true, you stop assuming P. To
model this sort of reasoning formally, we need a way to keep track of how the
conclusions we establish depend on the assumptions we have made. Natural
deduction systems in introductory textbooks tend to do this geometrically (by
placement on the page), with special markers (e.g., ‘show’), and by drawing
lines or boxes around parts of the proof once the assumptions that led to those
parts are no longer operative. We will do it differently: we will keep track of the
dependence of conclusions on assumptions by writing down explicitly, for each
conclusion, which assumptions it depends on. We will do this by constructing
our derivations out of sequents.
A sequent looks like this:
Γ ' φ
CHAPTER 2. PROPOSITIONAL LOGIC 31
Γ is a set of formulas, called the premises of the sequent. φ is a single formula,
called the conclusion of the sequent. “'” is a sign that goes between the sequent’s
premises and its conclusion, to indicate that the whole thing is a sequent. We
will introduce sequent proofs below; and when you write down the sequent
Γ ' φ in one of them, the idea is that φ is an established conclusion, but it
was established by making the assumptions in Γ. Take away those assumptions,
and φ may no longer be established. In fact, one may think of a sequent as
“meaning” that its conclusion is a logical consequence of its premises.
Thus, our proofs will be proofs of sequents. It’s a bit weird at rst to think in
terms of proving sequents, rather than formulas, since each sequent itself asserts
a relation of logical consequence between its premises and its conclusion, but
the idea nevertheless makes sense. Let’s introduce an informal notion of logical
correctness for sequents: the sequent Γ ' φ is logically correct if the formula φ
is a logical consequence of the formulas in Γ. Thus, one is entitled to conclude
the conclusion of a logically correct sequent from its premises. The idea, then,
of constructing a sequent proof of a sequent is to show that that sequent is
logically correct—to show, that is, that its consequent is a logical consequence
of its premises.
From our investigation of the semantics of propositional logic, we already
have the makings of a semantic criterion for when a sequent is logically correct:
the sequent Γ ' φ is logically correct iff φ is a semantic consequence of Γ.
What we will be doing in this section is giving a new, prooftheoretic, criterion
for the logical correctness of sequents.
2.4.2 Rules
The rst step in developing our system is to write down sequent rules. A sequent
rule is a permission to move fromcertain sequents to certain other sequents. Our
goal is to construct rules with the following feature: if the “from” sequents are
all logically correct sequents, then any of the “to” sequents will be guaranteed
to be a logically correct sequent. Call such sequent rules “logicalcorrectness
preserving”.
Consider, as an example, the rst rule of our system “∧ introduction”, or
“∧I” for short. We picture this sequent rule thus:
Γ ' φ ∆' ψ
Γ,∆' φ∧ψ
∧I
CHAPTER 2. PROPOSITIONAL LOGIC 32
Above the line go the “from” sequents; below the line go the “to”sequents.
(The comma between Γ and ∆ in the “to” sequent simply means that the
premises of this sequent are all the members of Γ plus all the members of ∆.
Strictly speaking it would be more correct to write this in settheoretic notation
as: Γ∪∆' φ∧ψ.) Thus, ∧I permits us to move from the sequents Γ ' φ and
∆' ψ to the sequent Γ, ∆' φ∧ψ. For any sequent rule, we say that any of the
“to” sequents (Γ, ∆' φ∧ψ in this case) follows from the “from” sequents (in this
case Γ ' φ and ∆' ψ) via the rule.
It seems intuitively clear that ∧I preserves logical correctness. For if some
assumptions Γ logically imply φ, and some assumptions ∆ logically imply ψ,
then (since φ∧ψintuitively follows fromφand ψtaken together) the conclusion
φ∧ψ should indeed logically follow from all the assumptions together, the ones
in Γ and the ones in ∆.
Our next sequent rule is ∧E:
Γ ' φ∧ψ
Γ ' φ Γ ' ψ
∧E
This lets one move from the sequent Γ ' φ∧ψ to either the sequent Γ ' φ or
the sequent Γ ' ψ (or both). This, too, appears to preserve logical correctness.
If the members of Γ imply the conjunction φ∧ψ, then (since φ∧ψ intuitively
implies both φ and ψ individually) it must be that the members of Γ imply φ,
and they must also imply ψ.
The rule ∧I is known as an introduction rule for ∧, since it allows us to move
to a sequent whose major connective is the ∧. Likewise, the rule ∧E is known
as an elimination rule for ∧, since it allows us to move from a sequent whose
major connective is the ∧. In fact our sequent system contains introduction and
elimination rules for the other connectives as well: ∼, ∨, and → (let’s forget
the ↔here.) We’ll present those rules in turn.
First ∨I and ∨E:
Γ ' φ
Γ ' φ∨ψ Γ ' ψ∨φ
∨I
Γ ' φ∨ψ ∆
1
,φ' χ ∆
2
,ψ' χ
Γ,∆
1
,∆
2
' χ
∨E
Let’s think about what ∨Emeans. Remember the intuitive meaning of a sequent:
its conclusion is a logical consequence of its premise. Another (related) way to
think of it is that Γ ' φ means that one can establish that φ if one assumes the
members of Γ. So, if the sequent Γ ' φ∨ψ is logically correct, that means we’ve
got the disjunction φ∨ψ, assuming the formulas in Γ. Now, suppose we can
CHAPTER 2. PROPOSITIONAL LOGIC 33
reason to a new formula χ, assuming φ, plus perhaps some other assumptions
∆
1
. And suppose we can also reason to χ from ψ, plus perhaps some other
assumptions ∆
2
. Then, since either φ or ψ (plus the assumptions in ∆
1
and ∆
2
)
leads to χ, and we know that φ∨ψ is true (conditional on the assumptions in Γ),
we ought to be able to infer χ itself, assuming the assumptions we needed along
the way (∆
1
and ∆
2
), plus the assumptions we needed to get φ∨ψ, namely, Γ.
Next, we have double negation:
Γ ' φ
Γ ' ∼∼φ
Γ ' ∼∼φ
Γ ' φ
DN
In connection with negation, we also have the rule of reductio ad absurdum:
Γ,φ' ψ∧∼ψ
Γ ' ∼φ
RAA
That is, if φ (along with perhaps some other assumptions, Γ) leads to a contra
diction, we can conclude that ∼φ is true (given the assumptions in Γ). RAA
and DN together are our introduction and elimination rules for ∼.
And nally we have →I and →E:
Γ,φ' ψ
Γ ' φ→ψ
Γ ' φ→ψ ∆' φ
Γ,∆' ψ
→E is perfectly straightforward; it’s just the familiar rule of modus ponens.
But →I requires a bit more thought. →I is the principle of conditional proof.
Suppose you can get to ψ on the assumption that φ (plus perhaps some other
assumptions Γ.) Then, you should be able to conclude that the conditional
φ→ψ is true (assuming the formulas in Γ). Put another way: if you want to
establish the conditional φ→ψ, all you need to do is assume that φ is true, and
reason your way to ψ.
We add, nally, one more sequent rule, the rule of assumptions
φ' φ
RA
Note that this is the one sequent rule when no “from” sequent is required,
since there are no sequents above the line. The rule permits us to move from
no sequents at all to a sequent of the form φ' φ. (Strictly, this sequent should
be written “]φ] ' φ”.) Call any such sequent an “assumption sequent”. Clearly,
any assumption sequent is a logically correct sequent, since clearly φ can be
proved if we assume φ itself.
CHAPTER 2. PROPOSITIONAL LOGIC 34
2.4.3 Sequent proofs
We have assembled all the sequent rules; now we need to show how to use
those rules to provide a criterion for logically correct sequents. We do this by
rst dening the notion of a “sequent proof”:
A sequent proof is a series of sequents, each of which is either an
assumption sequent, or follows from earlier sequents in the series
by some sequent rule.
So, for example, the following is a sequent proof
1. P∧Q ' P∧Q As
2. P∧Q ' P 1, ∧E
3. P∧Q ' Q 1, ∧E
4. P∧Q ' Q∧P 2, 3 ∧I
Though it isn’t strictly required, we write a line number to the left of each
sequent in the series, and to the right of each line we write the sequent rule
that justies it, together with the line or lines (if any) that contained the “from”
sequents required by the sequent rule in question. (The rule of assumptions
requires no “from” sequents, recall.)
(It’s important to distinguish what we’re nowcalling proofs, namely, sequent
proofs, from the kinds of informal arguments I gave in section 2.3, and will
give elsewhere in this book. Sequent proofs (and also the axiomatic proofs we
will introduce in section 2.5) are formalized objectlanguage proofs. The sentences
in sequent proofs are sentences in the object language; they are wffs of PL.
Moreover, we gave a rigorous denition of what a sequent proof is. Moreover,
sequent proofs are restrictive in that only the system’s ofcial rules may be
used. For contrast, consider the argument I gave in section 2.3 that any PL
valuation assigns 1 to φ∧ψ iff it assigns 1 to φ and 1 to ψ. That argument was
an informal metalanguage proof. The sentences in the argument were sentences
of English, and the argument used informal (i.e., not formalized) techniques of
reasoning. “Informal” doesn’t imply lack of rigor. The argument was perfectly
rigorous: it conforms to the standards of good argumentation that generally
prevail in mathematics. We’re free to use any reasonable pattern of reasoning,
for example “universal proof” (to establish something of the form “everything
is thusandso”, we consider an arbitrary thing and show that it is thusandso).
We may “skip steps” if it’s clear how the argument is supposed to go. In short,
CHAPTER 2. PROPOSITIONAL LOGIC 35
what we must do is convince a wellinformed and mathematically sophisticated
reader that the result we’re after is indeed true.)
Next we introduce the notion of a “provable sequent”. The idea is that
each sequent proof culminates in the proof of some sequent. Thus we offer the
following denition:
A provable sequent is a sequent that is the last line of some sequent
proof
(Note that it would be equivalent to dene a sequent proof as any line in any
sequent proof, because at any point in a sequent proof one may simply stop
adding lines, and the proof up until that point counts as a legal sequent proof.)
So, for example, the sequent proof given above establishes that P∧Q ' Q∧P is
a provable sequent. (We call a sequent proof, whose last line is Γ ' φ, a sequent
proof of Γ ' φ.)
Given the notion of a provable sequent, we can now offer sequentproof
theoretic notions of logical truth and logical consequence:
A formula φ is a logical truth iff the sequent ∅ ' φ is a provable
sequent.
Formula φ is a logical consequence of the formulas in set Γ iff the
sequent Γ ' φ is a provable sequent.
The symbol ∅ stands for the “empty set”—the set containing no members.
Thus, a logical truth here is understood as a formula that is provable from no
assumptions at all.
2.4.4 Example sequent proofs
Let’s explore how to construct sequent proofs. You may nd this a bit less
intuitive, initially, than constructing natural deduction proofs in the systems
familiar from introductory textbooks. But a little experimentation will show
that the techniques for proving things in the usual systems carry over to the
present system.
A rst simple example: let’s return to the sequent proof of P∧Q ' Q∧P:
CHAPTER 2. PROPOSITIONAL LOGIC 36
1. P∧Q ' P∧Q As
2. P∧Q ' P 1, ∧E
3. P∧Q ' Q 1, ∧E
4. P∧Q ' Q∧P 2, 3 ∧I
Notice the strategy. We rst use the rule of assumptions to enter the premise
of the sequent we’re trying to prove: P∧Q. We then use the rules of inference
to infer the consequent of that sequent: Q∧P. Since our initial assumption of
P∧Q was dependent on the formula P∧Q, our subsequent inferences remain
dependent on that same assumption, and so the nal formula concluded, Q∧P,
remains dependent on that assumption.
Let’s write our proofs out in a simpler way. Instead of writing out entire
sequents, let’s write out only their conclusions. We can indicate the premises
of the sequent using line numbers; the line numbers indicating the premises
of the sequent will go to the left of the number indicating the sequent itself.
Rewriting the previous proof in this way yields:
1 (1) P∧Q As
1 (2) P 1, ∧E
1 (3) Q 1, ∧E
1 (4) Q∧P 2, 3 ∧I
Next, let’s have an example to illustrate conditional proof. Let’s construct a
sequent proof of P→Q, Q→R' P→R:
1. P→Q ' P→Q As
2. Q→R' Q→R As
3. P ' P As
4. P→Q, P ' Q 1,3 →E
5. P→Q, Q→R, P ' R 2,4 →E
6. P→Q, Q→R' P→R 5,→I
It can be rewritten in the simpler style as follows:
CHAPTER 2. PROPOSITIONAL LOGIC 37
1 (1) P→Q As
2 (2) Q→R As
3 (3) P As
1,3 (4) Q 1, 3 →E
1,2,3 (5) R 2, 4 →E
1,2 (6) P→R 5, →I
Let’s think about this example. We’re trying to establish P→R on the ba
sis of two formulas, P→Q and Q→R, so we start by assuming the latter two
formulas. Then, since the formula we’re trying to establish is a conditional,
we assume the antecedent of the conditional, in line 3. We then proceed, on
that basis, to reason our way to R, the consequent of the conditional we’re
trying to prove. (Notice how in lines 4 and 5, we add more line numbers on
the very left. Whenever we use →E, we increase dependencies: when we infer
Q from P and P→Q, our conclusion Q depends on all the formulas that P and
P→Q depended on, namely, the formulas on lines 1 and 3. Look back to the
statement of the rule →E: the conclusion ψ depends on all the formulas that φ
and φ→ψ depended on: Γ and ∆.) That brings us to line 5. At that point, we’ve
shown that R can be proven, on the basis of various assumptions, including P.
The rule →I (that is, the rule of conditional proof) then lets us conclude that
the conditional P→R follows merely on the basis of the other assumptions;
that rule, note, lets us in line 6 drop line 3 from the list of assumptions on
which P→R depends.
Next let’s establish an instance of DeMorgan’s Law, ∼(P∨Q) ' ∼P∧∼Q:
1 (1) ∼(P∨Q) As
2 (2) P As (for reductio)
2 (3) P∨Q 2, ∨I
1,2 (4) (P∨Q)∧∼(P∨Q) 1, 3 ∧I
1 (5) ∼P 4, RAA
6 (6) Q As (for reductio)
6 (7) P∨Q 6, ∨I
1,6 (8) (P∨Q)∧∼(P∨Q) 1, 7 ∧I
1 (9) ∼Q 8, RAA
1 (10) ∼P∧∼Q 5, 9∧I
Next let’s establish ∅' P∨∼P:
CHAPTER 2. PROPOSITIONAL LOGIC 38
1 (1) ∼(P∨∼P) As
2 (2) P As (for reductio)
2 (3) P∨∼P 2, ∨I
2,1 (4) (P∨∼P)∧∼(P∨∼P) 1, 3 ∧I
1 (5) ∼P 4, RAA
6 (6) ∼P As (for reductio)
6 (7) P∨∼P 6, ∨I
6,1 (8) (P∨∼P)∧∼(P∨∼P) 1, 7 ∧I
1 (9) ∼∼P 8, RAA
1 (10) ∼P∧∼∼P 5, 9 ∧I
∅ (11) ∼∼(P∨∼P) 10, RAA
∅ (12) P∨∼P 11, DN
Comment: my overall goal was to assume ∼(P∨∼P) and then derive a con
tradiction. And my route to the contradiction was to separately establish ∼P
(lines 25) and ∼∼P (lines 69), each by reductio arguments.
Finally, let’s establish a sequent corresponding to a way that ∨ E is some
times formulated: P∨Q, ∼P ' Q:
1 (1) P∨Q As
2 (2) ∼P As
3 (3) Q As (for use with ∨E)
4 (4) P As (for use with ∨E)
5 (5) ∼Q As (for reductio)
4,5 (6) ∼Q∧P 4,5 ∧I
4,5 (7) P 6, ∧E
2,4,5 (8) P∧∼P 2,7 ∧I
2,4 (9) ∼∼Q 8, RAA
2,4 (10) Q 9, DN
1,2 (11) Q 1,3,10 ∨E
The basic idea of this proof is to use ∨ E on line 1 to get Q. That calls,
in turn, for showing that each disjunct of line 1, P and Q, leads to Q. Showing
that Q leads to Q is easy; that was line 3. Showing that P leads to Q took lines
410; line 10 states the result of that reasoning, namely that Q follows from P
(as well as line 2). I began at line 4 by assuming P. Then my strategy was to
establish Q by reductio, so I assumed ∼Q in line 5. At this point, I basically
had my contradiction: at line 2 I had ∼P and at line 4 I had P. (You might
think I had another contradiction: Q at line 3 and ∼Q at line 5. But at the end
CHAPTER 2. PROPOSITIONAL LOGIC 39
of the proof, I don’t want my conclusion to depend on line 3, whereas I don’t
mind it depending on line 2, since that’s one of the premises of the sequent
I’m trying to establish.) So I want to put P and ∼P together, to get P∧∼P,
and then conclude ∼∼Q by RAA. But there is a minor hitch. Look carefully at
how RAA is formulated. It says that if we have Γ,φ' ψ∧∼ψ, we can conclude
Γ ' ∼φ. The rst of these two sequents includes φ in its premises. That means
that in order to conclude ∼φ, the contradiction ψ∧∼ψ needs to depend on φ.
So in the present case, in order to nish the reductio argument and conclude
∼∼Q, the contradiction P∧∼P needs to depend on the reductio assumption
∼Q (line 5.) But if I just used ∧I to put lines 2 and 4 together, the resulting
contradiction will only depend on lines 2 and 4. To get around this, I used
a little trick. Whenever you have a sequent Γ ' φ, you can always add any
formula ψ you like to the premises on which φ depends, using the following
argument:
2
Γ ' φ (begin with this)
ψ' ψ As (ψ is any chosen formula)
Γ,ψ' φ∧ψ ∧I
Γ,ψ' φ ∧E
Lines 4, 6 and 7 in the proof employ this trick: initially, at line 4, P only depends
on 4, but then by line 7, P also depends on 5. That way, the move from 8 to 9
by RAA is justied.
2.5 Axiomatic proofs in propositional logic
Natural deduction proofs are comparatively easy to construct; that is their
great advantage. A different approach to proof theory, the axiomatic method,
offers different advantages. Like natural deduction, the axiomatic method is a
prooftheoretic approach to logic, based on the stepbystep reasoning model
in which each step is sanctioned by a rule of inference. But unlike natural
deduction systems, axiomatic systems do not allow reasoning by assumptions,
and they have very few rules of inference. Although these differences make
2
Adding arbitrary dependencies is not allowed in relevance logic, where a sequent is provable
only when all of its premises are, in an intuitive sense, relevant to its conclusion. Relevant
logicians modify various rules of classical logic, including the rule of ∧E.
CHAPTER 2. PROPOSITIONAL LOGIC 40
axiomatic proofs much harder to construct, there is a compensatory advantage
in metalogic: it is far easier to prove things about axiomatic systems.
Let’s rst think about axiomatic systems informally. An axiomatic proof is
a series of formulas (not sequents—we no longer need them since we’re not
reasoning with assumptions), the last of which is the conclusion of the proof.
Each line in the proof must be justied in one of two ways: it may be inferred
by a rule of inference from earlier lines in the proof, or it may be an axiom.
An axiom is a certain kind of formula, a formula that one is allowed to enter
into a proof without any justication at all. Axioms are the “starting points” of
proofs, the foundation on which proofs rest. Since axioms are to play this role,
the axioms in a good axiomatic system ought to be indisputable logical truths.
For example, “P→P” would be a good axiom—it’s obviously a logical truth.
(As it happens, we won’t choose this particular axiom; we’ll instead choose
other axioms from which this one may be proved.) Similarly, for each rule of
inference in a good axiomatic system, there should be no question but that the
premises of the rule logically imply its conclusion.
Formally, to apply the axiomatic method, we must choose i) a set of rules,
and ii) a set of axioms. An axiom is simply any chosen sentence (though as we
saw, in a good axiomatic system the axioms will be clear logical truths.) A rule
is simply a permission to infer one sort of sentence from other sentences. For
example, the rule modus ponens can be stated thus: “From φ→ψ and φ you may
infer ψ”, and pictured as follows:
MP φ→ψ
φ
ψ
(Modus ponens is the analog of the sequent rule →E.)
Given any chosen axioms and rules, we can dene the following concepts:
A proof is a nite sequence of wffs, each of which either i) is an
axiom, or ii) follows from earlier wffs in the sequence via a rule.
Where Γ is a set of wffs, a proof from Γ is a nite sequence of wffs,
each of which either i) is an axiom, ii) is a member of Γ, or iii)
follows from earlier wffs in the sequence via a rule.
A theorem is the last line of any proof (we write “' φ” for “φ is a
theorem”).
CHAPTER 2. PROPOSITIONAL LOGIC 41
Wff φ is provable from set of wffs Γ (“Γ ' φ”) iff φ is the last line of
some proof from Γ.
The symbol ' may be subscripted with the name of the system in question.
Thus, for our axiom system for PL below, we may write: '
PL
. (We’ll omit this
subscript when it’s clear which axiomatic system is at issue.)
Here is an axiomatic system for propositional logic:
3
Axiomatic system for PL
Rule: modus ponens
Axioms: Where φ, ψ, and χ are wffs, anything that comes from
the following schemas are axioms
(A1) φ→(ψ→φ)
(A2) (φ→(ψ→χ))→((φ→ψ)→(φ→χ))
(A3) (∼ψ→∼φ)→((∼ψ→φ)→ψ)
Thus, a PLtheoremis any formula that is the last line of a sequence of formulas,
each of which is either an A1, A2, or A3 axiom, or follows from earlier formulas
in the sequence by modus ponens. And
The axiom “schemas” A1A3 are not themselves axioms. They are, rather,
“recipes” for constructing axioms. Take A1, for example:
φ→(ψ→φ)
This string of symbols isn’t itself an axiom because it isn’t a wff; it isn’t a wff
because it contains Greek letters, which aren’t allowed in wffs (since they’re
not on the list of PL primitive vocabulary). φ and ψ are variables of our
metalanguage; you only get an axiom when you replace these variables with
wffs. P→(Q→P), for example, is an axiom; it results from A1 by replacing φ
with P and ψ with Q. (Note: since you can put in any wff for these variables,
and there are innitely many wffs, there are innitely many axioms.)
A few points of clarication about how to construct axioms from schemas.
First point: you can stick in the same wff for two different Greek letters. Thus
you can let both φ and ψ in A1 be P, and construct the axiom P→(P→P).
(But of course, you don’t have to stick in the same thing for φ as for ψ.) Sec
ond point: you can stick in complex formulas for the Greek letters. Thus,
3
See Mendelson (1987, p. 29).
CHAPTER 2. PROPOSITIONAL LOGIC 42
(P→Q)→(∼(R→S)→(P→Q)) is an axiom (I put in P→Q for φ and ∼(R→S)
for ψ in A1). Third point: within a single axiom, you cannot substitute different
wffs for a single Greek letter. For example, P→(Q→R) is not an axiom; you
can’t let the rst φ in A1 be P and the second φ be R. Final point: even though
you can’t substitute different wffs for a single Greek letter within a single axiom,
you can let a Greek letter become one wff when making one axiom, and let
it become a different wff when making another axiom; and you can use each
of these axioms within a single axiomatic proof. For example, each of the
following is an instance of A1, and one could include both in a single axiomatic
proof:
P→(Q→P)
∼P→((Q→R)→∼P)
In the rst case, I made φ be P and ψ be Q; in the second case I made φ be
∼P and ψ be Q→R. This is ne because I kept φ and ψ constant within each
axiom.
The denitions we have given in this section constitute a formal way of
making precise the prooftheoretic conception of the core logical notions, as
applied to propositional logic. A logical truth, on this conception, is a PL
theorem; one formula is a logical consequence of others iff it is provable from
them in PL.
2.5.1 Example axiomatic proofs
Axiomatic proofs are much harder than natural deduction proofs. Some are
easy, of course. Here is a proof of (P→Q)→(P→P):
1. P→(Q→P) (A1)
2. P→(Q→P))→((P→Q)→(P→P)) (A2)
3. (P→Q)→(P→P) 1,2 MP
The existence of this proof shows that (P→Q)→(P→P) is a theorem. And,
building on the previous proof, we can construct a proof of P→P from]P→Q]:
1. P→(Q→P) (A1)
2. (P→(Q→P))→((P→Q)→(P→P)) (A2)
3. (P→Q)→(P→P) 1,2 MP
4. P→Q member of ](P→Q)]
5. P→P 3, 4 MP
CHAPTER 2. PROPOSITIONAL LOGIC 43
Thus, we have shown that ](P→Q)] ' (P→P).
The next example is a little harder: (R→P)→(R→(Q→P))
1. [R→(P→(Q→P))]→[(R→P)→(R→(Q→P))] A2
2. P→(Q→P) A1
3. [P→(Q→P)]→[R→(P→(Q→P))] A1
4. R→(P→(Q→P)) 2,3 MP
5. (R→P)→(R→(Q→P)) 1,4 MP
Here’s how I approached this problem. What I was trying to prove, namely
(R→P)→(R→(Q→P)), is a conditional whose antecedent and consequent both
begin: (R→. That looks like the consequent of A2. So I wrote out an instance
of A2 whose consequent was the formula I was trying to prove; that gave me
line 1 of the proof. Then I tried to gure out a way to get the antecedent of
line 1; namely, R→(P→(Q→P)). And that turned out to be pretty easy. The
consequent of this formula, P→(Q→P) is an axiom (line 2 of the proof). And
if you can get a formula φ, then you choose anything you like—say, R,—and
then get R→φ, by using A1 and modus ponens; that’s what I did in lines 3 and
4. In fact, we’ll want to make a move like this in many proofs, whenever we
have φ on its own, and we want to move to ψ→φ. Let’s call this move “adding
an antecedent”; this is how it is done:
1. φ (from earlier lines)
2. φ→(ψ→φ) A1
3. ψ→φ 1, 2 MP
In future proofs, instead of repeating such steps, let’s just move directly fromφ
to ψ→φ, with the justication “adding an antecedent”.
This proof was a bit tricky, and most proofs are trickier still. Moreover, the
proofs quickly get very long. Practically speaking, the best way to make progress
in an axiomatic system like this is by building up a toolkit. The toolkit consists
of theorems and techniques for doing bits of proofs which are applicable in a
wide range of situations. Then, one approaching a new problem, one can look
to see whether the problem can be reduced to a few chunks, each of which can
be accomplished by using the toolkit. Further, one can cut down on writing by
citing bits of the toolkit, rather than writing down entire proofs.
So far, we have just one tool in our toolkit: “adding an antecedent”. Let’s
add another: the “MP technique”. Here’s what the technique will let us do.
Suppose we can separately prove φ→ψ and φ→(ψ→χ). The MP technique
CHAPTER 2. PROPOSITIONAL LOGIC 44
then shows us how to construct a proof of φ→χ. I call this the MP technique
because its effect is that you can do modus ponens “within the consequent of
the conditional φ→”. Here’s how the MP technique works:
1. φ→ψ from earlier lines
2. φ→(ψ→χ) from earlier lines
3. (φ→(ψ→χ))→((φ→ψ)→(φ→χ)) A2
4. (φ→ψ)→(φ→χ) 2,3 MP
5. φ→χ 1,4 MP
Note that the lines in this “proof schema” are schemas (they contain Greek
letters), rather than wffs. It therefore isn’t a proof at all; rather, it becomes a
proof once you ll in wffs for the φ, ψ, and χ. We constructed a proof schema
because we want the MP technique to be applicable whenever one wants to
move from formulas of the form φ→ψ and φ→(ψ→χ), to a formula of the
form φ→χ, no matter what φ, ψ, and χ may be.
And while we’re on the topic of proof schemas, note also that whenever
one constructs a proof of a formula containing sentence letters, one could just
as well have constructed a similar proof schema. Corresponding to the proof
of (R→P)→(R→(Q→P)), for example, there is this proof schema:
1. [φ→(ψ→(χ→ψ))]→[(φ→ψ)→(φ→(χ→ψ))] A2
2. ψ→(χ→ψ) A1
3. [ψ→(χ→ψ)]→[φ→(ψ→(χ→ψ))] A1
4. φ→(ψ→(χ→ψ)) 2,3 MP
5. (φ→ψ)→(φ→(χ→ψ)) 1,4 MP
It’s usually more useful to think in terms of proof schemas, rather than
proofs, because they can go into our toolkit, if they have general applicability.
The proof schema we just constructed, for example, shows that anything of the
form(φ→ψ)→(φ→(χ→ψ)) is a theorem. As it happens, this is a fairly intuitive
theorem schema. Think of it as the principle of “weakening the consequent”:
χ→ψ is logically weaker than ψ, so if φ leads to ψ, φ must also lead to χ→ψ.
That sounds like a pattern that might well recur, so let’s put it into the toolkit,
under the label “weakening the consequent”. If we’re ever in the midst of a
proof and could really use a line of the form (φ→ψ)→(φ→(χ→ψ)), then we
can simply write that line down, and annotate on the right “weakening the
consequent”. Given the proof sketch above, we know that we could always
CHAPTER 2. PROPOSITIONAL LOGIC 45
in principle insert a veline proof of line; to save writing we simply won’t
bother. (Note that once we do this—omitting those ve lines—the proofs we
are constructing will cease to be ofcial proofs, since not every line will be
either an axiom or a line that follows from earlier lines by MP. They will be
instead proof sketches, which are in essence metalanguage arguments to the
effect that there exists some proof or other of the desired type. An ambitious
reader could always construct an ofcial proof on the basis of the proof sketch,
by taking each of the bits, lling in the details using the toolkit, and assembling
the results into one proof.
Next I want to add to our toolkit the principle of “strengthening the an
tecedent”: [(φ→ψ)→χ]→(ψ→χ). The intuitive idea is that if φ→ψ leads to
χ, then ψ ought to lead to χ, since ψ is logically stronger than φ→ψ. This
proof will be harder still; we’ll need to break it into bits and use the toolkit to
complete it. Here’s a sketch of the overall proof:
a. [(φ→ψ)→χ]→[ψ→(φ→ψ)] see below
b. [(φ→ψ)→χ]→[(ψ→(φ→ψ))→(ψ→χ)] see below
c. [(φ→ψ)→χ]→[ψ→χ] a,b MP method
All that remains is to supply separate proofs of lines a and b. Step a is pretty
easy. Its consequent, ψ→(φ→ψ) is an instance of A1, so we can prove it in one
line, then use “adding an antecedent” to get a.
Line b is a bit harder. It has the form: (α→β)→[(γ→α)→(γ→β)]. Call
this “adding antecedents”, since it lets you add the same antecedent (γ) to both
the antecedent and consequent of a conditional (α→β). The following proof
sketch for adding antecedents uses the MP technique again!
1. [γ→(α→β)]→[(γ→α)→(γ→β)] A2
2. (α→β)→[γ→(α→β)] A1
3. (α→β)→[γ→(α→β)]→[(γ→α)→(γ→β)] adding an antecedent to
line 1.
4. (α→β)→[(γ→α)→(γ→β)] 2, 3, MP method
(For this use of the MP method, we let φ = α→β, ψ = γ→(α→β), and
χ=(γ→α)→(γ→β).) Since we’ve now provided proof sketches for parts a. and
b., we’re nished our proof sketch for strengthening the antecedent.
Next let’s add a tool to our toolkit, to the effect that conditionals are “tran
sitive”. Here’s a proof sketch for ' (φ→ψ)→[(ψ→χ)→(φ→χ)]:
CHAPTER 2. PROPOSITIONAL LOGIC 46
1. (ψ→χ)→[(φ→ψ)→(φ→χ)] adding antecedents
2. {(ψ→χ)→[(φ→ψ)→(φ→χ)]]→
][(ψ→χ)→(φ→ψ)]→[(ψ→χ)→(φ→χ)]}
A2
3. [(ψ→χ)→(φ→ψ)]→[(ψ→χ)→(φ→χ)] 1, 2 MP
4. {[(ψ→χ)→(φ→ψ)]→[(ψ→χ)→(φ→χ)]]→
](φ→ψ)→[(ψ→χ)→(φ→χ)]}
strengthening the an
tecedent
5. (φ→ψ)→[(ψ→χ)→(φ→χ)] 3, 4 MP
Given this theorem, we can always move from φ→ψ and ψ→χ to φ→χ
thus:
1. φ→ψ from earlier lines
2. ψ→χ from earlier lines
3. (φ→ψ)→[(ψ→χ)→(φ→χ)] theorem just proved
4. (ψ→χ)→(φ→χ) 1, 3 MP
5. φ→χ 2, 4 MP
So, such moves may henceforth be justied by appeal to “transitivity”.
With transitivity in our toolkit, we can really get moving:
' [φ→(ψ→χ)]→[ψ→(φ→χ)] (“swapping antecedents”):
1. [φ→(ψ→χ)]→[(φ→ψ)→(φ→χ)] A2
2. [(φ→ψ)→(φ→χ)]→[ψ→(φ→χ)] strengthening the antecedent
3. [φ→(ψ→χ)]→[ψ→(φ→χ)] 1,2 transitivity
' (∼ψ→∼φ)→(φ→ψ) (“contraposition”):
1. (∼ψ→∼φ)→[(∼ψ→φ)→ψ] A3
2. [(∼ψ→φ)→ψ]→(φ→ψ) strengthening the antecedent
3. (∼ψ→∼φ)→(φ→ψ) 1, 2 transitivity
' ∼φ→(φ→ψ) (“ex falso quodlibet”):
1. ∼φ→(∼ψ→∼φ) A1
2. (∼ψ→∼φ)→(φ→ψ) contraposition
3. ∼φ→(φ→ψ) 1, 2 transitivity
CHAPTER 2. PROPOSITIONAL LOGIC 47
2.5.2 The deduction theorem
As we saw, once we added some tools to our toolkit, we could construct ax
iomatic proofs of some interesting theorems. But it certainly wasn’t easy! Think
of all the work that went into proving, for example, such a simple theorem as
the strengthening of antecedents. That same theorem is much easier to prove
in the the natural deduction system, because in that system one can reason
using assumptions; in particular, one can use conditional proof.
While conditional proof is not allowed within axiomatic proofs, one can
prove the the following metalogical theorem about our axiomatic system:
Deduction theorem: If Γ∪]φ] ' ψ, then Γ ' (φ→ψ)
That is: whenever there exists a proof from (Γ and) φ to ψ, then there also exists
a proof of φ→ψ (from Γ). That is not to say that one is allowed to assume φ
in a proof of φ→ψ. But since someone has proved the deduction theorem (I
won’t prove it here, but you can look up a proof in Mendelson (1987)) then if
one can prove ψ after assuming φ, one is entitled to conclude that some proof
of φ→ψ must exist.
2.6 Soundness and completeness of PL
In this chapter we have discussed two approaches to propositional logic: the
prooftheoretic approach and the semantic approach. In each case, we in
troduced formal notions of logical truth and logical consequence. For the
semantic approach, these notions involved truth in PLinterpretations. For the
prooftheoretic approach, we considered two formal denitions, one involving
sequentproofs, the other involving axiomatic proofs.
An embarrassment of riches! We have multiple formal accounts of our
logical notions. But in fact, it can be shown that each of our denitions yields
exactly the same results. That is, whether you dene a logical truth (for example)
as a formula that is true in all PLinterpretations, or as a formula that is the
last line of some PL axiomatic proof, or as a formula φ for which the sequent
∅' φ is a provable sequent, exactly the same formulas turn out logical truths.
Proving this is a major accomplishment of metalogic. We won’t prove this
here, not in full anyway, but we will discuss the issues a bit, to show how such
metalogical proofs proceed.
Let’s focus, for the moment, on just two of our notions, the notion of
a theorem (last line of an axiomatic proof) and the notion of a valid formula
CHAPTER 2. PROPOSITIONAL LOGIC 48
(true in all PLinterpretations). An important accomplishment of metalogic is
the establishment of the following two important connections between these
notions:
Soundness: every theorem is valid (If '
PL
φ then
PL
φ)
Completeness: every valid wff is a theorem (If
PL
φ then '
PL
φ)
It’s pretty easy to establish soundness:
Soundness proof for PL: We’re going to prove this by induction as
well. But our inductive proof here is slightly different. We’re not
trying to prove something of the form “Every wff has property
P”. Instead, we’re trying to prove something of the form “Every
theorem has property P”. In this case, the property P is: being a valid
formula.
Here’s how induction works in this case. A theorem is the last line
of any proof. So, to show that every theorem has a certain property
P, all we need to do is show that every time one adds another line to
a proof, that line has property P. Now, there are two ways one can
add to a proof. First, one can add an axiom. The base case of the
inductive proof must show that adding axioms always means adding
a line with property P. Second, one can add a formula that follows
from earlier lines by a rule. The inductive step of the inductive
proof must show that in this case, too, one adds a line with property
P, provided all the preceding lines have property P. OK, here goes:
base case: here we need to show that every PLaxiom is valid. This
is tedious but straightforward. Take A1, for example. Suppose
for reductio that some instance of A1 is invalid, i.e., for some PL
interpretation ·, V
·
(φ→(ψ→φ))=0. Thus, V
·
(φ)=1 and V
·
(ψ→φ)=0.
Given the latter, V
·
(φ)=0—contradiction. Analogous proofs can
be given that instances of A2 and A3 are also valid.
induction step: here we begin by assuming that every line in a proof
up to a certain point is valid (this is the “inductive hypothesis”); we
then show that if one adds another line that follows from earlier
lines by the rule modus ponens, that line must be valid too. I.e.,
we’re trying to show that “modus ponens preserves validity”. So,
assume the inductive hypothesis: that all the earlier lines in the
CHAPTER 2. PROPOSITIONAL LOGIC 49
proof are valid. And now, consider the result of applying modus
ponens. That means that the new line we’ve added to the proof
is some formula ψ, which we’ve inferred from two earlier lines
that have the forms φ→ψ and φ. We must show that ψ is a valid
formula, i.e., that V
·
(ψ)=1 for every PLinterpretation ·. By the
inductive hypothesis, all earlier lines in the proof are valid, and
hence both φ→ψ and φ are valid. Thus, V
·
(φ)=1 and V
·
(φ→ψ)=1.
But if V
·
(φ)=1 then V
·
(ψ) can’t be 0, for if it were, then V
·
(φ→ψ)
would be 0, and it isn’t. Thus, V
·
(ψ)=1.
We’ve shown that axioms are valid, and that modus ponens pre
serves validity. So, by induction, every time one adds to a proof,
one adds a valid formula. So the last line in a proof is always a valid
formula. Thus, every theorem is valid. QED.
Notice the general structure of this proof: we rst showed that every axiom
has a certain property, and then we showed that the rule of inference preserves
the property. Given the denition of ‘theorem’, it followed that every theorem
has the property. We chose our denition of a theorem with just this sort of
proof in mind.
Remember that this is a proof in the metalanguage, about propositional
logic. It isn’t a proof in any system of derivation.
Completeness is harder to prove, and I won’t prove it here.
One nice thing about soundness is that it lets us establish facts of unprov
ability. Soundness says “if ' φ then φ”. Equivalently, it says: “if φ then
φ”. So, to show that something isn’t a theorem, it sufces to show that it
isn’t valid. Consider, for example, the formula (P→Q)→(Q→P). There exist
PLinterpretations in which the formula is false, namely, PLinterpretations in
which P is 0 and Q is 1. So, (P→Q)→(Q→P) is not valid (since it’s not true
in all PLinterpretations.) But then soundness tells us that it isn’t a theorem
either. In general: if we’ve established soundness, then in order to show that a
formula isn’t a theorem, all we need to do is nd an interpretation in which it
isn’t true.
One could also prove soundness and completeness for the natural deduction
system of section 2.4.
4
The soundness proof, for instance, would proceed by
4
Note: if we allow sequents with innite premisesets, then to secure completeness we will
need to add a rule for adding dependencies: from Γ ' φ conclude Γ∪∆' φ, where ∆ is any set
of wffs. The trick described above for adding dependencies (using RA, ∧I and ∧E) only allows
CHAPTER 2. PROPOSITIONAL LOGIC 50
proving by induction that whenever sequent Γ ' φ is provable, φ is a semantic
consequence of Γ. This would proceed by showing that each rule of inference
(As, RAA, ∧I, ∧E, etc.) preserves semantic consequence. But note how much
more involved this proof would be, since there are so many rules of inference.
The paucity of rules in the axiomatic system made the construction of proofs
within that system a real pain in the neck, but now we see how it makes
metalogical life easier.
Before we leave this chapter, let me summarize and clarify the nature of
proofs by induction. Induction is the method of proof to use whenever one is
trying to prove that every member of an innite class of entities has a certain
feature F, where each member of that innite class of entities is generated from
certain “starting points” by a nite number of successive “operations”. To do
this, one establishes two things: a) that the starting points have feature F, and b)
that the operations preserve feature F—i.e., that if the inputs to the operations
have feature F then the output also has feature F.
In logic, it is important to distinguish two different cases where proofs by
induction are needed. One case is where one is establishing a fact of the form:
every theorem has a certain feature F. (The proof of the soundness theorem
is an example of this case.) Here’s why induction is applicable: a theorem is
dened as the last line of a proof. So the fact to be established is that every
line in every proof has feature F. Now, a proof is dened as a nite sequence,
where each member is either an axiom or follows from earlier lines by the rule
modus ponens. The axioms are the “starting points” and modus ponens is the
“operation”. So if we want to show that every line in every proof has feature F,
all we need to do is show that a) the axioms all have feature F, and b) show that
if you start with formulas that have feature F, and you apply modus ponens,
then what you get is something with feature F. More carefully, b) means: if φ
has feature F, and φ→ψ has feature F, then ψ has feature F. Once a) and b)
are established, one can conclude by induction that all lines in all proofs have
feature F.
A second case in which induction may be used is when one is trying to
establish a fact of the form: every formula has a certain feature F. (The proof
that every wff has a nite number of sentence letter is an example of this
case.) Here’s why induction is applicable: all formulas are built out of sentence
letters (the “starting points”) by successive applications of the rules of formation
us to add one wff at a time to the premise set of a given sequent, whereas this new rule allows
adding innitely many.
CHAPTER 2. PROPOSITIONAL LOGIC 51
(“operations”) (the rules of formation, recall, say that if φ and ψ are formulas,
then so are (φ→ψ) and ∼φ.) So, to show that all formulas have feature F, we
must merely show that a) all the sentence letters have feature F, and b) show
that if φ and ψ both have feature F, then both (φ→ψ) and ∼φ also will have
feature F.
If you’re ever proving something by induction, it’s important to identify
which sort of inductive proof you’re constructing.
Chapter 3
Variations and Deviations from
Standard Propositional Logic
A
s vvoxisib, we will not stop with the standard logics familiar from in
troductory textbooks. In this chapter we examine some philosophically
important variations and deviations from standard propositional logic.
3.1 Alternate connectives
3.1.1 Symbolizing truth functions in propositional logic
Our propositional logic is in a sense “expressively complete”. To get at this
sense, let’s introduce the idea of a truth function.
A truth function is a (niteplaced) function or rule that maps truth values
(i.e., 0s and 1s ) to truth values. For example, here is a truth function, f :
f (1) =0
f (0) =1
This is a called oneplace function because it takes only one truth value as
input.
In fact, we have a name for this truth function: negation. And we express
that truth function with our symbol ∼. So the negation truth function is one
we can symbolize in propositional logic. More carefully, what we mean by
saying “We can symbolize truth function f in propositional logic” is this:
52
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 53
there is some sentence of propositional logic, φ, containing a single
sentence letter, P, which has the following feature: whenever P has
a truth value t , then the whole sentence φ has the truth value f (t ).
The sentence φ is in fact ∼P.
Here’s another truth function, g. g is a twoplace truth function, which
means that it takes two truth values as inputs:
g(1, 1) =1
g(1, 0) =0
g(0, 1) =0
g(0, 0) =0
In fact, this is the conjunction truth function. And we have a symbol for this
truth function: ∧. And as before, we can symbolize function g in propositional
logic; this means that:
There is some sentence of propositional logic, φ, containing two
sentence letters, P and Q, which has the following feature: when
ever P has a truth value, t , and Q has a truth value, t
/
, then the
whole sentence φ has the truth value g(t , t
/
)
One such sentence φ is in fact: P∧Q.
Notice that the sentence φ that symbolizes
1
the function g had to have two
sentence letters rather than one. That’s because the function g was a twoplace
function. In general, the denition of “can be symbolized” is this:
nplace truth function h can be symbolized in propositional
logic iff there is some sentence of propositional logic, φ, contain
ing n sentence letters, P
1
. . . P
n
, which has the following feature:
whenever P
1
has a truth value, t
1
, and, …, and P
n
has a truth value,
t
n
, then the whole sentence φ has the truth value h(t
1
. . . t
n
)
Here’s another truth function:
i (1, 1) =0
i (1, 0) =1
i (0, 1) =1
i (0, 0) =1
1
Strictly speaking, we should speak of a sentence symbolizing a truth function relative to
an ordering of its sentence letters.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 54
Think of this truth function as “not both”. Unlike the negation and conjunc
tion truth functions, we don’t have a single symbol for this truth function.
Nevertheless, it can be symbolized in propositional logic: by the following
sentence:
∼(P∧Q)
Now, it’s not too hard to prove that:
All truth functions (of any nite number of places) can be symbol
ized in propositional logic using just the ∧, ∨, and ∼
I’ll begin by illustrating the idea of the proof with an example. Suppose we
want to symbolize the following threeplace truthfunction:
f ( 1, 1, 1 ) = 0
f ( 1, 1, 0 ) = 1
f ( 1, 0, 1 ) = 0
f ( 1, 0, 0 ) = 1
f ( 0, 1, 1 ) = 0
f ( 0, 1, 0 ) = 0
f ( 0, 0, 1 ) = 1
f ( 0, 0, 0 ) = 0
Since this truth function returns the value 1 in just three cases (rows two,
four, and seven), what we want is a sentence containing three sentence letters,
P
1
, P
2
, P
3
, that is true in exactly those three cases: (a) when P
1
, P
2
, P
3
take on the
three truth values in the second row (i.e., 1, 1, 0) , (b) when P
1
, P
2
, P
3
take on
the three truth values in the fourth row (1, 0, 0) , and (c) when P
1
, P
2
, P
3
take
on the three truth values in the seventh row (0, 0, 1) . Now, we can construct a
sentence that is true in case (a) and false otherwise: P
1
∧P
2
∧∼P
3
. We can also
construct a sentence that’s true in case (b) and false otherwise: P
1
∧∼P
2
∧∼P
3
.
And we can also construct a sentence that’s true in case (c) and false otherwise:
∼P
1
∧∼P
2
∧P
3
. But then we can simply disjoin these three sentences to get the
sentence we want:
(P
1
∧P
2
∧∼P
3
) ∨ (P
1
∧∼P
2
∧∼P
3
) ∨ (∼P
1
∧∼P
2
∧P
3
)
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 55
(Strictly speaking the threeway conjunctions, and the threeway disjunction,
need parentheses added, but since it doesn’t matter where they’re added—
conjunction and disjunction are associative—I’ve left them off.)
This strategy is in fact purely general. Any nplace truth function, f , can
be represented by a chart like the one above. Each row in the chart consists of
a certain combination of n truth values, followed by the truth value returned
by f for those n inputs. For each such row, construct a conjunction whose
i
th
conjunct is P
i
if the i
th
truth value in the row is 1, and ∼P
i
if the i
th
truth
value in the row is 0. Notice that the conjunction just constructed is true if and
only if its sentence letters have the truth values corresponding to the row in
question. The desired formula is then simply the disjunction of all and only
the conjunctions for rows where the function f returns the value 1.
2
Since the
conjunction for a given row is true iff its sentence letters have the truth values
corresponding to the row in question, the resulting disjunction is true iff its
sentence letters have truth values corresponding to one of the rows where f
returns the value true, which is what we want.
Say that a set of connectives is adequate iff one can symbolize all the truth
functions using a sentence containing only those connectives. What we just
showed was that the set ]∧, ∨, ∼] is adequate. We can then use this fact to
prove that other sets of connectives are adequate. For example, it is easy to
prove that φ∨ψ has the same truth table as (is true relative to exactly the same
PLassignments as) ∼(∼φ∧∼ψ). But that means that for any sentence χ whose
only connectives are ∧, ∨, and ∼, we can construct another sentence χ
/
with
the same truth table but whose only connectives are ∧ and ∼: simply begin
with χ and use the equivalence between φ∨ψ and ∼(∼φ∧∼ψ) to eliminate all
occurrences of ∨ in favor of occurrences of ∧ and ∼. But now consider any
truth function f . Since ]∧, ∨, ∼} is adequate, f can be symbolized by some
sentence χ; but χ has the same truth table as some sentence χ
/
whose only
connectives are ∧ , and ∼; hence f can be symbolized by χ
/
as well. So ]∧, ∼]
is adequate.
Similar arguments can be given to show that other connective sets are
adequate as well. For example, the ∧ can be eliminated in favor of the →and
the ∼ (since φ∧ψ has the same truth table as ∼(φ→∼ψ)); therefore, since
]∧, ∼] is adequate, {→, ∼} is also adequate.
2
Special case: if there are no such rows—i.e., if the function returns 0 for all inputs—
then let the formula be simply any logically false formula containing P
1
. . . P
n
, for example
P
1
∧∼P
1
∧P
2
∧P
3
∧· · · ∧P
n
.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 56
3.1.2 Inadequate connective sets
Can we show that certain sets of connectives are not adequate?
We can quickly answer yes, for a trivial reason. The set ]∼] isn’t adequate,
for the simple reason that, since ∼ is a oneplace connective, no sentence with
more than one sentence letter can be built using just ∼. So there’s no hope of
symbolizing all the nplace truth functions, for any n >1, using just the ∼.
More interestingly, we can show that there are inadequate connective sets
containing twoplace connectives. Let’s prove that ]∧, →} is not an adequate set
of connectives. We’ll do this by proving that if those were our only connectives,
we couldn’t symbolize the negation truth function. And we’ll demonstrate that
by proving the following fact:
For any sentence, φ, containing just sentence letter P and the
connectives ∧ and →, if P is true then so is φ.
We’ll again use the method of induction. We want to show that the assertion is
true for all sentences. So we rst prove that the assertion is true for all sentence
with no connectives (i.e., for sentences containing just sentence letters.) This is
the base case, and is very easy here, since if φhas no connectives, then obviously
φ is just the sentence letter P itself, in which case, clearly, if P is true then so is
φ. Next we assume the inductive hypothesis:
Inductive hypothesis: the assertion holds true for sentences φ and
ψ
And we try to show, on the basis of this assumption, that:
The assertion holds true for φ∧ψ and φ→ψ
This is easy to show. We want to show that the assertion holds for φ∧ψ—that
is, if P is true then so is φ∧ψ. But we know by the inductive hypothesis that
the assertion holds for φ and ψ individually. So we know that if P is true, then
both φ and ψ are true. But then, we know from the truth table for ∧ that
φ∧ψ is also true in this case. The reasoning is exactly parallel for φ→ψ: the
inductive hypothesis tells us that whenever P is true, so are φ and ψ, and then
we know that in this case φ→ψ must also then be true, by the truth table for
→. Therefore, by induction, the result is proved.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 57
3.1.3 Sheffer stroke
We’ve seen how we can choose alternate sets of connectives. Some of these
choices are adequate (i.e., allow symbolization of all truth functions), others
are not.
As we saw, there are some truth functions that can be symbolized in propo
sitional logic, but not by a single connective (e.g., the notboth function i
discussed above.)
We could change this, by adding a new connective. Let’s use a new connec
tive, the “Sheffer stroke”, , to symbolize notboth. φψ is to mean that not
both φ and ψ are true, so let’s stipulate that φψ will have the same truth table
as ∼(φ∧ψ), i.e:
φ  ψ
1 0 1
1 1 0
0 1 1
0 1 0
Now here’s an exciting thing about : it’s an adequate connective all on its own.
You can symbolize all the truth functions using just !
Here’s how we can prove this. We showed above that {→, ∼} is adequate;
so all we need to do is show how to dene the → and the ∼ using just the .
Dening ∼ is easy; φφ has the same truth table as ∼φ. As for φ→ψ, think of
it this way. φ→ψ is equivalent to ∼(φ∧∼ψ), i.e., φ∼ψ. But given the method
just given for dening ∼ in terms of , we know that ∼ψ is equivalent to ψψ.
Thus, φ→ψ has the same truth table as: φ(ψψ).
3.2 Polish notation
Alternate connectives, like the Sheffer stroke, are called “variations” of standard
logic because they don’t really change what we’re saying with propositional
logic; it’s just a change in notation.
Another fun change in notation is polish notation. The basic idea of polish
notation is that the connectives all go before the sentences they connect. Instead
of writing P∧Q, we write ∧PQ. Instead of writing P∨Q we write ∨PQ.
Formally, here is the denition of a wff:
i) sentence letters are wffs
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 58
ii) if φ and ψ are wffs, then so are:
∧φψ
∨φψ
→φψ
↔φψ
∼φ
What’s the point? This notation eliminates the need for parentheses. With the
usual notation, in which we put the connectives between the sentences they
connect, we need parentheses to distinguish, e.g.:
(P∧Q)→R
P∧(Q→R)
But with polish notation, these are distinguished without parentheses; they
become:
→∧PQR
∧P→QR
respectively.
3.3 Multivalued logic
3
Logicians have considered adding a third truth value to the usual two. In these
new systems, in addition to truth (1) and falsity (0) , we have a third truth value,
#. There are a number of things one could take # to mean (e.g., “meaningless”,
or “undened”, or “unknown”).
Standard logic is “bivalent”—that means that there are no more than two
truth values. So, moving from standard logic to a system that admits a third
truth value is called “denying bivalence”. One could deny bivalence, and go
even further, and admit four, ve, or even innitely many truth values. But
we’ll only discuss trivalent systems—i.e., systems with only three truth values.
Why would one want to admit a third truth value? There are various
philosophical reasons one might give. One concerns vagueness. A person with
one dollar is not rich. A person with a million dollars is rich. Somewhere in
the middle, there are some people that are hard to classify. Perhaps a person
3
See Gamut (1991a, pp. 173183).
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 59
with $100, 000 is such a person. They seem neither denitely rich nor denitely
not rich. So there’s pressure to say that the statement “this person is rich” is
capable of being neither denitely true nor denitely false. It’s vague.
Others say we need a third truth value for statements about the future. If
it is in some sense “not yet determined” whether there will be a sea battle
tomorrow, then (it is argued) the sentence:
There will be a sea battle tomorrow
is neither true nor false. In general, statements about the future are neither
true nor false if there is nothing about the present that determines their truth
value one way or the other.
4
Yet another case in which some have claimed that bivalence fails concerns
failed presupposition. Consider this sentence:
Ted stopped beating his dog.
In fact, I have never beaten a dog. I don’t even have a dog. So is it true that
I stopped beating my dog? Obviously not. But on the other hand, is this
statement false? Certainly no one would want to assert its negation: “Ted has
not stopped beating his dog”. The sentence presupposes that I was beating a dog;
since this presupposition is false, the question of the sentence’s truth does not
arise: the sentence is neither true nor false.
For a nal challenge to bivalence, consider the sentence:
Sherlock Holmes has a mole on his left leg
‘Sherlock Holmes’ doesn’t refer to a real entity. Further, Sir Arthur Conan
Doyle does not specify in his Sherlock Holmes stories whether Holmes has
such a mole. Either of these reasons might be argued to result in a truth value
gap for the displayed sentence.
It’s an interesting philosophical question whether any of these arguments
for bivalence’s failing are any good. But we won’t take up that question. Instead,
we’ll look at the formal result of giving up bivalence. That is, we’ll introduce
some nonbivalent formal systems. We won’t ask whether these systems really
model English correctly.
4
An alternate view preserves the “openness of the future” as well as bivalence: both ‘There
will be a sea battle tomorrow’ and ‘There will fail to be a sea battle tomorrow’ are false. This
combination is not contradictory, provided one rejects the equivalence of “It will be the case
tomorrow that ∼φ” and “∼ it will be the case tomorrow that φ”.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 60
These systems all give different truth tables for the Boolean connectives.
The original truth tables give you the truth values of complex formulas based
on whether their sentence letters are true or false (1 or 0) . The new truth
tables need to take into account cases where the sentence letters are # (neither
1 nor 0) .
3.3.1 Łukasiewicz’s system
Here are the new truth tables (let’s skip the ↔):
∼
1 0
0 1
# #
∧ 1 0 #
1 1 0 #
0 0 0 0
# # 0 #
∨ 1 0 #
1 1 1 1
0 1 0 #
# 1 # #
→ 1 0 #
1 1 0 #
0 1 1 1
# 1 # 1
Using these truth tables, one can calculate truth values of wholes based on
truth values of parts.
Example: Where Pis 1, Qis 0 and Ris #, calculate the truth value of (P∨Q)→∼(R→Q).
First, what is R→Q? Answer, from the truth table for →: #. Next,
what is ∼(R→Q)? From the truth table for ∼, we know that the
negation of a # is a #. So, ∼(R→Q) is #. Next, P∨Q: that’s 1 ∨
0—i.e., 0. Finally, the whole thing: 0→#, i.e., 1.
We can formalize this a bit more by dening up new notions of an inter
pretation, and of truth relative to an interpretation:
A trivalent interpretation is a function, ·, that assigns to each sen
tence letter exactly one of the values: 1, 0, #.
For any trivalent interpretation, ·, the valuation for ·, V
·
, is
dened as the function that assigns to each wff either 1, 0, or #, and
which is such that, for any wffs φ and ψ,
i) if φ is a sentence letter then V
·
(φ) =·(φ)
ii) V
·
(φ∧ψ) =
1 if V
·
(φ) =1 and V
·
(ψ) =1
0 if V
·
(φ) =0 or V
·
(ψ) =0
# otherwise
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 61
iii) V
·
(φ∨ψ) =
1 if V
·
(φ) =1 or V
·
(ψ) =1
0 if V
·
(φ) =0 and V
·
(ψ) =0
# otherwise
iv) V
·
(φ→ψ) =
1 if either: V
·
(φ) =0, or V
·
(ψ) =1,
or V
·
(φ) =# and V
·
(ψ) =#
0 V
·
(φ) =1 and V
·
(ψ) =0
# otherwise
v) V
·
(∼φ) =
1 if V
·
(φ) =0
0 if V
·
(φ) =1
# otherwise
Let’s dene validity and semantic consequence for Łukasiewicz’s system
much like we did for standard PL:
φ is Łukasiewiczvalid (“
Ł
φ”) iff every trivalent interpretation ·,
V
·
(φ) =1
φ is a Łukasiewiczsemanticconsequence of Γ (“Γ
Ł
φ”) iff for every
trivalent interpretation, ·, if V
·
(γ) = 1 for each γ ∈ Γ, then
V
·
(φ) =1
Notice that there are now two ways a formula can fail to be valid. It can be
0 under some PLvaluation, or it can be # under some PLvaluation. “Valid”
(under this denition) means always true; it does not mean never false. (A formula
can be never false and still not always be true, if it sometimes is #.)
Example: is P ∨∼P valid?
Answer: no, it isn’t. Suppose P is #. Then ∼P is #; but then the
whole thing is # (since #∨# is #.)
Example: is P→P valid?
Answer: yes. P could be either 1, 0 or #. From the truth table for
→, we see that P→P is 1 in all three cases.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 62
3.3.2 Kleene’s “strong” tables
This system is like Łukasiewicz’s system, except that the truth table for the →
is different:
→ 1 0 #
1 1 0 #
0 1 1 1
# 1 # #
As with Łukasiewicz’s system, let’s continue to understand validity as truth in
all trivalent interpretations, and semantic consequence as the preservation of
truth in a given trivalent interpretation.
Here is the intuitive idea behind the Kleene tables. Let’s call the truth values
0 and 1 the “classical” truth values. If a formula’s halves have only classical
truth values, then the truth value of the whole formula is just the classical
truth value determined by the classical truth values of the halves. But if one or
both halves are #, then we must consider the result of turning each # into one
of the classical truth values. If the entire formula would sometimes be 1 and
sometimes be 0 after doing this, then the entire formula is #. But if the entire
formula always takes the same truth value, X, no matter which classical truth
value any #s are turned into, then the entire formula gets this truth value X.
Intuitively: if there is “enough information” in the classical truth values of a
formula’s parts to settle on one particular classical truth value, then that truth
value is the formula’s truth value.
Take the truth table for φ→ψ, for example. When φ is 0 and ψ is #, the
whole formula is 1—because the false antecedent is sufcient to make the whole
formula true, no matter what classical truth value we convert ψ to. On the
other hand, when φ is 1 and ψ is #, then the whole formula is #. The reason is
that what classical truth value we substitute in for ψ’s # affects the truth value of
the whole. If the # becomes a 0 then the whole thing is 0; but if the # becomes
a 1 then the whole thing is 1.
There are two important differences between Łukasiewicz’s and Kleene’s
systems. The rst is that, unlike Łukasiewicz’s system, Kleene’s system makes
the formula P→P invalid. The reason is that in Kleene’s system, #→# is #;
thus, P→P isn’t true in all valuations (it is # in the valuation where P is #.)
In fact, it’s easy to show that there are no valid formulas in Kleene’s system.
Consider the valuation that makes every sentence letter #. Here’s an inductive
proof that every wff is # in this interpretation. Base case: all the sentence letters
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 63
are # in this interpretation. (That’s obvious.) Inductive step: assume that φ
and ψ are both # in this interpretation. We need now to show that φ∧ψ, φ∨ψ,
and φ→ψ are all # in this interpretation. But that’s easy—just look at the truth
tables for ∧, ∨ and →. #∧# is #, #∨# is #, and #→# is #. QED.
Even though there are no valid formulas in Kleene’s system, there are still
cases of semantic consequence. Semantic consequence for Kleene’s system is
dened as truthpreservation: Γ
Kleene
φ iff φ is true whenever every member
of Γ is true, given Kleene’s truth tables. Then P∧Q
Kleene
P, since the only
way for P∧Q to be true is for P to be true and Q to be true.
The second (related) difference is that in Kleene’s system, → is interde
nable with the ∼ and ∨, in that φ→ψ has exactly the same truth table as
∼φ∨ψ. (Look at the truth tables to verify that this is true.) But that’s not true
for Łukasiewicz’s system. In Łukasiewicz’s system, when φ and ψ are both #,
then φ→ψ is 1, but ∼φ∨ψ is #.
3.3.3 Kleene’s “weak” tables (Bochvar’s tables)
This nal system is based on a very different intuitive idea: that # is “infectious”.
That is, if any formula has a part that is #, then the entire formula is #. Thus,
the tables are as follows:
∼
1 0
0 1
# #
∧ 1 0 #
1 1 0 #
0 0 0 #
# # # #
∨ 1 0 #
1 1 1 #
0 1 0 #
# # # #
→ 1 0 #
1 1 0 #
0 1 1 #
# # # #
So basically, the classical bit of each truth table is what you’d expect; but
everything gets boring if any constituent formula is a #.
One way to think about these tables is to think of the # as indicating nonsense.
The sentence “The sun is purple and blevledgekl;rz”, one might naturally think,
is neither true nor false because it is nonsense. It is nonsense even though it
has a part that isn’t nonsense.
3.3.4 Supervaluationism
Recall the guiding thought behind the strong Kleene tables: if a formula’s
classical truth values x a particular truth value, then that is the value that the
formula takes on. There is a way to take this idea a step further, which results
in a new and interesting way of thinking about threevalued logic.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 64
According to the strong Kleene tables, we get a classical truth value for
φψ, where  is any connective, only when we have “enough classical
information” in the truth values of φ and ψ to x a classical truth value for
φ ψ. Consider φ∧ψ for example: if either φ or ψ is false, then since
falsehood of a conjunct is classically sufcient for the falsehood of the whole
conjunction, the entire formula is false. But if, on the other hand, both φ and
ψ are #, then neither φ nor ψ has a classical truth value, we do not have enough
classical information to settle on a classical truth value for φ∧ψ, and so the
whole formula is #.
But now consider a special case of the situation just considered, where φ is
P, ψ is ∼P, and P is #. According to the strong Kleene tables, the conjunction
P∧∼P is #, since it is the conjunction of two formulas that are #. But there is a
way of thinking about truth values of complex sentences according to which
the truth value ought to be 0, not #: no matter what classical truth value P were
to take on, the whole sentence P∧∼P would be 0—therefore, one might think,
P∧∼P ought to be 0. If P were 0 then P∧∼P would be 0∧∼0—that is 0; and if
P were 1 then P∧∼P would be 1∧∼1—0 again.
The general thought here is this: suppose a sentence φ contains some
sentence letters P
1
. . . P
n
that are #. If φ would be false no matter how we assign
classical truth values to P
1
. . . P
n
—that is, no matter how we precisied φ—then
φ is in fact false. Further, if φ would be true no matter how we precisied it,
then φ is in fact true. But if precisifying φ would sometimes make it true and
sometimes make it false, then φ in fact is #.
The idea here can be thought of as an extension of the idea behind the
strong Kleene tables. Consider a formula φψ, where is any connective.
If there is enough classical information in the truth values of φ and ψ to x on a
particular classical truth value, then the strong Kleene tables assign φψ that
truth value. Our new idea goes further, and says: if there is enough classical
information within φ and ψ to x a particular classical truth value, then φψ
gets that truth value. Information “within” φ and ψ includes, not only the
truth values of φ and ψ, but also a certain sort of information about sentence
letters that occur in both φ and ψ. For example, in P∧∼P, when P is #, there
is insufcient classical information in the truth values of P and of ∼P to settle
on a truth value for the whole formula P∧∼P (since each is #). But when we
look inside P and ∼P, we get more classical information: we can use the fact
that P occurs in each to reason as we did above: whenever we turn P to 0, we
turn ∼P to 1, and so P∧∼P becomes 0; and whenever we turn P to 1 we turn
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 65
∼P to 0, and so again, P∧∼P becomes 0.
This new idea—that a formula has a classical truth value iff every way of
precisifying it results in that truth value—is known as supervaluationism. Let us
lay out this idea formally.
As above, a trivalent interpretation is a function that assigns to each sentence
letter one of the values 1, 0, #. Where · is a trivalent interpretation and (
is a PLinterpretation (i.e., a bivalent interpretation in the sense of section
2.3), say that ( is a precisication of · iff: whenever · assigns a sentence letter
a classical truth value (i.e., 1 or 0), ( assigns that sentence letter the same
classical value. Thus, precisications of · agree with · on the classical truth
values, but in addition—being PLinterpretations—they also assign classical
truth values to sentence letters to which · assigns #. A given precisication of
· “decides” all of ·’s #s in a certain way.
We can now say how the supervaluationist assigns truth values to complex
formulas relative to a trivalent interpretation. When φ is any wff and · is a
trivalent interpretation, let us dene S
·
(φ)—the “supervaluation of φ relative
to ·—thus:
S
·
(φ) = 1 if V
(
(φ) =1 for every precisication, (, of ·
S
·
(φ) = 0 if V
(
(φ) =0 for every precisication, (, of ·
S
·
(φ) =# otherwise
Here V
(
is the valuation for PLinterpretation (, as dened in section 2.3.
Some common terminology: when S
·
(φ) =1, we say that φis supertrue in ·,
and when S
·
(φ) =0, we say that φ is superfalse in ·. For the supervaluationist,
a formula is true when it is supertrue (i.e., true in all precisications of ·), false
when it is superfalse (i.e., false in all precisications of ·), and # when it is
neither supertrue nor superfalse (i.e., when it is true in some precisications of
· but false in others.)
Supervaluational notions of validity and semantic consequence may be
dened thus:
φ is supervaluationally valid (“
SV
φ”) iff φ is supertrue in every
trivalent interpretation
φ is a supervaluational semantic consequence of Γ (“Γ
SV
φ”) iff φ is
supertrue in each trivalent assignment in which every member of Γ
is supertrue
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 66
To return to the example considered above: the supervaluationist assigns a
different truth value to P∧∼P, when P is #, than do the strong Kleene tables
(and indeed, than do all the other tables we have considered.) The strong
Kleene tables say that P∧∼P is # in this case. But the supervaluationist says
that it is 0: each precisication of any trivalent interpretation that assigns P #
is by denition a PLinterpretation, and P∧∼P is 0 in each PLinterpretation.
Let us note a few facts about supervaluationism.
First, note that every tautology (PLvalid formula) turns out to be superval
uationally valid. For let φ be a tautology; and consider any trivalent interpreta
tion ·, and any precisication ( of ·. Precisications are PLinterpretations;
so, since φ is a tautology, φ is true in (.
Second, note that according to supervaluationism, some formulas are nei
ther true nor false in some trivalent interpretations. For instance, take the
formula P∧Q, in any trivalent interpretation · in which P is 1 and Q is #. Any
precisication of · must continue to assign P 1. But some precisications of
· will assign 1 to Q, whereas others will assign 0 to Q. Any of the former
precisications will assign 1 to P∧Q, whereas any of the latter will assign 0 to
P∧Q. Hence P∧Q is neither supertrue nor superfalse in ·: S
·
(P∧Q) =#.
Finally, notice that the propositional connectives are not truth functional
according to supervaluationism. To say that a connective is truth functional
is to say that the truth value
5
of a complex statement whose major connective
is  is a function of the truth values of the immediate constituents of that
formula—that is, any two such formulas whose immediate constituents have
the same true values must themselves have the same truth value. According to
all of the truth tables for threevalued logic we considered earlier (Łukasiewicz,
Kleene strong and weak), the propositional connectives are truthfunctional.
Indeed, this is in a way trivial: if a connective isn’t truthfunctional then one
can’t give it a truth table at all: what a truth table does is specify how a connective
determines the truth value of entire sentences as a function of the truth values
of its parts. But supervaluationism renders the truth functional connectives
not truth functional. Consider the following pair of sentences, in a trivalent
interpretation in which P and Q are both #:
P∧Q
P∧∼P
5
I am counting #, in addition to 1 and 0, as a “truth value”.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 67
As we have seen, in such trivalent interpretations, the rst formula is # and
the second formula is 0 (since it is superfalse). But each of these formulas is a
conjunction, each of whose conjuncts is #: the truth values of φ and ψ do not
determine the truth value of φ∧ψ.
Similarly, the truth values of other complex formulas are not determined
by the truth values of their parts, as the following pairs of formulas show (the
indicated truth values, again, are relative to a trivalent interpretation in which
P and Q are #):
P∨Q #
P∨∼P 1
P→Q #
P→P 1
3.4 Intuitionism
Intuitionism in the philosophy of mathematics is a view according to which
there are no mindindependent mathematical facts. Rather, mathematical facts
and entities are mental constructs that owe their existence to the mental activity
of mathematicians constructing proofs. This philosophy of mathematics leads
intuitionists to a distinctive form of logic: intuitionist logic.
Let P be the statement: The sequence 0123456789 occurs somewhere in the
decimal expansion of π. How should we think about its meaning? For the classical
mathematician, the answer is straightforward. P is a statement about a part of
mathematical reality, namely, the innite decimal expansion of π. Either the
sequence 0123456789 occurs somewhere in that expansion, in which case P is
true, or it does not, in which case P is false and ∼P is true.
For the intuitionist, this whole picture is mistaken, premised as it is on the
reality of an innite decimal expansion of π. Our minds are nite, and so only
the nite initial segment of π’s decimal expansion that we have constructed so
far is real. The intuitionist’s alternate picture of P’s meaning, and indeed of
meaning generally (for mathematical statements) is a radical one.
6
The classical mathematician, comfortable with the idea of a realm of mind
independent entities, thinks of meaning in terms of truth and falsity. As we saw,
she thinks of P as being true or false depending on the facts about π’s decimal
expansion. Further, she explains the meanings of the propositional connectives
in truththeoretic terms: a conjunction is true iff each of its conjuncts are true;
6
One intuitionist picture, anyway, on which see Dummett (1973). What follows is a crude
sketch. It does not do justice to the actual intuitionist position, which is, as they say, subtle.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 68
a negation is true iff the negated formula is false; and so on. Intuitionists, on
the other hand, reject the centrality of truth to meaning, since truth is tied
up with the rejected picture of mindindependent mathematical reality. For
them, the central semantic concept is that of proof. They simply do not think
in terms of truth and falsity; in matters of meaning, they think in terms of the
conditions under which formulas have been proved.
Take P, for example. Intuitionists advise us: don’t think in terms of what
it would take for P to be true. Think, rather, in terms of what it would take
to prove P. And the answer is clear: we would need to actually continue our
construction of the decimal expansion of π to a point where we found the
sequence 0123456789.
What, now, of ∼P? Again, thinking in terms of proof, not truth: what
would it take for ∼P to be proved? The answer here is less straightforward.
Since P said that there exists a number of a certain sort, it was clear how it
would have to be proved: by actually exhibiting (calculating) some particular
number of that sort. But ∼P says that there is no number of a certain sort; how
do we prove something like that? The intuitionist’s answer: by proving that the
assumption that there is a number of that sort leads to a contradiction. In general, a
negation, ∼φ, is proved by proving that φ leads to a contradiction.
7
Similarly for the other connectives: the intuitionist explicates their meanings
by their role in generating proof conditions, rather than truth conditions. φ∧ψ
is proved by separately giving a proof of φ and a proof of ψ; φ∨ψ is proved
by giving either a proof of φ or a proof of ψ; φ→ψ is proved by exhibiting a
construction whereby any proof of φ can be converted into a proof of ψ.
Likewise, the intuitionist thinks of logical consequence as the preservation
of provability, not the preservation of truth. For example, φ∧ψlogically implies
φ because if one has a proof of φ∧ψ, then one has a proof of φ; and conversely,
if one has proofs of φ and ψ separately, then one has the materials for a proof
of φ∧ψ. So far, so classical. But ∼∼φ does not logically imply φ, for the
intuitionist. Simply having a proof of ∼∼P—a proof that the assumption that
0123456789 occurs nowhere in π’s decimal expansion leads to a contradiction—
wouldn’t give us a proof of P, since proving P would require exhibiting a
particular place in π’s decimal expansion where 0123456789 occurs.
Likewise, intuitionists do not accept the law of the excluded middle, φ∨∼φ,
as a logical truth. To be a logical truth, according to an intuitionist, a sentence
7
Given the contrast with the classical conception of negation, a different symbol (often
“¬”) is sometimes used for intuitionist negation.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 69
should be provable from no premises whatsoever. But to prove P∨∼P, for
example, would require either exhibiting a case of 0123456789 in π’s decimal
expansion, or proving that the assumption that 0123456789 occurs in π’s decimal
expansion leads to a contradiction. We’re not in a position to do either.
Though we won’t consider intuitionist predicate logic, one of its most
striking features is easy to grasp informally. Intuitionists say that an existentially
quantied sentence is proved iff one of its instances has been proved. Therefore
they reject the inference from ∼∀xF x to ∃x∼F x, for one might be able to
prove a contradiction from the assumption of ∀xF x without being able to
prove any instance of ∃x∼F x.
We have so far been considering a putative philosophical justication for
intuitionist propositional logic. That justication has been rough and ready;
but intuitionist propositional logic itself is easy to present, perfectly precise,
and is a coherent system regardless of what one thinks of its philosophical
underpinnings. Two simple modications to the natural deduction system of
section 2.4 generate a natural deduction system for intuitionistic propositional
logic. First, we drop one half of the doublenegation rule, namely what we
might call “doublenegation elimination” (DNE):
Γ ' ∼∼φ
Γ ' φ
Second, we add the rule “ex falso” (EF):
Γ ' φ∧∼φ
Γ ' ψ
Note that EF can be proved in the original system of chapter 2.4: simply use
RAA and then DNE. So, intuitionist logic results from a system for classical
logic by simply dropping one rule (DNE) and adding another rule that was pre
viously provable (EF). It follows that every intuitionistically provable sequent
is also classically provable (because every intuitionistic proof can be converted
to a classical proof).
Notice how dropping DNE blocks proofs of various classical theorems the
intuitionist wants to avoid. The proof of ∅' P∨∼P in chapter 2.4, for instance,
used DNE. Of course, for all we’ve said so far, there might be some other way
to prove this sequent. Only when we have a semantics for intuitionistic logic,
and a soundness proof relative to that semantics, can we show that this sequent
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 70
cannot be proven without DNE. We will discuss a semantics for intuitionism
in section 7.2.
It is interesting to note that even though intuitionists reject the inference
from ∼∼P to P, they accept the inference from∼∼∼P to ∼P, since its proof
only requires the half of DN that they accept, namely the inference from P to
∼∼P:
1 (1) ∼∼∼P As
2 (2) P As (for reductio)
2 (3) ∼∼P 2, DN (accepted version)
1,2 (4) ∼∼P ∧∼∼∼P 1,3 ∧I
1 (5) ∼P 4, RAA
Note that you can’t use this sort of proof to establish ∼∼P ' P. Given the
way RAA is stated, its application always results in a formula beginning with
the ∼.
Chapter 4
Predicate Logic
L
i:’s xov :ivx from propositional logic to the “predicate calculus” (PC),
as it is sometimes called. As with propositional logic, we’re going to
formalize predicate logic. We’ll rst do grammar, and then move to semantics.
We won’t consider proof theory at all.
1
4.1 Grammar of predicate logic
As before, we start by specifying the kinds of symbols that may be used in
sentences of predicate logic—primitive vocabulary—and then go on to dene
the well formed formulas as strings of primitive vocabulary that have the right
form.
Primitive vocabulary (of PC):
i) logical: →, ∼, ∀
ii) nonlogical:
a) for each n >0, nplace predicates F, G. . ., with or without
subscripts
b) variables x, y . . . with or without subscripts
c) individual constants (names) a, b . . ., with or without sub
scripts
1
Proof systems for predicate logic, in both axiomatic and naturaldeduction form, are
straightforward, and can be found in standard logic textbooks.
71
CHAPTER 4. PREDICATE LOGIC 72
iii) parentheses
No symbol of one type is a symbol of any other type. Let’s call any variable or
constant a term.
Denition of (PC) wff:
i) if Πis an nplace predicate and α
1
. . . α
n
are terms, then Πα
1
. . . α
n
is a wff
ii) if φ, ψare wffs, and α is a variable, then ∼φ, (φ→ψ), and ∀αφ
are wffs
iii) nothing else is a wff.
We’ll call formulas that are wffs in virtue of clause i) “atomic” formulas. When
a formula has no free variables, we’ll say that it is a closed formula, or sentence;
otherwise it is an open formula.
We have the same dened logical terms: ∧, ∨, ↔. We also add the following
denition of the existential quantier:
∃vφ=
df
∼∀α∼φ (where α is a variable and φ is a wff).
4.2 Semantics of predicate logic
Recall from section 2.2 the truthconditional conception of semantics. Se
mantics is about meaning; meaning is about the way truth is determined by
the world; and the way we represent the dependence of truth on the world in
logic consists of i) dening certain abstract congurations, which represent
different ways the world could be, and ii) dening the notion of truth for
formulas in these congurations. We thereby shed light on meaning, and we
are thereby able to dene precise versions of the notions of logical truth and
logical consequence.
In propositional logic, the congurations were the PLinterpretations:
assignments of truth or falsity (1 or 0) to sentence letters; and valuation functions
dened truth in a conguration. This procedure needs to get more complicated
for predicate logic. The reason is that the method of truth tables assumes that
we can calculate the truth value of a complex formula by looking at the truth
values of its parts. But take the sentence ∃x(F x∧Gx). You can’t calculate its
truth value by looking at the truth values of F x and Gx, since sentences like
CHAPTER 4. PREDICATE LOGIC 73
F x don’t have truth values at all. The variable ‘x’ doesn’t stand for any one
thing, and so ‘F x’ doesn’t have a truth value.
The solution to this problem is due to the Polish logician Alfred Tarski. It
begins with a new conception of a conguration, that of a model:
A (PC) model is an ordered pair 〈´, ·〉, such that:
i) ´ is a nonempty set (“the domain”)
ii) · is a function (“the interpretation function”) obeying the
following constraints:
a) if α is a constant then ·(α) ∈ ´
b) if Π is an nplace predicate, then ·(Π) = some set of
ntuples of members of ´.
(Recall the notion of an ntuple from section 1.8.)
Models, as we have dened them, seem like good ways to represent con
gurations of the world. The domain, ´, contains, intuitively, the individuals
that exist in the conguration. ·, the interpretation function, tells us what the
nonlogical constants (names and predicates) mean in the conguration. ·
assigns to each name a member of the domain—its denotation. For example,
if the domain is the set of persons, then the name ‘a’ might be assigned me.
Oneplace predicates get assigned sets of 1tuples of ´—that is, just sets of
members of ´. So, a oneplace predicate ‘F ’ might get assigned a set of persons.
That set is called the “extension” of the predicate—if the extension is the set
of males, then the predicate ‘F ’ might be thought of as symbolizing “is male”.
Twoplace predicates get assigned sets of ordered pairs of members of ´—that
is, binary relations over the domain. If a two place predicate ‘R’ is assigned
the set of persons 〈u, v〉 such that u is taller than v, we might think of ‘R’ as
symbolizing “is taller than”. Similarly, threeplace predicates get assigned sets
of ordered triples…
Relative to any PCmodel 〈´, ·〉, we want to dene the notion of truth in
a model—the corresponding valuation function. But we’ll need some apparatus
rst. It’s pretty easy to see what truth value a sentence like Fa should have. ·
assigns a member of the domain to a—call that member u. · also assigns a
subset of the domain to F —let’s call that subset S. The sentence Fa should be
true iff u ∈ S—that is, iff the referent of a is a member of the extension of F .
That is, Fa should be true iff ·(a) ∈ ·(F ). Similarly, Rab should be true iff
〈·(a), ·(b)〉 ∈ ·(R). And so on.
CHAPTER 4. PREDICATE LOGIC 74
As before, we can give recursive clauses for the truth values of negations
and conditionals. φ→ψ, for example, will be true iff either φ is false or ψ is
true.
But this becomes tricky when we try to specify the truth value of ∀xF x. It
should, intuitively, be true if and only if ‘F x’ is true, no matter what we put
in in place of ‘x’. But this is vague. Do we mean “whatever name (constant)
we put in place of ‘x”’? No, because we don’t want to assume that we’ve got a
name for everything in the domain, and what if F x is true for all the objects we
have names for, but false for one of the nameless things! Do we mean, “true no
matter what object from the domain we put in place of ‘x”’? No; objects from
the domain aren’t part of our primitive vocabulary, so the result of replacing ‘x’
with an object from the domain won’t be a formula!
2
Tarski’s solution to this problem goes as follows. Initially, we don’t consider
truth values of formulas absolutely. Rather, we let the variables refer to certain
things in the domain temporarily. Then, we’ll say that ∀xF x will be true iff
for all objects u in the domain ´: F x is true while x temporarily refers to u.
We implement this idea of temporary reference by dening the notion of a
variable assignment:
g is a variable assignment for model 〈´, ·〉 iff g is a function that
assigns to each variable some object in ´.
The variable assignments give the “temporary” meanings to the variables; when
g(x) = u, then u is the temporary denotation of x.
We need a further bit of notation. Let u be some object in ´, let g be some
variable assignment, and let α be a variable. We then dene “g
u/α
” to be the
variable assignment that is just like g, except that it assigns u to α. (If g already
assigns u to α then g
u/α
will be the same function as g.)
Note the following important fact about variable assignments: g
u/α
, when
applied to α, must give the value u. (Work through the denitions to see that
this is so.) That is:
g
u/α
(α) = u
One more bit of apparatus. Given any model . (=〈D,I〉), and given any
variable assignment, g, and given any term (i.e., variable or name) α, we dene
the denotation of α, relative to . and g, “[α]
., g
” as follows:
2
Unless the domain happens to contain members of our primitive vocabulary!
CHAPTER 4. PREDICATE LOGIC 75
[α]
., g
=I(α) if α is a constant, and
[α]
., g
= g(α) if α is a variable.
The subscripts . and g on [ ] indicate that denotations are assigned relative
to a model (.), and relative to a variable assignment ( g).
Now we are ready to dene the valuation function, which assigns truth
values relative to variable assignments. (Relativization to assignments is necessary
because, as we noticed before, F x doesn’t have a truth value absolutely. It only
has a truth value relative to an assigned value to the variable x—i.e., relative to
a choice of an arbitrary denotation for x.)
The valuation function, V
., g
, for model . (= 〈´, ·〉) and variable
assignment g, is dened as the function that assigns to each wff
either 0 or 1 subject to the following constraints:
i) for any nplace predicate Πand any terms α
1
…α
n
, V
., g
(Πα
1
…α
n
)
=1 iff 〈[α
1
]
., g
…[α
n
]
., g
〉 ∈ ·(Π)
ii) for any wffs φ, ψ, and any variable α:
a) V
., g
(∼φ)=1 iff V
., g
(φ)=0
b) V
., g
(φ→ψ)=1 iff either V
., g
(φ)=0 or V
., g
(ψ)=1
c) V
., g
(∀αφ)=1 iff for every u ∈ ´, V
., g
u/α
(φ)=1
(In understanding clause i), recall that the one tuple containing just u, 〈u〉 is
just u itself. Thus, in the case where Π is F , some one place predicate, clause i)
says that V
., g
(F α)=1 iff [α]
., g
∈ ·(F ).)
So far we have dened the notion of truth in a model relative to a variable
assignment. But what we really want is a notion of truth in a model, period—that
is, absolute truth in a model. (We want this because we want to dene, e.g., a
valid formula as one that is true in all models.) So, let’s dene absolute truth in
a model in this way:
φ is true in . iff V
., g
(φ)=1, for each variable assignment g
It might seem that this is too strict a requirement—why must φ be true relative
to each variable assignment? But in fact, it’s not too strict at all. The kinds of
formulas we’re really interested in are formulas without free variables (we’re
interested in formulas like Fa, ∀xF x, ∀x(F x→Gx); not formulas like F x,
CHAPTER 4. PREDICATE LOGIC 76
∀xRxy, etc.). And if a formula has no free variables, then if there’s even a
single variable assignment relative to which it is true, then it is true relative to
every variable assignment. (And so, we could just as well have dened truth in
a model as truth relative to some variable assignment.) I won’t prove this fact,
but it’s not too hard to prove; one would simply need to prove (by induction)
that, for any wff φ, if variable assignments g and h agree on all variables free
in φ, then V
., g
(φ)=V
.,h
(φ).
Now we can give semantic denitions of the core logical notions:
φ is PCvalid (“
PC
φ”) iff φ is true in all PCmodels
φ is a PCsemantic consequence of Γ (“Γ
PC
φ”) iff for every PC
model ., if each member of Γ is true in . then φ is also true in
.
Since our new denition of the valuation function treats the propositional
connectives →and ∼ in the same way as the propositional logic valuation did,
it’s easy to prove that it also treats the dened connectives ∧, ∨, and ↔in the
same way:
V
., g
(φ∧ψ)=1 iff V
., g
(φ)=1 andV
., g
(ψ)=1
V
., g
(φ∨ψ)=1 iff V
., g
(φ)=1 or V
., g
(ψ)=1
V
., g
(φ↔ψ)=1 iff V
., g
(φ) = V
., g
(ψ)
Moreover, we can also prove that the valuation function treats ∃ as it should
(given its intended meaning):
V
., g
(∃αφ)=1 iff there is some u ∈ ´ such that V
., g
u/α
(φ)=1
This can be established as follows:
The denition of ∃αφ is: ∼∀α∼φ. So, we must show that for
any model, and any variable assignment g based on that model,
V
., g
(∼∀α∼φ)=1 iff there is some u ∈ ´ such that V
., g
u/α
(φ)=1.
(In arguments like these, I’ll sometimes stop writing the subscript
. in order to reduce clutter. It should be obvious from the context
what the relevant model is.) Here’s the argument:
• V
g
(∼∀α∼φ)=1 iff V
g
(∀α∼φ)=0 (given the clause for ∼ in the
denition of the valuation function)
CHAPTER 4. PREDICATE LOGIC 77
• But, V
g
(∀α∼φ)=0 iff for some u ∈ ´, V
g
u/α
(∼φ)=0
• Given the clause for ∼, this can be rewritten as:
… iff for some u ∈ ´, V
g
u/α
(φ)=1
4.3 Establishing validity and invalidity
Given our denitions, we can establish that particular formulas are valid. For
example, let’s show that ∀xF x→Fa is valid; that is, let’s show that for any model
〈´, ·〉, and any variable assignment g, V
g
(∀xF x→Fa) =1.
i) Suppose otherwise; then V
g
(∀xF x) =1 and V
g
(Fa) =0.
ii) Given the latter, that means that [a]
g
/ ∈ ·(F )—that is, ·(a) / ∈
·(F ).
iii) Given the former, for any u ∈ ´, V
g
u/x
(F x) =1.
iv) ·(a) ∈ ´, and so V
g
·(a)/x
(F x) =1.
v) By the truth condition for atomics, [x]
g
·(a)/x
∈ ·(F ).
vi) By the denition of the denotation of a variable, [x]
g
·(a)/x
=
g
·(a)/x
(x)
vii) but g
·(a)/x
(x) = ·(a). Thus, ·(a) ∈ ·(F ). Contradiction
The claim in step iv) that ·(a) ∈ ´ comes from the denition of an inter
pretation function: the interpretation of a constant is always a member of the
domain. Notice that “·(a)” is a term of our metalanguage; that’s why, when
I’m given that “for any u ∈ ´” in step iii), I can set u equal to ·(a) to obtain
step iv).
One further example; let’s establish ∀x∀yRxy→∀xRxx:
i) Suppose for reductio that (for some assignment g in some
model), V
g
(∀x∀yRxy→∀xRxx) =0. Then V
g
(∀x∀yRxy) =
1 and …
ii) …V
g
(∀xRxx) =0
iii) Given ii), for some u ∈ ´, V
g
u/x
(Rxx) =0, and so 〈[x]
g
u/x
, [x]
g
u/x
〉 / ∈
·(R)
CHAPTER 4. PREDICATE LOGIC 78
iv) [x]
g
u/x
is g
u/x
(x), i.e., u. So 〈u, u〉 / ∈ ·(R)
v) Given i), for every member of ´, and so for u in particular,
Vg
u/x
(∀yRxy) =1
vi) given v), for every member of ´, and so for u in particular,
V
g
u/x u/y
(Rxy) =1
vii) given vi), 〈[x]
g
u/x u/y
, [y]
g
u/x u/y
〉 ∈ ·(R)
viii) But [x]
g
u/x u/y
and [y]
g
u/x u/y
are each just u. Hence 〈u, u〉 ∈ ·(R),
contradicting iv).
We’ve seen how to establish that particular formulas are valid. How do
we show that a formula is invalid? We need to simply exhibit a single model
in which the formula is false. (The denition of validity species that a valid
formula is true in all models; therefore, it only takes one model in which a
formula is false to make that formula invalid.) So let’s take one example; let’s
show that the formula (∃xF x∧∃xGx)→∃x(F x∧Gx) isn’t valid. To do this, we
must produce a model in which this formula is false. All we need is a single
model, since in order for the formula to be valid, it must be true in all models.
My model will contain letters in its domain:
´ =]u, v]
·(F ) =]u]
·(G) =]v]
It is intuitively clear that the formula is false in this model. In this model,
something has F (namely, u), and something has G (namely, v), but nothing
has both.
One further example: let’s show that ∀x∃yRxy ∃y∀xRxy. We must show
that the rst formula does not semantically imply the second. So we must come
up with a model in which the rst formula is true and the second is false. It helps
to think about natural language sentences that these formulas might represent.
If R symbolizes “respects”, then the rst formula says that “everyone respects
someone or other”, and the second says that “there is someone whom everyone
respects”. Clearly, the rst can be true while the second is false: suppose that
each person respects a different person, so that no one person is respected by
everyone. A simple case of this occurs when there are just two people, each of
whom respects the other, but neither of whom respects him/herself:
CHAPTER 4. PREDICATE LOGIC 79
•
•
Here is a model based on this idea:
´ =]u, v]
·(R) =]〈u, v〉, 〈v, u〉]
Chapter 5
Extensions of Predicate Logic
T
ui vvibita:i iooit we considered in the previous chapter is powerful.
Much natural language discourse can be represented using it, in a way
that reveals logical structure. Nevertheless, it has its limitations. In this chapter
we consider some of its limitations, and corresponding additions to predicate
logic.
5.1 Identity
“Standard” predicate logic is usually taken to include the identity sign (“=”).
“a=b” means that a and b are one and the same thing.
5.1.1 Grammar for the identity sign
We rst need to expand our denition of a wellformed formula. We simply
add the following clause:
If α and β are terms, then α=β is a wff
We need to beware of a potential source of confusion. We’re now using the
symbol ‘=’ as the objectlanguage symbol for identity. But I’ve also been using
‘=’ as the metalanguage symbol for identity, for instance when I write things
like “V(φ) = 1”. This shouldn’t generally cause confusion, but if there’s a
danger of misunderstanding, I’ll clarify by writing things like: “…= (i.e., is the
same object as)…”.
80
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 81
5.1.2 Semantics for the identity sign
This is easy. We keep the notion of a PCmodel from the last chapter, and
simply add to our denition of truthinaPCmodel. All we need to add is a
clause to the denition of a valuation function telling it what truth values to
give to sentences containing the = sign. Here is the clause:
V
., g
(α=β) =1 iff [α]
., g
= (i.e., is the same object as) [β]
., g
That is, the sentence α = β is true iff the terms α and β refer to the same
object.
As an example, let’s show that the formula ∀x∃y x=y is valid. We need
to show that in any model, and any variable assignment g in that model,
V
g
(∀x∃y x=y) =1. So:
i) So, suppose for reductio that for some g in some model,
V
g
(∀x∃y x=y) =0.
ii) Given the clause for ∀, for some object in the domain, call it
“u”, V
g
u/x
(∃y x=y) =0.
iii) Given the clause for ∃, for every v in the domain, V
g
u/x v/y
(x=y)
=0.
iv) Letting v in iii) be u, we have: V
g
u/x u/y
(x=y) =0.
v) So, given the clause for “=”, [x]
g
u/x u/y
is not the same object as
[y]
g
u/x u/y
vi) but [x]
g
u/x u/y
and [y]
g
u/x u/y
are the same object. [x]
g
u/x u/y
is g
u/x u/y
(x),
i.e., u; and [y]
g
u/x u/y
is g
u/x u/y
(y), i.e., u.
5.1.3 Symbolizations with the identity sign
Why do we ever add anything to our list of logical constants? Why not stick
with the tried and true logical constants of propositional and predicate logic?
We generally add a logical constant when it has a distinctive inferential and
semantic role, and when it has very general application—when, that is, it occurs
in a wide range of linguistic contexts. We studied the distinctive semantic role
of ‘=’ in the previous section. In this section, we’ll look at the range of linguistic
contexts that can be symbolized using ‘=’.
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 82
The most obvious sentences that may be symbolized with ‘=’ are those that
explicitly concern identity, such as:
Mark Twain is identical to Samuel Clemens t =c
Every man fails to be identical to George Sand ∀x(Mx→∼x=s )
(It will be convenient to abbreviate ∼α=β as α(=β. Thus, the second symbol
ization can be rewritten as: ∀x(Mx→x(=s ).) But many other sentences involve
the concept of identity in subtler ways.
Consider, for example, “Every lawyer hates every other lawyer”. The ‘other’
signies nonidentity; we have, therefore:
∀x(Lx→∀y[(Ly∧x(=y)→Hxy])
Consider next “Only Ted can change grades”. This means: “no one other
than Ted can change grades”, and may therefore be symbolized as:
∼∃x(x(=t ∧Cx)
(letting ‘Cx’ symbolize “x can change grades”.)
Another interesting class of sentences concerns number. We cannot sym
bolize “There are at least two dinosaurs” as: “∃x∃y(Dx∧Dy)”, since this would
be true even if there were only one dinosaur: x and y could be assigned the
same dinosaur. The identity sign to the rescue:
∃x∃y(Dx∧Dy ∧x(=y)
This says that there are two different objects, x and y, each of which are di
nosaurs. To say “There are at least three dinosaurs” we say:
∃x∃y∃z(Dx∧Dy∧Dz∧x(=y ∧x(=z ∧y(=z)
Indeed, for any n, one can construct a sentence φ
n
that symbolizes “there are
at least n F s”:
φ
n
: ∃x
1
. . . ∃x
n
(F x
1
∧· · · ∧F x
n
∧δ)
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 83
where δ is the conjunction of all sentences “x
i
(=x
j
” where i and j are integers
between 1 and n (inclusive) and i < j . (The sentence δ says in effect that no
two of the variables x
1
. . . x
n
stand for the same object.)
Since we can construct each φ
n
, we can symbolize other sentences involving
number as well. To say that there are at most n F s, we write: ∼φ
n+1
. To say
that there are between n and m F s (where m > n), we write: φ
n
∧∼φ
m+1
. To
say that there are exactly n F s, we write: φ
n
∧∼φ
n+1
.
These methods for constructing sentences involving number will always
work; but one can often construct shorter numerical symbolizations by other
methods. For example, to say “there are exactly two dinosaurs”, instead of
saying “there are at least two dinosaurs, and it’s not the case that there are at
least three dinosaurs”, we could say instead:
∃x∃y(Dx∧Dy ∧x(=y ∧∀z[Dz→(z=x∨z=y)])
5.2 Function symbols
A singular term, such as ‘Ted’, ‘New York City’, ‘George W. Bush’s father’,
or ‘the sum of 1 and 2’, is a term that purports to refer to a single entity.
Notice that some of these have semantically signicant structure. ‘George
W. Bush’s father’, for example, means what it does because of the meaning
of ‘George W. Bush’ and the meaning of ‘father’. But standard predicate
logic’s only (constant) singular terms are its names: a, b, c . . . , which do not
have semantically signicant parts. Thus, using predicate logic’s names to
symbolize semantically complex English singular terms leads to an inadequate
representation.
Suppose, for example, that we give the following symbolizations:
3 is the sum of 1 and 2
a = b
George W. Bush’s father was a politician
Pc
By symbolizing ‘the sum of 1 and 2’ as simply ‘b’, the rst symbolization
ignores the fact that ‘1’, 2’, and ‘sum’ are semantically signicant constituents
of ‘the sum of 1 and 2’; and by symbolizing “George W. Bush’s father” as ‘c’,
we ignore the semantically signicant occurrences of ‘George W. Bush’ and
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 84
‘father’. This is a bad idea. We ought, rather, to produce symbolizations of
these terms that take account of their semantic complexity. The symbolizations
ought to account for the distinctive logical behavior of sentences containing
the complex terms. For example, the sentence
George W. Bush’s father was a politician
logically implies the sentence
Someone’s father was a politician
This ought to be reected in the symbolizations; the rst sentence’s symboliza
tion ought to semantically imply the second sentence’s symbolization.
One way of doing this is via an extension of predicate logic: we add function
symbols to its primitive vocabulary. Think of “George W. Bush’s father” as
the result of plugging “George W. Bush” into the blank in “ ’s father”. “ ’s
father” is an English function symbol. Function symbols are like predicates in
some ways. The predicate “ is happy” has a blank in it, in which you can put
a name. “ ’s father” is similar in that you can put a name into its blank. But
there is a difference: when you put a name into the blank of a predicate, you
get a complete sentence, whereas when you put a name into the blank of “ ’s
father”, you get a noun phrase, such as “George W. Bush’s father”.
Corresponding to English function symbols, we’ll add logical function
symbols. We’ll symbolize “ ’s father” as f ( ). We can put names into the
blank here. Thus, we’ll symbolize “George W. Bush’s father” as “f (a)”, where
“a” stands for “George W. Bush”.
We need to add two more complications. First, what goes into the blank
doesn’t have to be a name—it could be something that itself contains a function
symbol. E.g., in English you can say: “George W. Bush’s father’s father”. We’d
symbolize this as: f ( f (a)).
Second, just as we have multiplace predicates, we have multiplace function
symbols. “The sum of 1 and 2” contains the function symbol “the sum of
and —”. When you ll in the blanks with the names “1” and “2”, you get the
noun phrase “the sum of 1 and 2”. So, we symbolize this using the twoplace
function symbol, “s( ,—). If we let “a” symbolize “1” and “b” symbolize “2”,
then “the sum of 1 and 2” becomes: s (a, b).
The result of plugging names into function symbols in English is a noun
phrase. Noun phrases combine with predicates to form complete sentences.
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 85
Function symbols function analogously in logic. Once you combine a function
symbol with a name, you can take the whole thing, apply a predicate to it, and
get a complete sentence. Thus, the sentence:
George W. Bush’s father was a politician
Becomes:
P f (a)
And
3 is the sum of 1 and 2
becomes
c = s (a, b)
(here “c” symbolizes “3”). We can put variables into the blanks of function
symbols, too. Thus, we can symbolize
Someone’s father was a politician
As
∃xP f (x)
5.2.1 Grammar for function symbols
We need to update our denition of a wff to allow for function symbols. First,
we need to add to our vocabulary. So, the new denition starts like this (the
new bit is in boldface):
Primitive vocabulary:
i) logical: →, ∼, ∀
ii) nonlogical:
a) for each n >0, nplace predicates F, G, . . ., with or with
out subscripts
b) for each n > 0, nplace function symbols f , g,…, with
or without subscripts
c) variables x, y, . . ., with or without subscripts
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 86
d) individual constants (names) a, b, . . ., with or without sub
scripts
iii) parentheses
The denition of a wff, actually, stays the same; all that needs to change is the
denition of a “term”. Before, terms were just names or variables. Now, we
need to allow for f (a), f ( f (a)), etc., to be terms. This is done by the following
recursive denition of a term:
Denition of terms
i) names and variables are terms
ii) if f is an nplace function symbol, and α
1
…α
n
are terms,
then f (α
1
,…,α
n
) is a term
iii) nothing else is a term
5.2.2 Semantics for function symbols
We now need to update our denition of a PCmodel by saying what the
interpretation of a function symbol is. That’s easy: the interpretation of an
nplace function symbol ought to be an nplace function dened on the model’s
domain—i.e., a rule that maps any n members of the model’s domain to another
member of the model’s domain. For example, in a model in which the oneplace
function symbol f ( ) is to represent “ ’s father”, the interpretation of f will
be the function that assigns to any member of the domain that object’s father.
Here’s the new general denition of a model (a “PC+FSmodel”, for “predicate
calculus plus function symbols”):
A PC+FSModel is dened as an ordered pair 〈´, ·〉, such that:
i) ´ is a nonempty set
ii) · is a function such that:
a) if α is a constant then ·(α) ∈ ´
b) if Π is an nplace predicate, then ·(Π) = some set of
ntuples of members of ´.
c) If f is an nplace function symbol, then (f ) is an
nplace (total) function dened on .
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 87
(“Total” simply means that the function must yield an output for any n members
of ´.)
The denition of a valuation function stays the same; all we need to do is
update the denition of denotation to accommodate our new complex terms.
Since we now can have arbitrarily long terms (not just names or variables), we
need a recursive denition:
Denition of denotation in a model .(=〈´, ·〉):
i) if α is a constant then [α]
., g
=·(α)
ii) if α is a variable then [α]
., g
= g(α)
iii) if α is a complex term f (α
1
,…,α
n
), then [α]
,g
=
(f )([α
1
]
,g
,…,[α
n
]
,g
)
Note the recursive nature of this denition: the denotation of a complex term
is dened in terms of the denotations of its smaller parts. Let’s think carefully
about what clause iii) says. It says that, in order to calculate the denotation of
the complex term f (α
1
,…,α
n
) (relative to assignment g), we must rst gure
out what ·( f ) is—that is, what the interpretation function · assigns to the
function symbol f . This object, the new denition of a model tells us, is an
nplace function on the domain. We then take this function, ·( f ), and apply
it to n arguments: namely, the denotations (relative to g) of the terms α
1
. . . α
n
.
The result is our desired denotation of f (α
1
,…,α
n
).
It may help to think about a simple case. Suppose that f is a oneplace
function symbol; suppose our domain consists of the set of natural numbers;
suppose that the name a denotes the number 3 in this model (i.e., ·(a) =3),
and suppose that f denotes the successor function (i.e., ·( f ) is the function,
successor, that assigns to any natural number n the number n +1.) In that case,
the denition tells us that:
[ f (a)]
g
=·( f )([a]
g
)
=·( f )(·(a))
=successor(3)
=4
Here’s a sample metalanguage proof that makes use of the new denitions.
As mentioned earlier, ‘George W. Bush’s father was a politician’ logically
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 88
implies ‘ ‘Someone’s father was a politician’. Let’s show that these sentences’
symbolizations stand in the relation of semantic implication. That is, let’s show
that P f (c) ∃xP f (x)—i.e., that in any model in which P f (c) is true, ∃xP f (x)
is true:
i) Suppose that P f (c) is true in a model 〈·, ´〉—i.e., V
g
(P f (c)) =
1 (where V is the valuation for this model), for each variable
assignment g.
ii) Suppose for reductio that ∃xP f (x) is false in this model; i.e.,
for some variable assignment g, V
g
(∃xP f (x)) =0
iii) By line i), V
g
(P f (c)) = 1, and so [ f (c)]
g
∈ ·(P). [ f (c)]
g
is
just ·( f )([c]
g
), and [c]
g
is just ·(c). So ·( f )(·(c)) ∈ ·(P).
iv) By ii), for every object u ∈ ´, V
g
u/x
(P f (x)) =0.
v) ·(c) ∈ ´. So, by line iv), V
g
·(c))/x
(P f (x)) = 0, and hence,
[ f (x)]
g
·(c)/x
/ ∈ ·(P)
vi) [ f (x)]
g
·(c)/x
is just ·( f )([x]
g
·(c))/x
), and [x]
g
·(c)/x
is just g
·(c)/x
(x)—
i.e., ·(c).
vii) So ·( f )(·(c)) / ∈ ·(P), which contradicts line iii)
5.2.3 Symbolizations with function symbols: some exam
ples
Here are some example symbolizations involving function symbols:
Everyone loves his or her father
∀xLx f (x)
No one’s father is also his or her mother
∼∃x f (x)=m(x)
No one is his or her own father
∼∃x x=f (x)
A person’s maternal grandfather hates that person’s paternal grand
mother
∀x H f (m(x)) m( f (x))
Every even number is the sum of two prime numbers
∀x(Ex→∃y∃z(Py∧Pz∧x=s (y, z)))
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 89
5.3 Denite descriptions
Our logic has gotten more powerful with the addition of function symbols,
but it still isn’t perfect. Function symbols let us “break up” certain complex
singular terms—e.g., “Bush’s father”. But there are others we still can’t break
up—e.g., “The black cat”. Even with function symbols, the only candidate for
a direct symbolization of this phrase into the language of predicate logic is a
simple name, “a” for example. But this symbolization ignores the fact that “the
black cat” contains “black” and “cat” as semantically signicant constituents.
It therefore fails to provide a good model of this term’s distinctively logical
behavior. For example, ‘The black cat is happy’ logically implies ‘Some cat
is happy’. But the simpleminded symbolization of the rst sentence, Ha,
obviously does not semantically imply the obvious symbolization of the second:
∃x(Cx∧Hx).
One response is to introduce another extension of predicate logic. We
introduce a new symbol, ι, to stand for “the”. The grammatical function of
“the” in English is to turn predicates into noun phrases. “Black cat” is a predicate
of English; “the black cat” is a noun phrase that refers to the thing that satises
the predicate “black cat”. Similarly, in logic, given a predicate F , we’ll let ιxF x
be a term that means: the thing that is F .
We’ll want to let ιx attach to complex predicates, not just simple predi
cates. To symbolize “the black cat”—i.e., the thing that is both black and a
cat—we want to write: ιx(Bx∧Cx). In fact, we’ll let ιx attach to wffs with
arbitrary complexity. To symbolize “the reman who saved someone”, we’ll
write: ιx(F x∧∃ySxy).
5.3.1 Grammar for ι
Just as with function symbols, we need to add a bit to the primitive vocabulary,
and revise the denition of a term. Here’s the new grammar:
Primitive vocabulary:
i) logical: →, ∼, ∀, ι
ii) nonlogical:
a) for each n >0, nplace predicates F, G, . . ., with or with
out subscripts
b) variables x, y, . . ., with or without subscripts
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 90
c) individual constants (names) a, b, . . ., with or without sub
scripts
Denition of terms and wffs:
i) names and variables are terms
ii) if φ is a wff and α is a variable then ιαφ is a term
iii) if Πis an nplace predicate and α
1
…α
n
are terms, then Πα
1
…α
n
is a wff
iv) if φ, ψ are wffs, and α is a variable, then ∼φ, (φ→ψ), and
∀αφ are wffs
v) nothing else is a wff or term
Notice how we needed to combine the recursive denitions of term and wff
into a single recursive denition of wffs and terms together. The reason is that
we need the notion of a wff to dene what counts as a term containing the ι
operator (clause ii); but we need the notion of a term to dene what counts as
a wff (clause iii). The way we accomplish this is not circular. The reason it isn’t
is that we can always decide, using these rules, whether a given string counts as
a wff or term by looking at whether smaller strings count as wffs or terms. And
the smallest strings are said to be wffs or terms in noncircular ways.
5.3.2 Semantics for ι
We need to update the denition of denotation so that ιxφ will denote the one
and only thing in the domain that is φ. This is a little tricky, though. What is
there is no such thing? Suppose that ‘K’ symbolizes “king of” and ‘a’ symbolizes
“USA”. Then, what should ‘ιxKxa’ denote? It is trying to denote the king of
the USA, but there is no such thing. Further, what if more than one thing
satises the predicate? In short, what do we say about “empty descriptions”?
One approach would be to say that every atomic sentence with an empty
description is false. One way to do this is to include in each model an “emptiness
marker”, ¿, which is an object we assign as the denotation for each empty de
scription. The emptiness marker shouldn’t be thought of as a “real” denotation;
when we assign it as the denotation of a description, this just marks the fact that
the description has no real denotation. We will stipulate that the emptiness
marker is not in the domain; this ensures that it is not in the extension of any
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 91
predicate, and hence that atomic sentences containing empty descriptions are
always false. Here’s how the semantics looks (“PC+DD”—“predicate calculus
plus denite descriptions”):
A PC+DDModel is dened as an ordered triple 〈´, ·, 〉, such
that:
i) ´ is a nonempty set
ii) / ∈
iii) · is a function such that:
a) if α is a constant then ·(α) ∈ ´
b) if Π is an nplace predicate, then ·(Π) = some set of
ntuples of members of ´
Denition of denotation and valuation:
The denotation and valuation functions, [ ]
,g
and V
,g
, for
PC+DDmodel (=〈, 〉) and variable assignment g, are
dened as the functions that satisfy the following constraints:
i) V
., g
assigns to each wff either 0 or 1
ii) []
., g
assigns to each term either , or a member of ´
iii) if α is a constant then [α]
., g
is ·(α)
iv) if α is a variable then [α]
., g
is g(α)
v) if α is a complex termιβφ, where βis a variable and φis a
wff, then [α]
,g
=the unique u ∈ such that V
g
u/β
(φ) = 1
if there is a unique such u; otherwise [α]
,g
=
vi) for any nplace predicate Π
.
and any terms α
1
, . . . , α
n
,
V
., g
(Πα
1
. . . α
n
) =1 iff 〈[α
1
]
., g
. . . [α
n
]
., g
〉 ∈ ·(Π)
vii) for any wffs φ, ψ, and any variable α:
a) V
., g
(∼φ) =1 iff V
., g
(φ) =0
b) V
., g
(φ→ψ) =1 iff either V
., g
(φ) =0 or V
., g
(ψ) =1
c) V
., g
(∀αφ) =1 iff for every u ∈ ´, V
., g
u/α
(φ) =1
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 92
(As with the grammar, we need to mix together the denition of denotation
and the denition of the valuation function. The reason is that we need to
dene the denotation of denite descriptions using the valuation function (in
clause v), but we need to dene the valuation function using the concept of
denotation in clause vi. As before, this is not circular.)
An alternate approach would appeal to threevalued logic. We could leave
the denotation of ιxφ undened if there is no object in the domain such that φ.
We could then treat any atomic sentence that contains a denotationless term as
being neither true nor false—i.e., #. We would then need to update the other
clauses to allow for #s, perhaps using the Kleene tables, perhaps some other
truth tables. I won’t explore this further.
5.3.3 Eliminability of function symbols and denite descrip
tions
In a sense, we don’t really need function symbols or the ι. Let’s return to
the English singular term ‘the black cat’. Introducing the ι gave us a way
to symbolize this singular term in a way that takes into account its semantic
structure (namely: ιx(Bx∧Cx).) But even without the ι, there is a way to
symbolize whole sentences containing ‘the black cat’, using just standard predicate
plus identity. We could, for example, symbolize
The black cat is happy
as:
∃x[(Bx∧Cx) ∧∀y[(By∧Cy)→y=x] ∧Hx]
That is, “there is something such that: i) it is a black cat, ii) nothing else is a
black cat, and iii) it is happy”.
This method for symbolizing sentences containing ‘the’ is called “Russell’s
theory of descriptions”, in honor of its inventor Bertrand Russell, the 19
th
and
20
th
century philosopher and logician.
1
The general idea is to symbolize: “the
φ is ψ” as “∃x[φ(x) ∧∀y(φ(y)→x=y) ∧ψ(x)]. This method can be iterated so
as to apply to sentences with two or more denite descriptions, such as:
The 8foot tall man drove the 20foot long limousine.
1
See Russell (1905).
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 93
which becomes, letting ‘E’ stand for ‘is eight feet tall’ and ‘T’ stand for ‘is
twenty feet long’:
∃x[Ex∧Mx ∧∀z([Ez∧Mz]→x=z) ∧∃y[Ty∧Ly ∧∀z([T z∧Lz]→y=z)
∧Dxy]]
An interesting problem arises with negations of sentences involving denite
descriptions, when we use Russell’s method:
The president is not bald.
Does this mean:
The president is such that he’s nonbald.
which is symbolized as follows:
∃x[Px∧∀y(Py→x=y)∧∼Bx]
? Or does it mean:
It is not the case that the President is bald
which is symbolized thus:
∼∃x[Px∧∀y(Py→x=y)∧Bx]
? According to Russell, the original sentence is simply ambiguous. Symbolizing
it the rst way is called “giving the description wide scope (relative to the ∼)”,
since the ∼ is inside the scope of the ∃x. Symbolizing it in the second way is
called “giving the description narrow scope (relative to the ∼)”, because the ∃x
is inside the scope of the ∼.
What is the difference in meaning between these two symbolizations? The
rst says that there really is a unique president, and adds that he is not bald.
So the rst implies that there’s a unique president. The second merely denies
that there is a unique president, who is bald. That doesn’t imply that there’s a
unique president. It would be true if there’s a unique president who is not bald,
but it would also be true in two other cases:
i) there’s no president
ii) there is more than one president
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 94
A similar ambiguity arises with the following sentence:
The round square does not exist.
We might think to symbolize it:
∃x[Rx∧Sx∧∀y([Ry∧Sy]→x=y)∧∼Ex]
letting “E” stands for “exists”. In other words, we might give the description
wide scope. But this is wrong, because it says there is a certain round square that
doesn’t exist, and that’s a contradiction. This way of symbolizing the sentence
corresponds to reading the sentence as saying:
The thing that is a round square is such that it does not exist
But that isn’t the most natural way to read the sentence. The sentence would
usually be interpreted to mean:
It is not true that the round square exists.
— that is, as the negation of “the round square exists”:
∼∃x[Rx∧Sx∧∀y([Ry∧Sy]→x=y)∧Ex]
with the ∼out in front. Here we’ve given the description narrow scope. Notice
also that saying that x exists at the end is redundant, so we could simplify to:
∼∃x[Rx∧Sx∧∀y([Ry∧Sy]→x=y)]
Again, notice the moral of these last two examples: if a denite description
occurs in a sentence with a ‘not’, the sentence may be ambiguous: does the
‘not’ apply to the entire rest of the sentence, or merely to the predicate?
If we are willing to use Russell’s method for translating denite descriptions,
we can drop ι from our language. We would, in effect, not be treating “the F ”
as a referring phrase. We would instead be paraphrasing sentences that contain
“the F ” into sentences that don’t. “The black cat is happy” got paraphrased
as: “there is something that is a black cat, is such that nothing else is a black
cat, and is happy”. See?—no occurrence of “the black cat” in the paraphrased
sentence.
In fact, once we use Russell’s method, we can get rid of function symbols too.
Given function symbols, we treated “father” as a function symbol, symbolized
it with “f ”, and symbolized the sentence “George W. Bush’s father was a
politician” as P f (b). But instead, we could treat ‘father of’ as a twoplace
predicate, F , and regard the whole sentence as meaning: “The father of George
W. Bush was a politician.” Given the ι, this could be symbolized as:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 95
PιxF xb
But given Russell’s method, we can symbolize the whole thing without using
either function symbols or the ι:
∃x(F xb ∧∀y(F yb→y=x) ∧Px)
We can get rid of all function symbols this way, if we want. Here’s the method:
• Take any nplace function symbol f
• Introduce a corresponding n +1place predicate R
• In any sentence containing the term “f (α
1
. . . α
n
)”, replace each occur
rence of this term with “the x such that R(x, α
1
. . . α
n
)”.
• Finally, symbolize the resulting sentence using Russell’s theory of de
scriptions
For example, let’s go back to:
Every even number is the sum of two prime numbers
Instead of introducing a function symbol s (x, y) for “the sum of x and y”, let’s
introduce a predicate letter R(z, x, y) for “z is a sum of x and y”. We then use
Russell’s method to symbolize the whole sentence thus:
∀x(Ex→∃y∃z[Py∧Pz∧∃w(Rwyz∧∀w
1
(Rw
1
yz→w
1
=w)∧x=w)])
The end of the formula (beginning with ∃w) says “the product of y and z is
identical to x”—that is, that there exists some w such that w is a product of y
and z, and there is no other product of y and z other than w, and w = x.
5.4 Further quantiers
Standard logic contains just the quantiers ∀ and ∃. As we have seen, using just
these quantiers, plus the rest of standard predicate logic, one can represent
the truth conditions of a great many sentences of natural language. But not all.
For instance, there is no way to symbolize the following sentences in predicate
logic:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 96
Most things are massive
Most men are brutes
There are innitely many numbers
Some critics admire only one another
Like those sentences that are representable in standard logic, these sentences
involve quanticational notions: most things, some critics, and so on. In this
section we examine broader notions of quantiers that allow us to symbolize
these sentences.
5.4.1 Generalized monadic quantiers
We will generalize the idea behind the standard quantiers ∃ and ∀ in two
ways. To approach the rst, think about the clauses in the denition of truth in
a PCmodel, ., with domain ´, for ∀ and ∃:
V
., g
(∀αφ) =1 iff for every u ∈ ´, V
., g
u/α
(φ) =1
V
., g
(∃αφ) =1 iff for some u ∈ ´, V
., g
u/α
(φ) =1
Let’s introduce the following bit of terminology. For any PCmodel, . (=
〈´, ·〉), and wff, φ, let’s introduce a name for (roughly speaking) the set of
members of .’s domain of which φ is true:
φ
., g,α
=
df
]u : u ∈ ´ and V
., g
u/α
(φ) =1]
Thus, if we begin with any variable assignment g, then φ
., g,α
is the set of
things u in ´ such that φ is true, relative to variable assignment g(u/α). Given
this terminology, we can rewrite the clauses for ∀ and ∃ as follows:
V
., g
(∀αφ) =1 iff φ
., g,α
=´
V
., g
(∃αφ) =1 iff φ
., g,α
(=∅
But if we can rewrite the semantic clauses for the familiar quantiers ∀ and
∃ in this way—as conditions on φ
., g,α
—then why not introduce new symbols
of the same grammatical type as ∀ and ∃, whose semantics is parallel to ∀ and
∃ except in laying down different conditions on φ
., g,α
? These would be new
kinds of quantiers. For instance, for any integer n, we could introduce a
quantier ∃
n
, to be read as “there exists at least n”. That is, ∃
n
φ means: “there
are at least n φs.” The denitions of a wff, and of truth in a model, would be
updated with the following clauses:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 97
Grammar: if α is a variable and φ is a wff, then ∃
n
αφ is a wff
Semantics: V
., g
(∃
n
αφ) =1 iff φ
., g,α
 ≥ n
The expression A stands for the “cardinality” of set A—i.e., the number of
members of A. Thus, this denition says that ∃
n
αφ is true iff the cardinality of
φ
., g,α
is greater than or equal to n—i.e., this set has at least n members.
Now, the introduction of the symbols ∃
n
do not increase the expressive
power of predicate logic, for as we saw in section 5.1.3, we can symbolize
“there are at least n F s” using just standard predicate logic (plus “=”). The
new notation is merely a spacesaver. But other such additions are not mere
spacesavers. For example, by analogy with the symbols ∃
n
, we can introduce a
symbol ∃
∞
, meaning “there are innitely many”:
Grammar: if α is a variable and φ is a wff, then “∃
∞
αφ” is a wff
Semantics: V
., g
(∃
∞
αφ) =1 iff φ
., g,α
 is innite
As it turns out (though I won’t prove it here), the addition of ∃
∞
genuinely
enhances predicate logic: no sentence of standard (rstorder) predicate logic
has the same truth condition as does ∃
∞
xF x.
Another generalized quantier that is not symbolizable using standard
predicate logic is most:
Grammar: If α is a variable and φ is a wff, then “most αφ” is a wff
Semantics: V
., g
(most αφ) =1 iff φ
., g,α
 >´ −φ
., g,α

The minussign in the second clause is the symbol for settheoretic difference:
A−B is the set of things that are in Abut not in B. Thus, the denition says
that most αφ is true iff more things in the domain ´ are φ than are not φ.
One could add all sorts of additional “quantiers” Q in this way. Each
would be, grammatically, just like ∀ and ∃, in that each would combine with
a variable, α, and then attach to a sentence φ, to form a new sentence Qαφ.
Each of these new quantiers, Q, would be associated with a relation between
sets, R
Q
, such that Qαφ would be true in a PCmodel, ., with domain ´,
relative to variable assignment g, iff φ
., g,α
bears R
Q
to ´.
If such an added symbol Q is to count as a quantier in any intuitive sense,
then the relation R
Q
can’t be just any relation between sets. It should be a
relation concerning the relative “quantities” of its relata. It shouldn’t, for
instance, “concern particular objects” in the way that the following symbol,
∃
Tedloved
, concerns particular objects:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 98
V
., g
(∃
Tedloved
αφ) =1 iff φ
., g,α
∩]u : u ∈ ´ and Ted loves u] (=∅
So we should require the following of R
Q
. Consider any set, D, and any one
one function, f , from D onto another set D
/
. Then, if a subset X of D bears
R
Q
to D, the set f [X] must bear R
Q
to D
/
. ( f [X] is the image of X under
function f —i.e., ]u : u ∈ D
/
and u = f (v), for some v ∈ D]. It is the subset of
D
/
onto which f “projects” X.)
5.4.2 Generalized binary quantiers
We have seen how the standard quantiers ∀ and ∃ can be generalized in
one way: syntactically similar symbols may be introduced and associated with
different relations between sets. Our second way of generalizing the standard
quantiers is to allow twoplace, or binary quantiers. ∀ and φ are monadic
in that ∀α and ∃α attach to a single open sentence φ. Compare the natural
language monadic quantiers ‘everything’ and ‘something’:
Everything is material
Something is spiritual
Here, the predicates (verb phrases) ‘is material’ and ‘is spiritual’ correspond
to the open sentences of logic; it is to these that ‘everything’ and ‘something’
attach.
But in fact, monadic quantiers in natural language are atypical. ‘Every’
and ‘some’ typically occur as follows:
Every student is happy
Some sh are tasty
The quantiers ‘every’ and ‘some’ attach to two predicates. In the rst, ‘every’
attaches to ‘[is a] student’ and ‘is happy’; in the second, ‘some’ attaches to ‘[is
a] sh’ and ‘[is] tasty’. In these sentences, we may think of ‘every’ and ‘some’ as
binary quantiers. (Indeed, one might think of ‘everything’ and ‘something’ as
the result of applying the binary quantiers ‘every’ and ‘some’ to the predicate
‘is a thing’.) A logical notation can be introduced which exhibits a parallel
structure, in which ∀ and ∃ attach to two open sentences. In this notation, the
form of quantied sentences is:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 99
(∀α:φ)ψ
(∃α:φ)ψ
The rst is to be read: “all φs are ψ”; the second is to be read “there is a φ that
is a ψ”. The clauses for these new binary quantiers in the denition of the
valuation function for a PCmodel are:
V
., g
((∀α:φ)ψ) =1 iff φ
., g,α
⊆ψ
., g,α
V
., g
((∃α:φ)ψ) =1 iff φ
., g,α
∩ψ
., g,α
(=∅
A further important binary quantier is the:
Grammar: if φ and ψ are wffs and α is a variable, then (theα:φ)ψ
is a wff
Semantics: V
., g
((theα:φ)ψ) = 1 iff φ
., g,α
 = 1 and φ
., g,α
⊆
ψ
., g,α
That is, (theα:φ)ψ is true iff i) there is exactly one φ, and ii) every φ is a
ψ. This truth condition, notice, is exactly the truth condition for Russell’s
symbolization of “the φ is a ψ”; hence the name the.
As with the introduction of the monadic quantiers ∃
n
, the introduction of
the binary existential and universal quantiers, and of the, does not increase the
expressive power of rst order logic, for the same effect can be achieved with
monadic quantiers. (∀α:φ)ψ, (∃α:φ)ψ, and (theα:φ)ψ become, respectively:
∀α(φ→ψ)
∃α(φ∧ψ)
∃α(φ∧∀β(φ
β
→β=α) ∧ψ)
(where φ
β
is φwith free αs changed to βs.) But, as with the monadic quantiers
∃
∞
and most, there are binary quantiers one can introduce that genuinely
increase expressive power. For example, most occurrences of ‘most’ in English
are binary, e.g.:
Most sh swim
To symbolize such sentences, we can introduce a binary quantier most
2
. The
sentence (most
2
α:φ)ψ is to be read “most φs are ψs”. The semantic clause for
most
2
is:
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 100
V
., g
((most
2
α:φ)ψ) =1 iff φ
., g,α
∩ψ
., g,α
 >φ
., g,α
−ψ
., g,α

The binary most
2
increases our expressive power, even relative to the monadic
most: not every sentence expressible with the former is equivalent to a sentence
expressible with the latter.
2
5.4.3 Secondorder logic
All the predicate logic we have considered so far is known as rstorder. We’ll
now briey look at secondorder predicate logic, a powerful extension to rst
order predicate logic. The distinction has to do with how variables behave, and
has syntactic and semantic aspects.
The syntactic part of the idea concerns the grammar of variables. All
the variables in rstorder logic are grammatical terms. That is, they behave
grammatically like names: you can combine them with a predicate to get a
wff; you cannot combine them solely with other terms to get a wff; etc. In
secondorder logic, on the other hand, variables can occupy predicate position.
Thus, each of the following sentences is a wellformed formula in secondorder
logic:
∃XXa
∃X∃yXy
Here we say the variable X occupying predicate position. Predicate variables,
like the normal predicates of standard rstorder logic, can be oneplace, two
place, three place, etc.
The semantic part of the idea concerns the interpretation of variables. In
rstorder logic, a variableassignment assigns to each variable a member of the
domain. A variable assignment in secondorder logic assigns to each standard
(rstorder) variable α a member of the domain, as before, but assigns to each
nplace predicate variable a set of ntuples drawn from the domain. (This is
what one would expect: the semantic value of a nplace predicate is its extension,
a set of ntuples, and variable assignments assign temporary semantic values.)
Then, the following clauses to the denition of truth in a PCmodel must be
added:
If π is an nplace predicate variable and α
1
…α
n
are terms, then
V
., g
(πα
1
…α
n
) =1 iff 〈[α
1
]
., g
…[α
n
]
., g
〉 ∈ g(π)
2
Westerståhl (1989, p. 29).
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 101
If π is a predicate variable and φ is a wff, then V
., g
(∀πφ) =1 iff
for every set U of ntuples from ´, V
., g
U/π
(φ) =1
(where g
U/π
is the variable assignment just like g except in assigning U to π.)
Notice that, as with the generalized monadic quantiers, no alteration to the
denition of a PCmodel is needed. All we need to do is change grammar and
the denition of the valuation function.
Secondorder logic is different from rstorder logic in many ways. For
instance, one can dene the identity predicate in secondorder logic:
x=y =
df
∀X(Xx↔Xy)
This can be seen to work correctly as follows. Aoneplace secondorder variable
X gets assigned a set of things. Thus, the atomic sentence Xx says that the
object (currently assigned to) x is a member of the set (currently assigned to)
X. Thus, ∀X(Xx↔Xy) says that x and y are members of exactly the same
sets. But since x and only x is a member of {x} (i.e., x’s unit set), that means
that the only way for this to be true is for y to be identical to x.
More importantly, the metalogical properties of secondorder logic are dra
matically different from those of rstorder logic. For instance, the axiomatic
method cannot be fully applied to second order logic. One cannot write down a
set of axioms for secondorder logic that are both sound and complete—unless,
that is, one resorts to cheap tricks like saying “let every valid wff be an axiom”.
This trick is “cheap” because one would have no way of telling what an axiom
is. (More precisely, the resulting set of axioms would fail to be “recursive”.
3
)
Secondorder logic also allows us to express claims that cannot be expressed
in rstorder logic. Consider the “GeachKaplan sentence”:
4
(GK) Some critics admire only one another
It can be shown that there is no way to symbolize (one reading of) the sentence
using just rstorder logic and predicates for ‘is a critic’ and ‘admires’. The
sentence (on the desired reading) says that there is a group of critics such that
the members of that group admire only other members of the group, but one
cannot say this in rstorder logic. However, the sentence can be symbolized
in secondorder logic:
3
For a rigorous statement and proof of this and other metalogical results about secondorder
logic, see Boolos and Jeffrey (1989, chapter 18).
4
The sentence and its signicance were discovered by Peter Geach and David Kaplan. See
Boolos (1984).
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 102
(GK
2
) ∃X[∃xXx ∧∀x(Xx→Cx) ∧∀x∀y([Xx∧Axy]→[Xy∧x(=y)]
(GK
2
) “symbolizes” (GK) in the sense that it contains no predicates other than
C and A, and for every model 〈´, ·〉, the following is true:
(*) (GK
2
) is true in 〈´, ·〉 iff ´ has a nonempty subset, X, such
that i) X ⊆·(C), and ii) whenever 〈u, v〉 ∈ ·(A) and u ∈ X,
then v ∈ X as well and v is not u.
No rstorder sentence symbolizes the GeachKaplan sentence in this sense.
However, one can in a sense symbolize the GeachKaplan sentence using a rst
order sentence, provided the sentence employs, in addition to the predicates C
and A, a predicate ∈ for setmembership:
(GK
1
) ∃z[∃x x∈z ∧∀x(x∈z→Cx) ∧∀x∀y([x∈z∧Axy]→[y∈z∧x(=y)]
(GK
1
) doesn’t “symbolize” (GK) in the sense of satisfying (*) in every model,
for in some models the twoplace predicate ∈ doesn’t mean setmembership.
Nevertheless, if we just restrict our attention to models 〈´, ·〉 in which ∈ does
mean setmembership (restricted to the model’s domain, of course—that is,
·(∈) =]〈u, v〉 : u, v ∈ ´ and u ∈ v]), and in which ´ contains each subset of
·(C) as a member, then (GK
1
) will indeed satisfy (*). In essence, the difference
between (GK
1
) and (GK
2
) is that it is hardwired into the denition of truth in
a model that secondorder predications Xy express setmembership, whereas
this is not hardwired into the denition of the rstorder predication y ∈ z.
5
5
For more on secondorder logic, see Boolos (1975, 1984, 1985).
Chapter 6
Propositional Modal Logic
M
obai iooit is the logic of necessity and possibility. In it we treat words
like “necessary”, “could be”, “must be”, etc. as logical constants. Here
are our new symbols:
2φ: “It is necessary that φ”, “Necessarily, φ”, “It must be that φ”
3φ: “It is possible that φ”, “Possibly, φ”, “It could be that φ”, “It
can be that φ”, “It might be that φ”
The phrase “φ is possible” is sometimes used in the following sense: “φ
could be true, but then again, φ could be false”. For example, if one says “it
might rain tomorrow”, one might intend to say not only that there is a possibility
of rain, but also that there is a possibility that there will be no rain. This is not
the sense of ‘possible’ that we symbolize with the 3. In our intended sense,
“possibly φ” does not imply “possibly notφ”. To get into the spirit of this sense,
note the naturalness of saying the following: “well of course 2+2 can equal 4,
since it does equal 4”. Here, ‘can’ is used in our intended sense: it is presumably
not possible for 2+2 to fail to be 4, and so in this case, ‘it can be the case that
2+2 equals 4’ does not imply ‘it can be the case that 2+2 does not equal 4’.
It is helpful to think of the 2 and the 3 in terms of possible worlds. A
possible world is a complete and possible scenario. Calling a scenario “possible”
means simply that it’s possible that the scenario happen, i.e., be actual. This
requirement disqualies scenarios in which, for example, it is both raining and
also not raining (at the same time and place)—such a thing couldn’t happen, and
so doesn’t happen in any possible world. But within this limit, we can imagine
all sorts of possible worlds: possible worlds with talking donkeys, possible
103
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 104
worlds in which I am ten feet tall, and so on. “Complete” means simply that no
detail is left out—possible worlds are completely specic scenarios. There is no
possible world in which I am “somewhere between 10 and 11 feet tall” without
being some particular height.
1
Likewise, in any possible world in which I am
exactly 10 feet, six inches tall (say), I must have some particular weight, must
live in some particular place, and so on. One of these possible worlds is the
actual world—this is the complete and possible scenario that in fact obtains.
The rest of them are merely possible—they do not obtain, but would have
obtained if things had gone differently.
In terms of possible worlds, we can think of our modal operators thus:
“2φ” is true iff φ is true in all possible worlds
“3φ” is true iff φ is true in at least one possible world
It is necessarily true that all bachelors are male; in every possible world, every
bachelor is male. There might have existed a talking donkey; some possible
world contains a talking donkey. Possible worlds provide, at the very least,
a vivid way to think about necessity and possibility. How much more than a
vivid guide they provide is an open philosophical question. Some maintain that
possible worlds are the key to the metaphysics of modality, that what it is for a
proposition to be necessarily true is for it to be true in all possible worlds.
2
Whether this view is defensible is a question beyond the scope of this book;
what is important for present purposes is that we distinguish possible worlds as
a vivid heuristic from possible worlds as a concern in serious metaphysics.
Our rst topic in modal logic is the addition of the 2 and the 3 to proposi
tional logic; the result is modal propositional logic (“MPL”). A further step will be
be modal predicate logic (chapter 9).
6.1 Grammar of MPL
We need a new language: the language of propositional modal logic. The
grammar of this language is just like the grammar of propositional logic, except
that we add the 2 as a new oneplace sentence connective:
Primitive vocabulary:
1
This is not to say that possible worlds exclude vagueness.
2
Sider (2003) presents an overview of this topic.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 105
Sentence letters: P, Q, R. . . , with or without numerical
subscripts
Connectives: →, ∼, 2
Parentheses: (,)
Denition of wff:
i) Sentence letters are wffs
ii) If φ, ψ are wffs then φ→ψ, ∼φ, and 2φ are wffs
iii) nothing else is a wff
The 2 is the only new primitive connective. But just as we were able to
dene ∧, ∨, and ↔, we can dene new nonprimitive modal connectives:
3φ =
df
∼2∼φ “Possibly φ”
φ⇒ψ =
df
2(φ→ψ) “φ strictly implies ψ”
6.2 Symbolizations in MPL
Modal logic allows us to symbolize a number of sentences we couldn’t symbolize
before. The most obvious cases are sentences that overtly involve “necessarily”,
“possibly”, or equivalent expressions:
Necessarily, if snow is white, then snow is white or grass is green
2[S→(S∨G)]
I’ll go if I must
2G→G
It is possible that Bush will lose the election
3L
Snow might have been either green or blue
3(G∨B)
If snow could have been green, then grass could have been white
3G→3W
‘Impossible’ and related expressions signify the lack of possibility:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 106
It is impossible for snow to be both white and not white
∼3(W∧∼W)
If grass cannot be clever then snow cannot be furry
∼3C→∼3F
God’s being merciful is inconsistent with imperfection’s being in
compatible with your going to heaven.
∼3(M∧∼3(I ∧H))
(M=“God is merciful”, I = “You are imperfect”, H=“You go to
heaven”)
As for the strict conditional, it arguably does a decent job of representing
certain English conditional constructions:
Snow is a necessary condition for skiing
∼W⇒∼K
Food and water are required for survival
∼(F ∧W)⇒∼S
Thunder implies lightning
T⇒L
Once we add modal operators, we can expose an important ambiguity in
certain English sentences. The surface grammar of a sentence like “if Ted is a
bachelor, then he must be unmarried” is misleading: it suggests the symboliza
tion:
B→2U
But since I am in fact a bachelor, it would follow from this symbolization that
the proposition that I am unmarried is necessarily true. But clearly I am not
necessarily a bachelor—I could have been married! The sentence is not saying
that if I am in fact a bachelor, then the following is a necessary truth: I am
married. It is rather saying that, necessarily, if I am a bachelor then I am
married:
2(B→U)
It is the relationship between my being a bachelor and my being unmarried that
is necessary. Think of this in terms of possible worlds: the rst symbolization
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 107
says that if I am a bachelor in the actual world, then I am unmarried in every
possible world (which is absurd); whereas the second one says that in each
possible world, w, if I am a bachelor in w, then I am unmarried in w (which
is quite sensible). The distinction between φ→2ψ and 2(φ→ψ) is called
the distinction between the “necessity of the consequent” (rst sentence) and
the “necessity of the consequence” (second sentence). It is important to keep
the distinction in mind, because of the fact that English surface structure is
misleading.
English modal words are ambiguous in a systematic way. For example,
suppose I say that I can’t attend a certain conference in Cleveland. What is the
force of “can’t” here? Probably I’m saying that my attending the conference
is inconsistent with honoring other commitments I’ve made at that time. But
notice that another sentence I might utter is: “I could attend the conference;
but I would have to cancel my class, and I don’t want to do that.” Now I’ve
said that I can attend the conference; have I contradicted my earlier assertion
that I cannot attend the conference? No—what I mean now is perhaps that I
have the means to get to Cleveland on that date. I have shifted what I mean by
“can”.
In fact, there are a lot of things one could mean by a modal word like ‘can’.
Examples:
I can come to the party, but I can’t stay late. (“can” = “is not inconve
nient”)
Humans can travel to the moon, but not Mars. (“can” = “is achievable
with current technology”)
Objects can move almost as fast as the speed of light, but nothing can travel
faster than light. (“can” = “is consistent with the laws of nature”)
Objects could have traveled faster than the speed of light (if the laws
of nature had been different), but no matter what the laws had been,
nothing could have traveled faster than itself. (“can” = “metaphysical
possibility”)
You can borrow but you can’t steal. (“can” = “morally acceptable”)
So when representing English sentences using the 2 and the 3, one should
keep in mind that these expressions can be used to express different strengths
of necessity and possibility. (Though we won’t do this, one could introduce
different symbols for different sorts of possibility and necessity.)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 108
The different strengths of possibility and necessity can be made vivid by
thinking, again, in terms of possible worlds. As we saw, we can think of the 2
and the 3 as quantiers over possible worlds (the former a universal quantier,
the latter an existential quantier). The very broad sort of possibility and
necessity, metaphysical possibility and necessity, can be thought of as a completely
unrestricted quantier: a statement is necessarily true iff it is true in all possible
worlds whatsoever. The other kinds of possibility and necessity can be thought
of as resulting from various restrictions on the quantiers over possible worlds.
Thus, when ‘can’ signies achievability given current technology, it means:
true in some possible world in which technology has not progressed beyond where it has
progressed in fact at the current time; when ‘can’ means moral acceptability, it
means: true in some possible world in which nothing morally forbidden occurs; and so
on.
6.3 Semantics for MPL
As usual, let’s consider semantics rst. As always, our goal is to model how
statements involving the 2 and 3 are made true by the world, in order to shed
light on the meaning of these connectives, and in order to provide semantic
denitions of the notions of logical truth and logical consequence.
In constructing a semantics for MPL, we face two main challenges, one
philosophical, the other technical. The philosophical challenge is simply that
it isn’t wholly clear which formulas of MPL are indeed logical truths. It’s hard
to construct an engine to spit out logical truths if you don’t know which logical
truths you want it to spit out. With a few exceptions, there is widespread
agreement over which formulas of nonmodal propositional and predicate logic
are logical truths. But for modal logic, this is less clear, especially for sentences
that contain iterations of modal operators. Is 2P→22P a logical truth? It’s
hard to say.
A quick peek at the history of modal logic is in order. Modal logic arose
from dissatisfaction with the material conditional →of standard propositional
logic. The material conditional φ→ψ is true whenever φ is false or ψ is true;
but in expressing the conditionality of ψ on φ, we sometimes want to require a
tighter relationship: we want it not to be a mere accident that either φ is false or
ψ is true. To express this tighter relationship, C. I. Lewis introduced the strict
conditional φ⇒ψ, which he dened, as above, as 2(φ→ψ).
3
Thus dened,
3
See Lewis (1918); Lewis and Langford (1932).
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 109
φ⇒ψ isn’t automatically true just because φ is false or ψ is true. It must be
necessarily true that either φ is false or ψ is true.
Lewis then asked: what principles govern this new symbol 2? Certain
principles seemed clearly appropriate, for instance: 2(φ→ψ)→(2φ→2ψ).
Others were less clear. Is 2φ→22φ a logical truth? What about 32φ→φ?
Lewis’s solution to this problem was not to choose. Instead, he formulated
several different modal systems. He did this axiomatically, by formulating differ
ent systems that differed from one another by containing different axioms and
hence different theorems.
We will follow Lewis’s approach, and construct several different modal
systems. Unlike Lewis, we’ll do this semantically at rst (the semantics for
modal logic we will study was published by Saul Kripke in the 1950s, long
after Lewis was writing), by constructing different denitions of a model for
modal logic. The denitions will differ from one another in ways that result
in different sets of valid formulas. In section 6.4 we’ll study Lewis’s axiomatic
systems, and in sections 6.5 and 6.6 we’ll discuss the relationship between the
semantics and the axiom systems.
Formulating multiple systems does not answer the philosophical question
of which formulas of modal logic are logically true; it merely postpones it.
The question rearises when we want to apply Lewis’s systems; when we ask
which system is the correct system—i.e., which one correctly mirrors the logical
properties of the English words ‘possibly’ and ‘necessarily’? (Note that since
there are different sorts of necessity and possibility, different systems might
correctly represent different sorts of necessity.) But we won’t try to address
such philosophical questions here.
The technical challenge to constructing a semantics for MPL is that the
modal operators 2 and 3 are not truth functional. A (sentential) connective is an
expression that combines with sentences to make new sentences. A oneplace
connective combines with one sentence to form a new sentence. ‘It is not the
case that’ is a oneplace connective of English—the ∼ is a oneplace connective
in the language of PL. A connective is truthfunctional iff whenever it combines
with sentences to form a new sentence, the truth value of the resulting sentence
is determined by the truth value of the component sentences. Many think
that ‘and’ is truthfunctional, since they think that an English sentence of the
form “φ and ψ” is true iff φ and ψ are both true. But ‘necessarily’ is not truth
functional. Suppose I tell you the truth value of φ; will you be able to tell me
the truth value of this sentence? Well, if φ is false then presumably you can (it
is false), but if φ is true, then you still don’t know. If φ is “Ted is a philosopher”
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 110
then “Necessarily φ” is false, but if φ is “Either Ted is a philosopher or he isn’t
a philosopher” then “Necessarily φ” is true. So the truth value of “Necessarily
φ” isn’t determined by the truth value of φ. Similarly, ‘possibly’ isn’t truth
functional either: ‘I might have been six feet tall’ is true, whereas ‘I might have
been a round square’ is false, despite the fact that ‘I am six feet tall’ and ‘I am a
round square’ each have the same truth value (they’re both false.)
Since the 2 and the 3 are supposed to represent ‘necessarily’ and ‘possibly’,
respectively, and since the latter aren’t truthfunctional, we can’t use the method
of truth tables to construct the semantics for the 2 and the 3. For the method
of truth tables assumes truthfunctionality. Truth tables are just pictures of truth
functions: they specify what truth value a complex sentence has as a function of
what truth values its parts have. Imagine trying to construct a truth table for
the 2. It’s presumably clear (though see the discussion of systems K, D, and T
below) that 2φ should be false if φ is false, but what about when φ is true?:
2 φ
? 1
0 0
There’s nothing we can put in this slot in the truth table, since when φ is true,
sometimes 2φ is true and sometimes it is false.
Our challenge is clear: we need a semantics for the 2 and the 3 other than
the method of truth tables.
6.3.1 Relations
Before we investigate how to overcome this challenge, a digression is necessary.
Recall our discussion of ordered sets from section 1.8. In addition to their use
in constructing models, ordered sets are also useful for constructing relations.
We take a binary (2place) relation to be a set of ordered pairs. For example,
the tallerthan relation may be taken to be the set of ordered pairs 〈u, v〉 such
that u is taller than v. The lessthan relation for positive integers is the set
of ordered pairs 〈m, n〉 such that m is a positive integer less than n, another
positive integer. That is, it is the following set:
{〈1, 2〉, 〈1, 3〉, 〈1, 4〉 . . . 〈2, 3〉, 〈2, 4〉…}
When 〈u, v〉 is a member of relation R, we say that u and v stand in R, or
that u bears R to v. Most simply, we write “Ruv”. (This notation is like that
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 111
of predicate logic; but here I’m speaking the metalanguage, not displaying
sentences of a formalized language.)
Some denitions. Let R be any binary relation.
• The domain of R (“dom(R)”) is the set ]u: for some v, Ruv]
• The range of R (“ran(R)”) is the set ]u: for some v, Rvu]
• R is over Aiff dom(R) ⊆Aand ran(R) ⊆A
In other words, the domain of R is the set of all things that bear R to something;
the range is the set of all things that something bears R to; and R is over A iff
the members of the ’tuples in R are all drawn from A.
Let R be any binary relation over A. Then we dene the following notions
with respect to A:
• R is serial (in A) iff for every u ∈ A, there is some v ∈ Asuch
that Ruv.
• R is reexive (in A) iff for every u ∈ A, Ruu
• R is symmetric iff for all u, v, if Ruv then Rvu
• R is transitive iff for any u, v, w, if Ruv and Rvw then Ruw
• R is an equivalence relation (in A) iff R is symmetric, transitive,
and reexive (in A)
• R is total (in A) iff for every u, v ∈ A, Ruv
Notice that we relativize some of these notions to a given set A. We dene
the notion of reexivity relative to a set, for example. We do this because the
alternative would be to say that a relation is reexive simpliciter if everything
bears R to itself; but that would require the domain and range of any reexive
relation to be the set of absolutely all objects. It’s better to introduce the notion
of being reexive relative to a set, which is applicable to relations with smaller
domains and ranges. (I will sometimes omit the qualier ‘in A’ when it is clear
which set that is.) Why don’t symmetry and transitivity have to be relativized
to a set?—because they only say what must happen if R holds among certain
things. Symmetry, for example, says merely that if R holds between u and v,
then it must also hold between v and u, and so we can say that a relation is
symmetric absolutely, without implying that everything is in its domain and
range.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 112
6.3.2 Kripke models
Now we’re ready to introduce a semantics for MPL. As we saw, we can’t
construct truth tables for the 2 or the 3. Instead, we will pursue an approach
called possibleworlds semantics. The intuitive idea is to count 2φ as being true
iff φ is true in all possible worlds, and 3φ as being true iff φ is true in some
possible worlds. More carefully: we are going to develop models for modal
propositional logic. These models will contain objects we will call “possible
worlds”. And formulas are going to be true or false “at” these worlds—that
is, we are going to assign truth values to formulas in these models relative to
possible worlds, rather than absolutely. Truth values of propositionallogic
compound formulas—that is, negations and conditionals—will be determined
by truth tables within each world; ∼φ, for example, will be true at a world iff φ
is false at that world. But the truth value of 2φ at a world won’t be determined
by the truth value of φ at that world; the truth value of φ at other worlds will
also be relevant.
Specically, 2φ will count as true at a world iff φ is true at every world that
is “accessible” from the rst world. What does “accessible” mean? Each model
will come equipped with a binary relation, %, that holds between possible
worlds; we will say that world v is “accessible from” world w when %wv. The
intuitive idea is that %wv if and only if v is possible relative to w. That is, if you
live in world w, then from your perspective, the events in world v are possible.
The idea that what is possible might vary depending on what possible
world you live in might at rst seem strange, but it isn’t really. “It is physically
impossible to travel faster than the speed of light” is true in the actual world,
but false in worlds where the laws of nature allow fasterthanlight travel.
On to the semantics. We rst dene a general notion of a MPL model,
which we’ll then use to give a semantics for each of our systems:
An MPLmodel is an ordered triple, 〈T, %, ·〉, such that:
i) T is a nonempty set of objects (“possible worlds”)
ii) % is a binary relation over T (“accessibility relation”)
iii) · is a twoplace function (the “interpretation” function) that
assigns a 0 or 1 to each sentence letter, relative to (“at”, or
“in”) each world—that is, for any sentence letter α, and any
w ∈ T, ·(α, w) is either 0 or 1.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 113
Each MPLmodel contains a set T of possible worlds, and an accessibility
relation %. 〈T, %〉 is sometimes called the model’s frame. Think of the frame
as a map of the “structure” of the model’s space of possible worlds: it contains
information about how many worlds there are, and which worlds are accessible
from which. In addition to a frame, each model also contains an interpretation
function ·, which assigns truth values to sentence letters.
A model’s interpretation function assigns truth values only to sentence
letters. But the sum total of all the truth values of sentence letters relative to
worlds determines the truth values of all complex wffs, again relative to worlds.
It is the job of the model’s valuation function to specify exactly how these truth
values get determined:
Where . (=〈T, %, ·〉) is any MPLmodel, the valuation for .,
V
.
, is dened as the twoplace function that assigns either 0 or 1
to each wff relative to each member of T, subject to the following
constraints, where α is any sentence letter, φ and ψ are any wffs,
and w is any member of T:
a) V
.
(α, w) =·(α, w)
b) V
.
(∼φ, w) =1 iff V
.
(φ, w) =0
c) V
.
(φ→ψ, w) =1 iff either V
.
(φ, w) =0 or V
.
(ψ, w) =1
d) V
.
(2φ, w) = 1 iff for each v ∈ T, if %wv, then V
.
(φ, v) =
1
What about the truth values for complex formulas that contain ∧, ∨, ↔, and
3? Given the denition of these dened connectives in terms of the primitive
connectives, it is easy to prove that the following derived conditions hold:
i) V
.
(φ∧ψ, w) =1 iff V
.
(φ, w) =1 and V
.
(ψ, w) =1
ii) V
.
(φ∨ψ, w) =1 iff V
.
(φ, w) =1 or V
.
(ψ, w) =1
iii) V
.
(φ↔ψ, w) =1 iff V
.
(φ, w) =V
.
(ψ, w)
iv) V
.
(3φ, w) =1 iff for some v ∈ T, %wv and V
.
(φ, v) =1
So far, we have introduced a general notion of an MPL model, and have
dened the notion of a wff’s being true at a world in an MPL model. Next, let
us consider how to dene validity.
Remember that our overall strategy is C. I. Lewis’s: we want to construct
different modal systems, since it isn’t obvious which formulas ought to count as
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 114
logical truths. The systems will be named: K, D, T, B, S4, S5. Each system will
come with its own denition of a model. As a result, different formulas will
come out valid in the different systems. For example, as we’ll see, the formula
2P→22P is going to come out valid in S4 and S5, but not in the other systems.
Here are the denitions:
• A Kmodel is dened as any MPLmodel
• A Dmodel is any MPLmodel whose accessibility relation is
serial (i.e., any model 〈T, %, ·〉 in which % is serial in T)
• A Tmodel is any MPLmodel whose accessibility relation is
reexive (in T)
• A Bmodel is any MPLmodel whose accessibility relation is
reexive (in T) and symmetric
• An S4model is any MPL model whose accessibility relation is
reexive (in T) and transitive
• An S5model is any MPL model whose accessibility relation is
reexive (in T), symmetric, and transitive
A formula φ is valid in model . (=〈T, %, ·〉 iff for every w ∈ T,
V
.
(φ, w) =1
A formula is valid in system S (where S is either K, D, T, B, S4, or
S5) iff it is valid in every Smodel
Notice that for each system, the valid formulas are dened as the formulas
that are valid in every model in which the accessibility relation has a certain
formal feature. The systems differ from from one another by what that formal
feature is. For T it is reexivity: a formula is Tvalid iff it is valid in every
model in which the accessibility relation is reexive. For S4 the formal feature
is reexivity + transitivity. Other systems correspond to other formal features.
As before, we’ll use the notation for validity. But since we have many
modal systems, if we claim that a formula is valid, we’ll need to indicate which
system we’re talking about. Let’s do that by subscripting with the name of
the system; thus, “
T
φ” means that φ is Tvalid.
It’s important to get clear on the status of possibleworlds lingo here. Where
〈T, %, ·〉 is a model, we call the members of T “worlds”, and we call % the
“accessibility” relation. Now, there is no question that “possible worlds” is a
vivid way to think about necessity and possibility. But ofcially, T is nothing
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 115
but a nonempty set, any old nonempty set. Its members needn’t be the kinds
of things metaphysicians call possible worlds: they can be numbers, people,
bananas—whatever you like. Similarly, % is just dened to be any old binary
relation on %; it needn’t have anything to do with the metaphysics of modality.
Ofcially, then, the possibleworlds talk we use to describe our models is just
talk, not heavyduty metaphysics. Still, models are usually intended to model
something—to depict some aspect of the dependence of truth on the world.
So if modal sentences of English containing ‘necessarily’ and ‘possibly’ aren’t
made true by anything like possible worlds, it’s hard to see why possible worlds
models would shed any light on their meaning, or why truthinallpossible
worldsmodels would be a good way of modeling (genuine) validity for modal
statements. At any rate, this philosophical issue should be kept in mind. Back,
now, to the formalism.
6.3.3 Semantic validity proofs
Given our denition of validity, one can now show that a certain formula is
valid in a given system. First, a very simple example. Let us prove that the
formula 2(P∨∼P) is Kvalid. That is, we must prove that this formula is valid
in every MPLmodel, since that is the denition of Kvalidity. Being valid
in a model means being true at every world in the model. So, consider any
MPLmodel 〈T, %, ·〉, and let w be any world in T. We must prove that
V
.
(2(P∨∼P), w) = 1. (As before, I’ll start to omit the subscript . on V
.
when it’s clear which model we’re talking about.)
i) Suppose for reductio that V(2(P∨∼P), w) =0
ii) So, by the truth condition for the 2 in the denition of the
valuation function, there is some world, v, such that %wv
and V(P∨∼P, v) =0
iii) Given the truth condition for the ∨, V(P, v) =0 and V(∼P, v) =
0
iv) Since V(∼P, v) = 0, given the truth condition for the ∼,
V(P, v) = 1. But that’s impossible; V(P, v) can’t be both 0
and 1.
Thus,
K
2(P∨∼P). Note that similar reasoning would establish
K
φ, for
any propositionallogic tautology φ. The reason is this: within any world, the
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 116
truth values of complex statements of propositional logic are determined by
the truth values of their constituents in that world by the usual truth tables. So
if φ is a PLtautology, it will be true in any world in any model; hence 2φ will
turn out true in any world in any model.
Another example: let’s show that
T
(32(P→Q)∧2P)→3Q. We must
show that V((32(P→Q)∧2P)→3Q, w) = 1 for the valuation V for an arbi
trarily chosen model and world w in that model.
i) Assume for reductio that V((32(P→Q)∧2P)→3Q, w) =0
ii) So V(32(P→Q)∧2P, w) =1 and …
iii) …V(3Q, w) =0
iv) From ii), 32(P→Q) is true at w, and so V(2(P→Q), v) =1,
for some world, call it v, such that %wv
v) From ii), V(2P, w) = 1. So, by the truth condition for the
2, P is true in every world accessible from w; since %wv, it
follows that V(P, v) =1.
vi) From iv), P→Q is true in every world accessible from v; since
our model is a Tmodel, % is reexive. So %vv; and so
V(P→Q, v) =1
vii) From v) and vi), by the truth condition for the →, V(Q, v) =1
viii) Given iii), Q is false at every world accessible from w; this
contradicts vii)
OK, we’ve shown that the formula (32(P→Q)∧2P)→3Q is valid in T.
Suppose we were interested in showing that this formula is also valid in S4.
What more would we have to do? Nothing! To be S4valid is to be valid in
every S4model; but a quick look at the denitions shows that every S4model
is a Tmodel. So, since we already know that the the formula is valid in all
Tmodels, we already know that it must be valid in all S4models (and hence,
S4valid), without doing a separate proof.
Think of it another way. To do a proof that the formula is S4valid, we need
to do a proof in which we are allowed to assume that the accessibility relation is
both transitive and reexive. And the proof above did just that. We didn’t ever
use the fact that the accessibility relation is transitive—we only used the fact
that it is reexive (in line 9). But we don’t need to use everything we’re allowed
to assume.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 117
In contrast, the proof above doesn’t establish that this formula is, say, Kvalid.
To be Kvalid, the formula would need to be valid in all models. But some
models don’t have reexive accessibility relations, whereas the proof we gave
assumed that the accessibility relation was reexive. And in fact the formula
isn’t in fact Kvalid, as we’ll show how to demonstrate in the next section.
Consider the following diagram of systems:
S5
S4






B
@
@
@
@
@
@
T
~
~
~
~
~
~
B
B
B
B
B
B
D
K
An arrow from one system to another indicates that validity in the rst system
implies validity in the second system. For example, if a formula is Dvalid, then
it’s also Tvalid. The reason is that if something is valid in all Dmodels, then,
since every Tmodel is also a Dmodel (since reexivity implies seriality), it
must be valid in all Tmodels as well.
S5 is the strongest system, since it has the most valid formulas. (That’s
because it has the fewest models—it’s easier to be S5valid because there are
fewer potentially falsifying models.)
Notice that the diagram isn’t linear. That’s because of the following. Both B
and S4 are stronger than T; each contains all the Tvalid formulas. But neither
B nor S4 is stronger than the other—each contains valid formulas that the
other doesn’t. (They of course overlap, because each contains all the Tvalid
formulas.) S5 is stronger than each; S5 contains all the valid formulas of each.
These relationships between the systems will be exhibited below.
Suppose you are given a formula, and for each system in which it is valid,
you want to give a semantic proof of its validity. This needn’t require multiple
semantic proofs—as we have seen, one semantic proof can do the job. To prove
that a certain formula is valid in a number of systems, it sufces to prove that it
is valid in the weakest possible system. Then, that very proof will automatically
be a proof that it is valid in all stronger systems. For example, a proof that a
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 118
formula is valid in K would itself be a proof that the formula is D, T, B, S4,
and S5valid. Why? Because every model of any kind is a Kmodel, so Kvalid
formulas are always valid in all other systems.
In general, then, to show what systems a formula is valid in, it sufces to
give a single semantic proof of it, namely, a semantic proof in the weakest
system in which it is valid. There is an exception, however, since neither B
nor S4 is stronger than the other. Suppose a formula is not valid in T, but one
has given a semantic proof its validity in B. This proof also establishes that the
formula is also valid in S5, since every S5 model is a Bmodel. But one still
doesn’t yet know whether the formula is S4valid, since not every S4model is a
Bmodel. Another semantic proof may be needed: of the formula’s S4validity.
(Of course, the formula may not be S4valid.)
So: when a wff is valid in both B and S4, but not in T, two semantic proofs
of its validity are needed.
I’ll present some more sample validity proofs below, but it’s often easier to
do proofs of validity when one has failed to construct a countermodel for a
formula. So let’s look rst at countermodeling.
6.3.4 Countermodels
We have a denition of validity for the various systems, and we’ve shown how
to establish validity of particular formulas. Now we’ll investigate establishing
invalidity.
Let’s show that the formula 3P→2P is not Kvalid. A formula is Kvalid if
it is valid in all Kmodels, so all we must do is nd one Kmodel in which it
isn’t valid. What follows is a procedure for doing this:
4
Place the formula in a box
The goal is to nd some model, and some world in the model, where the
formula is false. Let’s start by drawing a box, which represents some chosen
world in the model we’ll construct. The goal is to make the formula false in
this world. In these examples I’ll always call this rst world “r”:
3P→2P
r
4
This procedure is from Cresswell and Hughes (1996).
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 119
Now, since the box represents a world, we should have some way of representing
the accessibility relation. What worlds are possible, relative to r; what worlds
does r “see”? Well, to represent one world (box) seeing another, we’ll draw
an arrow from the rst to the second. However in the case of this particular
model, we don’t need to make this world r see anything. After all, we’re trying
to construct a Kmodel, and the accessibility relation of a Kmodel doesn’t
even need to be serial—no world needs to see any worlds at all. So, we’ll forget
about arrows for the time being.
Make the formula false in the world
We will indicate a formula’s truth value (1 or 0) by writing it above the formula’s
major connective. So to indicate that 3P→2P is to be false in this model, we’ll
put a 0 above its arrow:
0
3P→2P
r
Enter in forced truth values
If we want to make the 3P→2P false in this world, the denition of a valuation
function requires us to assign certain other truth values. Whenever a conditional
is false at a world, its antecedent is true at that world and its consequent is false
at that world. So, we’ve got to enter in more truth values; a 1 over the major
connective of the antecedent (3P), and a 0 over the major connective of the
consequent (2P):
1 0 0
3P→2P
r
Enter asterisks
When we assign a truth value to a modal formula, we thereby commit ourselves
to assigning certain other truth values to various formulas at various worlds.
For example, when we make 3P true at r, we commit ourselves to making P
true at some world that r sees. To remind ourselves of this commitment, we’ll
put an asterisk (*) below 3P. An asterisk below indicates a commitment to there
being some world of a certain sort. Similarly, since 2P is false at r, this means
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 120
that P must be false in some world P sees (if it were true in all such worlds,
then by the semantic clause for the 2, 2P would be true at r). We again have a
commitment to there being some world of a certain sort, so we enter an asterisk
below 2P as well:
1 0 0
3P→2P
∗ ∗
r
Discharge bottom asterisks
The next step is to fulll the commitments we incurred by adding the bottom
asterisks. For each, we need to add a world to the diagram. The rst asterisk
requires us to add a world in which P is true; the second requires us to add a
world in which P is false. We do this as follows:
1 0 0
3P→2P
∗ ∗
r
·
?
?
?
?
?
?
?
?
?
?
1
P
a
0
P
b
What I’ve done is added two more worlds to the diagram: a and b. P is true in
a, but false in b. I have thereby satised my obligations to the asterisks on my
diagram, for r does indeed see a world in which P is true, and another in which
P is false.
The ofcial model
We now have a diagram of a Kmodel containing a world in which 3P→2P
is false. But we need to produce an ofcial model, according to the ofcial
denition of a model. A model is an ordered triple 〈T, %, ·〉, so we must
specify the model’s three members.
The set of worlds We rst must specify the set of worlds, T. T is simply
the set of worlds I invoked:
T =]r, a, b]
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 121
But what are r, a, and b? Let’s just take them to be the letters ‘r’, ‘a’, and ‘b’. No
reason not to—the members of T, recall, can be any things whatsoever.
The accessibility relation Next, for the accessibility relation. This is
represented on the diagram by the arrows. In our model, there is an arrow
from r to a, an arrow from r to b, and no other arrows. Thus, the diagram
represents that r sees a, that r sees b, and that there are no further cases of
seeing. Now, remember that the accessibility relation, like all relations, is a set
of ordered pairs. So, we simply write out this set:
% =]〈r, a〉, 〈r, b〉]
That is, we write out the set of all ordered pairs 〈w
1
, w
2
〉 such that w
1
“sees”
w
2
.
The interpretation function Finally, we need to specify the interpreta
tion function, ·, which assigns truth values to sentence letters at worlds. In
our model, · must assign 1 to P at world a, and 0 to P at world b. Now, our
ofcial denition requires an interpretation to assign a truth value to each of
the innitely many sentence letters at each world; but so long as P is true at
world a and false at world b, it doesn’t matter what other truth values · assigns.
So let’s just (arbitrarily) choose to make all other sentence letters false at all
worlds in the model. We have, then:
·(P, a) =1
·(P, b) =0
for all other sentence letters α and worlds w, ·(α, w) =0
That’s it—we’re done. We have produced a model in which 3P→2P is false
at some world; hence this formula is not valid in all models; and hence it’s not
Kvalid:
K
3P→2P.
Check the model
At the end of this process, it’s a good idea to check to make sure that your
model is correct. This involves various things. First, make sure that you’ve
succeeded in producing the correct kind of model. For example, if you’re trying
to produce a Tmodel, make sure that the accessibility relation you’ve written
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 122
down is reexive. (In our case, we were only trying to construct a Kmodel, and
so for us this step is trivial.) Secondly, make sure that the formula in question
really does come out false at one of the worlds in your model.
Simplifying models
Sometimes a model can be simplied. Consider the diagram of the nal version
of the model above:
1 0 0
3P→2P
∗ ∗
r
·
?
?
?
?
?
?
?
?
?
?
1
P
a
0
P
b
We needn’t have used three worlds in the model. When we discharged the rst
asterisk, we needed to put in a world that r sees, in which P is true. But we
needn’t have made that a new world—we could have simply have made P true in
r. Of course we couldn’t haven’t done that for both asterisks, because that would
have made P both true and false at r. So, we could make one simplication:
1 1 0 0
3P→2P
∗ ∗
r
0
P
b
The ofcial model would then look as follows:
T : ]r, b]
% : ]〈r, r〉, 〈r, b〉]
·(P, r) =1; otherwise, everything false
Adapting models to different systems
We have showed that 3P→2P is not Kvalid. Now, let’s show that this formula
isn’t Dvalid, i.e. that it is false in some world of some model with a serial
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 123
accessibility relation (i.e., some “Dmodel”). Well, we haven’t quite done this,
since the model above does not have a serial accessibility relation. But we can
easily change this, as follows:
1 1 0 0
3P→2P
∗ ∗
r
0
P
b
Ofcial model:
T : ]r, b]
% : ]〈r, r〉, 〈r, b〉, 〈b, b〉]
·(P, r) =1; otherwise, everything false
That was easy—adding the fact that b sees itself didn’t require changing any
thing else in the model.
Suppose we want now to show that 3P→2P isn’t Tvalid. Well, we’ve
already done so! Why? Because we’ve already produced a Tmodel in which
this formula is false. Look back at the most recent model. Its accessibility
relation is reexive. So it’s a Tmodel already. In fact, that accessibility relation
is also already transitive, so it’s already an S4model.
So far we have established that
K,D,T,S4
3P→2P. What about B and S5?
It’s easy to revise our model to make the accessibility relation symmetric:
1 1 0 0
3P→2P
∗ ∗
r
0
P
b
Ofcial model:
T: {r,b}
%: {〈r,r〉,〈r,b〉,〈b,b〉,〈b,r〉}
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 124
·(P, r) =1; otherwise, everything false
Now, we’ve got a Bmodel, too. What’s more, we’ve also got an S5model:
notice that the accessibility relation is an equivalence relation. (In fact, it’s also
a total relation.)
So, we’ve succeeded in establishing that 3P→2P is not valid in any of our
systems. Notice that we could have done this more quickly, if we had given the
nal model in the rst place. After all, this model is an S5, S4, B, T, D, and
Kmodel. So one model establishes that the formula isn’t valid in any of the
systems.
In general, in order to establish that a formula is invalid in a number of
systems, try to produce a model for the strongest system (i.e., the system with
the most requirements on models). If you do, then you’ll automatically have a
model for the weaker systems. Keep in mind the diagram of systems:
S5
S4






B
@
@
@
@
@
@
T
~
~
~
~
~
~
B
B
B
B
B
B
D
K
An arrow from one system to another, recall, indicates that validity in the rst
system implies validity in the second. The arrows also indicate facts about
invalidity, but in reverse: when an arrow points from one system to another,
then invalidity in the second system implies invalidity in the rst. For example,
if a wff is invalid in T, then it is invalid in D. (That’s because every Tmodel is
a Dmodel; a countermodel in T is therefore a countermodel in D.)
When our task is to discover which systems a given formula is invalid in,
usually only one countermodel will be needed—a countermodel in the strongest
system in which the formula is invalid. But there is an exception involving B
and S4. Suppose a given formula is valid in S5, but we discover a model showing
that it isn’t valid in B. That model is automatically a T, D, and Kmodel, so we
know that the formula isn’t T, D, or Kvalid. But we don’t yet know about that
formula’s S4validity. If it is S4invalid, then we will need to produce a second
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 125
countermodel, an S4 countermodel. (Notice that the Bmodel couldn’t already
be an S4model. If it were, then its accessibility relation would be reexive,
symmetric, and transitive, and so it would be an S5 model, contradicting the
fact that the formula was S5valid.)
Additional steps in countermodelling
I gave a list of steps in constructing countermodels:
1. Place the formula in a box
2. Make the formula false in the world
3. Enter in forced truth values
4. Enter asterisks
5. Discharge bottom asterisks
6. The ofcial model
We’ll need to adapt this list.
Above asterisks Let’s try to get a countermodel for 32P→23P in all the
systems in which it is invalid, and a semantic validity proof in all the systems in
which it is valid. We always start with countermodelling before doing semantic
validity proofs, and when doing countermodelling, we start by trying for a
Kmodel. After the rst few steps, we have:
1 0 0
32P→23P
∗ ∗
r
{
{
{
{
{
{
{
{
{
{
{
C
C
C
C
C
C
C
C
C
C
C
1
2P
a
0
3P
b
At this point, we’ve got a true 2, and a false 3. Take the rst: a true 2P. This
doesn’t commit us to adding a world in which P is true; rather, it commits us
to making P true in every world that a sees. Similarly, a zero over a 3, over
3P in world b in this case, commits us to making P false in every world that
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 126
b sees. We indicate such commitments, commitments in every world seen, by
putting asterisks above the relevant modal operators:
1 0 0
32P→23P
∗ ∗
r
·~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
∗
1
2P
a
∗
0
3P
b
Now, how can we discharge these asterisks? In this case, when trying to
construct a Kmodel, we don’t need to do anything. Since a, for example,
doesn’t see any world, then automatically P is true in every world it sees; the
statement “for every world, w, if %aw then V(P, w) = 1” is vacuously true.
Same goes for b—P is automatically false in all worlds it sees. So, we’ve got a
Kmodel in which 32P→23P is true.
Now let’s turn the model into a Dmodel. Every world must now see at
least one world. Let’s try:
1 0 0
32P→23P
∗ ∗
r
·~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
∗
1
2P
a
∗
0
3P
b
1
P
c
0
P
d
I added worlds c and d, so that a and b would each see at least one world.
(Further, worlds c and d each had to see a world, to keep the relation serial.
I could have added still more worlds that c and d saw, but then they would
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 127
themselves need to see some worlds…So I just let c and d see themselves.) But
once c and d were added, discharging the upper asterisks in worlds a and b
required making P true in c and false in d (since a sees c and b sees d).
Let’s now try for a Tmodel. This will involve, among other things, letting
a and b see themselves. But this gets rid of the need for worlds c and d, since
they were added just to make the relation serial. I’ll try:
1 0 0
32P→23P
∗ ∗
r
·~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
∗
1 1
2P
a
∗
0 0
3P
b
When I added arrows, I needed to make sure that I correctly discharged the
asterisks. This required nothing of world r, since there were no top asterisks
there. There were top asterisks in worlds a and b; but it turned out to be easy
to discharge these asterisks—I just needed to let P be true in a, but false in b.
Notice that I could have moved straight to this Tmodel—which is itself a
Dmodel—rather than rst going through the earlier mereDmodel. However,
this won’t always be possible—sometimes you’ll be able to get a Dmodel, but
no Tmodel.
At this point let’s verify that our model does indeed assign the value 0 to
our formula 32P→23P. First notice that 2P is true in a (since a only sees
one world—itself—and P is true there). But r sees a. So 32P is true at r. Now,
consider b. b only sees one world, itself, and P is false there. So 3P must also
be false there. But r sees b. So 23P is false at r. But now, the antecedent of
32P→23P is true, while its consequent is false, at r. So that conditional is
false at r. Which is what we wanted.
Onward. Our model is not a Bmodel, since a, for example, doesn’t see r,
despite the fact that r sees a. So let’s try to make this into a Bmodel. This
involves making the relation symmetric. Here’s how it looks before I try to
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 128
discharge the top asterisks in a and b:
1 0 0
32P→23P
∗ ∗
r
·~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
∗
1 1
2P
a
∗
0 0
3P
b
Now I need to make sure that all top asterisks are discharged. For example,
since a now sees r, I’ll need to make sure that P is true at r. However, since
b sees r too, P needs to be false at r. But P can’t be both true and false at r.
So we’re stuck, in trying to get a Bmodel in which this formula is false. This
suggests that maybe it is impossible—that is, perhaps this formula is true in all
worlds in all Bmodels—that is, perhaps the formula is Bvalid. So, the thing
to do is try to prove this: by supplying a semantic validity proof.
So, let 〈T, %, ·〉 be any model in which % is reexive and symmetric, let
V be its valuation function, and let w be any member of T; we must show that
V(32P→23P, w) =1.
i) Suppose for reductio that V(32P→23P, w) =0
ii) Then V(32, w) =1 and …
iii) …V(23P, w) =0
iv) By i), for some v, %wv and V(2P, v) =1.
v) By symmetry, %vw.
vi) From iv), via the truth condition for 2, we know that P is true
at every world accessible from v; and so, by v), V(P, w) =1.
vii) By iii), there is some world, call it u, such that %wu and
V(3P, u) =0.
viii) By symmetry, w is accessible from u.
ix) By vi), P is false in every world accessible from u; and so by
viii), V(P, w) =0, contradicting vi)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 129
Just as we suspected: the formula is indeed Bvalid. So we know that it is
S5valid (the proof we just gave was itself a proof of its S5validity). But what
about S4validity? Remember the diagram—we don’t have the answer yet. The
thing to do here is to try to come up with an S4model, or an S4 semantic
validity proof. Usually, the best thing to do is to try for a model. In fact, in the
present case this is quite easy: our Tmodel is already an S4model. So, we’re
done. Our answer to what systems the formula is valid and invalid in comes in
two parts:
Invalidity: we have an S4model, which we put down ofcially as
follows:
T =]r, a, b]
% =]〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉]
·(P, a) =1, all others false
This model is itself also a T, D, and Kmodel (since its accessibility
relation is reexive and serial), so:
K,D,T,S4
32P→23P.
Validity: we gave a semantic proof of Bvalidity above. This was
itself a proof of S5validity, as I noted. So
B,S5
32P→23P
I’ll do one more example: 32P→3232P. We can get a Tmodel as
follows:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 130
∗
1 0 0 0
32P→3232P
∗ ∗
r
·
I discharged the second
bottom asterisk in
r by letting r see b
∗
1 1 0
2P 232P
∗
a
Notice how commitments
to specic truth values for
different formulas are recorded
by placing the formulas
side by side in the box
∗
0 0 1
32P P
∗
b
0
P
c
Ofcial model:
T =]r, a, b, c]
% =]〈r, r〉, 〈a, a〉, 〈b, b〉, 〈c, c〉, 〈r, a〉, 〈r, b〉, 〈a, b〉, 〈b, c〉]
·(P, b) =1, all else 0
Now consider what happens when we try to turn this model into a Bmodel.
World b must see back to world a. But then the false 32P in b conicts with the
true 2P in a. So it’s time for a validity proof. In constructed this validity proof,
we can be guided by failed attempt to construct a countermodel (assuming all
of our choices in constructing that countermodel were forced). In the following
proof that the formula is Bvalid, I chose variables for worlds that match up
with the countermodel above:
i) Suppose for reductio that V(32P→3232P, r ) =0, in some
world r in some Bmodel 〈T, %, ·〉
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 131
ii) So V(32P, r ) =1 and . . .
iii) V(3232P, r ) =0
iv) From ii), there’s some world, call it a, such that V(2P, a) =1
and %ra
v) From iii), since %ra, V(232P, a) =0
vi) And so, there’s some world, call it b, such that V(32P, b) =
0 and %ab
vii) By symmetry, %ba. And so, given vi), V(2P, a) = 0. This
contradicts iv)
We now have a Tmodel for the formula, and a proof that it is Bvalid. The
Bvalidity proof shows the formula to be S5valid; the Tmodel shows it to
be K and Dinvalid. We still don’t yet know about S4. So let’s return to the
Tmodel above, and see what happens when we try to make its accessibility
relation transitive. World a must then see world c, which is impossible since
2P is true in a and P is false in c. So we’re ready for a S4validity proof (the
proof looks like the Bvalidity proof at rst, but then diverges):
i) Suppose for reductio that V(32P→3232P, r ) =0, for some
world r in some S4model 〈T, %, · 〉
ii) So V(32P, r ) =1 and . . .
iii) V(3232P, r ) =0
iv) From ii), there’s some world, call it a, such that V(2P, a) =1
and %ra
v) From iii), since %ra, V(232P, a) =0
vi) And so, there’s some world, call it b, such that V(32P, b) =
0 and %ab
vii) By reexivity, %b b, so given vi), V(2P, b) =0
viii) And so, there’s some world, call it c, such that V(P, c) =0 and
%b c.
ix) From vi) and viii), given transitivity, we have %ac. And so,
given iv), V(P, c) =1, contradicting viii)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 132
Daggers
There’s another kind of step in constructing models. When we make a condi
tional false, we’re forced to enter certain truth values for its components: 1 for
the antecedent, 0 for the consequent. But consider making a disjunction true. A
disjunction can be true in more than one way. The rst disjunct might be true,
or the second might be true, or both could be true. So we have a choice for
how to go about making a disjunction true. Similarly for making a conditional
true, a conjunction false, or a biconditional either true or false.
When one has a choice about which truth values to give the constituents
of a propositional compound, it’s best to delay making the choice as long as
possible. After all, some other part of the model might force you to make one
choice rather than the other. If you investigate the rest of the countermodel,
and nothing has forced your hand, you may need then to make a guess: try one
of the truth value combinations open to you, and see whether you can nish
the countermodel. If not, go back and try another combination.
To remind ourselves of these choice points, we will place a dagger (†) un
derneath the major connective of the formula in question. Consider, as an
example, constructing a countermodel for the formula 3(3P∨2Q)→(3P∨Q).
Throwing caution to the wind and going straight for a Tmodel, we have after
a few steps:
∗
1 0 0 0 0 0
3(3P∨2Q)→(3P ∨Q)
∗
r
1 0
3P∨2Q P
†
a
We still have to decide how to make 3P∨2Q true in world a: which disjunct
to make true? Well, making 2P true won’t require adding another world to
the model, so let’s do that. We have, then, a Tmodel:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 133
∗
1 0 0 0 0 0
3(3P∨2Q)→(3P ∨Q)
∗
r
∗
1 1 1 0
3P∨2Q P
†
a
T : ]r, a]
% : ]〈r, r〉, 〈a, a〉, 〈r, a〉]
·(Q, a) =1, all else 0
OK, let’s try now to upgrade this to a Bmodel. We can’t simply leave
everything asis while letting world a see back to world r, since 2Q is true
in a and Q is false in r. But there’s another possibility. We weren’t forced to
discharge the dagger in world a by making 2Q true. So let’s explore the other
possibility; let’s make 3P true:
∗
1 0 0 0 0 0
3(3P∨2Q)→(3P ∨Q)
∗
r
1 1 0
3P∨2Q P
∗ †
a
1
P
b
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 134
T : ]r, a, b]
% : ]〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈a, b〉, 〈b, a〉]
·(P, b) =1, all else 0
What about an S4model? We can’t just add the arrows demanded by
transitivity to our Bmodel, since 3P is false in world r and P is true in world
b. What we can do instead is revisit the choice of which disjunct of 3P∨2Q to
make true. Instead of making 3P true, we can make 2Q true, as we did when
we constructed our Tmodel. In fact, that Tmodel is already an S4model.
So, we have countermodels in both S4 and B. The rst resulted from
one choice for discharging the dagger in world a, the second from the other
choice. An S5model, though, looks impossible. When we made the rst
choice—making the right disjunct of 3P∨2Q true—we were able to make the
accessibility relation symmetric, and when we made the second choice—making
the left disjunct of 3P∨2Q true—we were able to make the accessibility rela
tion transitive. It would seem to be impossible, then, to make the accessibility
both transitive and symmetric. Here is an S5validity proof, based on this
reasoning. Note the “separation of cases” reasoning:
i) Suppose for reductio that in some world r in some S5model,
V(3(3P∨2Q)→(3P∨Q), r ) =0. Then V(3(3P∨2Q), r ) =
1 and …
ii) …V(3P∨Q, r ) =0
iii) Given i), for some world a, %ra and V(3P∨2Q, a) =1. So,
either V(3P, a) =1 or V(2Q, a) =1
iv) The rst possibility leads to a contradiction:
a) Suppose V(3P, a) =1. Then for some world b, %ab and
V(P, b) =1
b) % is transitive, so given a) and iii), %r b.
c) given ii), V(3P, r ) = 0, and so, given b), V(P, b) = 0,
which contradicts a).
v) So does the second:
a) Suppose V(2Q, a) =1.
b) % is symmetric. So, given iii), %ar ; and so, given a),
V(Q, r ) =1
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 135
c) But given ii), V(Q, r ) =0—contradiction.
vi) Either way we have a contradiction.
So we have demonstrated that
S5
3(3P∨2Q)→(3P∨Q).
Summary of steps
Here, then, is a nal list of the steps for constructing countermodels:
1. Place the formula in a box
2. Make the formula false in the world
3. Enter in forced truth values
4. Enter in daggers, and after all forced moves over
5. Enter asterisks
6. Discharge asterisks (hint: do bottom asterisks rst)
7. Back to step 3.
8. The ofcial model
6.3.5 Schemas, validity, and invalidity
Let’s digress, for a moment, to clarify the notions of validity and invalidity, as
applied to formulas and schemas.
Formulas—ofcial wffs, that is—are the strings of symbols that are sanc
tioned by the ofcial rules of grammar of our object language. Formulas of
MPL include P∨(Q→R), 2P→3Q, and so on. Schemas are devices of our
metalanguage that are used to talk about innitely many formulas at once. We
used schemas, for instance, in section 2.5 to state the axioms of propositional
logic; we said: “each instance of the schema φ→(ψ→φ) is an axiom of PL”.
And we have used schemas throughout the book to dene the notion of a wff,
and to dene valuation functions. For example, earlier in this chapter we said
that a valuation function must assign the value 1 to any instance of the schema
∼φ relative to any world iff it assigns 0 to the corresponding instance of φ
relative to that world. Schemas are not formulas, since they contain schematic
variables (“φ” and “ψ” here) that are not part of the primitive vocabulary of
the object language.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 136
When we dened the notion of validity, what we dened was the notion of
a valid formula. (That’s because validity is dened in terms of truth in a model,
which itself was dened only for formulas.) So it’s not, strictly speaking, correct
to apply the notions of validity or invalidity to schemas.
However, it’s often interesting to show that every instance of a given schema
is valid. (Instances are schemas are formulas, and so the notion of validity can
be properly applied to them.) It’s easy, for example, to show that every instance
of the schema 2(φ→φ) is valid in each of our modal systems. (Let φ be any
MPLwff, and take any world w in any model. Since the rules for evaluating
propositional compounds within possible worlds are the classical ones, φ→φ
must be true at w, no matter what truth value φ has at w. Hence 2(φ→φ) is
true in any world in any model, and so is valid in each system.)
There is, therefore, a kind of indirect notion of schemavalidity: validity
of all instances. How about the invalidity of schemas? Here we must take
great care. In particular, the notion of a schema, all of whose instances are
invalid, is not a particularly interesting notion. Take, for instance, the schema
3φ→2φ. We showed earlier that a certain instance of this schema, namely
3P→2P is invalid in each of our systems. However, the schema 2φ→3φ also
has plenty of instances that are valid in various systems. The following formula,
for example, is an instance of 3φ→2ψ, and can easily be shown to be valid in
each of our systems:
3(P→P)→2(P→P)
Thus, even intuitively terrible schemas like 3φ→2φ have some valid instances.
(For an extreme example of this, consider the schema φ. Even this has some
valid instances: P→P, for one.) So it’s not interesting to inquire into whether
each instance of a schema is invalid. What is interesting is to inquire into
whether a given schema has some instances that are invalid. We can show, for
example, that the schema 3φ→2φ has some invalid instances (3P→2P , for
one), and hence is in this way unlike the schema 2(φ→φ).
So when dealing with schemas, it will often be of interest to ascertain
whether each instance of the schema is valid; it will rarely (if ever) be of interest
to ascertain whether each instance of the schema is invalid.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 137
6.4 Axiomatic systems of MPL
We turn next to provability in modal logic. We’ll approach this axiomatically:
we’re going to write down axioms, which are sentences of propositional modal
logic that seem clearly to be logical truths, and we’re going to write down rules
of inference, which say which sentences can be logically inferred from which
other sentences.
We’re going to continue to follow C. I. Lewis in constructing multiple
modal systems, since it’s so unclear which sentences of MPL are logical truths.
Hence, we’ll need to formulate multiple axiomatic systems. These systems will
contain different axioms from one another. As a result, different theorems will
be provable in the different systems.
We will, in fact, give these systems the same names as the systems we
investigated semantically: K, D, T, B, S4, and S5. (Thus we will subscript
the symbol for theoremhood with the names of systems; '
K
φ, for example,
will mean that φ is a theorem of system K.) Our reuse of the system names
will be justied in sections ?? and ??, where we will establish soundness and
completeness for each system. Given soundness and completeness, for each
system, exactly the same formulas are provable as are valid.
6.4.1 System K
Our rst system, K, is the weakest system—i.e., the system with the fewest
theorems.
Rules: MP NEC (“necessitation”)
φ
φ→ψ φ
ψ 2φ
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 138
Axioms: PL axioms: for any MPLwffs φ, ψ, and χ, the fol
lowing are axioms:
(A1) φ→(ψ→φ)
(A2) (φ→(ψ→χ))→((φ→ψ)→(φ→χ))
(A3) (∼ψ→∼φ)→((∼ψ→φ)→ψ)
Kschema: for any MPLwffs φand ψ, the following
is an axiom:
2(φ→ψ)→(2φ→2ψ)
As before, a proof is dened as a series of wffs, each of which is either an
axiom or follows from earlier lines in proof by a rule, and a theorem is dened
as the last line of any proof.
This axiomatic system (like all the modal systems we will study) is an exten
sion of propositional logic, in the sense that it includes all of the theorems of
propositional logic, but then adds more theorems. It includes all of proposi
tional logic because one of its rules is the propositional logic rule MP, and each
propositional logic axiom is one of its axioms. It adds theorems by adding a
new rule of inference (NEC), and a new axiom schema (the Kschema) (as well
as adding new wffs—wffs containing the 2—to the stock of wffs that can occur
in the PL axioms.)
The rule of inference, NEC (for “necessitation”), says that if you have a
formula φ on a line, then you may infer the formula 2φ. This may seem
unintuitive. After all, can’t a sentence be true without being necessarily true?
Yes; but the rule of necessitation doesn’t contradict this. Remember that every
line in every axiomatic proof is a theorem. So whenever one uses necessitation in
a proof, one is applying it to a theorem. And necessitation does seem appropriate
when applied to theorems: if φis a theorem, then 2φought also to be a theorem.
Think of it this way. The worry about the rule of necessitation is that it isn’t
a truthpreserving rule: intuitively speaking, its premise can be true when
its conclusion is false. The answer to the worry is that while necessitation
doesn’t preserve truth, it does preserve logical truth, which is all that matters
in the present context. For in the present context, we’re only using NEC in a
denition of theoremhood. We want our theorems to be, intuitively, logical
truths; and provided that our axioms are all logical truths and our rules preserve
logical truth, the denition will yield only logical truths as theorems. We will
return to this issue.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 139
Let’s investigate what one can prove in K. The simplest sort of distinctively
modal proof consists of rst proving something from the PL axioms, and then
necessitating it, as in the following proof of 2((P→Q)→(P→P))
1. P→(Q→P) (A1)
2. P→(Q→P))→((P→Q)→(P→P)) (A2)
3. (P→Q)→(P→P) 1,2 MP
4. 2((P→Q)→(P→P)) 3, NEC
Using this technique, we can prove anything of the form2φ, where φ is
provable in PL. And, since the PL axioms are complete (I mentioned but did
not prove this fact in chapter ??), that means that we can prove 2φ whenever
φ is a tautology—i.e., a valid wff of PL. But constructing proofs from the PL
axioms is a pain in the neck!—and anyway not what we want to focus on in
this chapter. So let’s introduce the following timesaving shortcut. Instead of
writing out proofs of tautologies, let’s instead allow ourselves to write any PL
tautology at any point in a proof, annotating simply “PL”.
5
Thus, the previous
proof could be shortened to:
1. (P→Q)→(P→P) PL
2. 2((P→Q)→(P→P)) 1, NEC
Furthermore, consider the wff 2P→2P. Clearly, we can construct a proof
of this wff from the PL axioms: begin with any proof of the tautology Q→Q
from the PL axioms, and then construct a new proof by replacing each occur
rence of Q in the rst proof with 2P. (This is a legitimate proof, even though
2P isn’t a wff of propositional logic, because when we stated the system K, the
schematic letters φ, ψ, and χ in the PL axioms are allowed to be lled in with
any wffs of MPL, not just wffs of PL.) So let us also include lines like this in
our modal proofs:
2P→2P PL
Why am I making such a fuss about this? Didn’t I just say in the previous
paragraph that we can write down any tautology at any time, with the annotation
“PL”? Well, strictly speaking, 2P→2P isn’t a tautology. A tautology is a valid
5
How do you know whether something is a tautology? Figure it out any way you like: do a
truth table, or a natural deduction derivation—whatever.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 140
wff of PL, and 2P→2P isn’t even a wff of PL (since it contains a 2). But it
is the result of beginning with some PLtautology (Q→Q, in this case) and
uniformly changing sentence letters to chosen modal wffs (in this case, Qs to
2Ps); hence any proof of the PL tautology may be converted into a proof of
it; hence the “PL” annotation is just as justied here as it is in the case of a
genuine tautology. So in general, MPL wffs that result from PL tautologies in
this way may be written down and annotated “PL”.
Back to investigating what we can prove in K. As we’ve seen, we can prove
that tautologies are necessary—we can prove 2φ whenever φ is a tautology.
One can also prove in K that contradictions are impossible. For instance,
∼3(P∧∼P) is a theorem of K:
1. ∼(P∧∼P) PL
2. 2∼(P∧∼P) 1, NEC
3. 2∼(P∧∼P)→∼∼2∼(P∧∼P) PL
4. ∼∼2∼(P∧∼P) 2, 3, MP
But line 4 is a denitional abbreviation of ∼3(P∧∼P).
Let’s introduce another timesaving shortcut. Note that the move from 2 to
4 in the previous proof is just a move from a formula to a propositional logical
consequence of that formula. Let’s allow ourselves to move directly from any
lines in a proof, φ
1
. . . φ
n
, to any propositional logical consequence ψ of those
lines, by “PL”. Thus, the previous proof could be shorted to:
1. ∼(P∧∼P) PL
2. 2∼(P∧∼P) 1, NEC
3. ∼∼2∼(P∧∼P) 2, PL
Why is this legitimate? Suppose that ψ is a propositional logical semantic
consequence of φ
1
. . . φ
n
. Then the conditional φ
1
→(φ
2
→· · · (φ
n
→ψ) . . . ) is
a PLvalid formula, and so, given the completeness of the PL axioms, is a
theorem of K. That means that if we have φ
1
, . . . , φ
n
in an axiomatic Kproof,
then we can always prove the conditional φ
1
→(φ
2
→· · · (φ
n
→ψ) . . . ) using the
PLaxioms, and then use MP repeatedly to infer ψ. So inferring ψ directly,
and annotating “PL”, is justied. (As with the earlier “PL” shortcut, let’s use
this shortcut when the conditional φ
1
→(φ
2
→· · · (φ
n
→ψ) . . .) results from some
tautology by uniform substitution, even if it contains modal operators and so
isn’t strictly a tautology.)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 141
So far our modal proofs have only used necessitation and the PL axioms.
What about the Kaxioms? The point of the Kschema is to enable “distribution
of the 2 over the →”. That is, if you ever have the formula 2(φ→ψ), then you
can always move to 2φ→2ψ as follows:
2(φ→ψ)
2(φ→ψ)→(2φ→2ψ) K axiom
2φ→2ψ MP
Distribution of the 2 over the →, plus the rule of necessitation, combine
to give us a powerful proof strategy. Whenever one can prove the conditional
φ→ψ, then one can prove the modal conditional 2φ→2ψ as well, as follows.
First prove φ→ψ, then necessitate it to get 2(φ→ψ), then distribute the 2
over the arrow to get 2φ→2ψ. This procedure is one of the core Kstrategies,
and is featured in the following proof of 2(P∧Q)→(2P∧2Q):
1. (P∧Q)→P PL
2. 2[(P∧Q)→P] NEC
3. 2[(P∧Q)→P]→[2(P∧Q)→2P] K axiom
4. 2(P∧Q)→2P 3,4 MP
5. 2(P∧Q)→2Q Insert steps similar to 14
6. 2(P∧Q)→(2P∧2Q) 4,5, PL
Notice that the preceding proof, like all of our proofs since we introduced
the timesaving shortcuts, is not a Kproof in the ofcial dened sense. Lines 1,
5, and 6 are not axioms, nor do they follow from earlier lines by MP or NEC;
similarly for line 6.
6
So what kind of “proof” is it? It’s a metalanguage proof:
an attempt to convince the reader, by acceptable standards of rigor, that some
real Kproof exists. A reader could use this metalanguage proof as a blueprint
for constructing a real proof. She would begin by replacing line 1 with a proof
from the PL axioms of the conditional (P∧Q)→P. (As we know from chapter
??, this could be a real pain in the neck!—but the completeness of PL assures us
that it is possible.) She would then replace line 5 with lines parallel to lines 14,
but which begin with a proof of (P∧Q)→Q rather than (P∧Q)→P. Finally,
in place of line 6, she would insert a proof from the PL axioms of the sen
tence (2(P∧Q)→2P)→[(2(P∧Q)→2Q)→(2(P∧Q)→(2P∧2Q))], and then
use modus ponens twice to infer 2(P∧Q)→(2P∧2Q).
Another example: (2P∨2Q)→2(P∨Q):
6
A further (even pickier) reason: the symbol ∧ isn’t allowed in wffs; the sentences in the
proof are mere abbreviations for ofcial MPLwffs.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 142
1. P→(P∨Q) PL
2. 2(P→(P∨Q)) 1, NEC
3. 2(Q→(P∨Q)) PL, NEC
4. 2P→2(P∨Q) 2, K
5. 2Q→2(P∨Q) 3, K
6. (2P∨2Q)→2(P∨Q) 4,5 PL
Here I’ve introduced another timesaving shortcut that I’ll use more and more
as we progress: doing two (or more) steps at once. Line 3 is really short for:
3a. Q→(P∨Q) PL
3b. 2(Q→(P∨Q)) 3a, NEC
And line 4 is short for:
4a. 2(P→(P∨Q))→(2P→2(P∨Q)) K axiom
4b. 2P→2(P∨Q) 2, 4a, MP
One further comment about this last proof: it illustrates a strategy that is
common in modal proofs. We were trying to prove a conditional formula
whose antecedent is a disjunction of two modal formulas. But the modal
techniques we had developed didn’t deliver formulas of this form. They only
showed us how to put 2s in front of PLtautologies, and how to distribute 2s
over →s. They only yield formulas of the form 2φ and 2φ→2ψ, whereas the
formula we were trying to prove looks different. To overcome this problem,
what we did was to use the modal techniques to prove two conditionals, namely
2P→2(P∨Q) and 2Q→2(P∨Q), from which the desired formula, namely
(2P∨2Q)→2(P∨Q), follows by propositional logic. The trick, in general, is
this: remember that you have PL at your disposal. Simply look for one or
more modal formulas you know how to prove which, by PL, imply the formula
you want. Assemble the desired formulas, and then write down your desired
formula, annotating “PL”. In doing so, it may be helpful to recall PL inferences
like the following:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 143
φ→ψ
ψ→φ
φ↔ψ
φ→(ψ→χ)
(φ∧ψ)→χ
φ→ψ
φ→χ
φ→(ψ∧χ)
φ→χ
ψ→χ
(φ∨ψ)→χ
φ→ψ
∼φ∨ψ
φ→∼ψ
∼(φ∧ψ)
The next example illustrates our next major modal proof technique: com
bining two 2statements to get a single 2statement. Let us construct a Kproof
of (2P∧2Q)→2(P∧Q):
1. P→(Q→(P∧Q)) PL
2. 2[P→(Q→(P∧Q))] NEC
3. 2P→2(Q→(P∧Q)) 2, K
4. 2(Q→(P∧Q))→[2Q→2(P∧Q)] K axiom
5. 2P→[2Q→2(P∧Q)] 3,4 PL
6. (2P∧2Q)→2(P∧Q) 5, PL
(If you wanted to, you could skip step 5, and just go straight to 6 by propositional
logic, since 6 is a propositional logical consequence of 3 and 4; I put it in for
perspicuity.)
The general technique illustrated by the last problem applies anytime you
want to move from several 2 statements to a further 2 statement, where the in
side parts of the rst 2statements imply the inside part of the nal 2statement.
More carefully: it applies whenever you want to prove a formula of the form
2φ
1
→(2φ
2
→· · · (2φ
n
→2ψ) . . . ), provided you are able to prove the formula
φ
1
→(φ
2
→· · · (φ
n
→ψ) . . . ). (The previous proof was an instance of this because
it involved moving from 2P and 2Q to 2(P∧Q); and this is a case where one
can move from the inside parts of the rst two formulas (namely, P and Q), to
the inside part of the third formula (namely, P∧Q)—by PL.) To do this, one
begins by proving the conditional φ
1
→(φ
2
→· · · (φ
n
→ψ) . . . ), necessitating it to
get 2[φ
1
→(φ
2
→· · · (φ
n
→ψ) . . . )], and then distributing the 2 over the arrows
repeatedly using Kaxioms and PL to get 2φ
1
→(2φ
2
→· · · (2φ
n
→2ψ) . . . ).
One cautionary note in connection with this last proof. One might think to
make it more intuitive by using conditional proof:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 144
1. 2P∧2Q assume for conditional proof
2. 2P 1, PL
3. 2Q 1, PL
4. P→(Q→(P∧Q)) PL
5. 2[P→(Q→(P∧Q))] NEC
6. 2P→2(Q→(P∧Q)) 5, K
7. 2(Q→(P∧Q)) 6,2, MP
8. 2Q→2(P∧Q) 7,K
9. 2(P∧Q) 3,8 MP
10. (2P∧2Q)→2(P∧Q) 19, conditional proof
But this is not a legal proof, since our axiomatic system allows neither assump
tions nor conditional proof.
In fact, our decision to omit conditional proof was not at all arbitrary. Given
our rule of necessitation, we couldn’t add conditional proof to our system. If we
did, proofs like the following would become legal:
1. P assume for conditional proof
2. 2P 1, NEC
3. P→2P 1,2, conditional proof
Thus, P→2P would turn out to be a Ktheorem. But we don’t want that: after
all, a statement P might be true without being necessarily true.
Once we have a soundness proof (section 6.5), we’ll be able to show that
P→2P isn’t a Ktheorem. But as we just saw, one can construct a Kproof
from ]P] of 2P (recall the notion of a proof from a set Γ, from section 2.5.)
It follows that the deduction theorem (section 2.5.2), which says that if there
exists a proof of ψ from ]φ], then there exists a proof of φ→ψ, fails for K (it
likewise fails for all the modal systems we will consider.) So there will be no
conditional proof in our axiomatic modal systems. (Of course, to convince
yourself that a given formula is really a tautology of propositional logic, you
may sketch a proof of it to yourself using conditional proof in some standard
natural deduction system for nonmodal propositional logic; and then you may
write that formula down in one of our axiomatic MPL proofs, annotating “PL”.)
Back to techniques for constructing proofs in K. The following proof of
22(P∧Q)→22P illustrates a technique for proving formulas with “nested”
modal operators:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 145
1. (P∧Q)→P PL
2. 2(P∧Q)→2P 1, NEC, K
3. 2[2(P∧Q)→2P] 2, NEC
4. 22(P∧Q)→22P 3, K
Notice in line 3 that we necessitated something that was not a PL theorem.
That’s ok; we’re allowed to necessitate any Ktheorems, even those whose proofs
were distinctly modal. Notice also how this proof contains two instances of our
basic Kstrategy. This strategy involves obtaining a conditional, necessitating
it, then distributing the 2 over the →. We did this rst using the conditional
(P∧Q)→P; that led us to a conditional, 2(P∧Q)→2P. Then we started the
strategy over again, using this as our initial conditional.
So far we have no techniques dealing with the 3, other than eliminating it
by denition. It will be convenient to derive some shortcuts. For one, there
are the following theorem schemas, which may collectively be called “modal
negation”, or “MN” for short:
'
K
∼2φ→3∼φ '
K
3∼φ→∼2φ
'
K
∼3φ→2∼φ '
K
2∼φ→∼3φ
I’ll do one of these; the rest can be an exercise:
1. ∼∼φ→φ PL
2. 2∼∼φ→2φ 1, NEC, K
3. ∼2φ→∼2∼∼φ 2, PL
The nal line, 3, is the denitional equivalent of ∼2φ→3∼φ.
It will also be worthwhile to know that an analog of the K axiom for the 3
is a Ktheorem:
“K3”: 2(φ→ψ)→(3φ→3ψ)
K3 is, by denition of the 3, the same formula as:
2(φ→ψ)→(∼2∼φ→∼2∼ψ):
How are we going to construct a Kproof of this theorem? In a natural de
duction system we would use conditional proof and reductio ad absurdum,
but these strategies are not available to us here. What we must do instead is
look for a formula we know how to prove in K, which is PLequivalent to the
formula we want to prove. Here is such a formula:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 146
2(φ→ψ)→(2∼ψ→2∼φ)
This is equivalent, given PL, to what we want to show; and it looks like the
result of necessitating a tautology and then distributing the 2 over the → a
couple times—just the kind of thing we know how to do in K. Here, then, is
the desired proof of K3:
1. (φ→ψ)→(∼ψ→∼φ) PL
2. 2(φ→ψ)→2(∼ψ→∼φ) 1, NEC, K
3. 2(∼ψ→∼φ)→(2∼ψ→2∼φ) K
4. 2(φ→ψ)→(2∼ψ→2∼φ) 2,3 PL
5. 2(φ→ψ)→(∼2∼φ→∼2∼ψ) 4,PL
In doing proofs, let’s also allow ourselves to refer to earlier theorems proved,
rather than repeating their proofs. The importance of K3 may be illustrated
by the following proof of 2P→(3Q→3(P∧Q)):
1. P→[Q→(P∧Q)] PL
2. 2P→2[Q→(P∧Q)] 1, NEC, K
3. 2[Q→(P∧Q)]→[3Q→3(P∧Q)] K3
4. 2P→[3Q→3(P∧Q)] 2,3, PL
In general, the K3 rule allows us to complete proofs of the following sort.
Suppose we wish to prove a formula of the form:
O
1
φ
1
→(O
2
φ
2
→(. . . →(O
n
φ
n
→3ψ) . . .)
where the O
i
s are modal operators, all but one of which are 2s. (Thus, the
remaining O
i
is the 3.) This can be done, provided that ψ is provable in K from
the φ
i
s. The basic strategy is to prove a nested conditional, the antecedents
of which are the φ
i
s, and the consequent of which is ψ; necessitate it; then
repeatedly distribute the 2 over the →s, once using K3, the rest of the times
using K. But there is one catch. We need to make the application of K3 last,
after all the applications of K. This in turn requires the conditional we use to
have the φ
i
that is underneath the 3as the last of the antecedents. For instance,
suppose that φ
3
is the one underneath the 3. Thus, what we are trying to
prove is:
2φ
1
→(2φ
2
→(3φ
3
→(2φ
4
→(…→(2φ
n
→3ψ)…)
In this case, the conditional to use would be:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 147
φ
1
→(φ
2
→(φ
n
→(φ
4
→(…→(φ
n−1
→(φ
3
→ψ)…)
In other words, one must swap one of the other φ
i
s (I arbitrarily chose φ
n
)
with φ
3
. What one obtains at the end will therefore have the modal statements
out of order:
2φ
1
→(2φ
2
→(2φ
n
→(2φ
4
→(…→(2φ
n−1
→(3φ
3
→3ψ)…)
But that problem is easily solved; this is equivalent in PL to what we’re trying
to get. (Recall that φ→(ψ→χ) is logically equivalent in PL to ψ→(φ→χ).)
Why do we need to save K3 for last? The strategy of successively distribut
ing the box over all the nested conditionals comes to a halt as soon as the K3
theorem is used. Let me illustrate with an example. Suppose we wish to prove
'
K
3P→(2Q→3(P∧Q)). We might think to begin as follows:
1. P→(Q→(P∧Q)) PL
2. 2[P→(Q→(P∧Q))] 1, Nec
3. 3P→3(Q→(P∧Q)) K3
4. ?
But now what? What we need to nish the proof is:
3(Q→(P∧Q))→(2Q→3(P∧Q)).
But neither K nor K3 gets us this. The remedy is to begin the proof with a
different conditional:
1. Q→(P→(P∧Q)) PL
2. 2(Q→(P→(P∧Q))) 1, Nec
3. 2Q→2(P→(P∧Q)) 2, K, MP
4. 2(P→(P∧Q))→(3P→3(P∧Q)) K3
5. 2Q→(3P→3(P∧Q)) 3, 4, PL
6. 3P→(2Q→3(P∧Q)) 5, PL
One can, then, prove a number of theorems in K. Nevertheless, K is a
very weak system. You can’t prove the formula 2P→3P in K. (We’ll be able
to demonstrate this after section 6.5.) Relatedly, one can’t prove in K that
tautologies are possible or that contradictions aren’t necessary.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 148
6.4.2 System D
System D results from adding a new axiom schema to system K:
System D
Rules: MP, NEC
Axioms: PL axioms
Kschema: 2(φ→ψ)→(2φ→2ψ)
Dschema: 2φ→3φ
Notice that since D includes all the K axioms, we retain all the Ktheorems.
The addition of the Dschema just adds more theorems. In fact, all of our
systems will build on K in this way, by adding new axioms to K. (K is so weak
that it isn’t of much independent interest; we study it because it’s the common
basis of all of the other systems we’re studying.)
An example of what we can do now is prove that tautologies are possible:
1. P∨∼P PL
2. 2(P∨∼P) 1, NEC
3. 2(P∨∼P)→3(P∨∼P) D
4. 3(P∨∼P) 2,3 MP
Let’s do one more theorem, 22P→23P:
1. 2P→3P D
2. 2(2P→3P) 1, NEC
3. 22P→23P 2, K
Like K, system D is very weak. As we will see later, we can’t prove 2φ→φ
in D. Therefore, D doesn’t seem to be a correct logic for metaphysical, or
nomic, or technological necessity, for surely, if something is metaphysically,
nomically, or technologically necessary, then it must be true. (If something is
true in all metaphysically possible worlds, or all nomically possible worlds, or
all technologically possible worlds, then surely it must be true in the actual
world, and so must be plain old true.) But perhaps there is some interest in
D anyway; perhaps D is a correct logic for moral necessity. Suppose we read
2φ as “One ought to make φ be the case”, and, correspondingly, read 3φ as
“One is permitted to make φ be the case”. Then the fact that 2φ→φ cannot
be proved in D would be a virtue, for from the fact that something ought be
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 149
done, it certainly doesn’t follow that it is done. The Daxiom, on the other
hand, would correspond to the principle that if something ought to be done
then it is permitted to be done, which does seem like a logical truth. But I won’t
go any further into the question of whether D in fact does give a correct logic
for moral necessity. That’s philosophy, not logic.
6.4.3 System T
T is the rst system we have considered that has any plausibility of being a
correct logic for a wide range of concepts of necessity (metaphysical necessity,
for example):
Rules: MP, NEC
Axioms: PL axioms
Kschema: 2(φ→ψ)→(2φ→2ψ)
Tschema: 2φ→φ
Recall that in the case of K, we proved a theorem schema, K3, which was
the analog for the 3 of the Kaxiom schema. Let’s do the same thing here; let’s
prove a theorem schema T3, which is the analog for the 3 of the T axiom
schema:
T3: φ→3φ
1. 2∼φ→∼φ T
2. φ→∼2∼φ 1, PL
2 is just the denition of φ→3φ. Thus, we have established that for every
wff φ, '
T
φ→3φ. So let’s allow ourselves to write down formulas of the form
φ→3φ, annotating simply “T3”.
Notice that instances of the Daxioms are now theorems: 2φ→φ is a T
axiom, we just proved that φ→3φ is a theorem; and from these two by PL we
can prove 2φ→3φ. Thus, T is an extension of D: every theorem of D remains
a theorem of T. (Since D was an extension of K, T too is an extension of K.)
6.4.4 System B
Our systems so far don’t allow us to prove anything interesting about iterated
modalities, i.e., sentences with consecutive boxes or diamonds. Which such
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 150
sentences should be theorems? The B axiom schema decides some of these
questions for us; here is system B:
Rules: MP, NEC
Axioms: PL axioms
Kschema: 2(φ→ψ)→(2φ→2ψ)
Tschema: 2φ→φ
Bschema: 32φ→φ
Note that we retain the T axiom schema in B. Thus, B is an extension of T
(and hence of K and D as well.)
As with K and T, we can establish a theorem schema that is the analog for
the 3 of B’s characteristic axiom schema. (The proof illustrates techniques for
“moving” ∼s through strings of modal operators.)
B3: φ→23φ:
1. 32∼φ→∼φ B
2. φ→∼32∼φ 1, PL
3. ∼32∼φ↔2∼2∼φ MN
4. ∼2∼φ→3φ PL (since 3 abbreviates ∼2∼)
5. 2∼2∼φ→23φ 4, NEC, K, MP
6. φ→23φ 2, 3, 5, PL
As an example, let’s show that '
B
[2P∧232(P→Q)]→2Q:
7
1. 32(P→Q)→(P→Q) B
2. 232(P→Q)→2(P→Q) 1, Nec, K, MP
3. 2(P→Q)→(2P→2Q) K
4. 232(P→Q)→(2P→2Q) 2, 3 PL
5. [2P∧232(P→Q)]→2Q 4, PL
6.4.5 System S4
The characteristic axiom of our next system, S4, is a different principle gov
erning iterated modalities:
7
This is # 41 from the handout “MPL theorems”.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 151
Rules: MP, NEC
Axioms: PL axioms
Kschema: 2(φ→ψ)→(2φ→2ψ)
Tschema: 2φ→φ
S4schema: 2φ→22φ
S4 contains the S4schema but does not contain the Bschema. Symmetri
cally, B lacks the S4schema, but of course contains the Bschema. As a result,
some instances of the Bschema are not provable in S4, and some instances of
the S4schema are not provable in B (we’ll be able to show this after section
6.5). Hence, although S4 and B are each extensions of T, neither B nor S4 is
an extension of the other.
As before, we have a theorem schema that is the analog for the 3 of the S4
axiom schema:
S43: 33φ→3φ:
1. 2∼φ→22∼φ S4
2. 2∼φ→∼3φ MN
3. 22∼φ→2∼3φ 2, NEC, K, MP
4. 2∼3φ→∼33φ MN
5. ∼3φ→2∼φ MN
6. 33φ→3φ 5,1,3,4, PL
Let’s do one fairly difcult example: (3P∧2Q)→3(P∧2Q).
8
How should
we approach this problem?
My thinking is as follows. We saw in the K section above that the following
sort of thing may always be proved: 2φ→(3ψ→3χ), whenever the conditional
φ→(ψ→χ) can be proved. So we need to try to work the probleminto this form.
Asis, the problem doesn’t quite have this form. But something very related
does have this form, namely: 22Q→(3P→3(P∧2Q)) (since the conditional
2Q→(P→(P∧2Q)) is a tautology). This thought inspires the following proof:
1. 2Q→(P→(P∧2Q)) PL
2. 22Q→2(P→(P∧2Q)) 1, Nec, K, MP
3. 2(P→(P∧2Q))→(3P→3(P∧2Q)) K3
4. 22Q→(3P→3(P∧2Q)) 2, 3 PL
5. 2Q→22Q S4
6. (3P∧2Q)→3(P∧2Q) 4, 5 PL
8
This is half of #56 from the “MPL Theorems” handout.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 152
6.4.6 System S5
Here, instead of the B or S4 schemas, we add the S5 schema to T:
Rules: MP, NEC
Axioms: PL axioms
Kschema: 2(φ→ψ)→(2φ→2ψ)
Tschema: 2φ→φ
S5schema: 32φ→2φ
First let’s prove the analog of the S5schema for the 3:
S53: 3φ→23φ
1. 32∼φ→2∼φ S5
2. ∼3φ→2∼φ MN
3. 3∼3φ→32∼φ 2, NEC, K3, MP
4. ∼23φ→3∼3φ MN
5. 2∼φ→∼3φ MN
6. 3φ→23φ 4,3,1,5, PL
Next, note that the B and S4 axioms are now derivable as theorems. The B
axiom, 32φ→φ, is trivial:
1. 32φ→2φ S5
2. 2φ→φ T
3. 32φ→φ 1,2 PL
And now the S4 axiom, 2φ→22φ. This is a little harder. I used the B3
theorem, which we can now appeal to since the theoremhood of the Bschema
has been established.
1. 2φ→232φ B3
2. 32φ→2φ S5
3. 2(32φ→2φ) 2, Nec
4. 232φ→22φ 3, K, MP
5. 2φ→22φ 4, 1, PL
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 153
6.4.7 Substitution of equivalents and modal reduction
Let’s conclude our discussion of provability in modal logic by proving two
simple metatheorems. First, let’s prove the following result about system K:
Substitution of equivalents: Where S is any of our modal sys
tems, and wff χ
β
results from wff χ by changing occurrences
of wff α to occurrences of wff β:
if '
S
α↔β then '
S
χ↔χ
β
Proof of substitution of equivalents: Suppose (*) '
S
α↔β. We proceed
by induction.
Base case: here χ is a sentence letter. Then either i) changing αs to
βs has no effect, in which case χ
β
is just χ, in which case obviously
'
S
(χ↔χ
β
); or ii) χ is α, in which case χ
β
is β, and we know '
S
(χ↔χ
β
) since we are given (*).
Induction case: We now assume the result holds for some formulas χ
1
and χ
2
—that is, we assume that '
S
χ
1
↔χ
β
1
and '
S
χ
2
↔χ
β
2
—and
we show the result holds for ∼χ
1
, 2χ
1
, and χ
1
→χ
2
.
Take the rst case. We must show that the result holds for ∼χ
1
—
i.e., we must show that '
S
∼χ
1
↔(∼χ
1
)
β
. (∼χ
1
)
β
is just ∼χ
β
1
, so we
must show '
S
∼χ
1
↔∼χ
β
1
. But ∼χ
1
↔∼χ
β
1
follows by PL from
χ
1
↔χ
β
1
, and the inductive hypothesis tells us that: '
S
χ
1
↔χ
β
1
.
Take the second case. We must show '
S
(χ
1
→χ
2
)
β
. The inductive
hypothesis tells us that '
S
χ
1
↔χ
β
1
, and so (since S includes PL):
'
S
(χ
1
→χ
2
)↔(χ
1
β
→χ
2
)
The inductive hypothesis also tells us that '
S
(χ
2
↔χ
2
β
), from
which, using propositional logic in S, we obtain:
'
S
(χ
1
β
→χ
2
)↔(χ
1
β
→χ
2
β
)
Nowfromthe two indented equivalences, again using propositional
logic in S, we have:
'
S
(χ
1
→χ
2
)↔(χ
1
β
→χ
2
β
)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 154
But note that (χ
1
β
→χ
2
β
) is just the same formula as (χ
1
→χ
2
)
β
. So
we’ve shown what we wanted to show.
Finally, take the third case. We must show that '
S
2χ
1
↔2χ
β
1
.
This follows from the inductive hypothesis '
S
χ
1
↔χ
β
1
. For the
inductive hypothesis implies '
S
χ
1
→χ
β
1
, by PL; and then, using
NEC and a Kaxiom, we have '
S
2χ
1
→2χ
β
1
. A parallel argument
establishes '
S
2χ
β
1
→2χ
1
; and then the desired conclusion follows
by using PL. That completes the inductive proof.
The following examples illustrate the power of substitution of equivalents.
In our discussion of K we proved the following two theorems:
2(P∧Q)→(2P∧2Q)
(2P∧2Q)→2(P∧Q)
Hence (by PL), 2(P∧Q)↔(2P∧2Q) is a Ktheorem. Given substitution
of equivalents, whenever we prove a theorem in which the formula 2(P∧Q)
occurs as a subformula, we can infer that the result of changing 2(P∧Q) to
2P∧2Q is also a Ktheorem—without having to do a separate proof.
Similarly, given the modal negation theorems, we know that all instances
of the following schemas are theorem of K (and hence of every other system):
2∼φ↔∼3φ
3∼φ↔∼2φ
Call these “the duals equivalences”.
9
Given the duals equivalences, we can swap
∼3φ and 2∼φ, or ∼2φ and 3∼φ, within any theorem of any system, and
the result will also be a theorem of that system. So we can “move” ∼s through
series of modal operators at will. For example, it’s easy to show that each of the
following is a theorem of each system S:
9
Given the duals equivalences, the 2 is related to the 3 the way the ∀ is related to the
∃ (since ∀x∼φ↔∼∃xφ, and ∃x∼φ↔∼∀xφ are logical truths). This shared relationship,
which holds between the 2 and the 3, and between the ∀ and the ∃, is called “duality”; 2 and
3 are said to be duals, as are ∀ and ∃. This logical analogy would be neatly explained by a
metaphysics according to which necessity just is truth in all worlds and possibility just is truth
in some worlds!
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 155
1. 332∼φ↔332∼φ this is a theorem of S, since it has the
form ψ→ψ
2. 33∼3φ↔332∼φ this is the result of changing 2∼φ on
the left of 1 to ∼3φ. Since 1 is a theo
rem of S, this too is a theorem of S, by
substitution of equivalents via a duals
equivalence
3. 3∼23φ↔332∼φ changed 3∼3φ in 2 to ∼23φ, so by
substitution of equivalents via a duals
equivalence this too is a theorem of S
4. ∼223φ↔332∼φ this follows from 3 and a MN theorem
by PL, so it too is a theorem of S
(Note how this greatly simplies the process of establishing the existence of
theorems such as K3, T3, B3, S43, and S53!)
Our second metatheorem concerns only system S5:
Modal reduction theorem for S5 Where O
1
. . . O
n
are modal op
erators and φ is a wff:
'
S5
O
1
. . . O
n
φ↔O
n
φ
That is, whenever a formula has a string of modal operators in front, it is always
equivalent to the result of deleting all the modal operators except the innermost
one. For example, 223232232323φ and 3φ are provably equivalent in
S5; i.e., 223232232323φ↔3φ is a theorem of S5). This follows from
the fact that the following equivalences are all theorems of S5:
a) 32φ↔2φ
b) 22φ↔2φ
c) 23φ↔3φ
d) 33φ↔3φ
The lefttoright direction of a) is just S5; the righttoleft is T3; b) is T and
S4; c) is T and S53; and d) is S43 and T3. Thus, by repeated applications of
these equivalences, using substitution of equivalents, we can reduce strings of
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 156
modal operators to the innermost operator. (It is straightforward to convert
this argument into a more rigorous inductive proof.)
10
6.5 Soundness in MPL
11
At this point, we have dened twelve logical systems: six semantic systems and
six axiomatic systems. But each semantic system was paired with an axiomatic
system to which we gave the same name. The time has come to justify this
pairing. In this section and the next, we show that for each semantic system,
exactly the same wffs are counted valid in that system as are counted theorems
by the axiomatic system of the same name. That is, for each of our systems, S
(for S = K, D, T, B, S4, and S5), we will prove soundness and completeness:
Ssoundness every Stheorem is Svalid
Scompleteness every Svalid formula is a Stheorem
Our study of modal logic has reversed that of history. We began with
semantics, because that is the more intuitive approach. Historically (as we
noted earlier), the axiomatic systems came rst, in the work of C. I. Lewis.
Given the uncertainty over what formulas ought to be counted as axioms, modal
logic was in disarray. The discovery by the teenaged Saul Kripke in the late
1950s of the possibleworlds semantics we studied in section ??, and of the
correspondence between simple constraints (reexivity, transitivity, etc.) on
the accessibility relation in his models and Lewis’s axiomatic systems, was a
major advance in the history of modal logic.
The soundness and completeness theorems have practical as well as the
oretical value. First, once we’ve proved soundness, we will for the rst time
have a method for establishing that a given formula is not a theorem: construct
a countermodel for that formula, thus establishing that the formula is not valid,
and then conclude via soundness that the formula is not a theorem. Second,
given completeness, if we want to know that a given formula is a theorem, it
10
The modal reduction formula, the duals equivalences, and substitution of equivalents
together let us “reduce” strings of operators that include ∼s as well as modal operators. Simply
use the duals equivalents to drive any ∼s in the string to the far right hand side, then use the
modal reduction theorem to eliminate all but the innermost modal operator.
11
The proofs of soundness and completeness in this and the next section are from Cresswell
and Hughes (1996).
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 157
sufces to show that it is valid. Since semantic validity proofs are comparatively
easy to construct, it’s nice to be able to use them rather than axiomatic proofs.
Let’s begin with soundness. We’re going to prove a general theorem, which
we’ll use in several soundness proofs. First we’ll need a piece of terminology.
Where Γ is any set of modal wffs, let’s call “K+Γ” the axiomatic system that
consists of the same rules of inference as K (MP and NEC), and which has
as axioms the axioms of K (instances of the K and PL schemas), plus the
members of Γ. Here, then, is the theorem:
Theorem 6.1 If Γ is any set of modal wffs and . is an MPL
model in which each wff in Γ is valid, then every theorem of
K+Γ is valid in .
Modal systems of the form K+Γ are commonly called normal. Normal
modal systems contain all the Ktheorems, plus possibly more. What Theorem
6.1 gives us is a method for constructing a soundness proof for any normal
system. Since all the systems we have studied here (K, D, etc.) are normal, this
method is sufciently general for us. Here’s how the method works for system
T. System T has the same rules of inference as K, and its axioms are all the
axioms of K, plus the instances of the Tschema. In the “K+Γ” notation, T = K
+ ]2φ→φ: φ is an MPL wff]. To establish soundness for T, all we need to do
is show that every instance of the Tschema is valid in all reexive models; for
we may then conclude by Theorem 6.1 that every theorem of T is valid in all
reexive models. This method can be applied to each of our systems: for any
system, S, to establish S’s soundness it will sufce to show that the S’s “extraK”
axioms are valid in all of the Smodels.
To prove Theorem 6.1, we will rst prove two lemmas:
Lemma 6.2 All PL and Kaxioms are valid in all MPLmodels
Lemma 6.3 For every MPLmodel, ., MP and Necessitation
preserve validity in .
Theorem 6.1 follows from the lemmas: assume that every wff in Γ is valid in a
given MPLmodel ., and consider any theorem φ of K+Γ. That theorem is
a last line in a proof in which each line is either an axiom K+Γ, or follows from
earlier lines in the proof by MP or NEC. But axioms of K+Γ are either PL
axioms, K axioms, or members of Γ. The rst two classes of axioms are valid in
all MPLmodels, by Lemma 6.2, and so are valid in .; and the nal class of
axioms are valid in . by hypothesis. Thus, all axioms in the proof are valid
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 158
in .. Moreover, by Lemma 6.3, the rules of inference in the proof preserve
validity in .. Therefore, by induction, every line in the proof is valid in ..
Hence the last line in the proof, φ, is valid in ..
We now need to prove the lemmas.
Proof of Lemma 6.2
From our proof of soundness for PL (section 2.6), we know that the PL truth
tables generate the value 1 for each PL axiom, no matter what truth value its
immediate constituents have. But here in MPL, the truth values of conditionals
and negations are determined at a given world by the truth values at that world
of its immediate constituents via the PL truth tables. So any PL axiom must
have truth value 1 at any world, regardless of what truth values its immediate
constituents have. PLaxioms, therefore, are true at every world in every model,
and so are valid in every model. We need now to show that any K axiom—i.e.,
any formula of the form 2(φ→ψ)→(2φ→2ψ)—is valid in any model:
i) Suppose for reductio that V(2(φ→ψ)→(2φ→2ψ), w) = 0,
for some model 〈T, %, ·〉, whose valuation is V, and some
w ∈ T
ii) So V(2(φ→ψ), w) =1 and…
iii) …V((2φ→2ψ), w) =0
iv) Given iii), V(2φ, w) =1 and …
v) …V(2ψ, w) =0
vi) Given v), for some v, %wv and V(ψ, v) =0
vii) Given iv), since %wv, V(φ, v) =1
viii) Given ii), since %wv, V(φ→ψ, v) =1
ix) Lines vi), vii), and viii) contradict, given the truth condition
for the →
Proof of Lemma 6.3
We must show that the rules MP and NEC preserve validity in any given model.
That is, we must show that if the inputs to one of these rules is valid in some
model, then that rule’s output must also be valid in that model.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 159
First MP. Let φ and φ→ψ be valid in model 〈T, %, ·〉; we must show that
ψ is also valid in that model. That is, where V is this model’s valuation, and
w is any member of T, we must show that V(ψ, w) = 1. Since φ and φ→ψ
are valid in this model, V(φ→ψ, w) = 1, and V(φ, w) = 1; but by the truth
condition for →, V(ψ, w) must also be 1.
Next NEC. Suppose φ is valid in model .. We must show that 2φ is
valid in ., i.e., that 2φ is true at each world in ., i.e., that for each world,
w, φ is true at every world accessible from w. But since φ is valid in ., φ is
true in every world in ., and hence is true at every world accessible from w.
6.5.1 Soundness of K
We can now construct soundness proofs for the individual systems. I’ll do this
for some of the systems, and leave the verication of soundness for the other
systems as exercises.
First K. In the “K+Γ” notation, K is just K+∅, and so it follows immediately
from Theorem 6.1 that every theorem of K is valid in every MPLmodel. So
K is sound.
6.5.2 Soundness of T
Tis K+Γ, where Γ is the set of all instances of the Tschema. So, given Theorem
6.1, to show that every theorem of T is valid in all Tmodels, it sufces to show
that all instances of the Tschema are valid in all Tmodels:
i) Assume for reductio that V(2φ→φ, w) =0 for some world w
in some Tmodel (i.e., some model with a reexive accessibility
relation)
ii) So V(2φ, w) =1 and…
iii) …V(φ, w) =0
iv) %ww, by reexivity. So, from ii), V(φ, w) =1, contradicting
iii)
6.5.3 Soundness of B
B is K+ Γ, where Γ is the set of all instances of the T and B schemas. Given
Theorem 6.1, it sufces to show that every instance of the Bschema and every
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 160
instance of the Tschema is valid in every Bmodel. So, choose an arbitrary
model with a reexive and symmetric accessibility relation, whose valuation is
V, and let w be any world in that model. We must show that V counts each
instance of the Tschema and the Bschema as being true at w. The proof
of the previous section shows that the Taxioms are true at w. Now for the
Baxioms:
i) Assume for reductio that V(32φ→φ, w) =1.
ii) So V(32φ, w) =1 and…
iii) …V(φ, w) =0
iv) By ii), V(2φ, v) =1, for some v such that %wv.
v) By symmetry, %vw. So, given iv), V(φ, w) =1, contradicting
iii)
6.6 Completeness of MPL
Next, completeness: for each system, we’ll show that every valid formula is a
theorem. As with soundness, most of the work will go into developing some
generalpurpose machinery. At the end we’ll use the machinery to construct
completeness proofs for each system.
We’ll be constructing a kind of completeness proof known as a “Henkin
proof”, after Leon Henkin, who used similar methods to demonstrate com
pleteness for (nonmodal) predicate logic.
6.6.1 Canonical models
For each of our systems, we’re going to show how to construct a certain special
model, the canonical model for that system. The canonical model for a system,
S, will be shown to have the following feature:
If a formula is valid in the canonical model for S, then it is a theorem of S
This sufcient condition for theoremhood can then be used to give complete
ness proofs, as the following example brings out. Suppose we can demonstrate
that the accessibility relation in the canonical model for T is reexive. Then,
since Tvalid formulas are by denition true in every world in every model
with a reexive accessibility relation, we know that every Tvalid formula is
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 161
valid in the canonical model for T. But then the italicized statement tells us
that every Tvalid formula is a theorem of T. So we would have established
completeness for T.
The trick for constructing canonical models will be to let the worlds in these
models be sets of formulas (remember, worlds are allowed to be anything we
like). And we’re going to construct the interpretation function of the canonical
model in such a way that a formula will be true at a world iff the formula is a
member of the set that is the world. Working out this idea will occupy us for
awhile.
6.6.2 Maximal consistent sets of wffs
To carry out this idea of constructing worlds as sets of formulas that are true at
those worlds, we’ll need to put some constraints on the nature of these sets of
wffs. It’s part of the denition of a valuation function that for any wff φ and
any world w, either φ or ∼φ is true at w. That means that any set of wffs that
we’re going to call a world had better contain either φ or ∼φ. Moreover, we’d
better not let such a set contain both φ and ∼φ, since a formula can’t be both
true and false at a world. Other constraints must be introduced as well.
Where S is any of our axiomatic systems, let’s dene the following notions:
A set of MPLwffs, Γ, is Sinconsistent iff for some φ
1
. . . φ
n
∈ Γ,
'
S
∼(φ
1
∧· · · ∧φ
n
). Γ is Sconsistent iff it is not Sinconsistent
A set of MPLwffs, Γ, is maximal iff for every MPLwff φ, either
φ or ∼φ is a member of Γ
A set is maximal Sconsistent iff it is both maximal and Sconsistent
A set is Sinconsistent if it contains some wffs (nite in number) that are
provably (in S) contradictory; it is Sconsistent if it contains no such wffs. This
notion of consistency is prooftheoretic: it has to do with what can be proved in
axiomatic systems, not with truth in models. Furthermore, Sconsistency has to
do with provability in systemS. It therefore requires more than the mere absence
of contradictions. Thus consider a set of wffs that contains no contradictions
(i.e., for no φ does the set contain both φ and ∼φ), but which contains both
2P and ∼P. This set would be Tinconsistent, since ∼(2P∧∼P) is a theorem
of T.
A maximal Sconsistent set of wffs contains, for each formula, either that
formula or its negation; and it contains nothing that is disprovable in S. Maximal
consistent sets are t sets to be worlds in our canonical models.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 162
6.6.3 Denition of canonical models
We’re now ready to dene canonical models. It may not be fully clear at this
point why the denition is phrased as it is; you’ll need to take it on faith that
the denition will get us where we want to go.
The canonical model for S is the MPLmodel 〈T, %, ·〉 where:
• T is the set of all maximal Sconsistent sets of wffs
• %ww
/
iff 2
−
(w) ⊆ w
/
• ·(α, w) = 1 iff α ∈ w, for each sentence letter α and each
w ∈ T
(where 2
−
(∆) is dened as the set of wffs φ such that 2φ is a
member of ∆)
Let’s think for a bit about this denition. As promised, we have dened
the members of T to be maximal Sconsistent sets of wffs. And note that all
maximal Sconsistent sets of wffs are included.
Accessibility is dened using the “2
−
” notation. Think of this operation
as “stripping off the boxes”: to arrive at 2
−
(∆) (“the boxstrip of set ∆”),
begin with set ∆, discard any formula that doesn’t begin with a 2, line up
the remaining formulas, and then strip one 2 off of the front of each. The
denition of accessibility, therefore, says %ww
/
iff for each wff 2φ that is a
member of w, the wff φ is a member of w
/
.
The denition of accessibility in the canonical model says nothing about
formal properties like transitivity, reexivity, and so on. As a result, it is not
true by denition that the canonical model for S is an Smodel. Tmodels,
for example, must have reexive accessibility relations, whereas the denition
of the accessibility relation in the canonical model for T says nothing about
reexivity. As we will eventually see, the canonical model for each system S
turns out to be an Smodel, but this fact must be proven; it’s not built into the
denition of a canonical model.
An atomic wff (sentence letter) is dened to be true at a world iff it is a
member of that world. Thus, for atomic wffs, truth and membership coincide.
What we really need to know, however, is that truth and membership coincide
for all wffs, including complex wffs. Proving this turns out to be a big task,
which will occupy us for several sections. We’ll need rst to assemble some
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 163
repower: a number of preliminary lemmas and theorems which will eventually
be used to prove that membership and truth coincide for all wffs in canonical
models. We’ll then nally be able to give completeness proofs.
6.6.4 Features of maximal consistent sets
We’ll begin with some lemmas governing the behavior of maximal consistent
sets:
Lemma 6.1 Where Γ is any maximal Sconsistent set of wffs:
6.1a for any wff φ, exactly one of φ, ∼φ is a member of Γ
6.1b φ→ψ∈ Γ iff either φ / ∈ Γ or ψ∈ Γ
6.1c if φ and φ→ψ are both members of Γ then so is ψ
Lemma 6.2 Where Γ is any maximal Sconsistent set of wffs,
6.2a if '
S
φ then φ∈ Γ
6.2b if φ∈ Γ and '
S
φ→ψ then ψ∈ Γ
I’ll prove Lemma 6.1, and leave 6.2 to the reader.
6.1a: we know from the denition of maximality that at least one of φor ∼φ
is in Γ. But it cannot be that both are in Γ, for then Γ would be Sinconsistent (it
would contain the nite subset {φ,∼φ}; but since all modal systems incorporate
propositional logic, it is a theorem of S that ∼(φ∧∼φ).)
6.1b: Suppose rst that φ→ψ is in Γ, and suppose for reductio that φ is
in Γ but ψ is not. Then, by 6.1a, ∼ψ is in Γ; but now Γ is Sinconsistent by
containing the subset {φ,φ→ψ,∼ψ}. Suppose for the other direction that either
φ is not in Γ or ψ is in Γ, and suppose for reductio that φ→ψ isn’t in Γ. By
6.1a, ∼(φ→ψ) is in Γ. Now, if φ / ∈ Γ then ∼φ∈ Γ, but then Γ would contain
the Sinconsistent subset {∼(φ→ψ),∼φ}. And if on the other hand, ψ ∈ Γ
then Γ again contains an Sinconsistent subset: {∼(φ→ψ),ψ}. Either possibility
contradicts Γ’s Sconsistency.
6.1c: Direct consequence of 6.1b.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 164
6.6.5 Maximal consistent extensions
Next let’s show that if we begin with an Sconsistent set ∆, we can “expand” it
into a maximal Sconsistent set Γ. (We prove this because we’ll need to know
that there exist enough maximal consistent sets in a canonical model’s T in
order for the model to do its thing.)
Theorem 6.3 If ∆ is an Sconsistent set of wffs, then there exists
some maximal Sconsistent set of wffs, Γ, such that ∆⊆Γ
Proof of Theorem 6.3: In outline, we’re going to build up Γ as follows. We’re
going to start by dumping all the formulas in ∆into Γ. Then we will go through
all the wffs in the language of MPL, φ
1
, φ
2
,…, one at a time. For each of these
wffs, we’re going to dump either it or its negation into Γ, depending on which
choice would be Sconsistent. After we’re done, our set Γ will obviously be
maximal; it will obviously contain ∆ as a subset; and, we’ll show, it will also be
Sconsistent.
So, let φ
1
, φ
2
,…be a list—an innite list, of course—of all the wffs of
MPL.
12
Our strategy, recall, is to construct Γ by starting with ∆, and then
12
We need to be sure that there is some way of arranging all the wffs of MPL into such a
list. Here is one method. Consider the following list of the primitive expressions of MPL:
( ) ∼ → 2 P
1
P
2
. . .
1 2 3 4 5 6 7 . . .
Since we’ll need to refer to what position an expression has in this list, the positions of the
expressions are listed underneath those expressions. (E.g., the position of the 2 is 5.) Now,
where φ is any wff, call the rating of φ the sum of the positions of the occurrences of its
primitive expressions. (The rating for the wff (P
1
→P
1
), for example, is 1 +6 +4 +6 +2 =19.)
We can now construct the listing of all the wffs of MPL by an innite series of stages: stage 1,
stage 2, etc. In stage n, we append to our growing list all the wffs of rating n, in alphabetical
order. The notion of alphabetical order here is the usual one, given the ordering of the primitive
expressions laid out above. (E.g., just as ‘and’ comes before ‘nad’ in alphabetical order, since
‘a’ precedes ‘n’ in the usual ordering of the English alphabet, ∼2P
2
comes before 2∼P
2
in
alphabetical order since ∼ comes before the 2 in the ordering of the alphabet of MPL. Note
that each of these wffs are inserted into the list in stage 15, since each has rating 15.) In stages
15 no wffs are added at all, since every wff must have at least one sentence letter and P
1
is
the sentence letter with the smallest position. In stage 6 there is one wff: P
1
. Thus, the rst
member of our list of wffs is P
1
. In stage 7 there is one wff: P
2
, so P
2
is the second member of
the list. In every subsequent stage there are only nitely many wffs; so each stage adds nitely
many wffs to the list; each wff gets added at some stage; so each wff eventually gets added after
some nite amount of time to this list.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 165
going through this list onebyone, at each point adding either φ
i
or ∼φ
i
.
Here’s how we do this more carefully. Let’s begin by dening an innite
sequence of sets, Γ
0
, Γ
1
, . . . :
i) Γ
0
is ∆
ii) Γ
n+1
is Γ
n
∪]φ
n+1
] if that is Sconsistent; otherwise Γ
n+1
is
Γ
n
∪]∼φ
n+1
]
Note the recursive nature of the denition: the next member of the sequence
of sets, Γ
n+1
is dened as a function of the previous member of the sequence,
Γ
n
.
Next let’s prove that each member in this sequence—that is, each Γ
i
—is an
Sconsistent set. We do this inductively, by rst showing that Γ
0
is Sconsistent,
and then showing that if Γ
n
is Sconsistent, then so will be Γ
n+1
.
Obviously, Γ
0
is Sconsistent, since ∆ was stipulated to be Sconsistent.
Next, suppose that Γ
n
is Sconsistent; we must showthat Γ
n+1
is Sconsistent.
Look at the denition of Γ
n+1
. What Γ
n+1
gets dened as depends on whether
Γ
n
∪]φ
n+1
] is Sconsistent. If Γ
n
∪]φ
n+1
] is Sconsistent, then Γ
n+1
gets dened
as that very set Γ
n
∪]φ
n+1
], and so of course is Sconsistent. So we’re ok in
that case.
The remaining possibility is that Γ
n
∪]φ
n+1
] is Sinconsistent. In that case,
Γ
n+1
gets dened as Γ
n
∪]∼φ
n+1
]. So must show that in this case, Γ
n
∪]∼φ
n+1
]
is Sconsistent. Suppose for reductio that it isn’t. The conjunction of some
nite subset of its members must therefore be provably false in S. Since Γ
n
was
Sconsistent, the nite subset must contain ∼φ
n+1
, and so there exist ψ
1
. . . ψ
m
∈
Γ
n
such that '
S
∼(ψ
1
∧· · · ∧ψ
m
∧∼φ
n+1
). Furthermore, since Γ
n
∪]φ
n+1
] is S
inconsistent, it too contains a nite subset that is provably false in S. Since Γ
n
is Sconsistent, the nite subset must contain φ
n+1
, so there exist χ
1
. . . χ
p
∈ Γ
n
such that '
S
∼(χ
1
∧· · · ∧χ
p
∧φ
n+1
). But notice that ∼(ψ
1
∧· · · ∧ψ
m
∧χ
1
∧· · · ∧χ
p
)
is a PLsemanticconsequence of the formulas ∼(ψ
1
∧· · · ∧ψ
m
∧∼φ
n+1
) and
∼(χ
1
∧· · · ∧χ
p
∧φ
n+1
). It follows that '
S
∼(ψ
1
∧· · · ∧ψ
m
∧χ
1
∧· · · ∧χ
p
) (each of
our modal systems “contains PL”: given the completeness of the PL axioms,
one can always move within an MPL axiomatic proof from some formulas to a
PLsemanticconsequence of those formulas.) Since ψ
1
. . . ψ
m
and χ
1
. . . χ
p
are
all members of Γ
n
, this contradicts the fact that Γ
n
is Sconsistent.
We have shown that all the sets in our sequence Γ
i
are Sconsistent. Let
us now dene Γ to be the union of all the sets in the innite sequence—i.e.,
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 166
]φ: φ∈ Γ
i
for some i ]. We must now show that Γ is the set we’re after: that i)
∆⊆Γ, ii) Γ is maximal, and iii) Γ is Sconsistent.
Any member of ∆ is a member of Γ
0
(since Γ
0
was dened as ∆), hence is a
member of one of the Γ
i
s, and hence is a member of Γ. So ∆⊆Γ.
Any wff of MPL is in the list somewhere—i.e., it is φ
i
for some i. But by
denition of Γ
i
, either φ
i
or ∼φ
i
is a member of Γ
i
; and so one of these is a
member of Γ. Γ is therefore maximal.
Suppose for reductio that Γ is Sinconsistent; there must then exist ψ
1
. . . ψ
m
∈ Γ such that '
S
∼(ψ
1
∧· · · ∧ψ
m
). By denition of Γ, each of these ψ
i
’s are
members of Γ
j
, for some j . Let k be the largest such j . Note next that, given
the way the Γ
i
’s are constructed, each Γ
i
is a subset of all subsequent ones.
Thus, all of the ψ
i
’s are members of Γ
k
, and thus Γ
k
is Sinconsistent. But that
can’t be—we showed that all the Γ
i
’s are Sconsistent. QED.
6.6.6 “Mesh”
Our ultimate goal is to show that in canonical models, a wff is true at a world
iff it is a member of that world. If we’re going to be able to show this, we’d
better be able to show things like this:
(2) If 2φ is a member of world w, then φ is a member of every
world accessible from w
(3) If 3φ is a member of world w, then φ is a member of some
world accessible from w
We’ll need to be able to show (2) and (3) because it’s part of the denition of
truth in any MPLmodel (whether canonical or not) that 2φ is true at w iff φ
is true at each world accessible from w, and that 3φ is true at w iff φ is true at
some world accessible from w. Think of it this way: (2) and (3) say that the
modal statements that are members of a world w in a canonical model “mesh”
with the members of the other worlds in that canonical model. This sort of
mesh had better hold if truth and membership are going to coincide.
(2) we know to be true straightaway, since it follows from the denition of
the accessibility relation in canonical models. The denition of the canonical
model for S, recall, stipulated that w
/
is accessible from w iff for each wff 2φ
in w, the wff φ is a member of w
/
. (3), on the other hand, doesn’t follow
immediately from our denitions; we’ll need to prove it. Actually, it will be
convenient to prove something slightly different which involves only the 2:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 167
Lemma 6.4 If ∆ is a maximal Sconsistent set of wffs containing
∼2φ, then there exists a maximal Sconsistent set of wffs Γ
such that 2
−
(∆) ⊆Γ and ∼φ∈ Γ
(Given the denition of accessibility in the canonical model and the denition
of the 3 in terms of the 2, Lemma 6.4 basically amounts to (3).)
Proof of Lemma 6.4: Let ∆ be as described. The rst (and biggest) step is to
establish:
(*) 2
−
(∆) ∪]∼φ] is Sconsistent.
Suppose for reductio that (*) is false. By the denition of Sinconsistency, the
following is a theorem of S:
∼(χ
1
∧· · · ∧χ
m
)
for some χ
1
. . . χ
m
∈ 2
−
(∆)∪]∼φ]. The following is a PLsemanticconsequence
of this formula, and hence, since S includes PL, is also a theorem of S:
∼(χ
1
∧· · · ∧χ
m
∧∼φ)
Now go through the list χ
1
. . . χ
m
, and if it contains any wffs that are not
members of 2
−
(∆), drop them from the list. Call the resulting list ψ
1
. . . ψ
n
.
Each of the ψ
i
s, note, is a member of 2
−
(∆).
13
The only wff that could have
been dropped, in moving from the χ
i
s to the ψ
i
s, is ∼φ (since each χ
i
was
a member of 2
−
(∆) ∪]∼φ]); the following wff is therefore a PLsemantic
consequence of the previous wff, and so is itself a theorem of S:
∼(ψ
1
∧· · · ∧ψ
n
∧∼φ)
Next, begin a proof in S with a proof of ∼(ψ
1
∧· · · ∧ψ
n
∧∼φ), and then continue
as follows:
.
.
∼(ψ
1
∧· · · ∧ψ
n
∧∼φ)
ψ
1
→(ψ
2
→· · · (ψ
n
→φ) . . . ) PL
2(ψ
1
→(ψ
2
→· · · (ψ
n
→φ) . . . ) NEC
.
.
2ψ
1
→(2ψ
2
→· · · (2ψ
n
→2φ) . . . ) K, PL (×n)
∼(2ψ
1
∧· · · ∧2ψ
n
∧∼2φ) PL
13
If 2
−
(∆) is empty then there will be no ψ
i
s. In that case, let’s regard “ψ
1
∧· · · ∧ψ
n
∧∼φ”
as standing for ∼φ.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 168
This proof establishes that '
S
∼(2ψ
1
∧· · · ∧2ψ
n
∧∼2φ). But since 2ψ
1
…2ψ
n
,
and ∼2φ are all in ∆, this contradicts ∆’s Sconsistency (2ψ
1
…2ψ
n
are mem
bers of ∆ because ψ
1
…ψ
n
are members of 2
−
(∆).)
We’ve established (*): 2
−
(∆) ∪ ]∼φ] is Sconsistent. It therefore has a
maximal Sconsistent extension, Γ, by Theorem 6.3. Since 2
−
(∆) ∪]∼φ] ⊆Γ,
we know that 2
−
(∆) ⊆Γ and that ∼φ∈ Γ. Γ is therefore our desired set; the
proof of Lemma 6.4 is complete.
6.6.7 The coincidence of truth and membership in canon
ical models
We’re now in a position to put all of our lemmas to work, and prove that
canonical models have the desired property that the wffs true at a world are
exactly the members of that world:
Theorem 6.5 Where . (= 〈T, %, ·〉) is the canonical model
for any normal modal system, S, for any wff φ and any w ∈ T,
V
.
(φ, w) =1 iff φ∈ w
Proof of Theorem 6.5: We’ll use induction. The base case is when φ has zero
connectives—i.e., φ is a sentence letter. In that case, the result is immediate:
by the denition of the canonical model, ·(φ, w) = 1 iff φ ∈ w; but by the
denition of the valuation function, V
.
(φ, w) =1 iff ·(φ, w) =1.
Now the inductive step. We suppose (ih) that the result holds for φ, ψ, and
show that it holds for ∼φ, φ→ψ, and 2φ as well:
∼: We must show that ∼φ is true at w iff ∼φ∈ w.
i) ∼φ∈ w iff φ / ∈ w (6.1a)
ii) φ / ∈ w iff φ is not true at w (ih)
iii) φ is not true at w iff ∼φ is true at w (truth cond. for ∼)
iv) so, ∼φ is true at w iff ∼φ∈ w (i, ii, iii)
→: We must show that φ→ψ is true at w iff φ→ψ∈ w:
i) φ→ψ is true at w iff either φ is not true at w or ψ is true at
w (truth cond for →)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 169
ii) So, φ→ψ is true at w iff either φ / ∈ w or ψ∈ w (ih)
iii) So, φ→ψ is true at w iff φ→ψ∈ w (6.1b)
2: We must show that 2φ is true at w iff 2φ∈ w.
First the forwards direction. Assume 2φ is true at w; then φ is true at every
world w
/
such that %ww
/
. By the ih, we have (+) φ is a member of every such
w
/
. Now suppose for reductio that 2φ / ∈ w; by 6.1a, ∼2φ ∈ w. Since w is
maximal Sconsistent, by Lemma 6.4, we know that there exists some maximal
Sconsistent set Γ such that 2
−
(w) ⊆ Γ and ∼φ ∈ Γ. By denition of T, Γ
is a world; by denition of %, %wΓ; and so by (+) Γ contains φ. But Γ also
contains ∼φ, which contradicts its Sconsistency.
Now the backwards direction. Assume 2φ∈ w. Then by denition of %,
for every w
/
such that %ww
/
, φ∈ w
/
. By the ih, φ is true at every such world;
hence by the truth condition for 2, 2φ is true at w. QED.
What was the point of proving theorem 6.5? The whole idea of a canonical
model was to be that a formula is valid in the canonical model for S iff it is a
theorem of S. This fact follows fairly immediately from Theorem 6.5:
Corollary 6.6: φ is valid in the canonical model for S iff '
S
φ
Proof of Corollary 6.6: Let 〈T, %, ·〉 be the canonical model for S. Suppose
'
S
φ. Then, by 6.2a, φ is a member of every maximal Sconsistent set, and
hence φ ∈ w, for every w ∈ T. By 6.5, φ is true in every w ∈ T, and so is
valid in this model. Now for the other direction: suppose
S
φ. Then {∼φ} is
Sconsistent, and so by theorem 6.3, has a maximal consistent extension; thus,
∼φ∈ w for some w ∈ T; by theorem 6.5, ∼φ is therefore true at w, and so φ
is not true at w, and hence φ is not valid in this model.
So, we’ve gotten where we wanted to go: we’ve shown that every system
has a canonical model, and that a wff is valid in the canonical model iff it is a
theorem of the system. We now use this fact to prove completeness for our
various systems:
6.6.8 Completeness of systems of MPL
Completeness of K
K’s completeness follows immediately. Any Kvalid wff is valid in all MPL
models, and so is valid in the canonical model for K, and so, by corollary 6.6, is
a theorem of K.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 170
For any other system, S, all we need to do to prove Scompleteness is to
show that the canonical model for S is an Smodel. That is, we must show
that the accessibility relation in the canonical model for S satises the formal
constraint for system S (seriality for D, reexivity for T and so on). This will
be made clear in the proof of completeness for D:
Completeness of D
Let us show that in the canonical model for D, the accessibility relation, %, is
serial. Let w be any world in that model. We showed above that 3(P→P) is
a theorem of D, and so is a member of w by 6.2a, and so is true at w by 6.5.
Thus, by the truth condition for 3, there must be some world accessible to w
in which P→P is true; and hence there must be some world accessible to w.
Now for D’s completeness. Let φ be Dvalid. It is then valid in all D
models, i.e., all models with a serial accessibility relation. But we just showed
that the canonical model for D has a serial accessibility relation. φ is therefore
valid in that model, and hence by 6.6, '
D
φ.
Completeness of T
All we need to do is to prove that the accessibility relation in the canonical model
for T is reexive; given that, every Tvalid formula is valid in the canonical
model for T, and hence by 6.6, every Tvalid formula is a Ttheorem.
Let w be any world in the canonical model for T. For any φ, '
T
2φ→φ;
thus, by 6.2b, it follows that for any φ, if 2φ∈ w then φ∈ w. But this is the
denition of %ww.
Completeness of B
We must show that the accessibility relation in the canonical model for B is
reexive and symmetric. Reexivity can be demonstrated in the same way as it
was for T, since every Ttheorem is a Btheorem.
Now for symmetry: in the canonical model for B, suppose that %wv. We
must show that %vw—that is, that for any 2ψ in v, ψ∈ w. So, suppose that
2ψ∈ v. By 6.5, 2ψ is true at v; since %wv, by the denition of 3 it follows
that 32ψ is true at w, and hence is a member of w by 6.5. Since '
B
32ψ→ψ,
by 6.2b, ψ∈ w.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 171
Completeness of S4
We must show that the accessibility relation in the canonical model for S4 is
reexive and transitive. Again, reexivity can be demonstrated as it was for
T; transitivity remains. Suppose %wv and %vu. We must show %wu—that
is, for any 2ψ ∈ w, ψ ∈u. If 2ψ ∈ w, since '
S4
2ψ→22ψ, by 6.2b we have
22ψ∈ w. By 6.5, 22ψ is true at w; hence by the truth condition for 2, 2ψ is
true at v; again by the truth condition for 2, ψ is true at u; by 6.5, ψ∈ u.
Completeness of S5
We must show that the accessibility relation in the canonical model for S5 is
reexive, symmetric, and transitive. But since each T, B, and S4 theorem is
an S5 theorem, the proofs of reexivity, symmetry, and transitivity from the
previous three sections apply here.
Chapter 7
Variations on Propositional Modal
Logic
A
s vi uavi siix, possible worlds are useful for giving a semantics for propo
sitional modal logic. Possible worlds are useful in other areas of logic as
well. In this chapter we will briey examine two other uses for possible worlds:
semantics for tense logic, and semantics for intuitionist propositional logic.
7.1 Propositional tense logic
1
7.1.1 The metaphysics of time
Propositional modal logic concerned the logic of the nontruthfunctional
sentential operators “it is necessary that” and “it is possible that”. Another
set of sentential operators that can be similarly treated are propositional tense
operators, such as “it will be the case that”, “it has always been the case that”,
etc.
A full logical treatment of natural language obviously requires that we
pay attention to temporal notions. Some philosophers, however, think that it
requires nothing beyond standard predicate logic. This was the view of many
early logicians, most notably Quine.
2
Here are some examples of how Quine
would regiment temporal sentences in predicate logic:
1
See Gamut (1991b, section 2.4); Cresswell and Hughes (1996, pp. 127134).
2
See, for example, Quine (1953b).
172
CHAPTER 7. VARIATIONS ON MPL 173
Everyone who is now an adult was once a child
∀x(Axn →∃t [Et n ∧Cxt ])
A dinosaur once trampled a mammal
∃x∃y∃t (Et n ∧Dx ∧My ∧T xyt )
Comments:
• ‘n’ is to be a name for the present time
• The predicate “E” is to be a predicate for the earlierthan re
lation over moments of time. Thus, “Et n” means that t is a
time that is before the present moment; and so, “∃t (Et n∧. . . ”
means that there exists some time, t , before the present mo
ment, such that …”.
• we add in a new place for all predicates, for the time at which
the object satises the predicate. Thus, instead of saying
“Cx”—“x is a child”—we say “Cxt ”: “x is a child at t ”
• the quantier ∃x is atemporal, ranging over all objects at all
times. That’s how we can say that there is a thing, x, that
is a dinosaur, and which, at some previous time, trampled a
mammal.
So: we can use Quine’s strategy to represent temporal notions using standard
predicate logic. But some philosophers reject the conception of time that is
presupposed by Quine’s strategy. First, Quine presupposes that the past, present,
and future are equally real. After all, his symbolization of “A dinosaur once
trampled a mammal” says that there is such a thing as a dinosaur. Quine’s view is
that time is “spacelike”. Other times are as real as the present, just temporally
distant, just as other places are equally real but spatially distant. Second, Quine
presupposes a distinctive metaphysics of change. Quine accounts for change
by adding argument places to temporary predicates like ‘is a child’ and ‘is an
adult’. For him, the statement ‘Ted is an adult’ is incomplete in something
like the way ‘Philadelphia is north of’ is complete: its predicate has an unlled
argument place. When all of a sentence’s argument places are lled, it can no
longer change its truth value; as a result (according to some), Quine’s approach
leaves no room for genuine change.
Arthur Prior (1967; 1968) and others reject Quine’s picture of time. Ac
cording to Prior, rather than reducing notions of past, present, and future to
notions about what is true at times, we must instead include certain special
CHAPTER 7. VARIATIONS ON MPL 174
temporal expressions—sentential tense operators—in our most basic languages,
and develop an account of their logic. Thus he initiated the study of tense logic.
One of Prior’s tense operators was P, symbolizing “it was the case that”.
Grammatically, P behaves like the ∼ and the 2: it attaches to a complete
sentence and forms another complete sentence. Thus, if R symbolizes “it is
raining”, then PR symbolizes “it was raining”. If a sentence letter occurs by
itself, outside of the scope of all temporal operators, then for Prior it is to
be read as presenttensed. Thus, it was appropriate to let R symbolize “It is
raining”—i.e., it is now raining.
Suppose we symbolize “there exists a dinosaur” as ∃xDx. Prior would then
symbolize “There once existed a dinosaur” as:
P∃xDx
And according to Prior, P∃xDx is not to be analyzed as saying that there exist
dinosaurs located in the past. For him, there is no further analysis of P∃xDx.
Prior’s attitude toward P is like everyone else’s attitude toward the ∼: no one
thinks that ∼∃xUx, “there are no unicorns”, is to be analyzed as saying that
there exist unreal unicorns. Further, Prior can represent the fact that I am
now, but have not always been, an adult, without adding argument places for
times to predicates. Symbolizing ‘is an adult’ with ‘A’, and ‘Ted’ with ‘t ’, Prior
would write: At ∧P∼At (“Ted is an adult, but it was the case that: Ted isn’t an
adult”). For Prior, the sentence At (“Ted is an adult”) is a complete statement,
but nevertheless can alter its truth value.
7.1.2 Tense operators
One can study various tense operators. Here is one group:
Gφ: it is, and is always going to be the case that φ
Hφ: it is, and always has been the case that φ
Fφ: it either is, or will at some point in the future be the case that,
φ
Pφ: it either is, or was at some point in the past the case that φ
Notice how these tense operators come in interdenable pairs (related to
each other as the 2 and the 3):
CHAPTER 7. VARIATIONS ON MPL 175
Gφ iff ∼F∼φ
Hφ iff ∼P∼φ
Thus one could start with just two of them, G and H say, and dene the others.
And one could use these two to dene further tense operators, for example A
and S, for “always” and “sometimes”
Aφ iff Hφ∧Gφ
Sφ iff ∼H∼φ∨∼G∼φ (i.e., iff Pφ∨Fφ)
There are further tense operators that are not denable in terms of those
we’ve been considering so far. There are, for example, metrical tense operators,
which concern what happened or will happen at specic temporal distances in
the past or future:
P
x
φ: it was the case x minutes ago that φ
F
x
φ: it will be the case in x minutes that φ
We will not consider metrical tense operators further.
The (nonmetrical) tense operators, as interpreted above, “include the
present moment”. For example, if Gφ is now true, then φ must now be true.
One could specify an alternate interpretation on which they do not include the
present moment:
Gφ: it is always going to be the case that φ
Hφ: it always has been the case that φ
Fφ: it will at some point in the future be the case that φ
Pφ: it was at some point in the past the case that φ
Whether we take the tense operators as including the present moment will
affect what kind of logic we develop for them. For example, the “2like”
operators G and H will obey the Tprinciple (Gφ and Hφ will imply φ) if they
are interpreted as including the present moment, but not otherwise.
CHAPTER 7. VARIATIONS ON MPL 176
7.1.3 Syntax of tense logic
In this chapter we will study only propositional tense logic. Its syntax is straight
forward: each tense operator has the grammar of the ∼and the 2. For example,
if we take the tense operators G and H as basic, then we could begin with the
denition of a wff from propositional logic (section 2.1) and add the following
clause:
If φ is a wff then so are:
Gφ
Hφ
7.1.4 Possible worlds semantics for tense logic
Let’s turn now to semantics. The most natural semantics for tense logic is
a possible worldsstyle semantics, in which we think of the members of T
as times rather than possible worlds, we think of the accessibility relation as
the temporal ordering relation, and we think of the interpretation function as
assigning truth values to sentence letters at times.
(A Priorean faces hard philosophical questions about the use of such a
semantics, since according to him, the semantics doesn’t accurately model the
metaphysics of time. The questions are like those questions that confront
someone who uses possible worlds semantics for modal logic, but doesn’t think
that possible worlds are part of the metaphysics of modality.)
This change in how we think about possibleworlds models doesn’t require
any change to the denition of a model from section ??. A model, ., is still an
ordered triple, whose rst member is a nonempty set, whose second member
is a binary relation over that set, and whose third member is a function that
assigns to each sentence letter a truth value relative to each member of the
rst member of .. To signify the change in how we’re thinking about these
models, however, let’s change our notation. Let’s call a model’s rst member ¨ ,
rather than T, and let’s use variables like t , t
/
, etc., for its members. And since
we’re thinking of a model’s second member as a relation of temporal ordering—
the atleastasearlyas relation over times—let’s rename it too: “≤”. (If we were
interpreting the tense operators as not including the present moment, then
we would think of the temporal ordering relation as the strictlyearlierthan
relation, and would write it “<”.) Thus, instead of writing “%ww
/
”, we write:
t ≤ t
/
.
CHAPTER 7. VARIATIONS ON MPL 177
We’ll need to update the denition of the valuation function. The clauses
for the propositional connectives remain the same; what we need to add is
clauses for the tense operators. Let’s take just G and H as primitive; here are
the clauses:
V
.
(Gφ, t ) =1 iff for every t
/
such that t ≤ t
/
, V
.
(φ, t
/
) =1
V
.
(Hφ, t ) =1 iff for every t
/
such that t
/
≤ t , V
.
(φ, t
/
) =1
If we dene F and P as ∼G∼ and ∼H∼, respectively, then we get the following
derived clauses:
V
.
(Fφ, t ) =1 iff for some t
/
such that t ≤ t
/
, V
.
(φ, t
/
) =1
V
.
(Pφ, t ) =1 iff for some t
/
such that t
/
≤ t , V
.
(φ, t
/
) =1
Call an MPLmodel, thought of in this way, a “PTLmodel” (for “Priorean
Tense Logic”). And say that a wff is PTLvalid iff it is true in every time in every
PTLmodel. Given our discussion of system K from chapter 6, we already
know a lot about PTLvalidity. The truth condition for the G is the same as
the truth condition for the 2 in MPL. Thus, for each Kvalid formula φ of
MPL, there is a PTLvalid formula of tense logic: simply replace each 2 in
φ with G. Replacing 2s with Gs in the Kvalid formula 2(P∧Q)→2P results
in the PTLvalid formula G(P∧Q)→GP, for example. Similarly, the result of
replacing 2s with Hs in a Kvalid formula also results in a PTLvalid formula.
But there are further cases of PTLvalidity that depend on the interaction
between different tense operators, and hence have no direct analog in MPL.
For example, we can demonstrate that
PTL
φ→GPφ:
i) Suppose for reductio that in some PTLmodel . (=〈¨ , ≤
, ·〉) and some t ∈ ¨ , V
.
(φ→GPφ, t ) = 0. (I henceforth
drop the subscript ..)
ii) So V(φ, t ) =1 and …
iii) …V(GPφ, t ) =0.
iv) Given iii), by the truth condition for G: for some t
/
∈ ¨ , t ≤ t
/
and V(Pφ, t ) =0
v) Given iv), by the (derived) truth condition for P: for every
t
//
∈ ¨ , if t
//
≤ t
/
then V(φ, t
//
) =0
vi) letting t
//
in v) be t , given that t ≤ t
/
(iv), we have: V(φ, t ) =0,
contradicting ii).
Similarly, one can show that
PTL
φ→HFφ.
CHAPTER 7. VARIATIONS ON MPL 178
7.1.5 Formal constraints on ≤
PTLvalidity is not a good model for logical truth in tense logic. We have
so far placed no constraints on the formal properties of the relation ≤ in a
PTLmodel. That means that there are PTL models in which the ≤ looks
nothing like a temporal ordering. We don’t normally think that time could
consist of a number of wholly temporally disconnected points, for example,
or of many points each of which is atleastasearlyas all of the rest, and so
on, but there are PTLmodels answering to these strange descriptions. As we
have dened them, PTLvalid formulas must be true at every world in every
PTLmodel, even these strange models. This means that many tenselogical
statements that ought, intuitively, to count as logical truths, are in fact not
PTLvalid.
The formula GP→GGP is an example. It is PTLinvalid, for consider a
model with three times, t
1
, t
2
, and t
3
, where t
1
≤ t
2
, t
2
≤ t
3
, and t
1
(≤ t
3
, and in
which P is true at t
1
and t
2
, but not at t
3
:
•
P
t
1
•
t
2
P
•
t
3
∼P
In this model, GP→GGP is false at time t
1
. But GP→GGP is, intuitively, a
logical truth. If it is and will always be raining, then surely it must also be
true that: it is and always will be the case that: it is and always will be raining.
The problem, of course, is that the ≤ relation in the model we considered is
intransitive, whereas, one normally assumes, the atleastasearlyas relation
must be transitive.
So: a more interesting notion of validity for PTL formulas results from
considering only PTLmodels with transitive ≤ relations. Doing this validates
every instance of the “S4” schemas:
Gφ→GGφ
Hφ→HHφ
There are other interesting constraints on ≤ that one might impose. One
might impose reexivity, for example. This is natural to impose if we are
construing the tense operators as including the present moment; not otherwise.
Imposing reexivity validates the “Tschemas” Gφ→φ and Hφ→φ.
One might also impose “connectivity” of some sort. Connective principles
disallow, in various ways, “incomparable” pairs of times—pairs of times neither
CHAPTER 7. VARIATIONS ON MPL 179
of which bears the ≤ relation to the other. The strongest sort disallows all
incomparable pairs:
“Strong connectivity”: for every t
/
, t
//
, either t
/
≤ t
//
or t
//
≤ t
/
A weaker sort merely disallows incomparable pairs when each member of the
pair is after or before some one time:
“Weak connectivity”: for every t , t
/
, t
//
, IF: either t ≤ t
/
and t ≤ t
//
,
or t
/
≤ t and t
//
≤ t , THEN: either t
/
≤ t
//
or t
//
≤ t
/
Weak connectivity disallows “branches” in the temporal order but allows dis
tinct timelines wholly disconnected from one another; strong connectivity
insures that all times are part of a single nonbranching structure. Each sort
validates every instance of the following schemas:
G(Gφ→ψ)∨G(Gψ→φ)
H(Hφ→ψ)∨H(Hψ→φ)
Here’s a sketch of a validity proof for the rst schema:
i) Assume for reductio that V(G(Gφ→ψ)∨G(Gψ→φ), t ) = 0
in some PTLmodel whose accessibility relation is weakly
connected.
ii) So V(G(Gφ→ψ), t ) =0 and V(G(Gψ→φ), t ) =0
iii) so there exist times, t
/
and t
//
, such that t ≤ t
/
and t ≤ t
//
, and
V(Gφ→ψ, t
/
) =0 and V(Gψ→φ, t
//
) =0
iv) thus, Gφ is true at t
/
, ψ is false at t
/
, Gψ is true at t
//
, and φ
is false at t
//
v) but by weak connectivity, t
/
≤ t
//
or t
//
≤ t
/
. Either way iv)
leads to a contradiction.
There are other constraints one might impose, for example antisymmetry
(no distinct times bear ≤ to each other), density (between any two times there is
another time), or eternality (there exists neither a rst nor a last time). In some
cases, imposing a constraint validates an interesting schema being validated.
Further, some constraints are more philosophically controversial than others.
CHAPTER 7. VARIATIONS ON MPL 180
Notice that one should not impose symmetry on ≤. Obviously if one time
is at least as early as another, then the second time needn’t be at least as early
as the rst. Moreover, imposing symmetry would validate the “B” schemas
FGφ→φ and PHφ→φ; but these clearly ought not to be validated. Take the
rst, for example: it doesn’t follow from it will be the case that it is always going to
be the case that I’m dead that I’m (now) dead.
So far we have been interpreting the tense operators as including the present
moment. That led us to call the temporal ordering relation in our models “≤”,
and require that it be reexive. What if we instead interpreted the tense
operators as not including the present moment? We would then call the
temporal ordering relation “<”, and think of it as the earlierthan relation; and
we would no longer require that it be reexive. Indeed, it would be natural to
require that it be irreexive: that it never be the case that t < t .
We have considered only the semantic approach to tense logic. What of a
prooftheoretic approach? Given the similarity between tense logic and modal
logic, it should be no surprise that axiom systems similar to those of section
6.4 can be developed for tense logic. Moreover, the techniques developed in
sections 6.56.6 can be used to give soundness and completeness proofs for
tenselogical axiom systems, relative to the possibleworlds semantics that we
have developed in this section.
7.2 Intuitionist propositional logic
7.2.1 Kripke semantics for intuitionist propositional logic
3
As we sawin section 3.4, intuitionists think of meaning in prooftheoretic terms,
rather than truththeoretic terms, and as a result reject classical propositional
logic in favor of intuitionist propositional logic. We have already developed a
prooftheory for intuitionist logic: we began with the original sequent calculus
and then dropped doublenegation elimination while adding ex falso. But we
still need a semantics. What should such a semantics look like?
In this book we have been thinking of logical truth, on the semantic
conception—i.e., validity—as “truth no matter what”. It is natural for in
tuitionists to think, rather, in terms of “provability no matter what”. We will
lay out a semantics for intuitionist propositional logic—due to Saul Kripke—
that is based on this idea. The semantics will be like that of possibleworlds
3
See (Priest, 2001, chapter 6)
CHAPTER 7. VARIATIONS ON MPL 181
semantics for propositional modal logic, and so it will include valuation func
tions that assign the values 1 and 0 to formulas relative to the members of a
set T. But the idea is to now think of the members of T as stages in the
construction of proofs, rather than as possible worlds, and to think of 1 and 0 as
“proof statuses”, rather than truth values. That is, we are to think of V(φ, w) =1
as meaning that formula φ has been proved at stage w.
Let us treat the ∧ and the ∨ as primitive connectives. Here is Kripke’s
semantics for intuitionist propositional logic. (To emphasize the different way
we are regarding the “worlds”, we rename T “· ”, for stages in the construction
of proofs, and we will use the variables s , s
/
, etc., for its members.)
A intuitionismmodel is a triple 〈· , %, ·〉, such that:
i) · is a nonempty set (“proof stages”)
ii) % is a binary relation over · (“accessibility”) that is reexive,
transitive, and obeys the heredity condition: for any sentence
letter α, if ·(α, s ) =1 and %s s
/
then ·(α, s
/
) =1
iii) · is a function from sentence letters and stages to truth values
(“interpretation function”).
Given any such model, we can dene the corresponding valuation function
thus:
a) V
.
(α, s ) =·(α, s ) for any sentence letter α
b) V
.
(φ∧ψ, s ) =1 iff V
.
(φ, s ) =1 and V
.
(ψ, s ) =1
c) V
.
(φ∨ψ, s ) =1 iff V
.
(φ, s ) =1 or V
.
(ψ, s ) =1
d) V
.
(∼φ, s ) =1 iff for every s
/
such that %s s
/
, V
.
(φ, s
/
) =0
e) V
.
(φ→ψ, s ) =1 iff for every s
/
such that %s s
/
, either V
.
(φ, s
/
) =
0 or V
.
(ψ, s
/
) =1
Note that the truth conditions for the → and the ∼ at stage s no longer
depend exclusively on what s is like; they are sensitive to what happens at
stages accessible from s . Unlike the ∧ and the ∨, → and ∼ are not “truth
functional” (relative to a stage); they behave like modal operators.
Let us think intuitively about these models. We are to think of each member
of · as a stage in the construction of mathematical proofs. At any stage, one
CHAPTER 7. VARIATIONS ON MPL 182
has come up with proofs of some things but not others. When V assigns 1 to a
formula at a stage, that means intuitively that as of that state of information,
the formula has been proven. The assignment of 0 means that the formula has
not been proven thus far (though it might nevertheless in the future.)
The holding of the accessibility relation % represents which future stages
are possible, given one’s current stage. If s
/
is accessible from s , that means that
s
/
contains all the proofs in s , plus perhaps more. Given this understanding,
reexivity and transitivity are obviously correct to impose, as is the heredity
condition, since (on the somewhat idealized conception of proof we are oper
ating with) one does not lose proved information when constructing further
proofs. But the accessibility relation will not in general be symmetric: for
sometimes one will come across a new proof that one did not formerly have.
Let’s also think through why the truth conditions for →, ∧, ∨ and ∼ are
intuitively correct. Intuitionists, recall, associate with each propositional con
nective, a conception of what proofs of formulas built using that connective
must be like:
a proof of ∼φ is a proof that φ leads to a contradiction
a proof of φ∧ψ is a proof of φ and a proof of ψ
a proof of φ∨ψ is a proof of φ or a proof of ψ
a proof of φ→ψ is a construction that can be used to turn any proof
of φ into a proof of ψ
This is what inspires the denition of a valuation function. As of a time, one
has proved φ∧ψ iff one has proved both φ and ψ then. As of a time, one has
proved φ∨ψ iff one has proved one of the disjuncts. As for ∼, a proof of ∼φ,
according to an intuitionist, is a proof that φ leads to a contradiction. But i) if
one has proved that φleads to a contradiction, then in no future stage could
one prove φ (at least if one’s methods of proof are consistent); and ii) if one
has not proved that φ leads to a contradiction, this leaves open the possibility
of a future stage at which one proves φ. Thus the valuation condition for ∼ is
justied.
4
As for →: if one has a method of converting proofs of φ into proofs
4
I’m fudging here a bit. Are the stages idealized so that one has already proven everyone
one can in principle prove? Clearly not, for then any formula assigned 1 at any accessible
stage should already be assigned 1 at that stage. But if stages are not idealized in this way,
then why suppose that the assignment of 0 at a stage to ∼φ (failure to prove that φ leads to a
contradiction) insures that there is some future stage at which φ is proved? A similar worry
confronts the valuation condition for →.
CHAPTER 7. VARIATIONS ON MPL 183
of ψ, then there could never be a possible future in which one has a proof of
φ but not one of ψ. Conversely, if one lacks such a method, then it should be
possible one day to have a proof of φ without being able to convert it into a
proof of ψ, and thus without then having a proof of ψ.
We can now dene intuitionist semantic consequence and validity in the
obvious way:
I
φ iff V
.
(φ, s ) =1 for each stage s in each intuitionist model .
Γ
I
φ iff for every intuitionist model . and every stage s in .,
if V
.
(γ, s ) =1 for each γ ∈ Γ, then V
.
(φ, s ) =1
7.2.2 Examples and proofs
First some examples. Let’s establish that:
Q P→Q
Take any model and any stage s ; assume that V(Q, s ) =1 and V(P→Q, s ) =0.
Thus, for some s
/
, %s s
/
and V(P, s
/
) = 1 and V(Q, s
/
) = 0. But this violates
heredity.
Next:
P→Q ∼Q→∼P (contraposition):
Suppose V(P→Q, s ) = 1 and V(∼Q→∼P, s ) = 0. Given the latter, there’s
some stage s
/
such that %s s
/
and V(∼Q, s
/
) =1 and V(∼P, s
/
) =0. Given the
latter, for some s
//
, %s
/
s
//
and V(P, s
//
) = 1. Given the former, V(Q, s
//
) = 0.
Given transitivity, %s s
//
. Given the truth of P→Q at s , either V(P, s
//
) =0 or
V(Q, s
//
) =1. Contradiction.
In section 3.4 we asserted (but did not prove) that ∅' P∨∼P is an unprov
able sequent. Here we’ll establish:
P∨∼P:
Here’s a model in which P∨∼P is valuated as 0 in stage r:
CHAPTER 7. VARIATIONS ON MPL 184
0 0 0
P∨∼P
∗
r
∗
1
P
a
As in section ??, we use asterisks to remind ourselves of commitments that
concern other worlds/stages. The asterisk is under ∼P in stage r because a
negation with value 0 carries a commitment to including some stage at which
the negated formula is 1. The asterisk is over the P in stage a because of the
heredity condition: a sentence letter valuated 1 carries a commitment to make
that letter 1 in every accessible stage. (Likewise, negations and conditionals
valuated as 1 generate topasterisks, and conditionals valuated as 0 generate
bottomasterisks). The ofcial model:
· : ]r, a]
% : ]〈r, r〉, 〈a, a〉, 〈r, a〉]
·(P, a) =1, all other atomics 0 everywhere
(I’ll skip the ofcial models from now on.) So, P∨∼P. Given the soundness
of the intuitionist proof system (proved below), it follows that ∅ ' P∨∼P is
indeed an unprovable sequent.
Next example:
∼∼P P:
CHAPTER 7. VARIATIONS ON MPL 185
∗
1 0 0
∼∼P P
∗
r
∗
1 0
P ∼P
∗
a
Note: since ∼∼P is 1 at r, that means that ∼P must be 0 at every stage at
which r sees. Now, %rr, so ∼P must be 0 at r. So r must see some stage in
which P is 1. World a takes care of that.
Next:
∼(P∧Q) (∼P∨∼Q):
∗
1 0 0 0 0 0
∼(P∧Q) ∼P∨∼Q
∗ ∗
r
.t
t
t
t
t
t
t
t
t
t
t
J
J
J
J
J
J
J
J
J
J
J
∗
1 0 0
P P∧Q
a
∗
1 0 0
Q P∧Q
b
Next example:
∼P∨∼Q ∼(P∧Q):
Suppose V(∼P∨∼Q, s ) =1 and V(∼(P∧Q), s ) =0. Given the latter, for some
s
/
, %s s
/
and V(P∧Q, s
/
) = 1. So, V(P, s
/
) = 1 and V(Q, s
/
) = 1. Given the
CHAPTER 7. VARIATIONS ON MPL 186
former, either V(∼P, s ) =1 or V(∼Q, s ) =1. If the former then V(P, s
/
) =0; if
the latter V(Q, s
/
) =0. Contradiction either way.
Final example:
P→(Q∨R) (P→Q)∨(P→R):
We begin thus:
∗
1 0 0 0
P→(Q∨R) (P→Q)∨(P→R)
∗ ∗
r
We now discharge the bottomasterisks:
∗
1 0 0 0
P→(Q∨R) (P→Q)∨(P→R)
∗ ∗
r
.r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
∗
1 0
P Q
a
P R b
7.2.3 Soundness and other facts about intuitionist validity
In this section we establish a few facts about intuitionist validity.
Semantic consequence and conditionals
Let’s establish two facts connecting the relation of semantic consequence to
conditionals.
Deduction theorem: If φ
I
ψ then
I
φ→ψ
Converse deduction theorem: if
I
φ→ψ then φ
I
ψ
CHAPTER 7. VARIATIONS ON MPL 187
Proof: for the rst, suppose φ ψ, and suppose for reductio that V(φ→ψ, s ) =
0. Then for some s
/
(that s sees), V(φ, s
/
) = 1 and V(ψ, s
/
) = 0—contradicts
φ ψ.
As for the second, suppose φ→ψ, and suppose for reductio that V(φ, s ) =1
while V(ψ, s ) =0. By φ→ψ, V(φ→ψ, s ) =1. By reexivity, either V(φ, s ) =0
or V(ψ, s ) =1. Contradiction.
Intuitionistic consequence implies classical consequence
That is: if Γ
I
φ then Γ
PL
φ, where
PL
is the relation of classical semantic
consequence (from section 2.3)).
Proof: Suppose Γ
I
φ, and let · be a PLinterpretation in which every
member of Γ is true; we must show that V
·
(φ) =1. (V
·
, recall, is the classical
valuation for ·.) Consider the intuitionist model with just one stage, r, in which
formulas have the same valuations as they have in the classical interpretation—
i.e., 〈]r], ]〈r, r〉], ·*〉, where ·*(α, r) = ·(α) for each sentence letter α. It’s
easy to check that since the intuitionist model has only one stage, the classical
and intuitionist truth conditions collapse in this case, so that for every wff φ,
V
·*
(φ, r) =V
·
(φ). So, since every member of Γ is true in ·, every member of
Γ is true at r in the intuitionist model. Since Γ
I
φ, it follows that φ is 1 at r
in the intuitionist model; and so, φ is true in the classical interpretation—i.e.,
V
·
(φ) =1.
The heredity condition holds for all formulas
(i.e., for any wff φ, whether atomic or no, and any stage, s , in any intuitionist
model, if V(φ, s ) = 1 and %s s
/
then V(φ, s
/
) = 1). Call this claim “General
heredity”.
Proof: by induction. The base case is just the ofcial heredity condition.
Next we make the inductive hypothesis: heredity is true for formulas φ and ψ;
we must now show that heredity also holds for ∼φ, φ→ψ, φ∧ψ, and φ∨ψ:
∼: Suppose for reductio that V(∼φ, s ) =1, %s s
/
, and V(∼φ, s
/
) =0. Given
the latter, for some s
//
, %s
/
s
//
and V(φ, s
//
) = 1. By transitivity, %s s
//
. This
contradicts V(∼φ, s ) =1.
→: Suppose for reductio that V(φ→ψ, s ) =1, %s s
/
, and V(φ→ψ, s
/
) =0.
Given the latter, for some s
//
, %s
/
s
//
and V(φ, s
//
) =1 and V(ψ, s
//
) =0; but by
transitivity, %s s
//
—contradicts the fact that V(φ→ψ, s ) =1.
CHAPTER 7. VARIATIONS ON MPL 188
∧: Suppose for reductio that V(φ∧ψ, s ) = 1, %s s
/
, and V(φ∧ψ, s
/
) = 0.
Given the former, V(φ, s ) = 1 and V(ψ, s ) = 1. By the inductive hypothesis,
V(φ, s
/
) =1 and V(ψ, s
/
) =1—contradiction.
∨ : Suppose for reductio that V(φ∨ψ, s ) = 1, %s s
/
, and V(φ∨ψ, s
/
) = 0.
Given the former, either V(φ, s ) =1 or V(ψ, s ) =1; and so, given the inductive
hypothesis, either φ or ψ is 1 in s
/
. That violates V(φ∨ψ, s
/
) =0.
Soundness for intuitionism
Say that a sequent Γ ' φ is intuitionistically valid (“Ivalid”) iff in for every stage
in every intuitionist model, IF every member of Γ is 1 at that stage (“V(Γ, s ) =1”,
for short), THEN V(φ, s ) = 1. What we wish to prove is that the system of
section 3.4 is sound—that is, that every sequent that is provable in that system
is Ivalid.
Proof: Since a provable sequent is the last sequent in any proof, all we need
to show is that every sequent in any proof is Ivalid. And to do that, all we need
to show is that i) the rule of assumptions generates Ivalid sequents, and ii) all
the other rules preserve Ivalidity.
The rule of assumptions generates sequents of the form φ' φ, which are
clearly Ivalid.
As for the other rules:
∧I: Here we assume that the inputs to ∧I are Ivalid, and show that its
output is Ivalid. That is, we assume that Γ ' φ and ∆' ψ are Ivalid sequents,
and we must show that it follows that Γ, ∆' φ∧ψ is also Ivalid. So, consider
any model with valuation V and any stage s such that V(Γ∪∆)=1, and suppose
for reductio that V(φ∧ψ, s ) =0. Since Γ ' φ is Ivalid, V(φ, s ) =1; since ∆' ψ
is Ivalid, V(ψ, s ) =1; contradiction.
∧E: Assume that Γ ' φ∧ψ is Ivalid, and suppose for reductio that V(Γ, s ) =
1 and V(φ, s ) =0, for some stage s in some model. By the Ivalidity of Γ ' φ∧ψ,
V(φ∧ψ, s ) =1, so V(φ, s ) =1. Contradiction. The case of ψ is parallel.
∨I: Assume that Γ ' φ is Ivalid and suppose for reductio that V(Γ, s ) =1
but V(φ∨ψ, s ) =0. Thus V(φ, s ) =0—contradiction.
∨E: Assume that Γ ' φ∨ψ, ∆
1
,φ ' Π, and ∆
2
,ψ ' Π are all Ivalid, and
suppose for reductio that V(Γ ∪ ∆
1
∪ ∆
2
, s ) = 1 but V(Π, s ) = 0. The rst
assumption tells us that V(φ∨ψ, s ) =1, so either φ or ψ is 1 at s . If the former,
then the second assumption tells us that V(Π, s ) = 1; if the second, then the
third assumption tells us that V(Π, s ) =1. Either way, we have a contradiction.
CHAPTER 7. VARIATIONS ON MPL 189
DN (the remaining version: if Γ ' φ then Γ ' ∼∼φ): assume Γ ' φ is
Ivalid, and suppose for reductio that V(Γ, s ) = 1 but V(∼∼φ, s ) = 1. From
the latter, for some s
/
, %s s
/
and V(∼φ, s
/
) = 1. So V(φ, s
/
) = 0 (since % is
reexive). From the former and the fact that Γ ' φ, V(φ, s ) =1. This violates
general heredity.
RAA: Suppose that Γ,φ' ψ∧∼ψ is Ivalid, and suppose for reductio that
V(Γ, s ) =1 but V(∼φ, s ) =0. Then for some s
/
, %s s
/
and V(φ, s
/
) =1. Since
V(Γ, s ) =1 (i.e., all members of Γ are 1 at s), by general heredity V(Γ, s
/
) =1 (all
members of Γ are 1 at s’). Thus, since Γ,φ' ψ∧∼ψ is Ivalid, V(ψ∧∼ψ, s
/
) =1.
But that is impossible. (If V(ψ∧∼ψ, s
/
) =1, then V(ψ, s
/
) =1 and V(∼ψ, s
/
) =1;
but from the latter and the reexivity of % it follows that V(ψ, s
/
) =0.)
→I: Suppose that Γ,φ' ψis Ivalid, and suppose for reductio that V(Γ, s ) =1
but V(φ→ψ, s ) = 0. Given the latter, for some s
/
, %s s
/
and V(φ, s
/
) = 1 and
V(ψ, s
/
) =0. Given general heredity, V(Γ, s
/
) =1. And so, given that Γ,φ' ψ is
Ivalid, V(ψ, s
/
) =1—contradiction.
→E: Suppose that Γ ' φ and ∆' φ→ψ are both Ivalid, and suppose for
reductio that V(Γ ∪ ∆, s ) = 1 but V(ψ, s ) = 0. Since Γ ' φ and ∆ ' φ→ψ,
V(φ, s ) =1 and V(φ→ψ, s ) =1. Given the latter, and given that % is reexive,
either V(φ, s ) =0 or V(ψ, s ) =1. Contradiction.
EF (ex falso): Suppose Γ ' φ∧∼φ is Ivalid, and suppose for reductio that
V(Γ, s ) =1 but V(ψ, s ) =0. Given the former and the Ivalidity of Γ ' φ∧∼φ,
V(φ∧∼φ, s ) =1, which is impossible.
Note that the rule we dropped, doublenegation elimination, does not
preserve Ivalidity. For suppose it did. The sequent ∼∼P ' ∼∼P is obviously I
valid, and thus the sequent ∼∼P ' P, which follows from it by doublenegation
elimination, would also be Ivalid. But it isn’t: above we gave a model and a
stage r in which ∼∼P is 1 and P is 0.
Chapter 8
Counterfactuals
1
T
uivi avi tiv:aix toxbi:ioxais in natural language that are not well
represented either by propositional logic’s material conditional or by
modal logic’s strict conditional. In this chapter we consider “counterfactual”
conditionals—conditionals that (loosely speaking) have the form:
If it had been that φ, then it would have been that ψ
For instance:
If I had struck this match, it would have lit
The counterfactuals that we typically utter have false antecedents (hence
the name), and are phrased in the subjunctive mood. They must therefore
be distinguished from English conditionals phrased in the indicative mood.
Counterfactuals are generally thought to semantically differ from indicative
conditionals. A famous example: the counterfactual conditional ‘If Oswald
hadn’t shot Kennedy, someone else would have’ is false (assuming that certain
conspiracy theories are false and Oswald was acting alone); but the indicative
conditional ‘If Oswald didn’t shoot Kennedy then someone else did’ is true
(we know that someone shot Kennedy, so if it wasn’t Oswald, it must have been
someone else.) The semantics of indicative conditionals is an important topic
in its own right, but we won’t take up that topic here.
We represent the counterfactual with antecedent φ and consequent ψ thus:
φ2→ψ
1
This section is adapted from my notes from Ed Gettier’s fall 1988 modal logic class.
190
CHAPTER 8. COUNTERFACTUALS 191
What should the logic of this new connective be, if it is to accurately represent
natural language counterfactuals?
8.1 Natural language counterfactuals
Well, let’s have a look at how natural language counterfactuals behave. Our
survey will provide guidance for our main task: developing a semantics for 2→.
As we’ll see, counterfactuals behave very differently from both material and
strict conditionals.
8.1.1 Not truthfunctional
Our system for counterfactuals should have the following features:
∼P P2→Q
Q P2→Q
For consider: I did not strike the match; but it doesn’t logically follow that
if I had struck the match, it would have turned into a feather. So if 2→ is to
represent ‘if it had been that…, it would have been that…’, ∼P should not
semantically imply P2→Q. Similarly, George W. Bush (somehow) won the last
United States presidential election, but it doesn’t follow that if the newspapers
had discovered beforehand that Bush had an affair with Al Gore, he would
still have won. So our semantics had better not count P2→Q as a semantic
consequence of Q either.
These implications hold for the material conditional, however:
∼P P→Q
Q P→Q
We have our rst difference in logical behavior between counterfactuals and
the material conditional →.
8.1.2 Can be contingent
It’s not true, presumably, that if Oswald hadn’t shot Kennedy, then someone else
would have (assuming that the conspiracy theory is false). But the conspiracy
theory might have been true; in a possible world in which there is a conspiracy,
CHAPTER 8. COUNTERFACTUALS 192
it would be true that if Oswald hadn’t shot Kennedy, someone else would
have. Thus, our logic should allow counterfactuals to be contingent statements.
Just because a counterfactual is true, it should not follow logically that it is
necessarily true; and just because a counterfactual is false, it should not follow
logically that it is necessarily false. Our semantics for 2→, that is, should have
the following features:
P2→Q 2(P2→Q)
∼(P2→Q) →2∼(P2→Q)
One reason this is important is that it shows an obstacle to using the strict
conditional ⇒to represent natural language counterfactuals. For remember
that P⇒Q is dened as 2(P→Q). As a result:
P⇒Q
S4,S5
2(P⇒Q)
∼(P⇒Q)
S5
2∼(P⇒Q)
So if, as is commonly supposed, the logic of the 2 is at least as strong as S4, we
have a logical mismatch between counterfactuals and the ⇒.
8.1.3 No augmentation
The →and the ⇒obey the argument form augmentation:
P→Q P⇒Q
(P∧R)→Q (P∧R)⇒Q
That is, P→Q
PL
(P∧R)→Q and P⇒Q
K,…
(P∧R)⇒Q. However, natural
language counterfactuals famously do not obey augmentation. Consider:
If I were to strike the match, it would light.
Therefore, if I were to strike the match and I was in outer space, it
would light.
So, our next desideratum is that the corresponding argument should not hold
good for 2→(that is, P2→Q (P∧R)2→Q.)
CHAPTER 8. COUNTERFACTUALS 193
8.1.4 No contraposition
→and ⇒obey contraposition:
φ→ψ φ⇒ψ
∼ψ→∼φ ∼ψ⇒∼φ
But counterfactuals do not. Suppose I’m on the ring squad, and we shoot
someone dead. My gun was loaded, but so were those of the others. Then the
premise of the following argument is true, while its consequent is false:
If my gun hadn’t been loaded, he would still be dead.
Therefore, if he weren’t dead, my gun would have been loaded.
8.1.5 Some implications
Here is an argument form that intuitively should hold for the 2→:
P2→Q
P→Q
The counterfactual conditional should imply the material conditional. 2→will
then obey modus ponens and modus tollens:
P P2→Q
P2→Q ∼Q
Q ∼P
The reason is, of course, that modus ponens and modus tollens are valid for
the →. (Note that it’s not inconsistent to say that modus tollens holds for the
2→and also that contraposition fails.)
Another implication: the strict conditional should imply the counterfactual:
P⇒Q
P2→Q
To see that these implications should hold, consider rst the argument from
the strict conditional to the counterfactual conditional. Surely, if P entails Q,
then if P were true, Q would be as well. As for the counterfactual implying the
CHAPTER 8. COUNTERFACTUALS 194
material, suppose that you think that if P were true, Q would also be true. Now
suppose that someone tells you that P is true, but that Q is false. Wouldn’t
you then need to give up your original claim that if P were to be true, then
Q would be true? It seems so. So, the statement P2→Q isn’t consistent with
P∧∼Q—that is, it isn’t consistent with the denial of P→Q.
8.1.6 Context dependence
Years ago, a few of us were at a restaurant in NY—Red Smith, Frank
Graham, Allie Reynolds, Yogi [Berra] and me. At about 11.30 p.m., Ted
[Williams] walked in helped by a cane. Graham asked us what we thought
Ted would hit if he were playing today. Allie said, “due to the better
equipment probably about .350.” Red Smith said. “About .385.” I said,
“due to the lack of really great pitching about .390.” Yogi said, “.220.”
We all jumped up and I said, “You’re nuts, Yogi! Ted’s lifetime average is
.344.” “Yeah” said Yogi “but he is 74 years old.” –Buzzie Bavasi, baseball
executive.
Who was right? If Ted Williams had played at the time the story was told,
would he or wouldn’t he have hit over .300?
Clearly, there’s no one answer. The rst respondents were imagining
Williams playing as a young man. Understood that way, the answer is no
doubt: yes, he would have hit over .300. But Berra took the question a different
way: he was imagining Williams hitting as he was then: a 74 year old man.
Berra took the others off guard, by deliberately (?—this is Yogi Berra we’re
talking about…) shifting how the question was construed, but he didn’t make a
semantic mistake in so doing. It’s perfectly legitimate, in other circumstances
anyway, to take the question in Berra’s way. (Imagine Williams talking to him
self ve years after he retired: “These punks today! If I were playing today,
I’d still hit over .300!”) Counterfactual sentences can mean different things
depending on the conversational context in which they are uttered.
Another example:
If Syracuse were in Louisiana, Syracuse winters would be warm.
True or false? It might seem true: Louisiana is in the south. But wait—perhaps
Louisiana would include Syracuse by extending its borders north to Syracuse’s
actual latitude.
Would Syracuse be warm in the winter? Would Williams hit over .300?
No one answer is correct, once and for all. Which answer is correct depends
CHAPTER 8. COUNTERFACTUALS 195
on the linguistic context. Whether a counterfactual is true or whether it is false
depends in part on what the speaker means to be saying, and what her audience
takes her to be saying, when she utters the counterfactual. Would Syracuse be
warm?—in some contexts, it would be correct to say yes, and in others, to say
no. When we imagine Syracuse being warm, we imagine reality being different
in certain respects from actuality. In particular, we imagine Syracuse as being
in Louisiana. In other respects, we imagine a situation that is a lot like reality—
we don’t imagine a situation, for example, in which Syracuse and Louisiana
are both located in China. Now, when considering counterfactuals, there is
a question of what parts of reality we hold constant. In the SyracuseLouisiana
case, we seem to have at least two choices. Do we hold constant the location of
Syracuse, or do we hold constant the borders of Louisiana? The truth value of
the counterfactual depends on which we hold constant.
What determines which things are to be held constant, when we evaluate
the truth value of a counterfactual? It large part: the context of utterance of
the counterfactual. Suppose I am in the middle of the following conversation:
“Syracuse restaurants struggle to survive because the climate there is so bad:
no one wants to go out to eat in the winter. If Syracuse were in Louisiana,
its restaurants would do much better.” In such a context, an utterance of the
counterfactual “If Syracuse were in Louisiana, Syracuse winters would be warm”
would be regarded as true. But if this counterfactual were uttered in the midst of
the following conversation, it would be regarded as false: “You know, Louisiana
is statistically the warmest state in the country. Good thing Syracuse isn’t in
Louisiana, because that would ruin the statistic.”
Does just saying a sentence, intending it to be true, make it true? Well,
sort of! When a certain sentence has a meaning that is partly determined by
context, then when a person utters that sentence with the intention of saying
something true, that tends to create a context in which the sentence is true.
Compare ‘at’—we’ll say “the table is at”, and thereby utter a truth. But when
a scientist looks at the same table and says “you know, macroscopic objects are
far from being at. Take that table, for instance. It isn’t at at all—when viewed
under a microscope, it can be seen to have a very irregular surface”. The term
‘at’ has a certain amount of vagueness—how at does a thing have to be to
count as being “at”? Well, the amount required is determined by context.
2
2
See Lewis (1979).
CHAPTER 8. COUNTERFACTUALS 196
8.2 The Lewis/Stalnaker approach
Here is the core idea of David Lewis (1973) and Robert Stalnaker (1968) of how
to interpret counterfactual conditionals. Consider a counterfactual conditional
P2→Q. To determine its truth, Lewis and Stalnaker instruct us to consider
the possible world that is as similar to reality as possible, in which P is true.
Then, the counterfactual is true in the actual world if and only if Q is true in
that possible world. Consider Lewis’s example:
If kangaroos had no tails, they would topple over.
When we consider the possible world that would be actual if kangaroos had
no tails, we do not depart gratuitously from actuality. For example, we do
not consider a world in which kangaroos have wings, or crutches. We do not
consider a world with different laws of nature, in which there is no gravity. We
keep the kangaroos as they actually are, but remove the tails, and we keep the
laws of nature as they actually are. It seems that the kangaroos would then fall
over.
Take the examples of the previous section, in which I got you to give
differing answers to certain sentences. Consider:
If Syracuse were in Louisiana, Syracuse winters would be warm.
How does the contextual dependence of this sentence work, on the Lewis
Stalnaker view? By supplying different standards of comparison of similarity.
Think about similarity, for a moment: things can be similar in certain respects,
while not being similar in other respect. A blue square is similar to a blue circle
in respect of color, not in respect of shape. Now, when we answer afrmative
to this counterfactual, according to Lewis and Stalnaker, when we consider
the possible world most similar to the actual world in which Syracuse is in
Louisiana, we are using a kind of similarity that weights heavily Louisiana’s
actual borders. When we count the counterfactual false, we are using a kind of
similarity that weights very heavily Syracuse’s actual location.
8.3 Stalnaker’s system
3
I now lay out Stalnaker’s system, SC (for “StalnakerCounterfactuals”). The
idea is to add the 2→to propositional modal logic.
3
See Stalnaker (1968). The version of the theory I present here is slightly different from
Stalnaker’s original version; see Lewis (1973, p. 79).
CHAPTER 8. COUNTERFACTUALS 197
8.3.1 Syntax of SC
The primitive vocabulary of SC is that of propositional modal logic, plus the
connective 2→. Here’s the grammar:
Denition of SCwffs:
i) Sentence letters are wffs
ii) if φ, ψ are wffs then (φ→ψ), ∼φ, 2φ, and (φ2→ψ) are wffs
iii) nothing else is a wff
8.3.2 Semantics of SC
First, let’s dene two formal properties for relations: strong connectivity and
antisymmetry. Let R be a binary relation over set A:
strong connectivity (in A): for any x, y ∈ A, either Rxy or Ryx
antisymmetry: for any x, y, if Rxy and Ryx then x = y
More notation: where R is a threeplace relation, let’s abbreviate “Rxyz” as
“R
z
xy”. And, where u is any object, let “R
u
” be the twoplace relation that holds
between objects x and y iff R
u
xy. (Think of R
u
as the result of “plugging” up
one place of R with u.)
We can now dene SCmodels:
An SCmodel is an ordered triple 〈, , V〉, where
i) is a nonempty set (“worlds”)
ii) V is a function that assigns truth values to formulas rela
tive to members of T (“valuation function”)
iii) is a threeplace relation over (“nearness relation”; read
“x _
z
y” as “x is at least as near to/similar to z as is y”.)
iv) V and _ satisfy the following conditions:
C1: for any w,_
w
is strongly connected (in T)
C2: for any w, _
w
is transitive
C3: for any w, _
w
is antisymmetric
C4: for any x, y, x _
x
y (“Base”)
CHAPTER 8. COUNTERFACTUALS 198
C5: for any SCwff, φ, provided φis true in at least one world,
then for every z, there’s some w such that V(φ, w) =
1, and such that for any x, if V(φ, x) = 1 then w _
z
x
(“Limit”)
v) For all SCwffs, φ, ψ and for all w ∈ ,
a) V(∼φ, w) =1 iff V(φ, w) =0
b) V(φ→ψ, w) =1 iff either V(φ, w) =0 or V(ψ, w) =1
c) V(2φ, w) =1 iff for any v, V(φ, v) =1
d) V(φ2→ψ, w) = 1 iff for any x, IF [V(φ, x) = 1 and for
any y such that V(φ, y) =1, x _
w
y] THEN V(ψ, x) =1
Phew! Let’s look into what this means.
First, notice that much of this is exactly the same as for our MPL models—
we still have the set of worlds, and formulas being given truth values at worlds.
We’ll still say that φ is true “at w” iff V(φ, w) =1.
What happened to the accessibility relation? It has simply been dropped,
in favor of a simplied truthclause for the 2—clause c. 2φ is true iff φ is true
at all worlds in the model, not just all accessible worlds. It turns out that this
in effect just gives us an S5 logic for the 2, for you get the same valid formulas
for MPL, whether you make the accessibility relation an equivalence relation, or
a total relation. Clearly, if φ is valid in all equivalence relation models, then it
is valid in all total models, since every total relation is an equivalence relation.
What’s more, the converse is true—if φ is valid in all total models then it’s also
valid in all equivalence relation models. Rough proof: let φ be any formula
that’s valid in all total models, and let . be any equivalence relation model.
We need to show that φ is true in an arbitrary world r ∈ T (.’s set of worlds).
Now, any equivalence relation partitions its domain into nonoverlapping
subsets in which each world sees every other world. So T is divided up into
one or more nonoverlapping subsets. One of these, T
r
, contains r . Now,
consider a model, .
/
, just like ., but whose set of worlds is just T
r
. .
/
is a
total model, so φ is valid in it by hypothesis. Thus, in this model, φ is true at
r . But then φ is true at r in ., as well. Why? Roughly: the truth value of φ
at r in . isn’t affected by what goes on outside r ’s partition, since chains of
modal operators just take us to worlds seen by r , and worlds seen by worlds
seen by r , and…. Such chains will never have us “look at” anything out of r ’s
partition, since these worlds are utterly unconnected to r via the accessibility
CHAPTER 8. COUNTERFACTUALS 199
relation. So φ’s truth value at r in . is determined by what goes on in T
r
,
and so is the same as its truth value at r in .
/
.
So, we get the same class of valid formulas whether we require the accessi
bility relation to be total, or an equivalence relation. Things are easier if we
make it a total relation, because then we can simply drop talk of the accessibility
relation and dene necessity as truth at all worlds. The corresponding clause
for possibility is:
e) V(3φ, w) =1 iff for some v, V(φ, v) =1
The derived clauses for the other connectives remain the same:
f) V(φ∧ψ, w) =1 iff V(φ, w) =1 and V(ψ, w) =1
g) V(φ∨ψ, w) =1 iff V(φ, w) =1 or V(ψ, w) =1
h) V(φ↔ψ, w) =1 iff V(φ, w) =V(ψ, w)
Next, what about this nearness relation? Think of this as the similarity
relation we talked about before. In order to evaluate whether x _
w
y, we place
ourselves in possible world w, and we ask whether x is more similar to our world
than y is. (Recall the bit about counterfactual conditionals being highly context
dependent. In a full treatment of counterfactuals, we would complicate our
semantics by introducing contexts of utterance, and evaluate sentences relative to
these contexts of utterance. The point of this would be to allow different nearness
relations in the different contexts.)
I say “we can think of” _ as a similarity relation, but take this with a grain
of salt—just as our denitions allow the members of T to be any old things, so,
_ is allowed to be any old relation over T. Just as the members of T could be
sh, so the _ relation could be any old relation over sh. (But as before, if the
truth conditions for natural language counterfactuals have nothing in common
with the truth conditions for 2→statements in our models, the interest in our
semantics is diminished, since our models wouldn’t be modeling the behavior of
natural language counterfactuals.)
The constraints on the formal properties of the nearness relation—certain
of them, at least, seem plausible constraints on _ if it is to be thought of as
a similarity relation. C1 simply says that it makes sense to compare any two
worlds in respect of similarity to a given world. C2 has a transparent meaning.
C3 means “no ties”—it says that, relative to a given base world w, it is never
the case that there are two separate worlds x and y such that each is at least as
close to w as the other. C4 is the “base” axiom—it says that every world is at
CHAPTER 8. COUNTERFACTUALS 200
least as close to itself as every other. Given C3, it has the further implication
that every world is closer to itself than every other. (We dene “x is closer to w
than y is” (x ≺
w
y) to mean x _
w
y and not: y _
w
x.) C5 is called the “limit”
assumption: according to it, for any formula φ and any base world w, there
is some world that is a closest world to w in which φ is true (that is, unless φ
isn’t true at any worlds at all). This rules out the following possibility: there
are no closest φ worlds, only an innite chain of φ worlds, each of which is
closer than the previous. Certain of these assumptions have been challenged,
especially C3 and C5. We will consider those issues below.
Note that the valuation function for a model was built directly into the
model. This is in contrast to our earlier denitions of models, in which we
equipped models with interpretation functions of different sorts, and then
dened the resultant valuation functions accordingly. The valuation function is
built into the denition of a SCmodel because the limit assumption constrains
SCmodels, and the limit assumption is stated in terms of the valuation function.
Given our denitions, we can dene validity and semantic consequence:
SC
φ iff φ is true at every world in every SCmodel
Γ
SC
φ iff for every SCmodel and every world w in that model,
if every member of Γ is true at w then φ is also true at w
8.4 Validity proofs in SC
Given this semantic system, we can give semantic validity proofs just as we
did for the various modal systems. Consider the formula (P∧Q)→(P2→Q).
We want to show that this is SCvalid. Thus, we pick an arbitrary SCmodel,
〈T, _, V〉, pick an arbitrary world r ∈ T, and show that this formula is true at
r :
i) Suppose for reductio that V((P∧Q)→(P2→Q), r ) =0
ii) then V(P, r ) =1, V(Q, r ) =1, and V(P2→Q, r ) =0
iii) the truth condition for 2→ says that P2→Q is true at r iff
for every closest Pworld (to r ), Q is true as well. So since
P2→Q is false at r , there must be a closesttor Pworld at
which Q is false—that is, there is some world a such that:
a) V(P, a) =1
CHAPTER 8. COUNTERFACTUALS 201
b) for any x, if V(P, x) =1 then a _
r
x
c) V(Q, a) =0
iv) from ii), since V(P, r ) =1, a _
r
r
v) by “base”, r _
r
a
vi) so, by antisymmetry, r =a. But now, Q is both true and false
at r
Here is another validity proof, for the formula ][P2→Q]∧[(P∧Q)2→R]]
→[P2→R]. (This formula is worth taking note of, because it is valid despite
its similarity to the invalid formula ][P2→Q]∧[Q2→R]]→[P2→R]):
i) Suppose for reductio that P2→Q and (P∧Q)2→R are true
at r , but P2→R is false there.
ii) Then there’s a world, a, that is a nearesttor P world, in
which R is false.
iii) Since P2→Q is true at r , Q is true in all the nearesttor P
worlds, and so V(Q, a) =1.
iv) Note now that a is a nearesttor P∧Q world:
• P and Q are both true there, so P∧Q is true there.
• let x be any P∧Q world. x is then a P world. But since a is
a nearesttor P world, we knowthat a _
r
x. (Remember:
“a is a nearesttor P world” means: “V(P, a) =1, and for
any x, if V(P, x) =1 then a _
r
x”)
v) Since (P∧Q)2→R is true at r , it follows that R is true at a.
This contradicts ii).
8.5 Countermodels in SC
Our other main concern will be countermodels. As before, we’ll be able to use
failed attempts at constructing countermodels as guides to our validity proofs.
First, consider the truth condition for φ2→ψ:
V(φ2→ψ, w) = 1 iff for every x, if x is a nearesttow φ world
then V(ψ, x) =1
CHAPTER 8. COUNTERFACTUALS 202
Now, suppose there are no nearesttow φ worlds. Then V(φ2→ψ, w) is
automatically 1, since the universally quantied statement is vacuously true.
How could it turn out that there are no nearesttow φ worlds? One way
would be if there are no φ worlds whatsoever. That is, if φ is necessarily false.
And in fact, this is the only way there could be no nearesttow φ worlds, for
the limit assumption guarantees that if there is at least one φ world, then there
is a nearesttow φ world.
Say that a counterfactual φ2→ψ is vacuously true iff φ is false at every world.
What we have seen is that there are two ways a counterfactual φ2→ψ can be
true at a given world w: i) it can be vacuously true, or ii) ψ can be true in all
the nearesttow φ worlds (by antisymmetry, there can be only one). Bear
this in mind when constructing a countermodel—there are two ways to make a
2→true.
On the other hand, there’s only one way to make a counterfactual false. It
follows from the truth condition for the 2→that:
V(φ2→ψ, w) =0 iff there is a nearesttow φ world at which ψ is
false.
In constructing models, it is often wise to make forced moves rst, as we saw
for constructing MPL models. This continues to be a good strategy. It now
means: take care of false 2→s before taking care of true 2→s, for the latter
can occur in two different ways, whereas the former can only occur in one way.
We’ll see how this goes with some concrete examples.
Let’s show that transitivity is invalid for 2→. That is, let’s show that the
following formula is SCinvalid: [(P2→Q)∧(Q2→R)]→(P2→R). I’m going
to do this with diagrams that are a little different from the ones I used for MPL.
The new diagrams again contain boxes (rounded now, to distinguish them from
the old countermodels) in which we put formulas; and we again indicate truth
values of formulas with small numbers above the formulas. Since there is no
accessibility relation, we don’t need the arrows between the boxes. But since we
need to represent the nearness relation, we will arrange the boxes in a vertical
line. At the bottom of the lines goes a box for the world, r , of our model in
which we’re trying to make a given formula false. We place the other worlds
in the diagram above this bottom world r : the further away a world is from
r in the _
r
ordering, the further above r we place it in the diagram. In this
case we want to make P2→Q and Q2→R true, but P2→R false, so we begin
as follows:
CHAPTER 8. COUNTERFACTUALS 203
/. ,
() *+
1 1 1 0 0
[(P2→Q)∧(Q2→R)]→(P2→R)
r
In keeping with the advice I gave a moment ago, let’s “take care of” the false
counterfactual rst: let’s make P2→R false in r. This means that we need to
have R be false in the nearesttor P world. I’ll indicate this as follows:
no P
/. ,
() *+
1 0 1
P R Q
a
/. ,
() *+
0 1 1 1 0 0
[(P2→Q)∧(Q2→R)]→(P2→R)
r
a is the nearest P world to r. I indicate that in the diagram by making P true
at a, and reminding myself that I can’t put any Pworld anywhere between a
and r . Since the counterfactual P2→R was supposed to be false at r, I made
R false in a. Notice that I made Q true in a. This was to take care of the fact
that P2→Q is true in r . This formula says that Q is true in the nearesttor P
world. Well, a is the nearesttor P world.
Now for the nal counterfactual Q2→R. This can be true in two ways—
either there is no Q world at all (the vacuous case), or Ris true in the nearesttor
Q world. Well, Q is already true in a, so the vacuous case is ruled out. So
we must make R true in the nearest to r Qworld. Where will we put the
nearesttor Q world? There are three possibilities. It could be farther away
from, tied with, or closer to r than a. Let’s try the rst possibility:
/. ,
() *+
1 1
Q R
b
no Q
no P
/. ,
() *+
1 0 1
P R Q
a
/. ,
() *+
0 1 0 1 1 0 0
[(P2→Q)∧(Q2→R)]→(P2→R)
r
CHAPTER 8. COUNTERFACTUALS 204
This doesn’t work, because when we make b the nearesttor Q world, this
contradicts the fact that we’ve got Q true at a nearer world—namely, a.
Likewise, we can’t make the nearesttor Q world be tied with a. Anti
symmetry would make this world be a. But we need to make R true at the
nearest Q world, whereas R is already false in a.
But the nal possibility works out just ne—let the nearest Q world be
closer than a:
no P
/. ,
() *+
1 0 1
P R Q
a
/. ,
() *+
1 1 0
Q R P
b
no Q
/. ,
() *+
0 1 0 1 1 0 0
[(P2→Q)∧(Q2→R)]→(P2→R)
r
Notice that I made P false in b, since I said “no P” of all worlds nearer to r
than a. Here’s the ofcial model:
T =]r, a, b]
_
r
=]〈b, a〉 . . . ]
V(P, a) = V(Q, a) = V(Q, b) = V(R, b) = 1; all other atomics false
everywhere else.
A couple comments on the ofcial model. I left out a lot in giving the similarity
relation for this model. First, I left out some of the elements of _
r
—fully
written out, it would be:
_
r
= {〈r,b〉,〈b,a〉,〈r,a〉,〈r,r〉,〈a,a〉,〈b,b〉}
I left out 〈r,b〉 because it gets included automatically given the “base” assumption
(C4). Also, the element 〈r,a〉 is required to make _
r
transitive. The elements
〈r,r〉, 〈a,a〉, and 〈b,b〉 were entered to make that relation reexive. Why must it
be reexive? Because reexivity comes from strong connectivity. (Let w and x
be any members of T; we get (x _
w
x or x _
w
x) from Strong connectivity of
CHAPTER 8. COUNTERFACTUALS 205
_
w
, and hence x _
w
x.) My plan was to write out enough of _
r
so that the rest
could be inferred, given the denition of an SCmodel.
Secondly, this isn’t a complete writing out of _ itself; it is just _
r
. To be
complete, we’d need to write out _
a
, and _
b
. But in this problem, these latter
two parts of _don’t matter, so it’s permissible to omit them. We’ll do a problem
below where we need to consider more of _ than simply _
r
.
Suppose one has a formula, for example:
(P2→R)→((P∧Q)2→R)
(This is the formula corresponding to the inference pattern of augmentation.)
Suppose we ask: is it SCvalid? The best way to approach this is as follows.
First try to found a countermodel. If we succeed, then the formula is not valid.
If we have trouble nding a countermodel, then we can try giving a semantic
validity proof (for if the formula is in fact valid, then no countermodel exists).
In this particular case, the formula is invalid, as the following countermodel
shows:
no P∧Q
/. ,
() *+
1 1 1 0
P∧Q R
a
/. ,
() *+
1 1 0
P R Q
b
no P
/. ,
() *+
0 1 0 0
(P2→R)→[(P∧Q)2→R)
r
Ok, I began with the false subjunctive: (P∧Q)2→R. This forced the existence
of a nearest P∧Q world, in which R was false. But since P∧Q was true there,
P was true there; this ruled out the true P2→R in r being vacuously true. So I
was forced to consider the nearest P world. It couldn’t be farther out than a,
since P is true in a. It couldn’t be a, since R was already false there. So I had to
put it nearer than a. Notice that I had to make Q false at b. Why? Well, it was
in the “no P∧Q zone”, and I had made P true in it. Here’s the ofcial model:
T =]r, a, b]
_
r
=]〈b, a〉 . . . ]
CHAPTER 8. COUNTERFACTUALS 206
V(P, a) = V(Q, a) = V(P, b) = V(R, b) = 1; all other atomics 0
everywhere else.
Next example: 3P→[(P2→Q)→∼(P2→∼Q)]. An attempt to nd a coun
termodel fails at the following point:
no P
/. ,
() *+
1
1 1 0
P ∼Q
a
/. ,
() *+
1 0 0 1 0 0 1
3P→[(P2→Q)→∼(P2→∼Q)]
r
At world a, we’ve got Q being both true and false. A word about how we got
to that point. I noticed that I had to make two counterfactuals true: P2→Q
and P2→∼Q. Now, this isn’t a contradiction all by itself. Remember that
counterfactuals are vacuously true if their antecedents are impossible. So if P
were impossible, then both of these would indeed be true, without any problem.
But 3P has to be true at r. This rules out those counterfactual’s being vacuously
true. Since P is possible, the limit assumption has the result that there is a closest
P world. This then with the two true counterfactuals created the contradiction.
This reasoning is embodied in the following semantic validity proof:
i) Suppose for reductio that 3P is true at some world r , but
(P2→Q)→∼(P2→∼Q) is false at r .
ii) Then P2→Q and P2→∼Q are both true at r as well.
iii) Since 3P is true at r , P is true at some world. So, by the
limit assumption, we have: there exists a world, a, such that
V(P, a) =1 and for any x, if V(P,x)=1 then a _
r
x. For short,
there is a closesttor P world.
iv) The truth condition for 2→, applied to P2→Q, gives us that
Q is true at all the closesttor P worlds.
v) Similarly, applied to P2→∼Q, we know that ∼Q is true at all
the closesttor P worlds.
vi) Thus, both Q and ∼Q would be true at a. Impossible.
CHAPTER 8. COUNTERFACTUALS 207
Note the use of the limit assumption. It is the limit assumption we must use
when we need to know that there is a nearest φworld, and we can’t get this
knowledge from other things in the proof.
For the last example of a countermodel, I’ll do a new kind of problem: one
with a counterfactual nested within another counterfactual. Let’s show that the
following formula is SCinvalid: [P2→(Q2→R)]→[(P∧Q)2→R] (this is the
formula corresponding to “importation”). We begin by making the formula
false in r, the actual world of the model. This means making the antecedent true
and the consequent false. Now, since the consequent is a false counterfactual,
we are forced to make there be a nearest P∧Q world in which R is false:
no P∧Q
/. ,
() *+
1 1 1 0
P∧Q R
a
/. ,
() *+
1 0 0
[P2→(Q2→R)]→[(P∧Q)2→R]
r
Now we’ve got to make the rst subjunctive conditional true. We can’t make it
vacuously true, because we’ve already got a Pworld in the model: a. So, we’ve
got to put in the nearesttor P world. Could it be farther away than a? No,
because a would be a closer P world. Could it be a? No, because we’ve got to
make Q2→R true in the closest P world, and since Q is true but R is false in a,
Q2→R is already false in a. So, we do it as follows:
no P∧Q
/. ,
() *+
1 1 1 0
P∧Q R
a
/. ,
() *+
1 0 1
P Q2→R
b
no P
/. ,
() *+
0 1 0 0
[P2→(Q2→R)]→[(P∧Q)2→R]
r
Why did I make Q false at b? Well, because b is in the “no P∧Q” zone, and P
is true at b, so Q had to be false there.
CHAPTER 8. COUNTERFACTUALS 208
Now, the remaining thing to do is to make Q2→R true at b. This requires
some thought. The diagram right now represents “the view from r”—it rep
resents how near the worlds in the model are to r. That is, it represents the
_
r
relation. But the truth value of Q2→R at b depends on “the view from
b”—that is, the _
b
relation. So we need to consider a new diagram, in which b
is the bottom, base, world:
no Q
/. ,
() *+
1 1
Q R
c
/. ,
() *+
1 0 1
P Q2→R
b
I made there be a nearesttob Q world, and made R true there. Notice that
I kept the old truth values of b from the other diagram. This is because this
new diagram is a diagram of the same worlds as the old diagram; the difference
is that the new diagram represents the _
b
nearness relation, whereas the old
one represented a different relation: _
r
. Now, this diagram isn’t nished. The
diagram is that of the _
b
relation, and that relation relates all the worlds in
the model. So, worlds r and a have to show up somewhere here. But the
safest practice is to put them far away from b, so that there isn’t any possibility
of conict with the “no Q” zone that has been established. Thus, the nal
appearance of this part of the diagram is as follows:
/. ,
() *+
r
/. ,
() *+
a
no Q
/. ,
() *+
1 1
Q R
c
/. ,
() *+
1 0 1
P Q2→R
b
CHAPTER 8. COUNTERFACTUALS 209
The old truth values from r and a still are had (remember that this is another
diagram of the same model, but representing a different nearness relation),
but I left them out because of the fact that they’ve already been written on the
other part of the diagram.
Notice that the order of the worlds in the rdiagram does not in any way
affect the order of the worlds on the b diagram. The nearness relations in
the two diagrams are completely independent, because in our denition of ‘SC
model’, we entered in no conditions constraining the relations between _
i
and
_
j
when i (= j . This sometimes seems unintuitive. For example, we could have
two halves of a model looking as follows:
The view from r The view from a
c r
b b
a c
r a
If this seems odd, remember that only some of the features of a diagram are
intended to be genuinely representative. For example, these diagrams are in ink;
this is not intended to convey the idea that the worlds in the model are made
of ink. This feature of the diagram isn’t intended to convey information about
the model. Analogously, the fact that in the left diagram, b is physically closer
to a than c is not intended to convey the information that, in the model, b_
a
c.
In fact, the diagram of the view from r is only intended to convey information
about _
r
; it doesn’t carry any information about _
a
, _
b
, or _
c
.
Back to the countermodel. That other part of the diagram, the view from r,
must be updated to include world c. The safest procedure is to put c far away
on the model to minimize possibility of conict. Thus, the nal picture of the
view from r is:
CHAPTER 8. COUNTERFACTUALS 210
/. ,
() *+
c
no P∧Q
/. ,
() *+
1 1 1 0
P∧Q R
a
/. ,
() *+
1 0 1
P Q2→R
b
no P
/. ,
() *+
0 1 0 0
[P2→(Q2→R)]→[(P∧Q)2→R]
r
Again, I haven’t rewritten the truth values in world c, because they’re already
in the other diagram, but they are to be understood as carrying over. Now for
the ofcial model:
T =]r, a, b, c]
V(P, a) =V(Q, a) =V(P, b) =V(Q, c) =V(R, c) =1; all else (atom
ics) 0 elsewhere
_
r
=]〈b, a〉, 〈a, c〉 . . . ]
_
b
=]〈c, a〉, 〈a, r〉 . . . ]
Notice that we needed to express two of _’s subrelations—_
r
and _
b
. Remem
ber that any model has got to contain _
i
for every world i in the model. For
example, if we were to write out this model completely ofcially, we’d have to
specify _
a
and _
c
. But we don’t bother with those parts of _ that don’t matter.
8.6 Logical Features of SC
Here we’ll discuss various features of SC, which appear to conrm that it’s a
good logic for counterfactual conditionals. In part, we will be showing that our
semantics for 2→matches the logical features of natural language that were
discussed in section 8.1.
CHAPTER 8. COUNTERFACTUALS 211
8.6.1 Not truthfunctional
We wanted our system for counterfactuals to have the following features:
∼P P2→Q
Q P2→Q
Clearly, it does. The rst fact is demonstrated by a model in which P is false
at some world, r, and in which there’s a nearesttor P world in which Q is
false; the second, by a model in which Q is true at r, P is false at r, and in which
there’s a nearesttor P world in which Q is false.
8.6.2 Can be contingent
We wanted it to turn out that:
P2→Q 2(P2→Q)
∼(P2→Q) →2∼(P2→Q)
Our semantics does indeed have this result, because the similarity metrics based
on different worlds can be very different. For example: consider a model with
worlds r and a, in which Q is true in the nearesttor P world, but in which Q
is false at the nearesttoa P world. P2→Q is true at r and false at a, whence
2(P2→Q) is false at r.
8.6.3 No augmentation
Earlier we produced a model containing a world in which this conditional was
false:
[P2→Q]→[(P∧R)2→Q]
Thus, P2→Q (P∧R)2→Q.
8.6.4 No contraposition
Let’s show that P2→Q ∼Q2→∼P:
CHAPTER 8. COUNTERFACTUALS 212
no ∼Q
/. ,
() *+
1 0 0 1
∼Q ∼P
a
/. ,
() *+
1 1
P Q
b
no P
/. ,
() *+
1 0
(P2→Q) (∼Q2→∼P)
r
I won’t bother with the ofcial model.
8.6.5 Some implications
Let’s show that the SC semantics vindicates the inference from the counterfac
tual conditional to the material conditional, and from the strict conditional to
the counterfactual conditional.
Proof that P2→Q P→Q:
i) Suppose P2→Q is true at some world r in some SCmodel,
and …
ii) …suppose for reductio that P→Q isn’t true there.
iii) Then P is true at r and …
iv) …Q is false at r
v) Given “base”, for every world, x, r _
r
x.
vi) Given iii) and v), r is a closesttor P world. So, given i), Q
is true at r . Contradicts iv).
Proof that P⇒Q P2→Q:
i) Suppose P⇒Q is true at r , and suppose for reductio that
P2→Q is false at r .
ii) Then there’s a nearesttor P world, a, at which Q is false.
iii) But that can’t be. “P⇒Q” means 2(P→Q). So P→Q is true
at every world. So there can’t be a world like a, in which P is
true and Q is false.
CHAPTER 8. COUNTERFACTUALS 213
8.6.6 No exportation
We have shown that the SCsemantics reproduces the logical features of natural
language counterfactuals discussed in section 8.1. In the next few sections we
discuss some further logical features of the SCsemantics, and compare them
with the logical features of the →, the ⇒, and natural language counterfactuals.
The →obeys exportation:
(P∧Q)→R
P→(Q→R)
But the ⇒ doesn’t in any system; (P∧Q)⇒R
S5
P⇒(Q⇒R). Nor does
the 2→; it can be easily shown with a countermodel that (P∧Q)2→R
SC
P2→(Q2→R).
Does the natural language counterfactual obey exportation? Here is an
argument that it does not. The following is true:
(L&H) If Bill had married Laura and Hillary, he would have been a
bigamist.
But one can argue that the following is false:
(L2→H) If Bill had married Laura, then it would have been the case
that if he had married Hillary, he would have been a bigamist.
Suppose Bill had married Laura. Would it then have been true that: if he had
married Hillary, he would have been a bigamist? Well, let’s ask for comparison:
what would the world have been like, had George W. Bush married Hillary
Rodham Clinton? Would Bush have been a bigamist? Here the natural answer
is no. George W. Bush is in fact married to Laura Bush; but when imagining him
married to Hillary Rodham Clinton, we don’t hold constant his actual marriage.
We imagine him being married to Hillary instead. If this is true for Bush, then
one might think it’s also true for Bill in the counterfactual circumstance in
which he’s married to Laura: it would then have been true of him that, if he
had married Hillary, he wouldn’t have still been married to Laura, and hence
would not have been a bigamist.
It’s unclear whether this is a good argument, though, since it assumes that
ordinary standards for evaluating unembedded counterfactuals (“If George had
married Hillary, he would have been a bigamist”) apply to counterfactuals em
bedded within other counterfactuals (“If Bill had married Hillary, he would have
CHAPTER 8. COUNTERFACTUALS 214
been a bigamist” as embedded within (L2→H).) Contrary to the assumption, it
seems most natural to evaluate the consequent of an embedded counterfactual
like (L2→H) by holding its antecedent constant. But a defender of the SC
semantics might argue that (L2→H) has a reading on which it is false (recall the
contextdependence of counterfactuals), and hence that we need a semantics
that allows for the failure of exportation.
8.6.7 No importation
Importation holds for →, and for ⇒in T and stronger systems:
P→(Q→R) P⇒(Q⇒R)
(P∧Q)→R (P∧Q)⇒R
but not for the 2→: above we produced an SCmodel with a world in which
the conditional [P2→(Q2→R)]→[(P∧Q)2→R] was false.
The status of importation for natural language counterfactuals is similar
to the status of exportation. If (L2→H) is false (on at least one reading), then
presumably the following could be true (on at least one reading):
If Bill had married Laura, then it would have been the case that if
he had married Hillary, he would have been happy.
without the result of importing being true:
If Bill had married Laura and Hillary, he would have been happy
(if he had married both, his relatives would have hated him, he would have
become a public spectacle, etc.)
8.6.8 No hypothetical syllogism (transitivity)
The following inference pattern is valid for →and ⇒(in all systems):
Q→R Q⇒R
P→Q P⇒Q
P→R P⇒R
CHAPTER 8. COUNTERFACTUALS 215
but not for 2→, as our proof above of the invalidity of [(P2→Q)
∧(Q2→R)]→(P2→R) shows.
Natural language counterfactuals also seem not to obey hypothetical syllo
gism. I am the oldest child in my family; my brother Mike is the secondoldest.
So the following counterfactuals seem true:
4
If I hadn’t been born, Mike would have been my parent’s oldest
child.
If my parents had never met, I wouldn’t have been born.
But the result of applying hypothetical syllogism seems false:
If my parents had never met, Mike would have been their oldest
child.
8.6.9 No transposition
Transposition governs the →:
P→(Q→R)
Q→(P→R)
but not the ⇒(in any of our modal systems); P⇒(Q⇒R)
S5
Q⇒(P⇒R). Nor
does it govern the 2→; it’s easy to show that P2→(Q2→R)
SC
Q2→(P2→R).
The status of transposition for natural language counterfactuals is sim
ilar to that of importation and exportation. If we can ignore the effects of
embedding on the evaluation of counterfactuals, then we have the following
counterexample to transposition. It is true that:
4
Note that if these two statements are written in the reverse order, it seems far less clear
that they’re both true:
If my parents had never met, I wouldn’t have been born.
If I hadn’t been born, Mike would have been my parent’s oldest child.
It’s natural in this case to interpret the second conditional by holding constant the antecedent
of the rst conditional. This fact, together with what we observed about embedded counter
factuals, suggests a systematic dependence of the interpretation of counterfactuals on their
immediate linguistic context. See von Fintel (2001) for a “dynamic” semantics for counterfac
tuals, which more accurately models this feature of their use.
CHAPTER 8. COUNTERFACTUALS 216
If Bill Clinton had married Laura Bush, then if he had married
Hillary Rodham, he’d have been married to a Democrat.
But it is not true that:
If Bill Clinton had married Hillary Rodham, then if he had married
Laura Bush, he’d have been married to a Democrat.
8.7 Lewis’s criticisms of Stalnaker’s theory
David Lewis has a rival theory of counterfactuals. Like Stalnaker’s theory, it is
based on similarity, but it differs from Stalnaker’s in certain respects. So far, we
have only discussed features of Stalnaker’s system that are shared by Lewis’s.
Let’s turn, now, to the differences.
Lewis challenges Stalnaker’s assumption that _
w
is always antisymmetric.
Real similarity relations permit ties. So it seems implausible to rule out the
possibility of two worlds being exactly similar to a given world.
The challenge to Stalnaker here is most straightforward if Stalnaker intends
to be giving truth conditions for natural language counterfactuals, rather than
merely doing model theory. In that case, the set T in an SCmodel must be the
set of genuine possible worlds, and _ must be a relation of genuine similarity,
in which case it ought to admit ties. But even if Stalnaker is not doing this,
the objection may yet have bite, to the extent that the semantics of natural
language conditionals is like similaritytheoretic semantics.
5
The validity of certain formulas depends on the “no ties” assumption; the
following two wffs are SCvalid, but are challenged by Lewis:
(P2→Q) ∨(P2→∼Q) (“Conditional excluded middle”)
[P2→(Q∨R)] →[(P2→Q)∨(P2→R)] (“distribution”)
Take the rst one, for example. Suppose you gave up antisymmetry, thereby
allowing ties. Then the following would be a countermodel for the law of
conditional excluded middle:
5
For an interesting response to Lewis, see Stalnaker (1981).
CHAPTER 8. COUNTERFACTUALS 217
no Q
/. ,
() *+
1 0
P Q
a
/. ,
() *+
1 0 1
P ∼Q
b
/. ,
() *+
0 0 0
(P2→Q)∨(P2→∼Q)
r
Remember that P2→Q is true only if Q is true in all the nearest P worlds.
In this model, Q is true in one of the nearest P worlds, but not all, so that
counterfactual is false at r. Similarly for P2→∼Q.
A similar model shows that distribution fails if the “no ties” assumption is
given up.
So, should we give up conditional excluded middle? As Lewis concedes,
the principle is initially plausible. An equivalent formulation of conditional
excluded middle is this:
∼(P2→Q)→(P2→∼Q)
But everyone agrees that (P2→∼Q)→∼(P2→Q) is always true, at least, when
P is possibly true. So, in cases where P is possibly true anyway, the question
of whether conditional excluded middle is valid is the question of whether
∼(P2→Q) and P2→∼Q are equivalent to each other. And it does indeed
seem that in ordinary usage, one expresses the negation of a counterfactual
by negating its consequent. To deny the counterfactual “if she had played,
she would have won", one says "no, she wouldn’t have!”, meaning “if she had
played, she would not have won”.
And take the other formula validated by Stalnaker’s theory, distribution. In
reply to: “if the coin had been ipped, it would have come out either heads or
tails”, one might ask: “which would it have been, heads or tails?”. The thinking
behind the reply is that “if the coin had been ipped, it would have come up
heads”, or “if the coin had been ipped, it would have come up tails” must be
true.
So there’s some plausibility to both these formulas. But Lewis says two
things. The rst is metaphysical: if we’re going to accept the similarity analysis,
we’ve got to give them up, because ties just are possible. The second is purely
semantic: the intuitions aren’t completely compelling. About the coinipping
case, Lewis denies that if the coin had been ipped, it would have come up
CHAPTER 8. COUNTERFACTUALS 218
heads, and he also denies that if the coin had been ipped, it would have come
up tails. Rather, he says, if it had been ipped, it might have come up heads.
And if it had been ipped, it might have come up tails. But neither outcome is
such that it would have resulted, had the coin been ipped.
Concerning excluded middle, Lewis says:
It is not the case that if Bizet and Verdi were compatriots, Bizet
would be Italian; and it is not the case that if Bizet and Verdi were
compatriots, Bizet would not be Italian; nevertheless, if Bizet and
Verdi were compatriots, Bizet either would or would not be Italian.
(Counterfactuals, p. 80.)
Lewis can follow this up by noting that if Bizet and Verdi were compatriots,
Bizet might be Italian, but it’s not the case that if they were compatriots, he
would be Italian.
Here is a related complaint of Lewis’s about Stalnaker’s semantics. In the
last little bit, I’ve used English phrases of the form “if it were the case that φ,
then it might have been the case that ψ”. This conditional Lewis calls “the
‘might’ counterfactual”; he symbolizes it as φ3→ψ, and denes it thus:
φ3→ψ=
df
∼(φ2→∼ψ)
Lewis criticizes Stalnaker’s system for the fact that this denition of 3→doesn’t
work in Stalnaker’s system. Why not? Well, since internal negation is valid in
Stalnaker’s system, φ3→ψ would always imply φ2→ψ—not good, since the
mightconditional in English seems weaker than the wouldconditional. So,
Lewis’s denition of 3→doesn’t work in Stalnaker’s system. Moreover, there
doesn’t seem to be any other plausible denition. So, Stalnaker can’t dene
3→.
6
Lewis also objects to Stalnaker’s limit assumption. The following line is
less than one inch long:
Now, consider the counterfactual:
If the line were more than 1 inch long, it would be over 100 miles
long.
6
Lewis (1973, p. 80).
CHAPTER 8. COUNTERFACTUALS 219
Seems false. But if we use Stalnaker’s truth conditions as truth conditions for
natural language counterfactuals, and take our intuitive judgments of similarity
seriously, we seem to get the result that it is true! The reason is that there
doesn’t seem to be a closest world in which the line is more than 1 inch long.
For every world in which the line is, say, 1+k inches long, there’s another world
in which the line has a length closer to its actual length but still more than 1
inch long: say, 1+
k
2
inches. So there doesn’t seem to be any closest world in
which the line is over 1 inch long.
In light of these criticisms, Lewis proposes a new similaritybased seman
tics for counterfactuals, which assumes neither antisymmetry nor the limit
assumption. Let’s look at that system.
8.8 Lewis’s system
7
To move from Stalaker’s system to Lewis’s, we can start by just dropping the
antisymmetry assumption. We also want to drop the limit assumption. But
after dropping the limit assumption, if we made no further adjustments to the
system, we would get unwanted vacuous truths, as we did in the example of the
1 inch line above.
8
The truth denition for 2→needs to be changed. Instead
of saying that φ2→ψ is true iff ψ is true in all the nearest φ worlds, we will
instead say that φ2→ψ is true iff either i) φ is true in no worlds (the vacuous
case), or ii) there is some φ world such that for every φ world at least as close,
φ→ψ is true there.
Here is the new system, LC (Lewiscounterfactuals). It is exactly the same
as the Stalnaker system except that limit and antisymmetry are dropped, and
the parts indicated in boldface are changed:
An LCmodel is a threetuple 〈T, _, V〉, where:
i) T is a nonempty set
ii) V is a function that assigns either 0 or 1 to each wff relative
to each member of T
7
See Lewis (1973, pp. 4849). My formulation does away with the accessibility relation (in
Lewis’s terminology, S
i
, the set of worlds accessible from world i , is always T, the set of all
worlds in the model), so it is a bit simpler.
8
Actually, dropping the limit assumption doesn’t affect the class of valid formulas, which is
the same with or without the limit assumption. (See Lewis (1973, p. 121).
CHAPTER 8. COUNTERFACTUALS 220
iii) _ is a threeplace relation over T
iv) V and _ satisfy the following conditions:
C1: for any w, _
w
is strongly connected
C2: for any w, _
w
is transitive
C3: for any x, y, if y
x
x then x = y (“base”)
v) For all wffs, φ, ψ and for all w ∈ T:
a) V(∼φ, w) =1 iff V(φ, w) =0
b) V(φ→ψ, w) =1 iff either V(φ, w) =0 or V(ψ, w) =1
c) V(2φ, w) =1 iff for any v, V(φ, v) =1
d) V(φ2→ψ, w) = 1 iff EITHER φ is true at no worlds,
OR: there is some world, x, such that V(φ, x) = 1 and
for all y, if y
w
x then V(φ→ψ, y) = 1
It may be veried that every LCvalid wff is SCvalid (although not vice
versa).
9
Comments on all this: First, notice that the limit and antisymmetry con
ditions are simply dropped. Second, the Base condition is modied; now it
says that no world is as close to a world as itself. Before, it said that each world
is at least as close to itself as any other. Stalnaker’s Base condition, plus anti
symmetry, entails the present Base condition. But Lewis’s system doesn’t have
antisymmetry, so the Base condition must be stated in the stronger form.
10
Third, let’s think about what the truth condition for the 2→ says. First,
there’s the vacuous case: if φ is necessarily false then φ2→ψ comes out true.
But if φ is possibly true, then what the clause says is this: φ2→ψ is true at
w iff there’s some φ world where ψ is true, such that no matter how much
closer to w you go, you’ll never get a φ world where ψ is false. If there is a
nearesttow φ world, then this implies that φ2→ψ is true at w iff ψ is true in
all the nearesttow φ worlds.
So, thinking of these as truthconditions for naturallangague counterfac
tuals for a moment, recall the sentence:
9
Here’s a sketch of an argument. Let’s say that an LC model is “Stalnakeracceptable” iff it
obeys the limit and antisymmetry assumptions. Suppose that φ is LCvalid. Then it’s true in
all Stalnaker acceptable LCmodels. Now, notice that in Stalnakeracceptable models, Lewis’s
truthconditions for 2→are identical to Stalnaker’s. So, φ must be true in all SCmodels.
10
Why do we want to prohibit worlds being just as close to w as w is to itself? So that P∧Q
semantically implies P2→Q. Otherwise P∧Q could be true at w while P∧∼Q was true at
some world as close to w as w is to itself, in which case P2→Q would turn out false at w.
CHAPTER 8. COUNTERFACTUALS 221
If the line were over 1 inch long, it would be over 10 inches long.
There’s no nearest world in which the line is over 1 inch long, only an innite
series of worlds where the line has lengths getting closer and closer to 1 inch
long. But this doesn’t make the counterfactual true. A counterfactual is true if
its antecedent is impossible, but that’s not true in this case. So the only way the
counterfactual could be true is if the second part of the denition is satised—if,
that is, there is some world, x, such that the antecedent is true at x, and the
material conditional (antecedent→consequent) is true at every world at least
as similar to the actual world as is x. Since the “at least as similar as” relation is
reexive, this can be rewritten thus:
for some world, x, the antecedent and consequent are both true at
x, and the material conditional (antecedent→consequent) is true
at every world at least as similar to the actual world as is x
So, is there any such world, x? No. For let x be any world at which the
antecedent and consequent are both true—i.e., any world in which the line is
over 10 inches long. We can always nd a world that is more similar to the actual
world than x in which the material conditional (antecedent→consequent) is
false: just choose a world just like x but in which the line is only, say, 2 inches
long.
Let’s see how Lewis’s theory works in the case of a true counterfactual, for
instance:
If I were more than six feet tall, then I would be less than nine feet
tall
(I am, in fact, less than six feet tall.) The situation here is similar to the previous
example in that there is no nearest world in which the antecedent is true. But
now, we can nd a world x, in which the antecedent and consequent are both
true, and such that the material conditional (antecedent→consequent) is true
in every world at least as similar to the actual world as is x. Simply take x to
be a world just like the actual world but in which I am, say, sixfeetone. Any
world that is at least as similar to the actual world as this world must be one in
which I’m less than nine feet tall; so in any such world the material conditional
(I’m more than six feet tall→I’m less than nine feet tall) is true.
Notice that the formulas representing Conditional Excluded Middle and
Distribution come out invalid now, because of the possibility of ties.
CHAPTER 8. COUNTERFACTUALS 222
Another thing: Lewis gives the following denition for the ‘might’counter
factual:
φ3→ψ=
df
∼(φ2→∼ψ)
From this we may obtain a derived clause for the truth conditions of φ3→ψ:
V(φ3→ψ, w) = 1 iff for some x, V(φ, x) = 1, and for any x, if
V(φ, x) =1 then there’s some y such that y _
w
x and V(φ∧ψ, y) =
1)
That is, φ3→ψ is true at w iff φ is possible, and for any φ world, there’s a
world as close or closer to w in which φ and ψ are both true. In cases where
there is a nearest φ world, this means that ψ must be true in at least one of the
nearest φ worlds.
8.9 The problem of disjunctive antecedents
Before we leave counterfactual conditionals, I want to talk about one criticism
that has been raised against both Lewis’s and Stalnaker’s systems.
11
In neither
systemdoes the formula (P∨Q)2→Rsemantically imply P2→R. (Take a model
where there is a unique nearest P∨Q world to r, in which Q is true but not P;
and make there be a unique nearest P world in which R is false.) But shouldn’t
this implication hold? Imagine a conversation between Butch Cassidy and
the Sundance Kid in heaven, after having been surrounded and killed by the
Bolivian army. They say:
If we had surrendered or tried to run away, we would have been
shot.
Intuitively, if this is true, so is this:
If we had surrendered, we would have gotten shot.
In general, one is entitled to conclude from “If P or Q had been the case,
then R would have been the case” that “if P had been the case, R would have
been the case”. If Butch Cassidy and the Sundance Kid could have survived by
11
For references, see the bibliography of Lewis (1977).
CHAPTER 8. COUNTERFACTUALS 223
surrendering, they certainly would not say to each other “If we had surrendered
or tried to run away, we would have been shot”.
Is this a problem for Lewis and Stalnaker? Some have argued this, but
others respond as follows. One must take great care in translating from natural
language into logic. For example,
12
no one would want to criticize the law
∼∼P→P on the grounds that “There ain’t no way I’m doing that” doesn’t
imply that I might do that. And there are notorious peculiar things about the
behavior of ‘or’ in similar contexts. Consider:
(1) You are permitted to stay or go.
One can argue that this does not have the form:
(2) Permit(Stay ∨ Go)
After all, suppose that you are permitted to stay, but not to go. If you stay, you
can’t help doing the following act: staying ∨ going. So, surely, you’re permitted
to do that. So, (2) is true. But (1) isn’t; if someone uttered (1) to you when you
were in jail, they’d be lying to you! (1) really means:
(3) You are permitted to stay, AND you are permitted to go.
Similarly, “If either P or Q were true then R would be true” seems usually to
mean “If P were true then R would be true, and if Q were true then R would be
true”. We can’t just expect natural language to translate directly into our logical
language—sometimes the surface structure of natural language is misleading.
12
The example is adapted from Loewer (1976).
Chapter 9
Quantied Modal Logic
W
i’vi ooixo to look at Kripkestyle semantics for quantied modal logic.
The language is what you get by adding the 2 and 3 to the language
of predicate logic. There are many interesting issues concerning the interaction
of modal operators with quantiers.
9.1 Grammar of QML
Grammatically, the language of QML is the same as that of plain old predicate
logic, except that we add the 2. Thus, the one new clause to the denition of a
wff is the clause that if φ is a wff, then so is 2φ. The 3 is dened as before: as
∼2∼.
9.2 Symbolizations in QML
Adding quantiers to our modal language allows us to make new distinctions
when symbolizing sentences of English. In particular, it allows us to make the
famous distinction between de re and de dicto modal statements. A paradigm
instance concerns the ambiguity of the sentence
(1) Some rich person might have been poor.
If we don’t have any quantiers around, it seems that we have only one possible
symbolization: as 3P, where P stands for “Some rich person is poor”. But this
symbolization is incorrect, because in no possible world is the statement “Some
224
CHAPTER 9. QUANTIFIED MODAL LOGIC 225
rich person is poor” true! In contrast, (1) seems to have a reading on which it
is true. When we turn to symbolize it in QML, we have better luck. There
seem to be two possibilities:
(1a) 3∃x(Rx∧Px)
(1b) ∃x(Rx∧3Px)
(1a) is just the rejected symbolization as 3P, with P becoming ∃x(Rx∧Px).
But (1b) is better—translating back into English, it says that there is a person
who is in fact rich, but is such that s/he might have been poor. That’s what the
original sentence meant, on its most natural reading.
The ambiguity of (1) is said to be a “de re/de dicto” ambiguity. (1a) is the “de
dicto” reading—the modal operator 3 attaches to a complete closed sentence.
It’s called “de dicto” because the property of possibility is attributed to the
sentence—a dictum. (1b) is the de re reading—there, the modal operator
attaches to an open sentence. It’s called “de re” (“of the object”), because 3F x
can be thought of as attributing a certain property to an object u when x is
assigned the value u—the modal property of possibly being F.
The following sentence also exhibits a de re/de dicto ambiguity:
Every bachelor is necessarily male.
The two readings are:
2∀x(Bx→Mx) (De dicto)
∀x(Bx→2Mx) (De re)
The de dicto reading is true because it is true that in any possible world, anyone
that is in that world a bachelor is, in that world, male. The de re reading is
false because it makes the claim that if any object, u, is a bachelor in the actual
world, that object u is necessarily a bachelor—i.e., the object u is a bachelor in
all possible worlds. That claim is false: many objects are in fact bachelors, but
might have been married.
Denite descriptions also exhibit a de re/de dicto ambiguity. This may be
illustrated by using Russell’s theory of descriptions (section 5.3.3). Russell’s
method generates an ambiguity when a sentential operator such as “not” is
added to “is G”, for example:
The striped bear is not dangerous
CHAPTER 9. QUANTIFIED MODAL LOGIC 226
Does this mean:
∼∃x(Sx∧Bx∧∀y([Sy∧By]→x=y)∧Dx)
or does it mean the following?
∃x(Sx∧Bx∧∀y([Sy∧By]→x=y)∧∼Dx)
I.e., is the sentence denying that there is one and only one striped bear that
is dangerous, or is it saying that there is one and only one striped bear, and
that bear is nondangerous? There’s an ambiguity, which is a matter of the
relative scopes of the denite description and the ∼. Sentences with denite
descriptions and modal operators have an analogous ambiguity. Consider, for
instance:
The number of the planets is necessarily odd
Letting “Nx” mean that x numbers the planets, this could be symbolized in
either of the following ways:
2∃x(Nx∧∀y(Ny→x=y)∧Ox)
∃x(Nx∧∀y(Ny→x=y)∧2Ox)
The rst is de dicto; it says that it’s necessary that: there is one and only one
number of the planets, and that number is odd; that’s false (since there could
have been eight planets.) The second is de re; it says that (in fact) there is one
and only one number of the planets, and that that number is necessarily odd.
That’s true, since the number that is in fact the number of the planets—the
number nine—is indeed necessarily odd.
9.3 A simple semantics for QML
I’m going to simplify our discussion of QML in a couple ways. I won’t consider
axiom systems at all; we’ll go straight to semantics.
1
Further, I’ll begin by
considering a very simple semantics, SQML (for “simply QML”). It’s simple in
two ways. First, there will be no accessibility relation. 2φwill be said to be true
iff φ is true in all worlds in the model. In effect, each world is accessible from
every other (and hence the underlying propositional modal logic is S5). Second,
it will be a “constant domain” semantics. (We’ll discuss what this means, and
more complex semantical treatments of QML, below.)
1
Cresswell and Hughes (1996, part III) present axiom systems for QML.
CHAPTER 9. QUANTIFIED MODAL LOGIC 227
An SQMLmodel is an ordered triple 〈T, ´, ·〉 such that:
i) T is a nonempty set (“possible worlds”)
ii) ´ is a nonempty set (“domain”)
iii) · is a function such that: (“interpretation function”)
a) if α is a constant then ·(α) ∈ ´
b) if Π
n
is an nplace predicate then ·(Π
n
) is a set of ordered
n +1tuples 〈u
1
, . . . , u
n
, w〉, where u
1
, . . . , u
n
are members
of ´, and w ∈ T.
Recall that our semantics for modal propositional logic assigned truth values
to sentence letters relative to possible worlds. We have something similar
here: we relativize the interpretation of predicates to possible worlds. The
interpretation of a twoplace predicate, R, for example is a set of ordered
triples, two members of which are in the domain, and one member of which is
a possible world. When 〈u
1
, u
2
, w〉 is in the interpretation of R, that represents
R’s applying to u
1
and u
2
in possible world w. In a possible worlds setting, this
relativization makes intuitive sense: a predicate can apply to some objects in
one possible world but fail to apply to those same objects in some other possible
world.
Notice that the interpretations of constants are not relativized in any way to
possible worlds. The interpretation · assigns simply a member of the domain
to a name. This reects the common belief that natural language proper
names—which constants are intended to represent—are rigid designators, i.e.,
terms that have the same denotation relative to every possible world (see Kripke
(1972).) We’ll discuss the signicance of this feature of our semantics below.
On to the denition of the valuation function for an SQMLmodel. First,
we keep the denition of a variable assignment from nonmodal predicate logic
(section 4). Our variable assignments therefore assign members of the domain
to variables absolutely, rather than relative to worlds. (This is an appropriate
choice given our choice to assign constants absolute semantic values.) But the
valuation function will now relativize truth values to possible worlds (as well
as to variable assignments). After all, the sentence ‘Fa’, if it represents “Ted is
tall”, should vary in truth value from world to world.
The valuation function V
., g
, for SQMLmodel . (=〈T, ´, ·〉)
and variable assignment g, is dened as the function that assigns
CHAPTER 9. QUANTIFIED MODAL LOGIC 228
either 0 or 1 to each wff relative to each member of T, subject to
the following constraints:
i) for any terms α, β, V
., g
(α=β, w) =1 iff [α]
., g
=[β]
., g
ii) for any nplace predicate, Π, and any terms α
1
, . . . , α
n
,
V
., g
(Πα
1
. . . α
n
, w) =1 iff 〈[α
1
]
., g
, . . . , [α
n
]
., g
, w〉 ∈ ·(Π)
iii) for any wffs φ, ψ, and any variable, α,
a) V
., g
(∼φ, w) =1 iff V
., g
(φ, w) =0
b) V
., g
(φ→ψ, w) =1 iff either V
., g
(φ, w) =0 or V
., g
(ψ, w) =
1
c) V
., g
(∀αφ, w) =1 iff for every u ∈ ´, V
., g
u/α
(φ, w) =1
d) V
., g
(2φ, w) =1 iff for every v ∈ T, V
., g
(φ, v) =1
The derived clauses are what you’d expect, including the following one for
3:
V
., g
(3φ, w) =1 iff for some v ∈ T, V
., g
(φ, v) =1
Finally, we have:
φ is valid in . (=〈T, ´, ·〉) iff for every variable assignment, g,
and every w ∈ T, V
., g
(φ, w) =1
φ is SQMLvalid (“
QML
φ”) iff φ is valid in all SQML models.
Γ SQMLsemanticallyimplies φ (“Γ
SQML
φ”) iff for every world
w in every SQML model, if every member of Γ is true at w, then
so is φ
9.4 Countermodels and validity proofs in SQML
We’ll again be interested in coming up with countermodels for invalid formulas,
and validity proofs for valid ones. We can use the same pictorial method for
constructing countermodels, asterisks and all, with a few changes. First, there’s
no need for the arrows between worlds, since we’ve dropped the accessibility
relation, thereby making every world accessible to every other. Secondly, we
have predicates and names for atomics instead of sentence letters, so how to
CHAPTER 9. QUANTIFIED MODAL LOGIC 229
account for this? Let’s look at an example: nding a countermodel for the
formula (3Fa∧3Ga)→3(Fa∧Ga).
We begin as follows:
∗
1 1 1 0 0
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗
r
The understars make us create two new worlds:
∗
1 1 1 0 0
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗
r
1
F a
a
1
Ga
b
We must then discharge the overstar from the false diamond in each world
(since every world is accessible to every other world in our models):
∗
1 1 1 0 0 0 0
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗ †
r
1 0 0
F a F a∧Ga
a
1 0 0
Ga F a∧Ga
b
(I had to make either Fa or Ga false in r—I chose Fa arbitrarily.) Now, we’ve
indicated the truthvalues that we want the atomics to have. How do we make
the atomics have the TVs we want in the picture?
We do this by introducing a domain for the model, and stipulating what the
names refer to and what objects are in the extensions of the predicates. Let’s
use letters like ‘u’ and ‘v’ as the members of the domain in our models. Now, if
we let the name ‘a’ refer to (the letter) u, and let the extension of F in world r
CHAPTER 9. QUANTIFIED MODAL LOGIC 230
be ]] (the empty set), then the truth value of ‘Fa’ in world r will be 0 (false),
since the denotation of a isn’t in the extension of F at world r. Likewise, we
need to put u in the extension of F (but not in the extension of G) in world
a, and put u in the extension of G ((but not in the extension of F ) in world b.
This all may be indicated on the diagram as follows:
a: u
∗
1 1 1 0 0 0 0
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗ †
F :]]
r
1 0 0
F a F a∧Ga
F : ]u] G : ]]
a
1 0 0
Ga F a∧Ga
F : ]] G : ]u]
b
Within each world I’ve included a specication of the extension of each predi
cate. But the specication of the referent of the name ‘a’ does not go within
any world; it was rather indicated (in boldface) at the top of model. This is
because names, unlike predicates, get assigned semantic values absolutely in a
model, not relative to worlds.
Time for the ofcial model:
T =]r, a, b]
´ =]u]
·(a) =u
·(F ) =]〈u, a〉]
·(G) =]〈u, b〉]
Let’s try one with quantiers: 2∃xF x→∃x2F x:
∗ +
1 1 0 0
2∃ xF x→∃ x2F x
+
r
CHAPTER 9. QUANTIFIED MODAL LOGIC 231
The overstar above the 2 in the antecedent must be discharged in r itself, since,
remember, every world sees every world in these models. That gives us a true
existential. Now, a true existential is a bit like a true 3—the true ∃xF x means
that there must be some object u from the domain that’s in the extension of F
in r. I’ll put a + under true ∃s and false ∀s, to indicate a commitment to some
instance of some sort or other. Analogously, I’ll indicate a commitment to all
instances of a given type (which would arise from a true ∀ or a false ∃) with a +
above the connective in question.
OK, how do we make ∃xF x true in r? By making “F x” true for some
value of x. Let’s put the letter u in the domain, and make “F x” true when u is
assigned to x. We’ll indicate this by putting a 1 overtop of “F
u
x
” in the diagram.
Now, “F
u
x
” isn’t a formula of our language—what it indicates is that “F x” is to
be true when u is assigned to x. And to make this come true, we treat it as an
atomic—we put u in the extension of F at r:
∗ +
1 1 0 0 1
2∃ xF x→∃ x2F x F
u
x
+
F : ]u]
r
Good. Now we’ve got to attend to the overplus, the + sign overtop the false
∃x2F x. Since it’s a false ∃, we’ve got to make 2F x false for every object in the
domain (otherwise—if there were something in the domain for which 2F x was
true—∃x2F x would be true after all). So far, we’ve got only one object in our
domain, u, so we’ve got to make 2F x false, when u is assigned to the variable
‘x’. We’ll indicate this on the diagram by putting a 0 overtop of “2F
u
x
”:
∗ +
1 1 0 0 1 0
2∃ xF x→∃ x2F x F
u
x
2F
u
x
+ ∗
F : ]u]
r
Ok, now we have an understar, which means we should add a new world to
our model. When doing so, we’ll need to discharge the overstar from the
antecedent. We get:
CHAPTER 9. QUANTIFIED MODAL LOGIC 232
∗ +
1 1 0 0 1 0
2∃ xF x→∃ x2F x F
u
x
2F
u
x
+ ∗
F : ]u]
r
0 1 1
F
u
x
∃ xF x F
v
x
+
F : ]v]
a
This move requires some explanation. Why the v? Well, I was required to
make F x false, with u assigned to x. Well, that means keeping u out of the
extension of F at a. Easy enough, right? Just make F ’s extension {}? Well,
no—because of the true 2 in r, I’ve got to make ∃xF x true in a. But that means
that something’s got to be in F ’s extension in a! It can’t be u, so I’ll add a new
object, v, to the domain, and put it in F ’s extension in a.
But adding v to the domain of the model adds a complication. We had
an overplus in r—over the false ∃. That meant that, in r, for every member of
the domain, 2F x is false. So, 2F x is false in r when v is assigned to x. That
creates another understar, requiring the creation of a new world. The model
then looks as follows:
CHAPTER 9. QUANTIFIED MODAL LOGIC 233
∗ +
1 1 0 0 1 0 0
2∃ xF x→∃ x2F x F
u
x
2F
u
x
2F
v
x
+ ∗ ∗
F : ]u]
r
0 1 1
F
u
x
∃ xF x F
v
x
+
F : ]v]
a
0 1 1
F
v
x
∃ xF x F
u
x
+
F : ]u]
b
(Notice that we needn’t have made another world b—we could simply have
discharged the understar on r.)
Ok, here’s the ofcial model:
T =]r, a, b]
´ =]u, v]
·(F ) =]〈u, r〉, 〈u, b〉, 〈v, a〉]
Another example: I’ll do a validity proof for the following formula: 3∃x(x =
a∧2F x)→Fa:
i) suppose for reductio that (for some model, some world r , and
some variable assignment g,) V
g
(3∃x(x=a∧2F x)→Fa, r ) =
0. Thus V
g
(3∃x(x=a∧2F x), r ) =1 and …
ii) …V
g
(Fa, r ) =0
iii) From i), for some w ∈ T, V
g
(∃x(x=a∧2F x), w) =1
iv) so for some u ∈ ´, V
g
u/x
(x=a∧2F x, w) =1)
v) Thus, V
g
u/x
(x=a, w) =1 and …
vi) …V
g
u/x
(2F x, w) =1
vii) from vi), V
g
u/x
(F x, r ) =1
viii) Thus, 〈[x]
g
u/x
, r 〉 ∈ ·(F )—that is, 〈u, r 〉 ∈ ·(F )
CHAPTER 9. QUANTIFIED MODAL LOGIC 234
ix) from v, [x]
g
u/x
=[a]
g
u/x
x) By the denition of denotation plus facts about variable as
signments, u =·(a)
xi) By viii) and x), 〈·(a), r 〉 ∈ ·(F )
xii) Thus, V
g
(Fa, r ) =1. Contradicts line ii)
Notice that in line xii) I inferred that V
g
assigned “Fa” truth at r . I could have
subscripted V with any variable assignment, since the truth condition for the
formula “Fa” is the same, regardless of the variable assignment; I picked g
because that’s what I needed to get the contradiction.
9.5 Philosophical questions about SQML
Our semantics for quantied modal logic faces philosophical challenges. In
each case we will be able to locate a particular feature of our semantics that
gives rise to the alleged problem. In response, one can stick with the simple
semantics and give it a philosophical defense, or one can revise the semantics.
9.5.1 The necessity of identity
Let’s try to come up with a countermodel for the following formula:
∀x∀y(x=y→2(x=y))
When we try to make the formula false by putting a 0 over the initial ∀, we get
an underplus. So we’ve got to make the inside part, ∀y(x=y→2x=y), false
for some value of x. We do this by putting some object u in the domain, and
letting that be the value of x for which ∀y(x=y→2x=y) is false. We get:
0 0
∀x∀y(x=y→2x=y) ∀y(
u
x
=y→2(
u
x
=y))
+ +
r
Nowwe need to do the same thing for our newfalse universal: ∀y(x=y→2x=y).
For some value of y, the inside conditional has to be false. But that means that
the antecedent must be true. So the value for y has to be u again. We get:
CHAPTER 9. QUANTIFIED MODAL LOGIC 235
0 0 1 0 0
∀x∀y(x=y→2x=y) ∀y(
u
x
=y→2(
u
x
=y))
u
x
=
u
y
→2(
u
x
=
u
y
)
+ + ∗
r
The understar now requires creation of a world in which x=y is false, when
both x and y are assigned u. But there cannot be any such world! An identity
sentence is true (at any world) if the denotations of the terms are identical.
Our attempt to nd a countermodel has failed; we must do a validity proof.
Consider any QML model 〈T, ´, ·〉, any r ∈ T, and any variable assignment
g; we’ll show that V
g
(∀x∀y(x=y→2x=y), r ) =1:
i) suppose for reductio that V
g
(∀x∀y(x=y→2x=y), r ) =0.
ii) Then, for some u ∈ ´, V
g
u/x
(∀y(x=y→2x=y), r ) =0
iii) So, for some v ∈ ´, V
g
u/x v/y
(x=y→2(x=y), r ) =0.
iv) Thus, V
g
u/x v/y
(x=y, r ) =1, and …
v) …V
g
u/x v/y
(2(x=y), r ) =0
vi) from iv) [x]
g
u/x v/y
=[y]
g
u/x v/y
vii) From v), at some world, w, V
g
u/x v/y
(x=y, w) =0
viii) And so, [x]
g
u/x v/y
(=[y]
g
u/x v/y
. Contradicts vi).
Notice at the end how the particular world at which the identity sentence was
false didn’t matter. The truth condition for an identity sentence is simply that
the terms denote the same thing; it doesn’t matter what world this is evaluated
relative to.
2
2
A note about variables. In validity proofs, I’m using italicized ‘u’ and ‘v’ as variables to
range over objects in the domain of the model I’m considering. So, a sentence like ‘u = v’
might be true, just as the sentence ‘x=y’ of our object language can be true. But when I’m
doing countermodels, I’m using the roman letters ‘u’ and ‘v’ as themselves being members of
the domain, not as variables ranging over members of the domain. Since the letters ‘u’ and ‘v’
are different letters, they are different members of the domain. Thus, in a countermodel with
letters in the domain, if the denotation of a name ‘a’ is the letter ‘u’, and the denotation of the
name ‘b’ is the letter ‘v’, then the sentence ‘a=b’ has got to be false, since ‘u’(=‘v’. If I were
using ‘u’ and ‘v’ as variables ranging over members of the domain, then the sentence ‘u = v’
might be true! This just goes to show that it’s important to distinguish between the following
two sentences:
CHAPTER 9. QUANTIFIED MODAL LOGIC 236
We can think of ∀x∀y(x=y→2(x=y)) as expressing “the necessity of iden
tity”: it says that whenever identity holds between objects, it necessarily holds.
The necessity of identity is philosophically controversial. On the one hand
it can seem obviously correct. “x=y” says that x and y are one and the same
thing. Now, if there were a world in which x was different from y, since x and
y are the same thing, this would have to be a world in which x was different
from x. How could that be? On the other hand, it was a great discovery that
Hesperus = Phosphorus. Surely, it could have turned out the other way—surely,
Hesperus might not have turned out identical to Phosphorus! But isn’t this
a counterexample to this formula? For a discussion of this example, see Saul
Kripke’s book Naming and Necessity.
It’s worth noting why the necessity of identity turns out valid, given our
semantics. It turns out valid because of the way we dened variable assignments:
our variable assignments assign members of the domain to variables absolutely,
rather than relative to worlds. (Similarly: since the interpretation function ·,
according to our denition above, assigns referents to names absolutely, rather
than relative to worlds, the formula a=b→2a=b turns out valid.) One could,
instead, dene variable assignments as functions that assign members of the
domain to variables relative to worlds. Given appropriate adjustments to the
denition of the valuation function, this would have the effect of invalidating
the necessity of identity.
3
(Similarly, one could make · assign denotations to
names relative to worlds, thus invaliding a=b→2a=b.)
9.5.2 The necessity of existence
Another (in)famous valid formula of SQML is the “Barcan Formula”, named
after Ruth Barcan Marcus: ∀x2F x→2∀xF x. (Call the schema ∀α2φ→
2∀αφ the “Barcan schema”.) If we try to nd a countermodel for this formula
we get to the following stage:
u = v
‘u’ = ’v’
The rst could be true, depending on what ‘u’ and ‘v’ currently refer to, but the second one is
just plain false, since ‘u’ and ‘v’ are different letters.
3
See Gibbard (1975).
CHAPTER 9. QUANTIFIED MODAL LOGIC 237
+
1 0 0
∀x2F x→2∀xF x
∗
r
0 0
∀xF x F
u
x
+
F : ]]
a
When you have a choice between discharging overthings and underthings,
whether plusses or stars, always do the under things rst. In this case, this
means discharging the understar and ignoring the overplus for the moment.
So, discharging the understar gave us world a, in which we made a universal
false. This gave an underplus, and forced us to make an instance false. So I put
object u in our domain, and keep it out of the extension of F in a. This makes
F x false in a, when x is assigned u.
But now, I need to discharge the overplus in r. I must make 2F x true for
every member of the domain, including u, which is now in the domain. But
then this requires F x to be true, when u is assigned to x, in a:
+ ∗
1 0 0 1 1
∀x2F x→2∀xF x 2F
u
x
∗
F : ]u]
r
0 0 1
∀xF x F
u
x
F
u
x
+
F : ]?]
a
So, we fail to get a model. Time for a validity proof; let’s show that every
instance of the Barcan Schema is valid:
i) suppose for reductio that V
g
(∀α2φ→2∀αφ, r ) = 0. Then
V
g
(∀α2φ, r ) =1 and …
ii) …V
g
(2∀αφ, r ) =0.
iii) from ii), for some w, V
g
(∀αφ, w) =0
iv) so, for some u in the domain, V
g
u/α
(φ, w) =0
v) from i), for every member of the domain, and so for u in
particular, V
g
u/α
(2φ, r ) =1.
CHAPTER 9. QUANTIFIED MODAL LOGIC 238
vi) thus, for every world, and so for w in particular, V
g
u/α
(φ, w) =1.
Contradicts iv).
The validity of the Barcan formula in our semantics is infamous because the
Barcan formula seems, intuitively, to be invalid. To see why, we need to think a
bit about the intuitive signicance of the relative order of quantiers and modal
operators. Consider the difference between the following two sentences:
3∃xF x
∃x3F x
In general, a sentence of the form 3φ says that it’s possible for the component
sentence, φ, to be true. So the rst of our two sentences, 3∃xF x, says that
it’s possible for “∃xF x” to be true. That is: it’s possible for there to exist
an F . What about the second sentence? In general, a sentence that begins
without a modal operator in front makes a statement about the actual world.
Thus, a statement that begins with “∃x . . . ” is saying that there exists, in the
actual world, an object x, such that…. Our second statement, then, says that
there actually exists an object, x, that is possibly F . We have seen that our two
statements have the following meanings:
3∃xF x “it’s possible for there to exist an F ”
∃x3F x “ there actually exists an object, x, that is possibly F ”
It matters, therefore, whether the ∃ comes after or before the 3. If the ∃ comes
rst, then the statement is saying that there actually exists a certain sort of
object (namely, an object that could have been a certain way.) But if it comes
second, after the 3, then the statement is merely saying that there could have
existed a certain sort of object. There is a similar contrast with the following
two statements:
2∀xF x
∀x2F x
The rst says that it’s necessary that: everything is F . That is, in every possible
world, every object that exists in that world is F in that world. The objects
ranged over by the ∀, so to speak, are drawn from the worlds the 2 introduces,
because the ∀ occurs inside the scope of the 2. The second statement, by
CHAPTER 9. QUANTIFIED MODAL LOGIC 239
contrast, says that: every actual object is necessarily F . That is, every object that
exists in the actual world is F in every possible world. The second statement
concerns just actually existing objects because the ∀ occurs in the front of the
formula, not inside the scope of the 2.
With all this in mind, return to the Barcan formula, ∀x2F x→2∀xF x. It
says:
“If every actually existing thing is F in every possible world, then
in every world, every object in that world is F in that world”
Now we can see why this claim is questionable. Even if every actual thing is
necessarily F , there could still be worlds containing nonF things, so long as
those nonF things don’t exist in the actual world. Suppose, for instance, that
every object in the actual world is necessarily a material object. Then, letting
F stand for “is a material object”, ∀x2F x is true. Nevertheless, 2∀xF x seems
false—it would presumably be possible for there to exist an immaterial object:
a ghost, say. Possible worlds containing ghosts would simply need to contain
objects that do not exist in the actual world (since all the objects in the actual
world are necessarily material.)
This objection to the validity of the Barcan formula is obviously based on
the idea that what objects exist can vary from possible world to possible world.
But this sort of variation is not represented in the SQML denition of a model.
Each such model contains a single domain, ´, rather than different domains
for different possible worlds. The truth condition we specied for a quantied
sentence ∀αφ, at a world w, was simply that φ is true at w of every member of
´—the quantier ranges over the same domain, regardless of which possible
world is being described. That is why the Barcan formula turns out valid under
our denition.
This feature of SQMLmodels is problematic for an even more direct reason:
the sentence ∀x2∃y y = x, i.e., “everything necessarily exists”, turns out valid!:
i) Suppose for reductio that V
g
(∀x2∃y y=x, w) =0.
ii) Then V
g(u/x)
(2∃y y=x, w) =0, for some u ∈ ´
iii) So V
g(u/x)
(∃y y=x, w
/
) =0, for some w
/
∈ T
iv) So V
g(u/x u
/
/y)
(y=x, w
/
) =0, for every u
/
∈ ´
v) So, since u ∈ ´, we have V
g(u/x u/y)
(y=x, w
/
) =0. But that can’t
be, given the clause for ‘=’ in the denition of the valuation
function.
CHAPTER 9. QUANTIFIED MODAL LOGIC 240
It’s clear that this formula turns out valid for the same reason that the Barcan
formula turns out valid: SQML models have a single domain common to each
possible world.
The Barcan schema is just one of a number of interesting schemas concern
ing how quantiers and modal operators interact (for each schema I also list an
equivalent schema with 3 in place of 2):
∀α2φ→2∀αφ 3∃αφ→∃α3φ (Barcan)
2∀αφ→∀α2φ ∃α3φ→3∃αφ (Converse Barcan)
∃α2φ→2∃αφ 3∀αφ→∀α3φ
2∃αφ→∃α2φ ∀α3φ→3∀αφ
We have already discussed the Barcan schema. The third schema raises no philo
sophical problems for SQML, since, quite properly, it has instances that turn out
invalid. As we saw above, there are SQML models in which 2∃xF x→∃x2F x
is false. Let’s look at the other two schemas.
First, the converse Barcan schema. Like the Barcan schema, each of its
instances is valid given the SQML semantics (I’ll leave this to the reader to
demonstrate), and like the Barcan schema, this verdict faces a philosophical
challenge. The antecedent says that in every world, everything that exists in
that world is φ. Existents are thus always φ. It might still be that some object
isn’t necessarily φ: perhaps some object that is φ in every world in which it
exists, fails to be φ in worlds in which it doesn’t exist. This talk of an object
being φ in a world in which it doesn’t exist may seem strange, but consider
the following instance of the converse Barcan schema, substituting “∃y y=x”
(think: “x exists”) for φ:
2∀x∃y y=x→∀x2∃y y=x
This formula seems to be false. Its antecedent is clearly true; but its consequent
says that every object in the actual world exists necessarily, and hence seems
intuitively to be false.
Each instance of the fourth schema, ∃α2φ→2∃αφ, is also validated by
the SQML semantics (again, an exercise for the reader); and again, this is
philosophically questionable. Let’s suppose that physical objects are necessarily
physical. Then, ∃x2Px seems true, letting P mean ‘is physical’. But 2∃xPx
seems false—it seems possible that there are no physical objects. This coun
terexample requires that there be worlds with fewer objects than those that
actually exist, whereas the counterexample to the Barcan formula involved the
possibility that there be more objects than those that actually exist.
CHAPTER 9. QUANTIFIED MODAL LOGIC 241
9.5.3 Necessary existence defended
There are various ways to respond to the challenge of the previous section.
From a logical point of view, the simplest is to stick to one’s guns and defend
the SQML semantics. SQMLmodels accurately model the modal facts. The
Barcan formula, the converse Barcan formula, the fourth schema, and the state
ment that everything necessarily exists are all logical truths; the philosophical
objections are mistaken. Contrary to appearances it is not contingent what
exists. Each possible world has exactly the same stock of individuals. Call this
the doctrine of Constancy.
One could uphold Constancy either by taking an narrow view of what is
possible, or by taking a broad view of what exists. On the former alternative,
one would claim that it is just not possible for there to be any ghosts, and that
it is just not possible in any sense for an actual object to have failed to exist. On
the latter alternative, which I’ll be discussing for the rest of this section, one
accepts the possibility of ghosts, dragons, and so on, but claims that possible
ghosts and dragons exist in the actual world.
Think of the objects in ´ as being all of the possible objects. In addition to
normal things—what one would normally think of as the actually existing entities:
people, tables and chairs, planets and electrons, and so on—our defender of
Constancy claims that there also exist objects that, in other possible worlds, are
ghosts, golden mountains, talking donkeys, and so forth; and these are included
in ´ as well. Call these further objects “merely possible things” (but don’t be
misled by this label; the claim is that merely possible things actually exist.) The
formula “∀xF x” means that every possible object is F in the actual world. It’s
not enough for the normal things to be F , for the normal things are not all
of the things that there are. There are also all the merely possible things, and
each of these must be F as well (must be F here in the actual world, that is), in
order for ∀xF x to be true. Hence, the objection to the Barcan formula from
the previous section fails. That objection assumed that ∀x2F x, the antecedent
of (an instance of) the Barcan formula, was true, when F symbolizes “is a
material object”. But this neglects the merely possible things. It’s true that all
the normal objects are necessarily material objects, but there are some further
things—merely possible things—that are not necessarily material objects.
Further: in ordinary language, when we say “Everything” or “something”,
we typically don’t mean to be talking about all possible objects; we’re typically
talking about just the normal things. Otherwise we would be speaking falsely
when we say, for example, “everything has mass”: merely possible unicorns
CHAPTER 9. QUANTIFIED MODAL LOGIC 242
presumably have no mass (nor any spatial location, nor any other physical
feature.) Ordinary quantication is restricted to normal things. So if we want
to translate an ordinary claim into the language of QML, we must introduce
a predicate for the normal things, “N”, and use it to restrict quantiers. But
now, consider the following ordinary English statement:
If everything is necessarily a material object, then necessarily: ev
erything is a material object
If we mindlessly translate this into the language of QML, we would get
∀x2F x→2∀xF x—an instance of the Barcan schema. But since in every
day usage, quantiers are restricted to normal things, the thought in the mind
of an ordinary speaker who utters this sentence is more likely the following:
∀x(Nx→2F x)→2∀x(Nx→F x)
which says:
If every normal thing is necessarily a material object, then neces
sarily: every normal thing is a material object.
And this formula is not an instance of the Barcan schema, nor is it valid, as may
be shown by the following countermodel:
+
1 0 0 0 1 0
∀x(Nx→2F x)→2∀x(Nx→F x) N
u
x
→2F
u
x
∗ †
N : ]]
r
0 1 0 0
∀x(Nx→F x) N
u
x
→F
u
x
+
N : ]u] F : ]]
a
So in a sense, the ordinary intuitions that were alleged to undermine the Barcan
schema are in fact consistent with Constancy.
CHAPTER 9. QUANTIFIED MODAL LOGIC 243
The defender of Constancy can defend the converse Barcan schema and the
fourth schema in similar fashion. The objection to the converse Barcan schema
assumed the falsity of ∀x2∃y y=x. “Sheer prejudice!”, according to the friend
of constancy. “And recall further that an ordinary utterance of ‘Everything exists
necessarily’ expresses, not ∀x2∃y y=x, but rather ∀x(Nx→2∃y(Ny∧y=x)),
(N for ‘normal’), the falsity of which is is perfectly compatible with Constancy.
It’s possible to fail to be normal; all that’s impossible is to utterly fail to exist.
Likewise for the fourth schema.”
This defense of SQML is hard to take. Let “G” stand for a kind of object
that, in fact, has no members, but which could have had members. Perhaps ghost
is such a kind. ∀x2∼Gx→2∀x∼Gx is an instance of the Barcan schema, and
so true according to the defender of Constancy. Since there could have existed
ghosts, the consequent of this conditional is false. Therefore, its antecedent
∀x2∼Gx must be false. That is, there exists something that could have been a
ghost. But this is a very surprising result. The alleged possible ghost couldn’t
be any material object, presumably, assuming it would be impossible for any
material object to be a ghost. The defender of Constancy, then, is committed to
the existence of objects which we wouldn’t otherwise have dreamed of accepting:
things that could have been ghosts, things that been dragons, things that could
have been gods, and so on.
The defender of Constancy might try to defend this conclusion by remind
ing us that these “possibleghosts”, “possibledragons”, and so on, are not
normal objects. They aren’t in space and time, presumably, which explains why
no one has ever seen, heard, felt, or smelled one. He might even say that they
are nonactual, or even that they do not exist (though they are). We are quite cor
rect, she might say, to scoff at the idea that some normal/actual/existing objects
are capable of being ghosts; but what’s the big deal about saying that some non
normal/nonactual/nonexisting objects have these capabilities? This move,
too, will be considered philosophically suspect by many. Many philosophers
regard the idea that there are some nonexistent things, or some nonactual
things, as being anywhere from obviously false to conceptually incoherent, or
subversive, or worse.
4
And how does it help to point out that the objects aren’t
normal? The postulation of nonnormal objects—objects above and beyond
the objects that the rest of us believe in—was exactly what I was claiming is
philosophically suspect!
On the other hand, Constancy’s defenders can point to certain powerful
4
See Quine (1948); Lycan (1979).
CHAPTER 9. QUANTIFIED MODAL LOGIC 244
arguments in its favor. Here’s a quick sketch of one such argument. First, the
following seems to be a logical truth:
Ted =Ted
But it follows from this that:
∃y y =Ted
This latter formula, too, is therefore a logical truth. But if φ is a logical truth
then so is 2φ (recall the rule of necessitation from chapter 6). So we may infer
that the following is a logical truth:
2∃y y =Ted
Next, notice that nothing in the argument for 2∃y y =Ted depended on any
special features of me. We may therefore conclude that the reasoning holds
good for every object; and so ∀x2∃y y = x is indeed a logical truth. Since,
therefore, every object exists necessarily, it should come as no surprise that
there are things that might have been ghosts, dragons, and so on—for if there
had been a ghost, it would have necessarily existed, and thus must actually exist.
This and other related arguments have apparently wild conclusions, but they
cannot be lightly dismissed, for it is no mean feat to say exactly where they go
wrong (if they go wrong at all!).
5
9.6 Variable domains
We now consider a way of dealing with the problems discussed in section 9.5.2
above that does not require embracing Constancy.
SQML models contain a single domain, ´, over which the quantiers range
in each possible world. Since it was this feature that led to the problems of
section 9.5.2, let’s introduce a new semantics that instead provides different
domains for different possible worlds. And let’s also reinstate the accessibility
relation, for reasons to be made clear below:
6
The new semantics is called
VDQML (“variabledomains quantied modal logic”):
5
On this topic see Prior (1967, 149151); Plantinga (1983); Fine (1985); Linsky and Zalta
(1994, 1996); Williamson (1998, 2002).
6
More care than I take is needed in converting the earlier denition of validity if one is
worried about the validity of formulas with free variables. See ?, p. 275.
CHAPTER 9. QUANTIFIED MODAL LOGIC 245
A VDQMLmodel is an ordered 5tuple 〈T, %, ´, 2, ·〉 such that:
i) T is a nonempty set (“possible worlds”)
ii) % is a binary relation on T (“accessibility relation”)
iii) ´ is a set (“superdomain”)
iv) 2 is a function that assigns to any w ∈ T a nonempty
7
subset of ´. Let us refer to 2(w) as “´
w
”. Think of ´
w
as
w’s “subdomain”—the set of objects that exist at w.
v) · is a function such that: (“interpretation function”)
a) if α is a constant then ·(α) ∈ ´
b) if Π is an nplace predicate then ·(Π) is a set of ordered
n+1tuples 〈u
1
, . . . , u
n
, w〉, where u
1
, . . . , u
n
are members
of ´, and w ∈ T.
Denotation is dened as before. Truth in a model may then be dened thus:
The valuation function V
., g
, for VDQMLmodel . (=〈T, %, ´, 2, ·〉)
and variable assignment g, is dened as the function that assigns
either 0 or 1 to each wff relative to each member of T, subject to
the following constraints:
i) for any terms α and β, V
., g
(α=β, w) =1 iff [α]
., g
=[β]
., g
ii) for any nplace predicate, Π, and any terms α
1
, . . . , α
n
,
V
., g
(Πα
1
. . . α
n
, w) =1 iff 〈[α
1
]
., g
, . . . , [α
n
]
., g
, w〉 ∈ ·(Π)
iii) for any wffs φ and ψ, and any variable, α,
a) V
., g
(∼φ, w) =1 iff V
., g
(φ, w) =0
b) V
., g
(φ→ψ, w) =1 iff either V
., g
(φ, w) =0 or V
., g
(ψ, w) =
1
c) V
., g
(∀αφ, w) =1 iff for each u ∈ ´
w
, V
., g
u/α
(φ, w) =1
7
One could drop this assumption. But if subdomains can be empty then 2∃x(F x→F x) will
be invalid. Since ∃x(F x→F x) is valid given the chapter 4 semantics for nonmodal predicate
logic, we would have the odd result of a logical truth whose necessitation isn’t a logical truth.
One could modify the chapter 4 semantics by allowing predicate logic models with empty
domains, thus invalidating ∃x(F x→F x). This approach is known as free logic.
CHAPTER 9. QUANTIFIED MODAL LOGIC 246
d) V
., g
(2φ, w) =1 iff for each v ∈ T, if %wv then V
., g
(φ, v) =
1
The denitions of validity and semantic consequence remain unchanged. The
obvious derived clauses for ∃ and 3 are as follows:
V
., g
(∃αφ, w) =1 iff for some u ∈ ´
w
, V
., g
u/α
(φ, w) =1
V
., g
(3φ, w) =1 iff for some v ∈ T, %wv and V
., g
(φ, v) =1
Thus, we have introduced introduced subdomains. We still have ´, a set
that contains all of the possible individuals. But for each possible world w,
we introduce a subset of the domain, ´
w
, to be the domain for w. When
evaluating a quantied sentence at a world w, the quantier ranges only over
´
w
. Notice that we also reinstated the accessibility relation. This isn’t necessary
for introducing subdomains; I did this in order to be able to make a certain
point about subdomains in section 9.6.2.
9.6.1 Countermodels to the Barcan and related formulas
in VDQML
What is the effect of this newtruth denition on the Barcan formula and related
formulas? All of the following formulas come out invalid:
2∀xF x→∀x2F x (Converse Barcan) ∃x3F x→3∃xF x
∀x2F x→2∀xF x (Barcan) 3∃xF x→∃x3F x
2∃xF x→∃x2F x ∀x3F x→3∀xF x
∃x2F x→2∃xF x 3∀xF x→∀x3F x
The third one on the list was invalid before, and so is still invalid now. For
note: if . is a SQML model, then we can construct a corresponding VDQML
model with the same set of worlds, (super) domain, and interpretation function,
in which every world is accessible fromevery other, and in which 2is a constant
function assigning the whole superdomain to each world. It is intuitively clear
that the same sentences are true in this corresponding model as are true in ..
Hence, whenever a sentence is SQMLinvalid, it is VDQMLinvalid.
As for the Barcan formula ∀x2F x→2∀xF x, consider the following VDQML
model:
CHAPTER 9. QUANTIFIED MODAL LOGIC 247
´
r
: ]u] F : ]u]
r
´
a
: ]u, v] F : ]u]
a
Ofcial model:
T : ]r, a]
% : ]〈r, r〉, 〈r, a〉, 〈a, a〉]
´ : ]u, v]
´
r
: ]u] ´
a
: ]u, v]
·(F ) : ]〈u, r〉, 〈u, a〉]
As for the fourth formula on the list, ∃x2F x→2∃xF x, we can construct a
countermodel as follows:
´
r
: ]u, v] F : ]u]
r
´
a
: ]v] F : ]u]
a
T : ]r, a]
% : ]〈r, r〉, 〈r, a〉, 〈a, a〉]
´ : ]u, v]
´
r
: ]u, v] ´
a
: ]v]
·(F ) : ]〈u, r〉, 〈u, a〉]
And for the converse Barcan formula, 2∀xF x→∀x2F x, we have the fol
lowing countermodel:
´
r
: ]u, v] F : ]u, v]
r
´
a
: ]v] F : ]v]
a
CHAPTER 9. QUANTIFIED MODAL LOGIC 248
T : ]r, a]
% : ]〈r, r〉, 〈r, a〉, 〈a, a〉]
´ : ]u, v]
´
r
: ]u, v] ´
a
: ]v]
·(F ) : ]〈u, r〉, 〈v, r〉, 〈v, a〉]
Note also that ∀x2∃y y=x is false at world r in this VDQMLmodel.
9.6.2 Expanding, shrinking domains
There are several comments worth making about VDQMLmodels. First,
note that if we made certain restrictions on variabledomains models, then the
countermodels of the previous section would no longer be legal models. For
example, the rst example, the counterexample to the Barcan formula, required
a model in which the domain expanded; world a was accessible from world r,
and had a larger domain. But suppose we made the:
Decreasing domains requirement: if %wv, then ´
v
⊆´
w
The counterexample would then go away. Indeed, every instance of the Barcan
schema would then become VDSQMLvalid, which may be proved as follows:
i) suppose for reductio that V
g
(∀α2φ→2∀αφ, w) = 0. Then
V
g
(∀α2φ, w) =1 and…
ii) …V
g
(2∀αφ, w) =0
iii) by ii), for some v, %wv and V
g
(∀αφ, v) =0
iv) and so, for some u ∈ ´
v
, V
g
u/α
(φ, v) =0
v) given decreasing domains, ´
v
⊆´
w
, and so u ∈ ´
w
vi) by i), for every object in ´
w
, and so for u in particular, V
g
u/α
(2φ, w) =
1
vii) so, V
g
u/α
(φ, v) =1. Contradicts iv)
Similarly, notice that the counterexamples to ∃x2F x→2∃xF x and the
converse Barcan formula assumed that domains can shrink. Just as above, an
added requirement on models will validate these formulas:
CHAPTER 9. QUANTIFIED MODAL LOGIC 249
Increasing domains requirement: if %wv then ´
w
⊆´
v
Any instance of the schema ∃α2φ→2∃αφ, for example, may be shown to be
validated as follows:
i) suppose for reductio that V
g
(∃α2φ→2∃αφ, w) = 0, Then
V
g
(∃α2φ, w) =1 and …
ii) …V
g
(2∃αφ, w) =0
iii) by i), for some u ∈ ´
w
, V
g
u/α
(2φ, w) =1
iv) by ii), for some world v, %wv and V
g
(∃αφ, v) =0
v) by the increasing domain requirement, ´
w
⊆´
v
, and so u ∈
´
v
vi) by iv), for every object in ´
v
, and so for u in particular,
V
g
u/α
(φ, v) =0
vii) by iii), V
g
u/α
(φ, v) =1. Contradicts vi)
Note further that even after imposing the increasing domains requirement,
the Barcan formula remains VDQMLinvalid; and after imposing the decreas
ing domains requirement, the converse Barcan formula and also ∃x2F x→2∃xF x
remain VDQMLinvalid; this can be seen by the original countermodels for
these formulas above, each of which obeys the requirements in question. How
ever, it should also be noted that in systems in which the accessibility relation
is symmetric, this collapses: imposing either of these requirements results in
imposing the other. That is, in B or S5, imposing either the increasing or the
decreasing domains requirement results in imposing both, and hence results in
all three formulas being validated.
9.6.3 Strong and weak necessity
In order for 2φ to be true at a world, the VDQML semantics requires that φ
be true at every accessible world. It might be thought that this requirement is
too strong. In order for 2Fa, say, to be true, our denition requires Fa to be
true in all possible worlds. But what if a fails to exist in some worlds? In order
for “Necessarily, I am human” to be true, must I be human in every possible
world? Isn’t it enough for me to be human in all the worlds in which I exist?
CHAPTER 9. QUANTIFIED MODAL LOGIC 250
This argument goes by a little too quickly. The main worry of its proponent
is that our semantics requires a to exist necessarily, in order for 2Fa to come
out true. But our semantics doesn’t require this. It does require Fa to be
true in every world, in order for 2Fa to be true; but it does not require a
to exist in every world in which Fa is true. The clause in the denition of a
VDQMLmodel for the interpretation of predicates was this:
b) if Π is an nplace predicate then ·(Π) is a set of ordered n+1
tuples 〈u
1
, . . . , u
n
, w〉, where u
1
, . . . , u
n
are members of ´, and
w ∈ T.
This allows ·(F ) to contain pairs 〈u, w〉, where u is not a member of ´
w
. So
one could say that 2Fa is consistent with a’s failing to necessarily exist; it’s just
that a has to be F even in worlds where it doesn’t exist.
I doubt this really addresses the worry, since it looks like bad metaphysics
to say that a person could be human at a world where he doesn’t exist. One
could hardwire a prohibition of this sort of bad metaphysics into VDQML
semantics, by replacing b) with:
b
/
) if Π is an nplace predicate then ·(Π) is a set of ordered n+1
tuples 〈u
1
, . . . , u
n
, w〉, where u
1
, . . . , u
n
are members of ´
w
, and
w ∈ T.
thus barring objects from having properties at worlds where they don’t ex
ist. But some would argue that this goes too far. Given b
/
), the sentence
∀x2(F x→∃y y=x) turns out valid. “An object must exist in order to be F ”—
sounds clearly true if F stands for ‘is human’, but what if F stands for ‘is famous’?
Suppose John Lennon had never existed (it was all a communist plot); some
would argue that in such a circumstance Lennon still would have been famous.
The issues here are complex.
8
But whether or not b) should be replaced
with b
/
), it looks as though there are some existenceentailing English predicates
π: predicates π such that nothing can be a π without existing. ‘Is human’ seems
to be such a predicate. So we’re back to our original worry about VDQML
semantics: its truth condition for 2φ requires truth of φ at all worlds, which is
allegedly too strong, at least when φ is a sentence like πa, where π is existence
entailing.
8
The question is that of socalled “serious actualism” (Plantinga, 1983).
CHAPTER 9. QUANTIFIED MODAL LOGIC 251
One could modify the clause for the 2 in the denition of the valuation
function, so that in order for 2Fa to be true, a only needs to be F in worlds in
which it exists:
d
/
) V
., g
(2φ, w) =1 iff for each v ∈ T, if %wv, and if [α]
., g
∈
´
w
for each name or free
9
variable α occurring in φ, then
V
., g
(φ, v) =1
This would indeed have the result that 2Fa gets to be true provided a is F in
every world in which it exists. But be careful what you wish for. Along with
this result comes the following: even if a doesn’t necessarily exist, the sentence
2∃x x=a comes out true. For according to d
/
), in order for 2∃x x=a to be
true, it must merely be the case that ∃x x=a is true in every world in which a
exists, and of course this is indeed the case.
If 2∃x x=a comes out true even if a doesn’t necessarily exist, then 2∃x x=a
doesn’t say that a necessarily exists. Indeed, it doesn’t look like we have any way
of saying that a necessarily exists, using the language of QML, if the 2 has the
meaning provided for it by d
/
).
A notion of necessity according to which “Necessarily φ” requires truth in
all possible worlds is sometimes called a notion of strong necessity. In contrast,
a notion of weak necessity is one according to which “Necessarily φ” requires
merely that φ be true in all worlds in which objects named within φ exist. d
/
)
corresponds to weak necessity, whereas our original denition d) corresponds
to strong necessity.
As we saw, if the 2 expresses weak necessity, then one cannot even express
the idea that a thing necessarily exists. That’s because one needs strong necessity
to say that a thing necessarily exists: in order to necessarily exist, you need to
exist at all worlds, not just at all worlds at which you exist! So this is a serious
deciency of having the 2 of QML express weak necessity. But if we allow the
2 to express strong necessity instead, there is no corresponding deciency, for
one can still express weak necessity using the strong 2 and other connectives.
For example, to say that a is weakly necessarily F (that is, that a is F in every
world in which it exists), one can say: 2(∃x x=a→Fa).
So it would seem that we should stick with our original truth condition d)
for the 2, and live with the fact that statements like 2Fa turn out false if a fails
to be F at worlds in which it doesn’t exist. Those who think that “Necessarily,
9
I.e., not bound to any quantier in φ.
CHAPTER 9. QUANTIFIED MODAL LOGIC 252
Ted is human” is true despite Ted’s possible nonexistence can always translate
this natural language sentence into the language of QML as 2(∃x x=a→Fa)
(which requires a to be F only at worlds at which it exists) rather than as 2Fa
(which requires a to be F at all worlds).
Chapter 10
Twodimensional modal logic
I
x :uis tuav:iv we consider an extension to modal logic with considerable
philosophical interest.
10.1 Actuality
The word ‘actually’, in one of its senses anyway, can be thought of as a oneplace
sentence operator: “Actually, φ.”
‘Actually’ might at rst seem redundant. “Actually, snow is white” basically
amounts to: “snow is white”. But the actuality operator interacts with modal
operators in interesting ways. The following two sentences, for example, clearly
have different meanings:
Necessarily, if snow is white then snow is white
Necessarily, if snow is white then snow is actually white
The rst sentence expresses the triviality that snow is white in any possible
world in which snow is white. But the second sentence makes the nontrivial
statement that if snow is white in any world, then snow is white in the actual
world.
So, ‘actually’ is nonredundant, and consequently, worth thinking about.
Let’s add a symbol to modal logic for it. “@φ” will symbolize “Actually, φ”.
We can now symbolize the pair of sentences above as 2(S→S) and 2(S→@S),
253
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 254
respectively. For some further examples of sentences we can symbolize using
‘actually’, consider:
1
It might have been that everyone who is actually rich is poor
3∀x(@Rx→Px)
There could have existed something that does not actually exist
3∃x@∼∃y y=x
10.1.1 Kripke models with actual worlds
For the purposes of this chapter, the logic of iterated boxes and diamonds isn’t
relevant, so let’s simplify things by dropping the accessibility relation from
models; we will thereby treat every world as being accessible from every other.
Before laying out the semantics of @, let’s examine a slightly different way
of laying out standard modal logic. For propositional modal logic, instead of
dening a model as an ordered pair 〈T, ·〉 (no accessibility relation, remem
ber), one could instead dene a model as a triple 〈T, w
@
, ·〉, where T and ·
are as before, and w
@
is a member of T, thought of as the actual, or designated
world of the model. The designated world w
@
plays no role in the denition of
the valuation for a given model; it only plays a role in the denitions of truth
in a model and validity:
φ is true in model . (= 〈T, w
@
, ·〉) iff V
M
(φ, r) =1
φ is valid in system S iff φ is true in all models for system S
The old denition of validity for a system, recall, never employed the notion
of truth in a model; rather, it proceeded via the notion of validity in a frame.
The nice thing about the new denition is that it’s parallel to the way validity
is usually dened in model theory: one rst denes truth in a model, and then
denes validity as truth in all models. But the new denition doesn’t differ in
any substantive way from the old denition, in that it yields exactly the same
class of valid formulas:
Proof: It’s obvious that everything valid on the old denition is
valid on the new denition (the old denition says that validity
1
In certain special cases, we could do without the new symbol @. For example, instead of
symbolizing “Necessarily, if snow is white then snow is actually white” as 2(S→@S), we could
symbolize it as 3S→S. But the @ is not in general eliminable; see Hodes (1984b,a).
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 255
is truth in all worlds in all models; the addition of the designated
world w
@
doesn’t play any role in dening truth at worlds, so each
of the new models has the same distribution of truth values as one
of the old models.) Moreover, suppose that a formula is invalid on
the old denition—i.e., suppose that φ is false at some world, w, in
some model .. Now construct a model of the new variety that’s
just like . except that its designated world is w. φ will be false in
this model, and so φ turns out invalid under the new denition.
In a parallel way, one can also add a designated world to models for quanti
ed modal logic.
10.1.2 Semantics for @
Now for the semantics of @. We can give @ a very simple semantics using
models with designated worlds. Further, the designated world will now be
involved in the notion of truth in a model, not just in the denition of validity.
We’ll move straight to quantied modal logic, bypassing propositional logic. To
keep things simple, let the models have a constant domain and no accessibility
relation. (It will be obvious how to add these complications back in, if they are
desired.) Dene a Designatedworld QMLmodel as a fourtuple 〈T, w
@
, ´, ·〉,
where:
i) T is a nonempty set (“worlds”)
ii) w
@
is a member of T (“designated/actual world”)
iii) ´ is a nonempty set (“domain”)
iv) · is a function that assigns semantic values (as before—names
are assigned members of ´; predicates are assigned extensions
relative to worlds)
In the denition of the valuation for such a model, the semantic clauses for the
old logical constants run as before, with the exception that the clause for the 2
no longer mentions accessibility:
V
., g
(2φ, w) =1 iff for every v ∈ T, V
., g
(φ, v) =1
And we now add a clause for the new operator @:
V
., g
(@φ, w) =1 iff V
., g
(φ, w
@
) =1
i.e., @φ is true at any world iff φ is true in the designated world of the model.
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 256
10.1.3 Examples
Example 1: Show that ∀x(F x∨2Gx)→2∀x(Gx∨@F x)
i) Suppose for reductio that this formula is not valid. Then for
some model and some variable assignment g, V
g
(∀x(F x∨2Gx)→
2∀x(Gx∨@F x), w
@
) =0.
ii) Then V
g
(∀x(F x∨2Gx), w
@
) =1 and…
iii) …V
g
(2∀x(Gx∨@F x), w
@
) =0
iv) Given the latter, there is some world, call it “a”, such that
%w
@
a and V
g
(∀x(Gx∨@F x), a) = 0. And so, there is some
object, call it “u”, in the model’s domain, ´, such that
V
g
u/x
(Gx∨@F x, a) =0
v) And so V
g
u/x
(Gx, a) =0 and…
vi) …V
g
u/x
(@F x, a) =0
vii) Given the latter, V
g
u/x
(F x, w
@
) =0 (by the clause in the truth
denition for @)
viii) Given ii), for every object in ´, and so for u in particular,
V
g
u/x
(F x∨2Gx, w
@
) =1.
ix) And so, either V
g
u/x
(F x, w
@
) =1 or V
g
u/x
(2Gx, w
@
) =1
x) From ix and vii, V
g
u/x
(2Gx, w
@
) =1
xi) And so, V
g
u/x
(Gx, a) =1—contradicts v)
Example 2: show that 2∀x(Gx∨@F x)→2∀x(Gx∨F x): here is a model in
which this formula is false at the actual world, w
@
:
T =]w
@
, a]
% =]〈w
@
, a〉]
´ =]u]
·(F ) =]〈u, w
@
〉]
·(G) =∅
The formula turns out false in this model: the consequent is false because at
world a, something (namely, u) is neither G nor F ; but the antecedent is true:
since u is F at w
@
, it’s necessary that u is either G or actually F .
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 257
10.2 ×
Adding @ to the language of quantied modal logic is a step in the right
direction, since it allows us to express certain kinds of comparisons between
possible worlds that we couldn’t express otherwise. But it doesn’t go far enough;
we need a further addition.
2
Consider this sentence:
It might have been the case that, if all those then rich might all
have been poor, then someone is happy
What it’s saying, in possible worlds terms, is this:
For some world w, if there’s a world v such that (everyone who is
rich in w is poor in v), then someone is happy in w.
This is a bit like “It might have been that everyone who is actually rich is poor”;
in this new sentence the word ‘then’ plays a role a bit like the role ‘actually’
played in the earlier sentence. But the ‘then’ does not take us back to the actual
world of the model; it rather takes us back to the world, w, that is introduced
by the rst possibility operator, ‘it might have been the case that’. We cannot,
therefore, symbolize our new sentence thus:
3(3∀x(@Rx→Px)→∃xHx)
for this has the truth condition that there is world w such that, if there’s a world
v such that (everyone who is rich in w
@
is poor in v), then someone is happy in
w. The problem is that the @, as we’ve dened it, always takes us back to the
model’s designated world, whereas what we need to do is to “mark” a world,
and have @ take us back to the “marked” world:
3×(3∀x(@Rx→Px)→∃xHx)
× marks the spot: it is a point of reference for subsequent occurrences of @.
2
See Hodes (1984a) on the limitations of @; see Cresswell (1990) on ×(his symbol is “Ref”),
and further related additions.
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 258
10.2.1 Twodimensional semantics for ×
So let’s further augment the language of QML with another oneplace sentence
operator, ×. The idea is that ×φ means the same thing as φ, except that
subsequent occurrences of @ in φ are to be interpreted as picking out the world
that was the “current world of evaluation” when the × was encountered. (This
will become clearer once we lay out the semantics for × and @.)
To lay out this semantics, let’s return to the old QML models (i.e., without
a designated world; and let’s continue to omit the accessibility relation). Thus,
a model is a triple 〈T, ´, ·〉, T a nonempty set, ´ a nonempty set, · a
function assigning referents to names and extensions to predicates at worlds as
before.
But now we change the denition of truth. We no longer evaluate formulas
at worlds. Instead we evaluate a formula at a pair of worlds (hence: “two
dimensional semantics”). One world is the world we’re used to; it’s the world
that we’re evaluating the formula for truth in. Call this the “world of evaluation”.
The other world is a “reference world”—it’s the world that we’re currently
thinking of as the actual world, and the world that will be relevant to the
evaluation of @. Thus,
V
., g
(φ, w
1
, w
2
)
will mean that φis true at world w
2
, with reference world w
1
. We dene [α]
., g
,
the denotation of term α relative to model . and variable assignment g, as
before. And V
., g
is dened as the function that assigns to each wff, relative to
each pair of worlds, either 0 or 1 subject to the following constraints:
i) V
., g
(Πα
1
. . . α
n
, v, w) =1 iff 〈[α
1
]
., g
, . . . , [α
n
]
., g
, w〉 ∈ ·(Π)
ii) V
., g
(∼φ, v, w) =1 iff V
., g
(φ, v, w) =0
iii) V
., g
(φ→ψ, v, w) =1 iff V
., g
(φ, v, w) =0 or V
., g
(ψ, v, w) =
1
iv) V
., g
(∀αφ, v, w) =1 iff for all u ∈ ´, V
., g
u/α
(φ, v, w) =1
v) V
., g
(2φ, v, w) =1 iff for all w
/
∈ T, V
., g
(φ, v, w
/
) =1
vi) V
., g
(@φ, v, w) =1 iff V
., g
(φ, v, v) =1
vii) V
., g
(×φ, v, w) =1 iff V
., g
(φ, w, w) =1
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 259
Note what the ×does: change the reference world. When evaluating a formula,
it says to forget about the old reference world, and make the new reference
world whatever the current world of evaluation happens to be.
We can dene validity and consequence thus:
φis 2Dvalid (“
2D
φ”) iff for every model ., every world w in that
model, and every assignment g based on that model, V
., g
(φ, w, w) =
1
φ is a 2Dsemantic consequence of Γ (“Γ
2D
φ”) iff for every model
., every assignment g based on that model, and every world w in
that model, if V
., g
(γ, w, w) =1 for each γ ∈ Γ, then V
., g
(φ, w, w) =
1
Valid formulas are thus dened as those that are true at every pair of worlds of
the form 〈w, w〉; semantic consequence is truthpreservation at every pair of
worlds 〈w, w〉.
Notice, however, that these aren’t the only notions of validity and con
sequence that one could introduce. There is also the notion of truth, and
truthpreservation, at every pair of worlds:
3
φis generally 2Dvalid (“
G2D
φ”) iff for every model ., any worlds
v and w in that model, and every assignment g based on that model,
V
., g
(φ, v, w) =1
φ is a general 2Dsemantic consequence of Γ (“Γ
G2D
φ”) iff for every
model ., every assignment g based on that model, and any worlds
v and w in that model, if V
., g
(γ, v, w) = 1 for each γ ∈ Γ, then
V
., g
(φ, v, w) =1
Validity and general validity, and consequence and general consequence, come
apart in various ways, as we’ll see below.
As we saw, moving to this new language increases the exibility of the @;
we can symbolize
It might have been the case that, if all those then rich might all
have been poor, then someone is happy
3
The term ‘general validity’ is from Davies and Humberstone (1980); the rst denition of
validity corresponds to their “realworld validity”.
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 260
as
3×(3∀x(@Rx→Px)→∃xHx)
Moreover, it costs us nothing. For we can replace any sentence φ of the old
language with ×φ in the new language (i.e. we just put the × operator at the
front of the sentence.)
4
For example, instead of symbolizing
It might have been that everyone who is actually rich is poor
as 3∀x(@Rx→Px) as we did before, we symbolize it nowas ×3∀x(@Rx→Px).
10.2.2 Examples
Example 1: show that if φ then @φ:
Suppose for reductio that φ is valid but @φ is not. That means
that in some model and some world, w (and some assignment g,
but I’ll suppress this since it isn’t relevant here), V(@φ, w, w) =0.
Thus, given the truth condition for @, V(φ, w, w) = 0. But that
violates the validity of φ.
Example 2: φ↔@φ, but 2(φ↔@φ). (Moral: any proof theory for this
logic had better not include the rule of necessitation!)
The truth condition for @insures that for any world w in any model
(and any variable assignment), V(@φ, w, w) =1 iff V(φ, w, w) =1,
and so V(φ↔@φ, w, w) =1. Thus, φ↔@φ.
But some instances of 2(φ↔@φ) aren’t valid. Let φ be ‘Fa’; here’s
a countermodel:
T =]c, d]
´ =]u]
·(a) =u
·(F ) =]〈u, c〉]
4
This amounts to the same thing as the old symbolization in the following sense. Let
φ be any wff of the old language. Thus, φ may have some occurrences of @, but it has no
occurrences of ×. Then, for every QMLmodel . = 〈T, ´, ·〉, and any v, w ∈ T, ×φ is
true at 〈v, w〉 in . iff φ is true in the designatedworld QML model 〈T, w, ´, ·〉.
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 261
In this model, V(2(Fa↔@Fa), c, c) =0, because V(Fa↔@Fa, c, d) =
0. For ‘Fa’ is true at 〈c, d〉 iff the referent of ‘a’ is in the extension of
‘F ’ at world d (it isn’t) whereas ‘@Fa’ is true at 〈c, d〉 iff the referent
of ‘a’ is in the extension of ‘F ’ at world c (it is).
Note that this same model shows that φ↔@φ is not generally valid.
General validity is truth at all pairs of worlds, and the formula
Fa↔@Fa, as we just showed, is false at the pair 〈c, d〉.
Example 3: φ→2@φ
Consider any model, world w (and variable assignment), and sup
pose for reductio that V(φ, w, w) =1 but V(2@φ, w, w) =0. Given
the latter, there is some world, v, such that V(@φ, w, v) =0. And
so, given the truth condition for @, V(φ, w, w) =0. Contradiction.
Example 4: 2×∀x3@F x→2∀xF x
i) Suppose for reductio that for some world w, some variable
assignment g, and some model, V
g
(2×∀x3@F x, w, w) = 1
and …
ii) …V
g
(2∀xF x, w, w) =0.
iii) Given the latter, for some world, call it “a”, V
g
(∀xF x, w, a) =
0.
iv) And so for some u ∈ ´ (call it “u”), V
g
u/x
(F x, w, a) =0.
v) Given i), V
g
(×∀x3@F x, w, a) =1
vi) Given the truth condition for ×, V
g
(∀x3@F x, a, a) =1
vii) Thus, for every object in the domain, and so for u in particular,
V
g
u/x
(3@F x, a, a) =1
viii) Thus, for some world, call it b, V
g
u/x
(@F x, a, b) =1
ix) Given the truth condition for @, V
g
u/x
(F x, a, a) =1
x) Given the truth condition for atomics, 〈[x]
g
u/x
, a〉 ∈ ·(F )
xi) But given iv, 〈[x]
g
u/x
, a〉 / ∈ ·(F ). Contradiction
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 262
10.3 Fixedly
The twodimensional approach to semantics—evaluating formulas at pairs of
worlds rather than single worlds—raises an intriguing possibility. The 2 is a
universal quantier over the world of evaluation; we might, by analogy, follow
Davies and Humberstone (1980) and introduce an operator that is a universal
quantier over the reference world. Davies and Humberstone call this operator
F, and read “Fφ” as “xedly, φ”. Grammatically, F is a oneplace sentential
operator. Its semantic clause is this:
V
., g
(Fφ, v, w) =1 iff for every v
/
, V
., g
(φ, v
/
, w) =1
Everything else, including the denitions of validity and consequence, stays
the same.
Humberstone and Davies point out that given F, @, and 2, we can introduce
two new operators: F@ and F2. It’s easy to show that:
V
., g
(F@φ, v, w) =1 iff for every v
/
∈ T, V
., g
(φ, v
/
, v
/
) =1
V
., g
(F2, v, w) =1 iff for v
/
, w
/
∈ T, V
., g
(φ, v
/
, w
/
) =1
Thus, we can think of F@ and F2, as well as 2 and F themselves, as being
“kinds of necessities”, since their truth conditions introduce universal quantiers
over worlds of evaluation and reference worlds. (What about 2F? It’s easy to
show that 2F is just equivalent to F2.)
Humberstone and Davies don’t use twodimensional semantics; they in
stead use designatedworld QML models (and they don’t include ×). Say that
designatedworld QML models are variants iff they are alike except perhaps
for the designated world. The truth condition for F is then this:
V
., g
(Fφ, w) = 1 iff for every model .
/
that is a variant of .,
V
.
/
, g
(φ, w) =1
But this isn’t signicantly different from the present twodimensional approach.
10.3.1 Examples
Example 1: for any wff, φ,
2D
F@φ→φ
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 263
Suppose otherwise: suppose V(F@φ, w, w) =1 but V(φ, w, w) =0.
Given the former, V(@φ, w, w) =1 (given the truth condition for
‘F’); but then V(φ, w, w) = 1 (given the truth condition for @).
Contradiction.
Example 2: For some wffs φ,
G2D
F@φ→φ
Our φwill be @Ga↔Ga. General validity requires truth at all pairs
〈v, w〉 in all models. But in the following model, V
g
(F@(@Ga↔Ga)→
(@Ga↔Ga), c, d) =0 (for any g):
T =]c, d]
´ =]u]
·(a) =u
·(G) =]〈u, c〉]
In this model, the referent of ‘a’ is in the extension of ‘G’ in world
c, but not in world d. That means that @Ga is true at 〈c, d〉 whereas
Ga is false at 〈c, d〉, and so @Ga↔Ga is false at 〈c, d〉. But F@φ
means that φ is true at all pairs of the form 〈v, v〉, and the for
mula @Ga↔Ga is true at any such pair (in any model). Thus,
F@(@Ga↔Ga) true at 〈c, d〉 in this model.
Example 3: for some φ,
2D
φ→Fφ
In the model of the previous problem, the formula @Ga→F@Ga
is false at 〈c, c〉. The antecedent is true because the referent of ‘a’ is
in the extension of ‘G’ at c. The consequent is false because F@Ga
means that ‘Ga’ is true at all pairs of the form 〈v, v〉, whereas ‘Ga’
is not true at 〈d, d〉 (since the referent of ‘a’ is not in the extension
of ‘G’ at d).
Example 4: If φ has no occurrences of @, then
2D
φ→Fφ
Let’s prove by induction that if φ has no occurrences of @, then
φ→Fφ is generally valid (i.e., true in any model at any world pair
〈v, w〉 under any variable assignment). The result then follows,
because general validity (truth at all pairs) obviously implies validity
(truth at pairs 〈w, w〉).
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 264
First, let φbe atomic. Then we’re trying to showthat for any worlds
v, w, and any variable assignment g, V
g
(Πα
1
. . . α
n
→FΠα
1
. . . α
n
, v, w) =
1. Suppose otherwise—suppose that (i) V
g
(Πα
1
. . . α
n
, v, w) = 1
and (ii) V
g
(FΠα
1
. . . α
n
, v, w) = 0. Given (ii), for some world, call
it v
/
, V
g
(Πα
1
. . . α
n
, v
/
, w) = 0, and so the ordered ntuple of the
denotations of α
1
. . . α
n
is not in the extension of Π at w, which
contradicts (i).
Now the inductive step. We must assume that φ and ψ obey our
statement, and show that complex formulas built fromφ and ψ also
obey our statement. That is, we assume the inductive hypothesis:
(ih) φ and ψ have no occurrences of @,
G2D
φ→Fφ
and
G2D
ψ→Fψ
and we must show that the following are also generally valid:
∼φ→F∼φ
(φ→ψ)→F(φ→ψ)
∀αφ→F∀αφ
2φ→F2φ
Fφ→FF φ
×φ→F×φ
∼: Suppose otherwise—suppose V(∼φ→F∼φ, v, w) =0 for some
v, w. So V(∼φ, v, w) =1 and V(F∼φ, v, w) =0. So V(φ, v, w) =0,
and for some v
/
, V(∼φ, v
/
, w) =0; and so V(φ, v
/
, w) =1. By (ih),
V(φ→Fφ, v
/
, w) =1, and so V(Fφ, v
/
, w) =1, and so V(φ, v, w) =
1—contradiction.
→: Suppose for some v, w, V(φ→ψ)→ F(φ→ψ), v, w) =0. So (i)
V(φ→ψ, v, w) =1 and V(F(φ→ψ), v, w) =0. So, for some world,
call it u, V(φ→ψ, u, w) =0, and so V(φ, u, w) =1 and V(ψ, u, w) =
0. Given the former and the inductive hypothesis, V(Fφ, u, w) =1,
and so V(φ, v, w) = 1. And so, given (i), V(ψ, v, w) = 1, and so,
given the inductive hypothesis, V(Fψ, v, w) =1, and so V(ψ, u, w) =
1, which contradicts (ii).
∀ : Suppose for some v, w, V
g
(∀αφ, v, w) =1, but V
g
(F∀αφ, v, w) =
0. Given the latter, for some v
/
, V
g
(∀αφ, v
/
, w) = 0; and so, for
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 265
some u in the domain, V
g
u/α
(φ, v
/
, w) =0. Given the former, V
g
u/α
(φ, v, w) =
1; given (ih) it follows that V
g
u/α
(Fφ, v, w) =1, and so, V
g
u/α
(φ, v
/
, w) =
1. Contradiction.
2: suppose (i) V(2φ, v, w) =1 and (ii) V(F2φ, v, w) =0, for some
v, w. From (ii), V(2φ, v
/
, w) =0 for some v
/
, and so V(φ, v
/
, w
/
) =
0 for some w
/
. Given (i), V(φ, v, w
/
) = 1; and so, given (ih),
V(Fφ, v, w
/
) =1, and so V(φ, v
/
, w
/
) =1. Contradiction.
F: suppose V(Fφ, v, w) = 1 and V(FFφ, v, w) = 0, for some v, w.
From the latter, V(Fφ, v
/
, w) =0 for some v
/
, and so V(φ, v
//
, w) =
0 for some v
//
, which contradicts the former.
×: suppose V
g
(×φ, v, w) = 1 but V
g
(F×φ, v, w) = 0, for some
v, w. Given the latter, V
g
(×φ, v
/
, w) = 0 for some v
/
, and so
V
g
(φ, w, w) =0, which contradicts the former.
10.4 A philosophical application: necessity and a
priority
The twodimensional modal framework has been put to signicant philosophi
cal use in the past twentyve or so years.
5
This is not the place for an extended
survey; rather, I will briey present the twodimensional account of just one
philosophical issue: the relationship between necessity and a priority.
In Naming and Necessity, Saul Kripke famously presented putative examples
of necessary a posteriori statements and of contingent a priori statements:
Hesperus = Phosphorus
B (the standard meter bar) is one meter long
The rst statement, Kripke argued, is necessary because whenever we try to
imagine a possible world in which Hesperus is not Phosphorus, we nd that
we have in fact merely imagined a world in which ‘Hesperus’ and ‘Phosphorus’
denote different objects than they in fact denote. Given that Hesperus and
5
For work in this tradition, see Stalnaker (1978, 2003a, 2004); Evans (1979); Davies and
Humberstone (1980); Hirsch (1986); Chalmers (1996, 2006); Jackson (1998); see Soames (2004)
for an extended critique.
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 266
Phosphorus are in fact one and the same entity—namely, the planet Venus—
there is no possible world in which Hesperus is different from Phosphorus, for
such a world would have to be a world in which Venus is distinct from itself.
Thus, the statement is necessary, despite its a posteriority: it took astronomical
evidence to learn that Hesperus and Phosphorus were identical; no amount
of pure rational reection would have sufced. As for the second statement,
Kripke argues that one can know its truth as soon as one knows that the phrase
‘one meter’ has its reference xed by the description: “the length of bar B”.
Thus it is a priori. Nevertheless, he argues, it is contingent: bar B does not
have its length essentially, and thus could have been longer or shorter than one
meter.
On the face of it, the existence of necessary a priori or contingent a posteriori
statements is paradoxical. How can a statement that is true in all possible worlds
be in principle resistant to a priori investigation? Worse, how can a statement
that might have been false be known a priori?
The twodimensional framework has been thought by some to shed light
on all this. Let’s consider the contingent a priori rst. Let’s dene the following
notion of contingency:
φ is supercially contingent in model . at world w iff, for every
variable assignment g, V
., g
(2φ, w, w) =V
., g
(2∼φ, w, w) =0.
This corresponds, intuitively, to this: if we were sitting at w, and we uttered
3φ∧3∼φ, we’d speak the truth.
How should we formalize the notion of a priority? As a rough and ready
guide, let’s think of a sentence as being a priori iff it is 2Dvalid—i.e., true at
every pair 〈w, w〉 of every model. In defense of this guide: we can think of the
truth value of an utterance of a sentence as being the valuation of that sentence
at the pair 〈w, w〉 in a model that accurately models the genuine possibilities,
and in which w accurately models the (genuine) possible world of the speaker.
So any 2Dvalid sentence is invariably true whenever uttered; hence, if φ is
2Dvalid, any speaker who understands his or her language is in a position to
know that an utterance of φ would be true.
Under these denitions, there are sentences that are supercially contingent
but nevertheless a priori. Consider any sentence of the form: φ↔@φ. In any
model in which φ is true at w and false at some other world, the sentence
is supercially contingent. But it is a priori, since, as we showed above, it is
2Dvalid (though it’s not generally 2Dvalid, as we also showed above.)
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 267
That was a relatively simple example; but one can give other examples that
are similar in spirit both to Kripke’s example of the meter bar, and to a related
example due to Gareth Evans (1979):
Bar B is one meter
Julius invented the zip
where bar B is the standard meter bar, and the “descriptive names” ‘one meter’
and ‘Julius’ are said to be “rigid designators” whose references are “xed” by the
descriptions ‘the length of bar B’ and ‘the inventor of the zip’, respectively. Now,
whether or not these sentences, understood as sentences of everyday English,
are indeed genuinely contingent and a priori depends on delicate issues in the
philosophy of language concerning descriptive names, rigid designation, and
reference xing. Rather than going into all that, let’s construct some examples
that are similar to Kripke’s and Evans’s. Let’s simply stipulate that ‘one meter’
and ‘Julius’ are to abbreviate “actualized descriptions”: ‘the actual length of bar
B’ and ‘the actual inventor of the zip’. With a little creative reconstruing in the
rst case, the sentences then have the form: “the actual G is G”:
the actual length of bar B is a length of bar B
the actual inventor of the zip invented the zip
Now, these sentences are not quite a priori, since for all one knows, the G might
not exist—there might exist no unique length of bar B, no unique inventor of
the zip. So suppose we consider instead the following sentences:
If there is exactly one length of bar B, then the actual length of bar
B is a length of bar B
If there is exactly one inventor of the zip, then the actual inventor
of the zip invented the zip
Each has the form:
If there is exactly one G, then the actual G is G
∃x(Gx∧∀y(Gy→y=x)) →∃x(@Gx∧∀y(@Gy→y=x)∧Gx)
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 268
Any sentence of this form is 2Dvalid (though not generally 2Dvalid), and is
supercially contingent. So we have further examples of the contingent a priori
in the neighborhood of the examples of Kripke and Evans.
Various philosophers want to concede that these sentences are contingent
in one sense—namely, in the sense of supercial contingency. But, they claim,
this is a relatively unimportant sense (hence the term ‘supercial contingency’,
which was coined by Evans). In another sense, they’re not contingent at all.
Evans calls the second sense of contingency “deep contingency”, and denes it
thus (1979, p. 185):
If a deeply contingent statement is true, there will exist some state of
affairs of which we can say both that had it not existed the statement
would not have been true, and that it might not have existed.
The intended meaning of ‘the statement would not have been true’ is that the
statement, as uttered with its actual meaning, would not have been true. The
idea is supposed to be that ‘Julius invented the zip’ is not deeply contingent,
because we can’t locate the required state of affairs, since in any situation in
which ‘Julius invented the zip’ is uttered with its actual meaning, it is uttered
truly. So the Julius example is not one of a deeply contingent a priori truth.
Evans’s notion of deep contingency is far from clear. One of the nice things
about the twodimensional modal framework is that it allows us to give a
clear denition of deep contingency. Davies and Humberstone (1980) give a
denition of deep contingency which is parallel to the denition of supercial
contingency, but with F@ in place of 2:
φ is deeply contingent in . at w iff (for all g) V
M, g
(F@φ, w, w) =0
and V
M, g
(F@∼φ, w, w) =0.
Under this denition, the examples we have given are not deeply contingent.
To be sure, this denition is only as clear as the twodimensional notions of
xedness and actuality. The formal structure of the twodimensional framework
is of course clear, but one can raise philosophical questions about how that
formalism is to be interpreted. But at least the formalism provides a clear
framework for the philosophical debate to occur.
Our discussion of the necessary a posteriori will be parallel to that of the
contingent a priori. Just as we dened supercial contingency as the falsity of
the 2, so we can dene supercial necessity as the truth of the 2:
φis supercially necessary in . at w iff (for all g) V
., g
(2φ, w, w) =1
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 269
How shall we construe a posteriority? Let’s follow our earlier strategy, and take
the failure to be 2Dvalid as our guide.
But here we must take a bit more care. It’s quite a trivial matter to construct
models in which 2Dinvalid sentences are necessarily true; and we don’t need
the twodimensional framework to do it. We clearly don’t want to say that
‘Everything is a lawyer” is an example of the necessary a posteriori. But let F
stand for ‘is a lawyer’; we can construct a model in which the predicate F is
true of every member of the domain at any world, ∀xF x is true, and so is
supercially necessary at every world, despite the fact that it is not 2Dvalid.
But this is too cheap. We began by letting the predicate F stand for a predicate
of English, but then constructed our model without attending to the modal
fact that it’s simply not the case that it’s necessarily true that everything is a
lawyer. If F is indeed to stand for ‘is a lawyer’, we would need to include in any
realistic model—any model faithful to the modal facts—worlds in which not
everything is in the extension of F .
To provide nontrivial models of the necessary a posteriori, when we have
chosen to think of the nonlogical expressions of the language of QML as
standing for certain expressions of English, our strategy will be provide realistic
models—models that are faithful to the real modal facts in relevant respects,
given the choice of what the nonlogical expressions stand—in which 2Dinvalid
sentences are necessarily true. Now, since the notion of a “realistic model”
has not been made precise, the argument here will be imprecise; but in the
circumstances this imprecision is inevitable.
So: consider now, as a schematic example of an a posteriori and supercially
necessary sentence:
If the actual F and the actual G exist, then they are identical
[∃x(@F x∧∀y(@F y→x=y)) ∧∃z(@Gz∧∀y(@Gy→z=y)] →
∃x[@F x∧∀y(@F y→x=y) ∧∃z(@Gz∧∀y(@Gy→z=y) ∧z=x)]
This sentence isn’t 2Dvalid. Nevertheless, it is supercially necessary in any
model and any world w in which F and G each have a single object in their
extension, no matter what the extensions of F and G are in other worlds in the
model. So whenever such a model is realistic (given what we let F and G stand
for), we will have our desired example.
We can ll in this schema and construct an example similar to Kripke’s
Hesperus and Phosphorus example. Set aside controversies about the semantics
of proper names in natural language; let’s just stipulate that ‘Hesperus’ is to
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 270
be short for ‘the actual F ’, and that Phosphorus is to be short for ‘the actual
G’. And let’s think of F as standing for ‘the rst heavenly body visible in the
evening’, and G for ‘the last heavenly body visible in the morning’. Then
(HP) If Hesperus and Phosphorus exist then they are identical
has the form ‘If the actual F and the actual G exist then they are identical’,
which was discussed in the previous paragraph. We may then construct a
realistic model in which F and G each have a single object in their extension in
some world, w, but in which they have different objects in their extensions in
other worlds. In such a model, the sentence
(2HP) 2(If Hesperus and Phosphorus exist then they are identical)
is true at 〈w, w〉, and so we again have our desired example: (HP) is supercially
necessary, despite the fact that it is a posteriori (2Dinvalid).
Isn’t it strange that (HP) is both a posteriori and necessary? The two
dimensional response is: no, it’s not, since although it is supercially necessary,
it isn’t deeply necessary in the following sense:
φ is deeply necessary in . at w iff (for all g) V
M, g
(F@φ, w, w) =1
It isn’t deeply necessary because in any realistic model (given what F and G
currently stand for), there must be worlds and objects other than c and u, that
are congured as they are in the model below:
T =]c, d]
´ =]u, v]
·(F ) =]〈u, c〉, 〈u, d〉]
·(G) =]〈u, c〉, 〈v, d〉]
In this model, even though (2HP) is false at 〈c, c〉, still, F@(HP), i.e.:
F@][∃x(@F x∧∀y(@F y→x=y)) ∧∃z(@Gz∧∀y(@Gy→z=y)] →
∃x[@F x∧∀y(@F y→x=y) ∧∃z(@Gz∧∀y(@Gy→z=y) ∧z=x)]]
CHAPTER 10. TWODIMENSIONAL MODAL LOGIC 271
is false at 〈c, c〉 (and indeed, at every pair of worlds), since (HP) is false at 〈d, d〉.
And so, (HP) is not deeply necessary in this model.
One might try to take this twodimensional line further, and claim that in
every case of the necessary a posteriori (or the contingent a priori), the necessity
(contingency) is merely supercial. But defending this stronger line would
require more than we have in place so far. To take one example, return again
to ‘Hesperus = Phosphorus’, but now, instead of thinking of ‘Hesperus’ and
‘Phosphorus’ as abbreviations for actualized descriptions, let us represent them
by names in the logical sense (i.e., the expressions called “names” in the deni
tion of wellformed formulas, which are assigned denotations by interpretation
functions in models). Thus, ‘Hesperus=Phosphorus’ is now represented as:
a=b. Consider the following model:
T =]c, d]
´ =]u, v]
·(a) =u
·(b) =u
The model is apparently realistic; it falsies no relevant modal facts. But the
sentence a=b is deeply necessary (at any world in the model). And yet it is a
posteriori (2Dinvalid).
Bibliography
Benacerraf, Paul and Hilary Putnam (eds.) (1983). Philosophy of Mathematics.
2
nd
edition. Cambridge: Cambridge University Press.
Boolos, George (1975). “On SecondOrder Logic.” Journal of Philosophy 72:
509–527. Reprinted in Boolos 1998: 37–53.
— (1984). “To Be Is to Be the Value of a Variable (or to Be Some Values of
Some Variables).” Journal of Philosophy 81: 430–49. Reprinted in Boolos 1998:
54–72.
— (1985). “Nominalist Platonism.” Philosophical Review 94: 327–44. Reprinted
in Boolos 1998: 73–87.
— (1998). Logic, Logic, and Logic. Cambridge, MA: Harvard University Press.
Boolos, George and Richard Jeffrey (1989). Computability and Logic. 3
rd
edition.
Cambridge: Cambridge University Press.
Chalmers, David (1996). The Conscious Mind. Oxford: Oxford University Press.
— (2006). “TwoDimensional Semantics.” In Ernest Lepore and Barry C.
Smith (eds.), Oxford Handbook of Philosophy of Language, 574–606. New York:
Oxford University Press.
Cresswell, M. J. (1990). Entities and Indices. Dordrecht: Kluwer.
Cresswell, M.J. and G.E. Hughes (1996). A New Introduction to Modal Logic.
London: Routledge.
Davies, Martin and Lloyd Humberstone (1980). “Two Notions of Necessity.”
Philosophical Studies 38: 1–30.
272
BIBLIOGRAPHY 273
Dummett, Michael (1973). “The Philosophical Basis of Intuitionist Logic.” In
H. E. Rose and J. C. Shepherdson (eds.), Proceedings of the Logic Colloquium,
Bristol, July 1973, 5–49. Amsterdam: NorthHolland. Reprinted in Benacerraf
and Putnam 1983: 97–129.
Enderton, Herbert (1977). Elements of Set Theory. New York: Academic Press.
Evans, Gareth (1979). “Reference and Contingency.” The Monist 62: 161–189.
Reprinted in Evans 1985.
— (1985). Collected Papers. Oxford: Clarendon Press.
Fine, Kit (1985). “Plantinga on the Reduction of Possibilist Discourse.” In
J. Tomberlin and Peter van Inwagen (eds.), Alvin Plantinga, 145–186. Dor
drecht: D. Reidel.
Gamut, L. T. F. (1991a). Logic, Language, and Meaning, Volume 1: Introduction
to Logic. Chicago: University of Chicago Press.
— (1991b). Logic, Language, and Meaning, Volume 2: Intensional Logic and Logical
Grammar. Chicago: University of Chicago Press.
Gibbard, Allan (1975). “Contingent Identity.” Journal of Philosophical Logic 4:
187–221. Reprinted in Rea 1997: 93–125.
Glanzberg, Michael (2006). “Quantiers.” In Ernest Lepore and Barry C.
Smith (eds.), The Oxford Handbook of Philosophy of Language, 794–821. Oxford
University Press.
Harper, William L., Robert Stalnaker and Glenn Pearce (eds.) (1981). Ifs: Con
ditionals, Belief, Decision, Chance, and Time. Dordrecht: D. Reidel Publishing
Company.
Hirsch, Eli (1986). “Metaphysical Necessity and Conceptual Truth.” In
Peter French, Theodore E. Uehling, Jr. and Howard K. Wettstein (eds.),
Midwest Studies in Philosophy XI: Studies in Essentialism, 243–256. Minneapolis:
University of Minnesota Press.
Hodes, Harold (1984a). “On Modal Logics Which Enrich Firstorder S5.”
Journal of Philosophical Logic 13: 423–454.
BIBLIOGRAPHY 274
— (1984b). “Some Theorems on the Expressive Limitations of Modal Lan
guages.” Journal of Philosophical Logic 13: 13–26.
Jackson, Frank (1998). FromMetaphysics to Ethics: ADefence of Conceptual Analysis.
Oxford: Oxford University Press.
Kripke, Saul (1972). “Naming and Necessity.” In Donald Davidson and Gilbert
Harman (eds.), Semantics of Natural Language, 253–355, 763–769. Dordrecht:
Reidel. Revised edition published in 1980 as Naming and Necessity (Cambridge,
MA: Harvard University Press).
Lemmon, E. J. (1965). Beginning Logic. London: Chapman & Hall.
Lewis, C. I. (1918). ASurvey of Symbolic Logic. Berkeley: University of California
Press.
Lewis, C. I. and C. H. Langford (1932). Symbolic Logic. New York: Century
Company.
Lewis, David (1973). Counterfactuals. Oxford: Blackwell.
— (1977). “PossibleWorld Semantics for Counterfactual Logics: A Rejoinder.”
Journal of Philosophical Logic 6: 359–363.
— (1979). “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8:
339–59. Reprinted in Lewis 1983: 233–249.
— (1983). Philosophical Papers, Volume 1. Oxford: Oxford University Press.
Linsky, Bernard and Edward N. Zalta (1994). “In Defense of the Simplest
Quantied Modal Logic.” In James Tomberlin (ed.), Philosophical Perspectives
8: Logic and Language, 431–458. Atascadero: Ridgeview.
— (1996). “In Defense of the Contingently Nonconcrete.” Philosophical Studies
84: 283–294.
Loewer, Barry (1976). “Counterfactuals with Disjunctive Antecedents.” Journal
of Philosophy 73: 531–537.
Lycan, William (1979). “The Trouble with Possible Worlds.” In Michael J.
Loux (ed.), The Possible and the Actual, 274–316. Ithaca: Cornell University
Press.
BIBLIOGRAPHY 275
Mendelson, Elliott (1987). Introduction to Mathematical Logic. Belmont, Cali
fornia: Wadsworth & Brooks.
Plantinga, Alvin (1983). “On Existentialism.” Philosophical Studies 44: 1–20.
Priest, Graham (2001). An Introduction to NonClassical Logic. Cambridge:
Cambridge University Press.
Prior, A. N. (1967). Past, Present, and Future. Oxford: Oxford University Press.
— (1968). Papers on Time and Tense. London: Oxford University Press.
Quine, W. V. O. (1948). “On What There Is.” Review of Metaphysics 2: 21–38.
Reprinted in Quine 1953a: 1–19.
— (1953a). From a Logical Point of View. Cambridge, Mass.: Harvard University
Press.
— (1953b). “Mr. Strawson on Logical Theory.” Mind 62: 433–451.
Rea, Michael (ed.) (1997). Material Constitution. Lanham, Maryland: Rowman
& Littleeld.
Russell, Bertrand (1905). “On Denoting.” Mind 479–93. Reprinted in Russell
1956: 41–56.
— (1956). Logic and Knowledge. Ed. Robert Charles Marsh. New York: G.P.
Putnam’s Sons.
Sher, Gila (1991). The Bounds of Logic: A Generalized Viewpoint. Cambridge,
Mass.: MIT Press.
Sider, Theodore (2003). “Reductive Theories of Modality.” In Michael J. Loux
and Dean W. Zimmerman (eds.), Oxford Handbook of Metaphysics, 180–208.
Oxford: Oxford University Press.
Soames, Scott (2004). Reference and Description: The Case against Two
Dimensionalism. Princeton: Princeton University Press.
Stalnaker, Robert (1968). “A Theory of Conditionals.” In Studies in Logical
Theory: American Philosophical Quarterly Monograph Series, No. 2. Oxford:
Blackwell. Reprinted in Harper et al. 1981: 41–56.
BIBLIOGRAPHY 276
— (1978). “Assertion.” In Peter Cole and Jerry Morgan (eds.), Syntax and Se
mantics, Volume 9: Pragmatics, 315–332. NewYork: Academic Press. Reprinted
in Stalnaker 1999: 78–95.
— (1981). “A Defense of Conditional Excluded Middle.” In Harper et al.
(1981), 87–104.
— (1999). Context and Content: Essays on Intentionality in Speech and Thought.
Oxford: Oxford University Press.
— (2003a). “Conceptual Truth and Metaphysical Necessity.” In Stalnaker
(2003b), 201–215.
— (2003b). Ways a World Might Be. Oxford: Oxford University Press.
— (2004). “Assertion Revisited: On the Interpretation of TwoDimensional
Modal Semantics.” Philosophical Studies 118: 299–322. Reprinted in Stalnaker
2003b: 293–309.
von Fintel, Kai (2001). “Counterfactuals in a Dynamic Context.” In Ken Hale:
A Life in Language, 123–152,. Cambridge, MA: MIT Press.
Westerståhl, Dag (1989). “Quantiers in Formal and Natural Languages.” In
D. Gabbay and F. Guenther (eds.), Handbook of Philosophical Logic, volume 4,
1–131. Dordrecht: Kluwer.
Williamson, Timothy (1998). “Bare Possibilia.” Erkenntnis 48: 257–273.
— (2002). “Necessary Existents.” In A. O’Hear (ed.), Logic, Thought and
Language, 233–51. Cambridge: Cambridge University Press.
This action might not be possible to undo. Are you sure you want to continue?