Lecture Notes by Prof Zhang

Modal Logic ZHANG Jiji
Lecture 1
1.1 Overview
Many claims and arguments, especially in philosophy, involve modal (模態) propositions.
In a narrow sense, modal propositions are propositions that involve necessity (必然性)
and possibility (可能性). For example, the proposition expressed by the sentence “it is
possibly the case that the tortoise runs as fast as Achilles does” and that expressed by the
sentence “it is necessarily the case that the tortoise runs as fast as Achilles does” are
modal propositions. They are different from saying that (it is the case that) the tortoise
runs as fast as Achilles does. Roughly speaking, modal logic studies reasoning with
modal propositions.
In addition to distinguishing among what is the case, what is possibly the case (or what
may be the case), and what is necessarily the case (or what must be the case), it also helps
to keep in mind that there are different kinds of necessity and possibility we can talk
about. For example, logical necessity/possibility seem to be different from physical
necessity/possibility. It is logically possible for Achilles to run faster than the speed of
light, but presumably not physically possible.
There may be different valid rules of reasoning for different modalities. For example,
suppose it is possible that Superman exists. Does it follow that it is necessarily possible
that Superman exists? As we shall see later, the reasoning from “possible” to “necessarily
possible” is widely regarded as valid if the modalities are understood as logical
possibility and necessity, but presumably not valid if the modalities are understood as
physical possibility and necessity. This is one reason why we will see a number of logical
systems in modal logic. (Another important reason is that even for a given kind of
modality, there are disagreements on what are valid rules of reasoning.)
Moreover, there are other concepts that are very analogous or closely related to the
alethic (of truth, 真性) modal concepts, and are often treated as modal in a broad sense
(廣義模態). For example, the concepts of moral obligation/permission (or moral
necessity/possibility), the concept of knowing (or epistemological necessity), the
concepts of always/sometimes (or temporal necessity/possibility), have been studied by
logicians in the general framework of modal logic. The resulting philosophical logics
such as deontic logic (道義邏輯), epistemic logic (認識論邏輯), and temporal logic
(a.k.a. tense logic, 時間邏輯或時態邏輯) are viewed as applications of modal logic, or
as modal logics in a broad sense (廣義模態邏輯).
Our plan is to spend the first half of the course on modal propositional logic (模態命題
邏輯), with a focus on the possible-worlds semantics and a number of axiomatic systems.
Once we have a solid grasp of modal propositional logic, we will briefly study some
philosophical logics, including deontic logic, epistemic logic, and temporal logic.
Towards the end of the course we will study modal predicate logic (模態謂詞邏輯)
which is both technically more complicated and philosophically more controversial than
1
modal propositional logic. To get a sense of the matter, consider the following question.
Supposing it is possible that there is a superman, does it follow that there is somebody
who is possibly a superman? And how about the converse: supposing someone is
possibly a superman, does it follow that it is possible that superman exists?
There are philosophers, most notably W.V.O. Quine, who take the very construction
“someone is possibly a superman (or there is somebody who is possibly a superman)” as
incoherent or meaningless, because such a construction appears to ascribe a modal
property to a thing (the jargon is de re, 從物) rather than to a statement or proposition
(the jargon is de dicto, 從言). What is wrong with de re modality? One of Quine’s main
objections is that de re modality is incoherent when we consider different descriptions of
the thing. For example, the number eight has the property of being greater than 7. Is the
property a necessary property of the number eight? The answer, argues Quine, is
equivocal, depending on how we describe the number eight. For example, the number
eight can be described by the numeral “8” or by the expression “the number of planets in
the solar system”. Under one description, its property of being greater than 7 seems to be
a necessary one, because it is true that necessarily 8 is greater than 7. Under the other
description, however, the property does not seem to be a necessary one, because it is not
true that necessarily the number of planets in the solar system is greater than 7. Thus
Quine regards it meaningless to say the number eight (the object, not any description of it)
has (or has not) a necessary property of being greater than 7.
If you find these issues difficult or confusing, don’t worry, you are not alone. We will
come back to them in due time.
1.2 Basic set-theoretical notions
Let me review some basic mathematical notions we will have to use in the course. We
will talk about sets (集合). For ordinary purposes, a set is just understood as a collection
of objects, concrete or abstract. So I can talk about the set of students in this class, the set
of books in my office, the set of well-formed formulas in a formal language, and the set
of natural numbers, etc. There are two common ways to specify a set, by enumerating all
members of the set or by describing a property shared by all and only the members of the
set. For example, I can specify a set F as
F = {Paisley Livingston, Neven Sesardic}
Alternatively, I can specify the set as
F = {x | x was a full professor of philosophy at Lingnan in 2012}
Either way, F is specified to be the set consisting of the two professors in the department
of philosophy. Note that we usually use a pair of curly brackets to indicate a set.
2
A basic relation in set theory is the membership relation. Paisley Livingston is a member
of the set F specified above, whereas Jiji Zhang is not a member of the set. The standard
notations are  and . We can write “Paisley Livingston  F” and “Jiji Zhang  F”.
There is a set containing no members, which we call the empty set (空集). It is denoted
by either  or {}.
Let A and B be any sets. We say A is a subset (子集) of B and B is a superset (母集) of A,
if and only if every member of A is also a member of B (or in other words, for every x, if
xA, then xB). The standard notation is to write AB. Obviously, for every set A,
AA and A. In addition, A is called a proper subset (真子集) of B, written as AB,
if and only if A is a subset of B and B is not a subset of A. For example, the set of female
students at Lingnan is a proper subset of the set of students at Lingnan; the set of natural
numbers is a proper subset of the set of integers.
A=B if and only if AB and BA, which means A and B have the exact same members.
We write AB to denote the union (並集) of A and B, which is the set of objects that
belong to A or belong to B. That is,
AB = {x | xA or xB}
For example, if A is the set of female students at Lingnan and B is the set of male
students at Lingnan, then AB is the set of students at Lingnan. Obviously, for any A
and B, AAB and BAB.
We write AB to denote the intersection (交集) of A and B, which is the set of objects
that belong to both A and B. That is,
AB = {x | xA and xB}
For example, if A is the set of students at Lingnan and B is the set of female residents in
Hong Kong, then AB is the set of female students at Lingnan. Obviously, for any A and
B, ABA and ABB.
The Cartesian product (卡式積) of A and B, written as AB, is the following set:
AB = {(x,y) | xA and yB}
In other words, AB is the set of ordered pairs (有序對) such that the first element in the
pair is a member of A and the second element in the pair is a member of B. For example,
if A={1, 2} and B = {1, 2, 3}, then AB = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3)}.
A function (函數) is a mapping from a set A to a set B, such that every member of A is
mapped to one and only one member in B. For example, in propositional logic (which we
will review in more detail next week), a truth assignment is just a function from the set of
propositional variables to the set of truth values: {FALSE, TRUE} (or {0, 1}, if we use 0
to represent FALSE and 1 to represent TRUE), which means that every propositional
variable is mapped to one and only one truth value. According to the truth-functional
3
interpretations of logical connectives, a truth assignment can then be extended to a

valuation function from the set of well-formed formulas in the language of propositional
logic to the set of truth values, which means that every well-formed formula is mapped to
one and only one truth value.
In the possible-worlds semantics of modal logic, we will use a set of possible worlds to
interpret modal statements. Propositional variables (and well-formed formulas in general)
are not true or false simpliciter, but true or false relative to a possible world. A truth
assignment will then be a function from the Cartesian product of the set of propositional
variables and the set of possible worlds to the set of truth values, so that each pair (P, w)
 where P is a propositional variable and w is a possible world  is mapped to one and
only one truth value (which means that P has that true value at w).
In addition to a set of possible worlds, an important element in the possible-worlds

semantics is a binary relation (二元關係) over the set of possible worlds. In the abstract,
(the extension of) a binary relation R over a set A is a subset of the Cartesian product of
A and A, that is, RAA. In other words, a binary relation is represented by the set of all
and only those pairs that have the relation.
R is said to be a serial (持續的) relation over A if and only if, for every xA, there is a
yA such that (x, y)R. For example, the relation of “less than” is a serial relation over
the set of natural numbers (because every natural number is less than some natural
number), but the relation of “greater than” is not (because the minimal natural number 0
is not greater than any natural number).
R is said to be a reflexive (自返的) relation over A if and only if, for every xA, (x,
x)R. For example, the relation of “less than or equal to” is a reflexive relation over the
set of natural numbers, but the relation of “less than” is not.
R is said to be a symmetric (對稱的) relation over A if and only if, for every x, yA, if (x,
y)R, then (y, x)R. For example, the relation of “equal to” is a symmetric relation over
the set of natural numbers, but the relation of “less than or equal to” is not.
R is said to be a transitive (傳遞的) relation over A if and only if, for every x, y, zA, if
(x, y)R and (y, z)R, then (x, z)R. For example, the relations of “less than”, “greater
than”, “equal to”, and “less than or equal to” are all transitive over the set of natural
numbers, but the relation of “being the successor of” is not.
R is said to be a Euclidean (歐式的) relation over A if and only if, for every x, y, zA, if
(x, y)R and (x, z)R, then (y, z)R. For example, among all the relations mentioned
above, only the relation of “equal to” is Euclidean.
4
Exercise 1
1. Consider the arguments mentioned in the lecture notes:
(a) It is possible that superman exists. So, it is necessarily possible that superman exists.
(b) It is possible that there is a superman. So, there is somebody who is possibly a
superman.
(c) Someone is possibly a superman. So, it is possible that superman exists.
Do you understand the statements in these arguments? If no, try to articulate why you
don’t understand them. If yes, do you think the arguments are valid?
2. Suppose A = {1, 2, a, b} and B = {b, c, 2}. Determine AB, AB, and AB.
3. Let A, B, C be any sets. Convince yourself that the following are true.
(a) If AB, then AB = B and AB = A.
(b) AB = BA; AB = BA
(c) A(BC) = (AB)C; A(BC) = (AB)C
(d) A(BC) = (AB)(AC); A(BC) = (AB)(AC)
4. Consider the set of students taking this course, and consider the following binary
relations over the set:
R1 = {(x, y) | x and y were born in the same month}
R2 = {(x, y) | x and y were born in different years}
Must R1 be Serial? Reflexive? Symmetric? Transitive? Euclidean? How about R2?
5. Consider the set W = {w1, w2, w3, w4}. For each of the following conditions, is there a
relation over W that satisfies the condition? If yes, define such a relation over W.
(a) is serial but not reflexive;
(b) is reflexive but not serial;
(c) is symmetric but not transitive;
(d) is transitive but not Euclidean;
(e) is symmetric and transitive but not Euclidean;
(f) is reflexive, symmetric, and transitive.
5
Lecture 2
2.1 Review of classical propositional logic: formal language and semantics
Following the textbook, I will use the following language of (classical) propositional
calculus (PC, 古典命題演算), with slightly different notations than in the textbook.
Language of PC
Primitive symbols:
Propositional variables: p, q, r, …,
Connectives/operators: ~, .
Auxiliary symbols: (, ).
The set of well-formed formulas (wffs) is defined by the following rules:

1. A propositional variable is a wff.
2. If  is a wff, so is ~.
3. If  and  are wffs, so is ().
4. No other strings of primitive symbols are wffs.
For convenience, we define three more connectives/operators.

[Def ] () =df ~(~  ~)
[Def ] () =df (~  )
[Def ] () =df (()  ())
Accordingly we also count (), (), and () as wffs, if  and  are. These wffs
are just shorthand for more complicated wffs in terms of ~ and  alone. Note that the
textbook uses ‘’ for ‘’, and ‘’ for ‘’, but I will follow my long-time habit of using
‘’ and ‘’. The usual conventions for omitting parentheses apply: the leftmost and
rightmost parentheses will be omitted, and the negation sign (~) has the highest priority.
The formal semantics for PC is simple. Propositional variables are interpreted as taking
truth values, TRUE (T) or FALSE (F). Connectives are interpreted as truth functions.
The truth functions can be easily expressed in truth tables:
  ~    

T T F T T T T
T F F T F F F
F T T T F T F
F F T F F T T
6
Recall that a truth assignment is a function from the set of propositional variables to the
set of truth values. Given a truth assignment, every wff gets a unique truth value
according to the interpretations of connectives. A wff  is said to be tautological or valid
if its value is TRUE under every truth assignment. We write  to mean  is valid. A
wff  is called a logical consequence of a (finite) set of wffs ={1, …, n} if and only if
(1…n) .
The textbook (pp. 8-10) describes a PC-game that can be used to define the notion of
validity in PC. It is really the same definition as the definition in terms of truth
assignment. The textbook (pp. 10-12) also reviews the truth table method and the
Reductio method for testing validity. I will not repeat them here but you should make
yourself very familiar with them (again!).
2.2 Formal axiomatic systems
Later on we will study a number of formal axiomatic systems of modal logic. As a

preparation, let me introduce the basic ideas here, in connection to PC.
Given a formal language such as the language of PC, an axiomatic system (公理系統) S
consists of a set of wffs, called axioms (公理), and a set of transformation rules (變形規
則), prescribing (roughly speaking) what wffs can be derived from what wffs. A proof
(證明) in S is a sequence of wffs such that every wff in the sequence is either (i) an
axiom of S or (ii) derived from some earlier wffs in the sequence, by applying a
transformation rule of S or applying a definition in the language. For any wff , if there is
a proof ending with  in S, then  is called a theorem (定理) of S, written as |–S . When
it is clear which system we are talking about, we can simply write |– . All axioms of a
system are, of course, also theorems of the system (they can be proved in one line).
The axiomatic systems we will study have a transformation rule called uniform
substitution. In order to state that rule, it is convenient to have the following notation. Let
p1, …, pn be distinct propositional variables that occur in a wff , and 1, …, n be any
(not necessarily distinct) wffs. We write [1/p1, …, n/pn] to denote the wff resulting
from  by simultaneously replacing all occurrences of pi by i (i=1,…n). For example, if
 is the wff p(qp), then [r/p] is the wff r(qr); [(rr)/q] is the wff
p((rr)p); [r/p, (rr)/q] is the wff r((rr)r).
Here is a common axiomatic system for PC, which I will refer to as system P.
Axioms:
AP1 p  (q  p)
AP2 (p  (q  r))  ((p  q)  (p  r))
AP3 (~p  q)  ((~p  ~q)  p)
Primitive Transformation Rules:

US (Uniform Substitution, 代入規則): If |– , then |– (1/p1, …, n/pn).
MP (Modus Ponens, 分離規則): if |–  and |– , then |– .
7
Note how the rules are expressed: they apply to wffs that are theorems and derive further
theorems. In particular, the rule US does not say you can derive (1/p1, …, n/pn) from
any . What it says is that if  is a theorem, then you can derive (1/p1, …, n/pn). In
this course, we will only consider proofs that do not allow any non-theorem to enter the
proof sequence. In other words, any wff that already appears in a proof sequence must be
a theorem. So, we can freely apply the rule to any wff that appears in a proof sequence.
To see how proofs in axiomatic systems are written, let’s prove a couple of simple
theorems in system P.
|– pp
Proof:
AP1 (1) p  (q  p)
(1)[(pp)/q] (2) p  ((pp)  p)
AP2 (3) (p  (q  r))  ((p  q)  (p  r))
(3)[(pp)/q, p/r] (4) (p  ((pp)  p))  ((p  (pp))  (p  p))
(2), (4)  MP (5) (p  (pp))  (p  p)
(1)[p/q] (6) p  (pp)
(5), (6)  MP (7) p  p
In general, a proof is written as a sequence of lines. In each line, there are three
components: a line number (for the purpose of reference), a wff, and a justification.
Following the style in the textbook (as you will read later), the justification is written on
the left, the line number in the middle, and the wff on the right. For each line, if the wff
on the line is an axiom, then the justification simply cites the name of the axiom;
otherwise the wff must be derived from some earlier lines by applying a transformation
rule or a definition, and the justification should specify which rule or definition is applied
to which earlier lines. In the previous proof, for example, line (2) is derived by applying
the rule of US to line 1, and we use the substitution notation in the justification to clearly
specify which propositional variables are being replaced by which wffs. Line (5) is
derived from (2) and (4) by applying the rule of MP, so we write “(2), (4)  MP”. Here I
also follow the textbook style of writing line numbers before the name of the rule and
using the symbol  to signal the application of a rule (except for the rule of US).
An obvious way to simplify proofs is to use theorems that are already proved directly in
other proofs. Let us refer to the theorem pp as ThP1, and use it to prove another
theorem of P.
8
|– ~~pp
Proof:
ThP1 (1) pp
(1)[~p/p] (2) ~p~p
AP3 (3) (~p  q)  ((~p  ~q)  p)
(3)[~p/q] (4) (~p  ~p)  ((~p  ~~p)  p)
(2), (4)  MP (5) (~p  ~~p)  p
AP1 (6) p  (q  p)
(6)[((~p~~p)p)/p, ~~p/q] (7) ((~p  ~~p)  p)  (~~p  ((~p  ~~p)  p))
(5), (7)  MP (8) ~~p  ((~p  ~~p)  p)
AP2 (9) (p  (q  r))  ((p  q)  (p  r))
(9)[~~p/p, (~p  ~~p)/q, p/r] (10) (~~p((~p~~p)  p))  ((~~p (~p~~p))
 (~~p p))
(8), (10)  MP (11) (~~p (~p~~p))  (~~p p)
(6)[~~p/p, ~p/q] (12) ~~p (~p~~p)
(11), (12)  MP (13) ~~pp
As you can see, doing proofs in such axiomatic systems is a more challenging (and yes,
less “natural”) game than that in the system of natural deduction (which, you will recall,
has no axioms but much more rules and allows conditional proofs). Good news: except
for a few (optional) problems in the exercise, I will not bother you with building proofs in
P. The purpose of introducing P here is just to illustrate the main ideas of formal
axiomatic systems. Later, when we study the modal systems, we will take all theorems of
P as axioms, which will make the job of constructing proofs much easier.
By the way, the set of all theorems of P is precisely the set of all valid wffs, i.e.,
tautologies, in PC. In other words, every wff provable in P is valid and every valid wff is
provable in P. So, from just three axioms and two transformation rules (and the three
definitions), all valid laws of classical propositional logic can be derived.
Two axiomatic systems S1 and S2 may have the exact same theorems, in which case we
say they are equivalent (等價)1. If all theorems of S1 are also theorems of S2, we say S2
contains (包含) S1. If S2 contains S1 but S1 does not contain S2, we say S2 properly
contains (真包含) S1, or S2 is a proper extension (真擴張) of S1. We will see later that a
straightforward way to extend a system is to add non-theorems as new axioms.
1
Sometimes a system is just identified with the set of its theorems. If so, we would say S 1 and S2 are two
axiomatizations of the same system.
9
2.3 The language of modal propositional logic
The language of modal propositional logic extends the language of PC by adding modal
operators (模態算子). Specifically, we keep the whole language of PC and add the
following:
 A new primitive symbol,  (the textbook uses L).
 Another rule for forming wffs: if  is a wff, so is .
 A definition of another operator  (the textbook uses M):
(Def )  =df ~~
Intuitively,  is intended to express the notion of “necessarily” or “must be”, and  is

intended to express the notion of “possibly” or “may be”. In English, we will refer to  as
the box or the necessity operator, and  as the diamond or the possibility operator. These
two operators will not be interpreted truth-functionally.
Since every wff in PC is also a wff in modal propositional logic, the system P is also a
system in the language of modal propositional logic. The modal systems we will study
later are all proper extensions of P.
Examples of wffs (that aren’t wffs of PC): pp, (pq)~q, (p~q), …

Examples of strings of symbols that are not wffs: , pq, qq, …
Below is a list of some important wffs in the language of modal propositional logic (yes,
they are important enough to have names):
K (pq)  (p  q)
D p  p
T p  p
B p  p
4 p  p
E p  p
As an exercise, think about what these formulas mean intuitively.
Pay attention to the discussion on pp. 15-16 of the textbook, about the ambiguity of the
scope of modal terms in English. In English, when we say things like “if p, then q must
be true” (or “if p, then necessarily q”), the modal term is usually meant to cover the
whole conditional, rather than just the consequent. That is, the logical form of the
sentence is (pq), rather than pq.
10
Exercise 2
1. On p. 13 of the textbook, there is a list of valid wffs of PC (which will be useful later).
Verify their validity using the Reductio method.
2. Prove the following theorems in the system P.

(a) (~p  p)  p
(b) p  (q (r  q))
(c) ~~p  (qp)
3. Rewrite the following formulas without using the defined operator .

(a) p  p
(b) p  p
(c) ~(pq)  ~q
(d) (p~q)
4. The notion (and notation) of uniform substitution remain the same in the language of
modal logic as in the language of PC. Let K be the formula (pq)  (p  q). What
are the formulas resulting from the following substitutions?
(a) K[q/p, p/q]
(b) K[(pq)/p]
(c) K[p/p, q/q]
(d) K[K/q]
5. Consider the following argument:

Necessarily, if Jiji is the first teacher of modal logic at Lingnan, then Jiji likes
spicy food if and only if the first teacher of modal logic at Lingnan likes spicy
food. So, if Jiji is the first teacher of modal logic at Lingnan and likes spicy food,
then the first teacher of modal logic at Lingnan must like spicy food. In fact, Jiji is
the first teacher of modal logic at Lingnan and also likes spicy food. Therefore, it
is necessarily the case that the first teacher of modal logic at Lingnan likes spicy
food.
What is the logical form of this argument in the language of modal propositional logic?
What is wrong with this argument?
11
Lecture 3
3.1 Basic ideas of the possible-worlds semantics
In the semantics for modal logic, the centerpiece is to give the modal operators a precise
meaning. In the celebrated possible-worlds semantics ( 可能世界語義學 ), they are
interpreted based on a natural idea, usually attributed to Leibniz: a proposition is
necessary if it is true in all possible worlds; a proposition is possible if it is true in some
possible worlds.
Recall that in the semantics of PC, the basic element (i.e., a model) is a (single) truth
assignment, which determines a unique truth value for every wff. In modal logic, the
basic element, roughly speaking, is a number of truth assignments. More accurately, we
will allow any number of possible worlds, and in each world there is a truth assignment.
A wff is not simply true or false, but true or false at a world. At each world, the
propositional variables are interpreted as true or false by the truth assignment.
Connectives are interpreted as they are in PC, i.e., truth-functionally:
(i) ~ is true at world w iff  is false at world w.

(ii)  is true at world w iff  is true at world w or  is true at world w (or both).
(Similarly for the other three connectives, which follow from their definitions in terms of
~ and . Recall also that ‘iff’ is shorthand for ‘if and only if’).
For , however, it is not enough to just consult the truth value of  at the same world.
Rather we need something like the following:
(iii)  is true at world w iff at every world (that is possible from w’s perspective)  is
true.
It follows, by the definition of the diamond, that
(iv)  is true at world w iff at some world (that is possible from w’s perspective)  is
true.
So  and  are not truth-functional: in general, the truth value of  at a world does not
determine the truth value of  or that of  at the same world.
Thus the basic element in the possible-worlds semantics consists of a set of worlds and a
set of truth assignments: one for each world.
And still something more. Notice the complication stated in the parentheses of (iii) and
(iv). The idea is that seen from some world, perhaps not every world is possible. For
example, if we are talking about physical possibility, which, according to a common
understanding, means compatibility with physical laws. Then from the perspective of our
actual word, a world in which there is a sphere of enriched uranium with 1 ton in mass is
not possible (because the critical mass of enriched uranium is far less than 1 ton,
12
according to our laws). That world, however, may well be possible from the perspective
of a world with different physical laws than ours.
Therefore, in our basic element, we will also have a binary (a.k.a. “dyadic”) relation over
the set of worlds, called accessibility relation (可達關係), which intuitively represents
which world is possible relative to which world. In other words, the basic element in the
possible-worlds semantics, called a model, consists of a set of worlds, a binary relation
over the set, and a set of truth assignments: one for each world. Given a model, every wff
gets a unique truth value at each world according to the interpretations of operators.
The inclusion of a binary relation between worlds gives the possible-worlds semantics a
lot of flexibility and power. In fact, the semantics is also known as relational semantics
(關係語義學).
The textbook describes a modal game (pp. 17-18), which is a concrete implementation of
the previous ideas. The players in the game correspond to possible worlds. The relation
“player x can see player y” corresponds to the accessibility relation. The sheet given to a
player corresponds to a truth assignment at a world.
3.2 Frames, Models, and Validity.
Let’s now make the ideas rigorous by a few formal definitions:
Definition 1 [Frame (框架)]: An ordered pair <W, R> is a frame iff W is a non-empty
set and R  WW is a binary relation over W.
Intuitively, a frame consists of a set of possible worlds (W) and an accessibility relation
(R) between worlds. I sometimes say w’ is accessible from w, by which I mean (w, w’) 
R (that is, w and w’ have the relation R).
In the next definition, I use FORM to denote the set of all well-formed formulas in the
language. Following the textbook, I use 1 for the truth value TRUE and 0 for the truth
value FALSE, and write wRw’ to mean (w, w’)  R.
Definition 2 [Model (模型)]: An ordered triple <W, R, V> is a model iff <W, R> is a
frame, and V is a function from FORMW to the set of truth values {0, 1} that satisfies
the following conditions:
(1) For any single propositional variable p and any w W, V(p, w) = 0 or V(p, w) = 1.
(2) For any wff  and any wW, V(~, w) = 1 iff V(, w) = 0.
(3) For any wffs  and , and any wW, V(, w) = 1 iff V(, w) = 1 or V(, w) = 1.
(4) For any wff , and any wW, V(, w) = 1 iff for every w’W such that wRw’,
V(, w’) = 1.
13
So a model consists of a frame (which, in turn, consists of a non-empty set and a binary
relation over the set), and a valuation function. Given a model <W, R, V>, we can say
that the frame of the model is <W, R>, or that the model is based on the frame <W, R>.
For the V component, it is easier to see it this way: to specify V is just to specify a truth
assignment for each wW (this is what condition (1) in the definition is about). Then at
each wW, all wffs can be evaluated according to the semantical rules (2)-(4). In practice,
of course, we will almost always deal with only a limited number of propositional
variables, and so will just need to specify the truth assignment for these relevant variables
at each world.
To be explicit, let us also write down the semantical rules for the defined operators,
which follow from (2)-(4) and the definitions:
(5) For any wffs  and , and any wW, V(, w) = 1 iff V(, w) = 1 and V(, w) = 1.
(6) For any wffs  and , and any wW, V(, w) = 1 iff V(, w) = 0 or V(, w) = 1.
(7) For any wffs  and , and any wW, V(, w) = 1 iff V(,w) = V(, w) .
(8) For any wff , and any wW, V(, w) = 1 iff for some w’W such that wRw’,
V(, w’) = 1.
(Be reminded that “some” just means “at least one”, and “every” means “every, if any”.
So “every A is a B” is true if there is no A at all.)
With the notions of frame and model, we can define several notions of validity.
Definition 3 (Validity relative to a model) Let M = <W, R, V> be a model. A wff  is

said to be valid on M, written as |=M, iff for each wW, V(,w) =1.
In plain words,  is valid on M iff  is true at every world of M.
Obviously, the notion of validity on a model depends on the particular valuation function
V in the model. A more interesting notion of validity is the following:
Definition 4 (Validity relative to a frame) Let F be a frame. A wff  is said to be valid on

F, written as |=F, iff  is valid on every model based on F.
In plain words,  is valid on F iff regardless of which truth assignment happens to hold at
which world,  is true at every world of F.
Later, we will also frequently talk about validity relative to a set of frames: a wff  is said
to be valid on a set of frames iff  is valid on every frame in the set. The notion of K-
validity in the textbook is simply validity relative to the set of all frames. In other words,
 is K-valid iff it is valid on every frame.
14
Finally, no matter which notion of validity, we can extend it to argument forms: an

argument form with premises 1, …, n and conclusion  is valid, if and only if the
formula (1…n)  is valid. Hence it suffices to consider the validity of formulas.
3.3 Examples
To illustrate the concepts of validity, let us start with the obvious: all tautologies of PC
are valid on every frame. In no frame can a tautology be made false at some world.
Similarly, all formulas resulting from tautologies of PC by uniform substitution, such as
pp and p~p, are also valid on every frame.
Next, consider such formulas as , where  is a tautology of PC. For example, consider
(pp). Obviously pp is true at every world in every model. So no matter what is the
accessibility relation, (pp) is true at every world. Thus, (pp) is valid on every
model, which implies that it is valid on every frame.
More generally, if  is valid on a frame, then for any model that is based on the frame, 
is true at every world in the model, which implies that  is also true at every world in
the model. Therefore, if  is valid on a frame, then  is also valid on the same frame. It
follows that if  is valid on every frame, then  is also valid on every frame
How about (pp)? You may be tempted to think that it is also valid on every frame.
But that is a mistake. There are frames on which (pp) is not valid. For example, let
W={w} and R=, then <W, R> is a frame on which (pp) is not valid. In order to
show that a formula is not valid on a frame, we just need to present a model based on the
frame such that the formula is not valid on the model. We call such a model a counter-
model (反模型). Here is one way to specify a counter-model to (pp):
Model 1: W={w}; R=; V(p, w) = 1.
Notice that we only need to specify the truth assignment, at each world (in this case, just
w), to the relevant propositional variables (in this case, just p). In Model 1, (pp) is
false at w, due to the fact that there is no w’W such that wRw’ (or in plain words, no
world, including w itself, is accessible from w). Thus (pp) is not valid on the model,
and hence not valid on its frame <{w}, >.
This frame certainly looks strange (intuitively the frame is such that no world is possible
relative to w), but it is a frame nonetheless. Of course the mere fact that (pp) is not
valid on this frame does not mean (pp) cannot be a valid law of modal logic. As I said
at the beginning, there are different kinds of modality (and for each kind there are
different opinions about its nature). Perhaps for some kind of modality, a certain kind of
frames like the previous one is improper, and so the logical laws for that kind of modality
need not be valid on such frames.
You will show in the exercise that although (pp) is not valid on every frame, it is
valid on every serial frame (by which we mean the relation in the frame is serial). Indeed
15
it is valid on a frame if and only if the (accessibility relation in the) frame is serial. So, if
you regard (pp) as valid, you are committed to taking only serial frames as proper.
Consider another intuitively valid formula: p  p (which is known as formula T). Is it

valid on every frame? No. Here is my reasoning towards a counter-model: I need to make
p  p false at some world w in a model. For that purpose, I need to make p true and p
false at w. How to make p true at w while p is false at w? I need to make it that p is true
in every world accessible from w. That means I should make w itself not accessible from
w, and make p true in any world that I make accessible from w. The simplest thing to do
is to make no world accessible from w, and we get a counter-model to formula T:
Model 2: W={w}; R=; V(p, w) = 0.
Model 2 is also based on the frame <{w}, >. You will show in the exercise that p  p
is valid on all and only reflexive frames (i.e., frames in which the accessibility relations
are reflexive). Since non-seriality entails non-reflexivity, every frame on which (pp)
is not valid is also a frame on which p  p is not valid. But not vice versa, as the
following model shows:
Model 3: W={w1, w2}; R={(w1, w2), (w2, w2)}; V(p, w1) = 0, V(p, w2) = 1.
T is not valid on this model, because at w1 it is false (V(p  p, w1) = 0). So it is not
valid on the frame <W={w1, w2}, R={(w1, w2), (w2, w2)}>. But it is not hard to see that
(pp) is valid on this frame (which is serial).
In the simple reasoning towards Model 2 above, we used a method very much like the
Reductio method for testing validity in PC (pp. 11-12 in the textbook). Indeed it is a
generalized version of the Reductio method suitable for modal logic. We will systematize
this method later. For now, it suffices to see a couple of more examples.
Consider the formula known as 4: p  p. Perhaps you do not have as clear an
intuition regarding this formula as you do regarding T. But since T is already not valid on
every frame, it is natural to expect that 4 is not valid on every frame either. How to find a
counter-model? I start by thinking that I need to make p  p false at some world w.
That means I need to make p true at w and p false at w. In order to make p false at
w, there needs to be a world w’ accessible from w at which p is false. Obviously w’
cannot be w, because p has to be true at w. So we need at least two different worlds.
What we get so far is two distinct worlds w1 and w2 such that w1Rw2, and we need to
make p true at w1 and make p false at w2. Since w1Rw2, in order to make p true at w1,
we have to make p true at w2. On the other hand, in order to make p false at w2, there
needs to be a world accessible from w2 at which p is false. Obviously this world cannot
be w2 itself because p has to be true at w2. So it is either w1 or still some other world. For
simplicity, we can just use w1, which means we need to require w2Rw1 as well and make
p false at w1.
Model 4: W={w1, w2}; R={(w1, w2), (w2, w1)}; V(p, w1) = 0, V(p, w2) = 1.
16
In Model 4, p  p is false at w1 (check it again!), so it is not valid on every frame.

(We will see later that p  p is valid on all and only transitive frames.)
Lastly, let us look at the example discussed in the textbook (p. 20), the formula K:
(pq)  (p  q). This formula turns out to be valid on every frame. To show this,
we can use the Reductio method to show that there is no counter-model to K. Suppose we
could find a model in which K is false at some world w. That means (pq) has to be
true at w and pq has to be false at w. In order for pq to be false at w, p has to
be true at w, and q has to be false at w. In order for q to be false at w, there needs to be
a world w’ accessible from w at which q is false. Now since w’ is accessible from w, in
order for p to be true at w, p has to be true at w’, and in order for (pq) to be true at w,
pq has to be true at w’. In order for pq and p to be both true at w’, q has to be true at
w’. Then we have to make q both false and true at the same world w’, which is not
allowed in a model. Therefore, there is no counter-model. In other words, K is valid on
every model, and hence valid on every frame. (For practice, try to reason like this on the
formulas in Exercise 1.1 of the textbook, p. 21).
3.4 Aside: what is a possible world?
The formal semantics we have just seen is called, among other names, possible-worlds
semantics. But what is a possible world? Is a possible world real in some sense or
completely fictional? These are unsettled questions of metaphysics. Opinions range from
taking possible worlds as concrete entities as the actual world, or at least as real as
abstract objects (e.g., numbers, sets) are, to taking them as non-existent objects, or merely
as objects of thought. Those who are interested may want to take a look at the optional
reading: a (very short) excerpt from Graham Priest’s book on non-classical logic,
summarizing (very briefly) the major metaphysical stances on possible worlds.
Personally I prefer to think of the locution of possible worlds as a way of speaking about
(logically) possible states of the (actual) world, and I apply whatever intuitions I have
about possible states of the world to possible worlds. I find this enough for most purposes
in studying modal logic. At the end of the course when we have to engage a little
metaphysics of modality, we may reconsider the options.
17
Exercise 3
1. Consider the following model:
W={w1, w2, w3}; R={(w1, w1), (w1, w2), (w2, w1), (w2, w2), (w2, w3), (w3, w2)};
V(p, w1) = V(p, w2) = V(p, w3) = 1, V(q, w1) = 1, V(q, w2) = V(q, w3) = 0.
For each of the formulas below, determine whether the formula is valid on the model.
(a) p  p
(b) p  (p~q)
(c) (pq)  (p  q)
(d) (pq)  (pq)
(e) q  q
(f) (pq)  (p  q)
2. Consider the formula (pp) and the formula known as D: p  p. Show that they
are valid on a frame if and only if the frame is serial. In other words, let F be any frame.
Show that:
(a) If the accessibility relation of F is not serial, then the formulas are not valid on F;
(b) If the accessibility relation of F is serial, then the formulas are valid on F.
3. Consider the formula known as T: p  p and the formula p  p. Show that they are
valid on a frame if and only if the frame is reflexive. In other words, let F be any frame.
Show that:
(a) If the accessibility relation of F is not reflexive, then the formulas are not valid on F;
(b) If the accessibility relation of F is reflexive, then the formulas are valid on F
4. We can also define notions of satisfiability (可满足性):

A formula  is satisfiable on a model <W, R, V> iff for some wW, V(,w) =1.
A formula  is satisfiable on a frame iff it is satisfiable on some model based on
the frame.
For each of the following formulas, find a frame on which it is satisfiable.
(a) p  p
(b) (pq)  (~p  ~q)
(c) ~(p  p)
(d) (p  p)  p
18
Lecture 4
4.0 A bit of history
In this and the following lectures, we will study a number of axiomatic systems in modal
propositional logic. 2 It is worth noting that modern modal logic began in the second
decade of the last century with C. I. Lewis’s study of the so-called strict implication (嚴
格蘊涵). Lewis, like many others, noted the peculiar properties of material implication in
the system of classical logic. For example, any standard system of PC has the following
theorems:
(1) p  (qp)
(2) ~p  (pq)
(3) (pq)  (qp)
Intuitively these theorems sound very strange: (1) says that if a proposition (p) is true,
then any arbitrary proposition (q) implies it; (2) says that if a proposition (p) is false, then
it implies any arbitrary proposition (q); (3) says that for any two propositions p and q,
either p implies q or q implies p. Some people were so disturbed by these and other
similarly strange theorems that they referred to them as paradoxes of material implication.
For Lewis, they are nothing paradoxical but ordinary consequences of the truth-functional
sense of implication. However, Lewis stressed that there is another sense of implication
for which (1), (2) and (3) should fail. In this sense, ‘p implies q’ means that q (logically)
follows from p. Lewis called this sense of implication strict implication and introduced
the symbol for it. According to Lewis, p q should be defined as ~(p~q), or
equivalently, (pq). That is, a strict conditional is the necessitation of the
corresponding material conditional.
Lewis’s work marked the beginning of the modern study of modal logic. But a proper
formal semantics for the modal operators or the strict implication had to wait for another
few decades. In particular, the possible-worlds semantics we studied last time did not
fully appear until the 1950s. So what did Lewis do? He proposed and studied several
axiomatic systems for strict implication. Since the strict implication can be defined in our
language of modal propositional logic, Lewis’s systems are really modal systems in this
language (we will study two of his systems next time). Thus, modern modal logic
originated with the axiomatic approach.
4.1The system K
The systems we will study belong to the class of normal modal systems (正規模態系統),
and the weakest normal system (i.e., the normal system which has the fewest theorems) is
named K (after Saul Kripke).
2
If you forgot the basic ideas of axiomatic systems, it is good time to revisit lecture 2.
19
The system K is a proper extension of the system of classical propositional logic (e.g., the
system P we studied briefly in lecture 2). To simplify proofs in K, we will follow the
textbook and take all valid formulas of PC (i.e., all tautologies) as axioms of K (though as
we saw in the system P, three axioms would be enough). Besides, there is one modal
axiom, the formula known as K: (pq)  (p  q).
There are three primitive transformation rules. We already studied two of them in lecture
2: the rule of uniform substitution and the rule of modus ponens. The third is called the
rule of necessitation (必然化規則): if a formula  is a theorem, then  is also a theorem.
As with the rule of uniform substitution, it is important to keep in mind that the rule of
necessitation only applies to formulas that are already proved to be theorems. It does not
say that from any formula ,  can be derived.
To summarize, here is the axiomatic basis for the system K:
Axioms:
PC: every valid wff of PC is an axiom.
K: (pq)  (p  q)
Primitive rules:
US: If |– , then |– (1/p1, …, n/pn).
MP: if |–  and |– , then |– .
N: if |– , then |– .
On p. 27 of the textbook, the textbook gives two examples of proofs (for Theorems K1
and K2) that are fully rigorous. You should read the proofs carefully and make sure to
understand each line in them. (By the way, “Q.E.D.” is an acronym that mathematicians
commonly use to indicate the end of a proof.)
The textbook then describes several ways to simplify proofs. First, we can cite theorems
that are proved earlier directly in a new proof (e.g., in the proof of K3 on p. 28). Second,
and very usefully, we can derive new rules for the system, and once derived, we can use
the derived rules (导出规则) in subsequent proofs. Third, we will follow a simplifying
convention when applying rules to axioms (or already proved theorems). When we apply
a rule to axioms (or already proved theorems), strictly speaking we should first write
down the axioms (or the theorems) in the proof and then apply the rule to them. By the
convention, however, we can simply cite the names of the axioms (or theorems) in a
justification without having written down the axioms (or theorems) first. For example, in
the shortened proof of K1 on p. 31, in order to reach the formula on line (1), strictly
speaking we should first write down the axiom PC1 in a separate line and then apply the
rule DR1 to it. But by the convention, the textbook simply writes PC1×DR1 as the
justification for the formula on line (1), without writing down PC1 in a separate line.
20
Obviously, “PC1×DR1” gives sufficient information on how the formula on line (1) is
derived.
The textbook introduces a number of derived rules. First, there are those rules that are
derived from tautologies. For example, the textbook uses the following, among others:
A rule derived from PC5 (p. 29): if |–  and |– , then |– .
A rule derived from PC6 (also named Syll, p. 30): If |–  and |– , then |– .
A rule derived from PC3 (used in the shortened proof of K1 on p. 31):

If |–  and |– , then |– ().
A rule derived from PC8 (used in the shortened proof of K2 on p. 31):

If |–  and |– (), then |– ()  .
A rule derived from PC11 (used in the proof of K4 on p. 31):

If |–  and |– , then |– ()  .
All such rules are easily justified, by using the corresponding tautology, the rule of US,
and the rule of MP. Note that the textbook cites all such PC-rules by the name PC or PC#
(when the rule is based on a tautology listed on p. 13).
Besides, there are derived rules having to do specifically with modal operators. For
example:
DR1: if |–   , then |–   .
DR2: if |–   , then |–   .
DR3: if |–   , then |–   .
It is fairly easy to justify these rules. You should know how.
The textbook also establishes two derived rules which are slightly more complicated. One
is the rule of Substitution of Equivalents (等值置換), or simply named as Eq (pp. 32-33).
This rule says that if |–  and |– , then replacing any component of  in  with  (or
replacing any component of  in  with ) still results in a theorem. For example, in the
proof of K5 on p. 33, line (3) is derived from line (2) by replacing the second occurrence
of the formula p in the formula on line (2) with the formula ~~p, because on line (1) we
have p~~p. As another example, in the proof of K6 on p. 34, line (3) is derived from
line (2) by replacing the formula ~(~p~q) in the formula on line (2) with the formula
(pq), because it is a tautology and hence an axiom of K that ~(~p~q)  (pq).
I do not expect you to be able to rigorously justify the rule Eq in K, but if you have time,
you should try to understand the basic ideas on pp. 32-33.
21
The other rule is the rule of - interchange (模態算子互換), or simply named as I.
This rule says that in any theorem, you can do the following to any sequence of adjacent
modal operators and still get a theorem: replace  with  and  with  throughout the
sequence, add or delete a ~ at the beginning of the sequence, and add or delete a ~ at the
end of the sequence. For example,  can be replaced by ~~, and vice versa;  can be
replaced by ~~, and vice versa; ~ can be replaced by ~, and vice versa; ~ can be
replaced by ~, and vice versa; ~ can be replaced by ~, and vice versa;  can be
replaced by ~~, and vice versa.
Again, I do not expect you to rigorously justify the rule I in K, but I hope you can try
to understand the basic ideas of the justification on pp. 33-34.
The textbook proves the following theorems in K:
K1: (pq)  (pq)

K2: (pq)  (pq)
K3: (pq)  (pq)
K4: (pq)  (pq)
K5: p  ~~p
K6: (pq)  (pq)
K7: (pq)  (p  q)
K8: (pq)  (pq)
K9: (pq)  (pq)
You should think of the intuitive meanings of these theorems, and should understand the
proofs for them in K. Let me end this section by another example of proof:
K10: (pq)  (pq).

Proof
K9[~q/p, ~p/q] (1) (~q~p)  (~q~p)
(1) × PC15 (2) ~(~q~p)  ~(~q~p)
PC[~q/p, ~p/q] (3) ~(~q~p)  (~~q  ~~p)
(2) × (3) × Eq (4) (~~q  ~~p)  ~(~q~p)
(4) × I (5) (qp)  ~(~q~p)
(5) × PC × Eq (6) (qp)  (qp)
PC[q/p, p/q] (7) (qp)  (pq)
(6) × (7) × Eq (8) (pq)  (qp)
(8) × PC × Eq (9) (pq)  (pq)
4.2 The frames for K and K-validity
Let us now connect the formal system to the semantics. Given any modal system S and a
frame F, we will say F is a frame for S if every theorem of S is valid on F. It turns out
22
every frame is a frame for the system K. In other words, every theorem of the system K is
valid on all frames. The proof of this fact is not difficult. The basic ideas are just that (i)
all the axioms of K are valid on all frames, and that (ii) all the primitive rules preserve
validity on any frame. In the textbook, Lemma 2.3 (p. 39) addresses (i), and Lemma 2.4
(p. 40) addresses (ii). The demonstrations of these lemmas are given on p. 41.
For convenience, we can define the notion of K-validity, which is just validity relative to
the set of all frames. The aforementioned fact is that every theorem of the system K is K-
valid. Another common expression of this fact is that the system K is sound (可靠) with
respect to K-validity.
Given this fact, in order to show that a formula is not a theorem of K, we can show that it
is not K-valid. For the latter purpose, it suffices to find a frame on which the formula is
not valid. For example, we have seen last time that such formulas as (pp), p p, and
p  p are not K-valid. They are hence not theorems of the system K.
A more remarkable fact is that every K-valid formula is a theorem of K. In other words,
for every formula that is valid on every frame, there is a proof of that formula in the
system K. So the system K is strong enough to prove all K-valid formulas. This fact is
known as the completeness (完全性) of K: the system K is complete with respect to K-
validity. Completeness, however, is significantly harder to establish. We will not enter
the details of establishing completeness, but you should remember the result.
To summarize, the theorems of the system K comprise all and only those formulas that
are valid on every frame.
4.3 The system T
As I said, all the so-called normal modal systems are extensions of K. A straightforward
way to get an extension of K is to keep the whole axiomatic basis of K (i.e., all the
axioms and primitive rules of K) and add extra axioms and/or primitive rules. For
example, the system T is the result of adding to K the formula T: pp, as an extra
axiom. We write T = K+T to note this construction of the system T.
Obviously T is an extension of K, which, recall, means that every theorem of K is also a

theorem of T. In addition, T is a proper extension of K, which, recall, means that T has
strictly more theorems than K does. This is also easy to see, for we have shown that the
formula T itself is not K-valid and hence not a theorem of K.
Besides T, the textbook proves the following theorems of T that are not theorems of K:
T1: p  p
T2: (p  p)
Again, you should think of the intuitive meanings of these theorems, and should fully
understand the proofs for them in T. Here is another theorem of T (that is not one of K):
23
T3: (pq)  (p  q)
Proof
T[(pq)/p] (1) (pq)  (pq)

K (2) (pq)  (p  q)
(1), (2) × Syll (3) (pq)  (p  q)
4.4 The frames for T and T-validity
In the last exercise, you showed that the formula T is valid on a frame if and only if the
frame is reflexive. It follows that all and only reflexive frames are frames for the system
T. Here is why. For any non-reflexive frame, the formula T, as a theorem of the system T,
is not valid on the frame, and hence the frame is not a frame for T. On the other hand, for
any reflexive frame, all the axioms, including T, of the system T are valid on the frame,
and all the primitive rules preserve validity on the frame; therefore, the frame is a frame
for T.
Define T-validity as validity relative to the set of reflexive frames. We have just
established that the system T is sound with respect to T-validity. Given this fact, it is easy
to show that a formula is not a theorem of T. We just need to find a reflexive frame on
which the formula is not valid.
Again, a remarkable fact is that the system T is also complete with respect to T-validity.
Thus, the theorems of T comprise all and only those formulas that are valid on every
reflexive frame.
4.5 The system D
The last system for today is a system, so to speak, between K and T. Add to the system K
the following axiom D: pp, and we get the system D. That is, D = K+D.
Obviously D is a proper extension of K, because D is not K-valid. As you showed in the

last exercise, D is not valid on any non-serial frame.
On the other hand, D is contained in the system T. To show this, it suffices to show that
we can prove the formula D in the system T. The proof is straightforward:
T (1) p  p
T1 (2) p  p
(1), (2) × Syll (3) p  p
So the system T is an extension of the system D. We will see shortly that the extension is
also proper.
24
The main interest in D is in connection to deontic logic (hence the name D). When we
interpret  as “it is obligatory that”, the formula T becomes dubious, because intuitively
“ought” does not imply “is”. But the formula D still seems valid, because when  is
interpreted as “it is obligatory that”,  will be interpreted as “it is permissible that”, and
intuitively being obligatory at least implies being permissible. We will study deontic
logic in more detail later.
Besides the formula D, the textbook proves a theorem of D that is not a theorem of K:
D1: (pp)
It is interesting to note that the system D is the weakest normal modal system that has a
theorem of the form . The system K, being still weaker than D, does not have any
theorem of the form .
Again, let me end this section by showing another theorem of D (that is not one of K):
D2: p  ~p
Proof
PC × N (1) (p  ~p)
D[(p  ~p)/p] (2) (p  ~p)  (p  ~p)
(1), (2) × MP (3) (p  ~p)
K6[~p/q] (4) (p  ~p)  (p  ~p)
(3) × (4) × Eq (5) p  ~p
4.6 The frames for D and D-validity
In the last exercise, you showed that the formula D is valid on a frame if and only if the
frame is serial. It follows that all and only serial frames are frames for the system D. Here
is why. For any non-serial frame, the formula D, as a theorem of the system D, is not
valid on the frame, and hence the frame is not a frame for D. On the other hand, for any
serial frame, all the axioms, including D, of the system D are valid on the frame, and all
the primitive rules preserve validity on the frame; therefore, the frame is a frame for D.
Define D-validity as validity relative to the set of serial frames. We have just established
that the system D is sound with respect to D-validity. Given this fact, it is easy to show
that a formula is not a theorem of D. We just need to find a serial frame on which the
formula is not valid. For example, the formula T is not a theorem of D. There are
obviously frames that are serial but not reflexive, and T is not valid on those frames.
Therefore, the system T is a proper extension of the system D.
25
It is also true that the system D is complete with respect to D-validity. Thus, the theorems
of D comprise all and only those formulas that are valid on every serial frame.
Exercise 4
1. Prove the following formulas in the system K.
(a) ((p  q)  (q  r))  (p  r)
(b) (p  q)  (p  q)
(c) (p  q)  (p  q)
2. Prove the following formulas in the system D.

(a) ~p  (~q  (pq))
(b) (p  ~p)  ~p
3. Prove the following formulas in the system T.

(a) p  p
(b) ~(p~p)
4. Show that the system K does not have any theorem of the form  or of the form .
5. Show that T2: (p  p), and T3: (pq)  (p  q) are not theorems of D.
6. Show that B: p  p, 4: p  p, and E: p  p are not theorems of T.
7. Show that  is D-valid if and only if  is D-valid.
26
Lecture 5
5.1 The system S4
Among the three systems we studied last time, the system T is the strongest. In this
lecture we shall look at two even stronger systems. The system known as S4 results from
adding to T an extra axiom, the formula 4: p  p. That is, S4 = T + 4 = K + T + 4.
This system is named S4 because it is equivalent to the fourth system for strict
implication developed by Lewis.
You have shown in the previous exercise that the formula 4 is not a theorem of T. Thus
S4 is a proper extension of T. Besides 4, the textbook mentions the following theorems of
S4 (that are not theorems of T):
S4(1) p  p
S4(2) p  p
S4(3) p  p
S4(4) p  p
S4(5) p  p
S4(6) p  p
S4(7) p  p
You should fully understand the proofs of these theorems in S4.
5.2 Modalities in S4
The four theorems S4(2), S4(3), S4(6), and S4(7) are known as reduction laws (規約律)
of S4, because according to them, certain sequences of modal operators are equivalent to,
and so can be reduced to, shorter sequences. For example, according to S4(2) (and the
rule of uniform substitution), for any formula ,  is equivalent to  in S4. Hence in
S4, anywhere you see , you can replace it by . Similarly, by S4(3),  can be replaced
by ; by S4(6),  can be replaced by ; by S4(7),  can be replaced by .
Another way of putting the point is that in S4,  and  (similarly,  and ,  and
,  and ) are equivalent modal terms or modalities. Technically, any (possibly
empty) string consisting of ~, ,  is called a modality. For example, these are modalities:
~, ~~~, , , , ~~. Note that we also count the empty string as a modality,
which is usually denoted as ‘’. A modality that contains two or more modal operators
is called an iterated modality (叠置模態). A modality that contains an even number of
negation signs is called an affirmative modality (which, by the rule of I and the rule of
double negation, is equivalent to a modality with no negation sign). A modality that
contains an odd number of negation signs is called a negative modality (which is
equivalent to a modality with a single negation sign at the beginning).
27
Obviously modalities are not themselves well-formed formulas. But for any modality A
and well-formed formula , A is also a well-formed formula. Two modalities A and B
are equivalent in a system S iff for any formula , the formula A  B is a theorem of
S. Two modalities are said to be distinct iff they are not equivalent. An interesting fact
about the system S4 is that the number of distinct modalities in S4 is quite limited. To be
precise, there are only fourteen distinct modalities in S4:
; ; ; ; ; ; ; ~; ~; ~; ~; ~; ~; ~
Any other modality is equivalent to one of the above in the system S4, which is
demonstrated on p. 55 of the textbook.
Obviously, to remember these distinct modalities in S4, it is sufficient to remember the
seven affirmative modalities (i.e., the first seven in the above list). The other seven are
simply the corresponding negations. The relative strength of the affirmative modalities in
S4 is summarized in the diagram on p. 56.
By contrast, there are no reduction laws in the system T. As a result, there are infinitely
many distinct modalities in the system T (and of course also infinitely many distinct
modalities in the even weaker systems K and D).
5.3 The frames for S4 and S4-validity
The formula 4 is valid on every transitive frame (i.e., every frame in which the
accessibility relation is transitive). The proof of this fact is simple, and is given on p. 57.
It is also easy to show that the formula 4 is not valid on any frame that is not transitive.
Here is a proof. Suppose F = <W, R> is a frame that is not transitive. That means there
are w, w’, w’’W such that (w, w’)R and (w’, w’’)R, but (w, w’’)R. Since (w,
w’’)R, we can define a model M = <F, V> such that V(p, w’’) = 0, and for every u such
that (w, u)R, V(p, u) = 1. In this model, V(p, w) = 1. However, V(p, w’) = 0 because
(w’, w’’)R and V(p, w’’) = 0. It follows that V(p, w) = 0, because (w, w’)R. As a
result, V(pp, w) = 0. Hence the formula 4 is not valid on the frame.
Recall that the formula T is valid on a frame if and only if the frame is reflexive. Since
S4 = K+T+4, the frames for the system S4 are precisely those frames that are both
reflexive and transitive. We can thus define S4-validity as validity relative to the set of
frames that are both reflexive and transitive. In order to show that a formula is not a
theorem of S4, it is sufficient to show that the formula is not valid on a frame that is both
reflexive and transitive.
For example, we can show that the formula p  p is not a theorem of S4, by noting
that it is not valid on the following model: W={w1, w2}; R = {(w1, w1), (w1, w2), (w2, w2)};
V(p, w1) =1, V(p, w2) =0. Since the frame of the model is reflexive and transitive, every
theorem of S4 is valid on the frame. It follows that p  p is not a theorem of S4. We
now see that  and  are really distinct modalities in S4, as we claimed in the previous
section. Similarly, you can show this for every other pair of modalities claimed to be
distinct in S4.
28
The system S4 is also complete with respect to S4-validity. That is, every formula that is
S4-valid can be proved in S4. So the theorems of S4 comprise all and only those formulas
that are valid on every frame that is both reflexive and transitive.
5.4 The system S5
An even stronger system than S4 is the system S5 (which is equivalent to the fifth system
of strict implication developed by Lewis). S5 can be obtained by adding to T the formula
E (or 5): pp, as an extra axiom. That is, S5 = T + E = K + T + E.
The fact that S5 contains S4 follows directly from the fact that the formula 4 is a theorem
of S5. This is proved on p. 58. But the formula E is not a theorem of S4, as we showed in
the last section (see also p. 59). Therefore, S5 is a proper extension of S4.
The textbook proves the following theorems of S5 (that are not theorems of S4):
S5(1) p  p
S5(2) p  p
S5(3) p  p
S5(4) (p  q)  (p  q)
S5(5) (p  q)  (p  q)
S5(6) (p  q)  (p  q)
S5(7) (p  q)  (p  q)
S5(8) p  p
S5(9) p  p
You should know how to prove them in S5.
Let me remark in passing that theorems S5(2)-S5(7), together with the reduction laws
already in S4, can be used to prove an important technical result: in S5, every formula is
equivalent to a formula in which no modal operator is within the scope of another modal
operator. The absence of nested modal operators is a crucial feature of the so-called
modal conjunctive normal form (模態合取範式), and a very important and useful result
is that in S5, every formula is equivalent to a formula in the modal conjunctive normal
form. I shall not bomb you with the technical details on this issue.
5.5 Modalities in S5
There is, however, a simpler point you should be able to fully appreciate. In addition to
the reduction laws already in S4, we now have two more powerful reduction laws in S5,
namely, S5(2) and S5(3). Among the seven distinct affirmative modalities in S4,  is
equivalent to  in S5 by S5(2),  is equivalent to  in S5 by S5(2) and S4(3),  is
29
equivalent to  in S5 by S5(3), and  is equivalent to  in S5 by S5(3) and S4(2). So

what remain as distinct affirmative modalities in S5 are at most three: , , and . In
fact they are truly distinct in S5, as we will be able to show shortly. Therefore, there are
six distinct modalities in S5:
; ; ; ~; ~; ~
Thus the system S5 is strong enough to render iterated modalities redundant.
5.6 The frames for S5 and S5-validity
The formula E is valid on a frame if and only if the frame is Euclidean (hence the name
E). This is not hard to show. Let F=<W, R> be any frame. Suppose it is Euclidean. We
show that there is no model based on F that makes E invalid. Suppose for the sake of
contradiction that there is such a model M = <F, V> such that V(E, w)=0 for some wW.
Then it must be that V(p, w) = 1 and V(p, w) = 0. Since V(p, w) = 1, there must be a
w’ accessible from w at which V(p, w’) = 1. Since V(p, w) = 0, there must be a w’’
accessible from w at which V(p, w’’) = 0. Now we have (w, w’’)R and (w, w’)R,
which implies that (w’’, w’)R, for R is Euclidean. Since V(p, w’’)=0 and (w’’, w’)R,
it has to be that V(p, w’) = 0, which contradicts the earlier requirement that V(p, w’) = 1.
Hence there is no such model, which means E is valid on F.
For the other direction, suppose F is not Euclidean. It means that there are w, w’, w’’W
such that (w, w’)R and (w, w’’)R, but (w’, w’’)R. Since (w’, w’’)R, we can define
a model M = <F, V> such that V(p, w’’) = 1 and for every u such that (w’, u)R, V(p, u)
= 0. In this model, V(p, w’) = 0, and hence V(p, w) = 0 because (w, w’)R. But V(p,
w) = 1, because (w, w’’)R and V(p, w’’) = 1. So, V(pp, w) = 0. Hence the formula
E is not valid on F.
Since S5 = K+T+E, the frames for the system S5 are precisely those frames that are both
reflexive and Euclidean. It is easy to show that a relation is reflexive and Euclidean if and
only if it is reflexive, symmetric, and transitive (Try it!). A relation that is reflexive,
symmetric, and transitive is also called an equivalence relation (等價關係). So the
frames for S5 are precisely equivalence frames, i.e., those frames in which the
accessibility relation is reflexive, symmetric, and transitive.
We can thus define S5-validity as validity relative to the set of equivalence frames. The
notion, however, can be further simplified. What is particularly nice about an equivalence
relation R over W is that W can be partitioned into a bunch of disjoint subsets (called
equivalence classes) such that in each subset every member has the relation R with every
member, but no member in a subset has the relation R with any member in a different
subset. Given this fact, it is easy to show that a formula is valid on every equivalence
frame if and only if it is valid on every universal frame, i.e., on every frame in which
every world is accessible from every world (or say, R=W×W). Therefore, we may well
define S5-validity as validity relative to the set of universal frames.
30
In order to show that a formula is not a theorem of S5, it is sufficient to present a

universal frame on which the formula is not valid. For example, we can show that the
formula p  p is not a theorem of S5, by noting that the formula is not valid on the
following model: W={w1, w2}; R = W×W; V(p, w1) =1, V(p, w2) =0. Since the frame of
this model is a universal frame, every theorem of S5 is valid on the frame. It follows that
p  p is not a theorem of S5. Thus  and  are really distinct in S5. Similarly, we can
show that the formula p  p is not a theorem of S5.
Two implications of these simple results are worth noting. First, strong as it is, S5 is still
consistent, in that it does not prove every formula. Second, although S5 renders iterated
modalities redundant, it is still properly modal, in that it does not conflate necessity,
possibility and actuality.
Finally, let us also note that the system S5 is in fact complete with respect to S5-validity.
That is, every formula that is S5-valid can be proved in S5. So the theorems of S5
comprise all and only those formulas that are valid on every universal frame.
5.7 Philosophical considerations: which system is the right one?
The five systems we have studied  K, D, T, S4, and S5  are among the most well-
known modal systems (though they constitute only a very small sample). From a
symbolic logician’s perspective, modal systems are studied for their own sake. Whether
or not they have philosophical or other applications is quite another matter. Philosophers,
however, are bound to ask which modal system captures the correct logical laws of
modality.
Since there are different kinds of modality, this philosophical question is really a family
of questions. There is no time for us to enter a careful philosophical investigation here. I
will just briefly comment on this question regarding logical necessity/possibility and
regarding physical necessity/possibility.
If we agree that the general framework of the possible-worlds semantics can handle
logical as well as physical modality (with different kinds of modality corresponding to
different kinds of frames), then it is safe to regard the theorems of K as among the correct
logical laws for both logical and physical modalities, because the theorems of K are valid
on every frame. However, the system K is presumably too weak. Our intuition seems
clear that the formula D and the formula T express valid laws for both logical and
physical modalities, but the system K does not approve them.
Intuitively, then, the right system for logical necessity/possibility and that for physical
necessity/possibility are at least as strong as the system T. Should they be even stronger?
Well, for logical necessity/possibility, it is generally thought that the right system is
stronger than T. In fact, the usual choice is the system S5. This view may be appreciated
in two ways. First, to many ears the formula E just sounds right for logical
necessity/possibility: if a proposition is logically possible, then this very possibility is a
logical truth and hence is logically necessary. This intuition is probably due to the belief
that logical laws are invariant across all possible worlds (indeed, the setup of our
31
possible-worlds semantics requires that every world should obey classical logic). If each
world has the same logical laws, then a proposition is logically possible in one world if
and only if it is logically possible in all worlds. Related to this thought is a second way to
appreciate the point. For the logical sense of possibility, the accessibility relation should
be a universal one: every world is logically possible relative to every world. If so, all the
theorems of S5 should be correct laws, because they are valid on universal frames. The
upshot is that the right system for logical necessity/possibility should be at least as strong
as S5. Given how strong the system S5 already is (recall that all iterated modalities are
reducible in S5), it is reasonable to think that the right system is precisely S5.
How about physical necessity/possibility? There are very good reasons to think that S5 is
too strong for physical necessity/possibility. Recall that I motivated the accessibility
relation in lecture 3 by arguing that not every world is physically possible relative to
every world. Underlying the argument is the intuition that physical laws are not the exact
same in all possible worlds. If the physical laws of world w are not all true at world w’,
then w’ is not physically possible relative to w. More to the point, the accessibility
relation for the physical modality does not have to be symmetric. For example, consider a
possible world in which all our actual physical laws are true. In addition, suppose that
this possible world has additional physical laws, say, that every gold bar has at most
100kg in mass. This world is physically possible relative to our world, but conversely,
our world is not physically possible relative to it (the additional law about gold bar is
false in our world: we have gold bars over 100kg). Thus, the formula E is not valid: even
though it is possible at our world that gold bars are over 100kg, it is not so in all worlds
accessible from our world, and so this possibility is not (physically) necessary at our
world.
So S5 is too strong for physical necessity/possibility. How about S4? One argument in
support of S4 for this purpose goes like this. The condition for relative possibility in the
physical sense should be the preservation of physical laws: w’ is possible relative to w if
and only if all the physical laws in w are still physical laws in w’. If this is the correct
condition, then the accessibility relation suitable for physical necessity/possibility has to
be transitive. It follows that the formula 4 is a valid law for physical necessity.
However, the crucial premise in this argument is questionable. For w’ to be physically

possible relative to w, one might argue, all that is needed is that the physical laws of w are
true in w’, whether or not they are still laws of w’. The difference is that under this
weaker condition, w’ is accessible from w even if some physical laws of w are not laws of
w’ but happen to be true in w’. If so, another world w’’ accessible from w’ may violate
the physical laws of w (without violating the physical laws of w’) and hence not
accessible from w, contrary to the requirement of transitivity. Then the formula 4 would
not be valid and the system S4 would be too strong for physical necessity/possibility.
Which condition of accessibility is metaphysically apt is debatable. I shall not pursue the
matter further here, but a popular opinion, which I share, is that the logic for physical
necessity/possibility is no stronger than T.
Recall now one of the sample arguments we discussed at the very beginning: It is
possible that superman exists. Therefore, it is necessarily possible that superman exists.
32
This argument is obviously an instance of E. Given the previous discussions, it is fair to

say that the argument is presumably valid if necessity and possibility are understood in
the logical sense, but not valid if they are understood in the physical sense.
33
Exercise 5
1. For each of the following formulas, determine whether it is a theorem of S4. If it is,
give a proof in S4 to show it is. If not, present a counter-model to show it is not, and
determine whether it is a theorem of S5. If it is, give a proof in S5 to show it is. If not,
present a counter-model to show it is not.
(a) (p  q)  (p  q)
(b) p  p
(c) p  p
(d) (p  q)  (p  q)
(e) p  p
(f) (p  q)  (p  q)
(g) ~(p  p)
(h) (p  q)  (q  p)
2. If we add the formula B: p  p as an extra axiom to the system T, we get the system
known as the Brouwerian system or simply the system B (pp. 62-63). That is, B = T + B.
Show the following:
(a) S4 is not an extension of B;
(b) The formula B is valid on a frame if and only if the frame is symmetric;
(c) A frame is a frame for B if and only if the frame is both reflexive and symmetric;
(d) B is not an extension of S4.
3. Consider the system D + B + E and the system S4 + B. Show that these two systems
are equivalent.
4. The converse of the formula T is the formula p  p, which we can call Tc. Show that
the system D + Tc is a proper extension of S5. How many distinct modalities are there in
D + Tc?
5. What are the reasons to think S5 captures the correct logic for logical necessity but is
too strong for physical necessity?
34
Lecture 6
6.1 Testing for Validity
This lecture is essentially a systematic review of a method for testing validity (relative to
a certain kind of frames) in modal logic. I regard it as a review because we have been
using this method all along. The method is a generalization of the Reductio method for
PC (pp. 11-12). The purpose is to figure out, for a given formula, whether there is a
counter-model of a certain kind to the formula. If there is such a counter-model, then the
formula is not valid relative to that kind of frames. If there is no such counter-model, then
the formula is valid relative to that kind of frames.
For example, we have used this method to show that the formula K is valid relative to the
set of all frames, D valid relative to the set of serial frames, T valid relative to the set of
reflexive frames, 4 valid relative to the set of transitive frames, and E valid relative to the
set of Euclidean frames. On the other hand, we have also used this method to show that D
is not valid relative to the set of all frames, T not valid relative to the set of serial frames,
4 not valid relative to the set of reflexive frames, and E not valid relative to the set of
frames that are both reflexive and transitive. And many more examples in the exercises.
So the method is not new. What we shall do here is learn a particular tool for presenting
the process of the method, known as semantic diagrams (語義圖). Let me hasten to add
that the significance of semantic diagrams goes far beyond being a presentation tool.
With them one can prove important meta-theoretical results, such as decidability and
completeness of various systems. However, we will not enter the complicated meta-
theory in this course. For us, therefore, the main purpose of learning semantic diagrams is
to facilitate the application and presentation of the Reductio method for testing validity.
For this reason, I will not strive for a fully rigorous definition of the rules for constructing
semantic diagrams. Instead I will emphasize the essential ideas and use various examples
to illustrate them.
6.2 K-diagrams
Let us begin with a familiar example. I quote from the notes for lecture 3:
“Lastly, let us look at the example discussed in the textbook (p. 20), the formula
K: (pq)  (p  q). This formula turns out to be valid on every frame. To
show this, we can use the Reductio method to show that there is no counter-model
to K. Suppose we could find a model in which K is false at some world w. That
means (pq) has to be true at w and pq has to be false at w. In order for
pq to be false at w, p has to be true at w, and q has to be false at w. In
order for q to be false at w, there needs to be a world w’ accessible from w at
which q is false. Now since w’ is accessible from w, in order for p to be true at w,
p has to be true at w’, and in order for (pq) to be true at w, pq has to be true
at w’. In order for pq and p to be both true at w’, q has to be true at w’. Then
we have to make q both false and true at the same world w’, which is not allowed
35
in a model. Therefore, there is no counter-model. In other words, K is valid on

every model, and hence valid on every frame.”
Instead of writing the reasoning process in words, we can present it in a diagram.
w1 (pq)  (p  q)

1 0 1 0 0
q pq
w2
0 1 1 1
I hope it is fairly easy to see the match between the diagram and the quoted words.
Basically we start with a rectangle representing an arbitrary world w1. In the rectangle we
write down the formula we want to test, and write 0 under its main operator to indicate
that we suppose it false at w1. We then derive more truth values from this supposition,
until we either get a solution to satisfy the supposition or derive a contradiction. You
knew how to handle the truth-functional operators even before taking this course. For the
modal operators, we handle them according to the following rules:
(1) If in a world w there is a formula of the form  evaluated as 0, then there must be a
world accessible from w in which  is assigned 0.
(2) If in a world w there is a formula of the form  evaluated as 1, then there must be a
world accessible from w in which  is assigned 1.
(3) If in a world w there is a formula of the form  evaluated as 1, then in every world
accessible from w,  must be assigned 1.
(4) If in a world w there is a formula of the form  evaluated as 0, then in every world
accessible from w,  must be assigned 0.
Note that (1) and (2) are responsible for generating new worlds in a diagram. (3) and (4),
by contrast, do not introduce new worlds. In the example above, w2 is generated
according to rule (1). The truth value of pq and that of p in w2 are determined
according to rule (3).
It should be obvious that in the sample diagram, the arrow from world w1 to world w2
means that w2 is accessible from w1. In general, we draw an arrow from w to w’ to mean
that w’ is accessible from w.
A diagram is completed when none of (1)-(4) or the familiar rules for truth-functional
operators is applicable.
36
For a second example, consider the formula (pq)  (pq). The process of finding a
counter-model to this formula is documented in the following diagram:
(p  q)  (p  q)

w1
1 11 0 0
p pq q pq
w2 w3
1 100 1 001
In this diagram, w2 and w3 are introduced according to rule (2). The formula pq is
assigned 0 in both w2 and w3 due to rule (4).
There is no contradictory truth assignment in the diagram. The truth assignments and
accessibility relations in the diagram are sufficient to satisfy the original requirement. It is
then easy to construct a counter-model from the diagram:
W = {w1, w2, w3}; R = { (w1, w2), (w1, w3) };
V(p, w1) =1, V(p, w2) = 1, V(p, w3) = 0, V(q, w1) =1, V(q, w2) =0, V(q, w3) =1.
Note that the diagram does not force a particular truth assignment at w1. That just means
we can choose any truth assignment at w1. I choose to assign 1 to both p and q at w1, but
any other choice will do.
In the previous two examples, we did not impose any restriction on the accessibility
relation in the diagrams. In other words, we were looking for a counter-model of any kind.
If we fail, that means there is no counter-model whatsoever, and the formula under test is
K-valid. Therefore, diagrams constructed without any restriction on the accessibility
relation are well-suited for testing K-validity. We can call them K-diagrams.
You should be reminded that truth functional operators may allow multiple ways to
satisfy a required truth value, and sometimes we have to deal with such situations by
considering alternatives. For example, suppose our task is to figure out whether the
formula (pq)  (p  q) is K-valid. We start with the rectangle:
w1 (p  q)  (p  q)

0
From this initial supposition we cannot proceed with unambiguous assignments to the
component formulas. There are two alternatives to consider:
37
(p  q)  (p  q) (p  q)  (p  q)

w1(i) w1(ii)
1 0 0 0 0 1
We need to check each alternative separately. The diagram for w1(i) is very smooth and
reveals a contradiction quickly (try it!). The diagram for w1(ii), however, cannot proceed
without branching into multiple cases again:
w1(ii)(i) (p  q)  (p  q) w1(ii)(ii) (p  q)  (p  q)

0 0 1 1 1 0 0 0 1
Again, we need to check each alternative. The diagram for w1(ii)(i) and that for w1(ii)(ii)
are both smooth and reveal contradictions (try them!). Hence the formula is K-valid.
Obviously it is advisable to deal with alternatives only when there is no other way to
proceed. When we have to consider alternatives, remember that we need to check every
alternative before concluding validity. But as long as one alternative gives a consistent
solution, we can stop and conclude invalidity.
6.3 D-diagrams
If our purpose is to check whether or not a formula is D-valid, that is, valid relative to the
set of serial frames, then in constructing the semantic diagrams, we need to follow an
extra requirement that the accessibility relation be serial. We can call such diagrams D-
diagrams.
Consider the formula D for a simple illustration. The K-diagram for D is simply:
w1 p  p
1 0 0
We are done with a counter-model: W = {w1}; R = ; V(p, w1) = 0 (or 1, doesn’t matter).
So D is not K-valid.
However, for a D-diagram, the accessibility relation has to be serial, which means that
the diagram should be continued with a world accessible from w1.
w1 p  p
1 0 0
w2 p p
1 0
38
Given w2 in the diagram, rules (3) and (4) become applicable. (3) requires value 1 for p in
w2 (for p is evaluated 1 in w1), and (4) requires value 0 for p in w2 (for p is evaluated 0
in w1); hence a contradiction. It is thus shown that there is no counter-model to D based
on a serial frame. Therefore, D is D-valid.
Now consider the formula (pq)  (p  q). The K-diagram for the formula is:
w1 (p  q)  (p  q)

1 0 1 0 0
w2 q p (pq)
0 1 1
We can then construct a counter-model from this diagram: W = {w1, w2}; R = {(w1, w2)};
V(p, w1) = V(p, w2) = 1, V(q, w1) = V(q, w2) = 0. (Again, the diagram shows clearly that
it does not matter what truth assignment at w1 is. Just choose one.) So the formula is not
K-valid.
To test D-validity, the diagram needs to be continued with a world accessible from w2.
w1 (p  q)  (p  q)

1 0 1 0 0
w2 q p (pq)
0 1 1
w3 pq
1
There are alternative ways to make pq true at w3. Any alternative gives us a counter-
model. But it is not yet a counter-model based on a serial frame, because no world is
accessible from w3 in the diagram. Obviously if we keep introducing new worlds, the
requirement of seriality will force us to go forever. Fortunately, it is equally obvious that
since there is no modal operator in the w3 rectangle, we can make any world already in
the diagram accessible from w3 without any further effect. Indeed we can just make w3
itself accessible from w3.
Here then is a D-diagram that reveals a counter-model to (p  q)  (p  q) based
on a serial frame:
39
(p  q)  (p  q)

w1
1 0 1 0 0
q p (pq)
w2
0 1 1
w3 pq
1
So, (p  q)  (p  q) is not D-valid either.
6.4 T-diagrams
In order to test T-validity, i.e., validity relative to the set of reflexive frames, we need to
obey the restriction that each world in the diagram is accessible from the world itself.
This means that whenever we see a formula  with value 1 in a rectangle, we should
assign 1 to  in the same rectangle, and whenever we see a formula  with value 0 in a
rectangle, we should assign 0 to  in the same rectangle.
For illustration, let us continue the example of (pq)  (pq), and draw a T-
diagram.
w1 (p  q)  (p  q)

111 1 1 0 11 0 0
q p (p  q)
w2
0 1 1 1 1 1
Note that in w1, from the initial supposition, we can derive a number of truth values, due
to the requirement that w1 is accessible from w1. Still, there is no contradiction in w1.
However, a contradiction arises quickly in w2, a world accessible from w1 in which q is
assigned 0 (in order to make q 0 at w1). Hence (pq)  (pq) is T-valid.
Now consider a sample formula that is not T-valid, say, the formula 4: p  p. Here
is the T-diagram for 4.
40
w1 p  p
11 0 0
w2 p p
1 0
p
w3 0
The counter-model is clear: W = {w1, w2, w3}; R = {(w1, w1), (w1, w2), (w2, w2), (w2, w3),
(w3, w3)}; V(p, w1) = V(p, w2) = 1, V(p, w3) = 0.
6.5 S4-diagrams
For S4-validity, we need to obey the restriction that the accessibility relation in the
diagram is both reflexive and transitive. So, in addition to having every world accessible
to itself, we also need to make sure that whenever there is a chain of arrows, …,
there should be an arrow from every world in the chain to every world later in the chain.
For example, in the previous diagram (for p  p), there is a chain w1 w2w3. In
order to construct the S4-diagram, we need add w1w3. With that additional accessibility
we would have to also assign value 1 to p in w3 (for p is 1 in w1) and derive a
contradiction in w3. Thus the S4-diagram for p  p would show that it is S4-valid.
There is a technical complication that may arise in S4-diagrams. Consider testing p 
p for S4-validity. The diagram goes like this:
w1 p  p
11 0 00
p p p p p p
w2 01 11 1 w3 0 00 10
p p p p p p
w4 0 00 10 w5 01 11 1
w6 p p p w7 p p p
01 11 1 0 00 10
41
This diagram is not yet completed, because rule (1) is still applicable in w6 and rule (2) is
still applicable in w7. Moreover, you should notice that the rectangle w6 has the same
content as w2 (and w7 has the same content as w3). So, if we follow rules (1) and (2) to
generate new worlds, the chain w1 w2w4 w6… will simply repeat the segment
w2w4 w6 again and again. But we may well represent such a repeating chain by
going back to w2 from w4, without extending the chain indefinitely. (The case for the
repeating chain w1 w3w5 w7… is parallel.)
w1 p  p
11 0 00
p p p p p p
w2 01 11 1 w3 0 00 10
p p p p p p
w4 0 00 10 w5 01 11 1
From this diagram we see a counter-model to p  p based on an S4-frame:

W = {w1, w2, w3, w4, w5}; R = { (w1, w1), (w1, w2), (w1, w3), (w1, w4), (w1, w5), (w2, w2),
(w2, w4), (w3, w3), (w3, w5), (w4, w4), (w4, w2), (w5, w5), (w5, w3) };
V(p, w1) = V(p, w2) = V(p, w5) = 1, V(p, w3) = V(p, w4) = 0.
More generally, we have the following rule in constructing S4-diagrams: if in a chain

there is a rectangle whose content is fully contained in another rectangle earlier in the
chain, then delete the later rectangle and re-direct the relevant arrows into the earlier
rectangle. With this rule, S4-diagrams are guaranteed to be finite.
6.6 S5-diagrams
Finally, for S5-validity, we need to obey the restriction that the accessibility relation is
universal. Consider, for illustration, the formula E: p  p. The S4-diagram for E is:
w1 p  p
1 0 01
p p
w2 w3
1 00
There is hence a counter-model to E based on the S4-frame.
42
However, the S5-diagram for E is the following:
w1 p  p
1 0 010
w2 p p w3 p
0 1 00
(Note that we can use a double-headed arrow  to denote two arrows, one in each
direction.) The contradiction arises in w2, because w2 is required to be also accessible
from w3 and hence p has to be assigned 0 in w2 as well. Therefore, E is S5-valid.
I shall end with two remarks. First, the counter-models constructed out of semantic
diagrams are usually not the simplest counter-models possible. For example, the S4-
diagram for E involves three worlds, but two worlds would be enough to make a counter-
model based on a frame that is both reflexive and transitive. The counter-model to p 
p we constructed earlier is also far from being the simplest. However, and this is the
second remark, a counter-model is a counter-model, even if not the simplest, and one
counter-model is enough to establish invalidity. What is good about the Reductio method
via semantic diagrams is that for K, D, T, S4, and S5 (and many other modal systems),
the method provides a mechanical procedure that is guaranteed to give you an answer to
the question of theoremhood or validity. Therefore, to put it in jargon, the systems K, D,
T, S4, and S5 are all decidable.
43
Lecture 7
7.1 Standard Deontic Logic
As I said earlier, several philosophically (or otherwise) important concepts seem to be

analogous to alethic modality, and logics for them may be developed in a fashion parallel
to modal logic. Indeed the machinery of modal logic has been applied to formalize logics
of obligation (deontic logic), of knowledge (epistemic logic), of belief (doxastic logic), of
time (temporal or tense logic), of process (dynamic logic), and of (counterfactual)
conditional, etc. Some applications have been more successful than others, but all are
areas of active research. We shall take a gentle look at a couple of examples.
Let’s start with deontic logic (道義邏輯), the logic of obligation (duty, ought). For the
kind of deontic logic we will discuss here, the formal language is essentially the same as
that for modal propositional logic. However, the intended interpretation of the operator 
becomes that of “deontic necessity”, so that  reads like this: “it is obligatory or
required (to make it true) that ”, or simply “ is obligatory”. Accordingly,  targets
“deontic possibility” or “it is permissible or permitted (to make it true) that”. In order to
highlight this intended interpretation, people usually use OB (or simply O) for  and PE
(or simply P) for  in deontic logic. We will follow suit.
Just as there are different kinds or senses of necessity, there are different kinds or senses
of obligation, such as moral obligation and legal obligation. In what follows we take the
moral sense of the term and of other related terms (such as “ought”, “required”,
“permitted”, etc.), though most issues also arise in the legal context.
The so-called standard deontic logic (or minimal deontic logic) is given by one of our old
friends, the system D (for Deontic). Recall that in addition to the classical propositional
logic, the system D has the following principles in its axiomatic basis, now cast in the
new notations:
OB-K: OB(p  q)  (OBp  OBq)

OB-D: OBp  PEp
OB-N: if |– , then |– OB.
In plain words, OB-K says that if it is obligatory that if p then q and it is obligatory that p,
then it is also obligatory that q. OB-N says that every valid principle of (deontic) logic is
obligatory. (This sounds a bit strange. The usual apology is that since logical laws are
automatically fulfilled, taking them as obligations does not really matter but is technically
convenient.) Together they imply that obligations are, so to speak, closed under logical
consequence. That is, every logical consequence of an obligation is also obligatory. This
is reflected in the derived rule:
OB-DR1: if |–   , then |– OB  OB.
44
OB-D says that whatever is obligatory is permissible. It sounds innocent enough, but it
has a controversial implication. Before we discuss the controversy, note that the standard
deontic logic settles for D because the principle T, though very plausible for alethic
modality, is too strong in the present context. The principle OB-T says that whatever is
obligatory is true (or actually fulfilled), which we know does not apply to our world.
The controversial implication of OB-D is that there are no conflicting or contradictory

obligations. That is, for every proposition p, it cannot be true that it is both obligatory
that p and obligatory that ~p. It is very easy to see this:
OB-D (1) OBp  PEp

(1) Def PE (2) OBp  ~OB~p
(2) Def  (3) ~OBp  ~OB~p
(3)  PC (4) ~(OBp  OB~p)
This is controversial because there seem to be plenty of counterexamples. Just think of all
those moral dilemmas in life and in philosophy books. We will return to this issue later.
The formal, possible-worlds semantics is precisely the same as before. At an intuitive or

philosophical level, however, the accessibility relation is not to be understood as the
relation of relative possibility, but one of relative acceptability. The basic idea is that
relative to a world w, those and only those worlds which do not violate any obligation in
w are (deontically) acceptable (accessible). In other words, w’ is (deontically) acceptable
(accessible) relative to w if and only if all the obligations in w are fulfilled in w’.
(Sometimes these w-acceptable worlds are also referred to as the best of possible worlds
seen from w.) Understood this way, the view that OB-T is not a valid principle of deontic
logic just corresponds to the view that not every world is acceptable relative to itself, and
the view that OB-D is a valid principle of deontic logic just corresponds to the view that
relative to any world there is always an acceptable world. Once again, we can see that
OB-D requires that obligations be consistent: no matter which world you are in, the
obligations in that world can be fulfilled simultaneously at some world.
7.2 Stronger systems
Suppose we accept the system D as a sound system for deontic logic. Are there further
valid principles for deontic logic? Let’s consider some candidates we have seen. First,
consider the formula I called NT: OB(OBp  p). Intuitively it says that it is obligatory
that whatever obligation is fulfilled. Put this way, it sounds plausible to me (if second-
order obligations make sense at all): although not every obligation is actually fulfilled, it
seems reasonable to think it ought to be the case.
As you have shown (in the midterm), NT is valid on all and only those frames that obey
secondary-reflexivity: for every w and w’, if w’ is accessible from w, then w’ is accessible
from itself. Thus the principle NT corresponds to the view that if a world is acceptable
relative to some world, then it is acceptable relative to itself. In other words, if a world is
45
not acceptable relative to itself, it is not acceptable relative to any world whatsoever. In
particular, if, as we agree, not all obligations in the actual world are fulfilled in the actual
world, then NT means that the actual world fails to fully respect the obligations of any
possible world. (I have to say it now sounds less plausible to me.)
Obviously, there are serial frames that are not secondary-reflexive. It follows that NT is
not a theorem of D, and D+NT gives us a stronger system. In this stronger system (but
not in D) we can easily derive OBOBp  OBp, which in plain words says that if it is
obligatory that p be obligatory, then p is indeed obligatory, or in other words, whatever
ought to be obligatory is indeed obligatory. This again sounds very plausible to me.
The converse of this formula is the familiar formula 4: OBp  OBOBp. It expresses the
idea that whatever is obligatory ought to be obligatory. To say this principle is valid is to
say the relation of relative acceptability is transitive: if world w’ is acceptable relative to
world w, and world w’’ is acceptable relative to world w’, then world w’’ is acceptable
relative to world w. Obviously, if we further add 4 into the system D+NT, we get an even
stronger system with the reduction law: OBp  OBOBp.
Or one may find 4 but not NT a plausible logical principle for obligations, in which case
the system D+4 may be considered. It should be clear that D+4 is stronger than D (since
4 is not a theorem of D) but weaker than D+NT+4 (since NT is not a theorem of D+4),
and even D+NT+4 is still weaker than S4 (which, recall, is T+4).
Finally, the formula E: PEp  OBPEp also seems to express something sensible:
whatever is permissible ought to be permissible (or equivalently, whatever is not
obligatory ought not to be obligatory). It corresponds to the condition that if both w’ and
w’’ are acceptable seen from w, then w’ and w’’ are acceptable relative to each other.
Somehow many people regard this principle as more dubious than 4, but personally I find
E as plausible-sounding as 4.
It is very easy to show the system D+E contains NT. But D+E does not contain 4, nor
does D+NT+4 contain E. Thus, the system D+4+E is stronger than both D+E and
D+NT+4 (but of course is still weaker than S5).
7.3 Problems with the Standard Deontic Logic
So much for the further principles that may be added to the system D. In fact, the
soundness of D for deontic logic is already dubious. The SEP entry parades a number of
“puzzles” that challenge the standard deontic logic. I will expect you to be aware of the
following four issues.
First, as already mentioned, the implication of no conflicting obligations is controversial.

It is not hard to recall or imagine scenarios in which multiple obligations seem to conflict
with one another. A person is obligated to return a rifle he borrowed from his friend but
may at the same time be obligated not to return it because his friend is trying to find a
weapon to kill an innocent person. A son is obligated to stay at home to take care of his
old mother but may at the same time be obligated to leave home to defend his country
46
against invasion. More commonly, a person may make solemn but contradictory
promises to different people, which would commit the person to two contradictory
obligations.
One way to respond to the challenge is to insist that such conflicts of obligations are
apparent but not real. In each case, at least one of the supposed obligations is not really
an obligation. To back up this response a substantive theory of obligations is needed.
Such a theory, if available at all, would presumably make a distinction between apparent
or prima facie obligations and real or strict obligations. It is fair to say that the standard
deontic logic cannot even express such a distinction.
Second, as also noted, in the standard deontic logic obligations are closed under logical
consequence. This closure has counterintuitive consequences. For example, it follows
from closure that OBp  OB(pq). Thus, since you are obligated to respect your
teachers, you are obligated to respect your teachers or kill them. Sounds very strange
indeed. Some people seem to think this is absurd because if it were true, then if you failed
to respect your teachers, you would be obligated to kill them. But this is a mistake. In the
standard deontic logic, from OB(pq) and ~p it does NOT follow that OBq. The really
disturbing consequence of taking “respect or kill” as a true obligation is rather that one
can fulfill an obligation by killing one’s teachers.
A second puzzle having to do with closure is known as the Good Samaritan Paradox.
Here is an instance. It is obligatory that I report plagiarism cases to the university. But my
reporting plagiarism cases entails that someone commits plagiarism. By closure, it is
obligatory that someone commits plagiarism, which sounds blatantly absurd. A common
diagnosis of this paradox is that the obligation of reporting plagiarism cases to the
university is a conditional obligation: obligation to report conditional on there being
plagiarism. But the standard deontic logic does not have adequate devices to express
conditional obligations.
Third, there is the following puzzle about permissions. If I say you are permitted to write
English or Chinese in the exam, I effectively give you a freedom to choose between
English and Chinese. It should thus follow that you are permitted to write English in the
exam AND you are permitted to write Chinese in the exam. But the logical form of the
argument is to conclude PEpPEq from the premise PE(pq), which is obviously not
valid in the system D. Moreover, it would be disastrous to strengthen D by adding
PE(pq)  (PEpPEq), for a contradiction is easily derivable in the resulting system.
In my opinion, however, this puzzle stems from familiar traps in natural languages.
Saying you are permitted to write English or Chinese is (a more convenient way of)
saying you are permitted to write English AND you are permitted to write Chinese. The
correct form for the statement is PEpPEq in the first place, despite the usage of the
English word “or”. You may challenge me by saying that if I don’t allow PE(pq) to be
the correct form of this sentence, then there will be no naturally formulated English
sentences that have the form PE(pq). But this does not bother me even if it is true. Why
can’t the formal language have redundant expressive power?
47
Fourth, there are puzzles with what is called contrary-to-duty imperatives. Consider the
classical example due to R. M. Chisholm (“Contrary-To-Duty Imperatives and Deontic
Logic”, Analysis, 24:2, 1963, pp. 33-36): “ … suppose: (1) it ought to be that a certain
man go to the assistance of his neighbours; (2) it ought to be that if he does go he tell
them he is coming; but (3) if he does not go then he ought not to tell them he is coming;
and (4) he does not go.”
(1) states a (primary) obligation which is violated according to (4). (2) states a (secondary)
obligation which is called off due to the violation of the obligation stated in (1). (3) states
a so-called contrary-to-duty imperative, which is an obligation that comes into effect
when another obligation is violated. Chisholm’s point is that (1)-(4) are consistent and
describe a realistic situation, but when we put them in the standard deontic logic we get
the following: (1’) OBp; (2’) OB(p  q); (3’) ~p  OB~q; (4’) ~p. These four entail a
contradiction in the system D.
The moral people commonly draw from this puzzle (or the Good Samaritan Paradox) is
that the language of the standard deontic logic does not properly express conditional
obligations. To fix this, a natural proposal is to replace the unary or monadic deontic
operator OB() with a binary or dyadic operator OB(/). So we can write OB(p/q) to
mean it is obligatory that p under the circumstance that q. Moreover, we can define
absolute obligations as obligations conditional on a tautology. The details of dyadic
deontic logic would take us too far afield, so let me stop and turn to epistemic logic.
7.4 Systems of Epistemic Logic
In epistemic logic (認知邏輯，認識論邏輯), the analogue to the necessity operator is a

knowledge operator (or a number of knowledge operators). We will use K as the symbol
for that. Since knowledge is relative to a knower, the knowledge operator is usually
indexed by a letter, such as Kc, to indicate a particular knower or cognitive agent. The
indexing is especially necessary when multiple agents are involved. For single-agent
epistemic logic, the index is not essential, but it is common practice to keep it there. So
we write Kcp for “c knows that p”.
We will again confine our discussion to the normal systems (i.e., systems that are
extensions of the system K). For normal epistemic logic, it is widely agreed that the logic
should be at least as strong as the system T, for it is widely agreed that the formula T:
Kcp  p states a valid principle: if c knows that p then p is true, or in terms that are
familiar to students of epistemology, truth is a necessary condition for knowledge.
Again, the formal semantics is the same as before. Intuitively the accessibility relation is
to be understood as one of relative epistemic possibility or compatibility: world w’ is
accessible from world w iff w’ is compatible with what the agent c knows in world w, or
say, w’ is possible for all c knows in world w. (It is important to distinguish between
metaphysical or ontic possibility on the one hand, and epistemic possibility on the other
hand. It may be metaphysically impossible that the Goldbach conjecture is false, but for
48
all I know, it is possible that the Goldbach conjecture is false, which is an epistemic
possibility relative to my current knowledge.)
In this light, the formula T corresponds to the condition that every world w is compatible
with what c knows in w, which sounds very reasonable.
Can we go further than T? A number of epistemic logicians think that the formula 4 also
states a valid principle: Kcp  KcKcp. It says that if c knows that p, then c knows that c
knows that p. In other words, c has reflective knowledge of whatever he/she knows. This
claim is often known as the KK-thesis (or the thesis of positive introspection). Whether
the KK-thesis is true is a matter of debate in epistemology. Not surprisingly, different
accounts of knowledge give different answers. The traditional accounts of knowledge
inherited from Descartes tend to favor the KK-thesis, while some recent alternative
accounts (especially the so-called externalist accounts) tend to deny it. A leading
epistemic logician, J. Hintikka, argues that there may be a sense of knowing for which the
KK-thesis is false, but for what he takes to be the primary sense of knowing, which
requires reflective awareness, the KK-thesis should be accepted. His view influences
many epistemic logicians, who build their systems on the basis of S4.
Suppose the KK-thesis is acceptable. What else can we add? A familiar candidate is ~Kcp
 Kc~Kcp, which is essentially the formula E. This principle states that if c does not
know that p, then c knows that he/she does not know that p. In other words, c has full
reflective knowledge of his/her ignorance. This thesis of negative introspection does not
sound plausible. Aristotle probably thought that he knew the earth is at the center of the
universe, so although he didn’t know that false proposition, he didn’t know that he didn’t
know. In general, people may be very confident about a false proposition (and even have
excellent reasons), and do not know that they do not know. This is a common way to fail
negative introspection.
With this kind of failure in mind, one may consider a weaker thesis of negative
introspection:
.4: p  (~Kcp  Kc~Kcp)
It says that if p is true, then if c does not know that p, then c knows that he/she does not
know that p. In other words, it claims that the thesis of negative introspection holds for
true propositions. Obviously the kind of cases described in the previous paragraph does
not challenge this weaker thesis. If we add this formula to S4, we get a system known as
S4.4 (and the formula is often just called .4). S4.4 is stronger than S4 and weaker than S5.
The weaker thesis of negative introspection is still dubious. (For those who are familiar
with the so-called Gettier cases in epistemology, you may realize that those cases can be
used to challenge .4.) A still weaker system, known as S4.2, is obtained by adding to S4
the following formula:
.2: ~Kc~Kcp  Kc~Kc~p
(In the box-diamond notation this is a formula we have seen before: p  p.) This
formula says that if c does not know that she does not know that p is true, then c knows
49
that she does not know that p is false. If this sounds confusing, think of it in the following,
equivalent formulation: either c knows that he/she does not know that p is true, or c
knows that he/she does not know that p is false. This is again a principle about negative
introspection (i.e., about knowing that one does not know). But it is weaker than .4, and
sounds much more plausible to me. Not surprisingly, the system S4.2 is stronger than S4
and weaker than S4.4.
We have considered these systems in the context of a single cognitive agent. The more
exciting but also more complicated epistemic logic is for multiple cognitive agents. To
handle multiple knowers we would need multiple knowledge operators (one for each
agent), with which we would have what is called multiply modal logic (多模態邏輯). We
can also include belief operators, to combine doxastic logic with epistemic logic, and/or
include temporal operators, to model changing knowledge and beliefs, and/or even
include action operators, to connect knowledge/beliefs to actions. These efforts are in the
frontier of research.
7.5 Logical Omniscience and Deductive Closure
We have seen the dubious consequence of the standard deontic logic that obligations are
closed under logical deduction. An analogous closure is also a consequence of normal
epistemic logics, and the issue of closure is especially conspicuous and controversial in
the context of epistemic logic. So let me finish this lecture with a brief discussion.
The said closure is easy to see. All normal systems, including all the systems mentioned
in the last section, respect the axiom K and the rule of necessitation. In the notation of
epistemic logic, the rule of necessitation is: if |– , then |– Kc. It means that the agent c
knows all the valid principles of logic: not just the tautologies of classical logic, but also
the valid principles of epistemic logic. This is often referred to as the strong thesis of
logical omniscience. With logical omniscience and the axiom K, we can easily derive
that knowledge is closed under logical consequence: if c knows that p, then c knows
whatever logically follows from p. I will refer to this as the (epistemological) thesis of
deductive closure. (It is also common, as in the SEP entry, to use the single label of
logical omniscience to refer to closure as well. I think it is helpful to separate the kind of
logical omniscience due to the rule of necessitation alone and the deductive closure due
to the axiom K together with the rule of necessitation.)
A common objection is that logical omniscience and deductive closure are too demanding
and very unrealistic for actual human agents. A quick response is to say that epistemic
logic does not aim to model actual human agents anyway. Instead the purpose of
epistemic logic is to capture the logical properties of knowledge for ideal knowers. If so,
however, the potential relevance of epistemic logic to the mainstream epistemology
(which targets human knowledge) seems diminished. Moreover, even for machines with
much better memory and more computational power, the requirement of logical
omniscience and deductive closure seems unduly strong. In fact, if we include predicate
logic in the basis for epistemic logic, there are strong theoretical reasons to think logical
omniscience is impossible for computing machines (because predicate logic is not
50
decidable). Of course depending on what “ideal” means, ideal knowers may go well
beyond the theoretical limits of computers. Perhaps we should think of different systems
of epistemic logic matching different levels of ideal knowers. A God-like knower, for
example, should have no problem satisfying S5.
Another response is to argue that the purpose of epistemic logic is to model “implicit
knowledge” or “epistemic commitment”. The idea is that even though a psychologically
realistic agent can hardly be expected to be explicitly aware of all logical truths and
logical relations between propositions, it is fair to say that they, as rational agents, are or
ought to be committed to the deductive consequences of what they consciously know.
Whether or not an agent is consciously aware of a certain deductive consequence of their
factual knowledge, we should think of the information as implicitly represented in the
agent’s cognitive state, and in this sense as part of the agent’s implicit knowledge. Put
this way, the response is quite uninteresting, as it simply enforces the requirement of
closure in the definition of the so-called implicit knowledge. What would make it more
interesting is an independent account of the nature and value of implicit knowledge. I
should also note that this line of response would not cohere well with the attempt to
justify the KK-thesis by appealing to an active, explicit sense of knowledge.
There are other ways to handle the issue. One can simply opt for non-normal systems.
Indeed a number of systems in the literature of epistemic logic are non-normal. But since
we have only studied normal animals, I will keep you away from the beasts.
51
Exercise 7
1. A binary relation R on W is called dense iff for every x, yW, if (x, y)R, then there
exists zW such that (x, z)R and (z, y)R.
(a) Show that the formula OBOBp  OBp is valid on a frame iff the (accessibility
relation in the) frame is dense.
(b) Show that OBOBp  OBp is not derivable in the standard deontic logic.
(c) Show that even if we add the reduction law OBOBp  OBp to the standard deontic
logic, the resulting system does not contain the principle NT: OB(OBp  p).
2. Show that the system D+E contains the principle NT, but the system D+NT+4 does
not contain the principle E.
3. Show that if we add PE(pq)  (PEpPEq) to the standard deontic logic, the
resulting system contains p as a theorem (and hence contains every formula as a theorem).
4. What is the objection to the standard deontic logic in connection to the so-called
contrary-to-duty imperatives?
5. Prove the formula .2 in the system S4.4, and prove the formula .4 in the system S5.
6. In the so-called doxastic logic (信念邏輯), there is a belief operator, analogous to the
necessity operator. We write Bc to mean “c believes that ”. Write down the
corresponding principles D, T, NT, 4, and 5 for doxastic logic. Which of them sound
invalid to you? Why?
7. To model two cognitive agents in epistemic logic, we need two knowledge operators
K1 and K2 in the formal language. Accordingly, in the formal semantics, we need to have
two accessibility relations in a frame, one corresponding to each knowledge operator.
More precisely, a frame is a triple <W, R1, R2> such that W is a non-empty set, and both
R1 and R2 are binary relations on W.
K1 is true at world w iff for every (w, w’)R1,  is true at w’.
K2 is true at world w iff for every (w, w’)R2,  is true at w’.
Other semantic rules are the same as before.
Now consider the formula K1K2p  K1p. What does this formula say in words? Do you
think it is plausible? On what kind of frames is this formula valid? (Question 1 may give
you a hint to the last question.)
52
Lecture 8
8.1 Time and modality
In this lecture we study the basics of temporal (or tense) logic (時間或者時態邏輯),
which is currently a very active and exciting area of research. In the history of philosophy
time and (alethic) modality are often thought to be closely related. Before we enter
temporal logic proper, let’s consider an example of interpreting modality in terms of
temporal notions. Tradition has it that Diodorus Chronos, an ancient Greek philosopher
and logician, held the following view: what can be the case is or will be the case. This
suggests a temporal interpretation of modality: “it is necessarily the case that p” means
that “it is and will always be the case that p”. Obviously this way of approaching the
issue allows a statement or proposition to change its truth value over time. Thus in this
context, for example, “Jiji teaches at Lingnan” is a statement, which was not true at any
time in 2007, but is true right now, and may become false sometime in the future.
Under this Diodorean interpretation of modality (which was amply studied by the founder
of modern tense logic, A. N. Prior), the state of the world at each time is a possible world,
and the accessibility relation between possible worlds is that of “being contemporaneous
or later than” (so that  is true at a time iff  is true at that time and all later times).
This relation is obviously reflexive, and by our common conception of time, it is also
transitive. So, for this temporal interpretation of modality, the right system of modal logic
is at least as strong as S4.
The frames for S4 include some that intuitively represent branching (分叉) time in the
future. In an S4-frame, for example, there may be two different worlds w2 and w3 both
later than (accessible from) world w1, but w2 and w3 are not comparable in timing, i.e.,
neither is w2 later than (accessible from) w3, nor is w3 later than (accessible from) w2.
They do not, so to speak, lie on the same time line. Whether the flow of time allows such
branching future is a matter of debate for philosophers (and physicists). For those people
who think the flow of time is linear (綫性), they would prefer a stronger system than S4
that rules out the branching frames. The system is known as S4.3, which results from
adding to S4 the extra axiom:
.3: (p  q)  (q  p)
(A slightly simpler formula which is equivalent to .3 in S4 is (p  q)  (q  p).)
It is not hard to check that the formula .3 corresponds to the following condition on the
frame, called linearity or connectedness: for every w’ and w’’ accessible from w, either
w’ is accessible from w’’ or w’’ is accessible from w’. This condition “pulls” the future
into a line. The frames for S4.3 must be reflexive, transitive and connected. They do not
allow branching structures.
By the way, as you must have guessed from the names, the system S4.3 is in between the
system S4.2 and the system S4.4 (which we encountered in the context of epistemic
logic): S4.3 is stronger than S4.2 and weaker than S4.4.
53
8.2 The language for basic temporal logic
We now turn to temporal logic, and more specifically, to the kind of temporal logic
invented by Prior (called tense logic by Prior). The language of temporal (propositional)
logic includes the language of PC, and two primitive temporal operators, one for future
and one for past. The one for future we will use G, which intuitively reads “it is always
Going to be the case that”, and the one for past we will use H, which intuitively reads “it
Has always been that case that”. Both operators are analogous to the box operator. The
formation rules associated with the two operators are obvious: if  is a wff, then G and
H are both wffs.
We can then define two other operators, analogous to the diamond operator:
F =d.f. ~G~
P =d.f. ~H~
Intuitively, F says that at some time in the Future  will be true; P says that at some
time in the Past  was true.
In this language, we can express (forms of) such tensed statements as:
Example 1: Once upon a time China was the country with the most advanced
technology but it lost that status later.
p: China is the country with the most advanced technology.
P(p  F~p)
Example 2: At some point in the future China will be the biggest economy in
the world and the US will never be the only superpower.
p: China is the biggest economy in the world.
q: The US is the only superpower.
F(p  G~q)
Example 3: It has always been the case that Elizabeth Taylor will die, and it is
always going to be the case that she has died.
p: Elizabeth Taylor dies.
HFp  GPp
Example 4: If Bill has never visited Hong Kong before, he will take a course
in Cantonese and visit Hong Kong afterwards.
p: Bill visits Hong Kong.
q: Bill takes a course in Cantonese.
~Pp  F(q  Fp)
54
8.3 Formal semantics
Since there are two primitive temporal operators, we are in fact dealing with a multiply
(doubly, to be exact) modal logic here. A natural way to extend the possible-worlds
semantics to deal with multiply modal logic is to have multiple accessibility relations,
one for each modal operator. So we need an accessibility relation for G (F) and an
accessibility relation for H (P). These two relations, fortunately, are tightly connected on
intuitive grounds. Intuitively, the accessibility relation for future is just the relation of
“later than”, and the accessibility relation for past is just the relation of “earlier than”.
One is thus the inverse (逆關係) of the other: x is later than y if and only if y is earlier
than x. Therefore, we can use one relation to effectively handle both operators.
Formally, a frame in the semantics for this type of temporal logic is the same as before,
i.e., a non-empty set together with a binary relation over the set. In the context of
temporal logic, we will refer to them as temporal frames, to remind us, among other
things, that the relation will be used to interpret both the future-operators (G and F) and
the past-operators (H and P). A model is, as before, a frame together with a valuation
function. The valuation rules for G and H are:
(VG) V(G, w) = 1 if and only if for every (w, w’)  R, V(, w’) = 1.
(VH) V(H, w) = 1 if and only if for every (w’, w)  R, V(, w’) = 1.
Notice the difference. Both operators are interpreted in a way analogous to that of the box
operator, but the accessibility relation for H is the inverse of that for G.
Derivatively, we have the rules for F and P:

(VF) V(F, w) = 1 if and only if for some (w, w’)  R, V(, w’) = 1.
(VP) V(P, w) = 1 if and only if for some (w’, w)  R, V(, w’) = 1.
Same as before, a formula in the language of temporal logic is valid on a model iff it is
true at every time point in the model. A formula is valid on a temporal frame iff the
formula is valid on every model based on the frame.
8.4 Some systems of temporal logic
The following formulas, for example, are valid on every temporal frame:
(1a) G(p  q)  (Gp  Gq)
(1b) H(p  q)  (Hp  Hq)
(2a) p  HFp
(2b) p  GPp
(1a) and (1b) are analogous to the formula K. (2a) and (2b) are valid due to the fact that
the accessibility relation for future and the accessibility relation for past are inverse to
each other. We obtain the so-called minimal temporal (tense) logic if, on top of PC, we
55
add these formulas as axioms and the following rule: If |– , then |– G and |– H.
Obviously the rule is a temporal analogue of the rule of necessitation. This minimal
system is often referred to as Kt (i.e., the temporal analogue of the system K). Every
theorem of Kt is valid on every temporal frame, and every formula that is valid on every
temporal frame is a theorem of Kt.
Common conceptions of time, however, have various constraints for the temporal
structure. For example, almost everyone agrees that time order is transitive. Many people
also think that time is linear (at least in the past). Some people think that there is a
beginning of time, while others disagree, and similarly for ending time. There are
arguments to the effect that time cannot be circular, but some think circular time is at
least possible. So on and so forth. Exactly which constraints are correct is controversial,
but an interesting question of logic is what systems correspond to what constraints.
Transitivity is easy. It corresponds to the formulas

(3a): Gp  GGp
(3b): Hp  HHp
which are analogues of the formula 4. Adding (one of) them to Kt yields the system
known as K4t, which is a system that contains all and only those formulas that are valid
on transitive temporal frames.
Transitivity is routinely accepted as a property of temporal structures. In the point-arrow

diagrams usually used to visualize (discrete) temporal structures, transitivity is implicitly
assumed and not explicitly represented. In the following diagrams, for example, it is
understood that whenever there is a chain leading from a time point t to a time point t’,
then t and t’ bear the relation R (or intuitively, t is earlier than t’ and t’ is later than t).
(i) … t1 t2 t3 t4 …
t3 …
(ii) t1 t2
t4 …
t1
(iii) t3 t4 …
t2
(iv) … t1 t2 t3 t4
56
Further constraints on time structure are not as uncontroversial. But no matter. We are not
concerned here with the metaphysics of time, but with the implications of various
metaphysical doctrines. In what follows, we will consider the following constraints: (1)
time is linear (or connected) towards future and/or time is linear (or connected) towards
past; (2) time has a beginning and/or time has an end; (3) time has no beginning and/or
time has no end; and (4) time has no cycle.
8.4.1 Linear time
Look at the previous diagrams. Diagram (i) depicts a linear time structure with no
branching. Diagram (ii) represents a structure with branching future. Diagram (iii)
represents a structure with branching past.
Formally, a temporal frame <T, R> is said to be linear (or connected) towards future iff
for every x, y, zT, if (x, y)R and (x, z)R, then (y, z)R or y=z or (z, y)R.
A temporal frame <T, R> is said to be linear (or connected) towards past iff for every x,
y, zT, if (y, x)R and (z, x)R, then (y, z)R or y=z or (z, y)R.
Non-linear time is usually referred to as branching time, and branching time (especially
towards future) is often associated with indeterminism.
A formula that corresponds to linearity towards future is:

(4a) G(p  (Gp  q))  G(Gq  p)
Notice the similarity to the formula .3. We can write (4a) equivalently as
F(p  Gp  ~q)  G(F~q  p)
Intuitively, this says that if at some point in the future p is true, q is false and from that
time on p will always be true, then at every point in the future, either q will be false or p
is true. If this sounds valid to you, you probably have reasoned this way: let t1 be the time
in the future at which p is true, q is false and from that time on p will always be true.
Then for any point t in the future, either t is earlier than t1 or t=t1 or t is later than t1. If t is
earlier than t1, then at t it is true that q will be false. If t=t1 or t is later than t1, then p is
true at t. So at every point in the future, either q will be false or p is true.
Notice the crucial step underlined in the previous reasoning, which relies on the
supposition that time is linear towards future.
Likewise, the following corresponds to linearity towards past:

(4b) H(p  (Hp  q))  H(Hq  p)
The temporal frames for the system K4t+(4a) are transitive and linear towards future,
which do not include those depicted by diagram (ii). The temporal frames for the system
K4t+(4b) are transitive and linear towards past, which do not include those depicted by
diagram (iii). The temporal frames for the system K4t+(4a)+(4b) (also known as K4.3t)
are transitive and linear in both ways, which rule out both diagram (ii) and diagram (iii).
57
8.4.2 Beginning and ending
A temporal frame <T, R> has a beginning time iff there is xT such that for every yT,
(y, x)R. It has an ending time iff there is xT such that for every yT, (x, y)R.
Consider the formula (5a) Gp  FGp. It expresses the idea that from this time or some
future time on, p is always going to be true. Since p can be any proposition whatsoever
(including a contradiction), (5a) can be valid only if there is an ending time. Similarly,
the formula (5b) Hp  PHp is valid only if there is a beginning time. I said “only if”
instead of “if and only if”, because in branching time it is not sufficient for the validity of
(5a) or (5b) to have an ending or a beginning in some branch (rather, every branch should
have ending or beginning). However, when time is linear, we can say that (5a) is valid iff
there is an ending time, and (5b) is valid iff there is a beginning time.
Therefore, the system K4t+(4a)+(4b)+(5a) corresponds to linear time with an ending; the
system K4t+(4a)+(4b)+(5b) corresponds to linear time with a beginning; the system
K4t+(4a)+(4b)+(5a)+(5b) corresponds to linear time with both a beginning and an ending.
8.4.3 No beginning and no ending
On the contrary, if we think there is no ending to time, (5a) is not valid, but the formula
(6a) Gp  Fp is valid. Obviously (6a) is but an analogue of the formula D. Given that
the formula D corresponds to the condition of seriality, it is easy to appreciate the point
that (6a) is valid on a temporal frame if and only if there is no ending time. Similarly, the
formula (6b) Hp  Pp is valid if and only if there is no beginning time.
As an exercise, you should write down the system that corresponds to linear time with no
ending, the system that corresponds to linear time with no beginning, the system that
corresponds to linear time with no beginning or ending, the system that corresponds to
linear time with a beginning but no ending, and the system that corresponds to linear time
with an ending but no beginning.
8.4.4 Non-circular time
Circular time, as depicted in diagram (iv), is an intriguing scenario, which is closely

related to the fantasy of time travel. But most people in their sober moments seem to
regard time as non-circular. Assuming we consider transitive temporal frames only, the
further constraint of non-circularity is not hard to express. It can be expressed simply as
irreflexivity (禁自返性): for every x, (x, x)R. You of course remember how easy it is to
develop a system characterizing the condition of reflexivity. As it turns out, however, it is
worse than difficult to do that for irreflexivity. It is impossible. There is no formula in this
language of temporal logic that is valid on all and only irreflexive frames, and
consequently there is no system whose frames are precisely those irreflexive frames. In
fact --- and this is a profound fact --- every formula that is valid on all irreflexive frames
can be proved in the minimal system Kt. But as we know, theorems of Kt are also valid on
all other temporal frames. Thus they fail to pick out irreflexive frames. The point is that
not every interesting constraint on temporal structure can be captured.
58
Exercise 8
1. For each of the following arguments, write down its form in the language of tense logic.
(a) If Bill has never visited Hong Kong before, he will take a course in Cantonese and
visit Hong Kong afterwards. Therefore, if he is never going to visit Hong Kong, he has
already been to Hong Kong.
(b) Bill has visited Hong Kong before. Therefore, there was a time when Bill had never
visited Hong Kong.
(c) Bill is going to visit Hong Kong. He is also going to learn Cantonese. So, if he will
not visit Hong Kong and learn Cantonese at the same time, either he will learn Cantonese
before he visits Hong Kong or he will learn Cantonese after he visits Hong Kong.
(d) Bill is in Hong Kong right now, because it has always been the case that he will visit
Hong Kong.
2. The following diagrams depict several temporal structures. Ellipses (…) represent an
unbounded linear chain from the past or an unbounded linear chain into the future. For
each of the structures, determine whether the previous argument forms (in problem 1) are
valid or not on temporal frames with the structure.
(a) t1 t2 t3 t4 …
(b) … t1 t2 t3 t4
t1
(c) t3 t4 …
t2
t3 …
(d) t1 t2
t4 …
3. What is the system corresponding to (transitive) linear time with a beginning but no
ending? What about (transitive) linear time with an ending but no beginning?
4. The so-called Diodorean modality is to define the box operator as:

(Diodorean modality)  =d.f. G
The so-called Aristotelian modality is to define the box operator as:
(Aristotelian modality)  =d.f. HG
Show that for Diodorean modality, the system S4 is contained in the system K4t; and that
for Aristotelian modality, the system B is contained in the system K4t.
59
Lecture 9
9.1 Review: the language and semantics for (first-order) predicate logic
We shall use the following language for predicate logic:
Primitive symbols
Individual variable symbols: x, y, z, …
Predicate symbols (of various arities): A, B, C, …
Logical connectives/operators: ~, .
Quantifier: .
Auxiliary symbols: (, ).
Rules for forming well-formed formulas (wffs)

1. If P is a predicate symbol of arity n, and x1, …, xn are any individual variables, then
Px1…xn is a wff (often called an atomic wff).
2. If  is a wff, so is ~.
3. If  and  are wffs, so is ().
4: If  is a wff, and x is an individual variable, then x is also a wff.
For convenience, we introduce the following defined symbols.

[Def ] () =df ~(~  ~)
[Def ] () =df (~  )
[Def ] () =df (()  ())
[Def ] x =df ~x~
(For those who wonder, we did not include constant symbols or the identity symbol for
the sake of simplicity.) The textbook uses Greek letters (, , , etc.) for predicate
symbols. I use bold, capitalized and italicized English letters instead (and reserve Greek
letters for other purposes).
Recall that the language of predicate logic allows us to express the inner structure of
quantified statements, such as
All logicians are careful. x(LxCx)
Some logicians are funny. x(LxFx)
And more complicated structures with nested quantifiers, such as

Everyone admires someone. xyAxy
Someone is admired by everyone. yxAxy
60
As further examples, note that the frame conditions we considered are all expressible in
this language:
Serial: xyRxy
Reflexive: xRxx
2-Reflexive: xy(Rxy  Ryy)
Symmetric: xy(Rxy  Ryx)
Transitive: xyz((RxyRyz)  Rxz)
Euclidean: xyz((RxyRxz)  Ryz)
Dense: xy(Rxy  z(RxzRzy))
An important distinction is that between free and bound variables. Given a wff and an
occurrence of a variable x in the wff, the occurrence of x is said to be bound if it is
within the scope of a quantifier x or x. Otherwise it is said to be free. For example, y is
a free variable in the formula Rxy but is bound in the formula yRxy. However, x is still
free in yRxy (although it is in the scope of y, the quantifier is not x or x). It becomes
bound in the formula xyRxy.
Remember that the same variable can occur several times in a wff and it may be the case
that some occurrences are bound while others are free. For example, in xRxy  Rxy,
the first occurrence of x is bound, but the second occurrence of x is free (while both
occurrences of y are free).
If a wff does not contain any free variable (i.e., all variables in the formula are bound),
we call it a closed formula (or a sentence); otherwise we call the formula an open formula.
In the formal semantics of predicate logic, a model consists of a domain of individuals D,

which is any non-empty set, and an interpretation function V, which maps each predicate
symbol of arity n to a set of (ordered) n-tuples of elements of D. In more plain words, for
each predicate symbol of arity n, V specifies exactly which n elements in the domain (in
a certain order) have the relation (or property) represented by the predicate symbol. For
example, we can write D = {e1, e2}; V(L) = {e1}, V(A) = {(e1, e2), (e2, e2)} to specify
(part of) a model, in which the domain has two individuals e1 and e2, the predicate
symbol L is interpreted as a property that e1 has but e2 does not, and the predicate symbol
A is interpreted as a binary relation that (e1, e2) bears, and (e2, e2) bears, but (e1, e1) does
not, and (e2, e1) does not.
The basic idea of interpreting quantifiers is simple: a universally quantified formula x
is true iff every element in the domain has the property described by ; x is true iff
some element in the domain has the property described by . There are several ways to
formalize this idea. The textbook employs one of the standard ways, in which open
formulas play an important role (even though ultimately we are interested in closed
formulas).
61
Given a model <D, V>, an open formula such as Rxy does not have a determinate truth
value, because the model does not tell us what the free variables stand for. To give it a
truth value, we need an (auxiliary) value assignment to variables. A value assignment  is
just a function that maps each variable to an element of D; that is, for every variable x,
(x)D. A useful technical notion is that of x-alternative assignment. A value assignment
 is called an x-alternative of a value assignment  if  and  give the exact same
assignments to variables other than x, that is, for every y  x, (y) = (y). (Note that the
definition does not require that  and  must give different assignments to x; they may or
may not differ in their assignments to x but must agree in their assignments to other
variables. Thus  counts as an x-alternative of itself.)
Given a model <D, V> and a value assignment , every wff gets a truth value according
to the following rules:
(1) For atomic formula Px1…xn, it is true in the model <D, V> relative to the value
assignment , written as V(Px1…xn) = 1, if and only if ((x1), …, (xn))  V(P).
(2) V(~) = 1 if and only if V() = 0.
(3) V() = 1 if and only if V() = 1 or V() = 1.
(4) V(x) = 1 if and only if for every x-alternative  of , V() = 1.
In plain words, (1) says that Px1…xn is true iff the individuals represented by x1, …, xn
(according to ) have the relation represented by P (according to the model). (4) says that
x is true iff  is true for every value of x, while the values for other (free) variables, if
any, remain the same as  assigns.
The rules for the defined connectives are obvious. For the defined quantifier , we can
derive the following rule:
(5) V(x) = 1 if and only if for some x-alternative  of , V() = 1.
A wff is called valid in a model <D, V> iff it is true in the model relative to every value
assignment. A wff is simply called valid if it is valid in every model.
It should be fairly obvious that value assignments matter only for open formulas. In other
words, only the assignments to free variables in a formula matter for its truth value. The
textbook calls this the principle of agreement (PA). Formally, PA says that for any model
<D, V> and any wff , if two value assignments  and  give the same assignments to
those variables that are free in , then V() = V().
Since a closed formula does not contain any free variable, its truth value in a model
remains the same relative to every value assignment. In other words, the model already
determines its truth value, regardless of value assignments. Thus, to say a closed formula
is valid in a model is just to say it is true in the model.
62
Another useful fact is labeled the principle of replacement (PR) in the textbook. To state
the principle, recall the notion of substitution in predicate logic. Let  be any wff. We
write [y/x] to denote the wff resulting from  by replacing every free occurrence of
variable x with variable y. We say y is free for x in  (y 對 x 代入合適) if in  there is no
free occurrence of x within the scope of y or y. The idea is that if y is free for x in ,
then the y’s that substitute in for the free x’s remain free in [y/x]. For example, y is free
for x in Ax, and substituting y for the free x gives us Ay, in which the substituting y
remains free. By contrast, if we substitute y for x in yAx, what we get is yAy in which y
is bound. This is the kind of substitution we will disallow by requiring that y be free for x.
(The textbook uses a different technical notion, the so-called bound alphabetic variants,
for essentially the same purpose.)
PR says that for any model <D, V> and any wff , if y is free for x in , then V() =
V([y/x]) when  is the same as  except that (x) = (y). From this fact it follows that
the formula (schema) x  [y/x] is valid whenever y is free for x in . This formula
expresses the principle of universal instantiation: if  holds for every individual, it holds
for any particular individual.
9.2 Axiomatic system
Following the textbook, we will use axiom schemata (公理模式) instead of the rule of
uniform substitution in predicate logic (in order to avoid the notational confusion
between substitution of individual variables and substitution of propositional variables).
For example, a commonly used axiom schema is the one we just mentioned:
1 x  [y/x], where  is any wff; x and y are any variables s.t. y is free for x in .
This schema gives us infinitely many axioms: we can use any wff as  and any two
variables as long as one is free for the other in . For example, the following are all
axioms of the schema 1:
xAx  Ax
x(AxBx)  (AyBy)
xyRxy  yRzy
xy(Rxy  Ryx)  y(Rxy  Ryx)
If you remember natural deduction for predicate logic, it is not hard to see that 1 plays
the role of the rule of universal quantifier elimination.
To the rule of universal quantifier introduction in natural deduction, there is an analogous

rule in the standard axiomatic system for predicate logic:
2 If |–   β and a variable x does not occur free in , then |–   xβ.
63
The standard axiomatization of predicate logic is basically PC plus 1 and 2. More
precisely, the axiomatic basis for the system LPC (lower or first-order predicate calculus)
consists of the following:
Axiom Schemata
PC Every substitution-instance of a tautology is an axiom.
1 x  [y/x], where y is free for x in .
Rules
MP If |–  and |–   β, then |– β.
From these axioms and rules (and the definitions), every derivable wff is valid and every
valid wff in predicate logic can be derived.
You should take a look at the theorems and derived rules mentioned in the textbook,
some of which I will explain in class.
9.3 The language for modal predicate logic
To get the language for modal predicate logic, we simply add a primitive modal operator
 to the language of predicate logic, and the formation rule: if  is a wff, so is . We
then introduce a defined modal operator :  =df ~~. And that is it.
In this language, we may express such statements as:
(i) It is necessarily true that every human being is mortal.

x(HxMx)
(ii) It is possible that some human beings are immortal.

x(Hx~Mx)
(iii) All human beings are necessarily mortal.

x(HxMx)
(iv) Some human beings are possibly immortal.

x(Hx~Mx)
A philosophically important distinction is that between de dicto (of the statement or

proposition) and de re (of the thing). In (i), for example, what is claimed to be necessary
is a statement or proposition (that every human being is mortal). Similarly with (ii). The
modalities in (i) and (ii) are thus de dicto. By contrast, in (iii) and (iv), the modalities are
ascribed to things. (iii), for example, says that for anything, if it is a human being, then
64
this thing must be mortal. It is about the fate of this thing in various possible worlds. The
modality is thus called de re.
The textbook marks this distinction in the syntax: a wff is called a de re formula if it
contains a sub-formula (i.e., a part of the formula which is itself a wff) in which there is a
free variable within the scope of a modal operator. Otherwise it is a de dicto formula. For
example, x(HxMx) is de re, because it has a sub-formula Mx in which x occurs
free within the scope of . Likewise, x(Hx~Mx) is de re, because it has a sub-formula
~Mx in which x occurs free within the scope of . By contrast, x(HxMx) and
x(Hx~Mx) are both de dicto formulas. We will return to this distinction when we
discuss Quine’s objections to quantified modal logic.
9.4 Formal semantics with the constant-domain assumption
The formal semantics for modal predicate logic is a natural generalization of that for non-
modal predicate logic (and a natural generalization of that for modal propositional logic).
Basically we have a set of worlds, together with an accessibility relation, and in each and
every world we run the semantic machinery of predicate logic. That means, among other
things, that in each and every world there is a domain of quantification. In this lecture, we
develop the semantics with a constant-domain assumption: every world has the exact
same domain. Next time we shall explore the consequences of relaxing this assumption.
Formally, a frame is still a pair <W, R> where W is a non-empty set and R is a binary
relation over W. A constant-domain quantification frame is a triple <W, R, D> where
<W, R> is a frame and D is a non-empty set (serving as the domain of quantification in
every world). A model is a quantification frame plus an interpretation function V. For
each predicate symbol P of arity n and each world wW, V returns a set of (ordered) n-
tuples of elements of D, specifying exactly which n elements in the domain (in a certain
order) have the relation (or property) represented by P in the possible world w.
(Equivalently, we can say V maps each predicate symbol of arity n to a set of n+1-tuples
of the form (e1, …, en, w), where e1, …, enD and wW.)
A value assignment is again a function that maps each individual variable to an element
of D. Given a model <W, R, D, V> and a value assignment , every wff in the language
of modal predicate logic get a truth value at each world wW, according to the following
rules:
(1) V(Px1…xn, w) = 1 if and only if ((x1), …, (xn))  V(P, w).
(2) V(~, w) = 1 if and only if V(, w) = 0.
(3) V(, w) = 1 if and only if V(, w) = 1 or V(, w) = 1.
(4) V(x, w) = 1 if and only if for every x-alternative  of , V(, w) = 1.
(5) V(, w) = 1 if and only if for every (w, w’)R , V(, w’) = 1.
65
The rules for the defined operators should be obvious.
A wff is valid in a model iff it is true at every world and relative to every value
assignment. It is valid on a quantification frame iff it is valid in every model based on the
quantification frame. It is valid on a frame iff it is valid on every quantification frame
based on the frame.
As an illustration, consider an instance of the famous Barcan formula: xAx  xAx.

Intuitively it says that if everything is necessarily an A, then it is necessarily the case that
everything is an A. This formula is valid on every constant-domain quantification frame.
In other words, there is no counter-model to this formula based on a constant-domain
quantification frame. Here is why. Suppose there is a model <W, R, D, V>, in which
xAx  xAx is false at a world w (I don’t relativize to a value assignment, because
for closed formulas it does not really matter). That means xAx is true at w, and xAx
is false at w. Since xAx is false at w, there is a world w’ accessible from w and at w’
xAx is false. That means there is an eD such that eV(A, w’). (In plain words, there is
at least one thing in the domain that is not an A at world w’.) Then when x is assigned the
value e, Ax is false at w’ and hence Ax is false at w. It follows that xAx is false at w,
which contradicts the requirement that xAx be true at w. Therefore, there is no such
model.
Similarly, every formula of the schema xx can be shown to be valid on every
constant-domain quantification frame. (Next time we will reexamine such formulas when
the assumption of constant-domain is dropped.)
9.5 Systems
The systems we have studied in modal propositional logic, in particular K, D, T, S4, and
S5, can all be extended to modal predicate logic in a straightforward way. Indeed, for any
normal modal system S in modal propositional logic, we can construct a corresponding
system in modal predicate logic: LPC+S, with the following axiomatic basis:
Axiom Schemata
S Every substitution-instance of the axioms of S is an axiom.
1 x  [y/x], where y is free for x in .
Rules
MP If |–  and |–   β, then |– β.
N If |– , then |– .
In particular, we have systems LPC+K, LPC+D, LPC+T, LPC+S4, and LPC+S5.
66
Moreover, there is the schema of Barcan formula (named after the brilliant logician and
philosopher R. Barcan Marcus).
BF xx
Following the textbook, we will refer to the system LPC+S+BF simply as S+BF. So we
have systems K+BF, D+BF, T+BF, S4+BF, and S5+BF.
For every S, S+BF is of course an extension of LPC+S, but not necessarily a proper
extension. In fact, in LPC+S5 we can derive BF (p. 247), so S5+BF is equivalent to
LPC+S5. In LPC+S4 (and of course all the weaker systems), BF is not a theorem, and so
S4+BF is a proper extension of LPC+S4. (However, to show that BF is not a theorem of
LPC+S4, we need to go beyond the constant-domain models.)
The textbook mentions a few theorems of K+BF. I will discuss some of them in class.
As is proved in the textbook, the set of frames for S+BF is exactly the same as the set of
frames for S. I do not expect you to go through the proof, but you should note the obvious
consequences of this result: every theorem of K+BF is valid on every frame; every
theorem of D+BF is valid on every serial frame; every theorem of T+BF is valid on
every reflexive frame; every theorem of S4+BF is valid on every frame that is reflexive
and transitive; and every theorem of S5+BF is valid on every universal frame.
67
Exercise 9
1. Which of the following formulas are de re formulas?
(a) xAx  xAx
(b) x(Ax  Bx)  (xAx  xBx)
(c) xyLxy
(d) ~xyLxy
2. Consider the following (constant-domain) model:

W = {w1, w2, w3}; R = W×W
D = {e1, e2, e3}
V(A, w1) = V(A, w2) = {e1, e2}, V(A, w3) ={e3}
V(B, w1) = {e1, e2, e3}, V(B, w2) = {e2, e3}, V(B, w3) ={}
For each of the following formulas, determine whether the formula is valid in the model.
(a) xAx
(b) xAx
(c) xBx  yBy
(d) xBx  xBx
(e) (xAx  xBx)  x(Ax  Bx)
(f) x(~Ax  ~Bx)
3. Consider the following (constant-domain) quantification frame:

W = {w1, w2, w3}; R = W×W
D = {e1, e2}
For each of the following formulas, determine whether it is valid on the frame. If not,
present a counter-model based on the quantification frame.
(a) x(Ax  Bx)  (xAx  xBx)
(b) x(Ax  yBy)
(c) xBx  xBx
(d) x(Ax  Bx)  (xAx  xBx)
4. Show that (xAx  xBx)  x(Ax  Bx) is not a theorem of the system S5+BF.
68
Lecture 10
10.1 A more general semantics (without the constant-domain assumption)
We now drop the constant-domain assumption. That is, we allow different possible
worlds to have different domains of quantification. If you will, you may think of this as
the idea that perhaps different things exist in different worlds. Now if Jiji does not exist
in some possible world (which actually sounds very compelling if all that means is Jiji
might not have existed), how do we make of the statement that everyone is necessarily
mortal? For this statement to be true in our world, it needs to be true in our world that Jiji
is necessarily mortal. That means Jiji is mortal in every possible world accessible from
ours. But does it make sense to say Jiji is (or is not) mortal in those worlds he does not
exist?
Depending on whether it makes sense, the formal semantics may be developed in

different ways. If it makes sense, then a formula like Mx has a truth value at a world w
even when x is assigned an individual that is not in the domain at w. If it does not make
sense, then it seems more appropriate to allow truth-value gaps, so that Mx is treated as
undefined or lacking truth value at a world w when x is assigned an individual that is not
in the domain at w. Which way should we go?
It turns out either way is fine: each way gives us a coherent semantics, and the two,
fortunately, do not differ regarding which formulas are valid (this can be proved
rigorously, but we shall not bother with the technical details). We will thus follow the
slightly simpler way, the semantics without truth-value gaps.
Formally, we still start with a frame <W, R>. Instead of requiring that there be a constant
domain, we allow each world to have its own domain. The standard way to implement
this is to define a quantification frame as a quadruple <W, R, D, Q>, where D is a non-
empty set (if you will, you can think of this set as the union of all the domains), and Q is
a function that maps each world in W to a subset of D, i.e., Q(w)  D for wW. Q(w) is
understood to be the domain of quantification at world w, and sometimes we simply write
Dw for Q(w).
A model is a quantification frame plus an interpretation function V. And it is still that for
each predicate symbol P of arity n and each world wW, V returns a set of (ordered) n-
tuples of elements of D. (Note that it is D, not Dw. That is how we enforce that Mx has a
truth value at w even when x is assigned an individual not in Dw.)
A value assignment is again a function that maps each individual variable to an element
of D. Given a model <W, R, D, Q, V> and a value assignment , every wff in the
language of modal predicate logic gets a truth value at each world wW, according to the
following rules:
(1) V(Px1…xn, w) = 1 iff ((x1), …, (xn))  V(P, w).
(2) V(~, w) = 1 iff if V(, w) = 0.
69
(3) V(, w) = 1 iff V(, w) = 1 or V(, w) = 1.
(4) V(x, w) = 1 iff for every x-alternative  of  such that (x)  Q(w), V(, w) = 1.
(5) V(, w) = 1 iff for every (w, w’)R , V(, w’) = 1.
Compared to the constant-domain semantics, the only difference is that in the

interpretation of quantified formula, i.e., rule (4), the domain of quantification is Q(w) or
Dw, instead of D (though of course it can happen that Q(w) = D).
A formula  is valid in a model iff for every world w, V(, w) = 1 for every value
assignment  such that (x)Q(w) for every free variable x in . Obviously, the
underlined restriction matters only for open formulas. As before,  is valid on a
quantification frame iff it is valid in every model based on the quantification frame. It is
valid on a frame iff it is valid on every quantification frame based on the frame.
Clearly the constant-domain semantics is just a special case, where we only consider
quantification frames in which Q(w) = D for every w.
10.2 The Barcan Formula and Shrinking Domain
As we have seen, the Barcan formula (BF), for example xAx  xAx, is valid in
every constant-domain model. When we drop the constant-domain assumption, it is no
longer always valid. In words, the (instance of) Barcan formula says that if everything is
necessarily A, then it is necessarily the case that everything is A. When evaluated at a
world w, it basically states that if everything in the domain of w is A in every possible
world accessible from w, then in every possible world accessible from w, everything is A.
This statement would be false in case everything in the domain of w is A in every
possible world accessible from w, but there is some individual e not in the domain of w
but in the domain of w’ accessible from w such that e is not A (in w’ then it is false that
everything is A).
More formally, here is a simple counter-model to the formula:

W = {w1, w2}; R = {(w1, w2)}; D = {e1, e2}; Q(w1) = {e1}, Q(w2) = D;
V(A, w1) = V(A, w2) = {e1}.
In this model, from the viewpoint of w1, everything is necessarily A (because the only
thing in the domain of w1, e1, is necessarily A), but it is not true that necessarily
everything is A (because in w2, there is something, i.e., e2, that is not A).
On what kind of quantification frames is BF valid? We already know it is valid on every

constant-domain frame, so constant-domain is a sufficient condition for its validity. But it
is not a necessary condition. The sufficient and necessary condition is the following: for
every (w, w’)  R, Q(w’)  Q(w). In words, the condition is that whenever w’ is possible
relative to w, the domain at w’ is contained in the domain at w. Let’s call it the shrinking-
70
domain (收缩个体域) condition. The constant-domain condition obviously implies the

shrinking-domain condition. And the previous counter-model is obviously based on a
frame that violates the shrinking-domain condition.
It is easy to show that BF is valid on a quantification frame if and only if the frame
satisfies the shrinking-domain condition. I will leave it to you.
An equivalent formulation of the (instance of) Barcan formula is xAx  xAx

(indeed this is how Barcan Marcus herself presented it). At this point we can recall a
sample argument I gave at the very beginning of the course: It is possible that someone is
a superman. So, someone is possibly a superman. Now you know one worry about it.
10.3 The Converse of Barcan Formula and Expanding Domain
The converse of Barcan formula, often noted as BFC, is xAx  xAx (or xAx 
xAx). In words, it says that if it is necessarily the case that everything is A, then
everything is necessarily A. This formula is invalidated in the following model:
W = {w1, w2}; R = {(w1, w2)}; D = {e1, e2}; Q(w1) = D, Q(w2) = {e1};
V(A, w1) = {e1, e2}, V(A, w2) = {e1}.
In this model, from the viewpoint of w1, it is necessarily the case that everything is A
(because w2 is the only world accessible from w1, and in w2 xAx is true), but not
everything in w1 is necessarily A (because e2 is not A at w2). So BFC is false at w1.
It is easy to show that BFC is valid on a quantification frame if and only if the frame
satisfies the expanding-domain (扩张个体域) condition (called inclusion requirement in
the textbook): for every (w, w’)  R, Q(w)  Q(w’). In words, the condition is that
whenever w’ is possible relative to w, the domain at w’ contains the domain at w. Again I
will leave the demonstration to you.
Many people find the expanding-domain condition more plausible than the shrinking-
domain condition. I shall not explore their metaphysical intuitions here. But a technical
point is worth noting. A consequence of giving up the expanding-domain condition is that
the rule of necessitation does not preserve validity any more. That is, given a
quantification frame that does not satisfy the expanding-domain condition, you can find a
wff  valid on the frame such that  is not valid on the frame. An implication of this
fact is that the frames for systems with the rule of necessitation must satisfy the
expanding-domain condition. In particular, the frames for the system LPC+K satisfy the
expanding-domain condition. Thus, BFC is valid on every frame for LPC+K. It is then
unsurprising that BFC is a theorem of LPC+K (a three-line proof is given on pp. 245-246
of the textbook).
BF, by contrast, is not a theorem of LPC+K. Indeed the counter-model to BF we saw in

the last section can be used to show that BF is not a theorem of LPC+K (or for that
matter, of LPC+S4). However, as I mentioned last time, BF is a theorem of LPC+S5. It
71
follows that a frame for LPC+S5 must satisfy the shrinking-domain condition. And this is
not surprising. Remember that a frame for LPC+S5 must be, among other things,
symmetric. Since LPC+S5 has the rule of necessitation, its frames must satisfy the
expanding-domain condition. The symmetry then forces expanding-domain to be
shrinking-domain as well, which entails that for every (w, w’)  R, Q(w) = Q(w’). So a
universal quantification frame for LPC+S5 must be a constant-domain frame.
72
Exercise 10
1. Show that the formula xAx  xAx is valid on a quantification frame if and only if
the frame satisfies the shrinking-domain condition.
2. Show that the formula xAx  xAx is valid on a quantification frame if and only if
the frame satisfies the expanding-domain condition.
3. Consider the following quantification frame:

W = {w1, w2, w3}; R = W×W
D = {e1, e2}; Q(w1) = Q(w2) = Q(w3) = D
For each of the following formulas, determine whether it is valid on the frame. If not,
present a counter-model based on the quantification frame.
(a) x(Ax  Bx)  (xAx  xBx)
(b) x(Ax  yBy)
(c) xBx  xBx
(d) x(Ax  Bx)  (xAx  xBx)
4. For each of the following statements, determine whether it is true or false.

(a) Consider the argument: It is possible that something travels faster than light. So,
something possibly travels faster than light. The form of this argument is valid on the
following quantification frame: W = {w1, w2, w3}; R = {(w1, w2), (w1, w3), (w3, w2)}; D =
{e1, e2, e3}; Q(w1) = Q(w2) = {e2, e3}, Q(w3) = D
(b) Consider the argument: Something possibly travels faster than light. So, it is possible
that something travels faster than light. This argument is valid according to the system
LPC+K.
(c) Every quantification frame for the system LPC+B satisfies the following condition:
For every (w, w’)  R, Q(w) = Q(w’).
(d) The formula (xAx  Ay) is valid on the frame: W = {w1, w2}; R = {(w1, w2)}; D =
{e1, e2}; Q(w1) = D, Q(w2) = {e1}.
(e) The rule of necessitation does not preserve validity on the frame: W = {w1, w2}; R =
{(w1, w2)}; D = {e1, e2}; Q(w1) = D, Q(w2) = {e1}.
(f) There are frames for the system LPC+S4 that do not satisfy the shrinking-domain
condition.
73

Lecture Notes by Prof Zhang

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture Notes by Prof Zhang

Uploaded by

Copyright:

Available Formats

Modal Logic ZHANG Jiji

1.2 Basic set-theoretical notions

F = {Paisley Livingston, Neven Sesardic}

Alternatively, I can specify the set as

F = {x | x was a full professor of philosophy at Lingnan in 2012}

AB = {(x,y) | xA and yB}

interpretations of logical connectives, a truth assignment can then be extended to a

In addition to a set of possible worlds, an important element in the possible-worlds

The set of well-formed formulas (wffs) is defined by the following rules:

For convenience, we define three more connectives/operators.

  ~    

2.2 Formal axiomatic systems

Later on we will study a number of formal axiomatic systems of modal logic. As a

Primitive Transformation Rules:

2.3 The language of modal propositional logic

Intuitively,  is intended to express the notion of “necessarily” or “must be”, and  is

Examples of wffs (that aren’t wffs of PC): pp, (pq)~q, (p~q), …

K (pq)  (p  q)

As an exercise, think about what these formulas mean intuitively.

2. Prove the following theorems in the system P.

3. Rewrite the following formulas without using the defined operator .

5. Consider the following argument:

(i) ~ is true at world w iff  is false at world w.

It follows, by the definition of the diamond, that

3.2 Frames, Models, and Validity.

Let’s now make the ideas rigorous by a few formal definitions:

Definition 3 (Validity relative to a model) Let M = <W, R, V> be a model. A wff  is

In plain words,  is valid on M iff  is true at every world of M.

Definition 4 (Validity relative to a frame) Let F be a frame. A wff  is said to be valid on

Finally, no matter which notion of validity, we can extend it to argument forms: an

Model 1: W={w}; R=; V(p, w) = 1.

Consider another intuitively valid formula: p  p (which is known as formula T). Is it

Model 2: W={w}; R=; V(p, w) = 0.

In Model 4, p  p is false at w1 (check it again!), so it is not valid on every frame.

3.4 Aside: what is a possible world?

4. We can also define notions of satisfiability (可满足性):

(3) (pq)  (qp)

To summarize, here is the axiomatic basis for the system K:

PC: every valid wff of PC is an axiom.

K: (pq)  (p  q)

US: If |– , then |– (1/p1, …, n/pn).

MP: if |–  and |– , then |– .

A rule derived from PC3 (used in the shortened proof of K1 on p. 31):

A rule derived from PC8 (used in the shortened proof of K2 on p. 31):

A rule derived from PC11 (used in the proof of K4 on p. 31):

DR1: if |–   , then |–   .

DR2: if |–   , then |–   .

DR3: if |–   , then |–   .

It is fairly easy to justify these rules. You should know how.

The textbook proves the following theorems in K:

K1: (pq)  (pq)

K10: (pq)  (pq).

4.3 The system T

Obviously T is an extension of K, which, recall, means that every theorem of K is also a

T3: (pq)  (p  q)

T[(pq)/p] (1) (pq)  (pq)

4.4 The frames for T and T-validity

4.5 The system D

Obviously D is a proper extension of K, because D is not K-valid. As you showed in the

PC × N (1) (p  ~p)

D[(p  ~p)/p] (2) (p  ~p)  (p  ~p)

(1), (2) × MP (3) (p  ~p)

K6[~p/q] (4) (p  ~p)  (p  ~p)

(3) × (4) × Eq (5) p  ~p