This action might not be possible to undo. Are you sure you want to continue?

Hannes Leitgeb

March 2010

Abstract

We prove that given reasonable assumptions, it is possible to give an explicit deﬁni-

tion of belief simpliciter in terms of subjective probability, such that it is neither the

case that belief is stripped of any of its usual logical properties, nor is it the case that

believed propositions are bound to have probability 1. Belief simpliciter is not to be

eliminated in favour of degrees of belief, rather, by reducing it to assignments of con-

sistently high degrees of belief, both quantitative and qualitative belief turn out to be

governed by one uniﬁed theory. Turning to possible applications and extensions of the

theory, we suggest that this will allow us to see: how the Bayesian approach in general

philosophy of science can be reconciled with the deductive or semantic conception of

scientiﬁc theories and theory change; how primitive conditional probability functions

(Popper functions) arise from conditionalizing absolute probability measures on max-

imally strong believed propositions with respect to diﬀerent cautiousness thresholds;

how the assertability of conditionals can become an all-or-nothing aﬀair in the face of

non-trivial subjective conditional probabilities; how knowledge entails a high degree

of belief but not necessarly certainty; and how high conditional chances may become

the truthmakers of counterfactuals.

1 Introduction

[THIS IS APRELIMINARYANDINCOMPLETE DRAFT OF JUST THE TECHNICAL

DETAILS. . .]

Belief is said to come in a quantitative version—degrees of belief—and in a qualitative

one—belief simpliciter. More particularly, rational belief is said to have such a quantita-

tive and a qualitative side, and indeed we will only be interested in notions of belief here

which satisfy some strong logical requirements. Quantitative belief is given in terms of

numerical degrees that are usually assumed to obey the laws of probability, and we will

1

follow this tradition. Belief simpliciter, which only recognizes belief, disbelief, and sus-

pension of judgement, is closed under deductive inference as long as every proposition

that an agent is committed to believe is counted as being believed in an idealised sense;

this is how epistemic logic conceives of belief, and we will subscribe to this view in the

following. Despite of these logical diﬀerences between the two notions of belief, it would

be quite surprising if it did not turn out that quantitative and qualitative belief were but

aspects of one and the same underlying substratum; after all, they are both concepts of

belief. However, this still allows for a variety of possibilities: they could be mutually irre-

ducible conceptually, with only some more or less tight bridge laws relating them; or one

could be reducible to the other, without either of them being eliminable from scientiﬁc or

philosophical thought; or either of them could be eliminable. So which of these options

should we believe to be true?

The concept of quantitative belief is being applied successfully by scientists, such as

cognitive psychologists, economists, and computer scientists, but also by philosophers, in

particular, in epistemology and decision theory; eliminating it would be detrimental both

to science and philosophy. On the other hand, it has been suggested (famously, by Richard

Jeﬀrey) that the concept of belief simpliciter can, and should, be eliminated in favour of

keeping only quantitative belief. But this is not advisable either: (i) Epistemic logic, huge

chunks of cognitive science, and almost all of traditional epistemology rely on the concept

of belief in the qualitative sense; by abandoning it one would simply have to sacriﬁce too

much. (ii) Beliefs held by some agent are the mental counterparts of the scientiﬁc theories

and hypotheses that are held by a scientist or a scientiﬁc community; they can be true

or false just as those theories and hypotheses can be (taking for granted a realist view of

scientiﬁc theories). But not many would recommend banning the concept holding a sci-

entiﬁc theory/hypothesis from science or philosophy of science. (iii) The concept of belief

simpliciter, which is a classiﬁcatory concept, occupies a more elementary scale of mea-

surement than the numerical concept of quantitative belief does, which is precisely one of

the reasons why it is so useful. That is also why giving up on any of the standard properties

of rational belief, such as closure under conjunction (the Conjunction property)—if X and

Y are believed, then X∧Y is believed—as some have suggested in response to lottery-type

paradoxes (see Kyburg...), would not be a good idea: for without these properties belief

simpliciter would not be so much less complex than quantitative belief anymore (however,

see Hawthorne &Makinson...). But then one could have restricted oneself to quantitative

belief fromthe start, and in turn one would lack the simplifying power of the qualitative be-

lief concept. (iv) Beliefs involve dispositions to act under certain conditions. For instance,

if I believe that my original edition of Carnap’s Logical Syntax is on the bookshelf in my

oﬃce, then given the desire to look something up in it, and with the right background con-

ditions being satisﬁed, such as not being too tired, not being distracted by anything else,

2

and so on, I am disposed to go to my oﬃce and pick it up. The same belief also involves

lots of other dispositions, and what holds all of these dispositions together is precisely

that belief. If one looks at the very same situation in terms of degrees of belief, then with

everything else in place, it will be a matter of what my degree of belief in the proposition

that Carnap’s Logical Syntax is in my oﬃce is like whether I will actually go there or not,

and similarly for all other relevant dispositions. Somehow the continuous scale of degrees

of belief must be cut down to a binary decision: acting in a particular way or not. And

the qualitative concept of belief is exactly the one that plays that role, for it is meant to

express precisely the condition other than desire and background conditions that needs to

be satisﬁed in order for to me to act in the required way, that is, for instance, to walk to the

oﬃce and to pick up Carnap’s monograph from the bookshelf. Decision theory, which is

a probabilistic theory again, goes some way of achieving this without using a qualitative

concept of belief, but it does not quite give a complete account. Take assertions as a class

of actions. One of the linguistic norms that govern assertability is: If all of A

1

, . . . , A

n

are

assertable for an agent, then so is A

1

∧. . . ∧A

n

. One may of course attack this norm on dif-

ferent grounds, but the norm still seems to be in force both in everday conversation and in

scientiﬁc reasoning. Here is plausible way of explaining why we obey that norm by means

of the concept of qualitative belief: Given the right desires and background conditions,

a descriptive sentence gets asserted by an agent if and only if the agent believes the sen-

tence to be true. And the assertability of a sentence A is just that very necessary epistemic

condition for assertion—belief in the truth of A—to be satisﬁed. (Williamson... states an

analogous condition in terms of knowledge rather than belief; but it is again a qualitative

concept that is used, not a quantitative one.) But if an agent believes all of A

1

, . . . , A

n

, then

the agent believes, or is at least epistemically committed to believe, also A

1

∧ . . . ∧ A

n

.

That explains why if A

1

, . . . , A

n

are assertable for an agent, so is A

1

∧ . . . ∧ A

n

. And it

is not clear how standard decision theory just by itself, without any additional resources

at hands, such as a probabilistic explication of belief, would be able to give a similar ex-

planation. The assertability of indicative conditionals A → B

i

makes for a similar case.

Here, one of the linguistic norms is: If all of A → B

1

, . . . , A → B

n

are assertable for an

agent, then so is A → (B

1

∧ . . . ∧ B

n

). This may be explained by invoking the Ramsey test

for conditionals (see...) as follows: Given the right desires and background conditions,

A → B

i

gets asserted by an agent if and only if the agent accepts A → B

i

, which in turn is

the case if and only if the agent believes B

i

to be true conditional on the supposition of A.

Again, the assertability of a sentence, A → B

i

, is just that respective necessary epistemic

condition—belief in B

i

on the supposition of A—to be satisﬁed. But, if an agent believes

all of B

1

, . . . , B

n

conditional on A, then the agent believes, or is epistemically committed

to believe, also B

1

∧ . . . ∧ B

n

on the supposition of A. Therefore, if A → B

1

, . . . , A → B

n

are assertable for an agent, so is A → (B

1

∧ . . . ∧ B

n

). Ernest Adams’ otherwise marvel-

3

lous probabilistic theory of indicative conditionals (...), which ties the acceptance of any

such conditional to its corresponding conditional subjective probability and hence to the

quantitative counterpart of conditional belief, does not by itself manage to explain such

patterns of assertability. While from Adams’ theory one is able to derive that the uncer-

tainty (1 minus the corresponding conditional probability) of A → (B

1

∧ . . . ∧ B

n

) is less

than or equal the sum of the uncertainties of A → B

1

, . . . , A → B

n

, and thus if all of the

conditional probabilities that come attached to A → B

1

, . . . , A → B

n

tend to 1 then so does

the conditional probability that is attached to A → (B

1

∧ . . . ∧ B

n

), it also follows that for

an increasing number n of premises, ever greater lower boundaries 1 −δ of the conditional

probabilities for A → B

1

, . . . , A → B

n

are needed in order to guarantee that the conditional

probability for A → (B

1

∧ . . . ∧ B

n

) is bounded from below by a given 1 − . No uniform

boundary emerges that one might use in order to determine for a conditional—whether

premise or conclusion, whatever the number of premises, or whether in the context of an

inference at all—its assertability simpliciter. But since there is only assertion simpliciter,

at some point a condition must be invoked that discriminates between what is a case of

asserting and what is not. Once again the concept of (conditional) qualitative belief gives

us exactly what we need.

The upshot of this is: Neither the concept of quantitative belief nor the concept of qual-

itative belief ought to be eliminated from science or philosophy. But this leaves open, in

principle, the possibility of reducing one to the other without eliminating either of them—

using traditional terminology: one concept might simply turn out to be logically prior to

the other. Now, reducing degrees of belief to belief simpliciter seems unlikely (no pun in-

tended!), simply because the formal structure of quantitative belief is so much richer than

the one of qualitative belief. But for the same reason, at least prima facie, one would think

that the converse ought to be feasible: by abstracting in some way from degrees of belief,

it ought to be possible to explicate belief simpliciter in terms of them. Belief simpliciter

would thus be qualitative only at ﬁrst glance; its deeper logical structure would turn out to

be quantitative after all. One obvious suggestion of how to explicate belief simpliciter on

the basis of degrees of belief is to maintain that having the belief that X is just having as-

signed to X a degree of belief strictly above some threshold level less than 1 (this is called

the Lockean thesis by Richard Foley... more about which below). If that threshold is also

greater than or equal to

1

2

, then belief would simply amount to high subjective probabil-

ity. But since the probability of X ∧ Y might well be below the threshold even when the

probabilities of X and Y are not, one would thus have to sacriﬁce logical properties such as

the Conjunction property, which one should not, as mentioned above. While the Lockean

thesis seems materially ﬁne, for qualitative belief does seem to be close to high subjective

probability, it does not get the logical properties of qualitative belief right. Or one iden-

tiﬁes the belief that X with having a degree of belief of 1 in X: call this the ‘probability

4

1 proposal’. While this does much better on the logical side, it is not perfect on that side

either. Truth for propositions is certainly closed under taking conjunctions of arbitrary

cardinality, however, being assigned probability 1 is not so except for those cases in which

probability assignments simply coincide with truth value assignments; but in the presence

of uncertainy, subjective probability measures do not. If qualitative belief inherits this

general conjunction property from truth—maybe because truth is what qualitative beliefs

aim at, whether directly or indirectly—then an explication of qualitative belief in terms

of probability 1 is simply not good enough. More importantly, apart from such logical

considerations, the proposal is materially wrong. As Roorda (...) pointed out, our pre-

theoretic notions of belief-in-degrees and belief simpliciter have the following epistemic

and pragmatic properties: (i) One can believe X and Y without assigning the same degree

of belief to them. But then at least one of X and Y must have a probability other than 1. For

instance, I believe that my desk will still be there when I enter my oﬃce tomorrow, and I

also believe that every natural number has a successor, but should I therefore be forced to

assign the same degree of belief to them? (ii) One can believe X without being disposed

to accept every bet whatsoever on X, although the latter ought be that case by the standard

Bayesian understanding of probabilities if one assigns probability 1 to X, at least as long as

the stakes of the bet are not too extravagant. For example, I do believe that I will be in my

oﬃce tomorrow. But I would refrain from accepting a bet on this if I were oﬀered 1 Pound

if I were right, and if I were to lose lose 1000 Pound if not. (Alternatively, one could aban-

don the standard interpretation of subjective probabilities in terms of betting quotients, but

breaking with such a successful tradition comes with a price of its own. However, later we

will see that our theory will allow for a reconciling oﬀer in that direction, too.) Roorda’s

presents a third argument against the probability 1 proposal based on considerations on

fallibilism, but with it we are going to deal later. This shows that Ramsey’s term ‘par-

tial belief’ for subjective probability is in fact misleading (or at least ambiguous, about

which more later): for full belief, that is, belief simpliciter, does not coincide with having

a degree of belief of 1, and hence a degree of belief of less than 1 should not be regarded

as partial belief. All of these points also apply to a much more nuanced version of the

probability 1 proposal which was developed by Bas van Fraasen, Horacio Arlo-Costa, and

Rohit Parikh, according to which within the quantitative structure of primitive conditional

probability measures (Popper functions) one can always ﬁnd so-called belief cores, which

are propositions with particularly nice and plausible logical properties; by taking super-

sets of those one can deﬁne elegantly notions of qualitative belief in diﬀerent variants

and strengths. But the same problems as mentioned before emerge, since all belief cores

can be shown to have absolute probability 1. Additionally, the axioms of Popper func-

tions are certainly more controversial than those of the standard absolute or unconditional

probability measures, and since two distinct belief cores diﬀer only in terms of some set

5

of absolute probability 0, one wonders whether in many practically relevant situations in

which only probability measures on ﬁnite spaces are needed and where often there are no

non-empty zero sets at all—or otherwise the corresponding worlds with zero probabilistic

weight would simply have been dropped from the start—the analysis is too far removed

from the much more mundane reality of real-world reasoning and epistemological thought

experiments. On the other hand, we will see that the logical properties of belief cores are

enormously attractive: we will return to this later, when we will show that it is actually

possible to restore most of them in the new setting that we are going to propose.

Summing up: Reducing qualitative belief to quantitative belief does not seem to work

either. In the words of Jonathan Roorda (...), “The depressing conclusion . . . is that no

explication of belief is possible within the conﬁnes of the probability model”. Roorda

himself then goes on to suggest an explication that is based on sets of subjective probability

measures rather than just one probability measure as standard Bayesianism has it. In

contrast, we will bite the bullet and stick to just one probability measure below.

Given all of these problems, the only remaining option seems to be: neither of quan-

titative or qualitative belief can be reduced to the other; while there are certainly bridge

principles of some kind that relate the two, it is impossible to understand qualitative be-

lief just in terms of quantitative belief or the other way round. A view like this has been

proposed and worked out in detail, for example, by Isaac Levi (...) and recently by James

Hawthorne (...). And apart from extreme Bayesians who believe that one can do without

the concept of qualitative belief, it is probably fair to say that something like this is the

dominating view in epistemology these days.

In what follows, we are going to argue against this view: we aim to show that it is in

fact possible to reduce belief simpliciter to probabilistic degrees of belief by means of an

explicit deﬁnition, without stripping qualitative belief of any of its constitutive properties,

without revising the intended interpretation of subjective probabilities in any way, without

running into any of the diﬃculties that we found to aﬀect the standard proposals for quan-

titative explications of belief, and without thereby intending to eliminate the concept of

belief simpliciter in favour of quantitative belief. Both notions of belief will be preserved;

it is just that having the qualitative belief that A will turn out to be deﬁnable in terms of

assignments of consistently high degrees of belief, where what this means exactly will

be clariﬁed below. We will also point out which consequences this has for various prob-

lems in philosophy of science, epistemology, and the philosophy of language. And for

the convinced Bayesian, who despises qualitative belief, the message will be: within your

subjective probability measure you ﬁnd qualitative belief anyway; so you might just as

well use it.

Before we turn to the details of our theory, we will ﬁrst sketch the underlying idea of

the explication.

6

2 The Basic Idea

Our starting point is again what Richard Foley (..., pp. 140f) calls the Lockean thesis, that

is:

to say that you believe a proposition is just to say that you are suﬃciently

conﬁdent of its truth for your attitude to be one of belief

and consequently

it is rational for you to believe a proposition just in case it is rational for

you to have a suﬃciently high degree of conﬁdence in it, suﬃciently high to

make your attitude toward it one of belief.

He takes this to be derivative from Locke’s views on the matter, as exempliﬁed by

most of the Propositions we think, reason, discourse, nay act upon, are

such, as we cannot have undoubted Knowledge of their Truth: yet some of

them border so near upon Certainty, that we make no doubt at all about them;

but assent to them ﬁrmly, and act, according to that Assent, as resolutely, as

if they were infallibly demonstrated, and that our Knowledge of them was

perfect and certain (Locke..., p. 655, Book IV, Chapter XV; his emphasis)

and

the Mind if it will proceed rationally, ought to examine all the grounds of

Probability, and see how they make more or less, for or against any probable

Proposition, before it assents to or dissents from it, and upon a due ballancing

the whole, reject, or receive it, with a more or less ﬁrm assent, proportionably

to the preponderancy of the greater grounds of Probability on the one side or

the other. (Locke..., p. 656, Book IV, Chapter XV; his emphasis)

We take this account of belief simpliciter in terms of high degrees of belief to be right in

spirit. However, as we know from lottery paradox situations, it is not yet good enough:

there are logical principles for belief (such as the Conjunction principle) which we regard

as just as essential to the belief in X as assigning a suﬃciently high subjective probability

to X, and it is precisely these logical principles that which are invalidated if the Lockean

thesis is turned into a deﬁnition of belief. Instead, we take the Lockean thesis to charac-

terise a more preliminary notion of belief, or what one might call prima facie belief:

Deﬁnition 1 Let P be a subjective probability measure. Let X be a proposition in the

domain of P: X is believed prima facie as being given by P if and only if P(X) > r.

7

Of course, more needs to be said about the threshold value r here, but let us postpone this

discussion.

In analogy with the case of prima facie obligations in ethics, a proposition is believed

prima facie in view of the fact that it has an epistemic feature that speaks in favour of it

being a belief proper—that is, to have a suﬃciently high subjective probability—and as

long as no other of its epistemic properties tells against it being such, it will in fact be

properly believed.

Accordingly, as far as belief itself is concerned, we suggest to drop just the right-to-

left direction of the Lockean thesis, so that high subjective probability is still a necessary

condition for belief but it is not anymore demanded to be a suﬃcient one. Thus, ultimately,

all beliefs simpliciter will be among the prima facie candidates for beliefs. The left-to-

right direction is going to ensure that beliefs remain reasonably cautious—how cautious

will depend on the “cautiousness parameter” r—and that they inherit all the dispositional

consequences of having suﬃciently high degrees of belief. On the other hand, the right-to-

left direction was the one that got us into lottery-paradox-like trouble. Instead of it, we will

regard all the standard logical principles for belief as being constitutive of belief from the

start. Unlike the deﬁnition of prima facie belief which expresses a condition to be satisﬁed

by single beliefs, these logical principles do not apply to beliefs taken by themselves but

rather to systems of beliefs taken as wholes. Therefore, when putting together the left-to-

right direction of the Lockean thesis with these logical postulates, we need to formulate

the result as a constraint on an agent’s belief system or class. Furthermore, we will not

just do this for absolute or unconditional belief—the belief that X is the case—but also for

conditional belief, that is, belief under a supposition, as in: the belief that X is the case

under the supposition that Y is the case. Indeed, generalizing the left-to-right direction of

the original Lockean thesis to cases of conditional belief will pave the way to our ultimate

understanding of belief. And arguably belief simpliciter under a supposition is just as

important for our epistemic lives as belief simpliciter taken absolutely or unconditionally.

This will give us then something of the following form:

• If P is an agent’s degree-of-belief function at a time t, and if Bel is the class of

believed propositions by the agent at t (and both relate to the same underlying class

of propositions), then they have the following properties:

(1) Probabilistic constraint:

∗ P is a probability measure.

.

.

.

(Additional constraints on P.)

(2) Logical constraints:

8

∗ For all propositions Y, Z: if Y ∈ Bel and Y logically entails Z, then Z ∈ Bel.

∗ For all propositions Y, Z: if Y ∈ Bel and Z ∈ Bel, then Bel(Y ∩ Z).

∗ No logical contradiction is a member of Bel.

.

.

.

(Other standard logical principles for Bel and their extensions to condi-

tional belief.)

(3) Mixed constraints:

∗ For all propositions X ∈ Bel, P(X) > r.

∗ (An extension of this to conditional belief.)

.

.

.

(Additional mixed constraints on P and Bel.)

While the conjunction of (1), (2), and (3) might well do as a meaning postulate on ‘Bel’

and ‘P’, obviously this is not an explicit deﬁnition of ‘Bel’ on the basis of ‘P’ anymore. Is

there any hope of turning it into an explicit deﬁnition of belief again?

Immediately, David Lewis’ (...) classic method of deﬁning theoretical terms, which

builds on work by Ramsey and Carnap, comes to mind: given P, deﬁne ‘Bel’ to be the

class, such that the conditions on Bel and P above are the case. But of course this invites

all the standard worries about such deﬁnitions by deﬁnite description: First of all, for given

P, there might simply not be any such class Bel at all. Fortunately, we will be able to prove

that this worry does not get conﬁrmed. Secondly, at least for many P, there might be more

than just one class Bel that satisﬁes the constraints above. Worse, for some P, there might

even be two such classes that contain mutually inconsistent propositions. We will prove

later that this is not so, in fact, for every given P and for every two distinct classes Bel

which satisfy the conditions above (relative to that P) it is always the case that one of the

two contains the other as a subset. Even with that in place, one would still have to decide

which class Bel in the resulting chain of belief classes ought to count as the “actual” belief

class as being given by P in order to satisfy the uniqueness part of our intended deﬁnition

by deﬁnite description. But then again, what if there were a largest such class Bel? That

class would have all the intended properties, and it would contain every proposition that

is a member of any class Bel as above. It would therefore maximize the extent by which

prima facie beliefs in the sense deﬁned before are realized in terms of actual beliefs. In

other words: it would approximate as closely as possible the right-to-left direction of the

Lockean thesis that we were forced to drop in view of the logical principles of belief.

The class would thus have every right to be counted as the class of beliefs at a time t of an

agent whose subjective probability measure at that time is P, and no restriction of bounded

9

variables to “natural” classes as in Lewis’ original proposal would be necessary at all. If

such a largest belief class exists, of course—but as we will prove later, indeed it does.

What we will have found then is that the following is a materially adequate and explicit

deﬁnition of an agent’s beliefs in terms of the agent’s subjective probability measure:

• If P is an agent’s subjective probability measure at a time t that satisﬁes the addi-

tional constraints. . ., then a proposition (in the domain of P) is believed as being

given by P if and only if it is a member of the largest class Bel of propositions that

satisﬁes the following properties:

(1) Belief constraints:

∗ For all propositions Y, Z: if Y ∈ Bel and Y logically entails Z, then Z ∈ Bel.

∗ For all propositions Y, Z: if Y ∈ Bel and Z ∈ Bel, then Bel(Y ∩ Z).

∗ No logical contradiction is a member of Bel.

.

.

.

(Other standard logical principles for Bel and their extensions to condi-

tional belief.)

(2) Mixed constraints:

∗ For all propositions X ∈ Bel, P(X) > r.

∗ (An extension of this to conditional belief.)

.

.

.

(Additional mixed constraints on P and Bel.)

So we will have managed to deﬁne belief simpliciter just in terms of ‘P’ and logical and

set-theoretical vocabulary. In fact, it will turn out to be possible to characterize the deﬁning

conditions of belief just in terms of a simple and independently appealing quantitative

condition on P and elementary set-theoretic operations and relations.

Belief simpliciter will therefore have been reduced to degrees of belief. In the follow-

ing two sections, we are going to execute this strategy in all formal details. The remaining

sections will be devoted to applications and extensions of the theory.

3 The Reduction of Belief I: Absolute Beliefs

The goal of this section and the subsequent one is to enumerate a couple of postulates

on quantitative and qualitative beliefs and their interaction; and we will assume that the

ﬁctional epistemic agent ag that we will deal with has belief states of both kinds available

10

which obey these postulates. The terms ‘P’ and ‘Bel’ that will occur in these postulates

should be thought of as primitive ﬁrst, with each postulate expressing a constraint either

on the reference of ‘P’ or on the reference of ‘Bel’ or on the references of ‘P’ and ‘Bel’

simultaneously. Even though initially we will present these constraints on subjective prob-

ability and belief in the form of postulates or axioms, it will turn out that they will be strong

enough to constrain qualitative belief in a way such that the concept of qualitative belief

ends up being deﬁnable explicitly just on the basis of ‘P’, that is, in terms of quantitative

belief (and a cautiousness parameter) only. When we state the theorems from which this

follows, ‘P’ and ‘Bel’ will become variables, so that we will able to say: For all P, Bel, it

holds that P and Bel satisfy so-and-so if and only if. . .. Accordingly, in the deﬁnition of

belief simpliciter itself, ‘P’ will be a variable again, and ‘Bel’ will be a variable the exten-

sion of which is deﬁned on the basis of ‘P’ (and mathematical vocabulary). We will keep

using the same symbols ‘P’ and ‘Bel’ for all of these purposes, but their methodological

status should always become clear from the context.

3.1 Probabilistic Postulates

Consider an epistemic agent ag which we keep ﬁxed throughout the article. Let W be a

(non-empty) set of logically possible worlds. Say, at t our agent ag is capable in principle

of entertaining all and only propositions (sets of worlds) in a class A of subsets of W,

where A is formally a σ-algebra over W, that is: W and ∅ are members of A; if X ∈ A then

the relative complement of X with respect to W, W\X, is also a member of A; for X, Y ∈ A,

X ∪ Y ∈ A; and ﬁnally if all of X

1

, X

2

, . . . , X

n

, . . . are members of A, then

n∈N

X

n

∈ A. It

follows that A is closed under countable intersections, too. A is not demanded to coincide

with some power set algebra, instead A might simply not count certain subsets of W as

propositions at all.

We will extend the standard logical terminology that is normally deﬁned for formulas

or sentences to propositions in A: so when we speak of a proposition as a logical truth we

actually have in mind the unique proposition W, when we say that a proposition is con-

sistent we mean that it is non-empty, when we refer to the negation of a proposition X we

do refer to its complement relative to W (and we will denote it by ‘¬X’), the conjunction

of two propositions is of course their intersection, and so on. We shall speak of conjunc-

tions and disjunctions of propositions even in cases of inﬁnite intersections or unions of

propositions.

Let P be ag’s degree-of-belief function (quantitative belief function) at time t. Follow-

ing the Bayesian take on quantitative belief, we postulate:

P1 (Probability) P is a probability measure on A, that is, P has the following properties:

11

P : A → [0, 1]; P(W) = 1; P is ﬁnitely additive: if X

1

, X

2

. . . are pairwise disjoint

members of A, then P(X

1

∪ X

2

) = P(X

1

) + P(X

2

).

Conditional probabilities are introduced by: P(Y¦X) =

P(Y∩X)

P(X)

whenever P(X) > 0.

As far as our familiar treatment of conditional probabilities in terms of the ratio formula for

absolute or unconditional probabilities is concerned, we should stress that the elegant the-

ory of primitive conditional probability measures (Popper functions) would allow P(Y¦X)

to be deﬁned and non-trivial even when P(X) = 0 (that is, as we will sometimes say, when

X is a zero set as being given by P). But the theory is still not accepted widely, and we

want to avoid the impression that the theory in this paper relies on Popper functions in

any sense. We shall nevertheless have occasion to return to Popper functions later in some

parts of the paper.

To P1 we add:

P2 (Countable Additivity) P is countably additive (σ-additive): if X

1

, X

2

, . . . , X

n

, . . . are

pairwise disjoint members of A, then P(

n∈N

X

n

) =

∞

n=1

P(X

n

).

Countable Additivity or σ-additivity is in fact not uncontroversial even within the Bayesian

camp itself, although in purely mathematical contexts, such as measure theory, σ-additivity

is usually beyond doubt (but see Schurz &Leitgeb...); we shall simply take it for granted

now. For many practical purposes, A may simply be taken to ﬁnite, and then σ-additivity

reduces to ﬁnite additivity again which is indeed uncontroversial for all Bayesians what-

soever.

In our context, Countable Additivity serves just one purpose: it simpliﬁes the theory.

However, in future versions of the theory one might want to study belief simpliciter in-

stead under the mere assumption of ﬁnite additivity, that is, assuming just P1 but not P2.

Extending the theory in that direction is feasible: Dropping P2 may be seen to correspond,

roughly, to what happens to David Lewis’ “spheres semantics” of counterfactuals when

the so-called Limit Assumption is dropped (to which Lewis himself does not subscribe,

while others do).

3.2 Belief Postulates

Let us turn now from quantitative belief to qualitative belief: Each belief simpliciter—or

more brieﬂy: each belief —that ag holds at t is assumed to have a set in A as its proposi-

tional content. As a ﬁrst approximation, assume that by ‘Bel’ we are going to denote the

class of propositions that our ideally rational agent believes to be true at time t. Instead

of writing ‘Y ∈ Bel’, we will rather say: Bel(Y); and we call Bel our agent ag’s belief

set at time t. In line with elementary principles of doxastic or epistemic logic (which are

12

entailed by the modal axiom K and by applications of necessitation to tautologies), Bel is

assumed to satisfy the following postulates:

1. Bel(W).

2. For all Y, Z ∈ A: if Bel(Y) and Y ⊆ Z, then Bel(Z).

3. For all Y, Z ∈ A: if Bel(Y) and Bel(Z), then Bel(Y ∩ Z).

Actually, we are going to strengthen the principle on ﬁnite conjunctions of believed propo-

sitions to the case of the conjunction of all believed propositions whatsoever:

4. For ) = ¦Y ∈ A¦ Bel(Y)¦,

) is a member of A, and Bel(

)).

This certainly involves a good deal of abstraction. On the other hand, if A is ﬁnite, then

the last principle simply reduces to the case of ﬁnite conjunctions again. In any case, 4.

has the following obvious consequence: There is a least set (a strongest proposition) Y,

such that Bel(Y); that Y is just the conjunction of all propositions believed by ag at t. We

will denote this very proposition by: B

W

. The main reason why we presuppose 4. is that

it enables us to represent the sum of ag’s beliefs in terms of such a unique proposition or

a unique set of possible worlds. In the semantics of doxastic or epistemic logic, our set

B

W

would correspond to the set of accessible worlds from the viewpoint of the agent’s

current mindset. Accordingly, using the terminology that is quite common in areas such

as belief revision or nonmonotonic reasoning, one might think of the members of B

W

as

being precisely the most plausible candidates for what the actual world might be like, if

seen from the viewpoint of ag at time t.

Our postulate 4. imposes also another constraint on A: While it is not generally the case

that the algebra A contains arbitrary conjunctions of members of A, 4. together with our

other postulates does imply that A is closed under taking arbitrary countable conjunctions

of believed propositions: for if all the members of any countable class of propositions

are believed by ag at t, then their conjunction is a member of A by A being a σ-algebra,

and the conjunction is a member of Bel by its being a superset of B

W

and by 2. above.

There is yet another independent reason for assuming 4.: In light of lottery paradox or

preface paradox situations, with which we will deal later, it is thought quite commonly

that if the set of beliefs simpliciter is presupposed to be closed under conjunction, then this

prohibits any probabilistic analysis of belief simpliciter from the start. We will show that

beliefs simpliciter can in fact be reduced to quantitative belief even though 4. expresses

the strongest form of closure under conjunction whatsoever that a set of beliefs can satisfy.

So we will not be accused of playing tricks by building up some kind of non-standard

model for qualitative belief in which certain types of conjunction rules are applicable to

13

certain sets of believed propositions but where other types of conjunction rules may not be

applied (as one can show would be the case if we dropped countable additivity as being

one of our assumptions). In a nutshell: 4. prohibits our agent from having anything like

an ω-inconsistent set of beliefs.

Finally, we add

5. (Consistency) ¬Bel(∅).

as our agent ag does not believe a contradiction. Once again, this will be granted in order to

mimick the same assumption that in epistemic logic is sometimes made: one justiﬁcation

for it is the thought that if a rational agent is shown to believe a contradiction, then he

will aim to change his mind; if ag’s actual beliefs are considered to coincide with the (in

principle) outcome of such a rationalization process, then 5. should be ﬁne.

So much for belief if taken unconditionally. But we will require more than just qual-

itative belief in that sense—indeed, this will turn out to be the key move: Let us assume

that ag also holds conditional beliefs, that is, beliefs conditional on certain propositions in

A. We will interpret such conditional beliefs in suppositional terms: they are beliefs that

the agent has under the supposition of certain propositions, where the only type of sup-

position that we will be concerned with in the following will be supposition as a matter

of fact, that is, suppositions which are usually expressed in the indicative, rather than the

subjunctive, mood: Suppose that X is the case. Then I believe that Y is the case. If X is any

such “assumed” proposition, we take Bel

X

to be the class of propositions that our ideally

rational agent believes to be true at time t conditional on X; instead of writing ‘Y ∈ Bel

X

’,

we will say somewhat more transparently: Bel(Y¦X). Accordingly, we call Bel

X

our agent

ag’s belief set conditional on X at t, and we call any such class of propositions for what-

ever X ∈ A a conditional belief set at t of our agent ag. In this extended context, Bel itself

should now be regarded as a class of ordered pairs of members of A, rather than as a set

of members of A as before; instead of ‘(Y, X) ∈ Bel’ we may simply say again: Bel(Y¦X).

And we may identify ag’s belief set at t from before with one of ag’s conditional belief sets

at t: the class of propositions that ag believes to be true at t conditional on the tautological

proposition W, that is, with the class Bel

W

. Accordingly, we now call all and only the

members Y of Bel

W

to be believed absolutely or unconditionally, and Bel

W

the absolute or

unconditional belief set.

In the present section we will be interested only in conditional beliefs in Y given X

where X is consistent with everything that the agent believes absolutely (or conditionally

on W) at that time; equivalently: where X is consistent with B

W

. In particular, this will

yield an explication of absolute or unconditional belief in terms of subjective probabilities,

which is the main focus of this section. In the next section we will add some postulates

which will impose constraints even on beliefs conditional on propositions in A that con-

14

tradict B

W

, and ultimately we be able to state a corresponding explication of conditional

belief in general. Even in the cases in which we will consider a belief suppositional on a

proposition that is inconsistent with the agent’s current absolute beliefs, as we will in the

section after this one, we will still regard the supposition in question to be a matter-of-fact

supposition in the sense that in natural language it would be expressed in the indicative

rather than the subjunctive one. As in: I believe that John is not in the building. But

suppose that he is in the building: then I believe he is in his oﬃce.

For every X ∈ A that is consistent with what the agent believes, Bel

X

is a set of the very

same kind as the original unconditional or absolute belief set of propositions from above.

And for every such X ∈ A, Bel

X

will therefore be assumed to satisfy postulates of the very

same type as suggested before for absolute beliefs:

B1 (Reﬂexivity) If ¬Bel(¬X¦W), then Bel(X¦X).

B2 (One Premise Logical Closure)

If ¬Bel(¬X¦W), then for all Y, Z ∈ A: if Bel(Y¦X) and Y ⊆ Z, then Bel(Z¦X).

B3 (Finite Conjunction)

If ¬Bel(¬X¦W), then for all Y, Z ∈ A: if Bel(Y¦X) and Bel(Z¦X), then Bel(Y ∩ Z¦X).

B4 (General Conjunction)

If ¬Bel(¬X¦W), then for ) = ¦Y ∈ A¦ Bel(Y¦X)¦,

) is a member of A, and

Bel(

)¦X).

On the other hand, we assume the Consistency postulate to hold only for beliefs condi-

tional on W at this point (in the next section this will be generalised). So just as in the case

of 5. above, we only demand:

B5 (Consistency) ¬Bel(∅¦W).

By now the axioms should look quite uncontroversial, if given our logical approach to

belief. Assuming B1 is unproblematic at least under a suppositional reading of conditional

belief: under the (matter of fact) supposition of X, with X being consistent with what the

agent believes, the ideally rational agent ag holds X true at time t. Of course, B3 is

redundant really in light of B4, but we shall keep it as well for the sake of continuity with

the standard treatment of belief. As before, B4 now entails for every X ∈ A for which

¬Bel(¬X¦W) that there is a least set (a strongest proposition) Y, such that Bel(Y¦X), which

by B1 must be a subset of X. For any such given X, we will denote this very proposition

by: B

X

. For X = W, this is consistent with the notation ‘B

W

’ introduced before.

Clearly, we have then for all X with ¬Bel(¬X¦W) and for Y ∈ A:

Bel(Y¦X) if and only if Y ⊇ B

X

,

15

from left to right by the deﬁnition of ‘B

X

’, and from right to left by B2 and the deﬁnition

of B

X

again. Furthermore, it also follows that

Y ⊇ B

X

if and only if Bel(Y¦B

X

),

since if the left-hand side holds, then the right-hand side follows from B1 and B2, and if

the right-hand side is the case then the left-hand side must be true by the deﬁnition of ‘B

X

’

and the previous equivalence. So we ﬁnd that actually for all Y ∈ A,

Bel(Y¦X) if and only if Bel(Y¦B

X

),

hence what is believed by ag conditional on X may always be determined just by means

of considering all and only the members of A which ag believes conditional on the subset

B

X

of X. We will use these equivalences at several points, and when we do so we will not

state this explicitly anymore.

By B5, W itself is such that ¬Bel(¬W¦W) (since ¬W = ∅), hence all of B1–B4 apply

to X = W unconditionally, and consequently B

W

must be non-empty. Using this and the

ﬁrst of the three equivalences above, one can thus derive

¬Bel(¬X¦W) if and only if X ∩ B

W

∅.

For this reason, instead of qualifying the postulates in this section by means of ‘¬Bel(¬X¦W)’,

we see that we may just as well replace this qualiﬁcation by ‘X∩B

W

∅’, and this is what

we are going to do in the following.

So far there are no postulates on how belief sets conditional on diﬀerent propositions

relate to each other logically. At this point we demand one such condition to be satisﬁed

which corresponds to the standard AGM (...) postulates K*3 and K*4 on belief revision if

B

W

takes over the role of AGM’s syntactic belief set K, and if the revised belief set in the

sense of AGM gets described in terms of conditional belief:

B6 (Expansion)

For all Y ∈ A such that Y ∩ B

W

∅:

For all Z ∈ A, Bel(Z¦Y) if and only if Z ⊇ Y ∩ B

W

.

In words: if the proposition Y is consistent with B

W

, then ag believes Z conditional on

Y if and only if Z is entailed by the conjunction of Y with B

W

. This is really just a pos-

tulate on “revision by expansion” in terms of propositional information that is consistent

with the sum of what the agent believes; nothing is said at all about revision in terms of

information that would contradict some of the agent’s beliefs, which will be the topic of

the next section. As mentioned before, a principle like B6 is entailed by the AGM postu-

lates on revision by propositions which are consistent with what the agent believes at the

16

time, and it can be justiﬁed in terms of plausibility rankings of possible worlds: say that

conditional beliefs express that the most plausible of their antecedent-worlds are among

their consequent-worlds; then if some of the most plausible worlds overall are Y-worlds,

these worlds must be precisely the most plausible Y-worlds, and therefore in that case the

most plausible Y-worlds are Z-worlds if and only if all the most plausible worlds overall

that are Y-worlds are Z-worlds.

Equivalently:

B6 (Expansion)

For all Y ∈ A, such that for all Z ∈ A, if Bel(Z¦W) then Y ∩ Z ∅:

For all Z ∈ A, Bel(Z¦Y) if and only if Z ⊇ Y ∩ B

W

.

Supplying conditional belief with our intended suppositional interpretation again: If Y

is consistent with everything ag believes absolutely, then supposing Y as a matter of fact

amounts to nothing else than adding Y to one’s stock of absolute beliefs, so that what the

agent believes conditional on Y is precisely what the agent would believe absolutely if the

strongest proposition that he believes were the intersection of Y and B

W

. That is, we may

reformulate B6 one more time in the form:

B6 (Expansion) For all Y ∈ A such that Y ∩ B

W

∅: B

Y

= Y ∩ B

W

.

The superset claim that is implicit in the equality statement follows from the postulates

above because Bel(B

Y

¦Y) holds by the deﬁnition of ‘B

Y

’ and then the original formulation

of B6 above can be applied. The corresponding subset claim follows from the deﬁnition

of B

Y

again since Bel(Y ∩ B

W

¦Y) follows from the original version of B6. Similarly, the

original version of B6 above can be derived from our last version of that principle and the

other postulates that we assumed. It follows from our last formulation of B6 (trivially) that

for all Y ∩ B

W

∅, B

Y

is non-empty, simply because B

Y

= Y ∩ B

W

in that case.

AGM’s K*3 and K*4 have not remained unchallenged, of course. One typical worry

is that revising by some new evidence or suppositional information Y may lead to more

beliefs than what one would get deductively by adding Y to one’s current beliefs, in view

of possible inductively strong inferences that the presence of Y might warrant. One line of

defence of AGM here is: if the agent’s current beliefs are themselves already the result of

the inductive expansion of what the agent is certain about, so that the agent’s beliefs are

really what he expects to be the case, then revising his beliefs by consistent information

might reduce to merely adding it to his beliefs and closing oﬀ deductively. Another line

of defence is: a postulate such as B6 might be true of belief simpliciter, and without it

qualitative belief would not have the simplifying power that is essential to it. But there

might nothing like it that would hold of quantitative belief, and the mentioned criticism

of the conjunction of K*3 and K*4 might simply result from mixing up considerations on

17

qualitative and quantitative belief. We will return to this issue later where we will see in

what sense our theory allows us to reconcile B6 above with the worry about them that we

were addressing in this paragraph.

This ends our list of postulates on qualitative belief.

3.3 Mixed Postulates and the Explication of Absolute Belief

Finally, we turn to our promised necessary probabilistic condition for having a belief—the

left-to-right direction of the Lockean thesis—and indeed for having a belief conditional on

any proposition consistent with all the agent ag believes at t; this will make ag’s degrees

of beliefs at t and (some of) his conditional beliefs simpliciter at t compatible in a sense.

The resulting bridge principle between qualitative and quantitative belief will involve a

numerical constant ‘r’ which we will leave indeterminate at this point—just assume that

r is some real number in the half-open interval [0, 1). Note that the principle is not yet

meant to give us anything like a deﬁnition of ‘Bel’ (nor of any terms deﬁned by means

of ‘Bel’, such as ‘B

W

’) on the basis of ‘P’. It only expresses a joint constraint on the

references of ‘Bel’ and ‘P’, that is, on our agent’s ag’s actual conditional beliefs and his

actual subjective probabilities. The principle says:

BP1

r

(Likeliness) For all Y ∈ A such that Y ∩ B

W

∅ and P(Y) > 0:

For all Z ∈ A, if Bel(Z¦Y), then P(Z¦Y) > r.

BP1

r

is just the obvious generalisation of the left-to-right direction of the Lockean thesis

to the case of beliefs conditional on propositions Y which are consistent with all absolute

beliefs. The antecedent clause ‘P(Y) > 0’ in BP1

r

is there to make sure that the conditional

probability P(Z¦Y) is well-deﬁned. By using W as the value of ‘Y’ and B

W

as the value of

‘Z’ in BP1

r

, and then applying the deﬁnition of B

W

(which exists by B1–B4) and P1, it

follows that P(B

W

¦W) = P(B

W

) > r. Therefore, from the deﬁnition of B

W

and P1 again,

having an subjective probability of more than r is a necessary condition for a proposition

to be believed absolutely, although it will become clear below that this is far from being a

suﬃcient condition.

r is a non-negative real number less than 1 which functions as a threshold value and

which at this stage of our investigation can be chosen freely. BP1

r

really says: conditional

beliefs (with the relevant Ys) entail having corresponding conditional probabilities of more

than r. One might wonder why there should be one such threshold r for all propositions Y

and Z as stated in BP1

r

at all, rather than having for all Y (or for all Y and Z) a threshold

value that might depend on Y (or on Y and Z). But without any further qualiﬁcation, a

principle such as the latter would be almost empty, because as long as for Y and Z it is

the case that P(Z¦Y) > 0, there will always be an r such that P(Z¦Y) > r. In contrast,

18

BP1

r

postulates a conditional probabilistic boundary from below that is uniform for all

conditional beliefs—this r really derives from considerations on the concept of belief itself

rather than from considerations on the contents of belief. (Remark: It would be possible

to weaken ‘>’ to ‘≥’ in BP1

r

; not much will depend on it, except that whenever we are

going to use BP1

r

with r ≥

1

2

below, one would rather have to choose some r

·

>

1

2

instead

and then demand that ‘. . . P(Z¦Y) ≥ r

·

’ is the case).

For illustration, in BP1

r

, think of r as being equal to

1

2

: If degrees of beliefs and beliefs

simpliciter ought to be compatible in some sense at all, then the resulting BP1

1

2

is pretty

much the weakest possible expression of any such compatibility that one could think of: if

ag believes Z (conditional on one of Y’s referred to above), then ag assigns an subjective

probability to Z (conditional on Y) that exceeds the subjective probability that he assigns

to the negation of Z (conditional on Y). If BP1 were invalidated, then there would be

Z and Y, such that our agent ag believes Z conditional on Y, but where P(Z¦Y) ≤

1

2

: if

P(Z¦Y) <

1

2

, then ag would be in a position in which he regarded ¬Z as more likely than

Z, conditional on Y, even though he believes Z, but not ¬Z, conditional on Y. On the

other hand, if P(Z¦Y) =

1

2

, then ag would be in a position in which he regarded ¬Z as

equally likely as Z, conditional on Y, even though he believes Z, but not ¬Z, conditional

on Y. While the former is diﬃcult to accept—and the more diﬃcult the lower the value

of P(Z¦Y)—the latter might be acceptable if one presupposes a voluntaristic conception of

belief such as van Fraassen’s (...). But it would still be questionable then why the agent

would choose to believe Z, rather than ¬Z, but not choose to assign to Z a higher degree

of belief than to ¬Z (assuming this voluntary conception of belief would apply to degrees

of belief, too). Richard Foley (...) has argued that the Preface Paradox would show that

a principle such as BP1

1

2

would in fact be too strong: a probability of

1

2

could not even

amount to a necessary condition on belief. We will return to this when we discuss the

Lottery Paradox and Preface Paradox in section ??. Instead of defending BP1

1

2

or any

other particular instance of BP1

r

at this point, we will simply move on now, taking for

granting one such BP1

r

has been chosen. We will argue later that choosing r =

1

2

is in

fact the right choice for the least possible threshold value that would give us an account

of ‘believing that’, even though taking any greater threshold value less than 1 would still

be acceptable. However, for weaker forms of subjective commitment, such as ‘supecting

that’ or ‘hypothesizing that’, r ought to be chosen to be less than

1

2

.

For the moment this exhausts our list of postulates (with two more to come later). Let

us pause for now and focus instead on jointly necessary and suﬃcient conditions for our

postulates up to this point to be satisﬁed, which will lead us to our ﬁrst representation

theorem by which pairs (P, Bel) that jointly satisfy our postulates get characterized trans-

parently. In order to do so, we will need the following additional probabilistic concept

which will turn out to be crucial for the whole theory:

19

Deﬁnition 2 (P-Stability

r

) Let P be a probability measure on a set algebra A over W. For

all X ∈ A:

X is P-stable

r

if and only if for all Y ∈ A with Y ∩ X ∅ and P(Y) > 0: P(X¦Y) > r.

If we think of P(X¦Y) as the degree of X under the supposition of Y, then a P-stable

r

proposition X has the property that whatever proposition Y one supposes, as long as Y is

consistent with X and probabilities conditional on Y are well-deﬁned, it will be the case

that the degree of X under the supposition of Y exceeds r. So a P-stable

r

proposition

has a special stability property: it is characterized by its stably high probabilities under

all suppositions of a particularly salient type. Trivially, the empty set is P-stable

r

. W is

P-stable

r

, too, and more generally all propositions X in A with probability P(X) = 1 are

P-stable

r

. More importantly, as we shall see later in section 3.4, there are in fact lots of

probability measures for which there are lots of non-trivial P-stable

r

propositions which

have a probability strictly between 0 and 1.

A diﬀerent way of thinking of P-stability

r

is the following one. With X being P-

stable

r

, and Y being such that Y ∩ X ∅ and P(Y) > 0, it holds that P(X¦Y) =

P(X∩Y)

P(Y)

> r,

which is equivalent to: P(X ∩ Y) > r P(Y). But by P1 this is again equivalent with

P(X ∩Y) > r [P(X ∩Y) + P(¬X ∩Y)], which yields P(X ∩Y) >

r

1−r

P(¬X ∩Y). X ∩Y is

some proposition in A that is a subset of X, and by assumption it needs to be non-empty.

¬X ∩Y is just some proposition in A which is a subset of ¬X. If P(X ∩Y) were 0, then the

inequality above could not be satisﬁed irrespective of what ¬X ∩ Y would be like; and if

P(X ∩ Y) is greater than 0, then a fortiori X ∩ Y ∅ and also P(Y) > 0 are the case. So

really X is P-stable

r

if and only if for all Y, Z ∈ A, such that Y is a subset of X with P(Y) > 0

and where Z is a subset of ¬X, it holds that P(Y) >

r

1−r

P(Z). In words: The probability

of any subset of X that has positive probability at all is greater than the probability of any

subset of ¬X if the latter is multiplied by

r

1−r

. In the special case in which r =

1

2

, this

factor is just 1, and hence X is P-stable

1

2

if and only if the probability of any subset of X

that has positive probability at all is greater than the probability of any subset of ¬X. So

P-stability

r

is also a separation property, which divides the class of subpropositions of a

proposition from the class of subpropositions of its negation in terms of probability.

Here is a property of P-stable

r

propositions X that we will need on various occasions:

if P(X) < 1, then there is no non-empty Y ⊆ X with Y ∈ A and P(Y) = 0. For assume

otherwise: then Y ∪ ¬X has non-empty intersection with X since Y has, and at the same

time P(Y ∪ ¬X) > 0 because P(¬X) > 0. By X being P-stable

r

, it would therefore have

to hold that P(X¦Y ∪ ¬X) =

P(X∩Y)

P(Y∪¬X)

> r, which contradicts P(X ∩ Y) ≤ P(Y) = 0. For

the same reason, non-empty propositions of probability 0 cannot be P-stable

r

, or in other

words: non-empty P-stable

r

propositions X have positive probability.

20

Using this new concept, we can show the following ﬁrst and rather simple representa-

tion theorem on belief (there will be another more intricate one in the next section which

will extend the present one to conditional belief in general):

Theorem 3 Let Bel be a class of ordered pairs of members of a σ-algebra A as explained

above, let P : A → [0, 1], and let 0 ≤ r < 1. Then the following two statements are

equivalent:

I. P and Bel satisfy P1, B1–B6, and BP1

r

.

II. P satisﬁes P1, and there is a (uniquely determined) X ∈ A, such that X is a non-

empty P-stable

r

proposition, and:

– For all Y ∈ A such that Y ∩ X ∅, for all Z ∈ A:

Bel(Z ¦ Y) if and only if Z ⊇ Y ∩ X

(and hence, B

W

= X).

Proof. From left to right: P1 is satisﬁed by assumption. Now we let X = B

W

, where B

W

exists and has the intended property of being the strongest believed proposition by B1–B4:

First of all, as derived before by means of B5, B

W

is non-empty; and B

W

is P-stable

r

: For

let Y ∈ A with Y ∩ B

W

∅, P(Y) > 0: since B

W

⊇ Y ∩ B

W

, it thus follows from B6 that

Bel(B

W

¦Y), which by BP1 and P(Y) > 0 entails that P(B

W

¦Y) > r, which was to be shown.

Secondly, let Y ∈ A be such that Y ∩ B

W

∅, let Z ∈ A: then it holds that Bel(Z¦Y) if

and only if Z ⊇ Y ∩ B

W

by B6, as intended. Finally, uniqueness: Assume that there is an

X

·

∈ A, such X

·

X, X

·

is non-empty, P-stable

r

, and for all Y ∈ A with Y ∩X

·

∅, for all

Z ∈ A, it holds that Bel(Z ¦ Y) if and only if Z ⊇ Y ∩ X

·

. But from the latter it follows that

X

·

= B

W

, and hence with X = B

W

from above that X

·

= X, which is a contradiction.

From right to left: Suppose P satisﬁes P1, and there is an X, such that X and Bel have

the required properties. Then, ﬁrst of all, all the instances of B1–B5 for beliefs conditional

on W are satisﬁed: for it holds that W∩X = X ∅because X is non-empty by assumption,

so Bel(Z¦W) if and only if Z ⊇ W∩X = X, by assumption, therefore B5 is the case, and the

instances of B1–B4 for beliefs conditional on W follow from the characterisation of beliefs

conditional on W in terms of supersets of X. Indeed, it follows: B

W

= X. So, for arbitrary

Y ∈ A, ¬Bel(¬Y¦W) is really equivalent to Y ∩ X ∅, as we did already show after our

introduction of B1–B5, and hence B1–B4 are satisﬁed by the assumed characterisation of

beliefs conditional on any Y with Y ∩ X ∅ in terms of supersets of Y ∩ X. B6 holds

trivially, by assumption and because of B

W

= X. About BP1

r

: Let Y∩X ∅and P(Y) > 0.

If Bel(Z¦Y), then by assumption Z ⊇ Y ∩ X, hence Z ∩ Y ⊇ Y ∩ X, and by P1 it follows

that P(Z ∩ Y) ≥ P(Y ∩ X). From X being P-stable

r

and P(Y) > 0 we have P(X¦Y) > r.

21

Taking this together, and by the deﬁnition of conditional probability in P1, this implies

P(Z¦Y) > r, which we needed to show.

Note that P2 (Countable Additivity) did not play any role in this; but of course P2

may be added to both sides of the proven equivalence with the resulting equivalence being

satisﬁed.

This simple theorem will prove to be fundamental for all subsequent arguments in this

paper. We start by exploiting it ﬁrst in a rather trivial fashion: Let us concentrate on its

right-hand side, that is, condition II. of Theorem 3. Disregarding for the moment any con-

siderations on qualitative belief, let us just assume that we are given a probability P over a

set algebra A on W. We know already that one can in fact always ﬁnd a non-empty set X,

such that X is a P-stable

r

proposition: just take any proposition with probability 1. In the

simplest case: take X to be W itself. P(W) > 0 and P-stability

r

follow then immediately.

Now consider the very last equivalence clause of II. and turn it into a (conditional) deﬁni-

tion of Bel(.¦Y) for all the cases in which Y ∩ W = Y ∅: that is, for all Z ∈ A, deﬁne

Bel(Z ¦ Y) to hold if and only if Z ⊇ Y ∩ W = Y. In particular, Bel(Z ¦ W) holds then if and

only if Z ⊇ W which obviously is the case if and only if Z = W. B

W

= W follows, all the

conditions in II. of Theorem 3 are satisﬁed, and thus by Theorem 3 all of our postulates

from above must be true as well. What this shows is that given a probability measure, it

is always possible to deﬁne belief simpliciter in a way such that all of our postulates turn

out to be the case. What would be believed absolutely thereby by our agent is maximally

cautious: having such beliefs, ag would believe absolutely just W, and therefore trivially

every absolute belief would have probability 1. Accordingly, he would believe condition-

ally on the respective Ys from above just what is logically entailed by them, that is, all

supersets of Y.

As we pointed out in the introduction, this is not in general a satisfying explication

of belief. But what is more important, we actually ﬁnd that a much more general pat-

tern is emerging: Let P be given again as before. Now choose any non-empty P-stable

r

proposition X, and deﬁne conditional belief in all cases in which Y ∩ X ∅ by: Bel(Z ¦ Y)

if and only if Z ⊇ Y ∩ X. Then B

W

= X follows again, and all of our postulates hold

by Theorem 3—including B3 (Finite Conjunction) and B4 (General Conjunction)—even

though it might well be that P(X) < 1 and hence even though there might be beliefs whose

propositional contents have a subjective probability of less than 1 as being given by P.

Such beliefs are not maximally cautious anymore—exactly as it is the case for most of the

beliefs of any real-world human agent ag. Of course this does not mean that according to

the current construction all believed propositions would have to be assigned probability

of less than 1: Even if P(X) < 1, there will always be believed propositions that have

a probability of precisely 1—for instance, W—it only follows that there exist believed

propositions that have a probability of less than 1—X itself is an example. And every be-

22

lieved proposition must then have a probability that lies somewhere in the closed interval

[P(X), 1], so that P(X) becomes a lower threshold value; furthermore, since X is P-stable

r

,

P(X) itself is strictly bounded from below by r. It does not follow that if a proposition has

a probability in the interval [P(X), 1], then this just by itself implies that the proposition is

also believed absolutely, since it is not entailed that the proposition is then also a superset

of the P-stable

r

proposition X that had been chosen initially.

Since P-stable

r

propositions play such a distinguished role in this, the questions arise:

Do P-stable

r

sets other W exist at all for many P? More generally: Do non-trivial exist for

many P, that is, such with a probability strictly between 0 and 1? Subsection 3.4 below

will show that the answers are aﬃrmative. And how diﬃcult is it to determine whether a

proposition is a non-empty P-stable

r

set?

About the last question: At least in the case where W is ﬁnite, it turns out not to be

diﬃcult at all: Let A be the power set algebra on W, and let P be deﬁned on A. By

deﬁnition, X is P-stable

r

if and only if for all Y ∈ A with Y ∩ X ∅ and P(Y) > 0,

P(X¦Y) =

P(X∩Y)

P(Y)

> r. We have seen already that all sets with probability 1 are P-stable

r

. So

let us focus just on how to generate all non-empty P-stable

r

sets X that have a probability

of less than 1. As we observed before, such sets do not contain any subsets of probability

0, which in the present context means that if w ∈ X, P(¦w¦) > 0.

For any given such non-empty X with P(X) < 1, as we have shown before, it follows

that X is P-stable

r

if and only if for all Y, Z ∈ A, such that Y is a subset of X (and hence, in

the present case, P(Y) > 0) and where Z is a subset of ¬X, it holds that P(Y) >

r

1−r

P(Z).

Therefore, in order to check for P-stability

r

in the current context, it suﬃces to consider

just sets Y and Z which have the required properties and for which P(Y) is minimal and

P(Z) is maximal. In other words, we have for all non-empty X with P(X) < 1:

X is P-stable

r

if and only if for all w in X it holds that P(¦w¦) >

r

1 − r

P(W \ X).

In particular, for r =

1

2

, this is:

X is P-stable

1

2

if and only if for all w in X it holds that P(¦w¦) > P(W \ X).

Thus it turns out to be very simply to decide whether a set X is P-stable

r

and even more so

if it is P-stable

1

2

.

From this it is easy to see that in the present ﬁnite context there is also an eﬃcient

procedure that computes all non-empty P-stable

r

subsets of W. We only give a sketch for

the case r =

1

2

: All sets of probability 1 are P-stable

r

, so we disregard them. All other non-

empty P-stable

r

sets do not have singleton subsets of probability 0, so let us also disregard

all worlds whose singletons are zero sets. Assume that after dropping all worlds with zero

probabilistic mass, there are exactly n members of W left, and P(¦w

1

¦), P(¦w

2

¦), . . . , P(¦w

n

¦)

23

is already in (not necessarily strictly) decreasing order. If P(¦w

1

¦) > P(¦w

2

¦) + . . . +

P(¦w

n

¦) then ¦w

1

¦ is P-stable

1

2

, and one moves on to the list P(¦w

2

¦), . . . , P(¦w

n

¦). If

P(¦w

1

¦) ≤ P(¦w

2

¦) + . . . + P(¦w

n

¦) then consider P(¦w

1

¦), P(¦w

2

¦): If both of them are

greater than P(¦w

3

¦) +. . . +P(¦w

n

¦) then ¦w

1

, w

2

¦ is P-stable

1

2

, and one moves on to the list

P(¦w

3

¦), . . . , P(¦w

n

¦). If either of them is less than or equal to P(¦w

3

¦) + . . . + P(¦w

n

¦) then

consider P(¦w

1

¦), P(¦w

2

¦), P(¦w

3

¦): And so forth, until the ﬁnal P-stable

1

2

set W has been

generated. This recursive procedure yields precisely all non-empty P-stable

1

2

sets of prob-

ability less than 1 in polynomial time complexity. (The same procedure can be applied in

cases in which W is countably inﬁnite and A is the full power set algebra on W. But then

of course the procedure will not terminate in ﬁnite time.)

What Theorem 3 gives us therefore is not just a construction procedure but even, in the

ﬁnite case, an eﬃcient construction procedure for a class Bel from any given probability

measure P, so that the two together satisfy all of our postulates. P2 still has not played a

role so far. But Theorem 3 does more: it also shows that whatever our agent ag’s actual

probability measure P and his actual class Bel of conditionally believed pairs of proposi-

tions are like, as long as they satisfy our postulates from above, then it must be possible

to partially reconstruct Bel by means of some P-stable

r

proposition X as explained before,

where: X is then simply identical to B

W

; and by ‘partially’ we mean that it would only be

possible to reconstruct beliefs that are conditional on propositions Y which were consistent

with X = B

W

. For this is just the left-to-right direction of the theorem. Hence, if we had

any additional means of identifying the very P-stable

r

proposition X that would give us the

agent’s actual belief class Bel, we could deﬁne explicitly the set of all pairs (Z, Y) in that

class Bel for which Y ∩ X ∅ holds by means of that proposition X and thus, ultimately,

by the given measure P. Amongst those conditional beliefs, in particular, we would ﬁnd

all of ag’s absolute beliefs, and therefore the set of absolutely believed propositions could

be deﬁned explicitly in terms of P.

So are we in the position to identify the P-stable

r

proposition X that gives us ag’s

actual beliefs, simply by being handed only ag’s subjective probability measure? That is

the ﬁrst open question that we will deal with in the remainder of this section. The other

open question is: What should r be like in our postulate BP1

r

above?

In order to address these two questions, we need the following additional theorem ﬁrst:

Theorem 4 Let P : A → [0, 1] such that P1 is satisﬁed. Let r ≥

1

2

. Then the following is

the case:

III. For all X, X

·

∈ A: If X and X

·

are P-stable

r

and at least one of P(X) and P(X

·

) is

less than 1, then either X ⊆ X

·

or X

·

⊆ X (or both).

IV. If P also satisﬁes P2, then there is no inﬁnitely descending chain of sets in A that are

all subsets of some P-stable

r

set X

0

in A with probability less than 1, that is, there is

24

no countably inﬁnite sequence

X

0

X

1

X

2

. . .

of sets in A (and hence no inﬁnite sequence of such sets in general), such that X

0

is

P-stable

r

, each X

n

is a proper superset of X

n+1

and P(X

n

) < 1 for all n ≥ 0.

A fortiori, given P2, there is no inﬁnitely descending chain of P-stable

r

sets in A

with probability less than 1.

Proof.

• Ad III: First of all, let X and X

·

be P-stable

r

, and P(X) = 1, P(X

·

) < 1: as observed

before, there is then no non-empty subset Y of X

·

, such that P(Y) = 0. But if X

·

∩¬X

were non-empty, then there would have to be such a subset of X

·

. Therefore, X

·

∩¬X

is empty, and thus X

·

⊆ X. The case for X and X

·

being taken the other way round

is analogous.

So we can concentrate on the remaining logically possible case. Assume for contra-

diction that there are P-stable

r

members X, X

·

of A, such that P(X), P(X

·

) < 1, and

neither X ⊆ X

·

nor X

·

⊆ X. Therefore, both X ∩ ¬X

·

and X

·

∩ ¬X are non-empty,

and they must have positive probability since as we showed before P-stable

r

propo-

sitions with probability less than 1 do not have non-empty subsets with probability

0. We observe that P(X¦(X ∩ ¬X

·

) ∪ ¬X) is greater than r by X being P-stable

r

,

(X ∩ ¬X

·

) ∪ ¬X ⊇ (X ∩ ¬X

·

) having non-empty intersection with X, and the proba-

bility of (X ∩ ¬X

·

) ∪ ¬X being positive. The same must hold, mutatis mutandis, for

P(X

·

¦(X

·

∩ ¬X) ∪ ¬X

·

). So we have

P(X¦(X ∩ ¬X

·

) ∪ ¬X) > r ≥

1

2

and

P(X

·

¦(X

·

∩ ¬X) ∪ ¬X

·

) > r ≥

1

2

,

where r ≥

1

2

by assumption.

Next we show that

P(X ∩ ¬X

·

) > P(¬X).

For suppose otherwise, that is P(X ∩¬X

·

) ≤ P(¬X): Since by P1 and P((X ∩¬X

·

) ∪

¬X) > 0, it must be the case that P(X∩¬X

·

¦(X∩¬X

·

)∪¬X)+P(¬X¦(X∩¬X

·

)∪¬X) =

1, and since we knowfrom before that the second summand must be strictly less than

1

2

, the ﬁrst summand has to strictly exceed

1

2

. On the other hand, it also follows that:

25

1

2

> P(¬X¦(X ∩¬X

·

) ∪¬X) =

P(¬X)

P((X∩¬X

·

)∪¬X)

≥

P(X∩¬X

·

)

P((X∩¬X

·

)∪¬X)

= P(X ∩¬X

·

¦(X ∩¬X

·

) ∪

¬X), by our initial supposition; but this contradicts our conclusion from before that

P(X ∩ ¬X

·

¦(X ∩ ¬X

·

) ∪ ¬X) exceeds

1

2

.

Analogously, it follows also that

P(X

·

∩ ¬X) > P(¬X

·

).

Finally, from this (and P1) we can derive: P(X ∩ ¬X

·

) > P(¬X) ≥ P(X

·

∩ ¬X) >

P(¬X

·

) ≥ P(X ∩ ¬X

·

), which is a contradiction.

• Ad IV: Assume for contradiction that there is a sequence X

0

X

1

X

2

. . . of sets

in A with probability less 1, with X

0

being P-stable

r

as described. None of these

sets can be empty, or otherwise the subset relationships holding between them could

not be proper. Now let A

i

= X

i

\ X

i+1

for all i ≥ 0, and let B =

∞

i=0

A

i

. Note that

every A

i

is non-empty and indeed has positive probability, since as observed before

P-stable

r

sets with probability less than 1 do not contain subsets with probability 0.

Furthermore, for i j, A

i

∩ A

j

= ∅. Since A is a σ-algebra, B is in fact a member

of A. By P2, the sequence (P(A

i

)) must converge to 0 for i → ∞, for otherwise

P(B) = P(

∞

i=o

A

i

) =

∞

i=o

P(A

i

) would not be a real number. Because by assumption

X

0

has a probability of less than 1, P(¬X

0

) is a real number greater that 0. It follows

that the sequence of real numbers

P(A

i

)

P(A

i

)+P(¬X

0

)

=

P(X

0

∩(A

i

∪¬X

0

))

P(A

i

∪¬X

0

)

= P(X

0

¦A

i

∪ ¬X

0

) also

converges to 0 for i → ∞, where for every i, (A

i

∪¬X

0

)∩X

0

∅and P(A

i

∪¬X

0

) > 0.

But this contradicts X

0

being P-stable

r

.

We may draw two conclusions from this. First of all, in view of IV, P-stable

r

sets of

probability less than 1 have a certain kind of groundedness property: they do not allow for

inﬁnitely descending sequences of subsets. Secondly, in light of III and IV taken together,

the whole class of P-stable

r

propositions X in A with P(X) < 1 is well-ordered with respect

to the subset relation. In particular, if there is a non-empty P-stable

r

proposition with

probability less than 1 at all, there must also be a least non-empty P-stable

r

proposition

with probability less than 1. Furthermore, all P-stable

r

propositions X in A with P(X) < 1

are subsets of all propositions in A of probability 1. And the latter are all P-stable

r

. If

we only look at non-empty P-stable

r

propositions with a probability of less than 1, we

ﬁnd therefore that they constitute a sphere system that satisﬁes the Limit Assumption (by

well-orderedness) for every proposition in A, in the sense of Lewis (...). Note that P2

(Countable Additivity) was needed in IV. in order to derive the well-foundedness of the

chain of P-stable

r

propositions of probability less than 1.

26

For given P (and given A and W), such that P satisﬁes P1–2, and for given r ∈ [0, 1),

let us denote the class of all non-empty P-stable

r

propositions X with P(X) < 1 by: X

r

P

.

We know from Theorem 4 that (X

r

P

, ⊆) is then a well-order. So by standard set-theoretic

arguments, there is a bijective and order-preserving mapping from X

r

P

into a uniquely

determined ordinal β

r

P

, where β

r

P

is a well-order of ordinals with respect to the subset

relation which is also the order relation for ordinals; β

r

P

measures the length of the well-

ordering (X

r

P

, ⊆). Hence, X

r

P

is identical to a strictly increasing sequence of the form

(X

r

α

)

α<β

r

P

. X

r

0

is then the least non-empty P-stable

r

proposition in A with probability less

than 1, if there is one at all. If there are none, then β

r

P

is simply equal to 0 (that is, the

ordinal ∅). In case the union of all X

r

α

is W, each world w ∈ W can be assigned a uniquely

determined ordinal rank: the least ordinal α, such that w ∈ X

r

α

. So we ﬁnd that the non-

empty P-stable

r

propositions X with probability less than 1, if they exist, determine ordinal

rankings of those possible worlds that are members of at least one of them.

Furthermore, by P1–2, Theorem 4, and the fact that no non-empty P-stable

r

of proba-

bility less than 1 has a non-empty subset of probability zero, each such X in X

r

P

determines

a number P(X) ∈ (r, 1] and no non-empty P-stable

r

proposition of probability less than 1

other than X could determine the same number P(X); by P1 the greater the set X with

respect to the subset relation, the greater its probability P(X), that is: for α < α

·

< β

r

P

it

holds that r < P(X

r

α

) < P(X

r

α

·

). It follows that there is also a bijective and order-preserving

mapping from the set of probabilities of the members of X

r

P

to the set of ordinals below

β

r

P

(that is, to the set β

r

P

). Accordingly, since every ordinal number has a unique successor,

there is a bijective mapping between the set of intervals of the form (P(X

r

α

), P(X

r

α+1

)) for

α < β

r

P

and the set β

r

P

. See Figure 1.

From this we can determine a boundary for the ordinal type of β

r

P

:

Observation 5 Let P be a countably additive probability measure on a σ-algebra A over

W. Let

1

2

≤ r < 1.

The ordinal β

r

P

(see above) is either ﬁnite or equal to ω.

(Hence, the class X

r

P

of all non-empty P-stable

r

propositions X with probability less

than 1 is countable.)

Proof. Assume for contradiction that β

r

P

≥ ω + 1: then there certainly exist non-empty

P-stable

r

propositions X with probability less than 1. Now, for X

r

α

as deﬁned above, and

for all 0 ≤ n < ω, let Y

n

= X

r

n+1

\ X

r

n

, and let Z

n

=

m≥n

Y

r

m

. We know that for all n it

holds that Z

n

∈ A, by Theorem 4 and the deﬁnition of ‘X

r

α

’ it is the case that Z

n

⊆ X

r

ω

,

by assumption we have P(X

r

ω

) < 1, and furthermore P(Z

n

) < 1 and the sequence (Z

n

) is

strictly monotonically decreasing. So there is a sequence X

r

ω

Z

0

Z

1

. . . of sets in A

with probability less 1, with X

r

ω

being P-stable

r

, in contradiction with IV of Theorem 4.

27

!

"!

!!!#!

!!!$!

!!!%!

!

!!!&!

&'$!

!

(!

!

!

!

!

!

!

!

!

)!

!

!

*+,#-!

*+,$-!

*+,%-!

!

*+,&-!

*+,&'$-!

!

$!

Figure 1: P-stable sets for r ≥

1

2

We also ﬁnd that, given P is countably additive, if there are countably inﬁnitely many

non-empty P-stable

r

propositions X with probability less than 1, then the union of all non-

empty P-stable

r

propositions X with probability less than 1 is itself P-stable

r

, non-empty,

and it must have probability 1. For: The countable union

α<ω

X

r

α

is a member of our

σ-algebra A. If Y ∩

α<ω

X

r

α

∅ for Y ∈ A with P(Y) > 0, then there must be an X

r

α

with

α < ω, such that Y ∩ X

r

α

∅. Because X

r

α

is P-stable

r

, it follows that P(X

r

α

¦Y) > r. But

by P1, P(

α<ω

X

r

α

¦Y) ≥ P(X

r

α

¦Y), hence P(

α<ω

X

r

α

¦Y) > r. So

α<ω

X

r

α

is P-stable

r

(and

non-empty, of course). If P(

α<ω

X

r

α

) were less than 1, then β

r

P

would have to be at least

of the order type ω + 1, which was ruled out by Observation 5. So P(

α<ω

X

r

α

) = 1.

Since, as we saw before, no non-empty P-stable

r

propositions X with probability less

than 1 contains a non-empty zero set as a subset, that union could not do so either. So in the

case in which β

r

P

is inﬁnite, that union of all non-empty P-stable

r

propositions with prob-

ability less than 1 would then have to be the least P-stable

r

proposition with probability

1.

Now back to our remaining open questions. Let us start with: what should we choose

as r?

For the proof of III. in Theorem 4 it was crucial that r ≥

1

2

. Indeed, one can show by

means of examples that if r <

1

2

then III. can be invalidated: it is possible then that there

are P-stable

r

members X, X

·

of A, such that neither X ⊆ X

·

nor X

·

⊆ X. In fact, it is even

possible that there are non-empty P-stable

r

members X, X

·

of A, such that X∩X

·

= ∅. This

28

!

"!

Figure 2: P-stable sets for r <

1

2

means: if our agent ag’s probability measure P is held ﬁxed for the moment, and if r <

1

2

,

then depending on what P is like, our postulates P1–P2, B1–B6, and BP1

r

might allow for

two classes Bel such that all of these postulates are satisﬁed for each of them (by Theorem

3) and yet some absolute beliefs according to the one class Bel contradict some absolute

beliefs according to the other class Bel, although both are based on one and the same

subjective probability measure P. It seems advisable then, for the sake of a better theory,

to demand that r ≥

1

2

, for this will allow us to derive as a law that a situation such as that

cannot occur. Of course, this is far from being a knock-down argument against r <

1

2

, but it

certainly puts a bit of methodological pressure on it. For if P is ﬁxed, then one might think

that our postulates should suﬃce to rule out systems of qualitative belief that contradict

each other. As van Fraassen (..., p. 350) puts it, the assumed role of full belief is “to form

a single, unequivocally endorsed picture of what things are like”: If r ≥

1

2

, then while

Theorem 4 does not yet pin down such a “single, unequivocally endorsed picture of what

things are like”, at least the linearity condition III. guarantees the following: given P, if X

and X

·

are possible choices of strongest possible believed propositions B

W

such that P1–

P2, B1–B6, and BP1

r

are satisﬁed, that is, by Theorem 3, if X and X

·

are both non-empty

P-stable

r

members of A, then either everything that ag believes absolutely according to

B

W

= X would also be believed if it were the case that B

W

= X

·

or vice versa. Combining

this with what we said about r <

1

2

initially when we introduced BP1

r

above—that is, that

if an agent believes a proposition it is quite reasonable for him to have assigned to that

proposition a probability that is greater than the probability of its negation—we do have

a plausible case against choosing r in that way. (But we will see later that r <

1

2

is an

attractive choice if ‘Bel’ is taken to express not belief but some weaker epistemic attitude.)

29

Apart from presupposing r ≥

1

2

, is it possible to exclude other possible values of ‘r’?

Before we answer this question, the following elementary observation informs us about

some of the consequences that the answer will have:

Observation 6 Let P be a probability measure on an algebra A over W. Let X ∈ A, and

assume that

1

2

≤ r < s < 1. Then it holds:

• If X is P-stable

s

, X is P-stable

r

.

Proof. If X is P-stable

s

, then for all Y ∈ A with Y ∩ X ∅, P(X¦Y) > s. But then it also

holds for all Y ∈ A with Y ∩ X ∅ that P(X¦Y) > r, since r < s by assumption, so X is

P-stable

r

as well.

Hence, the smaller the threshold value r, the more inclusive is the class of P-stable

r

sets that it determines. What this tells us, in conjunction with our previous results, is that

if we choose r minimally such that

1

2

≤ r < 1, that is, if we choose r =

1

2

, then we do not

exclude any of the logically possible options for B

W

.

Should our agent ag exclude some of them? By determining the value of ‘r’, one lays

down how brave a belief can be maximally, or how cautious a belief needs to be minimally,

in order not to cease to count as a belief. Choosing r =

1

2

is the bravest possible option. At

the same time, beliefs in this sense would not necessarily seem too brave: after all, with

P being given, Bel would still be constrained by BP1

1

2

. In particular, if Y is believed in

this sense, then the subjective probability of Y would have to be greater than

1

2

. And of

course Bel would have to satisfy all of the standard logical properties of belief simpliciter,

as expressed by B1–B6. Indeed, for many purposes this might well be the right choice.

But then again, maybe, for other purposes a more cautious notion of belief is asked for,

which would correspond to choosing a value for ‘r’ that is greater than

1

2

. In many cases,

the value of ‘r’ might be determined by the epistemic and pragmatic context in which our

agent ag is about to reason and act, and diﬀerent contexts might ask for diﬀerent values

of ‘r’. In yet other cases, the value of ‘r’ might only be determined vaguely; and so on.

And all of these options would still be covered by what we call pre-theoretically ‘belief’.

We suggest therefore to explicate belief conditional on any given threshold value r ≥

1

2

,

without making any particular choice of the value of ‘r’ mandatory.

With that one of our two open questions settled (or rather dismissed), we are in the

position to address the other one: Can we always identify the P-stable

r

proposition X

that yields our agent’s ag’s actual beliefs, if we are given only ag’s subjective probability

measure P (and a threshold value r)? We need one more postulate before we answer this.

Degrees of belief conditional on a proposition of probability 0 are brought in line with

beliefs conditional on a contradiction in the following manner:

BP2 (Zero Supposition) For all Y ∈ A: If Y ∩ B

W

∅ and P(Y) = 0, then B

Y

= ∅.

30

Since P is an absolute probability measure that does not allow for conditionalization on a

proposition of probability 0 at all, it makes sense to restrict belief simpliciter accordingly

in the way that supposing any such proposition of probability 0 amounts to believing a

contradiction. For intuitively there is no reason to think that supposing a proposition qual-

itatively ought to less zero-intolerant—using Jonathan Bennett’s corresponding term (...)

which he applies to indicative conditional whose antecedent has subjective probability 0—

than the quantitative supposition of a proposition. This said, rather than restricting qual-

itative belief in such a way, it would actually be more attractive to liberate quantitative

probability such that the (non-trivial) conditionalization on zero sets becomes possible:

that is, as mentioned before, one might want to use Popper functions P from the start. But

then again the current theory has the advantage of relying just on the much more common

absolute probability measures, and since the theory is not particularly aﬀected by using

BP2 as an additional assumption, we shall stick to conditional belief being constrained as

expressed by BP2. So BP2 is acceptable really just for the sake of simplicity. At least, if

P is regular, that is, every non-empty proposition in A has positive probability, then BP2

is of course superﬂuous, and for many practically relevant scenarios, Regularity is indeed

usually taken for granted or otherwise W would be redeﬁned by dropping all worlds whose

singleton sets have zero probability.

Here is an important consequence of BP2: Let Y ∈ A be such that P(Y) = 1. Y must

then have non-empty intersection with B

W

, in light of P1 and P(B

W

) > 0. Therefore, by

B6, B

Y

= Y ∩ B

W

⊆ B

W

. Assume that Y is a proper subset of B

W

: then both Y ∩ B

W

and ¬Y ∩ B

W

are non-empty. Since P(Y) = 1, it follows that P(¬Y) = 0 and hence

with BP2: B

¬Y

= ∅. But since ¬Y has non-empty intersection with B

W

, BP6 entails that

B

¬Y

= ¬Y ∩ B

W

. Therefore, ¬Y ∩ B

W

= ∅, which contradicts ¬Y ∩ B

W

being non-empty.

So we ﬁnd that by BP2 (and the rest of our postulates), every Y ∈ A for which P(Y) = 1

holds is such that B

Y

= B

W

. This also entails that, since B

Y

⊆ Y for all such Y by the

deﬁnition of ‘B

Y

’, if B

W

has probability 1 itself, then B

W

must be the least proposition in

A with probability 1.

Now we are in the position to answer our remaining question from above aﬃrmatively,

by identifying the P-stable

r

proposition X that yields ag’s actual beliefs if we given just

ag’s subjective probability measure P (and a threshold value r). As explained already in

section 3, apart fromsatisfying our postulates, the class Bel ought to be so that the resulting

class of absolute beliefs is maximised, as this approximates prima facie belief, and hence,

the right-to-left direction of the original Lockean thesis, to the greatest possible extent.

This corresponds to the following postulate:

BP3 (Maximality)

Among all classes Bel

·

of ordered pairs of members of A, such that P and Bel

·

jointly satisfy P1–P2, B1–B6, BP1

r

, BP2, the class Bel is the largest with respect to

31

the class of absolute beliefs, that is, pairs of the form (Z, W), that it determines.

In other words, for all such Bel

·

: Bel ∩ ¦(Z, W)¦Z ∈ A¦ ⊇ Bel

·

∩ ¦(Z, W)¦Z ∈ A¦.

The logical character of BP3 is obviously diﬀerent from the one of our previous postulates,

but then again adding postulates that maximize or minimize classes subject to axiomatic

constraints is of course not unheard of; for example, famously, Hilbert (...) uses this

strategy in his axiomatization of geometry.

The term ‘the largest’ in BP3 is well-deﬁned given the postulates P1–P2, B1–B6, BP1

r

,

BP2 Theorem 3, or in view of Theorem 4, and by what we pointed out before: Because of

Theorem 3, B

W

must be a non-empty P-stable

r

proposition in A in order to satisfy P1, B1–

B6, and BP1

r

. If there is at least one non-empty P-stable

r

proposition with probability less

than 1, then we know that amongst all the non-empty P-stable

r

propositions that are can-

didates for the maximally strong believed proposition B

W

according to Theorem 3 (which

relied on P2), there must be a least one by Theorem 4: this least P-stable

r

proposition

X

least

, which then has a probability of less than 1, and which does not have any non-empty

zero sets as subsets and hence satisﬁes BP2, must therefore determine the largest class of

absolute beliefs once II. in Theorem 3 is turned into a (partial) deﬁnition of conditional

belief again, since its class of supersets is the largest one possible. On the other hand,

if there are no non-empty P-stable

r

propositions with probability less than 1, then by P1,

B1–B6, and BP1

r

again, P(B

W

) must be a non-empty P-stable

r

proposition with probabil-

ity 1, and from our considerations on BP2 above we know that B

W

must really be the least

set of probability 1 in A.

Since we did not just deal with absolute belief in this section but also with belief con-

ditional on any proposition that is consistent with everything the agent believes absolutely,

one might wonder why we did not demand Bel in BP3 to be largest even with respect to

the class of pairs (Z, Y) for which Y ∩ B

W

∅. However, let B

·

W

B

··

W

derive from two

distinct candidates Bel

·

, Bel

··

, such that both satisfy all of our postulates apart from BP3:

by Theorem 3, without restriction of generality, B

·

W

B

··

W

. But then, ﬁrst of all, the class of

all pairs (Z, Y) for which Y ∩ B

·

W

∅ is distinct from the class of all pairs (Z, Y) for which

Y ∩ B

··

W

∅, so it would not be clear with respect to which of two classes our intended

belief class Bel ought to be the largest. Furthermore, there are propositions Z ∈ A (as,

e.g., B

··

W

\ B

·

W

), such that Z has non-empty intersection with B

··

W

but not with B

·

W

; while

BP6 would tell us whether Bel

··

(.¦Z), it would not give us any information whatsoever on

Bel

·

(.¦Z). For these reasons, it will only be in the next section, when we will deal with

conditional beliefs in general, that we will be in the position to strengthen Maximality so

that it extends to all pairs (Z, Y) for Z, Y ∈ A whatsoever. The resulting class Bel will

again be deﬁned uniquely and the set of absolute beliefs that it determines will correspond

to what is required by BP3 and the rest of the postulates of the present section.

32

With BP3 on board, and in light of our previous results, we may conclude from our

postulates that in each and every case our agent’s set B

W

is nothing but the least non-empty

P-stable

r

set in A. In other words, our postulates (including BP3) entail the explicit deﬁn-

ability of ag’s absolute beliefs, and indeed the deﬁnability of all of his beliefs conditional

on any Y that is consistent with B

W

, by means of the following corollary to our results

mentioned before:

Corollary 7 Let Bel be a class of ordered pairs of members of a σ-algebra A, let P : A →

[0, 1]. Then the following two statements are equivalent:

V. P and Bel satisfy P1–P2, B1–B6, BP1

r

, BP2, BP3.

VI. P satisﬁes P1–P2, there exists a (uniquely determined) least non-empty P-stable

r

proposition X

least

in A, and:

– For all Y ∈ A such that Y ∩ X

least

∅, for all Z ∈ A:

Bel(Z ¦ Y) if and only if Z ⊇ Y ∩ X

least

.

– In particular: B

W

= X

least

, and for all Z ∈ A:

Bel(Z ¦ W) if and only if Z ⊇ X

least

.

Where the previous postulate was reminiscent of Hilbert’s axiomatisation of geome-

try, with respect to its open parameter r the last corollary is closer in spirit to something

like Zermelo’s (...) quasi-categoricity result for second-order set theory: according to

Zermelo’s theorem, the cumulative hierarchy of sets is pinned down uniquely conditional

on the speciﬁcation of an ordinal number of a certain kind. The real number r in BP1

r

above takes over the function of such an ordinal number in Zermelo’s theorem, for only

conditional on it the class Bel is speciﬁed uniquely.

VI. of Corollary 7 can now be turned into an explicit deﬁnition of all relevant condi-

tional beliefs just on the basis of P (and logical and set-theoretic notions). Since in the

next section we will extend this result to arbitrary conditional beliefs, whether or not they

are beliefs conditional on proposition that are consistent with what the agent believes, we

refrain from stating the resulting deﬁnition here. However, we do exploit Corollary 7 by

deriving from it a particularly important special case: the concept of absolute belief can

be deﬁned explicitly in terms of P alone.

In order to do so, we will take one ﬁnal step. We restrict the probability measures P

that we are interested in such that the existence claim in VI. is always satisﬁed. While our

explicit deﬁnition of belief will then just hold conditional on that additional restriction,

since the restriction is not overly demanding in our belief context (though it would be in

33

other contexts, say, in measure theory, where one needs measures for integration), we will

still end up with a deﬁnition that assigns the right reference to ‘Bel’ for a wide range of

subjective probability measures.

This is thus the restriction on P that we use. Call it the ‘Least Certain Set Restriction’:

There is a member X ∈ A, such that P(X) = 1, and for every Y ∈ A, with P(1) = 0: X ⊆ Y.

That is: There is a least set of probability 1 in A. Equivalently, by P1, there is a member

X ∈ A, such that P(X) = 0, and for every Y ∈ A, with P(Y) = 0: Y ⊆ X. Or in other words:

there is a greatest set of probability 0 in A (which is just the complement of the least set

of probability 1). It is easy to see that the least proposition X of probability 1 cannot have

a non-empty subset Y ∈ A, such that P(Y) = 0: for otherwise, X ∧ ¬Y, which is a member

of A again, would be a set of probability 1 which is a proper subset of X.

Given this Least Certain Set Restriction, there is always a least non-empty P-stable

r

proposition in A: Either there is a non-empty P-stable

r

proposition of probability less than

1, and then there is a least non-empty P-stable

r

proposition anyway by Theorem 4. Or

all and only non-empty P-stable

r

propositions have probability 1: but then by the Least

Certain Set Restriction there is a least set with probability 1, and that set is thus the least

non-empty P-stable

r

proposition in A.

Standard examples of countably additive probability measures for which there are least

sets of probability 1 are:

• All probability measures on ﬁnite algebras A, and hence also all probability mea-

sures on algebras A that are based on a ﬁnite set W of worlds.

• All countably additive probability measures on the power set algebra of a set W that

is countably inﬁnite: In that case the conjunction of all sets of probability 1 is a

member of the algebra of propositions again, and of course it is then the least set of

probability 1.

• All countably additive probability measures (on a σ-algebra) that are regular: for all

X ∈ A, P(X) = 0 if and only if X = ∅. Here the empty happens to be the least set

of probability 1. Regularity (Strict Coherence) does not enjoy general support, even

though Carnap, Shimony, Stalnaker and others argued for it as a plausible constraint

on subjective probability measures, some of them in view of a special variant of

the Dutch book argument that favours Regularity. (But see Levi... for contrary

arguments.)

• All countably additive probability measures on a countably inﬁnite σ-algebra: The

conjunction of all sets of probability 1 is then a countably inﬁnite conjunction, so it

is a member of the given σ-algebra, and it is again the least set of probability 1.

34

These examples demonstrate that a great variety of probability measures satisfy P1, P2,

and our additional constraint, and many—if not most—of the typical philosophical toy

examples of subjective probability measures are covered by these examples.

We end up with the following materially adequate explicit deﬁnition of absolute belief

for countably additive probability measures that satisfy this additional constraint of the

Least Certain Set Restriction:

Deﬁnition 8 Let P : A → [0, 1] be a countably additive probability measure on a σ-

algebra A, such that there exists a least set of probability 1 in A. Let X

least

be the least

non-empty P-stable

r

proposition in A (which exists).

Then we say for all Y ∈ A and

1

2

≤ r < 1:

Y is believed (to a cautiousness degree of r) as being given by P if and only if

Y is a superset of X

least

.

By ‘materially adequate’ we mean here: By Corollary 7, since all of P1, B1–B6, BP3 are

plausibly true, BP1

r

is true conditional on the choice of r as a cautiousness threshold, and

with P2, BP2 being acceptable for the sake of simplicity, our deﬁnition of belief is true if

given a probability measure that satisﬁes the Least Certain Set Restriction, if the deﬁnition

is taken as a descriptive sentence. What is more, since all of P1, B1–B6, BP1

r

, BP3 are

not just true but even conceptually necessary or analytic of belief, the deﬁnition is so as

well (conditional on the presupposition of P2 and BP2).

Note that from the theory above we know that the deﬁniens could actually be replaced

by ‘Y is a superset of some non-empty P-stable

r

proposition in A’ without thereby chang-

ing the extension of the belief predicate in any way.

If we ﬁnally deﬁne for any given P : A → [0, 1], Y ∈ A is believed a priori as

being given by P if and only if P(Y) = 1, then we end up with three notions of belief

of increasing strength for all P that satisfy P1, P2, and the Least Certain Set Restriction:

prima facie belief, belief (to a cautiousness degree of r), and belief a priori. For any two

of these concepts, under very special circumstances, that is, for very special P, they can

in fact determine precisely the same beliefs (later we will deal with an example). But

under “normal” circumstances, for realistic P, they will diﬀer extensionally, and belief in

the sense of Deﬁnition 8 is the concept that we oﬀer as an explication of our pre-theoretic

notion of qualitative belief.

3.4 Examples

Finally, here are some examples. In all of them, Ais simply the full power set algebra of W.

If W contains exactly two worlds, then the situation is trivial insofar as for given

1

2

≤ r < 1,

35

!

"#! "$!

"%!

$!

%!

#!

%!

$!

#!

!!$!

#&!%!

!!%!

#&!$!

!!%!

#&!$!

!!#!

$&!%!

!!$!

#&!%!

!!#!

$&!%!

#&!$&!%!

#!

%!

$!

$!

#!

%!

#!

$!

%!

$&!%!

!!#!

#&!%!

!!$!

#&!$!

!!%!

%!

#!

$!

Figure 3: Rankings determined by P

the singleton ¦w¦ ⊆ W is the least non-empty P-stable

r

proposition if P(¦w¦) > r, and W

itself is such otherwise.

So let us turn to the ﬁrst non-trivial case, that is, where W is a set ¦w

1

, w

2

, w

3

¦ of three

elements. For simplicity, let r =

1

2

. Let us view of all probability measures on that set W

as being represented by points in a triangle, such that P(¦w

1

¦), P(¦w

2

¦), P(¦w

3

¦) become

the scalar factors of a convex combination of three given vectors that we associate with the

worlds w

1

, w

2

, w

3

. Then depending on where P is represented in that triangle, P determines

diﬀerent classes of P-stable

r

sets. See Figure 3.

The diagram should be read as follows: The vertices of the outer equilateral triangle

represent the probability measures that assign 1 to the singleton set of the respective world

and 0 to all other singleton sets. Each non-vertex on any of the edges of the outer equi-

lateral triangle represents a probability measure that assigns 0 to exactly one of the three

worlds. Each edge of the inner equilateral triangle separates the representatives of prob-

ability measures of the following kinds: probability measures that assign to the singleton

set of some world a probability that is greater than the sum of probabilities that it assigns

to the singleton sets of the two other worlds; and probability measures that assign to the

singleton set of some world a probability that is less than the sum of probabilities that it

assigns to the singleton sets of the two other worlds. For instance, to the left-below of the

left edge of the inner equilateral triangle we ﬁnd such probability measures represented

36

which assign to ¦w

1

¦ a greater probability than to the sum of what they assign to ¦w

2

¦ and

¦w

3

¦. Each straight line segment that connects a vertex with the mid-point of the opposite

edge of the outer equilateral triangle separates the representatives of probability measures

of the following kinds: probability measures that assign to the singleton set of one world a

greater probability than to the singleton set of another world; and the probability measures

that do so the other way round. Accordingly, the straight line segment that connects w

3

and the mid-point of the edge from w

1

to w

2

separates the probability measures that assign

more probability to ¦w

1

¦ than to ¦w

2

¦ from those which assign more probability to ¦w

2

¦

than to ¦w

1

¦. The center point of both equilateral triangles represents the probability that

is uniform over W = ¦w

1

, w

2

, w

3

¦.

Given all of that, and using the construction procedure for P-stable

1

2

sets that we have

sketched before, it is easy to read oﬀ for each point, and hence for the probability measure

that this point represents, all the non-empty P-stable

1

2

sets that are determined by it. The

points on the outer equilateral triangle are special: The probability measure represented

by the vertex for w

i

has ¦w

i

¦ as its least non-empty P-stable

1

2

set, all supersets of that set

are non-empty and P-stable

1

2

, too, and all of them have probability 1. The probability

measures represented by the inner part of the edge between the vertices that belong to

two worlds w

i

and w

j

have either ¦w

i

¦, or ¦w

j

¦, or ¦w

i

, w

j

¦ as their least non-empty P-

stable

1

2

set, depending on whether the representing point is closer to the vertex of w

i

than

to the vertex of w

j

, or vice versa, or equidistant of both of them; all supersets of each of

them, respectively, are non-empty and P-stable

1

2

again, and all of them have probability 1.

But the really interesting part of the diagram concerns the interior of the outer equilateral

triangle: Since relative to the probability measures that are represented as such only W

has probability 1 (and hence is P-stable

1

2

), we can concentrate solely on non-empty P-

stable

1

2

sets with probability less than 1. As we have seen, these form a sphere system of

sets. In the diagram, we denote these sphere systems by enumerating in diﬀerent lines the

numeral indices of worlds of equal rank in the sphere system, starting with the worlds of

rank 0 which we take to correspond to the entries in the bottom line of each numerical

inscription. For example: Consider the interior of the two smallest rectangular triangles

that are adjacent to w

1

. Probability measures which are presented by points in the upper

one yield a sphere system of three non-empty P-stable

1

2

sets: ¦w

1

¦, ¦w

1

, w

3

¦, ¦w

1

, w

2

, w

3

¦.

So w

1

has rank 0, w

3

has rank 1, and w

2

has rank 2. Accordingly, probability measures

represented by points in the lower one of the two triangles determine a sphere system of the

three non-empty P-stable

1

2

sets ¦w

1

¦, ¦w

1

, w

2

¦, ¦w

1

, w

2

, w

3

¦. In either of these two cases, the

probability measures in question would yield an absolute belief in every proposition that

includes w

1

as a member, by Deﬁnition 8. The further one moves geometrically towards

the center point of the two equilateral triangles, the more coarse-grained the orderings

become that are given by the sphere systems of the probability measures thus represented,

37

and the smaller the class of absolutely believed propositions gets. Probability measures

which are presented by points on any of the designated straight line segments within the

interior of the outer equilateral triangle require special attention: Probability measures

whose points lie on the boldface part in the diagram are treated separately in the little

graphics left to the triangle; they all lead to the three worlds ranked equally. For three

of the straight line segments we have denoted the sphere systems that they determine

explicitly. The points on the three edges of the inner equilateral triangle—or rather the six

halfs of those (without their midpoints which fall into the boldfaced lines)—yield sphere

systems which coincide with those of the areas to which they are adjacent on the inside,

which is why we did not say anything about them explicitly in Figure 3. Finally, for the

three straight line segments in the interior of the inner equilateral triangle we did not say

anything about “their” sphere systems either because they simply inherit them from the

rectangular triangle areas that they separate.

If r >

1

2

, then a diagramsimilar to Figure 3 can be drawn, with all of the interior straight

line segments being pushed towards the three vertices to an extent that is proportional to

the magnitude of r.

One might wonder about Figure 3 why sphere systems with one world of rank 0

and two worlds of rank 1 are determined only by points or probability measures in one-

dimensional line segments rather than in two-dimensional areas. In one sense, this is really

just a consequence of dealing with precisely three worlds. If W had four members, then

sphere systems with one world of rank 0, two worlds of rank 1, and hence one world of

rank 2 would be represented in terms of proper areas again. However, what is true in

general: sphere systems with precisely two worlds of maximal rank can only be repre-

sented by points or probability measures of areas of dimension n −1, if W has n members.

For then the probabilities of these two worlds of maximal rank must be the same, which

means the points of the represented probability measures must lie on one of the distin-

guished hyperplanes that generalise the distinguished line segments in our diagram to the

higher-dimensional case.

For analogous reason, the following is true: The set of points in the diagram which

represent probability measures for which a set of probability 1 is the least P-stable

1

2

set

has Lebesgue measure, that is, geometrical measure, 0. This is because, for any such P:

If there were a unique world whose singleton had least probability, then W without that

world would be P-stable

1

2

; so there must be at least two worlds whose singleton sets have

the same probability, and the rest follows in the same way as before. We conclude: Almost

all probability measures over a ﬁnite algebra have a least P-stable

1

2

set with a probability

less than 1.

Here is another example with 7 worlds and concrete numbers: Let W = ¦w

1

, . . . , w

7

¦

and P(¦w

1

¦) = 0.54, P(¦w

2

¦) = 0.342, P(¦w

3

¦) = 0.058, P(¦w

4

¦) = 0.03994, P(¦w

5

¦) =

38

0.018, P(¦w

6

¦) = 0.002, P(¦w

7

¦) = 0.00006. Then the resulting sphere system of non-

empty P-stable

1

2

sets is: ¦w

1

¦, ¦w

1

, w

2

¦, ¦w

1

, . . . , w

4

¦, ¦w

1

, . . . , w

5

¦, ¦w

1

, . . . , w

6

¦, ¦w

1

, . . . , w

7

¦.

However, if we switch e.g. to r =

3

4

, then the corresponding sphere systemof non-empty P-

stable

3

4

sets is: ¦w

1

, w

2

¦, ¦w

1

, . . . , w

4

¦, ¦w

1

, . . . , w

5

¦, ¦w

1

, . . . , w

6

¦, ¦w

1

, . . . , w

7

¦. In line with

Observation 6, the latter sphere system is a subclass of the former one. With a cautious-

ness degree of r =

1

2

, the proposition ¦w

1

¦ is the strongest one that is believed as being

given by P, while relative to a cautiousness degree of r =

3

4

, the proposition ¦w

1

, w

2

¦ is the

strongest one that is believed as being given by the same probability measure, as entailed

by Deﬁnition 8.

Finally, a simple inﬁnite example: Let W = ¦w

1

, w

2

, w

3

, . . .¦ be countably inﬁnite,

let A be the power set algebra on W, and let P be the unique regular countably additive

probability measure that is given by: P(¦w

1

¦) =

1

2

+

1

4

, P(¦w

2

¦) =

1

8

+

1

16

, P(¦w

3

¦) =

1

32

+

1

64

, . . .. Then the resulting non-empty P-stable

1

2

sets are:

¦w

1

¦, ¦w

1

, w

2

¦, ¦w

1

, w

2

, w

3

¦, . . . , ¦w

1

, w

2

, . . . , w

n

¦, . . . and W.

It is also easy to see that every ﬁnite sphere system can be realized in this way in terms

of P-stable

1

2

propositions of probability less than 1, and hence every AGM-style belief

revision operator on a logically ﬁnite language. So there are really lots of diﬀerent types

of sphere systems of P-stable

1

2

propositions.

Once we have covered conditional belief in full in the next section, we will return

to some of these examples and analyse them in terms of conditional belief accordingly.

Moreover, eventually, we will give some of these examples an intended interpretation by

assuming that the possible worlds in question satisfy particular statements.

4 The Reduction of Belief II: Conditional Beliefs

Now we ﬁnally generalise the postulates of the previous section to the case of beliefs that

are conditional on propositions which may even be inconsistent with what our agent ag

believes absolutely.

P1–P2 remain unchanged, of course. Our generalisations of B1–B5 simply result from

dropping the antecedent ‘¬Bel(¬X¦W)’ condition that they contained:

B1

∗

(Reﬂexivity) Bel(X¦X).

B2

∗

(One Premise Logical Closure)

For all Y, Z ∈ A: if Bel(Y¦X) and Y ⊆ Z, then Bel(Z¦X).

B3

∗

(Finite Conjunction)

For all Y, Z ∈ A: if Bel(Y¦X) and Bel(Z¦X), then Bel(Y ∩ Z¦X).

39

B4

∗

(General Conjunction)

For ) = ¦Y ∈ A¦ Bel(Y¦X)¦,

) is a member of A, and Bel(

)¦X).

The Consistency postulate stays the same:

B5

∗

(Consistency) ¬Bel(∅¦W).

The same arguments as before apply: B4

∗

now entails that for every X ∈ A there is

a least set Y, such that Bel(Y¦X), which by B1

∗

must be a subset of X. We denote this

proposition again by: B

X

. This is consistent with the corresponding notations that we used

in the last section. Once again, we have

Bel(Y¦X) if and only if Y ⊇ B

X

if and only if Bel(Y¦B

X

).

The following postulate extends our previous Expansion postulate B6 to all cases of

conditional belief whatsoever. It corresponds to the standard AGM postulates K*7 and

K*8 for belief revision if translated again into the current context:

B6

∗

(Revision)

For all X, Y ∈ A such that Y ∩ B

X

∅:

For all Z ∈ A, Bel(Z¦X ∩ Y) if and only if Z ⊇ Y ∩ B

X

.

Equivalently:

B6

∗

(Revision)

For all X, Y ∈ A, such that for all Z ∈ A, if Bel(Z¦X) then Y ∩ Z ∅:

For all Z ∈ A, Bel(Z¦X ∩ Y) if and only if Z ⊇ Y ∩ B

X

.

That is: if the proposition Y is consistent with B

X

—equivalently: Y is consistent with ev-

erything ag believes conditional on X—then ag believes Z conditional on the conjunction

of Y and X if and only if Z is logically entailed by the conjunction of Y with B

X

. Just as

the original B6 postulate it can be justiﬁed in terms of standard possible worlds accounts

of similarity orderings (as for David Lewis’ conditional logic) or plausibility rankings (as

in belief revision and nonmonotonic reasoning): say what a conditional belief expresses is

again that the most plausible antecedent-worlds are consequent-worlds; then if some of the

most plausible X-worlds are Y-worlds, these worlds must be precisely the most plausible

X ∩ Y-worlds, hence the most plausible X ∩ Y-worlds are Z-worlds if and only if all the

most plausible worlds X-worlds that are Y-worlds are Z-worlds. Analogously to the last

section, this is thus yet another equivalent statement of B6

∗

:

B6

∗

(Revision) For all X, Y ∈ A such that Y ∩ B

X

∅: B

X∩Y

= Y ∩ B

X

.

40

The generalised version BP1

r∗

of our previous BP1

r

postulate arises simply by drop-

ping the ‘Y ∩ B

W

∅’ restriction again. So we have:

BP1

r∗

(Likeliness) For all Y ∈ A with P(Y) > 0:

For all Z ∈ A, if Bel(Z¦Y), then P(Z¦Y) > r.

Finally, we generalise BP2 in the same way, and additionally we strengthen it by as-

suming also the converse of the resulting generalisation:

BP2

∗

(Zero Supposition) For all Y ∈ A: P(Y) = 0 if and only if B

Y

= ∅.

The reason why the original BP2 principle did not include the corresponding right-to-left

direction of BP1

r∗

with the qualiﬁcation ‘Y ∩ B

W

∅—that is, why we did not postulate:

If B

Y

= ∅ and Y ∩ B

W

∅, then P(Y) = 0—is that the resulting principle would have

been empty: if Y ∩ B

W

∅, then by BP6 the proposition B

Y

would have to be non-empty,

in contradiction with B

Y

= ∅, so the antecedent of that direction would always have to be

false.

We have seen in the last section that BP2, and hence BP2

∗

, entails (given the other

postulates): all Y ∈ A for which P(Y) = 1 holds are such that B

Y

= B

W

, and B

W

is the least

proposition in A of probability 1. The additional strengthening has it that the propositions

the supposition of which leads to inconsistency qualitatively are precisely those for which

conditionalization is undeﬁned quantitatively. As mentioned before, if we had started with

primitive conditional probability measures, which do allow for conditionalization on zero

sets, then BP2

∗

should not be taken for granted, but in the context of absolute probability

measures BP2

∗

is natural to postulate in order to treat qualitative and quantitative supposi-

tion similarly.

We are now ready to prove the main result of our theory on conditional beliefs in gen-

eral. The “soundness” direction of the following representation theorem incorporates the

corrsponding direction of Grove’s (...) representation theorem for belief revision operators

in terms of sphere systems. However, since all the propositions or sets of worlds that we

are about to consider are required to be members of our given algebra A, it is not possible to

simply translate the more diﬃcult “completeness” part of Grove’s representation theorem

in ... into our present context and apply it, since his construction of spheres involves taking

unions of propositions that might not be members of our σ-algebra A anymore. That is

why the proof of that part of the theorem diﬀers from Grove’s proof quite signiﬁcantly.

Here is the theorem:

Theorem 9 Let Bel be a class of ordered pairs of members of a σ-algebra A, and let

P : A → [0, 1]. Then the following two statements are equivalent:

I. P and Bel satisfy P1–P2, B1

∗

–B6

∗

, BP1

r∗

, BP2

∗

.

41

II. P satisﬁes P1–P2, A contains a least set of probability 1, and there is a (uniquely de-

termined) class X of non-empty P-stable

r

propositions in A, such that (i) X includes

the least set of probability 1 in A, (ii) all other members of X have probability less

than 1, and:

– For all Y ∈ A with P(Y) > 0: if, with respect to the subset relation, X is the

least member of Xfor which Y∩X ∅ holds (which exists), then for all Z ∈ A:

Bel(Z ¦ Y) if and only if Z ⊇ Y ∩ X.

Additionally, for all Y ∈ A with P(Y) = 0, for all Z ∈ A: Bel(Z¦Y).

Proof. The right-to-left direction is like the one in Theorem 3, except that one shows ﬁrst

that the equivalence for Bel entails for all Y ∈ A with P(Y) > 0 that B

Y

= Y ∩ X, where

X is the least member of X for which Y ∩ X ∅. The existence of that least member

follows from Theorem 4, from the fact that every non-empty P-stable

r

propositions with

probability less than 1 is a subset of the least set in A with probability 1, and from the

fact that the least set of probability 1 in A must have non-empty intersection with every

proposition of positive probability. The proof of B6

∗

is straight forward (and analogous to

Groves Theorem in...), as is the proof of BP2

∗

.

So we can concentrate on the left-to-right direction: P1–P2 are satisﬁed by assumption.

Now we deﬁne X by transﬁnite recursion as the class of all sets X

α

of the following kind:

For all ordinals α < β

r

P

+1 (the successor ordinal of the ordinal that was deﬁned in the last

section), let

X

α

=

γ<α

[X

γ

] ∪ B

W\

γ<α

X

γ

.

(So, in particular, X

0

= B

W

.)

At ﬁrst we make a couple of observations about this class X:

(a) Every member of X is also a member of A. By transﬁnite induction. For assume

that all X

γ

are in A for γ < α < β

r

P

+ 1: by the results of the last section, β

r

P

is countable

and so are its predecessors, and therefore by A being a σ-algebra,

γ<α

X

γ

∈ A; thus

W \

γ<α

X

γ

∈ A, and therefore B

W\

γ<α

X

γ

∈ A; hence, X

α

∈ A.

(b) For all γ < α < β

r

P

+ 1: X

γ

⊆ X

α

. This follows directly from the deﬁnition of the

members of X. From this it also follows that for all α + 1 < β

r

P

+ 1: X

α+1

= X

α

∪ B

W\X

α

.

(c) For all α < β

r

P

+ 1: X

α

=

γ<α

B

W\

δ<γ

X

δ

∪ B

W\

γ<α

X

γ

. By transﬁnite induction. As-

sume that for all γ < α: X

γ

=

δ<γ

B

W\

η<δ

X

η

∪ B

W\

δ<γ

X

δ

. Substituting this for the ﬁrst oc-

currence of ‘X

γ

’ in the original deﬁnition of X

α

, we conclude: X

α

=

γ<α

[

δ<γ

B

W\

η<δ

X

η

∪

B

W\

δ<γ

X

δ

] ∪ B

W\

γ<α

X

γ

. But this can be simpliﬁed to: X

α

=

γ<α

[B

W\

δ<γ

X

δ

] ∪ B

W\

γ<α

X

γ

,

which was to be shown.

42

(d) For all α < β

r

P

+ 1: For all Y ∈ A with Y ∩ X

α

∅, it holds that B

Y

⊆ X

α

. This is

because: If Y ∩ X

α

∅, then by (c) there is a γ ≤ α, such that Y ∩ B

W\

δ<γ

X

δ

∅, and

by the well-orderedness of the ordinals, there must be a least such ordinal γ. Note that for

that least ordinal γ it holds that Y ∩

δ<γ

X

δ

= ∅, and hence Y ⊆ W \

δ<γ

X

δ

. By B6

∗

,

B

[W\

δ<γ

X

δ

]∩Y

= Y ∩ B

W\

δ<γ

X

δ

, which is equivalent to B

Y

= Y ∩ B

W\

δ<γ

X

δ

by what we have

shown before. Finally, because Y ∩ B

W\

δ<γ

X

δ

⊆ B

W\

δ<γ

X

δ

⊆ X

α

by (c) again, it follows

that B

Y

⊆ X

α

.

(e) For all α < β

r

P

+ 1: X

α

is P-stable

r

. This can be derived from: For all Y ∈ A, if

Y ∩ X

α

∅ and P(Y) > 0, then by (d), B

Y

⊆ X

α

, and hence by the deﬁnition of ‘B

Y

’:

Bel(X

α

¦Y). But this implies by BP1

r∗

that P(X

α

¦Y) > r; therefore, X

α

is P-stable

r

.

(f) There exists a least proposition X ∈ A with probability 1, X ∈ X, and X is the only

member of X with probability 1.

Proof: First of all, either there P-stable

r

propositions in A with probability less than 1 or

not: If so, then as shown in the last section their (countable) union is the least proposition

X ∈ A with probability 1; if not, then as observed before, BP2

∗

entails with the other

postulates that B

W

is the least X ∈ A of probability 1. In either case, there exists the least

proposition X ∈ A with probability 1.

Secondly, we turn to the proof of: X ∈ X, and X is the only member of X with

probability 1. Assume for contradiction that all sets X

α

with α < β

r

P

+ 1 have probability

less than 1. Since they are all P-stable

r

by (e), it follows from (b) that there is a well-

ordered chain of (not necessarily strictly) increasing P-stable

r

sets of probability less than

1, where the length of that chain is β

r

P

+ 1. That chain could not be a chain of strictly

increasing P-stable

r

sets of probability less than 1, by Observation 5 and by the deﬁnition

of β

r

P

which is the ordinal type of all P-stable

r

sets of probability less than 1 whatsoever. So

there must be α < α

·

< β

r

P

+1, such that X

α

= X

α+1

. Hence, by (b) again: X

α

= X

α

∪B

W\X

α

.

Because P(X

α

) < 1, it holds that P(W \ X

α

) > 0 by P1, so by the right-to-left direction of

BP2

∗

it follows that B

W\X

α

∅. Since B

W\X

α

⊆ W \ X

α

by the deﬁnition of ‘B

W\X

α

’ and

B1

∗

–B4

∗

, a contradiction follows. Hence, we have that there must be at least one set X

α

with α < β

r

P

+ 1 that has probability 1. Since β

r

P

+ 1 is an ordinal, there must be a least

α < β

r

P

+ 1, such that P(X

α

) = 1. By Observation 5, either β

r

P

is ﬁnite or equal to ω. We

will deal with these cases separately:

In the former case, there is a γ < β

r

P

+ 1, such that α = γ + 1, and, by (b) again:

X

α

= X

γ

∪ B

W\X

γ

. If there is a set Y ∈ A, such that P(Y) = 1 and Y is not a superset of

X

α

, then X

α

∩ ¬Y is non-empty, where X

α

∩ ¬Y is a zero set since ¬Y is. Because X

γ

is

P-stable

r

with a probability of less than 1, it cannot contain any non-empty zero set, as

shown in the previous section. So X

γ

∩ ¬Y is empty, and therefore B

W\X

γ

∩ ¬Y must be

non-empty. This implies by B6

∗

: B

[W\X

γ

]∩¬Y

= ¬Y ∩ B

W\X

γ

. But [W \ X

γ

] ∩ ¬Y is a set

of probability 0 since ¬Y is, which means by BP2

∗

that B

[W\X

γ

]∩¬Y

is empty, which is a

43

contradiction. Therefore, all Y ∈ A with P(Y) = 1 are supersets of X

α

, and so X

α

is the

least set in A of probability 1. Furthermore, if α < β

r

P

, then X

α+1

∈ X, and by (b) again:

X

α+1

= X

α

∪ B

W\X

α

. But W \ X

α

has probability 0 then, hence by BP2

∗

it must hold that

B

W\X

α

is empty, and so X

α+1

= X

α

. Thus, X

α

, the least set in A of probability 1, remains to

be the only set in X with probability 1.

In the other case, where β

r

P

= ω, if α < ω, then by the same reasoning as before,

X

α

, the least set in A of probability 1, remains to be the only set in X with probability

1. Finally, if α = ω, then all sets X

γ

with γ < ω must be P-stable

r

sets with probability

less than 1. If these sets are not pairwise distinct, they must be equal from some ordinal

less than ω by (b), hence there is such an X

γ

, such that X

α

= X

γ

∪ B

W\X

γ

, which entails

just as before that X

α

is the least set in A of probability 1 and the only set in X that has

probability 1. On the other hand, if the sets X

γ

with γ < ω are pairwise distinct, then by

Observation 5, their union

γ<ω

X

γ

must be equal to the union of all P-stable

r

sets with

probability less than 1. And as shown immediately after Observation 5, that union is the

least set in A of probability 1. By deﬁnition, X

α

= X

ω

=

γ<ω

[X

γ

] ∪ B

W\

γ<ω

X

γ

, and since

W\

γ<ω

X

γ

is then a zero set, B

W\

γ<ω

X

γ

is empty as follows from BP2

∗

, and therefore X

α

is again identical to the least set in A with probability 1, and it is the only set in X with

probability 1 since α = ω = β

r

P

is the last ordinal less than β

r

P

+ 1 in the present case. This

concludes (f): the least set X with probability 1 is a member of A and indeed of X, and X

is the only member of X with probability 1.

Now let Y ∈ A with P(Y) > 0: By P1 and (f), there is a member of X with which Y has

non-empty intersection. Let α < β

r

P

+ 1 be least, such that Y ∩ X

α

∅: because of (b),

X

α

is then with respect to the subset relation the least member of X for which this holds.

We now show that B

Y

= Y ∩ X

α

, from which the relevant part of II. follows by means of

the deﬁnition of B

Y

and B1

∗

–B4

∗

. From (d) we know already that B

Y

⊆ X

α

and hence with

B1

∗

–B4

∗

, B

Y

⊆ Y ∩ X

α

. Now consider Y ∩ X

α

again, which by assumption is non-empty:

By (c), X

α

=

γ<α

B

W\

δ<γ

X

δ

∪ B

W\

γ<α

X

γ

. If Y had non-empty intersection with any set of

the form B

W\

δ<γ

X

δ

for γ < α, then Y ∩ X

γ

∅, by (c) again, in contradiction with the way

in which α was deﬁned before. Therefore, Y ∩X

α

= Y ∩B

W\

γ<α

X

γ

∅. The latter implies

with B6

∗

that B

[W\

γ<α

X

γ

]∩Y

= Y ∩ B

W\

γ<α

X

γ

. As in the proof of (d), Y ∩

γ<α

X

γ

is empty,

and thus [W \

γ<α

X

γ

] ∩ Y = Y. So we have B

Y

= Y ∩ B

W\

γ<α

X

γ

= Y ∩ X

α

and we are

done.

Finally, consider Y ∈ A with P(Y) = 0: By BP2

∗

, B

Y

= ∅, from which the remaining

part of II. follows by means of the deﬁnition of B

Y

and B1

∗

–B4

∗

again.

Uniqueness follows from: if there are two such classes X, X

·

with the stated properties,

then they must diﬀer with respect to at least one P-stable

r

sets with probability less than 1.

Without restriction of generality, let X

α

be the ﬁrst member of X that is not also a member

of X

·

: since X

α

is P-stable

r

and has probability less than 1, it follows just as before that

44

α is ﬁnite. If α = 0, then B

W

could not be the same as being given by X and X

·

, which

would be a contradiction. If α is a successor ordinal γ +1, then B

W\X

γ

= X

α

\ X

γ

could not

be the same as being given by X and X

·

, which would again be a contradiction.

Theorem 9 generalises Theorem 3 of the last section to conditional beliefs in general—

Theorem 3 simply dealt with the special case of a sphere system of just one P-stable

r

set.

It remains to generalise BP3 in the now obvious way:

BP3

∗

(Maximality)

Among all classes Bel

·

of ordered pairs of members of A, such that P and Bel

·

jointly

satisfy P1–P2, B1

∗

–B6

∗

, BP1

r∗

, BP2

∗

, the class Bel is the largest one.

In other words, for all such Bel

·

: Bel ⊇ Bel

·

.

Using this, we can derive:

Corollary 10 Let Bel be a class of ordered pairs of members of a σ-algebra A, let P :

A → [0, 1]. Then the following two statements are equivalent:

III. P and Bel satisfy P1–P2, B1

∗

–B6

∗

, BP1

r∗

, BP2

∗

, BP3

∗

.

IV. P satisﬁes P1–P2, A contains a least set of probability 1, and if X is such that (and

indeed is uniquely determined by) (i) X includes the least set of probability 1 in

A, (ii) and all the other members of X are precisely all the non-empty P-stable

r

propositions in A which have probability less than 1, then:

– For all Y ∈ A with P(Y) > 0: if, with respect to the subset relation, X is the

least member of Xfor which Y∩X ∅ holds (which exists), then for all Z ∈ A:

Bel(Z ¦ Y) if and only if Z ⊇ Y ∩ X.

Additionally, for all Y ∈ A with P(Y) = 0, for all Z ∈ A: Bel(Z¦Y).

This follows immediately from Theorem 9, except that we have to show: adding ‘BP3

∗

’ to

I. of Theorem 9 is equivalent to determining X as in IV. of Corollary 10.

But that is a consequence of the following independent observation:

Observation 11 Let P be a countably additive probability measure on a σ-algebra A over

W. Assume that A contains a least set of probability 1, let X, X

·

be classes of non-empty

P-stable

r

propositions for which (i) and (ii) of II. of Theorem 9 is satisﬁed. Let Bel, Bel

·

be deﬁned in terms of X, X

·

, respectively, as stated in II. of Theorem 9. Then it holds:

If X ⊆ X

·

, then for all Y, Z ∈ A: If Bel(Z¦Y) then Bel

·

(Z¦Y).

45

Proof. Let X ⊆ X

·

. For Y with P(Y) = 0 there is nothing to show. So let Y be such that

P(Y) > 0: If Bel(Z¦Y), then by deﬁnition Z ⊇ Y ∩ X with X being the least member of X

for which Y ∩X ∅ holds. But since X is also a member of X

·

, the least member X

·

of X

·

for which Y ∩ X

·

∅ holds must then be a subset of X; hence, Z ⊇ Y ∩ X

·

and therefore

Bel

·

(Z¦Y).

From this it follows that choosing X to be the greatest class of all non-empty P-stable

r

propositions in A such that (i) and (ii) of II. of Theorem 9 is satisﬁed must lead to the

maximal class Bel of pairs of propositions in A, if Bel is given as in in II. of Theorem

9. But that is exactly what we did in IV. of Corollary 10. Note that unlike the case of

absolute belief, where where we were only interested in the least P-stable

r

proposition

B

W

, the additional Least Certain Set Restriction on P is even entailed by our postulates on

subjective probability and belief. So when we ﬁnally turn now IV. of Corollary 10 into an

explicit deﬁnition of belief on the basis of P, but this time of conditional belief in general,

then doing so “just” for probability measures for which there exist least propositions of

probability 1 is not an actual constraint (given our postulates are plausible). After all, only

such probability measures can be combined with any class Bel at all, such that all of our

postulates are satisﬁed jointly by them.

This is thus the intended materially adequate explicit deﬁnition of conditional belief:

Deﬁnition 12 Let P : A → [0, 1] be a countably additive probability measure on a σ-

algebra A, such that there exists a least set of probability 1 in A. Let X be uniquely

determined by: (i) X includes the least set of probability 1 in A, (ii) and all the other

members of X are precisely all the non-empty P-stable

r

propositions in A which have

probability less than 1.

Then we say for all Y, Z ∈ A and

1

2

≤ r < 1:

Z is believed conditional on Y (to a cautiousness degree of r) as being given by P if and

only if either (i) P(Y) > 0 and Z is a superset of the intersection of Y with the least non-

empty P-stable

r

proposition X

least

in A that has a non-empty intersection with Y (which

exists), or (ii) P(Y) = 0.

By ‘materially adequate’ we mean the same as at the end of the previous section: the

deﬁnition is a true, and even conceptually true, sentence, if taken as a descriptive statement

and if given our postulates.

In analogy with the case of absolute beliefs, we could nowdeﬁne notions of prima facie

conditional belief and conditional belief a priori again, and again we would end up with

three notions of belief of increasing strength: prima facie conditional belief, conditional

belief (to a cautiousness degree of r), and conditional belief a priori. Of course, condi-

tional belief in the sense of Deﬁnition 12 is the concept that we propose as an explication

of our pre-theoretic notion of conditional belief simpliciter.

46

[APPLICATIONS AND EXTENSIONS LEFT OUT.]

47

follow this tradition. Belief simpliciter, which only recognizes belief, disbelief, and suspension of judgement, is closed under deductive inference as long as every proposition that an agent is committed to believe is counted as being believed in an idealised sense; this is how epistemic logic conceives of belief, and we will subscribe to this view in the following. Despite of these logical diﬀerences between the two notions of belief, it would be quite surprising if it did not turn out that quantitative and qualitative belief were but aspects of one and the same underlying substratum; after all, they are both concepts of belief. However, this still allows for a variety of possibilities: they could be mutually irreducible conceptually, with only some more or less tight bridge laws relating them; or one could be reducible to the other, without either of them being eliminable from scientiﬁc or philosophical thought; or either of them could be eliminable. So which of these options should we believe to be true? The concept of quantitative belief is being applied successfully by scientists, such as cognitive psychologists, economists, and computer scientists, but also by philosophers, in particular, in epistemology and decision theory; eliminating it would be detrimental both to science and philosophy. On the other hand, it has been suggested (famously, by Richard Jeﬀrey) that the concept of belief simpliciter can, and should, be eliminated in favour of keeping only quantitative belief. But this is not advisable either: (i) Epistemic logic, huge chunks of cognitive science, and almost all of traditional epistemology rely on the concept of belief in the qualitative sense; by abandoning it one would simply have to sacriﬁce too much. (ii) Beliefs held by some agent are the mental counterparts of the scientiﬁc theories and hypotheses that are held by a scientist or a scientiﬁc community; they can be true or false just as those theories and hypotheses can be (taking for granted a realist view of scientiﬁc theories). But not many would recommend banning the concept holding a scientiﬁc theory/hypothesis from science or philosophy of science. (iii) The concept of belief simpliciter, which is a classiﬁcatory concept, occupies a more elementary scale of measurement than the numerical concept of quantitative belief does, which is precisely one of the reasons why it is so useful. That is also why giving up on any of the standard properties of rational belief, such as closure under conjunction (the Conjunction property)—if X and Y are believed, then X ∧ Y is believed—as some have suggested in response to lottery-type paradoxes (see Kyburg...), would not be a good idea: for without these properties belief simpliciter would not be so much less complex than quantitative belief anymore (however, see Hawthorne & Makinson...). But then one could have restricted oneself to quantitative belief from the start, and in turn one would lack the simplifying power of the qualitative belief concept. (iv) Beliefs involve dispositions to act under certain conditions. For instance, if I believe that my original edition of Carnap’s Logical Syntax is on the bookshelf in my oﬃce, then given the desire to look something up in it, and with the right background conditions being satisﬁed, such as not being too tired, not being distracted by anything else, 2

and so on, I am disposed to go to my oﬃce and pick it up. The same belief also involves lots of other dispositions, and what holds all of these dispositions together is precisely that belief. If one looks at the very same situation in terms of degrees of belief, then with everything else in place, it will be a matter of what my degree of belief in the proposition that Carnap’s Logical Syntax is in my oﬃce is like whether I will actually go there or not, and similarly for all other relevant dispositions. Somehow the continuous scale of degrees of belief must be cut down to a binary decision: acting in a particular way or not. And the qualitative concept of belief is exactly the one that plays that role, for it is meant to express precisely the condition other than desire and background conditions that needs to be satisﬁed in order for to me to act in the required way, that is, for instance, to walk to the oﬃce and to pick up Carnap’s monograph from the bookshelf. Decision theory, which is a probabilistic theory again, goes some way of achieving this without using a qualitative concept of belief, but it does not quite give a complete account. Take assertions as a class of actions. One of the linguistic norms that govern assertability is: If all of A1 , . . . , An are assertable for an agent, then so is A1 ∧ . . . ∧ An . One may of course attack this norm on different grounds, but the norm still seems to be in force both in everday conversation and in scientiﬁc reasoning. Here is plausible way of explaining why we obey that norm by means of the concept of qualitative belief: Given the right desires and background conditions, a descriptive sentence gets asserted by an agent if and only if the agent believes the sentence to be true. And the assertability of a sentence A is just that very necessary epistemic condition for assertion—belief in the truth of A—to be satisﬁed. (Williamson... states an analogous condition in terms of knowledge rather than belief; but it is again a qualitative concept that is used, not a quantitative one.) But if an agent believes all of A1 , . . . , An , then the agent believes, or is at least epistemically committed to believe, also A1 ∧ . . . ∧ An . That explains why if A1 , . . . , An are assertable for an agent, so is A1 ∧ . . . ∧ An . And it is not clear how standard decision theory just by itself, without any additional resources at hands, such as a probabilistic explication of belief, would be able to give a similar explanation. The assertability of indicative conditionals A → Bi makes for a similar case. Here, one of the linguistic norms is: If all of A → B1 , . . . , A → Bn are assertable for an agent, then so is A → (B1 ∧ . . . ∧ Bn ). This may be explained by invoking the Ramsey test for conditionals (see...) as follows: Given the right desires and background conditions, A → Bi gets asserted by an agent if and only if the agent accepts A → Bi , which in turn is the case if and only if the agent believes Bi to be true conditional on the supposition of A. Again, the assertability of a sentence, A → Bi , is just that respective necessary epistemic condition—belief in Bi on the supposition of A—to be satisﬁed. But, if an agent believes all of B1 , . . . , Bn conditional on A, then the agent believes, or is epistemically committed to believe, also B1 ∧ . . . ∧ Bn on the supposition of A. Therefore, if A → B1 , . . . , A → Bn are assertable for an agent, so is A → (B1 ∧ . . . ∧ Bn ). Ernest Adams’ otherwise marvel3

lous probabilistic theory of indicative conditionals (...), which ties the acceptance of any such conditional to its corresponding conditional subjective probability and hence to the quantitative counterpart of conditional belief, does not by itself manage to explain such patterns of assertability. While from Adams’ theory one is able to derive that the uncertainty (1 minus the corresponding conditional probability) of A → (B1 ∧ . . . ∧ Bn ) is less than or equal the sum of the uncertainties of A → B1 , . . . , A → Bn , and thus if all of the conditional probabilities that come attached to A → B1 , . . . , A → Bn tend to 1 then so does the conditional probability that is attached to A → (B1 ∧ . . . ∧ Bn ), it also follows that for an increasing number n of premises, ever greater lower boundaries 1 − δ of the conditional probabilities for A → B1 , . . . , A → Bn are needed in order to guarantee that the conditional probability for A → (B1 ∧ . . . ∧ Bn ) is bounded from below by a given 1 − . No uniform boundary emerges that one might use in order to determine for a conditional—whether premise or conclusion, whatever the number of premises, or whether in the context of an inference at all—its assertability simpliciter. But since there is only assertion simpliciter, at some point a condition must be invoked that discriminates between what is a case of asserting and what is not. Once again the concept of (conditional) qualitative belief gives us exactly what we need. The upshot of this is: Neither the concept of quantitative belief nor the concept of qualitative belief ought to be eliminated from science or philosophy. But this leaves open, in principle, the possibility of reducing one to the other without eliminating either of them— using traditional terminology: one concept might simply turn out to be logically prior to the other. Now, reducing degrees of belief to belief simpliciter seems unlikely (no pun intended!), simply because the formal structure of quantitative belief is so much richer than the one of qualitative belief. But for the same reason, at least prima facie, one would think that the converse ought to be feasible: by abstracting in some way from degrees of belief, it ought to be possible to explicate belief simpliciter in terms of them. Belief simpliciter would thus be qualitative only at ﬁrst glance; its deeper logical structure would turn out to be quantitative after all. One obvious suggestion of how to explicate belief simpliciter on the basis of degrees of belief is to maintain that having the belief that X is just having assigned to X a degree of belief strictly above some threshold level less than 1 (this is called the Lockean thesis by Richard Foley... more about which below). If that threshold is also 1 greater than or equal to 2 , then belief would simply amount to high subjective probability. But since the probability of X ∧ Y might well be below the threshold even when the probabilities of X and Y are not, one would thus have to sacriﬁce logical properties such as the Conjunction property, which one should not, as mentioned above. While the Lockean thesis seems materially ﬁne, for qualitative belief does seem to be close to high subjective probability, it does not get the logical properties of qualitative belief right. Or one identiﬁes the belief that X with having a degree of belief of 1 in X: call this the ‘probability 4

since all belief cores can be shown to have absolute probability 1. but should I therefore be forced to assign the same degree of belief to them? (ii) One can believe X without being disposed to accept every bet whatsoever on X. our pretheoretic notions of belief-in-degrees and belief simpliciter have the following epistemic and pragmatic properties: (i) One can believe X and Y without assigning the same degree of belief to them. If qualitative belief inherits this general conjunction property from truth—maybe because truth is what qualitative beliefs aim at. But then at least one of X and Y must have a probability other than 1. Horacio Arlo-Costa. one could abandon the standard interpretation of subjective probabilities in terms of betting quotients.1 proposal’. and I also believe that every natural number has a successor. subjective probability measures do not. More importantly. it is not perfect on that side either.. belief simpliciter. and if I were to lose lose 1000 Pound if not. Truth for propositions is certainly closed under taking conjunctions of arbitrary cardinality. and hence a degree of belief of less than 1 should not be regarded as partial belief. But the same problems as mentioned before emerge. being assigned probability 1 is not so except for those cases in which probability assignments simply coincide with truth value assignments.. that is. apart from such logical considerations. the proposal is materially wrong. Additionally. but in the presence of uncertainy. does not coincide with having a degree of belief of 1. however. As Roorda (. according to which within the quantitative structure of primitive conditional probability measures (Popper functions) one can always ﬁnd so-called belief cores. I believe that my desk will still be there when I enter my oﬃce tomorrow. about which more later): for full belief. which are propositions with particularly nice and plausible logical properties. and since two distinct belief cores diﬀer only in terms of some set 5 . For instance. later we will see that our theory will allow for a reconciling oﬀer in that direction. the axioms of Popper functions are certainly more controversial than those of the standard absolute or unconditional probability measures. at least as long as the stakes of the bet are not too extravagant. by taking supersets of those one can deﬁne elegantly notions of qualitative belief in diﬀerent variants and strengths. All of these points also apply to a much more nuanced version of the probability 1 proposal which was developed by Bas van Fraasen. For example. This shows that Ramsey’s term ‘partial belief’ for subjective probability is in fact misleading (or at least ambiguous. but breaking with such a successful tradition comes with a price of its own. but with it we are going to deal later. whether directly or indirectly—then an explication of qualitative belief in terms of probability 1 is simply not good enough. and Rohit Parikh. However. I do believe that I will be in my oﬃce tomorrow. While this does much better on the logical side. although the latter ought be that case by the standard Bayesian understanding of probabilities if one assigns probability 1 to X. too.) pointed out. But I would refrain from accepting a bet on this if I were oﬀered 1 Pound if I were right. (Alternatively.) Roorda’s presents a third argument against the probability 1 proposal based on considerations on fallibilism.

it is just that having the qualitative belief that A will turn out to be deﬁnable in terms of assignments of consistently high degrees of belief..) and recently by James Hawthorne (.. we will bite the bullet and stick to just one probability measure below. so you might just as well use it. without stripping qualitative belief of any of its constitutive properties. it is impossible to understand qualitative belief just in terms of quantitative belief or the other way round.). and without thereby intending to eliminate the concept of belief simpliciter in favour of quantitative belief. Given all of these problems. the message will be: within your subjective probability measure you ﬁnd qualitative belief anyway. We will also point out which consequences this has for various problems in philosophy of science. In the words of Jonathan Roorda (.). Roorda himself then goes on to suggest an explication that is based on sets of subjective probability measures rather than just one probability measure as standard Bayesianism has it. by Isaac Levi (. without revising the intended interpretation of subjective probabilities in any way. In what follows.. And for the convinced Bayesian. Before we turn to the details of our theory. “The depressing conclusion .. we will see that the logical properties of belief cores are enormously attractive: we will return to this later.. the only remaining option seems to be: neither of quantitative or qualitative belief can be reduced to the other. and the philosophy of language. . it is probably fair to say that something like this is the dominating view in epistemology these days. On the other hand. while there are certainly bridge principles of some kind that relate the two. where what this means exactly will be clariﬁed below. one wonders whether in many practically relevant situations in which only probability measures on ﬁnite spaces are needed and where often there are no non-empty zero sets at all—or otherwise the corresponding worlds with zero probabilistic weight would simply have been dropped from the start—the analysis is too far removed from the much more mundane reality of real-world reasoning and epistemological thought experiments. when we will show that it is actually possible to restore most of them in the new setting that we are going to propose. we will ﬁrst sketch the underlying idea of the explication. 6 . without running into any of the diﬃculties that we found to aﬀect the standard proposals for quantitative explications of belief.of absolute probability 0. In contrast.. is that no explication of belief is possible within the conﬁnes of the probability model”. we are going to argue against this view: we aim to show that it is in fact possible to reduce belief simpliciter to probabilistic degrees of belief by means of an explicit deﬁnition. Summing up: Reducing qualitative belief to quantitative belief does not seem to work either. who despises qualitative belief. Both notions of belief will be preserved. epistemology. A view like this has been proposed and worked out in detail. And apart from extreme Bayesians who believe that one can do without the concept of qualitative belief. for example. .

and upon a due ballancing the whole. his emphasis) and the Mind if it will proceed rationally. as we know from lottery paradox situations. Book IV. before it assents to or dissents from it.. 140f) calls the Lockean thesis.. reason. (Locke.. and see how they make more or less. as we cannot have undoubted Knowledge of their Truth: yet some of them border so near upon Certainty.. but assent to them ﬁrmly. 7 . it is not yet good enough: there are logical principles for belief (such as the Conjunction principle) which we regard as just as essential to the belief in X as assigning a suﬃciently high subjective probability to X. reject. as if they were infallibly demonstrated. for or against any probable Proposition. Let X be a proposition in the domain of P: X is believed prima facie as being given by P if and only if P(X) > r. 656. are such. we take the Lockean thesis to characterise a more preliminary notion of belief. that is: to say that you believe a proposition is just to say that you are suﬃciently conﬁdent of its truth for your attitude to be one of belief and consequently it is rational for you to believe a proposition just in case it is rational for you to have a suﬃciently high degree of conﬁdence in it. and it is precisely these logical principles that which are invalidated if the Lockean thesis is turned into a deﬁnition of belief. Book IV. suﬃciently high to make your attitude toward it one of belief. Chapter XV.. p. nay act upon. and act. that we make no doubt at all about them. discourse. or what one might call prima facie belief: Deﬁnition 1 Let P be a subjective probability measure.. with a more or less ﬁrm assent. p.. proportionably to the preponderancy of the greater grounds of Probability on the one side or the other. Instead. according to that Assent.. and that our Knowledge of them was perfect and certain (Locke. as exempliﬁed by most of the Propositions we think. Chapter XV. ought to examine all the grounds of Probability. He takes this to be derivative from Locke’s views on the matter. pp. as resolutely. or receive it..2 The Basic Idea Our starting point is again what Richard Foley (. his emphasis) We take this account of belief simpliciter in terms of high degrees of belief to be right in spirit. 655. However.

so that high subjective probability is still a necessary condition for belief but it is not anymore demanded to be a suﬃcient one. . and if Bel is the class of believed propositions by the agent at t (and both relate to the same underlying class of propositions). we suggest to drop just the right-toleft direction of the Lockean thesis. these logical principles do not apply to beliefs taken by themselves but rather to systems of beliefs taken as wholes. Instead of it. This will give us then something of the following form: • If P is an agent’s degree-of-belief function at a time t. all beliefs simpliciter will be among the prima facie candidates for beliefs. . Unlike the deﬁnition of prima facie belief which expresses a condition to be satisﬁed by single beliefs. a proposition is believed prima facie in view of the fact that it has an epistemic feature that speaks in favour of it being a belief proper—that is. ultimately. Accordingly. belief under a supposition. to have a suﬃciently high subjective probability—and as long as no other of its epistemic properties tells against it being such.) (2) Logical constraints: 8 . then they have the following properties: (1) Probabilistic constraint: ∗ P is a probability measure. On the other hand. And arguably belief simpliciter under a supposition is just as important for our epistemic lives as belief simpliciter taken absolutely or unconditionally. (Additional constraints on P. we need to formulate the result as a constraint on an agent’s belief system or class. . more needs to be said about the threshold value r here. that is. In analogy with the case of prima facie obligations in ethics. The left-toright direction is going to ensure that beliefs remain reasonably cautious—how cautious will depend on the “cautiousness parameter” r—and that they inherit all the dispositional consequences of having suﬃciently high degrees of belief. we will not just do this for absolute or unconditional belief—the belief that X is the case—but also for conditional belief. Furthermore. we will regard all the standard logical principles for belief as being constitutive of belief from the start. generalizing the left-to-right direction of the original Lockean thesis to cases of conditional belief will pave the way to our ultimate understanding of belief. Indeed. it will in fact be properly believed. but let us postpone this discussion. the right-toleft direction was the one that got us into lottery-paradox-like trouble. Therefore. Thus. as in: the belief that X is the case under the supposition that Y is the case. as far as belief itself is concerned. when putting together the left-toright direction of the Lockean thesis with these logical postulates.Of course.

) . there might simply not be any such class Bel at all. Z: if Y ∈ Bel and Z ∈ Bel. comes to mind: given P. and no restriction of bounded 9 . and (3) might well do as a meaning postulate on ‘Bel’ and ‘P’. . . ∗ For all propositions Y.) classic method of deﬁning theoretical terms. We will prove later that this is not so.) (3) Mixed constraints: ∗ For all propositions X ∈ Bel. (Other standard logical principles for Bel and their extensions to conditional belief. . we will be able to prove that this worry does not get conﬁrmed. in fact.∗ For all propositions Y. for given P. obviously this is not an explicit deﬁnition of ‘Bel’ on the basis of ‘P’ anymore. what if there were a largest such class Bel? That class would have all the intended properties. Even with that in place. which builds on work by Ramsey and Carnap. The class would thus have every right to be counted as the class of beliefs at a time t of an agent whose subjective probability measure at that time is P. then Z ∈ Bel. then Bel(Y ∩ Z). such that the conditions on Bel and P above are the case. ∗ No logical contradiction is a member of Bel. and it would contain every proposition that is a member of any class Bel as above. for every given P and for every two distinct classes Bel which satisfy the conditions above (relative to that P) it is always the case that one of the two contains the other as a subset. But of course this invites all the standard worries about such deﬁnitions by deﬁnite description: First of all. Is there any hope of turning it into an explicit deﬁnition of belief again? Immediately. .. It would therefore maximize the extent by which prima facie beliefs in the sense deﬁned before are realized in terms of actual beliefs. ∗ (An extension of this to conditional belief. But then again. David Lewis’ (.) While the conjunction of (1). . (Additional mixed constraints on P and Bel. there might be more than just one class Bel that satisﬁes the constraints above. Z: if Y ∈ Bel and Y logically entails Z. deﬁne ‘Bel’ to be the class. P(X) > r. for some P. (2). there might even be two such classes that contain mutually inconsistent propositions. one would still have to decide which class Bel in the resulting chain of belief classes ought to count as the “actual” belief class as being given by P in order to satisfy the uniqueness part of our intended deﬁnition by deﬁnite description. Secondly. Fortunately. In other words: it would approximate as closely as possible the right-to-left direction of the Lockean thesis that we were forced to drop in view of the logical principles of belief. at least for many P. Worse..

∗ (An extension of this to conditional belief. . and we will assume that the ﬁctional epistemic agent ag that we will deal with has belief states of both kinds available 10 . ∗ No logical contradiction is a member of Bel. then a proposition (in the domain of P) is believed as being given by P if and only if it is a member of the largest class Bel of propositions that satisﬁes the following properties: (1) Belief constraints: ∗ For all propositions Y. Belief simpliciter will therefore have been reduced to degrees of belief.) . indeed it does. P(X) > r. .. . What we will have found then is that the following is a materially adequate and explicit deﬁnition of an agent’s beliefs in terms of the agent’s subjective probability measure: • If P is an agent’s subjective probability measure at a time t that satisﬁes the additional constraints. . (Additional mixed constraints on P and Bel. then Bel(Y ∩ Z). it will turn out to be possible to characterize the deﬁning conditions of belief just in terms of a simple and independently appealing quantitative condition on P and elementary set-theoretic operations and relations. Z: if Y ∈ Bel and Z ∈ Bel. The remaining sections will be devoted to applications and extensions of the theory.variables to “natural” classes as in Lewis’ original proposal would be necessary at all. 3 The Reduction of Belief I: Absolute Beliefs The goal of this section and the subsequent one is to enumerate a couple of postulates on quantitative and qualitative beliefs and their interaction. then Z ∈ Bel. In the following two sections. we are going to execute this strategy in all formal details. .) So we will have managed to deﬁne belief simpliciter just in terms of ‘P’ and logical and set-theoretical vocabulary. ∗ For all propositions Y. . Z: if Y ∈ Bel and Y logically entails Z. In fact. . If such a largest belief class exists. of course—but as we will prove later.) (2) Mixed constraints: ∗ For all propositions X ∈ Bel. (Other standard logical principles for Bel and their extensions to conditional belief.

.. where A is formally a σ-algebra over W. X2 . Let P be ag’s degree-of-belief function (quantitative belief function) at time t. in the deﬁnition of belief simpliciter itself. Bel. ‘P’ will be a variable again. We will keep using the same symbols ‘P’ and ‘Bel’ for all of these purposes. too. We will extend the standard logical terminology that is normally deﬁned for formulas or sentences to propositions in A: so when we speak of a proposition as a logical truth we actually have in mind the unique proposition W. and so on. Xn . . for X. . . when we say that a proposition is consistent we mean that it is non-empty. ‘P’ and ‘Bel’ will become variables. . We shall speak of conjunctions and disjunctions of propositions even in cases of inﬁnite intersections or unions of propositions. Y ∈ A. it holds that P and Bel satisfy so-and-so if and only if. but their methodological status should always become clear from the context. and ‘Bel’ will be a variable the extension of which is deﬁned on the basis of ‘P’ (and mathematical vocabulary). it will turn out that they will be strong enough to constrain qualitative belief in a way such that the concept of qualitative belief ends up being deﬁnable explicitly just on the basis of ‘P’. X ∪ Y ∈ A. . It follows that A is closed under countable intersections. . so that we will able to say: For all P. when we refer to the negation of a proposition X we do refer to its complement relative to W (and we will denote it by ‘¬X’). then n∈N Xn ∈ A. at t our agent ag is capable in principle of entertaining all and only propositions (sets of worlds) in a class A of subsets of W. . the conjunction of two propositions is of course their intersection. When we state the theorems from which this follows. Following the Bayesian take on quantitative belief. The terms ‘P’ and ‘Bel’ that will occur in these postulates should be thought of as primitive ﬁrst. and ﬁnally if all of X1 . are members of A.1 Probabilistic Postulates Consider an epistemic agent ag which we keep ﬁxed throughout the article. P has the following properties: 11 . W \ X. is also a member of A. if X ∈ A then the relative complement of X with respect to W. with each postulate expressing a constraint either on the reference of ‘P’ or on the reference of ‘Bel’ or on the references of ‘P’ and ‘Bel’ simultaneously. that is. A is not demanded to coincide with some power set algebra. instead A might simply not count certain subsets of W as propositions at all. that is: W and ∅ are members of A. 3. Accordingly. that is. in terms of quantitative belief (and a cautiousness parameter) only. . Even though initially we will present these constraints on subjective probability and belief in the form of postulates or axioms. Let W be a (non-empty) set of logically possible worlds. Say. we postulate: P1 (Probability) P is a probability measure on A.which obey these postulates.

Countable Additivity serves just one purpose: it simpliﬁes the theory. A may simply be taken to ﬁnite. To P1 we add: P2 (Countable Additivity) P is countably additive (σ-additive): if X1 . then P( n∈N Xn ) = ∞ P(Xn ). X2 . Conditional probabilities are introduced by: P(Y|X) = P(Y∩X) P(X) whenever P(X) > 0. 1]. Xn . . while others do).P : A → [0. we will rather say: Bel(Y). . are pairwise disjoint members of A. For many practical purposes. . when X is a zero set as being given by P). although in purely mathematical contexts. In our context. assume that by ‘Bel’ we are going to denote the class of propositions that our ideally rational agent believes to be true at time t. But the theory is still not accepted widely. . and then σ-additivity reduces to ﬁnite additivity again which is indeed uncontroversial for all Bayesians whatsoever. in future versions of the theory one might want to study belief simpliciter instead under the mere assumption of ﬁnite additivity. . In line with elementary principles of doxastic or epistemic logic (which are 12 . are pairwise disjoint members of A. We shall nevertheless have occasion to return to Popper functions later in some parts of the paper. . to what happens to David Lewis’ “spheres semantics” of counterfactuals when the so-called Limit Assumption is dropped (to which Lewis himself does not subscribe. that is. such as measure theory. Extending the theory in that direction is feasible: Dropping P2 may be seen to correspond.). n=1 Countable Additivity or σ-additivity is in fact not uncontroversial even within the Bayesian camp itself. As a ﬁrst approximation. we shall simply take it for granted now. assuming just P1 but not P2. and we call Bel our agent ag’s belief set at time t. X2 . then P(X1 ∪ X2 ) = P(X1 ) + P(X2 ). 3..2 Belief Postulates Let us turn now from quantitative belief to qualitative belief: Each belief simpliciter—or more brieﬂy: each belief —that ag holds at t is assumed to have a set in A as its propositional content. P is ﬁnitely additive: if X1 . Instead of writing ‘Y ∈ Bel’. . as we will sometimes say. roughly. P(W) = 1. However. σ-additivity is usually beyond doubt (but see Schurz & Leitgeb. and we want to avoid the impression that the theory in this paper relies on Popper functions in any sense. As far as our familiar treatment of conditional probabilities in terms of the ratio formula for absolute or unconditional probabilities is concerned.. we should stress that the elegant theory of primitive conditional probability measures (Popper functions) would allow P(Y|X) to be deﬁned and non-trivial even when P(X) = 0 (that is. . .

Actually. Bel is assumed to satisfy the following postulates: 1. it is thought quite commonly that if the set of beliefs simpliciter is presupposed to be closed under conjunction. together with our other postulates does imply that A is closed under taking arbitrary countable conjunctions of believed propositions: for if all the members of any countable class of propositions are believed by ag at t. and Bel( Y). In the semantics of doxastic or epistemic logic. our set BW would correspond to the set of accessible worlds from the viewpoint of the agent’s current mindset. On the other hand. Our postulate 4. if A is ﬁnite. There is yet another independent reason for assuming 4. 2. Bel(W). using the terminology that is quite common in areas such as belief revision or nonmonotonic reasoning. For all Y. The main reason why we presuppose 4. Z ∈ A: if Bel(Y) and Bel(Z). above. imposes also another constraint on A: While it is not generally the case that the algebra A contains arbitrary conjunctions of members of A. We will show that beliefs simpliciter can in fact be reduced to quantitative belief even though 4. then Bel(Y ∩ Z). one might think of the members of BW as being precisely the most plausible candidates for what the actual world might be like. with which we will deal later. and the conjunction is a member of Bel by its being a superset of BW and by 2. then Bel(Z).: In light of lottery paradox or preface paradox situations. Accordingly. is that it enables us to represent the sum of ag’s beliefs in terms of such a unique proposition or a unique set of possible worlds. 4. So we will not be accused of playing tricks by building up some kind of non-standard model for qualitative belief in which certain types of conjunction rules are applicable to 13 . has the following obvious consequence: There is a least set (a strongest proposition) Y. We will denote this very proposition by: BW . such that Bel(Y). In any case. 4. if seen from the viewpoint of ag at time t. This certainly involves a good deal of abstraction. expresses the strongest form of closure under conjunction whatsoever that a set of beliefs can satisfy. that Y is just the conjunction of all propositions believed by ag at t. For Y = {Y ∈ A | Bel(Y)}. For all Y. then their conjunction is a member of A by A being a σ-algebra. then this prohibits any probabilistic analysis of belief simpliciter from the start. Z ∈ A: if Bel(Y) and Y ⊆ Z. we are going to strengthen the principle on ﬁnite conjunctions of believed propositions to the case of the conjunction of all believed propositions whatsoever: 4. 3. Y is a member of A.entailed by the modal axiom K and by applications of necessitation to tautologies). then the last principle simply reduces to the case of ﬁnite conjunctions again.

as our agent ag does not believe a contradiction. In the next section we will add some postulates which will impose constraints even on beliefs conditional on propositions in A that con14 . this will be granted in order to mimick the same assumption that in epistemic logic is sometimes made: one justiﬁcation for it is the thought that if a rational agent is shown to believe a contradiction. that is. that is. (Consistency) ¬Bel(∅). In a nutshell: 4. where the only type of supposition that we will be concerned with in the following will be supposition as a matter of fact. this will yield an explication of absolute or unconditional belief in terms of subjective probabilities. should be ﬁne. prohibits our agent from having anything like an ω-inconsistent set of beliefs. So much for belief if taken unconditionally. that is. Then I believe that Y is the case. X ∈ Bel’ we may simply say again: Bel(Y|X). we call BelX our agent ag’s belief set conditional on X at t. we take BelX to be the class of propositions that our ideally rational agent believes to be true at time t conditional on X. But we will require more than just qualitative belief in that sense—indeed. this will turn out to be the key move: Let us assume that ag also holds conditional beliefs. if ag’s actual beliefs are considered to coincide with the (in principle) outcome of such a rationalization process. we will say somewhat more transparently: Bel(Y|X). then 5. we now call all and only the members Y of BelW to be believed absolutely or unconditionally. instead of ‘ Y. If X is any such “assumed” proposition. which is the main focus of this section. In the present section we will be interested only in conditional beliefs in Y given X where X is consistent with everything that the agent believes absolutely (or conditionally on W) at that time. instead of writing ‘Y ∈ BelX ’. and BelW the absolute or unconditional belief set. In this extended context. And we may identify ag’s belief set at t from before with one of ag’s conditional belief sets at t: the class of propositions that ag believes to be true at t conditional on the tautological proposition W. with the class BelW . Finally. equivalently: where X is consistent with BW . rather than as a set of members of A as before. In particular. rather than the subjunctive. Accordingly. we add 5. Bel itself should now be regarded as a class of ordered pairs of members of A. beliefs conditional on certain propositions in A. suppositions which are usually expressed in the indicative. We will interpret such conditional beliefs in suppositional terms: they are beliefs that the agent has under the supposition of certain propositions.certain sets of believed propositions but where other types of conjunction rules may not be applied (as one can show would be the case if we dropped countable additivity as being one of our assumptions). and we call any such class of propositions for whatever X ∈ A a conditional belief set at t of our agent ag. Once again. then he will aim to change his mind. mood: Suppose that X is the case. Accordingly.

then Bel(X|X). but we shall keep it as well for the sake of continuity with the standard treatment of belief. Z ∈ A: if Bel(Y|X) and Bel(Z|X). as we will in the section after this one. if given our logical approach to belief. Clearly. we will still regard the supposition in question to be a matter-of-fact supposition in the sense that in natural language it would be expressed in the indicative rather than the subjunctive one. For every X ∈ A that is consistent with what the agent believes. and On the other hand. and ultimately we be able to state a corresponding explication of conditional belief in general. B3 (Finite Conjunction) If ¬Bel(¬X|W). Of course. then for all Y. this is consistent with the notation ‘BW ’ introduced before. Y is a member of A.tradict BW . By now the axioms should look quite uncontroversial. 15 . we have then for all X with ¬Bel(¬X|W) and for Y ∈ A: Bel(Y|X) if and only if Y ⊇ BX . For X = W. above. For any such given X. we only demand: B5 (Consistency) ¬Bel(∅|W). As before. BelX is a set of the very same kind as the original unconditional or absolute belief set of propositions from above. which by B1 must be a subset of X. B4 now entails for every X ∈ A for which ¬Bel(¬X|W) that there is a least set (a strongest proposition) Y. then for all Y. then Bel(Z|X). Even in the cases in which we will consider a belief suppositional on a proposition that is inconsistent with the agent’s current absolute beliefs. So just as in the case of 5. But suppose that he is in the building: then I believe he is in his oﬃce. then for Y = {Y ∈ A | Bel(Y|X)}. such that Bel(Y|X). BelX will therefore be assumed to satisfy postulates of the very same type as suggested before for absolute beliefs: B1 (Reﬂexivity) If ¬Bel(¬X|W). And for every such X ∈ A. we assume the Consistency postulate to hold only for beliefs conditional on W at this point (in the next section this will be generalised). with X being consistent with what the agent believes. Assuming B1 is unproblematic at least under a suppositional reading of conditional belief: under the (matter of fact) supposition of X. B4 (General Conjunction) If ¬Bel(¬X|W). B2 (One Premise Logical Closure) If ¬Bel(¬X|W). then Bel(Y ∩ Z|X). As in: I believe that John is not in the building. we will denote this very proposition by: BX . B3 is redundant really in light of B4. Z ∈ A: if Bel(Y|X) and Y ⊆ Z. Bel( Y|X). the ideally rational agent ag holds X true at time t.

and when we do so we will not state this explicitly anymore. hence all of B1–B4 apply to X = W unconditionally. Bel(Y|X) if and only if Bel(Y|BX ). W itself is such that ¬Bel(¬W|W) (since ¬W = ∅). For this reason. At this point we demand one such condition to be satisﬁed which corresponds to the standard AGM (. and if the right-hand side is the case then the left-hand side must be true by the deﬁnition of ‘BX ’ and the previous equivalence. we see that we may just as well replace this qualiﬁcation by ‘X ∩ BW ∅’. Furthermore. one can thus derive ¬Bel(¬X|W) if and only if X ∩ BW ∅. This is really just a postulate on “revision by expansion” in terms of propositional information that is consistent with the sum of what the agent believes. So we ﬁnd that actually for all Y ∈ A. hence what is believed by ag conditional on X may always be determined just by means of considering all and only the members of A which ag believes conditional on the subset BX of X. So far there are no postulates on how belief sets conditional on diﬀerent propositions relate to each other logically. instead of qualifying the postulates in this section by means of ‘¬Bel(¬X|W)’. and from right to left by B2 and the deﬁnition of BX again. it also follows that Y ⊇ BX if and only if Bel(Y|BX ). a principle like B6 is entailed by the AGM postulates on revision by propositions which are consistent with what the agent believes at the 16 .. nothing is said at all about revision in terms of information that would contradict some of the agent’s beliefs. and this is what we are going to do in the following. and consequently BW must be non-empty.) postulates K*3 and K*4 on belief revision if BW takes over the role of AGM’s syntactic belief set K. which will be the topic of the next section. Using this and the ﬁrst of the three equivalences above. We will use these equivalences at several points. As mentioned before. since if the left-hand side holds. In words: if the proposition Y is consistent with BW .from left to right by the deﬁnition of ‘BX ’. then ag believes Z conditional on Y if and only if Z is entailed by the conjunction of Y with BW .. By B5. Bel(Z|Y) if and only if Z ⊇ Y ∩ BW . and if the revised belief set in the sense of AGM gets described in terms of conditional belief: B6 (Expansion) For all Y ∈ A such that Y ∩ BW ∅: For all Z ∈ A. then the right-hand side follows from B1 and B2.

of course.time. we may reformulate B6 one more time in the form: B6 (Expansion) For all Y ∈ A such that Y ∩ BW ∅: BY = Y ∩ BW . and without it qualitative belief would not have the simplifying power that is essential to it. so that what the agent believes conditional on Y is precisely what the agent would believe absolutely if the strongest proposition that he believes were the intersection of Y and BW . Another line of defence is: a postulate such as B6 might be true of belief simpliciter. Bel(Z|Y) if and only if Z ⊇ Y ∩ BW . these worlds must be precisely the most plausible Y-worlds. ∅: Supplying conditional belief with our intended suppositional interpretation again: If Y is consistent with everything ag believes absolutely. But there might nothing like it that would hold of quantitative belief. AGM’s K*3 and K*4 have not remained unchallenged. One line of defence of AGM here is: if the agent’s current beliefs are themselves already the result of the inductive expansion of what the agent is certain about. and therefore in that case the most plausible Y-worlds are Z-worlds if and only if all the most plausible worlds overall that are Y-worlds are Z-worlds. in view of possible inductively strong inferences that the presence of Y might warrant. if Bel(Z|W) then Y ∩ Z For all Z ∈ A. The superset claim that is implicit in the equality statement follows from the postulates above because Bel(BY |Y) holds by the deﬁnition of ‘BY ’ and then the original formulation of B6 above can be applied. then supposing Y as a matter of fact amounts to nothing else than adding Y to one’s stock of absolute beliefs. That is. The corresponding subset claim follows from the deﬁnition of BY again since Bel(Y ∩ BW |Y) follows from the original version of B6. then if some of the most plausible worlds overall are Y-worlds. and the mentioned criticism of the conjunction of K*3 and K*4 might simply result from mixing up considerations on 17 . the original version of B6 above can be derived from our last version of that principle and the other postulates that we assumed. so that the agent’s beliefs are really what he expects to be the case. and it can be justiﬁed in terms of plausibility rankings of possible worlds: say that conditional beliefs express that the most plausible of their antecedent-worlds are among their consequent-worlds. It follows from our last formulation of B6 (trivially) that for all Y ∩ BW ∅. then revising his beliefs by consistent information might reduce to merely adding it to his beliefs and closing oﬀ deductively. One typical worry is that revising by some new evidence or suppositional information Y may lead to more beliefs than what one would get deductively by adding Y to one’s current beliefs. Similarly. simply because BY = Y ∩ BW in that case. BY is non-empty. such that for all Z ∈ A. Equivalently: B6 (Expansion) For all Y ∈ A.

such as ‘BW ’) on the basis of ‘P’. then P(Z|Y) > r. By using W as the value of ‘Y’ and BW as the value of ‘Z’ in BP1r .3 Mixed Postulates and the Explication of Absolute Belief Finally. having an subjective probability of more than r is a necessary condition for a proposition to be believed absolutely. although it will become clear below that this is far from being a suﬃcient condition. and then applying the deﬁnition of BW (which exists by B1–B4) and P1. The resulting bridge principle between qualitative and quantitative belief will involve a numerical constant ‘r’ which we will leave indeterminate at this point—just assume that r is some real number in the half-open interval [0. a principle such as the latter would be almost empty. It only expresses a joint constraint on the references of ‘Bel’ and ‘P’. 1). on our agent’s ag’s actual conditional beliefs and his actual subjective probabilities. 3. it follows that P(BW |W) = P(BW ) > r. The antecedent clause ‘P(Y) > 0’ in BP1r is there to make sure that the conditional probability P(Z|Y) is well-deﬁned. we turn to our promised necessary probabilistic condition for having a belief—the left-to-right direction of the Lockean thesis—and indeed for having a belief conditional on any proposition consistent with all the agent ag believes at t. 18 . The principle says: BP1r (Likeliness) For all Y ∈ A such that Y ∩ BW For all Z ∈ A. rather than having for all Y (or for all Y and Z) a threshold value that might depend on Y (or on Y and Z). But without any further qualiﬁcation. that is. BP1r really says: conditional beliefs (with the relevant Ys) entail having corresponding conditional probabilities of more than r. Therefore. ∅ and P(Y) > 0: BP1r is just the obvious generalisation of the left-to-right direction of the Lockean thesis to the case of beliefs conditional on propositions Y which are consistent with all absolute beliefs. Note that the principle is not yet meant to give us anything like a deﬁnition of ‘Bel’ (nor of any terms deﬁned by means of ‘Bel’. if Bel(Z|Y).qualitative and quantitative belief. We will return to this issue later where we will see in what sense our theory allows us to reconcile B6 above with the worry about them that we were addressing in this paragraph. This ends our list of postulates on qualitative belief. there will always be an r such that P(Z|Y) > r. because as long as for Y and Z it is the case that P(Z|Y) > 0. this will make ag’s degrees of beliefs at t and (some of) his conditional beliefs simpliciter at t compatible in a sense. In contrast. from the deﬁnition of BW and P1 again. One might wonder why there should be one such threshold r for all propositions Y and Z as stated in BP1r at all. r is a non-negative real number less than 1 which functions as a threshold value and which at this stage of our investigation can be chosen freely.

but not ¬Z. r ought to be chosen to be less than 1 . While the former is diﬃcult to accept—and the more diﬃcult the lower the value of P(Z|Y)—the latter might be acceptable if one presupposes a voluntaristic conception of belief such as van Fraassen’s (. which will lead us to our ﬁrst representation theorem by which pairs P. then ag would be in a position in which he regarded ¬Z as more likely than 2 Z. then there would be 1 Z and Y. Let us pause for now and focus instead on jointly necessary and suﬃcient conditions for our postulates up to this point to be satisﬁed. However. For illustration. we will need the following additional probabilistic concept which will turn out to be crucial for the whole theory: 19 . even though he believes Z. such as ‘supecting that’ or ‘hypothesizing that’.. conditional on Y. for weaker forms of subjective commitment. Instead of defending BP1 2 or any other particular instance of BP1r at this point. then ag assigns an subjective probability to Z (conditional on Y) that exceeds the subjective probability that he assigns to the negation of Z (conditional on Y).. one would rather have to choose some r > 1 instead 2 and then demand that ‘. think of r as being equal to 1 : If degrees of beliefs and beliefs 2 1 simpliciter ought to be compatible in some sense at all. but not ¬Z. conditional on Y. we will simply move on now. then the resulting BP1 2 is pretty much the weakest possible expression of any such compatibility that one could think of: if ag believes Z (conditional on one of Y’s referred to above). conditional on Y. such that our agent ag believes Z conditional on Y.). On the 1 other hand. if P(Z|Y) = 2 . Bel that jointly satisfy our postulates get characterized transparently. but not choose to assign to Z a higher degree of belief than to ¬Z (assuming this voluntary conception of belief would apply to degrees of belief. But it would still be questionable then why the agent would choose to believe Z. We will return to this when we discuss the 1 Lottery Paradox and Preface Paradox in section ??. then ag would be in a position in which he regarded ¬Z as equally likely as Z. in BP1r . taking for 1 granting one such BP1r has been chosen. . Richard Foley (. 2 For the moment this exhausts our list of postulates (with two more to come later).BP1r postulates a conditional probabilistic boundary from below that is uniform for all conditional beliefs—this r really derives from considerations on the concept of belief itself rather than from considerations on the contents of belief.) has argued that the Preface Paradox would show that 1 1 a principle such as BP1 2 would in fact be too strong: a probability of 2 could not even amount to a necessary condition on belief. We will argue later that choosing r = 2 is in fact the right choice for the least possible threshold value that would give us an account of ‘believing that’. conditional on Y. If BP1 were invalidated. not much will depend on it. even though he believes Z. In order to do so. rather than ¬Z. P(Z|Y) ≥ r ’ is the case).. too). but where P(Z|Y) ≤ 2 : if P(Z|Y) < 1 . even though taking any greater threshold value less than 1 would still be acceptable. except that whenever we are 1 going to use BP1r with r ≥ 2 below. .. (Remark: It would be possible to weaken ‘>’ to ‘≥’ in BP1r .

it holds that P(Y) > 1−r · P(Z). then the inequality above could not be satisﬁed irrespective of what ¬X ∩ Y would be like. this 2 1 factor is just 1. But by P1 this is again equivalent with r P(X ∩ Y) > r · [P(X ∩ Y) + P(¬X ∩ Y)]. If we think of P(X|Y) as the degree of X under the supposition of Y. Trivially. P(Y) which is equivalent to: P(X ∩ Y) > r · P(Y). If P(X ∩ Y) were 0. Here is a property of P-stabler propositions X that we will need on various occasions: if P(X) < 1. and more generally all propositions X in A with probability P(X) = 1 are P-stabler . which divides the class of subpropositions of a proposition from the class of subpropositions of its negation in terms of probability. By X being P-stabler . A diﬀerent way of thinking of P-stabilityr is the following one. which contradicts P(X ∩ Y) ≤ P(Y) = 0. and at the same time P(Y ∪ ¬X) > 0 because P(¬X) > 0. the empty set is P-stabler . and by assumption it needs to be non-empty. as we shall see later in section 3. such that Y is a subset of X with P(Y) > 0 r and where Z is a subset of ¬X. then there is no non-empty Y ⊆ X with Y ∈ A and P(Y) = 0. and Y being such that Y ∩ X ∅ and P(Y) > 0. as long as Y is consistent with X and probabilities conditional on Y are well-deﬁned. too. In words: The probability of any subset of X that has positive probability at all is greater than the probability of any r subset of ¬X if the latter is multiplied by 1−r . In the special case in which r = 1 . then a fortiori X ∩ Y ∅ and also P(Y) > 0 are the case. 20 . which yields P(X ∩ Y) > 1−r · P(¬X ∩ Y). it holds that P(X|Y) = P(X∩Y) > r. ¬X ∩ Y is just some proposition in A which is a subset of ¬X. or in other words: non-empty P-stabler propositions X have positive probability. X ∩ Y is some proposition in A that is a subset of X. and hence X is P-stable 2 if and only if the probability of any subset of X that has positive probability at all is greater than the probability of any subset of ¬X. W is P-stabler . and if P(X ∩ Y) is greater than 0. it will be the case that the degree of X under the supposition of Y exceeds r. it would therefore have P(X∩Y) to hold that P(X|Y ∪ ¬X) = P(Y∪¬X) > r. non-empty propositions of probability 0 cannot be P-stabler . For the same reason. there are in fact lots of probability measures for which there are lots of non-trivial P-stabler propositions which have a probability strictly between 0 and 1. So P-stabilityr is also a separation property.Deﬁnition 2 (P-Stabilityr ) Let P be a probability measure on a set algebra A over W. So a P-stabler proposition has a special stability property: it is characterized by its stably high probabilities under all suppositions of a particularly salient type. Z ∈ A. then a P-stabler proposition X has the property that whatever proposition Y one supposes. For all X ∈ A: X is P-stabler if and only if for all Y ∈ A with Y ∩ X ∅ and P(Y) > 0: P(X|Y) > r.4. So really X is P-stabler if and only if for all Y. More importantly. With X being Pstabler . For assume otherwise: then Y ∪ ¬X has non-empty intersection with X since Y has.

Then the following two statements are equivalent: I. by assumption and because of BW = X. BW = X). ﬁrst of all. as derived before by means of B5. and for all Y ∈ A with Y ∩ X ∅. Finally. P and Bel satisfy P1. If Bel(Z|Y). B6 holds trivially. 21 . and there is a (uniquely determined) X ∈ A. and let 0 ≤ r < 1. such that X and Bel have the required properties. Indeed. such X X. as we did already show after our introduction of B1–B5. P-stabler . P(Y) > 0: since BW ⊇ Y ∩ BW . it holds that Bel(Z | Y) if and only if Z ⊇ Y ∩ X . Secondly. Proof. we can show the following ﬁrst and rather simple representation theorem on belief (there will be another more intricate one in the next section which will extend the present one to conditional belief in general): Theorem 3 Let Bel be a class of ordered pairs of members of a σ-algebra A as explained above. and: – For all Y ∈ A such that Y ∩ X ∅. and hence B1–B4 are satisﬁed by the assumed characterisation of beliefs conditional on any Y with Y ∩ X ∅ in terms of supersets of Y ∩ X.Using this new concept. for arbitrary Y ∈ A. 1]. then by assumption Z ⊇ Y ∩ X. it thus follows from B6 that Bel(BW |Y). therefore B5 is the case. From right to left: Suppose P satisﬁes P1. and there is an X. About BP1r : Let Y ∩X ∅ and P(Y) > 0. and BP1r . and hence with X = BW from above that X = X. let Z ∈ A: then it holds that Bel(Z|Y) if and only if Z ⊇ Y ∩ BW by B6. it follows: BW = X. and BW is P-stabler : For let Y ∈ A with Y ∩ BW ∅. where BW exists and has the intended property of being the strongest believed proposition by B1–B4: First of all. let Y ∈ A be such that Y ∩ BW ∅. as intended. let P : A → [0. BW is non-empty. which by BP1 and P(Y) > 0 entails that P(BW |Y) > r. B1–B6. uniqueness: Assume that there is an X ∈ A. hence Z ∩ Y ⊇ Y ∩ X. and by P1 it follows that P(Z ∩ Y) ≥ P(Y ∩ X). But from the latter it follows that X = BW . Then. From X being P-stabler and P(Y) > 0 we have P(X|Y) > r. and the instances of B1–B4 for beliefs conditional on W follow from the characterisation of beliefs conditional on W in terms of supersets of X. for all Z ∈ A. P satisﬁes P1. which is a contradiction. From left to right: P1 is satisﬁed by assumption. X is non-empty. by assumption. all the instances of B1–B5 for beliefs conditional on W are satisﬁed: for it holds that W ∩X = X ∅ because X is non-empty by assumption. which was to be shown. for all Z ∈ A: Bel(Z | Y) if and only if Z ⊇ Y ∩ X (and hence. Now we let X = BW . II. So. so Bel(Z|W) if and only if Z ⊇ W ∩ X = X. such that X is a nonempty P-stabler proposition. ¬Bel(¬Y|W) is really equivalent to Y ∩ X ∅.

Note that P2 (Countable Additivity) did not play any role in this. of Theorem 3 are satisﬁed.Taking this together. P(W) > 0 and P-stabilityr follow then immediately. but of course P2 may be added to both sides of the proven equivalence with the resulting equivalence being satisﬁed. W—it only follows that there exist believed propositions that have a probability of less than 1—X itself is an example. We start by exploiting it ﬁrst in a rather trivial fashion: Let us concentrate on its right-hand side. there will always be believed propositions that have a probability of precisely 1—for instance. of Theorem 3. for all Z ∈ A. let us just assume that we are given a probability P over a set algebra A on W. which we needed to show. and by the deﬁnition of conditional probability in P1. this is not in general a satisfying explication of belief.|Y) for all the cases in which Y ∩ W = Y ∅: that is. In the simplest case: take X to be W itself. Now choose any non-empty P-stabler proposition X. and all of our postulates hold by Theorem 3—including B3 (Finite Conjunction) and B4 (General Conjunction)—even though it might well be that P(X) < 1 and hence even though there might be beliefs whose propositional contents have a subjective probability of less than 1 as being given by P. it is always possible to deﬁne belief simpliciter in a way such that all of our postulates turn out to be the case. and turn it into a (conditional) deﬁnition of Bel(. Such beliefs are not maximally cautious anymore—exactly as it is the case for most of the beliefs of any real-world human agent ag. that is. What this shows is that given a probability measure. Disregarding for the moment any considerations on qualitative belief. and deﬁne conditional belief in all cases in which Y ∩ X ∅ by: Bel(Z | Y) if and only if Z ⊇ Y ∩ X. condition II. he would believe conditionally on the respective Ys from above just what is logically entailed by them. As we pointed out in the introduction. and therefore trivially every absolute belief would have probability 1. What would be believed absolutely thereby by our agent is maximally cautious: having such beliefs. But what is more important. all supersets of Y. Now consider the very last equivalence clause of II. all the conditions in II. and thus by Theorem 3 all of our postulates from above must be true as well. Of course this does not mean that according to the current construction all believed propositions would have to be assigned probability of less than 1: Even if P(X) < 1. Accordingly. this implies P(Z|Y) > r. that is. deﬁne Bel(Z | Y) to hold if and only if Z ⊇ Y ∩ W = Y. This simple theorem will prove to be fundamental for all subsequent arguments in this paper. we actually ﬁnd that a much more general pattern is emerging: Let P be given again as before. such that X is a P-stabler proposition: just take any proposition with probability 1. BW = W follows. Bel(Z | W) holds then if and only if Z ⊇ W which obviously is the case if and only if Z = W. Then BW = X follows again. We know already that one can in fact always ﬁnd a non-empty set X. And every be22 . ag would believe absolutely just W. In particular.

lieved proposition must then have a probability that lies somewhere in the closed interval [P(X). Since P-stabler propositions play such a distinguished role in this. that is. 1]. Thus it turns out to be very simply to decide whether a set X is P-stabler and even more so 1 if it is P-stable 2 . From this it is easy to see that in the present ﬁnite context there is also an eﬃcient procedure that computes all non-empty P-stabler subsets of W.4 below will show that the answers are aﬃrmative. we have for all non-empty X with P(X) < 1: X is P-stabler if and only if for all w in X it holds that P({w}) > In particular. so that P(X) becomes a lower threshold value. and P({w1 }). such that Y is a subset of X (and hence. this is: 2 X is P-stable 2 if and only if for all w in X it holds that P({w}) > P(W \ X). P({wn }) 23 1 r · P(W \ X). P(X) itself is strictly bounded from below by r. We only give a sketch for the case r = 1 : All sets of probability 1 are P-stabler . Assume that after dropping all worlds with zero probabilistic mass. in r the present case. P({w}) > 0. it holds that P(Y) > 1−r · P(Z). 1]. such with a probability strictly between 0 and 1? Subsection 3. 1−r . By deﬁnition. so we disregard them. P(Y) > 0) and where Z is a subset of ¬X. It does not follow that if a proposition has a probability in the interval [P(X). there are exactly n members of W left. And how diﬃcult is it to determine whether a proposition is a non-empty P-stabler set? About the last question: At least in the case where W is ﬁnite. the questions arise: Do P-stabler sets other W exist at all for many P? More generally: Do non-trivial exist for many P. then this just by itself implies that the proposition is also believed absolutely. . For any given such non-empty X with P(X) < 1. since X is P-stabler . in order to check for P-stabilityr in the current context. . it follows that X is P-stabler if and only if for all Y. In other words. . since it is not entailed that the proposition is then also a superset of the P-stabler proposition X that had been chosen initially. Therefore. We have seen already that all sets with probability 1 are P-stabler . P(X∩Y) P(X|Y) = P(Y) > r. it turns out not to be diﬃcult at all: Let A be the power set algebra on W. As we observed before. X is P-stabler if and only if for all Y ∈ A with Y ∩ X ∅ and P(Y) > 0. for r = 1 . as we have shown before. So let us focus just on how to generate all non-empty P-stabler sets X that have a probability of less than 1. which in the present context means that if w ∈ X. and let P be deﬁned on A. P({w2 }). furthermore. such sets do not contain any subsets of probability 0. . it suﬃces to consider just sets Y and Z which have the required properties and for which P(Y) is minimal and P(Z) is maximal. so let us also disregard all worlds whose singletons are zero sets. All other non2 empty P-stabler sets do not have singleton subsets of probability 0. Z ∈ A.

and one moves on to the list P({w3 }). where: X is then simply identical to BW .is already in (not necessarily strictly) decreasing order. . . But Theorem 3 does more: it also shows that whatever our agent ag’s actual probability measure P and his actual class Bel of conditionally believed pairs of propositions are like. so that the two together satisfy all of our postulates. and by ‘partially’ we mean that it would only be possible to reconstruct beliefs that are conditional on propositions Y which were consistent with X = BW . we could deﬁne explicitly the set of all pairs Z. P({wn }). P({w2 }). If P({w1 }) ≤ P({w2 }) + . . If P also satisﬁes P2. + P({wn }) then {w1 . . . ultimately. + P({wn }) then consider P({w1 }). and therefore the set of absolutely believed propositions could be deﬁned explicitly in terms of P. Then the following is the case: III. The other open question is: What should r be like in our postulate BP1r above? In order to address these two questions. X ∈ A: If X and X are P-stabler and at least one of P(X) and P(X ) is less than 1. P2 still has not played a role so far. . . . there is 24 . . as long as they satisfy our postulates from above. So are we in the position to identify the P-stabler proposition X that gives us ag’s actual beliefs. P({w2 }): If both of them are 1 greater than P({w3 }) + . that is. P({w3 }): And so forth. then either X ⊆ X or X ⊆ X (or both). . If P({w1 }) > P({w2 }) + . Hence. + P({wn }) then 1 consider P({w1 }).) What Theorem 3 gives us therefore is not just a construction procedure but even. Let r ≥ 2 . But then of course the procedure will not terminate in ﬁnite time. Amongst those conditional beliefs. . This recursive procedure yields precisely all non-empty P-stable 2 sets of probability less than 1 in polynomial time complexity. 1] such that P1 is satisﬁed. . then there is no inﬁnitely descending chain of sets in A that are all subsets of some P-stabler set X0 in A with probability less than 1. until the ﬁnal P-stable 2 set W has been 1 generated. simply by being handed only ag’s subjective probability measure? That is the ﬁrst open question that we will deal with in the remainder of this section. P({wn }). (The same procedure can be applied in cases in which W is countably inﬁnite and A is the full power set algebra on W. then it must be possible to partially reconstruct Bel by means of some P-stabler proposition X as explained before. . + 1 P({wn }) then {w1 } is P-stable 2 . and one moves on to the list P({w2 }). Y in that class Bel for which Y ∩ X ∅ holds by means of that proposition X and thus. in the ﬁnite case. w2 } is P-stable 2 . . For this is just the left-to-right direction of the theorem. . if we had any additional means of identifying the very P-stabler proposition X that would give us the agent’s actual belief class Bel. we would ﬁnd all of ag’s absolute beliefs. For all X. If either of them is less than or equal to P({w3 }) + . . an eﬃcient construction procedure for a class Bel from any given probability measure P. IV. by the given measure P. we need the following additional theorem ﬁrst: 1 Theorem 4 Let P : A → [0. in particular.

We observe that P(X|(X ∩ ¬X ) ∪ ¬X) is greater than r by X being P-stabler . of sets in A (and hence no inﬁnite sequence of such sets in general). • Ad III: First of all. for P(X |(X ∩ ¬X) ∪ ¬X ). both X ∩ ¬X and X ∩ ¬X are non-empty.. Therefore. The case for X and X being taken the other way round is analogous. On the other hand. So we can concentrate on the remaining logically possible case. A fortiori. and neither X ⊆ X nor X ⊆ X. then there would have to be such a subset of X . The same must hold. that is P(X ∩ ¬X ) ≤ P(¬X): Since by P1 and P((X ∩ ¬X ) ∪ ¬X) > 0. P(X ) < 1: as observed before. Proof. X of A.no countably inﬁnite sequence X0 X1 X2 . and they must have positive probability since as we showed before P-stabler propositions with probability less than 1 do not have non-empty subsets with probability 0. and the probability of (X ∩ ¬X ) ∪ ¬X being positive. and thus X ⊆ X. such that P(Y) = 0. Assume for contradiction that there are P-stabler members X. Therefore. each Xn is a proper superset of Xn+1 and P(Xn ) < 1 for all n ≥ 0. there is then no non-empty subset Y of X . it also follows that: 2 2 25 . 2 by assumption. let X and X be P-stabler . the ﬁrst summand has to strictly exceed 1 . (X ∩ ¬X ) ∪ ¬X ⊇ (X ∩ ¬X ) having non-empty intersection with X. such that X0 is P-stabler . But if X ∩¬X were non-empty. given P2. mutatis mutandis. P(X ∩ ¬X ) > P(¬X).. P(X ) < 1. and P(X) = 1. Next we show that For suppose otherwise. So we have P(X|(X ∩ ¬X ) ∪ ¬X) > r ≥ and 1 2 where r ≥ 1 2 1 P(X |(X ∩ ¬X) ∪ ¬X ) > r ≥ . it must be the case that P(X∩¬X |(X∩¬X )∪¬X)+P(¬X|(X∩¬X )∪¬X) = 1. there is no inﬁnitely descending chain of P-stabler sets in A with probability less than 1. such that P(X). and since we know from before that the second summand must be strictly less than 1 . X ∩¬X is empty.

Since A is a σ-algebra. of sets in A with probability less 1. all P-stabler propositions X in A with P(X) < 1 are subsets of all propositions in A of probability 1.P(¬X) P(X∩¬X ) > P(¬X|(X ∩ ¬X ) ∪ ¬X) = P((X∩¬X )∪¬X) ≥ P((X∩¬X )∪¬X) = P(X ∩ ¬X |(X ∩ ¬X ) ∪ ¬X). in the sense of Lewis (. for i j. B is in fact a member of A. and let B = ∞ Ai . for otherwise P(B) = P( ∞ Ai ) = ∞ P(Ai ) would not be a real number. in view of IV. it follows also that P(X ∩ ¬X) > P(¬X ). the sequence (P(Ai )) must converge to 0 for i → ∞. we ﬁnd therefore that they constitute a sphere system that satisﬁes the Limit Assumption (by well-orderedness) for every proposition in A. Because by assumption i=o i=o X0 has a probability of less than 1. Secondly. from this (and P1) we can derive: P(X ∩ ¬X ) > P(¬X) ≥ P(X ∩ ¬X) > P(¬X ) ≥ P(X ∩ ¬X ). there must also be a least non-empty P-stabler proposition with probability less than 1.. if there is a non-empty P-stabler proposition with probability less than 1 at all. which is a contradiction. P-stabler sets of probability less than 1 have a certain kind of groundedness property: they do not allow for inﬁnitely descending sequences of subsets. 26 . If we only look at non-empty P-stabler propositions with a probability of less than 1. Finally. the whole class of P-stabler propositions X in A with P(X) < 1 is well-ordered with respect to the subset relation. in light of III and IV taken together. where for every i. or otherwise the subset relationships holding between them could not be proper. Furthermore.. By P2. since as observed before P-stabler sets with probability less than 1 do not contain subsets with probability 0. We may draw two conclusions from this. Now let Ai = Xi \ Xi+1 for all i ≥ 0. Note that i=0 every Ai is non-empty and indeed has positive probability. . (Ai ∪¬X0 )∩X0 ∅ and P(Ai ∪¬X0 ) > 0. but this contradicts our conclusion from before that 1 P(X ∩ ¬X |(X ∩ ¬X ) ∪ ¬X) exceeds 2 . • Ad IV: Assume for contradiction that there is a sequence X0 X1 X2 . None of these sets can be empty. 1 2 Analogously. It follows that the sequence of real numbers P(AiP(Ai ) 0 ) = P(X0 ∩(Ai ∪¬X0 )) = P(X0 |Ai ∪ ¬X0 ) also )+P(¬X P(Ai ∪¬X0 ) converges to 0 for i → ∞. In particular.). by our initial supposition. Note that P2 (Countable Additivity) was needed in IV. P(¬X0 ) is a real number greater that 0. with X0 being P-stabler as described. . Furthermore. First of all. in order to derive the well-foundedness of the chain of P-stabler propositions of probability less than 1. And the latter are all P-stabler . Ai ∩ A j = ∅. But this contradicts X0 being P-stabler .

Theorem 4. P (Hence. the P r ordinal ∅). where βr is a well-order of ordinals with respect to the subset P P relation which is also the order relation for ordinals. Accordingly. So by standard set-theoretic P arguments. the class Xr of all non-empty P-stabler propositions X with probability less P than 1 is countable. .) Proof. ⊆ . let us denote the class of all non-empty P-stabler propositions X with P(X) < 1 by: Xr . each such X in Xr determines P a number P(X) ∈ (r. 2 The ordinal βr (see above) is either ﬁnite or equal to ω. It follows that there is also a bijective and order-preserving mapping from the set of probabilities of the members of Xr to the set of ordinals below P βr (that is. for Xα as deﬁned above. P P r r there is a bijective mapping between the set of intervals of the form (P(Xα ). P(Xα+1 )) for α < βr and the set βr . Furthermore. r by assumption we have P(Xω ) < 1. So there is a sequence Xω Z0 Z1 . determine ordinal rankings of those possible worlds that are members of at least one of them. since every ordinal number has a unique successor. ⊆ is then a well-order. and for given r ∈ [0. See Figure 1. 1).For given P (and given A and W). Xr is identical to a strictly increasing sequence of the form P P r r (Xα )α<βrP . to the set βr ). . In case the union of all Xα is W. and furthermore P(Zn ) < 1 and the sequence (Zn ) is r strictly monotonically decreasing. P We know from Theorem 4 that Xr . such that P satisﬁes P1–2. P P From this we can determine a boundary for the ordinal type of βr : P Observation 5 Let P be a countably additive probability measure on a σ-algebra A over W. such that w ∈ Xα . the greater its probability P(X). by Theorem 4 and the deﬁnition of ‘Xα ’ it is the case that Zn ⊆ Xω . βr measures the length of the wellP ordering Xr . Assume for contradiction that βr ≥ ω + 1: then there certainly exist non-empty P r P-stabler propositions X with probability less than 1. and the fact that no non-empty P-stabler of probability less than 1 has a non-empty subset of probability zero. of sets in A r with probability less 1. and let Zn = m≥n Ym . Hence. then βr is simply equal to 0 (that is. If there are none. We know that for all n it r r holds that Zn ∈ A. and r r r for all 0 ≤ n < ω. if they exist. with Xω being P-stabler . each world w ∈ W can be assigned a uniquely r determined ordinal rank: the least ordinal α. Let 1 ≤ r < 1. 1] and no non-empty P-stabler proposition of probability less than 1 other than X could determine the same number P(X). Now. there is a bijective and order-preserving mapping from Xr into a uniquely P determined ordinal βr . that is: for α < α < βr it P r r holds that r < P(Xα ) < P(Xα ). by P1 the greater the set X with respect to the subset relation. So we ﬁnd that the nonempty P-stabler propositions X with probability less than 1. in contradiction with IV of Theorem 4. if there is one at all. 27 . by P1–2. X0 is then the least non-empty P-stabler proposition in A with probability less than 1. let Yn = Xn+1 \ Xn .

r Since. of course).%-! ! *+. Now back to our remaining open questions. that union could not do so either. as we saw before. one can show by 2 means of examples that if r < 1 then III. So P( α<ω Xα ) = 1. X of A. Indeed. r and it must have probability 1. Let us start with: what should we choose as r? For the proof of III. This 28 . if there are countably inﬁnitely many non-empty P-stabler propositions X with probability less than 1. then there must be an Xα with r r r α < ω. If Y ∩ α<ω Xα ∅ for Y ∈ A with P(Y) > 0. such that Y ∩ Xα ∅. non-empty. P( α<ω Xα |Y) ≥ P(Xα |Y). such that neither X ⊆ X nor X ⊆ X. in Theorem 4 it was crucial that r ≥ 1 . hence P( α<ω Xα |Y) > r. such that X ∩X = ∅.&'$-! ! $! !!!&! &'$! Figure 1: P-stable sets for r ≥ 1 2 We also ﬁnd that. For: The countable union α<ω Xα is a member of our r r σ-algebra A.#-! *+. Because Xα is P-stabler . So in the case in which βr is inﬁnite. can be invalidated: it is possible then that there 2 are P-stabler members X. which was ruled out by Observation 5. no non-empty P-stable propositions X with probability less than 1 contains a non-empty zero set as a subset. But r r r r by P1. If P( α<ω Xα ) were less than 1. that union of all non-empty P-stabler propositions with probP ability less than 1 would then have to be the least P-stabler proposition with probability 1.$-! *+. X of A.&-! *+. So α<ω Xα is P-stabler (and r non-empty. given P is countably additive. it follows that P(Xα |Y) > r. it is even possible that there are non-empty P-stabler members X. In fact. then βr would have to be at least P r of the order type ω + 1. then the union of all nonempty P-stabler propositions X with probability less than 1 is itself P-stabler ."! ! ! (! ! ! ! ! ! ! ! ! )! ! ! !!!#! !!!$! !!!%! ! *+.

guarantees the following: given P. our postulates P1–P2. then while 2 Theorem 4 does not yet pin down such a “single. B1–B6. but it certainly puts a bit of methodological pressure on it. that is. unequivocally endorsed picture of what things are like”: If r ≥ 1 . It seems advisable then. 2 then depending on what P is like. For if P is ﬁxed. then either everything that ag believes absolutely according to BW = X would also be believed if it were the case that BW = X or vice versa."! ! Figure 2: P-stable sets for r < 1 2 means: if our agent ag’s probability measure P is held ﬁxed for the moment. if X and X are possible choices of strongest possible believed propositions BW such that P1– P2. that 2 if an agent believes a proposition it is quite reasonable for him to have assigned to that proposition a probability that is greater than the probability of its negation—we do have a plausible case against choosing r in that way. for this will allow us to derive as a law that a situation such as that 2 1 cannot occur. then one might think that our postulates should suﬃce to rule out systems of qualitative belief that contradict each other. p. and BP1r might allow for two classes Bel such that all of these postulates are satisﬁed for each of them (by Theorem 3) and yet some absolute beliefs according to the one class Bel contradict some absolute beliefs according to the other class Bel. unequivocally endorsed picture of what things are like”.. B1–B6. the assumed role of full belief is “to form a single. if X and X are both non-empty P-stabler members of A. (But we will see later that r < 1 is an 2 attractive choice if ‘Bel’ is taken to express not belief but some weaker epistemic attitude.. and if r < 1 . at least the linearity condition III. this is far from being a knock-down argument against r < 2 .. for the sake of a better theory. Of course. although both are based on one and the same subjective probability measure P. Combining this with what we said about r < 1 initially when we introduced BP1r above—that is. to demand that r ≥ 1 .) 29 . by Theorem 3. As van Fraassen (. 350) puts it. and BP1r are satisﬁed.

then BY = ∅. if Y is believed in this sense. then we do not 2 exclude any of the logically possible options for BW . Hence. Indeed. Then it holds: 2 • If X is P-stable s . and so on. And all of these options would still be covered by what we call pre-theoretically ‘belief’. In many cases. Degrees of belief conditional on a proposition of probability 0 are brought in line with beliefs conditional on a contradiction in the following manner: BP2 (Zero Supposition) For all Y ∈ A: If Y ∩ BW 30 ∅ and P(Y) = 0. Choosing r = 2 is the bravest possible option. 1 in order not to cease to count as a belief. 1 which would correspond to choosing a value for ‘r’ that is greater than 2 . maybe. so X is P-stabler as well. At the same time. 1 We suggest therefore to explicate belief conditional on any given threshold value r ≥ 2 . one lays down how brave a belief can be maximally. in conjunction with our previous results. or how cautious a belief needs to be minimally. But then it also holds for all Y ∈ A with Y ∩ X ∅ that P(X|Y) > r. X is P-stabler . with 1 P being given. And of 2 course Bel would have to satisfy all of the standard logical properties of belief simpliciter.1 Apart from presupposing r ≥ 2 . P(X|Y) > s. beliefs in this sense would not necessarily seem too brave: after all. the value of ‘r’ might be determined by the epistemic and pragmatic context in which our agent ag is about to reason and act. we are in the position to address the other one: Can we always identify the P-stabler proposition X that yields our agent’s ag’s actual beliefs. that is. But then again. as expressed by B1–B6. the value of ‘r’ might only be determined vaguely. Proof. With that one of our two open questions settled (or rather dismissed). is that 1 if we choose r minimally such that 2 ≤ r < 1. the more inclusive is the class of P-stabler sets that it determines. . If X is P-stable s . if we are given only ag’s subjective probability measure P (and a threshold value r)? We need one more postulate before we answer this. since r < s by assumption. then the subjective probability of Y would have to be greater than 1 . and diﬀerent contexts might ask for diﬀerent values of ‘r’. without making any particular choice of the value of ‘r’ mandatory. for many purposes this might well be the right choice. if we choose r = 1 . then for all Y ∈ A with Y ∩ X ∅. Should our agent ag exclude some of them? By determining the value of ‘r’. for other purposes a more cautious notion of belief is asked for. What this tells us. and assume that 1 ≤ r < s < 1. In particular. Bel would still be constrained by BP1 2 . the following elementary observation informs us about some of the consequences that the answer will have: Observation 6 Let P be a probability measure on an algebra A over W. the smaller the threshold value r. is it possible to exclude other possible values of ‘r’? Before we answer this question. Let X ∈ A. In yet other cases.

But then again the current theory has the advantage of relying just on the much more common absolute probability measures. which contradicts ¬Y ∩ BW being non-empty.) which he applies to indicative conditional whose antecedent has subjective probability 0— than the quantitative supposition of a proposition. as mentioned before. one might want to use Popper functions P from the start. Therefore. Therefore. in light of P1 and P(BW ) > 0. if P is regular. it follows that P(¬Y) = 0 and hence with BP2: B¬Y = ∅. rather than restricting qualitative belief in such a way. if BW has probability 1 itself. BP2. to the greatest possible extent. such that P and Bel jointly satisfy P1–P2. the class Bel is the largest with respect to 31 . by B6. Since P(Y) = 1. the right-to-left direction of the original Lockean thesis.. For intuitively there is no reason to think that supposing a proposition qualitatively ought to less zero-intolerant—using Jonathan Bennett’s corresponding term (. and since the theory is not particularly aﬀected by using BP2 as an additional assumption. then BP2 is of course superﬂuous. Here is an important consequence of BP2: Let Y ∈ A be such that P(Y) = 1. This said. the class Bel ought to be so that the resulting class of absolute beliefs is maximised. So we ﬁnd that by BP2 (and the rest of our postulates). we shall stick to conditional belief being constrained as expressed by BP2. As explained already in section 3. it would actually be more attractive to liberate quantitative probability such that the (non-trivial) conditionalization on zero sets becomes possible: that is. every non-empty proposition in A has positive probability. that is. since BY ⊆ Y for all such Y by the deﬁnition of ‘BY ’. apart from satisfying our postulates. BP6 entails that B¬Y = ¬Y ∩ BW .Since P is an absolute probability measure that does not allow for conditionalization on a proposition of probability 0 at all. BY = Y ∩ BW ⊆ BW . every Y ∈ A for which P(Y) = 1 holds is such that BY = BW . Y must then have non-empty intersection with BW . as this approximates prima facie belief. So BP2 is acceptable really just for the sake of simplicity. then BW must be the least proposition in A with probability 1. This also entails that.. and hence. But since ¬Y has non-empty intersection with BW . by identifying the P-stabler proposition X that yields ag’s actual beliefs if we given just ag’s subjective probability measure P (and a threshold value r). ¬Y ∩ BW = ∅. B1–B6. At least. it makes sense to restrict belief simpliciter accordingly in the way that supposing any such proposition of probability 0 amounts to believing a contradiction. This corresponds to the following postulate: BP3 (Maximality) Among all classes Bel of ordered pairs of members of A. Now we are in the position to answer our remaining question from above aﬃrmatively. Assume that Y is a proper subset of BW : then both Y ∩ BW and ¬Y ∩ BW are non-empty. Regularity is indeed usually taken for granted or otherwise W would be redeﬁned by dropping all worlds whose singleton sets have zero probability. and for many practically relevant scenarios. BP1r .

P(BW ) must be a non-empty P-stabler proposition with probability 1. Hilbert (. Y for which Y ∩ BW ∅. such that both satisfy all of our postulates apart from BP3: by Theorem 3. Y for which Y ∩ BW ∅ is distinct from the class of all pairs Z. If there is at least one non-empty P-stabler proposition with probability less than 1. Bel . that is. it will only be in the next section. B1–B6. BP1r . The term ‘the largest’ in BP3 is well-deﬁned given the postulates P1–P2. or in view of Theorem 4. W |Z ∈ A}. it would not give us any information whatsoever on Bel (. there are propositions Z ∈ A (as. and which does not have any non-empty zero sets as subsets and hence satisﬁes BP2. BP2 Theorem 3. and BP1r . The logical character of BP3 is obviously diﬀerent from the one of our previous postulates. while BP6 would tell us whether Bel (. For these reasons. for all such Bel : Bel ∩ { Z. without restriction of generality. Furthermore. On the other hand. 32 . famously. in Theorem 3 is turned into a (partial) deﬁnition of conditional belief again. but then again adding postulates that maximize or minimize classes subject to axiomatic constraints is of course not unheard of. which then has a probability of less than 1. and by what we pointed out before: Because of Theorem 3. that we will be in the position to strengthen Maximality so that it extends to all pairs Z. if there are no non-empty P-stabler propositions with probability less than 1. there must be a least one by Theorem 4: this least P-stabler proposition Xleast . W |Z ∈ A} ⊇ Bel ∩ { Z. The resulting class Bel will again be deﬁned uniquely and the set of absolute beliefs that it determines will correspond to what is required by BP3 and the rest of the postulates of the present section. BW BW . then we know that amongst all the non-empty P-stabler propositions that are candidates for the maximally strong believed proposition BW according to Theorem 3 (which relied on P2). ﬁrst of all. Y for which Y ∩ BW ∅. since its class of supersets is the largest one possible.. then by P1. However. Y for Z. W . BW must be a non-empty P-stabler proposition in A in order to satisfy P1. BW \ BW ).|Z). and from our considerations on BP2 above we know that BW must really be the least set of probability 1 in A. pairs of the form Z.. Y ∈ A whatsoever. so it would not be clear with respect to which of two classes our intended belief class Bel ought to be the largest. B1– B6.the class of absolute beliefs. one might wonder why we did not demand Bel in BP3 to be largest even with respect to the class of pairs Z. let BW BW derive from two distinct candidates Bel . e. In other words. when we will deal with conditional beliefs in general.. must therefore determine the largest class of absolute beliefs once II. for example.) uses this strategy in his axiomatization of geometry.|Z). and BP1r again. that it determines. the class of all pairs Z.g. B1–B6. But then. Since we did not just deal with absolute belief in this section but also with belief conditional on any proposition that is consistent with everything the agent believes absolutely. such that Z has non-empty intersection with BW but not with BW .

and indeed the deﬁnability of all of his beliefs conditional on any Y that is consistent with BW . Where the previous postulate was reminiscent of Hilbert’s axiomatisation of geometry. However. for all Z ∈ A: Bel(Z | Y) if and only if Z ⊇ Y ∩ Xleast . since the restriction is not overly demanding in our belief context (though it would be in 33 .. BP2. let P : A → [0. is always satisﬁed. B1–B6. VI.With BP3 on board. the cumulative hierarchy of sets is pinned down uniquely conditional on the speciﬁcation of an ordinal number of a certain kind. P satisﬁes P1–P2. Since in the next section we will extend this result to arbitrary conditional beliefs. Then the following two statements are equivalent: V. with respect to its open parameter r the last corollary is closer in spirit to something like Zermelo’s (. – In particular: BW = Xleast . VI. The real number r in BP1r above takes over the function of such an ordinal number in Zermelo’s theorem. for only conditional on it the class Bel is speciﬁed uniquely. by means of the following corollary to our results mentioned before: Corollary 7 Let Bel be a class of ordered pairs of members of a σ-algebra A. P and Bel satisfy P1–P2. and in light of our previous results. and for all Z ∈ A: Bel(Z | W) if and only if Z ⊇ Xleast . whether or not they are beliefs conditional on proposition that are consistent with what the agent believes. and: – For all Y ∈ A such that Y ∩ Xleast ∅. BP3. In order to do so.. BP1r . While our explicit deﬁnition of belief will then just hold conditional on that additional restriction.) quasi-categoricity result for second-order set theory: according to Zermelo’s theorem. 1]. In other words. we refrain from stating the resulting deﬁnition here. our postulates (including BP3) entail the explicit deﬁnability of ag’s absolute beliefs. of Corollary 7 can now be turned into an explicit deﬁnition of all relevant conditional beliefs just on the basis of P (and logical and set-theoretic notions). there exists a (uniquely determined) least non-empty P-stabler proposition Xleast in A. We restrict the probability measures P that we are interested in such that the existence claim in VI. we will take one ﬁnal step. we may conclude from our postulates that in each and every case our agent’s set BW is nothing but the least non-empty P-stabler set in A. we do exploit Corollary 7 by deriving from it a particularly important special case: the concept of absolute belief can be deﬁned explicitly in terms of P alone.

with P(Y) = 0: Y ⊆ X. (But see Levi. would be a set of probability 1 which is a proper subset of X. It is easy to see that the least proposition X of probability 1 cannot have a non-empty subset Y ∈ A. That is: There is a least set of probability 1 in A. and that set is thus the least non-empty P-stabler proposition in A. Equivalently. and for every Y ∈ A. by P1. • All countably additive probability measures on the power set algebra of a set W that is countably inﬁnite: In that case the conjunction of all sets of probability 1 is a member of the algebra of propositions again. Regularity (Strict Coherence) does not enjoy general support. some of them in view of a special variant of the Dutch book argument that favours Regularity. such that P(Y) = 0: for otherwise. such that P(X) = 0. so it is a member of the given σ-algebra. there is a member X ∈ A. where one needs measures for integration).) • All countably additive probability measures on a countably inﬁnite σ-algebra: The conjunction of all sets of probability 1 is then a countably inﬁnite conjunction. X ∧ ¬Y. Standard examples of countably additive probability measures for which there are least sets of probability 1 are: • All probability measures on ﬁnite algebras A. which is a member of A again.. say. and then there is a least non-empty P-stabler proposition anyway by Theorem 4. Stalnaker and others argued for it as a plausible constraint on subjective probability measures. Call it the ‘Least Certain Set Restriction’: There is a member X ∈ A. for contrary arguments. This is thus the restriction on P that we use. • All countably additive probability measures (on a σ-algebra) that are regular: for all X ∈ A. Here the empty happens to be the least set of probability 1. there is always a least non-empty P-stabler proposition in A: Either there is a non-empty P-stabler proposition of probability less than 1. Or all and only non-empty P-stabler propositions have probability 1: but then by the Least Certain Set Restriction there is a least set with probability 1.. and it is again the least set of probability 1.other contexts. in measure theory. and of course it is then the least set of probability 1. and for every Y ∈ A. 34 . even though Carnap. Or in other words: there is a greatest set of probability 0 in A (which is just the complement of the least set of probability 1). with P(1) = 0: X ⊆ Y. we will still end up with a deﬁnition that assigns the right reference to ‘Bel’ for a wide range of subjective probability measures. P(X) = 0 if and only if X = ∅. Shimony. Given this Least Certain Set Restriction. and hence also all probability measures on algebras A that are based on a ﬁnite set W of worlds. such that P(X) = 1.

P2. A is simply the full power set algebra of W. since all of P1. the deﬁnition is so as well (conditional on the presupposition of P2 and BP2). What is more. and many—if not most—of the typical philosophical toy examples of subjective probability measures are covered by these examples. BP1r . that is. under very special circumstances. belief (to a cautiousness degree of r). For any two of these concepts. BP1r is true conditional on the choice of r as a cautiousness threshold. B1–B6. and with P2. since all of P1. Note that from the theory above we know that the deﬁniens could actually be replaced by ‘Y is a superset of some non-empty P-stabler proposition in A’ without thereby changing the extension of the belief predicate in any way. 1]. 3. Let Xleast be the least non-empty P-stabler proposition in A (which exists). By ‘materially adequate’ we mean here: By Corollary 7. and belief a priori. they can in fact determine precisely the same beliefs (later we will deal with an example).4 Examples Finally. if the deﬁnition is taken as a descriptive sentence. 2 35 . BP2 being acceptable for the sake of simplicity. Then we say for all Y ∈ A and 1 ≤ r < 1: 2 Y is believed (to a cautiousness degree of r) as being given by P if and only if Y is a superset of Xleast . 1] be a countably additive probability measure on a σalgebra A. B1–B6. such that there exists a least set of probability 1 in A. then the situation is trivial insofar as for given 1 ≤ r < 1. here are some examples. We end up with the following materially adequate explicit deﬁnition of absolute belief for countably additive probability measures that satisfy this additional constraint of the Least Certain Set Restriction: Deﬁnition 8 Let P : A → [0. then we end up with three notions of belief of increasing strength for all P that satisfy P1. BP3 are plausibly true. If W contains exactly two worlds. our deﬁnition of belief is true if given a probability measure that satisﬁes the Least Certain Set Restriction. for very special P. In all of them. and our additional constraint. they will diﬀer extensionally.These examples demonstrate that a great variety of probability measures satisfy P1. BP3 are not just true but even conceptually necessary or analytic of belief. P2. If we ﬁnally deﬁne for any given P : A → [0. and belief in the sense of Deﬁnition 8 is the concept that we oﬀer as an explication of our pre-theoretic notion of qualitative belief. for realistic P. Y ∈ A is believed a priori as being given by P if and only if P(Y) = 1. But under “normal” circumstances. and the Least Certain Set Restriction: prima facie belief.

So let us turn to the ﬁrst non-trivial case. Each non-vertex on any of the edges of the outer equilateral triangle represents a probability measure that assigns 0 to exactly one of the three worlds. See Figure 3. Each edge of the inner equilateral triangle separates the representatives of probability measures of the following kinds: probability measures that assign to the singleton set of some world a probability that is greater than the sum of probabilities that it assigns to the singleton sets of the two other worlds. P({w2 }). where W is a set {w1 . For instance. that is. to the left-below of the left edge of the inner equilateral triangle we ﬁnd such probability measures represented 36 . let r = 1 . For simplicity. Let us view of all probability measures on that set W 2 as being represented by points in a triangle. and probability measures that assign to the singleton set of some world a probability that is less than the sum of probabilities that it assigns to the singleton sets of the two other worlds. such that P({w1 })."%! #&!$! !!%! #&!$&!%! $! #! %! !!$! #&!%! $! %! #! %! $! #! !!$! #&!%! !!%! #&!$! #! $! %! !!#! $&!%! !!#! $&!%! !!%! #&!$! %! #! $! #! %! $! "#! $&!%! !!#! #&!%! !!$! "$! ! Figure 3: Rankings determined by P the singleton {w} ⊆ W is the least non-empty P-stabler proposition if P({w}) > r. and W itself is such otherwise. P determines diﬀerent classes of P-stabler sets. Then depending on where P is represented in that triangle. The diagram should be read as follows: The vertices of the outer equilateral triangle represent the probability measures that assign 1 to the singleton set of the respective world and 0 to all other singleton sets. w3 } of three elements. w2 . w2 . P({w3 }) become the scalar factors of a convex combination of three given vectors that we associate with the worlds w1 . w3 .

w j } as their least non-empty P1 stable 2 set. these form a sphere system of sets. all the non-empty P-stable 2 sets that are determined by it. w3 }. and w2 has rank 2. the more coarse-grained the orderings become that are given by the sphere systems of the probability measures thus represented. w2 }. respectively. Each straight line segment that connects a vertex with the mid-point of the opposite edge of the outer equilateral triangle separates the representatives of probability measures of the following kinds: probability measures that assign to the singleton set of one world a greater probability than to the singleton set of another world. by Deﬁnition 8. For example: Consider the interior of the two smallest rectangular triangles that are adjacent to w1 . The points on the outer equilateral triangle are special: The probability measure represented 1 by the vertex for wi has {wi } as its least non-empty P-stable 2 set. w3 has rank 1. The further one moves geometrically towards the center point of the two equilateral triangles. w3 }. the straight line segment that connects w3 and the mid-point of the edge from w1 to w2 separates the probability measures that assign more probability to {w1 } than to {w2 } from those which assign more probability to {w2 } than to {w1 }. and all of them have probability 1. 1 Given all of that. 37 . or {wi . In the diagram. So w1 has rank 0. or {w j }. {w1 . In either of these two cases. {w1 . starting with the worlds of rank 0 which we take to correspond to the entries in the bottom line of each numerical inscription. and all of them have probability 1. w3 }. Probability measures which are presented by points in the upper 1 one yield a sphere system of three non-empty P-stable 2 sets: {w1 }. all supersets of each of 1 them. we can concentrate solely on non-empty P1 stable 2 sets with probability less than 1. and using the construction procedure for P-stable 2 sets that we have sketched before. The probability measures represented by the inner part of the edge between the vertices that belong to two worlds wi and w j have either {wi }. w2 . As we have seen. we denote these sphere systems by enumerating in diﬀerent lines the numeral indices of worlds of equal rank in the sphere system. or equidistant of both of them. it is easy to read oﬀ for each point. w2 . depending on whether the representing point is closer to the vertex of wi than to the vertex of w j . Accordingly. But the really interesting part of the diagram concerns the interior of the outer equilateral triangle: Since relative to the probability measures that are represented as such only W 1 has probability 1 (and hence is P-stable 2 ). The center point of both equilateral triangles represents the probability that is uniform over W = {w1 . and hence for the probability measure 1 that this point represents. w2 . too. or vice versa. {w1 . w3 }. all supersets of that set 1 are non-empty and P-stable 2 . the probability measures in question would yield an absolute belief in every proposition that includes w1 as a member. {w1 . probability measures represented by points in the lower one of the two triangles determine a sphere system of the 1 three non-empty P-stable 2 sets {w1 }. Accordingly.which assign to {w1 } a greater probability than to the sum of what they assign to {w2 } and {w3 }. are non-empty and P-stable 2 again. and the probability measures that do so the other way round.

For analogous reason. P({w5 }) = 38 . Probability measures which are presented by points on any of the designated straight line segments within the interior of the outer equilateral triangle require special attention: Probability measures whose points lie on the boldface part in the diagram are treated separately in the little graphics left to the triangle. If W had four members.058. P({w3 }) = 0. Here is another example with 7 worlds and concrete numbers: Let W = {w1 . 0. for any such P: If there were a unique world whose singleton had least probability. w7 } and P({w1 }) = 0. and the rest follows in the same way as before. which is why we did not say anything about them explicitly in Figure 3. P({w4 }) = 0. . the following is true: The set of points in the diagram which 1 represent probability measures for which a set of probability 1 is the least P-stable 2 set has Lebesgue measure. . In one sense. that is. this is really just a consequence of dealing with precisely three worlds. If r > 1 . what is true in general: sphere systems with precisely two worlds of maximal rank can only be represented by points or probability measures of areas of dimension n − 1. two worlds of rank 1. they all lead to the three worlds ranked equally. The points on the three edges of the inner equilateral triangle—or rather the six halfs of those (without their midpoints which fall into the boldfaced lines)—yield sphere systems which coincide with those of the areas to which they are adjacent on the inside. for the three straight line segments in the interior of the inner equilateral triangle we did not say anything about “their” sphere systems either because they simply inherit them from the rectangular triangle areas that they separate. P({w2 }) = 0. This is because. if W has n members. For then the probabilities of these two worlds of maximal rank must be the same. then sphere systems with one world of rank 0. so there must be at least two worlds whose singleton sets have the same probability. with all of the interior straight 2 line segments being pushed towards the three vertices to an extent that is proportional to the magnitude of r. However.54. One might wonder about Figure 3 why sphere systems with one world of rank 0 and two worlds of rank 1 are determined only by points or probability measures in onedimensional line segments rather than in two-dimensional areas. which means the points of the represented probability measures must lie on one of the distinguished hyperplanes that generalise the distinguished line segments in our diagram to the higher-dimensional case. and hence one world of rank 2 would be represented in terms of proper areas again. then W without that 1 world would be P-stable 2 .03994. Finally. then a diagram similar to Figure 3 can be drawn. For three of the straight line segments we have denoted the sphere systems that they determine explicitly. We conclude: Almost 1 all probability measures over a ﬁnite algebra have a least P-stable 2 set with a probability less than 1. . geometrical measure. .342.and the smaller the class of absolutely believed propositions gets.

then Bel(Z|X). . . .002. . {w1 . . {w1 . {w1 . wn }. . . .0. . {w1 . then Bel(Y ∩ Z|X). . So there are really lots of diﬀerent types 1 of sphere systems of P-stable 2 propositions. Z ∈ A: if Bel(Y|X) and Y ⊆ Z. Z ∈ A: if Bel(Y|X) and Bel(Z|X). . In line with Observation 6. . to r = 3 . w4 }.018. P({w2 }) = 1 + 16 . . Then the resulting sphere system of non1 empty P-stable 2 sets is: {w1 }. w3 }. . w5 }. . P({w6 }) = 0. {w1 . the proposition {w1 . B2∗ (One Premise Logical Closure) For all Y. w3 . . {w1 . . {w1 . and let P be the unique regular countably additive 1 probability measure that is given by: P({w1 }) = 1 + 1 . . . the latter sphere system is a subclass of the former one. . . Finally. . . and hence every AGM-style belief revision operator on a logically ﬁnite language. w6 }. P({w3 }) = 2 4 8 1 1 1 + 64 . as entailed by Deﬁnition 8. w2 . . Moreover. w7 }. {w1 . . . w2 . we will give some of these examples an intended interpretation by assuming that the possible worlds in question satisfy particular statements. . . . w2 . Then the resulting non-empty P-stable 2 sets are: 32 {w1 }. a simple inﬁnite example: Let W = {w1 . w4 }. {w1 . of course. while relative to a cautiousness degree of r = 4 . . w5 }. B3∗ (Finite Conjunction) For all Y. and W. . . w6 }. . Once we have covered conditional belief in full in the next section. {w1 .} be countably inﬁnite. . . . then the corresponding sphere system of non-empty P4 3 stable 4 sets is: {w1 .. w2 }. 4 The Reduction of Belief II: Conditional Beliefs Now we ﬁnally generalise the postulates of the previous section to the case of beliefs that are conditional on propositions which may even be inconsistent with what our agent ag believes absolutely. However. 39 . . {w1 . we will return to some of these examples and analyse them in terms of conditional belief accordingly. let A be the power set algebra on W. P({w7 }) = 0.g. w7 }. w2 } is the strongest one that is believed as being given by the same probability measure. P1–P2 remain unchanged. w2 }. eventually. . {w1 . . . . if we switch e. the proposition {w1 } is the strongest one that is believed as being 2 3 given by P. w2 }. . . With a cautiousness degree of r = 1 . . Our generalisations of B1–B5 simply result from dropping the antecedent ‘¬Bel(¬X|W)’ condition that they contained: B1∗ (Reﬂexivity) Bel(X|X). . . . . It is also easy to see that every ﬁnite sphere system can be realized in this way in terms 1 of P-stable 2 propositions of probability less than 1. .00006.

this is thus yet another equivalent statement of B6∗ : B6∗ (Revision) For all X. ∅: That is: if the proposition Y is consistent with BX —equivalently: Y is consistent with everything ag believes conditional on X—then ag believes Z conditional on the conjunction of Y and X if and only if Z is logically entailed by the conjunction of Y with BX . This is consistent with the corresponding notations that we used in the last section. Analogously to the last section. and Bel( Y|X). Y ∈ A such that Y ∩ BX ∅: For all Z ∈ A. . Bel(Z|X ∩ Y) if and only if Z ⊇ Y ∩ BX . The same arguments as before apply: B4∗ now entails that for every X ∈ A there is a least set Y. Y ∈ A. Just as the original B6 postulate it can be justiﬁed in terms of standard possible worlds accounts of similarity orderings (as for David Lewis’ conditional logic) or plausibility rankings (as in belief revision and nonmonotonic reasoning): say what a conditional belief expresses is again that the most plausible antecedent-worlds are consequent-worlds. The following postulate extends our previous Expansion postulate B6 to all cases of conditional belief whatsoever.B4∗ (General Conjunction) For Y = {Y ∈ A | Bel(Y|X)}. Bel(Z|X ∩ Y) if and only if Z ⊇ Y ∩ BX . such that for all Z ∈ A. The Consistency postulate stays the same: B5∗ (Consistency) ¬Bel(∅|W). we have Bel(Y|X) if and only if Y ⊇ BX if and only if Bel(Y|BX ). We denote this proposition again by: BX . Once again. It corresponds to the standard AGM postulates K*7 and K*8 for belief revision if translated again into the current context: B6∗ (Revision) For all X. Y ∈ A such that Y ∩ BX 40 ∅: BX∩Y = Y ∩ BX . Equivalently: B6∗ (Revision) For all X. if Bel(Z|X) then Y ∩ Z For all Z ∈ A. Y is a member of A. these worlds must be precisely the most plausible X ∩ Y-worlds. then if some of the most plausible X-worlds are Y-worlds. such that Bel(Y|X). hence the most plausible X ∩ Y-worlds are Z-worlds if and only if all the most plausible worlds X-worlds that are Y-worlds are Z-worlds. which by B1∗ must be a subset of X.

However. then P(Y) = 0—is that the resulting principle would have been empty: if Y ∩ BW ∅. 41 . Then the following two statements are equivalent: I. then by BP6 the proposition BY would have to be non-empty. B1∗ –B6∗ . The reason why the original BP2 principle did not include the corresponding right-to-left direction of BP1r∗ with the qualiﬁcation ‘Y ∩ BW ∅—that is. then BP2∗ should not be taken for granted. The “soundness” direction of the following representation theorem incorporates the corrsponding direction of Grove’s (. if we had started with primitive conditional probability measures. it is not possible to simply translate the more diﬃcult “completeness” part of Grove’s representation theorem in . and hence BP2∗ . since his construction of spheres involves taking unions of propositions that might not be members of our σ-algebra A anymore. but in the context of absolute probability measures BP2∗ is natural to postulate in order to treat qualitative and quantitative supposition similarly. and BW is the least proposition in A of probability 1. BP2∗ . which do allow for conditionalization on zero sets. We have seen in the last section that BP2.The generalised version BP1r∗ of our previous BP1r postulate arises simply by dropping the ‘Y ∩ BW ∅’ restriction again. why we did not postulate: If BY = ∅ and Y ∩ BW ∅. That is why the proof of that part of the theorem diﬀers from Grove’s proof quite signiﬁcantly. P and Bel satisfy P1–P2. and additionally we strengthen it by assuming also the converse of the resulting generalisation: BP2∗ (Zero Supposition) For all Y ∈ A: P(Y) = 0 if and only if BY = ∅. Here is the theorem: Theorem 9 Let Bel be a class of ordered pairs of members of a σ-algebra A.. entails (given the other postulates): all Y ∈ A for which P(Y) = 1 holds are such that BY = BW .) representation theorem for belief revision operators in terms of sphere systems. As mentioned before.. 1]. The additional strengthening has it that the propositions the supposition of which leads to inconsistency qualitatively are precisely those for which conditionalization is undeﬁned quantitatively. We are now ready to prove the main result of our theory on conditional beliefs in general. in contradiction with BY = ∅. BP1r∗ . Finally. into our present context and apply it. so the antecedent of that direction would always have to be false. we generalise BP2 in the same way. and let P : A → [0. since all the propositions or sets of worlds that we are about to consider are required to be members of our given algebra A.. then P(Z|Y) > r. if Bel(Z|Y).. So we have: BP1r∗ (Likeliness) For all Y ∈ A with P(Y) > 0: For all Z ∈ A.

42 . Proof. X is the least member of X for which Y ∩ X ∅ holds (which exists). βr is countable P P and so are its predecessors.. thus W \ γ<α Xγ ∈ A.) At ﬁrst we make a couple of observations about this class X: (a) Every member of X is also a member of A.II. where X is the least member of X for which Y ∩ X ∅. X0 = BW . Xα ∈ A. hence. and: – For all Y ∈ A with P(Y) > 0: if. (ii) all other members of X have probability less than 1. So we can concentrate on the left-to-right direction: P1–P2 are satisﬁed by assumption. as is the proof of BP2∗ . But this can be simpliﬁed to: Xα = γ<α [BW\ δ<γ Xδ ] ∪ BW\ γ<α Xγ . (b) For all γ < α < βr + 1: Xγ ⊆ Xα . which was to be shown. AsP sume that for all γ < α: Xγ = δ<γ BW\ η<δ Xη ∪ BW\ δ<γ Xδ . we conclude: Xα = γ<α [ δ<γ BW\ η<δ Xη ∪ BW\ δ<γ Xδ ] ∪ BW\ γ<α Xγ . let Xα = [Xγ ] ∪ BW\ γ<α Xγ .. The existence of that least member follows from Theorem 4. with respect to the subset relation. For assume that all Xγ are in A for γ < α < βr + 1: by the results of the last section. This follows directly from the deﬁnition of the P members of X. for all Z ∈ A: Bel(Z|Y). P satisﬁes P1–P2. in particular.). The right-to-left direction is like the one in Theorem 3. from the fact that every non-empty P-stabler propositions with probability less than 1 is a subset of the least set in A with probability 1. By transﬁnite induction. γ<α (So. γ<α Xγ ∈ A. and therefore BW\ γ<α Xγ ∈ A. Additionally. A contains a least set of probability 1. Substituting this for the ﬁrst occurrence of ‘Xγ ’ in the original deﬁnition of Xα . and there is a (uniquely determined) class X of non-empty P-stabler propositions in A. P (c) For all α < βr + 1: Xα = γ<α BW\ δ<γ Xδ ∪ BW\ γ<α Xγ . such that (i) X includes the least set of probability 1 in A. By transﬁnite induction. and therefore by A being a σ-algebra. Now we deﬁne X by transﬁnite recursion as the class of all sets Xα of the following kind: For all ordinals α < βr + 1 (the successor ordinal of the ordinal that was deﬁned in the last P section). except that one shows ﬁrst that the equivalence for Bel entails for all Y ∈ A with P(Y) > 0 that BY = Y ∩ X. The proof of B6∗ is straight forward (and analogous to Groves Theorem in. for all Y ∈ A with P(Y) = 0. then for all Z ∈ A: Bel(Z | Y) if and only if Z ⊇ Y ∩ X. and from the fact that the least set of probability 1 in A must have non-empty intersection with every proposition of positive probability. From this it also follows that for all α + 1 < βr + 1: Xα+1 = Xα ∪ BW\Xα .

BY ⊆ Xα . So Xγ ∩ ¬Y is empty. there exists the least proposition X ∈ A with probability 1. if not. This implies by B6∗ : B[W\Xγ ]∩¬Y = ¬Y ∩ BW\Xγ . This is P because: If Y ∩ Xα ∅. either there P-stabler propositions in A with probability less than 1 or not: If so. which is a 43 . as shown in the previous section. Since βr + 1 is an ordinal.(d) For all α < βr + 1: For all Y ∈ A with Y ∩ Xα ∅. Xα is P-stabler . and X is the only member of X with probability 1. such that Xα = Xα+1 . then as shown in the last section their (countable) union is the least proposition X ∈ A with probability 1. Hence. if P Y ∩ Xα ∅ and P(Y) > 0. such that α = γ + 1. it cannot contain any non-empty zero set. In either case. such that P(Xα ) = 1. then as observed before. Since they are all P-stabler by (e). Since BW\Xα ⊆ W \ Xα by the deﬁnition of ‘BW\Xα ’ and B1∗ –B4∗ . This can be derived from: For all Y ∈ A. By B6∗ . and hence by the deﬁnition of ‘BY ’: Bel(Xα |Y). Assume for contradiction that all sets Xα with α < βr + 1 have probability P less than 1. Because Xγ is P-stabler with a probability of less than 1. there must be a least P P α < βr + 1. which means by BP2∗ that B[W\Xγ ]∩¬Y is empty. That chain could not be a chain of strictly P increasing P-stabler sets of probability less than 1. and by the well-orderedness of the ordinals. there must be a least such ordinal γ. such that Y ∩ BW\ δ<γ Xδ ∅. such that P(Y) = 1 and Y is not a superset of Xα . then by (d). by Observation 5 and by the deﬁnition of βr which is the ordinal type of all P-stabler sets of probability less than 1 whatsoever. We P P will deal with these cases separately: In the former case. it holds that BY ⊆ Xα . and therefore BW\Xγ ∩ ¬Y must be non-empty. it follows that BY ⊆ Xα . X ∈ X. because Y ∩ BW\ δ<γ Xδ ⊆ BW\ δ<γ Xδ ⊆ Xα by (c) again. by (b) again: Xα = Xα ∪ BW\Xα . P Because P(Xα ) < 1. then Xα ∩ ¬Y is non-empty. there is a γ < βr + 1. and hence Y ⊆ W \ δ<γ Xδ . BP2∗ entails with the other postulates that BW is the least X ∈ A of probability 1. B[W\ δ<γ Xδ ]∩Y = Y ∩ BW\ δ<γ Xδ . so by the right-to-left direction of BP2∗ it follows that BW\Xα ∅. and X is the only member of X with probability 1. then by (c) there is a γ ≤ α. and. (e) For all α < βr + 1: Xα is P-stabler . we turn to the proof of: X ∈ X. it follows from (b) that there is a wellordered chain of (not necessarily strictly) increasing P-stabler sets of probability less than 1. Note that for that least ordinal γ it holds that Y ∩ δ<γ Xδ = ∅. which is equivalent to BY = Y ∩ BW\ δ<γ Xδ by what we have shown before. (f) There exists a least proposition X ∈ A with probability 1. a contradiction follows. But this implies by BP1r∗ that P(Xα |Y) > r. Secondly. therefore. by (b) again: P Xα = Xγ ∪ BW\Xγ . Hence. But [W \ Xγ ] ∩ ¬Y is a set of probability 0 since ¬Y is. Proof: First of all. If there is a set Y ∈ A. By Observation 5. So P there must be α < α < βr + 1. Finally. where the length of that chain is βr + 1. we have that there must be at least one set Xα with α < βr + 1 that has probability 1. it holds that P(W \ Xα ) > 0 by P1. either βr is ﬁnite or equal to ω. where Xα ∩ ¬Y is a zero set since ¬Y is.

and so Xα is the least set in A of probability 1. if α < ω. their union γ<ω Xγ must be equal to the union of all P-stabler sets with probability less than 1. Therefore. Without restriction of generality. that union is the least set in A of probability 1. Uniqueness follows from: if there are two such classes X. if the sets Xγ with γ < ω are pairwise distinct. Y ∩ Xα = Y ∩ BW\ γ<α Xγ ∅. then all sets Xγ with γ < ω must be P-stabler sets with probability less than 1. If Y had non-empty intersection with any set of the form BW\ δ<γ Xδ for γ < α. and by (b) again: P Xα+1 = Xα ∪ BW\Xα . and so Xα+1 = Xα . But W \ Xα has probability 0 then. by (c) again. and since W \ γ<ω Xγ is then a zero set. they must be equal from some ordinal less than ω by (b). Xα = γ<α BW\ δ<γ Xδ ∪ BW\ γ<α Xγ . Finally. Furthermore. there is a member of X with which Y has non-empty intersection. BY ⊆ Y ∩ Xα . hence by BP2∗ it must hold that BW\Xα is empty. In the other case. then by the same reasoning as before. and it is the only set in X with probability 1 since α = ω = βr is the last ordinal less than βr + 1 in the present case. which entails just as before that Xα is the least set in A of probability 1 and the only set in X that has probability 1. such that Xα = Xγ ∪ BW\Xγ . all Y ∈ A with P(Y) = 1 are supersets of Xα . This P P concludes (f): the least set X with probability 1 is a member of A and indeed of X. in contradiction with the way in which α was deﬁned before. from which the remaining part of II. The latter implies with B6∗ that B[W\ γ<α Xγ ]∩Y = Y ∩ BW\ γ<α Xγ . then Y ∩ Xγ ∅. From (d) we know already that BY ⊆ Xα and hence with B1∗ –B4∗ . hence there is such an Xγ . Therefore. and X is the only member of X with probability 1. Y ∩ γ<α Xγ is empty. remains to be the only set in X with probability 1. BY = ∅. the least set in A of probability 1. then they must diﬀer with respect to at least one P-stabler sets with probability less than 1. By deﬁnition. As in the proof of (d). BW\ γ<ω Xγ is empty as follows from BP2∗ . Now let Y ∈ A with P(Y) > 0: By P1 and (f). P Xα is then with respect to the subset relation the least member of X for which this holds. if α = ω. follows by means of the deﬁnition of BY and B1∗ –B4∗ . such that Y ∩ Xα ∅: because of (b). We now show that BY = Y ∩ Xα . If these sets are not pairwise distinct. then by Observation 5. X with the stated properties. consider Y ∈ A with P(Y) = 0: By BP2∗ . remains to be the only set in X with probability 1. Finally. On the other hand.contradiction. where βr = ω. then Xα+1 ∈ X. follows by means of the deﬁnition of BY and B1∗ –B4∗ again. the least set in A of probability 1. and thus [W \ γ<α Xγ ] ∩ Y = Y. Xα = Xω = γ<ω [Xγ ] ∪ BW\ γ<ω Xγ . And as shown immediately after Observation 5. Let α < βr + 1 be least. Thus. if α < βr . Now consider Y ∩ Xα again. which by assumption is non-empty: By (c). let Xα be the ﬁrst member of X that is not also a member of X : since Xα is P-stabler and has probability less than 1. P Xα . and therefore Xα is again identical to the least set in A with probability 1. it follows just as before that 44 . Xα . from which the relevant part of II. So we have BY = Y ∩ BW\ γ<α Xγ = Y ∩ Xα and we are done.

Additionally. X . (ii) and all the other members of X are precisely all the non-empty P-stabler propositions in A which have probability less than 1. let P : A → [0. respectively. then BW\Xγ = Xα \ Xγ could not be the same as being given by X and X . P and Bel satisfy P1–P2. as stated in II. and if X is such that (and indeed is uniquely determined by) (i) X includes the least set of probability 1 in A. But that is a consequence of the following independent observation: Observation 11 Let P be a countably additive probability measure on a σ-algebra A over W. then for all Z ∈ A: Bel(Z | Y) if and only if Z ⊇ Y ∩ X. 1]. Then it holds: If X ⊆ X . of Theorem 9. for all Z ∈ A: Bel(Z|Y). X is the least member of X for which Y ∩ X ∅ holds (which exists). This follows immediately from Theorem 9. for all such Bel : Bel ⊇ Bel .α is ﬁnite. we can derive: Corollary 10 Let Bel be a class of ordered pairs of members of a σ-algebra A. It remains to generalise BP3 in the now obvious way: BP3∗ (Maximality) Among all classes Bel of ordered pairs of members of A. BP3∗ . then for all Y. 45 . of Corollary 10. In other words. of Theorem 9 is satisﬁed. P satisﬁes P1–P2. A contains a least set of probability 1. BP1r∗ . Assume that A contains a least set of probability 1. BP2∗ . IV. Then the following two statements are equivalent: III. of Theorem 9 is equivalent to determining X as in IV. Using this. with respect to the subset relation. Bel be deﬁned in terms of X. BP2∗ . which would be a contradiction. B1∗ –B6∗ . for all Y ∈ A with P(Y) = 0. B1∗ –B6∗ . which would again be a contradiction. Let Bel. the class Bel is the largest one. then BW could not be the same as being given by X and X . BP1r∗ . Z ∈ A: If Bel(Z|Y) then Bel (Z|Y). except that we have to show: adding ‘BP3∗ ’ to I. let X. then: – For all Y ∈ A with P(Y) > 0: if. such that P and Bel jointly satisfy P1–P2. If α = 0. Theorem 9 generalises Theorem 3 of the last section to conditional beliefs in general— Theorem 3 simply dealt with the special case of a sphere system of just one P-stabler set. If α is a successor ordinal γ + 1. X be classes of non-empty P-stabler propositions for which (i) and (ii) of II.

Of course. This is thus the intended materially adequate explicit deﬁnition of conditional belief: Deﬁnition 12 Let P : A → [0. Z ∈ A and 2 ≤ r < 1: Z is believed conditional on Y (to a cautiousness degree of r) as being given by P if and only if either (i) P(Y) > 0 and Z is a superset of the intersection of Y with the least nonempty P-stabler proposition Xleast in A that has a non-empty intersection with Y (which exists). Let X ⊆ X . But since X is also a member of X . Z ⊇ Y ∩ X and therefore Bel (Z|Y). then by deﬁnition Z ⊇ Y ∩ X with X being the least member of X for which Y ∩ X ∅ holds. and again we would end up with three notions of belief of increasing strength: prima facie conditional belief. where where we were only interested in the least P-stabler proposition BW . hence. of Corollary 10 into an explicit deﬁnition of belief on the basis of P. of Theorem 9. of Theorem 9 is satisﬁed must lead to the maximal class Bel of pairs of propositions in A. From this it follows that choosing X to be the greatest class of all non-empty P-stabler propositions in A such that (i) and (ii) of II. then doing so “just” for probability measures for which there exist least propositions of probability 1 is not an actual constraint (given our postulates are plausible). By ‘materially adequate’ we mean the same as at the end of the previous section: the deﬁnition is a true. For Y with P(Y) = 0 there is nothing to show. Let X be uniquely determined by: (i) X includes the least set of probability 1 in A. such that all of our postulates are satisﬁed jointly by them. After all. conditional belief in the sense of Deﬁnition 12 is the concept that we propose as an explication of our pre-theoretic notion of conditional belief simpliciter. sentence. and conditional belief a priori. if taken as a descriptive statement and if given our postulates. only such probability measures can be combined with any class Bel at all. if Bel is given as in in II. 1] be a countably additive probability measure on a σalgebra A. (ii) and all the other members of X are precisely all the non-empty P-stabler propositions in A which have probability less than 1. the additional Least Certain Set Restriction on P is even entailed by our postulates on subjective probability and belief. such that there exists a least set of probability 1 in A. So let Y be such that P(Y) > 0: If Bel(Z|Y). In analogy with the case of absolute beliefs. and even conceptually true. or (ii) P(Y) = 0. Note that unlike the case of absolute belief. So when we ﬁnally turn now IV. but this time of conditional belief in general. we could now deﬁne notions of prima facie conditional belief and conditional belief a priori again. of Corollary 10. 46 . 1 Then we say for all Y. the least member X of X for which Y ∩ X ∅ holds must then be a subset of X. But that is exactly what we did in IV. conditional belief (to a cautiousness degree of r).Proof.

[APPLICATIONS AND EXTENSIONS LEFT OUT.] 47 .

- Termeni Si Notiuni 1
- Obiectul psihologiei
- Regulamentul de Ordine Și Funcționare - Ltșv - 2015
- 02. Calendarul Olimpiadele Naționale Scolare_2014-2015
- RETETE Frusecuri Si Placinta Turnata
- Despre Filosofie
- Cronologie voievozi
- Despre Filosofie
- Subiecte-G1
- subiecte-2013-G1
- Jonathan Black-Istoria Secreta a Lumii 1.0 10
- 0199684847
- Falkenburg Tenerife
- Aperitive Romanesti
- Unde i Lege Nu i Tocmeala(1)
- Anton Pann
- The Passion Maker Test
- Rezumat Kripke
- Atenţia
- UTILITATEA BUNURILOR ECONOMICE (Autosaved).docx
- Cum Se Citeste Psaltirea
- 26 Sf Vasile Cel Mare p 439-480
- Charles Leonard Hamblin, Fallacies
- Profesorul CA Manager
- Increderea in Sine

Sign up to vote on this title

UsefulNot useful- Wolfman
- Naive Realism and Experiential Evidence
- Understanding Basic Probability
- BOVENS & OLSSON. Believing More, Risking Less. on Coherence, Truth and Non-trivial Extension
- Entitlement Paper Final
- Southwold Martin_Religious Belief
- Shafer Theory
- Al Farabi Certitude
- Mentalism Eastern Apa
- Dempster Shafer
- Williamson - Very Improbable Knowing
- Bayesian Statistics
- Moral and Mathematical Epistemology
- introphil_lecture_slides_IntroPhil2-week2-slides.pdf
- HO12-HW3Solutions
- Practice What is Knowledge and Do We Have Any
- Philosophy
- What is Knowledge? Do We Have Any Knowledge?
- Market Sentiment
- FALLIBILISM AND ORGANIZATIONAL RESEARCH (Powell 2001).pdf
- Hold Up Game and Many More Extensive Form
- FRW Lecture 4
- XAT 2013 Solutions
- Social Knowing
- Philosophy
- ValueEngineering Presentation
- Paradox
- 2.2 About Is and Ought in Research on Belief in a Just World; The Janus-Faced Just-World Motivation.pdf
- Against Many-Worlds Interpretations, by Adrian Kent (WWW.OLOSCIENCE.COM)
- thamkhao4
- leitgeb

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd