1.4K views

Uploaded by Tim Hendrix

This is a review of "Proving History" by Dr. Richard Carrier I wrote a few years back.

save

You are on page 1of 32

**Carrier’s ”Proving History”
**

A Review From a Bayesian Perspective

Tim Hendrix∗

May 4, 2013

Introduction

Dr. Richard Carrier’s new book, ”Proving History” (Prometheus Books, ISBN

978-1-61614-560-6), is the first of two volumes in which Dr. Carrier investigates

the question if Jesus existed or not. According to Dr. Carrier, the current state

of Jesus studies is one of confusion and failures in which all past attempts to

recover the ”true” Jesus has failed. According to Dr. Carrier the main problem

is the methods employed: Past studies has focused on developing historical

criteria to determine which parts of a text (for instance the Gospel of Luke)

can be trusted, but according to Dr. Carrier all criteria and their use is flawed.

This has as a result led to many incompatible views of what Jesus said or

did, and accordingly the question ”Who was Jesus?” has many incompatible

answers: a Cynic sage, a Rabinical Holy Man, a Zealot Activist, an Apolytic

prophet and so on.

Richard Carrier propose that Bayes theorem (see below) should be employed

in all areas of historical study. Specifically, Dr. Carrier propose that the problems plaguing the methods of criteria can be solved by applying Bayes theorem,

and this will finally allow allow the field of Jesus studies to advance. What this

progress will be like and specifically, how the question if Jesus exist should be

answered, will be the subject of his second volume.

I was interested in Dr. Carriers book, both because I have a hobby interest

in Jesus studies and found his other book on early christianity, ”Not the impossible faith” very enjoyable and informative, but certainly also because Bayesian

methods was the focus area of my PhD and my current research area. My main

focus in writing this review will therefore be on the technical content relating to

the use of Bayes theorem and its applicability to historical questions as argued

in the book.

The book is divided into six chapters. Chapter one contain an introduction

which argues historical Jesus studies in it’s present form is ripe with problems,

chapter two introduce the historical method as a set of 12 axioms and 12 rules,

∗ Tim Hendrix is not my real name. For family reasons I prefer not to have my name

associated with my religious views online. All quotations are from ”Proving History”.

1

**chapter three introduce Bayes Theorem, chapter four discuss historical methods
**

and seek to demonstrate with formal logic that all valid historical methods

reduce to applications of Bayes theorem, chapter five goes through historical

criteria often used in Jesus studies and conclude each is only valid insofar as it

agrees with Bayes theorem. Finally chapter six, titled ”the hard stuff”, discuss

a number of issues that arise in applying Bayes theorem as well as Richard

Carriers proposal for how a frequentist and Bayesian view of probabilities can

be unified.

In reviewing this book I wish to focus on what I believe are the books main

contributions: The first point is that Bayes theorem not only applies to the

historical method, but that it can be formally proven all historical methods can

be reduced to applications of Bayes theorem and importantly, thinking in this

way will give tangible benefits compared to traditional historical methods.

The second point is how Dr. Carrier address several philosophical points that

are raised throughout the book, for instance the unification of the frequentistic

and Bayesian view of probabilities. Since I am not a philosopher I will not be

able to say much on the philosophical side, but I do think there are a number

of points which fall squarely within my field that should be raised.

However before I proceed I will first briefly touch upon the Bayesian view of

probabilities and Bayes theorem.

1

Bayes Theorem

**I wish to begin with a point that may seem pedantic at first, namely why we
**

should think Bayes theorem is true at all. Dr. Carrier introduces Bayes Theorem

as follows:

In simple terms, Bayes’s Theorem is a logical formula that deals

with cases of empirical ambiguity, calculating how confident we can

be in any particular conclusion, given what we know at the time.

The theorem was discovered in the late eighteenth century and has

since been formally proved, mathematically and logically, so we now

know its conclusions are always necessarily true if its premises are

true. (Chapter 3)

Unfortunately there are no references for this section, and so it is not explained

what definitions Bayes theorem make use of, which assumptions Bayes theorem

rests upon and how it’s proven. For reasons I will return to later I think this

omission is problematic. However, shortly after the above quotation, just before

introducing the formula for bayes theorem, we are given a reference:

But if you do want to advance to more technical issues of the

application and importance of Bayes’s Theorem, there are several

highly commendable texts[9]

Footnote 9 has as it’s first entry E.T. Jaynes ”Probability Theory” from 2003. I

highly endorse this choice and I think most Bayesian statisticians would agree.

2

**E.T. Jaynes was not only an influential physicist, he was a great communicator
**

and his book is in my preferred reference for students. In his book, Jaynes

argues Bayes theorem is an extension of logic, and I will attempt to give the

gist of Jaynes treatment of Bayes theorem below. Interested readers can find

an almost complete draft of Jaynes book freely available online1 :

Suppose you want to program a robot that can reason in sensible manner.

You want the robot to be reason quantitatively about true/false statements such

as:

A = ”The next flip of the coin will be heads”

B = ”There has been life on mars”

C = ”Jesus existed”,

A basic problem is neither we or the robot have perfect knowledge, and so it

must reason under uncertainty. Accordingly, we want the robot to have a notion

of the ”degree of plausibility” of some statements given other statements that

are accepted as true.

The most important choice in the above is I have not defined what the

”degree of plausibility” is. Put in other words, the goal is to analyse what

”the degree of plausibility” of some statement could possibly mean and derive

a results. Jaynes treatment in Probability Theory is both throughout and entertaining2 , and at the end he arrive at the following 3 desiderata a notion of

degree of plausibility must fulfill:

• The degree of plausibility must be described by a real number

• It must agree with common sense (logic) in the limit of certainty

• It must be consistent

Consistency implies that if we have two ways to reason about the degree of

plausibility of a statement, these two ways must give the same result. After

some further analysis he arrive at the result that the degree of plausible of

statements A, B, C, . . . can be described by a function P , and the function

must behave like ordinary probabilities usually do, hereunder Bayes theorem:

P (A|B) =

P (B|A)P (A)

P (B)

**Where by the notation P (A|B) mean ”the degree of plausibility of A given B.
**

The key point is Bayes theorem now -if we accept what goes into the derivationnot only applies to flips with coins, but to all assignment of the degrees of

plausibility of true/false statements we may consider, and the interpretation

that a probability is really a degree of plausibility is then called the Bayesian

1 c.f.

**http://bayes.wustl.edu/etj/prob/book.pdf
**

should be noted the argument is not original to E.T. Jaynes, see R.T Cox’s work from

1946 and 1961 or Jaynes book for a detailed discussion of the history

2 It

3

**interpretation of probabilities. It is in this sense Jaynes (as well as most other
**

who call themselves Bayesians) consider Bayes theorem an extension of logic.

These definitions may appear somewhat technical and irrelevant at this

point, however their importance will hopefully be apparent later. For now let

us make a few key observations:

• Bayes theorem do not tell us what any particular probability should be

• Bayes theorem do not tell us how we should define the statements A, B, C, . . .

in a particular situation

What Bayes theorem do provide us is a consistency requirement: If we know

the probabilities on the right-hand side of the above equation, then know what

the probability on the left-hand side should be.

2

Is all historical reasoning just Bayes theorem?

**First and foremost, I think it is entirely uncontroversial to say Bayes theorem
**

has something important to say about reasoning in general and so also historical

reasoning. For instance, by going through various toy examples, Bayes theorem

provide a powerful tool to weed out biases and logical fallacies we are all prone

to make.

However I believe Dr. Carrier has a more general connection between BT

and the historical method in mind. In chapter 3:

Since BT is formally valid and its premises (the probabilities

we enter into it) constitute all that we can relevantly say about

the likelihood of any historical claim being true, it should follow

that all valid historical reasoning is described by Bayes’s Theorem

(whether historians are aware of this or not). That would mean

any historical reasoning that cannot be validly described by Bayes’s

Theorem is itself invalid (all of which I’ll demonstrate in the next

chapter). There is no other theorem that can make this claim. But

I shall take up the challenge of proving that in the next chapter.

and later, just before the formal proof:

(...) we could simply conclude here and now that Bayes’s Theorem models and describes all valid historical methods. No other

method is needed, apart from the endless plethora of techniques

that will be required to apply BT to specific cases of which the AFE

and ABE represent highly generalized examples, but examples at

even lower levels of generalization could be explored as well (such as

the methods of textual criticism, demographics, or stylometrics). All

become logically valid only insofar as they conform to BT and thus

are better informed when carried out with full awareness of their

Bayesian underpinning. This should already be sufficiently clear by

4

**now, but there are always naysayers. For them, I shall establish this
**

conclusion by formal logic

The crux of the logical argument seem to be this. Dr. Carrier define variables

C, D and E, of which only D and E will be of interest to us. The relevant part

of the argument is as follows:

Formally, if C = ”a valid historical method that contradicts BT,”

D = ”a valid historical method fully modeled and described by (and

thereby reducible to) BT,” and E = ”a valid historical method that is

consistent with but only partly modeled and described by BT,” then:

P8 Either C, D, or E. (proper trichotomy)

···

P10 If P5 and P6, then ∼E.

P11 P5 and P6.

C4 Therefore, ∼E

···

To establish premise P5 and P6, we consider a historical claim h, a piece of

evidence e and some background knowledge b. The premises are as follows:3

P5 Anything that can be said about any historical claim h that

makes any valid difference to the probability that h is true will

either (a) make h more or less likely on considerations of background knowledge alone or (b) make the evidence more or less

likely on considerations of the deductive predictions of h given

that same background knowledge or (c) make the evidence more

or less likely on considerations of the deductive predictions of

some other claim (a claim which entails h is false) given that

same background knowledge.

P6 Making h more or less likely on considerations of background

knowledge alone is the premise P (h|b) in BT; making the evidence more or less likely on considerations of the deductive predictions of h on that same background knowledge is the premise

P (e|h.b) in BT; making the evidence more or less likely on considerations of the deductive predictions of some other claim that

entails h is false is the premise P (e| ∼h.b) in BT; any value for

P (h|b) entails the value for the premise P (∼h|b) in BT; and

these exhaust all the premises in BT.

3I

**have chosen to follow Dr. Carriers typesetting and accordingly for propositions such
**

as A = ”It will rain tomorrow” and B = ”It will be cold tomorrow” the notation ∼A means

”not A” (”it will not rain tomorrow”) and A.B means ”A and B” (It will be rainy and cold

tomorrow

5

**I think we can summarize the argument as follows: Consider a valid historical
**

method. Either the historical method is fully or partly described by Bayes

theorem. We can rule out the later possibility, E, for the following reason:

Anything that can be said about the probability a historical claim h is true

given some background knowledge b and evidence e, denoted by P (h|e.b), will

affect either P (h|b), P (e|h.b), P (e| ∼h, b) or P (h|b). However these values fully

determine P (h|e.b) according to Bayes theorem:

P (h|e.b) =

P (e|h.b)p(h|b)

p(e|h.b)p(h|b) + p(e| ∼h.b)p(∼h|b)

**and so the method must be fully included in Bayes theorem, proving the original
**

claim.

I see two problems with the argument. The first is somewhat technical

but need to be raise: Though it is not stated explicitly, Dr. Carrier tacitly

assume anything we are interested in about a claim h is the probability h is true.

However I see no reason why this should be the case. For instance DempsterSchafer theory establish the support and plausibility (the later term is used to

a different effect than I did in the introduction) of a claim and multi-valued

logics attempts to define and analyze the graded truth of a proposition; all

of these are concepts different than the probability. It is not at all apparent

why these concepts can be ruled out as being either not useful or reducing to

Bayes theorem. For instance, suppose we define Jesus as a ”highly inspirational

prophet”, a great many in my field would say the modifier ”highly” is not well

analysed in terms of probabilities but requires other tools. More generally, it

goes without saying we do not have a general theory for cognition, and I would

be very surprised if that theory turned out to reduce to probability theory in

the case of history.

The second problem is more concrete and relates to the scope of what is being

demonstrated: Lets assume we are only interested in the probability of a claim

h being true. As noted in the past section, Bayes theorem is clearly only saying

something about how the quantity on the left-hand side of the above equation,

P (h|e.b), must be related to those on the right-hand side, and Dr. Carrier

is correct in pointing out any change in P (h|e.b) must (this is pure algebra!)

correspond to a change in at least one term on the right-hand side. The problem

is we do not know what those quantities on the right-hand side are numerically,

and we cannot figure them out only by applying Bayes theorem more times. For

instance, applying Bayes theorem to the term P (e|h.b) will require knowledge

of P (h|e.b), exactly the term we set out to determine.

This however seems to severely undercut the importance of what is being

demonstrated. Let me illustrate this by an example: Lets say I make a claim

such as:

Basic algebra [Bayes’s Theorem] models and describes all valid

methods for reasoning about unemployment [historical methods]

My proof goes at follows: Let X be the number of unemployed people, Y is the

number of people who are physically unable to work due to some disability and

6

**Z is the number of people who can work but have not found a work. Now the
**

algebra:

X =Y +Z

(contrast this equation to Bayes theorem). I can now make an equivalent claim

to P5 and P6: All that can be validly said about X must imply a change in

either Y or Z, and I can conclude: all that can validly be said about the number

of unemployed people must therefore be described by algebra.

Clearly in some sense this is true however it misses nearly everything of

economical interest such as what actually affects the terms Y and Z and by

how much; while it is clear if X change at least one of the terms Y or Z have

to change, algebra does not tell us which, just as Bayes theorem does not tell

us what the quantities P (e|h.b), P (h|b), · · · actually are, and it does not tell us

how the propositions e, h, b should be defined.

Suppose we try to rescue the idea of a formal proof by accepting the term

”a valid historical method” simply mean system (or method) of inference which

operate on the probability of propositions, without worrying which propositions are relevant (which Bayes theorem does not say) or how to obtain these

probabilities (which Bayes theorem does not say either). But if we accept this

definition, I see no reason why we could not simply replace the argument in

chapter 4 by the following:

Bayesian inference describes the relationship between probabilities of various propositions (c.f. Jaynes, 2003). In particular it

applies when the propositions are related to historical events.

This claim would of course be hard to expand into about half a chapter.

It is of course true Bayesian methods has found wide applications in almost

all sciences, but this has been because Bayesian methods has shown themselves

to work. I completely agree with Dr. Carrier that there are reasons to consider how Bayesian methods could be applied to history so as to give tangible

results, but the main point is this must be settled by giving examples of actual

applications that offer tangible benefits, just as it has been the case in all other

scientific disciplines where Bayesian methods are presently applied. This is what

I will focus on in the next sections.

3

**Applications of Bayes theorem in ”Proving
**

History”

To my surprise, Proving History contains almost no applications of Bayes theorem to historical problems. The purpose of most of the applications of Bayes

theorem in Proving History is to illustrate aspects of Bayes theorem and show

how it agree with our common intuition. Take for instance the first example in

the book, the analysis of the disappearing sun in chapter in chapter 3, which

seem mainly intended to show how different degree of evidence affect ones conclusion in Bayes theorem. The example considers an ahistorical disappearing

7

sun in 1989 with overwhelming observational evidence, and the claimed disappearing sun in the gospels with very little evidence, and shows that according

to Bayes theorem we should be more inclined to believe the disappearance with

overwhelming evidence. This is certainly true, however it is not telling us anything new.

The example which by far receive the most extensive treatment is the criteria

of embarrassment, for which the discussion take up about half of chapter five

and end with a computation of probabilities. I will therefore only focus on this

example:

3.1

The criteria of embarrassment

**The criteria of embarrassment (EC) is as follows:
**

The EC (or Embarrassment Criterion) is based on the folk belief that if an author says something that would embarrass him, it

must be true, because he wouldn’t embarrass himself with a lie. An

EC argument (or Argument from Embarrassment) is an attempt to

apply this principle to derive the conclusion that the embarrassing

statement is historically true. For example, the criterion of embarrassment states that material that would have been embarrassing to

early Christians is more likely to be historical since it is unlikely that

they would have made up material that would have placed them or

Jesus in a bad light, (Chapter 5)

Dr. Carrier then offers an extended discussion of some of the problems with

the criteria of embarrassment which I found well written an interesting. The

problems raised are: (1) the gospels are themselves very late making it problematic to assume the authors had access to an embarrassing core tradition they

felt compelled to write down (2) we do not know what would be embarrassing

for the early church and (3) would the gospel authors pen something genuinely

embarrassing at all?.

There then follows treatments of several ”embarrassing” stories in the gospels

where Dr. Carrier argues (convincingly in my opinion) there can be little ground

for an application of the EC. We then gets to the application of Bayes theorem:

Thus, for the Gospels, we’re faced with the following logic. If

N (T ) = the number of true embarrassing stories there actually were

in any friendly source, N (∼ T ) = the number of false embarrassing stories that were fabricated by friendly sources, N (T.M ) =

the number of true embarrassing stories coinciding with a motive

for friendly sources to preserve them that was sufficient to cause

them to be preserved, N (∼ T.M ) = the number of false embarrassing stories (fabricated by friendly sources) coinciding with a

motive for friendly sources to preserve them that was sufficient to

cause them to be preserved, and N (P ) = the number of embarrassing stories that were preserved (both true and fabricated), then

8

**N (P ) = N (T.M ) + N (∼T.M ), and P (T |P ), the frequency of true
**

stories among all embarrassing stories preserved, = N (T.M )/N (P ),

which entails P (T |P ) = N (T.M )/(N (T.M ) + N (∼T.M )) Since all

we have are friendly sources that have no independently confirmed

reliability, and no confirmed evidence of there ever being any reliable

neutral or hostile sources, it further follows that N (T.M ) = qN (T ),

where q 1, and N (∼T.M ) = 1 × N (∼T ): because all false stories

created by friendly sources have motives sufficient to preserve them

(since that same motive is what created them in the first place),

whereas this is not the case for true stories that are embarrassing,

for few such stories so conveniently come with sufficient motives to

preserve them (as the entire logic of the EC argument requires). So

the frequency of the former must be 1, and the frequency of latter

(i.e., q) must be 1. Therefore: [Assuming N (T ) = N (∼T ) and

with slight changes to the typesetting]

P (T |P ) =

qN (T )

q

N (T.M )

=

=

N (T.M ) + N (∼T.M )

q × N (T ) + 1 × N (∼T )

q+1

**So this is saying the probability a story will be true given it is embarrassing will
**

always be less than 0.5, so the EC actually works in reverse!

3.2

Reality Check

**If you read a memoir and it said (1) the author severely bullied one of his
**

classmates for a year (2) the author once gave a large sum of money to a homeless

man, then, all things being equal, which of the two would you be more inclined

to believe the author to had made up? If the memoir was a gospel, we should

be more inclined to believe the story of the bullying was made up, however this

obviously goes against common sense! As Richard Carrier himself points out,

sometimes the EC does work, and any computation must at the least be able

to replicate this situation.

3.3

What happened

**I think the first observation is the quoted argument in Proving History do not
**

actually use Bayes theorem (specifically, it avoids the use of probabilities), but

rely on fractions of the size of appropriately defined sets. I can’t tell why this

choice is made, but it is a recurring theme throughout the book to argue for

the application of Bayes theorem and then carry out at least a part of the

argumentation using non-standard arguments. Another thing I found confusing

was how the sets are actually defined and why they are chosen the way they

are. To first translate the criteria into Bayes theorem we need to define the

9

**appropriate variables. As I understand the text they are defined as follows
**

T, F : The story is true (as opposed to fabricated)

Pres : The story was preserved

Em : The story is embaressing

The discussion carried out in the text now amount to the following assumptions

P (Pres| ∼T, Em) = 1

P (Pres|T, Em) = q < 1

The first assumption is saying the only way someone would fabricate a seemingly

embarrassing story is if it serves some purpose and so it must be preserved, and

the second is saying a true story which seems embarrassing might not serve a

specific purpose and we are not guaranteed it will be preserved. It should be

clear now we are really interested in computing P (T |Pres, Em), the probability a

story is true given it is preserved and seems embarrassing. Turning the Bayesian

crank:

P (Pres|T, Em)P (T |Em)

P (Pres|T, Em)P (T |Em) + P (Pres| ∼T, Em)P (∼T |Em)

qP (T |Em)

q

=

=

qP (T |Em) + P (∼T |Em)

q+1

P (T|Pres, Em) =

from which the result follows. We can try to translate the result into English: Suppose the gospel writers started out with/made up an equal number

of true and false stories that seems embarrassing today. However all the seemingly embarrassing stories that are false were made up (by the gospel writers

or whoever supplied them with their material) because they were significant

and were therefore preserved, and the true seemingly embarrassing stories were

preserved/writtern down by the gospel writers at a low rate, q, and therefore

almost all seemingly embarrassing stories that survive to this date are false.

A reader might notice I have used the phrase ”seemingly embarrassing”, by

which I mean ”seemingly embarrassing to us”. This is evidently required for the

argument to work. Consider for instance the assumption P (Pres| ∼T, Em) = 1.

However if Em meant that it was truly embarrassing to the author, this would

mean that false stories made up by friendly sources (are there any?) which were

truly embarrassing would always be preserved – a highly dubious assumption

and clearly contrary to Dr. Carrier’s argument.

A basic problem in the above line of argument is there is no way to encode

the information that a story was actually embarrassing. We are, effectively,

analysing the criteria for embarrassment without having any way to express a

story was embarrassing to the author!.

Embarrassing therefore become effectively synonymous with ”embarrassing

with a deeper literary meaning” (the reader can try the substitution this phrase

in the previous sections and notice the argument become more natural), and the

10

**analysis boil down to saying stories with a deeper literary meaning (that also
**

happens to look embarrassing today) are for the most part made up, except a

few that are true and happens to have a deeper meaning by accident.

3.4

Adding Embarrassment to the Criteria of Embarrassment

**To call something an analysis of the criteria of embarrassment, we need to
**

include an expressiveness amongst our variables that include the basic intuition

behind the criteria. I believe the following is minimal:

T, F : The story is true or fabricated

Pres : The story was preserved

Em : The story is seemingly embaressing (to us)

Tem : The story was truly embaressing to the author

Lp : The story served a litteraty purpose (we assume ∼Tem = Lp)

Notice Tem mean something different than Em: Tem mean the story was embarrassing to the one doing the preservation, Em means it seem embarrassing

to us 2000 years later. To put the EC into words: A person would not preserve something that was actually embarrassing which he knew was false, or in

symbols:

P (Pres| ∼T, Tem) = 0

The following is always true:

P (Pres, T, Tem|Em) = P (Pres|T, Tem, Em)P (T |Tem, Em)P (Tem|Em)

Where I have been really sloppy in the notation and implicitly assume variables

such as T and Tem can also take values ∼T and ∼Tem = LP. The next step is

to add simplifying assumptions. I am going to assume

P (Pres|T, Tem, Em) = P (Pres|T, Tem)

P (T |Tem, Em) = P (T |Tem)

The assumptions here is that our (20th century) interpretation of whether a

story is embarrassing or not is secondary to if it was truly embarrassing. Next,

lets look at the likelihood term. I will assume:

P (Pres|F, Tem) = 0

P (Pres|F, LP) = l

P (Pres|T, Tem) = c

P (Pres|T, LP) = 1

The first and last specification is saying an author would never record something

truly embarrassing he knew was false, and he would always record something he

11

**knew was true and served a literary purpose. The second specification is saying
**

the author will (with probability l) include stories that are false but nevertheless

serve a literary purpose, and the third that he has a certain candor that makes

him sometimes (with probability c) include embarrassing stories he know are

true. Turning the Bayesian crank now give:

P (T |Pres, Em) =

P (Tem|Em)P (T |Tem)c + P (LP|Em)P (T |LP)

P (Tem|Em)P (T |Tem)c + P (LP|Em)P (T |LP) + P (F |LP)P (LP|Em)l

**This is a bit of a mess. Lets begin by assuming we are equally good at
**

determining if a story is truly embarrassing or serves a literary purpose, ie.

P (Tem|Em) = P (LP|Em) = 0.5 and we know nothing of the (conditional)

probability a story is true/false, eg. P (T |Tem) = P (T |LP) = 0.5. In this case:

c+1

c+1+l

We can now try to plug in some limits. Assume the gospel authors have perfect

candor and will always report true stories (c = 1) we get:

P (T |Pres, Em) =

2

2

∈ [ ; 1]

2+l

3

so in this case the criteria of embarrassment actually work. Another case might

be where the gospel authors have no candor and will always suppress embarrassment stories, c = 0, and in this case

1

1

P (T |Pres, Em) =

∈ [ ; 1]

1+l

2

so actually the criteria of embarrassment also work in this limit(!). To recover

Dr. Carrier’s analysis, we need something more. Inspecting the full expression

reveal the easiest thing to assume is something like:

1

P (T |LP) = q <

2

Which is saying stories that serves a literary purpose are likely to be made up.

I suppose which value you think q would have depend on how you view Jesus:

Do you expect him to have lived the sort of life where many of the things he did

or said would have a deeper literary purpose afterwards? Your religious views

may influence how you judge that question to put it mildly. At any rate, this

lead to the new expression:

c+q

P (T |Pres, Em) =

.

c + q + (1 − q)l

P (T |Pres, Em) =

**It is difficult to directly relate this expression to Dr. Carrier’s analysis, however
**

lets assume a story is preserved with probability 1 if it is true and serves a

literary purpose (l = 1) and a story which is true but also embarrassing will

never be preserved (c = 0). Then we simply obtain

1

2

which is qualitatively consistent with Dr. Carrier’s result.

P (T |Pres, Em) = q <

12

3.4.1

Some thorny issues

**Dr. Carrier offered one analysis of the EC which indicate embarrassment lower
**

the probability a story is historical, I included a variable that actually allow

for a story to be embarrassing and got the opposite result. My point is not to

demonstrate one of us is wrong or right, but motivate some questions I think

are problematic in terms of applying Bayes theorem to history:

Do we actually model history: I think both Dr. Carriers and my analysis

contained a term like P (Pres|T, x) (x possibly meaning different things).

The model this presume is something akin to the following: The gospel authors are compiling (or preserving) a set of stories with knowledge of their

truth-value and –at least in my case– knowledge of their literary purpose

and them being embarrassing. However I think it is uncontroversial to say

this is a bad model of how the gospel authors worked. For instance the

gospel authors also made up a good deal of the gospels, changed stories

to fit an agenda and so, and this should also figure in the analysis.

When true is not true: Continuing the above line of thought: With some

probability, which we need to estimate, the gospel authors did not know

what was true or false per see because they were writing about events

that may have happened 40 years prior. This means that conditioning

on a variable T (true) is problematic. True seem to more likely mean

(with some probability at least) that the statement was something that

were being told by the Christian community and believed to be true. This

should be included in the analysis.

Where do the stories come from: Continuing the above line of thought,

if the Gospel authors had access to a set of stories about Jesus, we need

to ask where they came from. This lead to a secondary application of the

criteria of embarrassment, but with the subtle difference that we know

even less about who the original compilers (or tellers) of these stories

were, what they would find embarrassing, what they actually produced

and so on, this should also be included in the analysis.

Variable sprawl: A basic point is this: If we want to determine how well the

criteria of embarrassment work in a Bayesian fashion, we need to model

the underlying situation with some accuracy. Continuing the above line

of thought would properly result in a good 10-20-(100?) variables that

mean different things and are all relevant to determining if a seemingly

embarrassing story is historical or not. Basically, every time one have a

noun and a ”might” or ”properly”, there is a new variable for the analysis,

and we must include this variables in our analysis. Determining what the

variables actually mean, what their probabilities are and how they (numerically) affect each other is a truly daunting task that scale exponentially

in the number of variables. Is it possible to undertake this project and

expect some accuracy at the end?

13

**Toy models: An alternative view is to undertake the analysis using naive toy
**

models and arguing why large parts of the problem can either be ignored

or approximated by these toy model. This is what both I and Dr. Carrier

has done. This is properly a more fruitful way to approach the problem,

however since all toy models are going to be wrong (the fact Dr. Carrier

and I produced exactly opposite results is evidence of this), this raises some

basic questions of how the numerical estimates we get out are connected

to the historical truth of any given proposition under consideration.

In statistical modelling, or any other science for that matter, whenever one is

postulating a model, no matter how reasonable the assumptions that goes into

it may seem, there must be a step where the result is validated in some way by

predicting a feature of the data which can be checked. I hope the disagreement

of Dr. Carrier’s model for the Criteria of Embarrassment and my proposed

model will convince the reader such measures are required.

How such validations should be carried out is not discussed in proving history, nor does one get the impression there would be much of a need in the first

place. I will try to illustrate how ”Proving History” treats this issue by two

examples. The first is from chapter six, on resolving expert disagreement, in

which it is discussed at some length is how Bayes theorem can be used to make

two parties agree:

The most common disagreements are disagreements as to the

contents of b (background knowledge) or its analysis (the derivation

of estimated frequencies). Knowledge of the validity and mechanics

of Bayes’s Theorem, and of all the relevant evidence and scholarship,

must of course be a component of b (hence the need for meeting those

conditions before proceeding). This basic process of education can

involve making a Bayesian argument, allowing opponents to critique

it (by giving reasons for rejecting its conclusion), then resolving that

critique, then iterating that process until they have no remaining objections (at which time they will realize and understand the validity

and operation of Bayes’s Theorem and the soundness of its application in the present case). So, too, for any other relevant knowledge

although they may also have their own information to impart to you,

which might in fact change your estimates and results, but either way

disagreements are thereby resolved as both parties become equally

informed and negotiate a Bayesian calculation whose premises (the

four basic probability numbers) neither can object to, and therefore

whose conclusion both must accept

A worrying aspect of the above quote is how Dr. Carrier discuss these problems

as having to do with estimating the ”four basic probability numbers”, by which

I assume he really intend to say the three numbers p(e|h, b), p(h|b), p(e| ∼h, b).

Just to take my toy example from above, there will very clearly be more than

four numbers involved. In fact, the amount of numbers will grow exponentially

in the number of different binary variable (such as T , Em, Tem, etc. in the

14

**above) we attempt to treat in our analysis. I think the pressing issue is not if
**

or if not two perfectly rational scholars should in principle end up agreeing, but

how we ourselves would know what we were doing had scientific value and what

two scholars should do in practise.

The second suggestion in Proving History is a-fortiori reasoning. This roughly

means using the largest/smallest plausible value of probabilities in the analysis

to see which kind of results one may obtain. I think there are ample reasons

to suspect, based on the past example alone, that one can get divergent results

this way. At any rate such over or underestimation would not fix the problem

of having the wrong model to begin with, a point the toy example above should

be sufficient to demonstrate.

4

The re-interpretation of probability theory

In my reading of the book there was a number of times where I had problems following the discussion, for instance when discussion how to obtain prior

probabilities from frequencies, or the suggestion of a-fortiori reasoning. I think

Chapter six, ”The technical stuff”, explain much of this confusion, namely Dr.

Carriers suggestion for how one can combine the Bayesian and the frequentist

view on probabilities, which is also a main theoretical contribution of the book.

Before I return to some more practical considerations I wish to treat Dr. Carrier’s suggestion in more details.

One of the main purposes of chapter six is to address some philosophical

issues of Bayesian theory. Dr. Carrier introduce the chapter with these words:

Six issues will be taken up here: a bit more on how to resolve

expert disagreements with BT; an explanation of why BT still works

when hypotheses are allowed to make generic rather than exact predictions; the technical question of determining a reference class for

assigning prior probabilities in BT; a discussion of the need to attenuate probability estimates to the outcome of hypothetical models

(or a hypothetically infinite series of runs), rather than deriving estimates solely from actual data sets (and how we can do either); and a

resolution of the epistemological debate between so-called Bayesians

and frequentists, where I’ll show that since all Bayesians are in fact

actually frequentists, there is no reason for frequentists not to be

Bayesians as well. That last may strike those familiar with that

debate as rather cheeky. But I doubt you’ll be so skeptical after having read what I have to say on the matter. That discussion will

end with a resolution of a sixth and final issue: a demonstration of

the actual relationship between physical and epistemic probabilities,

showing how the latter always derive from (and approximate) the

former.

Where the emphasis are the claims I will focus on in this review. In reviewing

Dr. Carriers suggestion, I will not focus so much on the ”debate” between fre15

**quentists and Bayesians (in my experience it is a not something one encounters
**

very frequently), but rather on Dr. Carriers proposed interpretation of Bayesian

probabilities. I apologize in advance the section will be somewhat technical at

places, I have tried to structure it by providing what I consider a ”standard”

Bayesian answer (these sections will be marked with an *) to the questions Dr.

Carrier attempt to answer, and then discuss Dr. Carriers alternative suggestion.

But before I begin I think it is useful to review the standard Bayesian interpretation of the two central terms Dr. Carrier seek to investigate, namely

probabilities and frequencies. The following continue from the introduction of

Bayes theorem I outlined in the first section. I will refer readers to E.T. Jayne’s

book which discuss these issues with much more clarity.

4.1

Probabilities and frequencies. The mainstream view*

**The mainstream Bayesian view on frequencies and probabilities can be summarized as follows:
**

Probabilities represent degrees of plausibility. Probabilities therefore refer

to a state of knowledge of a rational agent and are either assigned based on (for

instance) symmetry considerations (the chance a coin come up heads is 50%

because there are two sides) or derived from other probabilities according to the

rules of probability theory (hereunder Bayes theorem).

Frequencies is a factual property of the real world that we measure or

estimate. For instance, if we count 10 cows on the field and notice 3 are red, the

frequency of red cows is 3/10 = 0.3. This is not a probability. The two things

simply refer to completely different things: Probabilities change when our state

of knowledge change, frequencies do not.

With these things in mind lets focus on Dr. Carriers definition of probabilities and frequencies:

4.2

Richard Carriers proposal

**A key point I found confusing is what Dr. Carrier actually mean by the word
**

probability. The word is used from the beginning to the end of the book, however

an attempt to clarify it’s meaning is only encountered in Chapter 2, right after

stating axiom 4: ”Every claim has a nonzero probability of being true or false

(unless it’s being true or false is logically impossible)” 4 the following clarification

follows:

...by probability here I mean epistemic probability, which is the

probability that we are correct when affirming a claim is true. Setting aside for now what this means or how they’re related, philosophers have recognized two different kinds of probabilities: physical

and epistemic. A physical probability is the probability that an

4 The axioms and rules are themselves somewhat. If the historical method reduces to

application of Bayes theorem, shouldn’t we rather be interested in the assumptions behind

Bayes theorem?

16

**event x happened. An epistemic probability is the probability
**

that our belief that x happened is true.

Notice that both the definition of ”probability”, ”epistemic probability” and

”physical probability themselves rely on the word ”probability” which is of

course circular. The definition is revisited in chapter 6 in the section ”The

role of hypothetical data in determining probability”. The definition (it is hard

to tell if an actual definition is offered) introduce the axillary concepts ”logical

truths”, ”emperical truths” and ”hypothetical truths”. I will confess I found

the chapter very difficult to understand, and I will therefore provide quotations

before giving my own impression of the various definitions and arguments such

that the reader can form his own opinion.

What are probabilities really probabilities of? Mathematicians

and philosophers have long debated the question. Suppose we have

a die with four sides (a tetrahedron), its geometry is perfect, and we

toss it in a perfectly randomizing way. From the stated facts we can

predict that it has a 1 in 4 chance of coming up a 4 based on the

geometry of the die, the laws of physics, and the previously proven

randomizing effects of the way it will be tossed (and where). This

could even be demonstrated with a deductive syllogism (such that

from the stated premises, the conclusion necessarily follows). Yet

this is still a physical probability. So in principle we can connect

logical truths with empirical truths. The difference is that empirically we don’t always know what all the premises are, or when or

whether they apply (e.g. no die’s geometry is ever perfect; we don’t

know if the die-thrower may have arranged a scheme to cheat; and

countless other things we might never think of). That’s why we

can’t prove facts from the armchair.

From this, it seem the ”logical truth” is the observation a perfectly random

throw with a perfect die with four sides will come up 4 exactly 1/4’th of the time.

Dr. Carrier note this probability is connected to the ”physical probability”, by

which I believe is meant how a concrete die will behave. While it is clearly true

the two things must be connected in some way, the entire point must be how the

two are connected. In the following section Dr. Carrier (correctly) identify this

connection as having to do with our lack of knowledge. The text then continue:

Thus we go from logical truths to empirical truths. But we have

to go even further, from empirical truths to hypothetical truths. The

frequency with which that four-sided die turns up a 4 can be deduced

logically when the premises can all be ascertained to be true, or near

enough that the deviations don’t matter (...), yet ascertained still

means empirically, which means adducing a hypothesis and testing

it against the evidence, admitting all the while that no test can

leave us absolutely certain. And when these premises can’t be thus

ascertained, all we have left is to just empirically test the die: roll

17

**it a bunch of times and see what the frequency of rolling 4 is. Yet
**

that method is actually less accurate. We can prove mathematically

that because of random fluctuations the observed frequency usually

won’t reflect the actual probability. For example, if we roll the die

four times and it comes up 4 every time, we cannot conclude the

probability that this die will roll a 4 on the next toss is 100% (or

even 71%, which is roughly the probability that can be deduced if

we don’t assume the other facts in evidence). That’s because if the

probability really is 1 in 4, then there is roughly a 4% chance you’ll

see a straight run of four 4’s (mathematically: 0.254 = 0.00390625)

I believe the above discussion can be summarized as follows: Suppose we have

an idealized die with four sides we roll in an idealized way. The chance it will

come up 4 is (exactly) 0.25. This is what Dr. Carrier call a hypothetical truth.

However, since the die has minute random imperfections, the real chance it will

come up 4 is slightly different, perhaps 0.256. This is the physical probability.

The reason why these two numbers are different is because we are unaware of

the small imperfections in the die. Now, if we roll an actual die a number of

times, say 4, and compute the frequency of times the die will come up 4 to the

total number of rolls, we will get a third number which properly will not be

any of the above. In fact, the fluctuations that are being discussed are exactly

distributed according to the previously introduced expression, viz.:

N n

P (n rolls|N rolls) =

p (1 − p)N −n , and p = 0.25.

n

While there are a few minor points about the way the problem is laid out

(for instance the use of the word ”chance” is problematic; how is that defined

without reference to probabilities?) and the terminology, the problems raised

above -namely how these three numbers are related- is the central one. We will

now turn to Dr. Carrier’s proposal, the discussion continue as follows:

Even a thousand tosses of an absolutely perfect four-sided die

will not generate a perfect count of 250 4’s (except but rarely). The

equivalent of absolutely perfect randomizer do exist in quantum mechanics. An experiment involving an electron apparatus could be

constructed by a competent physicist that gave a perfect 1 in 4 decision every time. Yet even that would not always generate 250 hits

every 1,000 runs. Random variation will frequently tilt the results

slightly one way or another. Thus, you cannot derive the actual

frequency from the data alone. For example, using the hypothetical

electron experiment, we might get 256 hits after 1,000 runs. Yet we

would be wrong if we concluded the probability of getting a hit the

next time around was 0.256. That probability would still be 0.250.

We could show this by running the experiment several times

again. Not only would we get a different result on some of those

new runs (thus proving the first result should not have been so concretely trusted), but when we combined all these data sets, odds are

18

**the result would converge even more closely on 0.250. In fact you
**

can graph this like an approach vector over many experiments and

see an inevitable curve, whose shape can be quantified by mathematical calculus, which deductively entails that that curve ends (when

extended out to infinity) right at 0.250. Calculus was invented for

exactly those kinds of tasks, summing up an infinite number of cases,

and defining a curve that can be iterated indefinitely, so we can see

where it goes without actually having to draw it (and thus we can

count up infinite sums in finite time).

The last paragraph verge on gobblygog in using technical words in a manner that

is both unclear and very hard to recognize. The proposal seem to be that if we

carry out the idealized experiment out for sufficiently long time, the observed

frequency will converge towards 0.25. A reader who is unfamiliar with this

result should keep in mind that a formal statement of the result (from the setup

I assume it is the weak law of large numbers Dr. Carrier has in mind) contain

the somewhat technical statement: ”..will converge with probability one..”, so

if one is using such an argument to later define probability there is again an

issue of circularity. Directly following the above paragraph is this:

Clearly, from established theory, when working with the imagined quantum tabletop experiment we should conclude the frequency

of hits is 0.25, even though we will almost never have an actual data

set that exhibits exactly that frequency. Hence we must conclude

that that hypothetical frequency is more accurate than any actual

frequency will be. After all, either the true frequency is the observed

frequency or the hypothesized frequency; due to the deductive logic

of random variation you know the observed frequency is almost never

exactly the true frequency (the probability that it is is always ≤ 0.5,

and in fact approaches 0 as the odds deviate from even and the

number of runs increases); given any well-founded hypothesis you

will know the probability that the hypothesized frequency is the true

frequency is > 0.5 (and often 0.5), and certainly not → 0); therefore P (THE HYPOTHESIZED FREQUENCY IS THE TRUE FREQUENCY)

> P (THE OBSERVED FREQUENCY IS THE TRUE FREQUENCY); in fact,

quite often P (HYPOTHESIZED) P (OBSERVED). So the same is true

in every case, including the four-sided die, and anything else we are

measuring the frequency of. Deductive argument from empirically

established premises thus produces more accurate estimates of probability.

The main philosophical ”charge” (if you will) leveled by Bayesian statesticians

against frequentists is a frequentist view tend to require thought-experiments

in idealized situations that are run to infinite, and I will just notice we are now

having a imagined quantum tabletop experiment where we assume we know the

limit frequency is 0.25 (no concrete experiment I can think of would behave

like that, and no experiment can be run to the limit of infinite). The typical

19

**Bayesian objection is that while we are free to think of this idealized situation as
**

a thought-experiment, it is quite different to eg. the situation where we consider

the probability a corpse is stolen from a grave. Again I will refer to Jaynes book

for a deeper treatment of the problems that arise and again only notice Carrier

does not discuss them at all.

However Dr. Carrier also introduce some novel problems in his discussion.

Consider the statement: ”After all, either the true frequency is the observed

frequency or the hypothesized frequency”. But clearly this is false. Suppose i

hypothesize that the so-called hypothesized frequency of the die coming up 4

is 0.25. I then roll the die 10 times and get a observed frequency (in Bayesian

terms, the frequency) of 3/10. However both of these values are going to be

wrong, because clearly the microscopic imperfections in the die is going to mean

it will have a different ”true frequency” (in Dr. Carriers language) than either

0.25 or 0.3, simply due to the fact there are an infinite number of other candidate

true frequencies. The statement is therefore in any practical situation a false

dilemma; regarding the inequalities what would happen would be that both

sides would tend towards zero, in direct contradiction to what Dr. Carrier write

(because, again staying in the frequentist language, the true frequency is with

probability 1 something else) and depending on the situation the inequality

could go either way. The argument is simply false.

Finally, and this is a recurrent theme, it is very hard to tell what has actually

been defined. I have carefully gone through the chapter, and the above quotation

is the first time the word ”hypothetical frequency” is used. But what exactly

does it mean? The closest to a definition is shortly later in chapter six: ”Thus

we must instead rely on hypothetical frequencies, that is, frequencies that are

generated by hypothesis using the data available which data includes not just

the frequency data (from which we can project an observed trend to a limit of

infinite runs), but also the physical data regarding the system that’s generating

that frequency (like the shape and weight distribution of a die).”. What I think

is intended here is to say the ”hypothetical frequency” represent our best guess

at what will happen with the die (or quantum tabletop experiment) if we roll

it in the future, given given our knowledge of the geometry of the die and past

rolls. In Bayesian terms, we would call this the probability.

Having introduced observed and hypothetical frequencies, we can now begin

to make headway towards defining probabilities, unfortunately it is done in a

very indirect manner:

...that hypothetical frequencies are more accurate than observed

frequencies, should not surprise anyone. ... if we take care to manufacture a very good four-sided die and take pains to use methods of

tossing it that have been proven to randomize well, we don’t need

to roll it even once to know that the hypothetical frequency of this

die rolling 4’s is as near to 0.25 as we need it to be. (...) Thus

it’s not valid to argue that because hypothetical frequencies are not

actual data, and since all we have are actual data, we should only

derive our frequencies from the latter. All probability estimates (even

20

**of the very fuzzy kind historians must make, such as occasioned in
**

chapters 3 through 5) are attempted approximations of the true frequencies (as I’ll further explain in the next and last section of this

chapter, starting on page 265). So that’s what we’re doing when we

subjectively assign probabilities, attempting to predict and thus approximate the true frequencies, which we can only approximate from

the finite data available because those data do not reflect the true

frequency of anything (...). Thus we must instead rely on hypothetical frequencies, that is, frequencies that are generated by hypothesis

using the data available which data includes not just the frequency

data (from which we can project an observed trend to a limit of

infinite runs), but also the physical data regarding the system that’s

generating that frequency (like the shape and weight distribution of

a die). Of course, when we have a lot of good data, the observed and

hypothetical frequencies will usually be close enough as to make no

difference. [my italic]

The question I started out with was this: What is a probability in Proving

History? To the best of my knowledge, probability is being equated with hypothetical frequencies, however this suggestion is definately non-Bayesian and

is plagued by all the problems Bayesian has been raising for nearly a century,

starting with Dr. Carriers main technical reference for Bayes theorem, namely

Jaynes book.

The first thing to notice is the discussion above is entirely focused on dies and

quantum tabletop computers, that is, experiments which we can easily imagine

be carried out over and over again. However these setups are very different from

the ones we are really interested in, namely probabilities of historical events

that perhaps only happened once. To give a concrete example of this difficulty,

consider the following propositions

A : ”I believe with probability 0.8 that the 8th digit of π is a nine”

In a Bayesian view, the term ”with probability 0.8” refer to a state of knowledge

of π, and thus require no axillary considerations; it simply reflect me thinking

the 8th digit is properly a nine while not being certain.

However, in the interpretation above, when we assign a probability of 0.8 to

the statement then (to quote): ”what we’re doing when we subjectively assign

probabilities,[is] attempting to predict and thus approximate the true frequencies,

which we can only approximate from the finite data available”. But what is the

true frequency of the 8th digit in π being a 9? Why should we think there is

such a thing? How would we set out to prove it exists? What is the true value

of the true frequency? The basic reason why these questions are hard to answer

is this: either it is or it is not a nine, and the reason I am uncertain reflect

only my lack of knowledge. A Bayesian treatment give a direct analysis of this

situation, an attempt to connect it to a quantum tabletop experiment does not.

The situation is analogous for history. Consider for instance the probability

Caesar crossed the Rubicon, or a miracle was made up and attributed to a

21

**first-century miracle worker. The notion of ”true frequency” in these situations
**

become very hard to define, however if we accept probability simply refer to our

degree of belief there is no need for such thought experiments.

5

The connection between frequencies and probabilities

**The last section of Chapter six offers a main philosophical point of the book,
**

namely a combination of frequentistic and Bayesian view of probabilities. This is

done by re-interpretating what is meant by Bayesian probabilities. The chapter

open thus:

Probability is obviously a measure of frequency. If we say 20%

of Americans smoke, we mean 1 in 5 Americans smoke, or in other

words, if there are 300 million Americans, 60 million Americans

smoke. When weathermen tell us there is a 20% chance of rain

during the coming daylight hours, they mean either that it will rain

over one-fifth of the region for which the prediction was made (i.e., if

that region contains a thousand acres, rain will fall on a total of two

hundred of those acres before nightfall) or that when comparing all

past days for which the same meteorological indicators were present

as are present for this current day we would find that rain occurred

on one out of five of those days (i.e., if we find one hundred such days

in the record books, twenty of them were days on which it rained).

Speaking of bold assertions, consider the first line: ”Probability is obviously a

measure of frequency”. The basic problem is this: If this is obvious, how come

Bayesians has failed to see the obvious for 50 years and insisted on probability

as being rational degrees of belief, ie. a state of knowledge? if it is obvious

how come the main technical reference, Jaynes book, dedicate entire chapters

to argue against this misconception?

What is of course obvious is one can go from probabilities to frequencies

-as I have already illustrated with the example of the coin-, but in that case

the implication goes the other way: If the probability is defined in a situation

where there is a well-defined experiment, such as with a coin, one can make

probabilistic predictions about it’s frequency using Bayesian methods.

What is frustrating is Dr. Carriers examples illustrate this well. For instance,

if I am the weatherman, if I say i believe it will rain tomorrow with probability

0.2, what I mean is most definitely not what Dr. Carrier says, ”it will rain

over one-fifth of the region”. Think of how variable the weather is and how

nonsensical that statement is if you take it at face value! In fact, I would be

be almost certain that it might rain over either 1/10 or 1/2 or 1/3 or some

other fraction of the region. What I am trying to convey is a have a lack of

knowledge whether or not it will rain tomorrow, and my models and data (and

possible Bayes theorem) allow me to quantify this as being 0.2, full stop, no

further thought-experiments required!.

22

**The section continues directly:
**

Those are all physical probabilities. But what about epistemic

probabilities? As it happens, those are physical probabilities, too.

They just measure something else: the frequency with which beliefs

are true. Hence all Bayesians are in fact frequentists (and as this

book has suggested, all frequentists should be Bayesians). When

Bayesians talk about probability as a degree of certainty that h is

true, they are just talking about the frequency of a different thing

than days of rain or number of smokers. They are talking about

the frequency with which beliefs of a given type are true, where

of a given type means backed by the kind of evidence and data

that produces those kinds of prior and consequent probabilities. For

example, if I say I am 95% certain h is true, I am saying that of all

the things I believe that I believe on the same strength and type of

evidence as I have for h, 1 in 20 of those beliefs will nevertheless still

be false (...). Probability can be expressed in fractions or percentile

notation, but either is still a ratio, and all ratios by definition entail a

relation between two values, and those values must be meaningful for

a probability to be meaningful. For Bayesians, those two values are

beliefs that are true and all beliefs backed by a certain comparable

quantity and quality of evidence, which values I’ll call T and Q. T

is always a subset of Q, and Bayesians are always in effect saying

that when we gather together and examine every belief in Q, we’ll

find that n number of them are T , giving us a ratio, nt /nq , which is

the epistemic probability that any belief selected randomly from Q

will be true

The good news about the proposal is that it is relatively clearly stated, the bad

news is it is both unnecessary and defective. That the definition is defective is

properly best illustrated with a small puzzle: Suppose I have a coin of which

I know if I flip it two times (independently), the chance it will come up heads

both times is 1/2. What is the probability it will come up heads if I flip it once?

The problem is easy to solve: P (HH) = P (H)P (H) = 21 and so P (H) =

√

√

1/ 2. Now, the problem is 1/ 2 cannot be represented as a fraction of two

integers, so when Dr. Carrier writes: Probability can be expressed in fractions

or percentile notation, but either is still a ratio, and all ratios by definition

entail a relation between two values, and those values must be meaningful for a

probability to be meaningful., and then go on to define the probability in terms

of fractions of integers (see the quotation above), he is exactly excluding the

above case.

It goes without saying the coin should not and do not pose a problem from a

Bayesian or frequentist perspective.

There are two ways to avoid the problem: One is to say we simply don’t

care about the coin because it’s a stupid example. In my opinion thats just

admitting the proposed definition do not work. The other is to say the above

discussion only applies to epistemic probabilities and the coins probability is

23

**something else which we have not defined. The problem is this would create
**

absurdities, because I could then change the type of probability from epistemic

to ”that something else” by considering a new system that involve the coin at

some point.

I think this basic example is fatal in terms of obtaining a general and consistent theory out of Dr. Carriers proposal, but to avoid charges of rejecting a

good idea because of some mathematical trickery which can perhaps be fixed,

I want to point out some other more serious ailments of the proposal of which

the coin-example is only a symptom:

Lets simply try to imagine how the proposal can be implemented. Suppose

I consider the statement: ”I will get an email between 16.00-17.00 today”. Lets

say that after thinking about this as carefully I can, possibly using Bayes theorem, I arrive at a probability of 0.843 of that statement being true. Now, to

implement the above definition, I think very carefully about all I know and,

though I cannot at the moment tell how I would arrive at this conclusion, I realize I know exactly 3 other things on ”the same type and strength of evidence”

as was the case of the email, giving nq = 4. I now need to compute nt , namely:

beliefs that are true. A basic problem is that I wouldn’t know how to do this

because I do not know which of these are true or not, so I suppose I should

imagine I have access to an oracle that knows the real truth.

At any rate, even without the oracle, nt can take the values: 0, 1, 2, 3 and 4.

This give 5 different possible epistemic probabilities, nt /nq = 0, 1/4, 1/2, 3/4, 1,

none of which is 0.843. So does this mean I didn’t really believe the statement

at probability 0.843? In which case, with what probability do I believe the

statement with, then? Does it mean the probabilities we have available is limited

by how many things we know? If taken at face value, the proposal seems entirely

flawed.

To counter any claim I am quoting Dr. Carrier out of context the proposal

is summarized later in the section as follows:

So when you say you are only about 75% sure you’ll win a particular hand of poker, you are saying that of all the beliefs you have

that are based on the same physical probabilities available to you in

this case, 1 in 4 of them will be false without your knowing it, and

since this particular belief could be one of those four, you will act

accordingly. So when Bayesians argue that probabilities in BT represent estimates of personal confidence and not actual frequencies,

they are simply wrong. Because an estimate of personal confidence

is still a frequency: the frequency with which beliefs based on that

kind of evidence turn out to be true (or false). As Faris says of

Jaynes (who in life was a prominent Bayesian), Jaynes considers

the frequency interpretation of probability as far too limiting. Instead, probability should be interpreted as an indication of a state of

knowledge or strength of evidence or amount of information within

the context of inductive reasoning. But an indication of a state of

knowledge is a frequency: the frequency with which beliefs in that

24

**state will actually be true, such that a 0.9 means 1 out of every 10
**

beliefs achieving that state of knowledge will actually be false (so of

all the beliefs you have that are in that state, 1 in 10 are false, you

just won’t know which ones). This is true all the way down the line.

If anything I think this write up is even more muddled. To take the poker

example, I don’t know 1 in 4 things I know with probability 0.75 will be false.

Why should they? It might turn out everything I know with probability 0.75

will be true.

Asides suffering from the above flaws, it suffer from all the other flaws I

previously discussed: Suppose I know exactly 4 things with probability 0.75:

The 6th digit of π is 3, that the Brazillians speak Brazillian, that there are 52

states in USA and that Adam and Eve really lived; however these things will all

be false! For that reasoner, the frequency of which beliefs based on that type

of evidence turn out to be false is 1. This is no problem if we use probability

to refer to a state of knowledge, as Jaynes do, but it is a problem if we want

to root it in what is actually the case, as Dr. Carrier suggests. Again there is

absolutely nothing novel about the points raised here they can all be found in

Jaynes book.

One might attempt to rescue the proposal as follows. Suppose one say:

”I did not intend to say, ’a [probability of ] 0.9 means 1 out of every

10 beliefs achieving that state of knowledge will actually be false’. I

merely meant: The average (or expected value) of nt /nq is 0.9”. The problem of

such a definition is that it will almost inherently be circular, since the average is

computed using the probability, and so cannot be used to define probabilities. A

source of confusion is that we can make probabilistic statements about nt , but

in order to do so require we have a theory for probabilities. What is that theory?

If we are frequentists, we need to consider why it should apply to statements

about eg. Jesus. If it is Bayesian, well, there is your theory. There is no reason

to force an ad-hoc layer of interpretation on top of it.

5.1

There is good weather at infinity

**That the proposal is flawed simply by the virtue of not allowing one to represent
**

a probability of 0.843 if one only know 3 other things at that confidence (and

suppose how unfortunate

we would be if we only knew 1 such thing..) or a a

√

probability of 1/ 2 makes me suspect Dr. Carrier had intended some sort of

limit statement, that is, using infinities in some way.

A basic problem of using infinities is the things we consider are not infinite.

If we have two interpretations of assigning probabilities in the case of 3 coins

and the existence of 1 Jesus, and the first only require us to consider 3 coins and

one Jesus while the second require us to consider an infinite number of coins

and Jesuses, I think there is ample reason to suspect the first proposal is the

more fundamental for the sole reason there was at most one Jesus.

Nevertheless I will briefly mention 3 ways to attempt to ”fix” the proposal

above by appeal to infinities and simply notice there is no need for any similar

25

**ad-hockery in a Bayesian interpretation.
**

The first is to propose we always know an infinite number of things of any

given probability. I think this proposal can be rejected on the grounds it is

blatantly false.

The second proposal is somewhat related to the first, and that is that to make

sense of any given probability of (say) 0.8, we immediately imagine an infinite

number of coin-flips with biased coins that come up heads with probability 0.8

and define probability from this. I suspect it is hard to define this in a noncircular fashion (keep in mind random must be defined in this context without

using probability), but a worse problem is the chance of the event happening

in the real world is irrelevant to the definition, since the limit will be entirely

dominated by the infinite number of hypothetical coins. Thus, the proposal

has no normative power. Finally the proposal seem to simply be a fancy way

of arriving at the number 0.8: Does the proposal effectively differ from simply

saying a probability of 0.8 is taking a cake and dividing it in such a way the one

part is in a ratio of 0.8 to the total and that’s the definition of probability? Put

briefly, I don’t see how the proposal has a normative effect on how probabilities

are used.

The third proposal is going deeper into frequentist land and imagining an

infinite number of words in which we believe things at a probability of 0.8 and

imagine the probability is defined as how often things believed at a probability

0.8 turn out to be true in these worlds. This is basically the frequentist definition

of probabilities, and contain all the illusions of circularity and fancy reasoning

Bayesians usually object to, and has led frequentists themselves to object to the

idea we can assign probability to things like Jesus raising from the dead. For

instance, how does the infinite number of worlds where the 6th digit of π is 3

look like?

5.2

The Bayesian/frequentist divide is not only about probabilities

**Finally, I am not sure how the division between frequentists and Bayesians is
**

resolved even if the proposal work. The division involve things such as if data

is fixed and parameters are variable, or if data is variable and parameters is

fixed. It involve frequentists objecting to applying Bayes theorem to things

like those considered by Dr. Carrier, and it involve (at least some) Bayesians

rejecting frequentist methods such as confidence intervals and t-tests as blatant

ad-hockery that should go the way of the Dodo. I simply do not see how adding

a layer of frequencies on top of Bayesian language affect the difference of opinion

on these issues.

5.3

The big picture

**Why should we accept Bayes theorem and it’s applications to questions like the
**

book of Matthew being written by Matthew? If we do, it must be because of a

26

**rigorous argument. I believe that eg. Cox and Jaynes provide such arguments,
**

and it seem Dr. Carrier believe so as well, recall from chapter three:

The theorem was discovered in the late eighteenth century and

has since been formally proved, mathematically and logically, so we

now know its conclusions are always necessarily trueif its premises

are true”

Though the claim us surprisingly not given a reference, Carrier himself suggest

exactly Jaynes. But I think it is evident Dr. Carrier is in opposition to most

of Jaynes philosophical points and assumptions from first to last chapter. For

instance, Dr. Carrier advance several different notions of probability (probability, physical probability, epistemic probability, hypothized probability) and of

frequencies. Suppose all of these are equivalent to what Jaynes call probability

and frequency, in that case why confuse the language and not simply talk about

probability and frequency?

The most logical conclusion, which I repeat think is very evident from simply

noting the differences in opinion I have documented above, is Dr. Carrier is in

opposition to Jaynes and by extension Cox and most other Bayesian thinkers of

the 20th century. In that case why should we think Bayes theorem hold? How

do we set out to prove it? Simply pointing to the Kolmogorov Axioms wont cut

it: Sure, that give us a mathematical theory of probabilities, but why suppose

it applies to historical questions any more than the theory of matrices?

The alternative is that Dr. Carrier is in agreement with eg. Jaynes and

Cox and I have just been to sloppy to see it. For instance the re-interpretation

of epistemic probabilities as frequencies is really just something added on top

of the Bayesian framework. Well if it is just something we add and it has no

normative effect in terms of our calculations, I think Laplace reply is in order:

”[No, Sire,] I had no need of that hypothesis”.

6

Priors

**The problem with interpreting probabilities as frequencies is in my opinion
**

reflected through the book, for instance when Dr. Carrier propose how one

should arrive at priors from frequencies. The problem can be summarized as

this: Suppose you want to assign a prior probability to some event E. You

observe E happening n times out of N . What is the prior probability p(E)?

For probability to have a quantitative applicability to history it is crucial to

arrive at objective ways of specifying prior probabilities. For instance in the

example of the Criteria of Embarrassment we must be able to estimate numbers

such as P (P ) (the probability a gospel is preserved) or P (Em) (the probability

a story is embarrassing). Without such machinery Bayes theorem will just be

a consistency requirement without the ability to provide quantitative results.

To give a concrete example of how Dr. Carrier treats this problem consider the

following from chapter six, in the section on determining a reference class:

27

**If our city is determined to have had that special patronage, and
**

our data shows 67 out of 67 cities with their patronage have public

libraries, then the prior probability our new city did as well will now

be over 98%.

Laplaces rule of succession is invoked here to arrive at the figure 98%, as it often

is through the book, but without any consideration where it come from or if

the specific assumptions are fulfilled. In fact, one would not get the impression

from reading the book Laplaces rule is a Bayesian method at all, but I digress.

Now consider the following example of a more elaborate problem on libraries

in two provinces:

To illustrate this, the libraries scenario can be represented with

this Venn diagram [see figure 1] In this example, P (LIBRARY |RC) =

**Figure 1: Venn diagram from Proving History
**

0.80, P (LIBRARY |IT ) = 0.90, and P (LIBRARY |N P ) = 0.20.

What’s unknown is P (LIBRARY |C), the frequency of libraries at

the conjunction of all three sets. If we use the shortcut of assigning

P (LIBRARY |C) the value of P (LIBRARY |N P ) < P (LIBRARY |C) <

P (LIBRARY |IT ), that is, P (LIBRARY |C) can be any value from

P (LIBRARY |N P ) to P (LIBRARY |IT ), then the first concern is

how likely it is that P (LIBRARY |C) might actually be less than

P (LIBRARY |N P ), or more than P (LIBRARY |IT ), and the second concern is whether we can instead narrow the range. Given that

we know Seguntium lacked special patronage, in order for P (LIBRARY |C) <

P (LIBRARY |N P ), there have to be regionally pervasive differences

in the means and motives of veteran settlers in Italy - enough to

28

**make a significant difference from veteran settlers in the rest of the
**

Roman empire. And indeed, on the other side of the equation, for

P (LIBRARY |C) > P (LIBRARY |IT ) these deviations would have

to be remarkably extreme, not only because P (LIBRARY |IT ) >

P (LIBRARY |RC), but also because P (LIBRARY |RC) is already

>> P (LIBRARY |N P ), which to overcome requires something extremely unusual. Lacking evidence of such differences, we must assume there are none until we know otherwise, and even becoming

aware of such differences, we must only allow those differences to

have realistic effects (e.g., evidence of a small difference in conditions

cannot normally warrant a huge difference in outcome; and if you

propose something abnormal, you have to argue for it from pertinent

evidencewhich all constitutes attending to the contents of b and its

conditional effect on probabilities in BT). However, we would have

to say all the same for P (LIBRARY |C) > P (LIBRARY |N P ),

since we have no more evidence that P (LIBRARY |C) is anything

other than exactly P(LIBRARY—NP). All we have is the fact that

P (LIBRARY |IT ) is higher than P (LIBRARY |RC), but that in

itself does not even suggest an increase in P (LIBRARY |N P ), and

certainly not much of an increase. Thus P (LIBRARY |N P ) <

P (LIBRARY |C) < P (LIBRARY |IT ) introduces far more ambiguity than the facts warrant. There is every reason to believe

P (LIBRARY |C) ≈ P (LIBRARY |N P ) and no reason to believe

being in Italy makes that much of a difference, especially as P (LIBRARY |IT )

is only slightly greater than P (LIBRARY |RC), which does suggest only a small rather than a large difference between Italy and

the rest of the empire, and likewise we should expect the large disparity between P (LIBRARY |N P ) and P (LIBRARY |RC) to be

preserved between P (LIBRARY |C) and P (LIBRARY |IT ), as the

causes producing the first disparity should be similarly operating

to produce the secondunless, again, we have evidence otherwise. In

short, NP appears to be far more relevant a reference class than

IT in this case and should be preferred until we know otherwise.

And if we also use a fortiori values (setting the probability at, say,

10 − 30%), we will almost certainly be right to a high degree of probability. All this constitutes a more complex application of the rule

of greater knowledge. When you have competing reference classes

entailing a higher and a lower prior, if you have no information indicating one prior is closer to the actual (but unknown) prior, then

you must accept a margin of error encompassing both, but when you

have information indicating the actual prior is most probably nearer

to one than the other, you must conclude that it is (because, so far

as you know, it is). In short, we can already conclude that it’s so

unlikely that P (LIBRARY |C) deviates by any significant amount

from P (LIBRARY |N P ) that we must conclude, more probably

than not, P (LIBRARY |C) ≈ P (LIBRARY |N P ), regardless of the

29

**difference between P (LIBRARY |IT ) and P (LIBRARY |RC). And
**

as in this case, so in many others you’ll encounter.

I will admit after six readings I am still not quite certain what exactly is being

argued above and put generously I think it is another example of the books

sometimes less than lucid style of writing.

The reason why the argument is hard is because the problem is underdetermined, meaning one has to add additional assumptions to get a definite result.5

What is perhaps surprising Bayes theorem was not invoked, according to

which the analysis is actually fairly straight forward. All probabilities here are

conditioned on RC. By Bayes theorem:

P (L|N P, IT ) =

P (N P, IT |L)

P (L)

P (N P, IT )

**Thus the argument is actually fairly simple: Since almost all provinces that
**

are NP do not have a library, P (L|N P ) = 0.2, and almost all provinces that

are IT do, P (L|IT ) = 0.9, it follows that all things being equal, if a province

has a library, there is less chance it is both NP and IT at the same time than

otherwise. For instance, if we assume the distribution factorize, P (N P, IT ) =

P (N P )P (IT ) and P (N P, IT |L) = P (N P |L)P (IT |L), then just applying Bayes

theorem two times give:

P (L|N P, IT ) =

P (N P, IT |L)

P (L|N P )P (L|IT )

0.2 × 0.9

P (L) =

=

≈ 0.22.

P (N P, IT )

P (L)

0.8

**which is in agreement with the discussion above.
**

In reality one should of course never attempt such an argument. Clearly

the relevant piece of information is the number of libraries in the provinces

in addition to a number of other things we would know, and we should not

simply assume independence or some other ad-hock handwaving argument to

get a prior. It is difficult to say what one should actually do. My first advice

to students would be to come up with a way to validate whatever method they

came up with worked, but this is evidently very difficult to do for historical

problems. Since any computation using Bayes theorem relies on the estimation

of many such probabilities it goes without saying this is an important difficulty.

7

Conclusion

**Dr. Carrier attempt to apply Bayes theorem to problems in history, specifically
**

the existence of Jesus. I emphasize I think this is an interesting idea, and while

I am uncertain we will find Jesus (or not) at the end, I am sure Dr. Carrier can

get something interesting out of the endeavour. Furthermore, the sections of

the book which discuss history are both entertaining and informative. However

5 If the problem was actually the way it is stated, one should apply the method of maximum

entropy (again see Jaynes).

30

**the book should not be evaluated as a treatment of history but as advocating
**

the use of Bayes theorem for advancing historical Jesus studies. However for

this to be successful it must be because two conditions are met:

• The main problem keeping back Jesus studies is one of inference, not for

instance forming hypothesis, discerning how a-priori plausible ideas are or

psychology.

• We can easily pose the inference problems in a numerical format suited

for Bayesian analysis; ie. the problems with granularity of language, formalization of our thoughts and so on can be overcome.

Are these two conditions actually met? Unfortunately I feel the book contains

great difficulties in it’s treatment of it’s various main claims which I have discussed in the review and will summarize here:

• The proof historical methods reduce to the application of Bayes theorem

is either false or not demonstrating anything which one would not already

accept as true if a Bayesian view of probabilities is accepted as true.

• The problem that a throughout treatment of a historical problem will

include a great many interacting variables with little chance of checking

the modelling assumptions

• The practical assignment of probabilities (and determining proper reference classes). The main example of assigning probabilities (the libraries

example discussed above) relies on non-standard argument and cannot be

said to be practical.

To convincingly make case Bayes theorem can advance history one needs lots

and lots of worked-out examples. Unfortunately the book contains nearly none

of these, and I would say the only time it tries to venture into the historical

method –the case of the criteria of embarrassment– it does so in a fashion that is

both distinctly non-Bayesian and without a way to encode something is actually

embarrassing to the author.

The book has even greater difficulties when it addresses foundational issues

such as the proposed resolution of the frequentist and Bayesian view of probabilities. An important problem with this proposal is that√it is flawed by the

virtue of not being able to represent probabilities like 1/ 2 if taken at face

value, however even a flawed suggestion could be interesting reading if it provided the reader with a comprehensive and accurate account of the underlying

problem and the current Bayesian resolution. Unfortunately this is not found

in the book, indeed it is impossibly to find (non-circular) definitions of the

most basic concepts such as probabilities and frequencies within it. Instead the

book introduces a plethora of important sounding terms (epistemic probability, hypothetical probability, true frequency, etc.) which are rapidly introduced

together with elaborate though-experiments. For the most part these thoughexperiments fails to demonstrate anything concrete and worse may give the

31

**unsuspecting reader the impression something important and widely accepted
**

is being conveyed. This discussion could easily be extended to the many other

oddities found in chapter six and it is difficult not to get the impression sections

of the book were written in a hurry.

32

- Metzger on the Syriac VersionsUploaded byWilma Mancuello
- Richard Carrier - Bart Ehrman and the Quest of the Historical Jesus of NazarethUploaded bymihaibooks
- Early Christian Literature and Intertextuality%2C Thematic - Craig a. EvansUploaded byyoshimi81
- Richard Carrier's rough fine-tuning argumentUploaded byTim Hendrix
- Richard Carrier - Not the Impossible FaithUploaded bymihaibooks
- Ransom Letter RedactedUploaded byJ Doe
- Faking JesusUploaded byMogg Morgan
- Cosmology and New Testament TheologyUploaded bysobrino928
- On on the Historicity of Jesus Why We Have Reasons to Doubt Richard Carrier’s HistoriographyUploaded by50_BMG
- TRANSLATION STUDIES Sustaining Fictions Intertextuality Midrash Translation and the Literary Afterlife of the Bible Library of Hebrew Bible Old Testament StudiesUploaded byMelinteDianaElena
- Awan v. Levant 2014 ONSC 6890Uploaded byScott Maniquet
- John S Kloppenborg - Q, The Earliest Gospel [Christian).pdfUploaded byFelegyi-Németh Mihály
- Reading Raphael Lataster, A Review From a Bayesian PerspectiveUploaded byTim Hendrix
- Richard Carrier's "On the Historicity of Jesus"Uploaded byTim Hendrix
- John F. Wippel-The Ultimate Why Question_ Why is There Anything at All Rather Than Nothing WhatsoeverUploaded byGalilaios
- Climax of the Syrian Wars - The Battle of Raphia 217 BCEUploaded byHorváth Gábor
- Jacob Wright Earliest Jewish SoldiersUploaded byJacob L. Wright
- Three Views on the New Testament Use of the Old Testament: Walter C. Kaiser Jr., Darrell L. Bock, Peter Enns by Kenneth BerdingUploaded byZondervan
- Kings of North and SouthUploaded byCarlo De Chiro
- Lost Christianities Bart EhrmanUploaded byDavid Angel
- Ptolemy and His Rivals in His History of AlexanderUploaded byMukesh H Singh
- William Horbury, Jewish Messianism and the Cult of Christ -SCM Press (2009)Uploaded byDavid Bailey
- Estudios Realizados en La Critica Textual Del Nuevo TestamentoUploaded byMartin Peliroja
- The Missing Link of Jewish European Ancestry- Contrasting the Rhineland and the Khazarian Hypotheses Eran ElhaikUploaded byÜntaç Güner
- Medieval SwordsUploaded bymarsenijrevic8
- The Tragedy in HistoryUploaded byJunMo Kwon
- Stanley E. Porter, Andrew W. Pitts Christian Origins and Greco-Roman Culture Social and Literary Contexts for the New Testament 2012Uploaded byNovi Testamenti Filius
- Antiochus IV and the JewsUploaded byMarius Jurca
- Jesus, Paul and the GospelsUploaded byAsael Illu