Schoch 2000

DANIEL SCHOCH
A FUZZY MEASURE FOR EXPLANATORY COHERENCE
ABSTRACT. In a series of articles, Paul Thagard has developed a connectionist’s model

for the evaluation of explanatory coherence for competing systems of hypotheses. He
has successfully applied it to various examples from the history of science and common
language reasoning. However, I will argue that his formalism does not adequately represent
explanatory relations between more than two propositions. In this paper, I develop a
generalization of Thagard’s approach. It is not subject to the connectionist paradigm of
neural nets, but is based on fuzzy logic: Explanatory coherence increases with the fuzzy
truth value of the conjunction of explanans and explanandum and decreases with the value
of the conjunction of explanans and the negation of the explanandum.
1. THE STRUCTURE OF EXPLANATORY COHERENCE
1.1. Inference of the Best Explanation

Theories of explanation can be formulated on two different levels. On
the micro-level, the relation ‘x explains y’ is the object of consideration,
and necessary or sufficient conditions for it are analyzed. By contrast,
the macro-level view takes the concept of explanation as an undefined
primitive. It either inquires into the general properties of explanations, or
uses explanatory relations in certain contextual frameworks. The general
question we are interested in is the problem of choice between concurrent
hypotheses. In order to achieve this, different concurrent explanations are
evaluated and aggregated according to the degree of their coherence.
Numerous reasons can be given for using the macro-level in philosoph-
ical and historical investigations. Although we seem to have concurrent
intuitions whose facts can be explained by a given theory, it is hard to
find sufficient conditions for the reasoning scheme of an explanation. The
D-N model, for example, is well-known to be much too liberal; apparent
confirmations more sophisticated than the classical instances of non-black
non-ravens can easily be constructed. Moreover, there is a major dispute
about whether explanations are cum grano salis based on deductive, causal,
statistical or inductive relations or combinations thereof – or neither. A
new fruitful approach based on the structuralist’s view of theories has
Synthese 122: 291–311, 2000.

© 2000 Kluwer Academic Publishers. Printed in the Netherlands.
292 DANIEL SCHOCH
recently been developed by Bartelborth (1996, 1997), who identifies ex-

planation with the model-theoretical embedding of the explanandum into
the explanans.
Apart from the quest for a formal theory of explanation, there are also
pragmatic reasons for the macro-level view. Many natural language argu-
mentations make use of certain background information which is neither
explicated nor questioned in its context. Refraining from making assump-
tions based on it often brings us much closer to the historical situation of
a dispute. The increase in psychological plausibility will pave the way for
a fruitful interdisciplinary analysis between the philosophy of knowledge
and human cognitive science studies.
In a large number of papers, Thagard (1998) has given an abduc-
tive theory of inference to the best explanation probably prior to similar
ideas of BonJour (1985). The propositions, both evidence and hypotheses,
are represented by an unstructured set E of elements. Each element is
independently assigned a numerical value of truth or acceptance. The ex-
planatory relation is mapped to a real-valued measure of coherence, which
depends on the numerical values assigned to the elements. The problem is
to find a truth value assignment which maximizes explanatory coherence.
According to these values, we can partition E into two disjoint sets, namely
accepted and rejected propositions. As we will see later (Lemma 1.2), a
clear-cut distinction can always be made. Abduction, in this sense, is a
metatheoretical choice function which maps a set of propositions into a
subset thereof. It is not, like deduction or induction, a relation between
individual sentences.
The purpose of this paper is to refine Thagard’s idea. This is necessary
because his measure of coherence is shown to be incapable of dealing
adequately with explanatory relations between more than two sentences. I
propose a straightforward generalization which could easily be interpreted
in terms of fuzzy logic. The drawbacks of Thagard’s approach probably
stem from relying on the connectionist paradigma of neural networks.
1.2. The Common Framework

Thagard’s model and the one I propose in this paper share the following
basic structure for a measure of explanatory coherence. It consists of
• A set E of propositions.
• A set R of rules of the form
‘P explains Q’ , P ⊆ E, Q ∈ E,
or ‘P is contradictory’ , P ⊆ E,
or ‘E is a fact’ , E ∈ E.
A FUZZY MEASURE FOR EXPLANATORY COHERENCE 293
(Although Thagard’s model contains a fourth type of rule, we shall

neglect it here as it is rarely used and is irrelevant to our investigation.)
• A closed interval I ⊆ < from the real line representing truth values.
Thagard chooses I = [−1, 1], while I set I = [0, 1].
• A set of real-valued variables x1 , . . . , xn with domain I , where xi is
the truth value of the i-th proposition from E
• The measure of coherence, a first-degree polynomial
X
(1.1) V (x1 , . . . , xn ) = ar1 ,...,rn · x1r1 · · · xnrn ,
0≤r1 ,...rn ≤1
which for Thagard’s model reduces to the special form

X X
(1.2) VT (x1 , . . . , xn ) = wij · xi xj + vi · xi .
1≤i6 =j ≤n 1≤i≤n
• An algorithm which translates the set of rules R to the weights ar1 ,...,rn
or wij and vi respectively.
A reason for choosing the form of V will be given later, because the
motivation is different from that of Thagard. He appeals to some special
intuitions about the coherence of propositions. Let us consider the special
case of relations between two propositions, in which I share his intuitions
about the coefficients in the function VT . If hypothesis H explains evid-
ence E, then H and E cohere. If H contradicts E, they incohere. The
weight wij between proposition i and j is set to a positive value if they co-
here, and to a negative value if they incohere. The coefficient vi represents
the empirical confidence in proposition i. If the third type of rule applies
to proposition i, i.e., if i is a fact, then vi > 0.
The extremal values of I represent the classical truth values ‘true’ and
‘false’. The following lemma states that we can restrict ourselves to them
without loss of generality. This means a clear-cut distinction can always be
made between the set of accepted and rejected propositions.
LEMMA 1.1. The function V (x) has at least one global maximum x =
(x1 , . . . , xn ) with xi ∈ {sup I, inf I }.
Proof. Let I = [a, b]. Since I n is compact and V is continuous, s =
supx∈I n V (x) exists and there is an x ∈ I n with V (x) = s. Let y 0 = x. For
i = 1, . . . , n let yji = yji−1 for j 6 = i. If yii−1 ∈ {a, b}, we let yii = yii−1 .
Since for each k we can write V (z) = zk · fk (z) + gk (z) for any z, where
fk and gk do not depend on zk , we set

a, fi (y i−1 ) < 0
yi =
i
b, else.
294 DANIEL SCHOCH
With yii · fi (y i ) = yii · fi (y i−1 ) ≥ yii−1 · fi (y i−1 ) and gk (y i ) = gk (y i−1 )

for every i = 1, . . . , n we find
V (x) = V (y 0 ) ≤ V (y 1 ) ≤ · · · ≤ V (y n ).
Thus y n is the proposed maximum with yin ∈ {a, b}.
1.3. Explanation in Thagard’s Model

The core principle of (Thagard 1989) is the Principle of Explanation.
If {P1 , . . . , Pm } explains Q, then:
(a) For each i, Pi and Q cohere.

(b) For each i, j , i 6 = j , Pi and Pj cohere.
(c) The degree of coherence is inversely proportional to m.
It could be rewritten as a conjunction of the two following principles:
(1) If P explains Q, then P ∪{Q} coheres (with degree ∼ 1/m).
(2) If {P1 , . . . , Pk } coheres (with degree d), then Pi and Pj , i 6 = j ,

coheres (with degree d).
The basic idea of explanatory coherence is expressed by (1): explan-
ations bring about coherence. Principle (2) states that the coherence of
a set of propositions reduces to the coherence of pairs of elements. To
understand Thagard’s motivation behind (2), it must be appreciated that
(1.2) could be understood as the energy function of an artificial neural
network: VT (x1 , . . . , xn ) = VT0 (x1 , . . . , xn , 1) with
X
(1.3) VT0 (x1 , . . . , xn+1 ) = wij0 · xi xj ,
1≤i6 =j ≤n+1
where

 wij , if i, j ≤ n,
wij0 = vi , if i ≤ n, j = n + 1,

0, otherwise.
In this picture, the variables xi , 1 ≤ i ≤ n correspond to the neurons

or nodes of the net, and wij0 are the synaptic weights or the strength of
connection between neurons i and j . A positive weight establishes an
excitatory link, a negative weight represents an inhibitory link. The stand-
alone confidence vi in proposition i is implemented by a link to a unit
xn+1 = 1, which is constantly held to 1.
The basic mechanism of a neural net is its update rule. The net starts
in an arbitrary state x(0). Then, in a fixed or randomly chosen sequence,
each neuron k is updated in accordance with
(1.4) xk (t + 1) = fact (xk (t), netk (t)) ,
where
X
0
netk (t) = wkj · xj (t).
1≤j ≤n+1
The function netk (t) represents the input of neuron k, which leads to
a new activation potential xk (t + 1). There are several possible activation
functions fact ; they increase or decrease the activation of neuron k if the
sign of netk is positive or negative respectively. Effectively, the update
rule increases the value of (1.3) until a local maximum is reached. This
procedure often (but not necessarily) attains a global maximum.
The relation between two nodes of a neural net could always be con-
sidered symmetrical. There is no need for an extra symmetry principle as
stated by Thagard. This follows from the simple identity
X X symm symm 1
wij · xi xj = wij · xi xj , wij = wij + wj i .
1≤i6 =j ≤n 1≤i6 =j ≤n
2
1.4. A Critique of Thagard’s Model

Although I agree with the basic idea of coherence, I cannot see any good
reason for part (2) of the principle of explanation. The approach of connec-
tionism neither has an epistemological value in itself, nor is it sufficient to
distinguish the special form of (1.2) from any other continuously differen-
tiable function V . Local update rules similar to (1.4) can be given for the
optimization of any such function V by proceeding step by step into any
direction which forms an acute angle with the gradient ∇V . Furthermore,
the argument of biological plausibility is not convincing either: One would
hardly expect the degree of acceptance for any theory, however complex, to
be represented by a single neuron. On the other hand, part 2 of the principle
runs into severe methodological difficulties.1
Let us consider the case of just three propositions P1 , P2 and Q in Fig-
ure 1. The measure of coherence for the system R1 consisting of a single
rule ‘{P1 , P2 } explains Q’ and for the system R2 containing the three rules
‘P1 explains Q’and ‘P2 explains Q’ and ‘P1 explains P2 ’ differ only by
a positive constant factor. The two coherence functions are symmetrical
296 DANIEL SCHOCH
Figure 1. Three rule systems.
under any permutation of the propositions, even if the roles of Q and P1 are
rather asymmetrical. Moreover, if one proposition is false (−1) while the
others were true (+1), we obtain the same amount of negative coherence
as when two propositions are false and the third is true in either case. But
for R2 , one would expect positive coherence if P2 and Q are true, since
Q is successfully explained by P2 . However, what beats everything is that
there is positive coherence if a false value has been assigned to all three
propositions!
There is no rule for dealing with contradictions between more than two
sentences in Thagard’s approach. It is too obvious that the inconsistence of
a set of propositions cannot be reduced to properties of pairs. There is no
way to explicitly deal with negations of propositions. If P is a proposition,
its negation must be implemented by a new proposition, say NP , and an
inhibitory link be established to P . For the system of rules R3 = {’{P , Q}
explains R’, ‘{P , NQ} explains NR’, ‘Q contradicts NQ’, ‘R contradicts
NR’}, the coherence function does not depend on the value of P , if Q and
NQ as well as R and NR have contrary truth values.
Another problem is directly connected with the connectionist
paradigma. Since Thagard chose I = [−1, 1] as the range of values, as is
usual for artificial neural nets, the coherence function (1.2) is symmetrical
under the negation of all values, VT (x) = VT (−x), if vi = 0 for all i̇. In
other words, if there is no confidence in a proposition besides that stem-
ming from coherence, the two sets of accepted and rejected propositions
may be interchanged. If, for example, P and Q are contradictory hypo-
theses, and exactly one can be accepted, then it is possible to accept either,
irrespective of how many ‘coherent’ links they have to other propositions.
2. F UZZY CONFIRMATION AS EXPLANATORY COHERENCE
2.1. The Improved Model of Coherence

In the following section, we will restrict ourselves to the framework of
classical propositional logic by assuming that the ‘tertium non datur’ and
‘duplo negatio affirmat’ rule holds. We introduce negated propositions
and conjunctions thereof, which we call constituents. The term constituent
comes from the theory of normal forms. Constituents are represented by a
set of signed propositions and represent belief systems. For example, sets
of elementary observations, such as the basis or protocol proposition in the
terminology of the Vienna Circle, are constituents. Basis propositions are
the instances which test, confirm or falsify theories. The basic assumption
of my model of coherence is that explanations bring about coherence and
incoherence of constituents. We do not presuppose that the propositions
are atomic; they can represent hypotheses or theory systems as well.
DEFINITION 2.1. Let P be a set of propositions. The set E of signed

propositions over P is defined as
E = P ∪ {¬P |P ∈ P }.
E is called a set of signed propositions if for some set of propositions P ,

E are the signed propositions over P .
Since we identify ¬¬P and P for each P , the set of signed propositions
is always closed under negation.
DEFINITION 2.2. A subset P ⊆ E of a set of signed propositions is called

constituent if and only if for no P ∈ E both P and ¬P are in P .
Now we are able to state our rules of explanation, contradiction and

fact. The principle of explanation implies rule (1) of my reconstruction of
Thagard’s theory. It is owing to the idea that a successful explanation of
Q by P when P and Q are both true always confirms a theory, while an
unsuccessful explanation when P and ¬Q are true always disconfirms a
theory. I henceforth assume that these confirmations and disconfirmations
always contribute to coherence and incoherence with the same weight
factor.
2.1.1. Principle of Explanation

If P explains Q and both P ∪ {Q} and P ∪ {¬Q} are constituents, then
P ∪ {Q} coheres and P ∪ {¬Q} incoheres with the same weight factor cP .
298 DANIEL SCHOCH
We generalize Thagard’s Principle of Contradiction to a Principle of

Competition, since in many cases when theories compete, they do not
logically contradict each other. An example mentioned by Thagard is two
competing theories of dinosaur extinction, which could be caused by met-
eorite impact or a drop in sea-level. Although these events are not mutually
exclusive, scientists are interested in establishing the best explanation and
therefore regard the two theories as competing.
2.1.2. Principle of Competition

If P is contradictory or competing and P is a constituent, then P inco-
heres. Data evidence is handled as a special instance of the explanation
rule with an empty explanans.
2.1.3. Principle of Data Evidence

If there is positive evidence for E, then {E} is coherent. If there is negative
evidence for E, then there is positive evidence for ¬E.
In contrast to Thargd’s elaboration, we propose a clear interpretation of
the activation potentials associated with the propositions in terms of fuzzy
logic.
2.1.4. Principle of Fuzzy Confirmation

The measure of coherence only depends on the coherent and incoherent
constituents. If P coheres (P ∈ C), the degree of coherence is propor-
tional to the fuzzy truth value of the conjunction of its elements. If P
incoheres (P ∈ I), the degree of coherence is proportional to the negative
fuzzy truth value of the conjunction of its elements.
Language independence is a general requirement for any meta-
scientific or methodological concept. Its large variety of aspects consists
of certain invariance conditions for transformations and translations into
equivalent or richer language structures. In our simple propositional
framework, the logical definitions for conservative language extension,
reduction or definability are not expressible. We only focus on a necessary
condition: Coherence must be independent of the set E of propositions in
the following sense: If a new redundant proposition P ∈ / E is added to the
explanatory structure without appearing essentially in the rules, then the
measure of coherence should remain the same.
2.1.5. Principle of Language Independence

Let P be a proposition which does not occur in any rule in R. Then the
rule system R0 obtained from R by replacing each rule of the form ‘Q
explains R’ by the two rules ‘Q ∪ {P } explains R’, ‘Q ∪ {¬P } explains R’
and each rule of the form ‘Q incoheres’ by ‘Q∪{P } incoheres’, ‘Q∪{¬P }
incoheres’ induces the same order of coherence over E ∪ {P } irrespective

of the value of P .
These principles map the explanatory structure for a pair hC, Ii of sets
of constituents, namely the coherent (C) and the incoherent (I) constitu-
ents (together with their weights). This set-theoretical formulation has the
advantage that redundant information cannot contribute to the coherence
function, e.g., by the rule ‘P ∪ {¬Q} competes’, when ‘P explains Q’
has already been established. The measure of coherence is a function of
this pair and the truth values of the propositions, where each constituent is
treated as the conjunction of its elements.
It is easy to show that in this terminology the Principle of Language
Independence could be reformulated as follows: If hC, Ii is a pair of the
sets of

0coherent
and incoherent constituents over E, and P ∈ / E, then the
0
pair C , I resulting from adding P and ¬P to each member,
C0 : = {P ∪ {P }|P ∈ C} ∪ {P ∪ {¬P }|P ∈ C}

(2.1)
I0 : = {P ∪ {P }|P ∈ I} ∪ {P ∪ {¬P }|P ∈ I} ,
induces the same order of coherence over E ∪ {P } irrespective of the value

of P .
2.2. The Coherence Function

I am only aware of one evaluation function which satisfies all the principles
stated above: Let
E = {P1 , . . . , Pn , ¬P1 , . . . , ¬Pn }
be a set of signed propositions, x1 , . . . , xn real-valued variables. Then for

each pair hC, Ii of the sets of coherent and incoherent constituents over E,
we define the coherence value VhC,Ii recursively
VPi (x1 , . . . , xn ) : = xi
V¬Pi (x1 , . . . , xn ) : = 1 − xQi
VP (x1 , . . . , xn ) : = cPP · P ∈P VPP
VhC,Ii (x1 , . . . , xn ) : = P ∈C VP − P ∈I VP
The constants cP , called the weight factor of coherence, can be considered

as the strength of explanation or competition respectively. For the sake of
lucidity, we henceforth identify the variables and the propositions.
This evaluation function uses the multiplication function as a fuzzy-
logical representation of the conjunction. It satisfies the minimum require-
ments of such a function as stated in Gottwald (1993): commutativity,
300 DANIEL SCHOCH
associativity, monotonicity and having 1 as a neutral element.

It satisfies
the principle of language independence: Let there be C0 , I0 as in Got-
twald (1993) resulting from adding a new proposition P to the constituents
hC, Ii, which is irrelevant or neutral to the explanatory structure and has
no influence on the weight factors (cP = cP ∪{P } = cP ∪{¬P } ). Then we find
immediately
VhC0 ,I0 i = VhC,Ii · P + VhC,Ii · (1 − P ) = VhC,Ii .
For the standard fuzzy-logical representation of the conjunction, the

minimum function fails to satisfy this requirement.2
We now reformulate the principles stated above in terms of contribu-
tions to the coherence function. If the constituent {P1 , . . . , Pm } explains
Q and Q, ¬Q 6 = Pi for 1 ≤ i ≤ m, then, according to the principle
of explanation, {P1 , . . . , Pm , Q} coheres and {P1 , . . . , Pm , ¬Q} incoheres.
Since the weight factor is symmetrical, the total contribution of these two
constituents to the coherence function up to a weight factor can be written
as
P1 · · · · · Pm · Q − P1 · · · · · Pm · (1 − Q) = P1 · · · · · Pm · (2Q − 1).
The two main principles could therefore be reformulated more briefly

as follows:
2.2.1. Principle of Explanation

If P = {P1 , . . . , Pm } explains Q and both P ∪ {Q} and P ∪ {¬Q} are
non-competing constituents, then a term proportional to
(2.2) P1 · · · · · Pm · (2Q − 1)
is added to the coherence function.
2.2.2. Principle of Competition

If P = {P1 , . . . , Pm } is a contradictory or competing constituent, then a
term proportional to
(2.3) −P1 · · · · · Pm
is added to the coherence function.

Formally, these principles could be understood as special cases of a
general reasoning rule,
(2.4) P1 , . . . , Pm Q,
whose value is proportional to (2.2). With the false proposition ⊥ standing

for a constantly zero value, the Principle of Competition could be rewritten
in the form
(2.5) P1 , . . . , Pm ⊥.
For Q = 0, (2.2) becomes equivalent to (2.3).
2.3. The Constructive Part of the Criticism

Let us now see how this evaluation function copes with the problems men-
tioned in the last chapter. For the two rule systems R1 and R2 in Figure 1
mentioned above in the last section of the previous chapter, we obtain the
following coherence functions (for equal weight factors)
VR1 = P1 · P2 · (2Q − 1),

VR2 = (P1 + P2 ) · (2Q − 1) + P1 · (2P2 − 1).
These formulas reveal the explanatory structure. The system R1 is sym-

metric only in P1 and P2 , as is VR1 . Only if both P1 and P2 are nonvan-
ishing is there positive or negative coherence, according to Q > 1/2 or
Q < 1/2 respectively. The system R2 and its corresponding measure of
coherence is not symmetric in any pair of variables. If the explanandum is
false, the measure of coherence can never become positive: For Q = 0 we
obtain
Q = 0 ⇒ VR2 = 2 · P1 · (P2 − 1) − P2 ≤ −P2 ≤ 0
If P2 and Q are both false, then VR2 = −2 · P1 . If they are both true,
then there is always positive coherence VR2 = 2 · P1 + 1 > 0, since Q is
successfully explained by P2 .
In Thagard’s theory, pairs of propositions which are connected by a
positive coherence link can nevertheless contribute with a negative value
to coherence if the elements have opposite truth values. This is not possible
in our approach. Propositions which appear as an explanandum in a rule
only contribute to the decision if other parts of the belief system lead
to a conflict, either by competing with it or by explaining the negation.
Otherwise, they can be accepted regardless of the truth values of the other
propositions. This is the statement of the following lemma.
DEFINITION 2.3. A rule system R is called non-conflictive in the pro-

position Q if and only if Q does not occur within a competing set of
propositions, and ¬Q does not appear in any explanatory rule.
302 DANIEL SCHOCH
LEMMA 2.1. Let VR be the coherence function of a rule system R, which

is non-conflictive in Q. For each global maximum of VR , there is another
global maximum with Q = 1, while all other values remain the same.
Proof. There are only three cases for possible rules containing Q: (i)
For a rule ‘{P1 , . . . , Pm } explains Q’ we find
P1 · · · · · Pm ≥ P1 · · · · · Pm · (2Q − 1).
(ii) For the form ‘{P1 , . . . , Pm , Q} explains R’ there is
P1 · · · · · Pm · (2R − 1) ≥ P1 · · · · · Pm · Q · (2R − 1).
(iii) In the case of ‘{P1 , . . . , Pm , ¬Q} is competing’ we obtain
0 ≥ −P1 · · · · · Pm · (1 − Q).
Thus, setting Q = 1 increases the total coherence.
2.4. Unification and the Weight Factors

Bartelborth (1996, 193), in the tradition of BonJour (1985), has proposed a
diacronic theory of explanatory coherence, according to which the degree
of (systematical) coherence of a belief system
• increases with the number of explanatory relations among the mem-
bers.
• increases with the strength of the explanations.
• increases with the degree of confirmation.
• decreases with the number of inconsistencies and explanatory anom-
alies.
• decreases with the number of competing explanations.
• decreases with the number of unconnected subsystems.
My approach satisfies all these requirements except the last without fur-
ther restrictions. The Principle of Explanation covers the first two points,
where the weight factor cP expresses the strength of the explanations. Ex-
pression (2.2) is nothing but a measure of confirmation. Inconsistencies are
handled by the Principle of Competition. If a proposition for which there
is negative evidence or which contradicts other parts of the belief system
is being explained, negative coherence is being generated. An explanatory
anomaly is given by a piece of evidence which ought to be explained by
the theory in question but is not. Although explanatory anomalies do not
directly contribute to coherence, this is not a problem since successful
explanations by competitive systems do. If a proposition is completely
Figure 2. Unification.
isolated, it can be removed from the explanatory structure. For each com-
peting system of hypotheses, a negative term of the form (2.3) is added to
the coherence function.
The measure of coherence should be sensitive not only to the number
of explanatory relations, but also to the degree to which the subsystems
are interconnected. In Figure 2, R2 needs two distinct theories to explain
the two pieces of evidence, while R3 uses a stronger theory capable of
explaining both. We may also say that R3 is a unified theory. The trivial
way in R1 of defining such a theory, namely taking the conjunction of both
subtheories, is explicitly ruled out from being regarded as unification.
Theory unification is the most important form of scientific progress. It is
extensively discussed by Friedman et al. (1989) and Watkins (1984), and
further developed by Redhead (1989) and Bartelborth (1996). The core
idea states that the more non-trivial components of conjunction a theory
has, the less unified it is. The elements of the decomposition should have
disjoint theoretical terms. Otherwise the four Maxwell Equations are not
considered as a unified theory of electromagnetism, since their theoretical
terms, the components of the E- and B-field, are entangled. If such a
decomposition is possible, then each explanandum could be explained by
only one component.3 Therefore, either R1 reduces to R2 , or R1 contains
redundant elements in the explanans. We introduce the following concept
of an irreducible ‘proper’ explanation and define the weight factors only
for them.4
DEFINITION 2.4. The rule ‘P explains Q’ in R is called a proper ex-

planation, if for every rule ‘S explains Q’ with S ⊆ P there is S = P .
By NR (P ) we denote the number of propositions which are properly
explained by P .
304 DANIEL SCHOCH
R3 is expected to be more coherent than the other two systems. However,

for weight factors 1, all three rule systems are equally coherent if the ex-
planans P1 , P2 and P are true. Consequently, the weight factors have to be
specified. We assume that the weight factor only depends on the number
of properly explained elements,
cP = f (NR (P )).
If P properly and successfully explains with precision n propositions, the

term (2.2) becomes 1, and the total contribution of P to the coherence
function is n · cP . A unified theory explaining n + m propositions should
contribute more than two separate theories explaining n and m of them
respectively. The necessary and sufficient condition is
(n + m) · f (n + m) > n · f (n) + m · f (m)
for n, m ≥ 1, or
n m
f (n + m) > · f (n) + · f (m).
n+m n+m
This is just the convexity of f !
We can obtain a stronger inequality by considering the following case:
P1 explains E1 , . . . , En , P2 explains E2 , . . . , En+1 and the unified theory
system Q explains all of them (E1 , . . . , En+1 ). Since Q explains all that
P1 and P2 does, we expect it to be superior to the combined system. It
follows
n
f (n + 1) > · 2 · f (n),
n+1
which is satisfied by f (n) = 2n . Therefore I propose the following formula
for the weight factor
(2.6) cP = 2NR (P ) .
Now, R3 has double the coherence of R2 .
3. EXAMPLES
3.1. The Program COHEN

The program COHEN (Coherence Optimization of Hypotheses Explanat-
ory Nets) is a rule-based evaluation program, which uses a simple gradient
method to optimize the coherence polynomial (1.1).5 The variables are

initially set to 0.5 and updated synchronously. The program accepts a list
of explanatory rules and delivers the polynomial and a maximum in all
variables. The weight factors can be freely chosen, the formula (2.6) is not
incorporated to give the user more freedom for experiments.
There are only two type of rules. The explanatory rule
[w :]P1 [. . . Pm ]− > Q,
where P1 , . . . , Pm , Q are signed propositions consisting of an alphanumer-

ical name adds a term (2.2) with the unsigned weight w to the polynomial.
The default value for w is 1. The competition rule
[w :]P1 [. . . Pm ]− >
adds a term (2.3) to the coherence function. This corresponds to the nota-
tion (2.5), where the empty explanandum stands for the constantly false
value.
Data evidence is specified in the form ‘[w :]− > E’ for positive
and ‘[w :]E− >’ for negative convictions. Comments are preceded by
a double slash ‘//’.
A negated proposition is expressed by a leading exclamation mark ‘!’,
as in the well-known programming language C. Thus, the two rules ‘P
explains E’ and ‘Q explains ¬E’ are written as follows:
P- > E
Q - > !E
The program stops when one of the following events occur:

• The net is exactly stable (within standard numerical accuracy).
• The net is approximately stable.
• There is no further progress in coherence.
3.2. Lavoisier
The case of Lavoisier’s oxygen hypothesis (1783) against the phlogiston
theory is one of the most famous and successful applications of Thagard’s
program. It is interesting for our proposes since some of the explanatory
rules have multiple-proposition explanans. We adopt the reconstruction
and notation of Thagard (1989, 444). Although not all Lavoisier’s argu-
ments are contained by these propositions, they recapitulate the major
points. According to Zytkow, many improvements of the late phlogiston
theory after 1780 concerning E3-E7 are not incorporated.6
306 DANIEL SCHOCH
Oxygene hypotheses
OH1 Pure air contains oxygen principle.
OH2 Pure air contains matter of fire and heat.
OH3 During combustion, oxygen from the air combines with the burning
body.
OH4 Oxygen has weight.
OH5 During calcination, metals add oxygen to become calxes.
OH6 During reduction, oxygen is given off.
Phlogiston hypotheses
PH1 Combustible bodies contain phlogiston.
PH2 Combustible bodies contain matter of heat.
PH3 During combustion, phlogiston is given off.
PH4 Phlogiston can pass from one body to another.
PH5 Metals contain phlogiston.
PH6 During calcination, phlogiston is given off.
The interesting point about this example is that there are only two ana-
lytical contradictions. There is no need to implement the main hypotheses
OH1 and PH1 as competing.
20: PH3 OH3 − >

20: PH6 OH5 − >
The evidence is
E1 During combustion, heat and light are given off.

E2 Inflammability is transmittable from one body to another.
E3 Combustion only occurs in the presence of pure air.
E4 The increase of weight in an incinerated body is exactly equal to the
weight of air absorbed.
E5 Metals undergo calcination.
E6 During calcination, bodies increase in weight.
E7 During calcination, volume of air diminishes.
E8 During reduction, effervescence appears.
According to Thagard, the theory systems perform the following ex-
planations. All explanations are proper in our sense. In our reconstruction,
we give all explanations the weighting of one. According to the rule (2.6),
only the fourth and sixth oxygen explanation should be given a weight
factor of two, but this does not change the numerical outcome, which
supports the oxygen hypothesis.
Oxygen explanations Phlogiston explanations

OH1 OH2 OH3− > E1 PH1 PH2 PH3− > E1
OH1 OH3− > E3 PH1 PH3 PH4− > E2
OH1 OH3 OH4− > E4 PH5 PH6− > E5
OH1 OH5− > E5
OH1 OH4 OH5− > E6
OH1 OH5− > E7
OH1 OH6− > E8
The program COHEN stops with an exactly stable net. All evidence
and all oxygen hypotheses are exactly accepted with value one, PH3 and
PH6 are exactly rejected with value zero. The other phlogiston hypotheses
PH1, PH2, PH4 and PH5 exactly receive the indifferent value 0.5. This
is plausible, since according to the reconstruction they conflict with some
part of the oxygen theory system. Thagard’s program ECHO tends towards
the same result (relative to his scale)! Some minor deviations, e.g., in OH2
and OH6, which remain below the value of full acceptance, seem to be
numerical artifacts – possibly caused by the connectionist algorithm. Ac-
cording to Lemma 1.2, this could not happen if ECHO did what it should,
namely optimize the ‘harmony function’ (1.2).
3.3. A Case of a Murder

The case of Mike’s contradictory alibis is also described by Thagard (loc.
cit. p. 496). The idea is to simulate a belief change in a legal trial when an
alibi is invalidated by a second contradicting the first. If Mike presents an
alibi by Sam, who saw him in Philadelphia, he should ceteris paribus not be
found guilty of having committed a murder in New York. If a second friend
appears who tells a different story, this should be a case against Mike.
Thagard’s program with the rules for G0, G1, I1, and E1 sets G0 to the
value of non-acceptance, while the full set of rules leads to an acceptance
of G0. COHEN convicts Mike in both cases, unless the simple alibi is
amplified by setting the weight of the I1 − > E1 rule to a value above
1.25.
This apparent advantage of ECHO over COHEN becomes less plaus-
ible when we examine how it is derived. The innocence hypothesis wins
out in the echo model because the rule ‘G0 explains G1’ in Thagard’s
model contributes positively if both G0 and G1 have negative values. In
other words, there is explanatory coherence for false predictions from
false theories – an unacceptably high price. Examining the structure of
the hypotheses system, we find a simple hypothesis I1 versus a system of
two propositions G0, G1 standing in an explanatory relation; both systems
308 DANIEL SCHOCH
explain the same evidence. In the context of scientific argumentation, we

would always favor the second option.
A simple modification would enable the innocence hypothesis to catch
up: Let I3 stand for ‘Mike was seen in Philadelphia by Sam’, I1 explain
I3, and I3 explain E1, and let I3 contradict both G0 and G1. Then the two
competing systems are symmetrical up to the explanatory relation I1 − >
E1. Moreover, if the latter is maintained, the innocence hypothesis wins.
A corresponding explanatory relation between G0 and E1 does not hold.
This is the translation of Thagard’s example using my notation. The
hypotheses are
G0 Mike committed the murder in New York.
G1 Sam is lying to protect Mike.
G2 Fred is lying to protect Mike.
I1 Mike was in Philadelphia.
I2 Mike was in Boston.
The evidence is
E1 Sam says that Mike was in Philadelphia.
E2 Fred says that Mike was in Boston.
The rules are
Contradictions Explanations
I1 I2− > G0− > G1
G0 I1− > G0− > G2
G0 I2− > G1− > E1
G1 I1− > G2− > E2
G2 I2− > I1− > E1
I2− > E2
COHEN stops with an acceptance of G0, G1, G2, E1, E2 and a rejection
of I1, I2. If the rules I1 − > I3, I3 − >E1, I3 G0 − > and I3 G1 − >
are added, these results a preserved and I3 is also rejected. But for the rule
system R = { G0 − > G1, G1 − > E1, I1 − > I3, I3 − > E1, I1
− > E1, I1 G0 − >, I1 G1 − >, I3 G0 − >, I3 G1 − >}, COHEN
accepts I1, I3 and E1 and rejects G0 and G1. In this reconstruction, Mike
is only convicted in the case of two contradictory alibis. This tallies with
our intuition.
This case could be regarded as a lesson on how sensitive the whole
approach of explanatory coherence is to the type of reconstruction of
the arguments, especially the number of auxiliary and intermediate hy-
potheses. Counterintuitive results are obtained if hypothesis systems of
different reconstruction approximation are compared. In our example, the
I1-I3 pair is analogous in reasoning to the G0–G1 pair. Although this
point is especially critical for everyday reasoning, in a scientific context

the different kinds of hypotheses are likely to have similar counterparts in
competing systems.
3.4. Moral: Never Try to Copy Nature

The human brain is undoubtedly the most powerful instrument for acquir-
ing knowledge. It therefore seems quite natural to copy a design which
has proved itself for millions of years in the struggle for survival. Even
if this is very suggestive, it does not imply that the paradigm of neural
networks, as it was used by Thaggard, always leads to the best results in
artificial intelligence. Purely artificial neural networks, either feed-forward
or recurrent, have a very poor performance in trained pattern recognition
compared with their synaptic complexity. The vast majority of technical
applications of neural networks therefore combine them with powerful
classical preprocessing tools and/or fuzzy logic. On the other hand, we
can find a lot of applications of pure fuzzy logic in every department store.
In 1889, Otto Lilienthal published his famous book on the flight of birds
as a foundation of aviation. He had constructed various machines for lift
measurement. One of his major discoveries was that human force cannot
produce an air lift of more than 40kg, too little to carry both man and
machine. All these machines, such as his pedal-driven lift machine and a
horizontally rotating propeller driven by weights, were merely designed
for research into the physical principles of air lift.
Like nearly all his predecessors, Lilienthal’s glider suffered from
having a design which was too similar to the bird’s anatomy. Recent ex-
periments using a historical reconstruction have revealed that the wing
frame became mechanically unstable above 40 km/h. Unfortunately, the
minimum speed for take off is about 30 km/h, more than a human stag-
gering under the weight of a heavy load running down a hill can manage.
Therefore, Lilienthal had to wait for a headwind. One gust of wind killed
him on August 10th, 1896.
Even though there had been successful flights using motorized Lili-
enthal gliders before, the breakthrough did not occur until 1905, when the
Wright brothers constructed the first fully practical airplane. More than a
hundred years after Lilienthal’s experiments, modern airplanes have be-
come far removed from nature. They are faster and carry much greater
loads than any bird, but on the other hand they cannot match the aerobatics
of an albatross. Only very recently, a model was constructed which can fly
by flapping its wings, although its movable wings have just a small fraction
of the degree of freedom of a bird’s wing.
310 DANIEL SCHOCH
ACKNOWLEDGEMENT
This paper was written within the research project “Explanatory Coher-
ence”, which is part of the research group “Kommunikatives Verstehen”
supported by the Deutsche Forschungsgemeinschaft (DFG).
NOTES
1 In the discussion part of Thagard’s article, Cohen also criticizes that it is impossible to
deal with conjunctions of hypotheses in his framework. Bereiter and Scardamalia mention
at the same location that a hypothesis can undergo positive activation if contradicting a
negatively activated proposition – a consequence of choosing the [−1, +1]-interval for
activation potentials. Thagard’s attempt to resolve the problem by raising the threshold
value of the associated neuron does not seem to be very coherent with his other rules (see
the reply part of Thagard (1989).
2 In sentential approaches to fuzzy logic, it is generally required that ‘x and x’ yields
the same fuzzy value as x. The multiplication function does not have the property of
idempotence, since x · x 6= x for 0 < x < 1. However, this is not a problem in our
approach. The representation of sentences by sets explicitly excludes the possibility that
a proposition appears more than once in a sentential expression. Moreover, according to
Definition 2.1, a proposition must not occur together with its contradiction.
The only idempotent fuzzy conjunction, the minimum function, does not satisfy the
principle of language independence. For example, take the positive data evidence E as the
only part of the explanatory structure, hC, Ii = h{E}, ∅i. Then for P , 1 − P < E < 1 we
find
VhC0 ,I0 i = min(E, P ) + min(E, 1 − P ) 6= E = VhC,Ii .
3 This is evident for Bartelborth’s identification of explanation with model-theoretical

embedding. Since this concept presumes the structuralist view of theories, while the co-
herence theory does not, this point is not elaborated here. For details on decompositions in
the structuralist’s view of theories (the Ramsey Sentence of a theory, where the theoretical
terms are eliminated), see Gähde (1989) and Bartelborth.
4 This must also be done in order to avoid ambiguities in the weight factors, otherwise
redundant parts of the explanations will effectively enlarge the weight factor if they are
added.
5 This program written by the author will soon be placed under www.uni-leipzig.de/
∼logik in the institute of philosophy or ‘Wissenschaftstheorie und Logik’. Please con-
sult www.uni-leipzig.de/ ∼logik/bartelborth.html. It should be noted that this version of
the program has only been tested in a few cases and may still contain bugs or produce
numerical artifacts. All error reports will be gratefully received!
6 See the discussion of this point by Zytkow in the commentary of Thagard (1989, 489).
REFERENCES
Bartelborth, T.: 1996, Begründungsstrategien, Akademie Verlag, Berlin.

Bartelborth, T.: 1997, ‘Scientific Explanation’, in: W. Balzer and U. Moulines (eds),
Structuralists Theory of Science. Focal Issues, New Results. W. de Gruyter, Berlin.
BonJour, L.: 1985, The Structure of Empirical Knowledge, Harvard University Press,
Cambridge, MA.
Gähde, U.: 1989, Theorie und Hypothese, Habilitationsschrift, Bielefeld.
Gottwald, S.: 1993, Fuzzy Sets and Fuzzy Logic, Vieweg, Wiesbaden.
Kitcher, P. and W. C. Salmon (eds.): 1989, Scientific Explanation, Minnesota Studies in the
Philosophy of Science.
Redhead, M.: 1989, Explanation, Preprint.
Thagard, P.: 1989, ‘Explanatory Coherence’, Behavioral and Brain Sciences 12, 435–502.
Thagard, P.: 1998, Coherence Articles, Internet page,
http://cogsci.uwaterloo.ca/Articles/Pages/Coherence.html.
Watkins, J.: 1984, Science and Scepticism, Princeton University Press, Princeton.
Department of Philosophy
University of Saarland
P.O. Box 15 11 50
D-66041 Saarbrücken
Germany
E-mail: d.schoch@mx.uni-saarland.de

Schoch 2000

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Schoch 2000

Uploaded by

Copyright:

Available Formats

DANIEL SCHOCH

A FUZZY MEASURE FOR EXPLANATORY COHERENCE

ABSTRACT. In a series of articles, Paul Thagard has developed a connectionist’s model

1. THE STRUCTURE OF EXPLANATORY COHERENCE

1.1. Inference of the Best Explanation

Synthese 122: 291–311, 2000.

recently been developed by Bartelborth (1996, 1997), who identifies ex-

1.2. The Common Framework

(Although Thagard’s model contains a fourth type of rule, we shall

which for Thagard’s model reduces to the special form

With yii · fi (y i ) = yii · fi (y i−1 ) ≥ yii−1 · fi (y i−1 ) and gk (y i ) = gk (y i−1 )

Thus y n is the proposed maximum with yin ∈ {a, b}.

1.3. Explanation in Thagard’s Model

(a) For each i, Pi and Q cohere.

It could be rewritten as a conjunction of the two following principles:

(1) If P explains Q, then P ∪{Q} coheres (with degree ∼ 1/m).

(2) If {P1 , . . . , Pk } coheres (with degree d), then Pi and Pj , i 6 = j ,

In this picture, the variables xi , 1 ≤ i ≤ n correspond to the neurons

(1.4) xk (t + 1) = fact (xk (t), netk (t)) ,

1.4. A Critique of Thagard’s Model

Figure 1. Three rule systems.

2. F UZZY CONFIRMATION AS EXPLANATORY COHERENCE

2.1. The Improved Model of Coherence

DEFINITION 2.1. Let P be a set of propositions. The set E of signed

E is called a set of signed propositions if for some set of propositions P ,

DEFINITION 2.2. A subset P ⊆ E of a set of signed propositions is called

Now we are able to state our rules of explanation, contradiction and

2.1.1. Principle of Explanation

We generalize Thagard’s Principle of Contradiction to a Principle of

2.1.2. Principle of Competition

2.1.3. Principle of Data Evidence

2.1.4. Principle of Fuzzy Confirmation

2.1.5. Principle of Language Independence

incoheres’ induces the same order of coherence over E ∪ {P } irrespective

C0 : = {P ∪ {P }|P ∈ C} ∪ {P ∪ {¬P }|P ∈ C}

induces the same order of coherence over E ∪ {P } irrespective of the value

2.2. The Coherence Function

E = {P1 , . . . , Pn , ¬P1 , . . . , ¬Pn }

be a set of signed propositions, x1 , . . . , xn real-valued variables. Then for

The constants cP , called the weight factor of coherence, can be considered

associativity, monotonicity and having 1 as a neutral element.

VhC0 ,I0 i = VhC,Ii · P + VhC,Ii · (1 − P ) = VhC,Ii .

For the standard fuzzy-logical representation of the conjunction, the

The two main principles could therefore be reformulated more briefly

2.2.1. Principle of Explanation

is added to the coherence function.

2.2.2. Principle of Competition

is added to the coherence function.

whose value is proportional to (2.2). With the false proposition ⊥ standing

For Q = 0, (2.2) becomes equivalent to (2.3).

2.3. The Constructive Part of the Criticism

VR1 = P1 · P2 · (2Q − 1),

These formulas reveal the explanatory structure. The system R1 is sym-

Q = 0 ⇒ VR2 = 2 · P1 · (P2 − 1) − P2 ≤ −P2 ≤ 0

DEFINITION 2.3. A rule system R is called non-conflictive in the pro-

LEMMA 2.1. Let VR be the coherence function of a rule system R, which

(ii) For the form ‘{P1 , . . . , Pm , Q} explains R’ there is

P1 · · · · · Pm · (2R − 1) ≥ P1 · · · · · Pm · Q · (2R − 1).

(iii) In the case of ‘{P1 , . . . , Pm , ¬Q} is competing’ we obtain

Thus, setting Q = 1 increases the total coherence.

2.4. Unification and the Weight Factors

DEFINITION 2.4. The rule ‘P explains Q’ in R is called a proper ex-

R3 is expected to be more coherent than the other two systems. However,