Christiano Et Al Naturalistic Reflection Early Draft

Paul Christiano, Eliezer Yudkowsky, Marcello Herresho, Mihaly Barasz. (Early draft.
Introduction
A central notion in metamathematics is the truth of a sentence. To express this notion within a theory, we introduce a predicate True which acts on quoted sentences and returns their truth value True ( ) (where is a representation of within the theory, for example its G odel number). We would like a truth predicate to satisfy the reection property: : True ( ) . (1)
Unfortunately, it is impossible for any expressive language to contain its own truth predicate True. For if it did, we could consider the liars sentence G dened by the diagonalization G True ( G ) . Combining this with the reection property, we obtain: G True ( G ) G, a contradiction. There are a few standard responses to this challenge. The rst and most popular is to work with meta-languages. For any language L we may form a new language by adding a predicate True, which acts only on sentences of L (i.e., sentences without the symbol True) and satises the reection property 1. We can iterate this construction to obtain an innite sequence of languages L1 , L2 , . . ., each of which contains the preceding languages truth predicate. A second approach is to accept that some sentences, such as the liar sentence G, are neither true nor false (or are simultaneously true and false, depending on which bullet we feel like biting). That is, we work with a many-valued logic such as Kleenes logic over {True, False, }. Then we may continue to dene True by the strong strong reection property that the truth values of and True ( ) are the same for every . We dont run into trouble, because any would-be paradoxical sentence can simply be valued as . Although this construction successfully dodges the undenability of truth it is somewhat unsatisfying. There is no predicate in these languages to test if a sentence is undened, and there is no bound on the number of sentences which remain undened. In fact, if we are specically concerned with self-reference, then a great number of properties of interest (and not just pathological counterexamples) become undened. In this paper we show that it is possible to perform a similar construction over probabilistic logic. Though a language cannot contain its own truth predicate True, it can nevertheless 1
contain its own subjective probability function P. The assigned probabilities can be reectively consistent in the sense of an appropriate analog of the reection property 1. In practice, most meaningful assertions must already be treated probabilistically, and very little is lost by allowing some sentences to have probabilities intermediate between 0 and 1. (This is in contrast with Kleenes logic, in which the value provides little useful information where it appears.)
2
2.1
Preliminaries
Probabilistic Logic
Fix some language L, for example the language of rst order set theory. Fix a theory T over L, for example ZFC. We are interested in doing probabilistic metamathematics within L, and so in addition to assuming that it is powerful enough to perform a G odel numbering, we will assume that some terms in L correspond to the rational numbers and have their usual properties under T . (But T will still have models in which there are non-standard rational numbers, and this fact will be important later.) We are interested in assignments of probabilities to the sentences of L. That is, we consider functions P which assign a real P () to each sentence . In analogy with the logical consistency of an assignment of truth values to sentences, we are particularly interested in assignments P which are internally consistent in a certain sense. We present two equivalent denitions of coherence for distributions over a language L: Denition 1 (Coherence). We say that P is coherent if there is a probability measure over models of L such that P () = ({M : M |= }). Theorem 1 (Equivalent denition of coherence). P is coherent if and only if the following axioms hold: 1. For each and , P () = P ( ) + P ( ). 2. For each tautology , P () = 1. 3. For each contradiction , P () = 0. Proof. It is clear that any coherent P satises these axioms. To show that any P satisfying these axioms is coherent, we construct a distribution over complete consistent theories T which reproduces P and then appeal to the completeness theorem. Fix some enumeration 1 , 2 , . . . of all of the sentences of L. Let T0 = and iteratively dene Ti+1 in terms of Ti as follows. If Ti is complete, we set Ti+1 = Ti . Otherwise, let j be the rst statement which is independent of Ti , and let Ti+1 = Ti j with probability 2
P (j | Ti ) 1 and Ti+1 = Ti j with probability P (j | Ti ). Because j was independent of Ti , the resulting system remains consistent. Dene T = i Ti . Since each Ti is consistent, T is consistent by compactness. Since either Ti+1 i or Ti+1 i , T is complete. Axiom 1 implies P ( | Ti ) = P ( | Ti j ) P (j | Ti ) + P ( | Ti j ) P (j | Ti ) , thus the sequence P ( | Ti ) is a martingale. Axiom 2 implies that if Ti , P ( | Ti ) = 1 and if Ti , P ( | Ti ) = 0. Thus, this sequence eventually stabilizes at 0 or 1, and T i this sequence stabilizes at 1. The martingale property then implies that T with probability P ( | T0 ). By axiom 3, P (T0 ) = 1, so P ( | T0 ) = P (). This process denes a measure over complete, consistent theories such that P () = ({T : T }). For each complete, consistent theory T , the completeness theorem guarantees the existence of a model M such that M |= if and only if T . Thus we obtain a distribution over models such that P () = ({M : M |= }), as desired. We are particularly interested in probabilities P that assign probability 1 to T . It is easy to check that such P correspond to distributions over models of T .
2.2
Going meta
So far we have talked about P as a function which assigns probabilities to sentences. If we would like L to serve as its own meta-language, we should augment it to include P as a real-valued function symbol. Call the agumented language L . We will use the symbol P in two dierent waysboth as our meta-level evaluation of truth in L , and as a quoted symbol within L . However, whenever P appears as a symbol in L , its argument is always quoted. This ensures that references are unambiguous: P () = p is a statement at the meta level, i.e. is true with probability p, while P ( ) = p is the same sentence expressed within L , which is assigned a probability by P.
3
3.1
Reection
Reection schema
We would now like to introduce some analog of the reection schema 1. This must be done with some care if we wish to avoid the problems with schema 1. For example, we might consider the following reection principle: L p Q : P (P ( ) = p) = 1 P () = p
1
(2)
P (j | Ti ) =
P(j Ti ) P(Ti ) .
It is easy to verify by induction that P (Ti ) > 0.
But this leads directly to a contradiction: if we dene G P ( G ) < 1, then P (G) < 1 P (P ( G ) < 1) = 1 P (G) = 1, which contradicts P (G) [0, 1]. (We will talk about schema over Q rather than R, because there is a symbol in L for every p Q but no such symbol for a generic p R. This should be a rst clue that schema 2 isnt what we want!) To overcome this challenge, we imagine P as reecting an idealized observers knowledge. We would like this observer to be as informed as possible, but we have just seen that perfect knowledge is contradictory. The most natural way to relax perfect knowledge is to imagine this idealized observer as executing an algorithmperhaps simulating itselfto discover the value P ( ). In this case, the idealized observer can never discover the exact value of P ( ). Instead, as the algorithm runs for longer and longer, the idealized observer gains access to a more and more accurate estimate of P ( ). This suggests the following reection principle L a, b Q : (a < P () < b) = P (a < P ( ) < b) = 1. (3)
Our goal is to show that this reection principle is consistent with any consistent theory T .
3.2
Discussion of reection schema
Given a statement such as G P ( G ) < p, a distribution P satisfying the reection principle will assign P (G) = p. If P (G) > p, then P (P ( G ) > p) = 1, so P (G) = 0, and if P (G) < p, then P (P ( G ) < p) = 1, so P (G) = 1. But if P (G) = p, then P is uncertain about whether P (G) is slightly more or slightly less than p. No matter how precisely it estimates P (G), it remains uncertain. An important point is that this reection principle is a schema, not a single axiom. That is, each statement of the form (a < P () < b) = P (a < P ( ) < b) = 1 is assigned probability 1, but the universal quantication over a, b, is not assigned probability 1. Since there are only countably many statements of this form, all of them jointly are actually assigned probability 1. However, not all true generalizations are assigned probability 1. If we look at the models of L corresponding to a distribution P satisfying the schema 3, they have P ( G ) = p , where is a non-standard rational number which is smaller than every standard rational number. That is, P is mistaken about itself, but its error is given by the innitesimal . (These are the same innitesimals which are used to formalize dierentiation in nonstandard analysis.) Although this paper focuses on Tarskis results on the undenability of truth, similar paradoxes are at the core of G odels incompleteness theorem and the inconsistency of unrestricted comprehension in set theory. A similar approach to the one taken in this paper is adequate 4
to resolve some of these challenges, and e.g. to construct a set theory which satises a probabilistic version of the unrestricted comprehension axiom. We will construct a distribution P satisfying the schema 3. The distribution we construct will be uncomputable, and for all practical purposes inaccessible. But the existence of any distribution implies that the schema 3 is consistent. This in turn implies that we can invoke this schema as an inference rule in a system without running into contradictions. So even though the distribution we construct is uncomputable, the result may still have signicance for ecient systems (in the same way that the predicate True might usefully increase the expressive power of a language, even though the extension of that predicate is uncomputable).
3.3
Proof of consistency of reection schema
The most important result of this paper is the following theorem: Theorem 2 (Consistency of reection schema). Let L be any language and T any consistent theory over L, and suppose that the theory of rational numbers with addition can be embedded in T in the natural sense. Let L be the extension of L by the symbol P as above. Then there is a probabilistic valuation P over L which is coherent, assigns probability 1 to T , and satises the reection schema 3: L a, b Q : (a < P () < b) = P (a < P ( ) < b) = 1. Proof. Let A be the set of coherent probability distributions over L which assign probability 1 to T . We consider A as a subset of [0, 1]L with the product topology. Clearly A is convex. By using the alternative characterization of coherence, we can see that A is a closed subset of [0, 1]L . Thus by Tychonos theorem, we conclude that A is compact. Moreover, because T is consistent, A is non-emptyfor example, let M be any model of T , in which each P ( ) is assigned some arbitrary value, and let P () = 1 if M |= and 0 otherwise. Given any P A, let AP be the schema of axioms of the form a < P ( ) < b where a, b are the endpoints of a rational interval containing P (). Let A (P ) = {P A : P (AP ) = 1}. If P A (P), then P is consistent with the reection schema and we are done. Consider the mapping f : P A (P). We will show that this map has a closed graph, and that the image of each P is convex, and non-empty. We can then apply the Kakutani xed point theorem to nd some P A (P), as desired. (The Kakutani xed-point theorem applies to [0, 1]L because it is a convex subset of a Hausdor and locally convex topological vector space.) First, note that A (P) is convex and non-empty for the same reasons that A is. So suppose that P2 / A (P1 ). In order to show that f has a closed graph, it is sucient to nd neighborhoods U1 P1 , U2 P2 such that U2 A (P) = for each P U1 . If there is AP1 such that P2 () < 1, then is of the form P ( ) I for some open interval I . Consider the set U1 = {P A : P ( ) I }. It is easy to see that U1 is open and 5
P1 U1 . Let U2 = {P A : P (P ( ) I ) < 1}. It is easy to see that U2 is open and P2 U2 . But for each P U1 , U2 A (P) = , as desired. If there is no such AP1 , then P2 is not coherent or does not assign probability 1 to T . But the space of distributions violating each of these axioms is manifestly open in the product topology (since each axiom depends only on nitely many values of P2 ). Thus we can take an open set U2 around P2 containing only probability distributions which are incoherent or dont assign T probability 1, and we can let U1 be the entire space. Since U2 A = we are done. Thus f has a closed graph, and f (P) is convex and non-empty for each P. We can now apply the Kakutani xed point theorem to nd P f (P), as desired.

Christiano Et Al Naturalistic Reflection Early Draft

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Christiano Et Al Naturalistic Reflection Early Draft

Uploaded by

Copyright:

Available Formats

Paul Christiano, Eliezer Yudkowsky, Marcello Herresho, Mihaly Barasz. (Early draft.

It is easy to verify by induction that P (Ti ) > 0.

Discussion of reection schema

Proof of consistency of reection schema

You might also like