Rewer
2
ceeding Third Arava
ACM Sympor 9
eee ee
> Compu di
Mey 47) ¢
Sunmary.
It is shown that any recognition
problem solved by a polynomial time-
bounded nondeterministic Turing
machine can be "reduced" to the pro-
blem of determining whether a given
propositional formula is a tautology.
Here "reduced" means, roughly speak-
ing, that the first problem can be
solved deterministically in polyno-
nial time provided an oracle is
available for solving the second.
From this notion of reducible,
polynomial degrees of difficulty are
defined, and it is shown that the
problem of determining tautologyhood
has the same polynomial degree as the
problem of determining whether the
first of two given graphs is iso-
morphic to a subgraph of the second.
Other examples are discussed. A
method of measuring the complexity of
proof procedures for the predicate
calculus is introduced and discussed.
Throughout this paper, a set of
strings means a set of strings om
Sone fixed, large, finite alphabet C.
This alphabet is large enough to in-
clude symbols for all sets described
here. All Turing machines are deter-
ministic recognition devices, unless
the contrary is explicitly stated.
1. Tautologies and Polynomial R
Reacibenity
Let us fix a formalism for
the propositional calculus in
which formulas are written as
strings on . Since we will re-
quire infinitely many proposition
symbols (atoms), each such symbol
will consist ofa member of =
followed by a number in binary
notation to distinguish that
symbol. Thus a formula of length
n can only have about n/logn
distinct function and predicate
symbols. The logical connectives
are & (and), v (or), and 7(not).
The set of tautologies
(denoted by {tautologies)) is a
-151-
Stephen
The Complexity of Theoren-Proving Procedures
+ Cook
University of Toronto
certain recursive set of strings on
this alphabet, and we are interested
in the problem of finding a good
lower bound on its possible recog-
nition times. We provide no such
lower bound here, but theorem I will
give evidence that {tautologies} is
a difficult set to recognize, since
many apparently difficult problens
can be reduced to determining tau-
tologyhood. By reduced we mean,
roughly speaking, that if tauto-
logyhood could be decided instantly
(by an “oracle") then these problens
could be decided in polynomial time.
In order to make this notion precise,
we introduce query machines, which
are like Turing machines with oracles
in [1].
A query machine is a multitape
Turing aachine witha distinguished
tape called the query tape, ani
three distinguished states cated
the query state, yes state, and no
State, respectively, TPM is @
query machine and 7 is a set of
Strings, then a T-computation of M
isa computation GFN in which
initially M is in the initial
state and has an input string w on
its input tape, and each time M
assumes the query state there is a
String u on the query tape, and
the next state M assumes is the
yes state if ueT and the no state
if udT. We think of an “oracle”,
which knows T, placing M in the
yes state or ho state.
Definition
A set_S of strings is P-redu-
gible (P for polynomial) to a set
YT of strings iff there is sone
query machine M and a polynomial
Q(n)" such that for each input string
w, the T-computation of M. with in-
put w halts within Q((w]) steps
(lw| is the length of w), and ends
in-an accepting state iff weS.
It is not hard to see that
P-reducibility is a transitive re-
lation. Thus the relation £ onsets of strings, given by (S,T)cE iff
each of S$ and’ T is Peredicible to
the other, is an cquivalence relation.
‘The equivalence class containing a set
Swill be denoted by deg (S) (the po-
lynonial degree of difficulty ofS).
Definition: We will denote deg ((0)) by
2Zq, where 0 denotes the zero func-
tion!
Thus Z, is the class of sets re-
cognizable in polynomial tine. Ly
was discussed in (2]) p. 5) and Is the
String analog of Cabhan's Class fof
Functions [3]. 4
We now define the following special
sets of Strings.
1) The subgraph problem is the
probien giveh two finite tndirected
graphs, determine whether the first is iso-
Borphi to a subgraph of the second. A
graph G can be represented by a string
6 on the alphabet {0,1,4) by listing
the successive rows of its adjacency
matrix, separated by *s. We let (sub-
Braph pairs) denote the set of strings
G\**G, such that Gy
a subgraph of G2.
2) ‘The graph isonorphism problem
witt be represented by the set, denoted
by (isomorphic graphpairs}, of all
is isomorphic to
strings G,**G, such that 6, is iso-
morphic to G).
3)_The set {Primes} is the set of
all binary notations for prine numbers.
4) The set (DNF tautologies) is
the set of strings representing tauto~
logies in disjunctive normal form.
5) The set Ds consists of those
tautologies in disjunctive normal form
in which each disjunct has at most three
Conjunets (each oF which is an atom or
negation of an atom).
Theorem 1: If a set S_ of strings is
accepted by some nondeterministic Turing
machine within polynomial time, then S
is Pereducible to {DNF tautologies}.
Gorollary: Each of the sets in defini-
{Tone Tes) is Poredueible to” CINE
tautologies}.
This is because each set, or its
complement, is accepted in polynomial
time by some nondeterministic Turing
-152-
nachine.
Proof of the theorem: Suppose a non-
deterministic Turtsg machine accepts
a set S of strings within time Q(n),
where Q(n) is a polynomial. Given an
Tnput For” My me wiil construct a
proposition foraéta. A{w) tn condune
tive normal form such that AGW) ds
Eatisfiable iff N accepts sw Thus
STAGe) vis easily put in disjunctive
potaai form (using! be Morgan's is),
and "A(w) vis a tautology if and onl
Tf" ws") Since the whole construction
can be carried out in tine bounded by
a"polynonial in’ |x] (the length of ¥),
the theorem will be proved.
Me may a3 well assume the Turing
machine "W has" only one tape, which is
Infinite to the right but has'a Lert
nost square. Let us number the squares
fron left to right 1,2) 2"-. Let us
fix'an input we to.’ ‘of length. ny
and Suppose’ weS." "then there fea
Computation of Ml with input w that
grit in anvaccentigg sfater witha”
sa(ny "steps. The formula. AG) x31
be bifit ttoaPainy different propos:
tion symbols, whose intended meanings,
Listed below, refer to such a comput
tation.
Suppose the tape alphabet for M
is (Gy, +++) 3}, and the set of
states is, (ayy s+) a4). Notice that
since the computation has at nost
T+ Q(m) steps, no tape square beyond
number T is scanned.
Proposition symbol.
pi, for 1s ist, les ter.
pi, is true i£f tape square number s
at step t contains the symbol 0; -
ah for reer, eect. Qh is
true iff at step t the machine is in
state aq;
for 1ss,tsT is true iff at
Sse
time t square number s is scanned
by the tape head.
The formula ACW)
tion BRCEDGEGFEGRHGI
follows. Notice A(w)
junctive normal form,
is a conjune-
formed as3 will assert that at each step ty one
andonly one square 1s scanned. Bis a
Conjunction 8) @ By 6 ++ 8 By» where
B, asserts that at time t one and
only one square is scanned
eC rs re)
(8 O89 ¥ 75; 2)
fej
For lsssT and Lststy Cy
asserts that at square s and time t
there is one and only one symbol. C is
the conjunction of all the Cy 4:
D asserts that for each t there
is one and only one state.
E asserts the initial conditions
are satisfied:
B= Of 65,5 6 Pty 6 Pye Ge w PAD
er eee aed 3. Then A
and only if At
is a tautology if
is a tautology where
AN = PARSE.
GRyy APGR)ER,VB,V oe. v Bys
where P is anew atom, Since we have
Teduced the number of conjunets in By,
this process may be repeated until
eventually a formula is found with at
most three conjuncts per disjunct.
Clearly the entire process is bounded in
time by a polynomial in the length of A.
It remains to show D; is P-reduc-
ible to (subgraph pairs}. > Suppose A
is a formula in disjunctive normal form
with three conjuncts per disjunct. Thus
Racy ses Cys where
-153-Cy = Ryy @ Rip & Ryss and each Ry is
an atom or a negation of an atom. Now
let G, be the complete graph with ver-
tices Cvys vgs c++» vgs and let
)
%
be the graph with vertices Cu;
Lsisk, 143, such that
is connected by an edge tou,
ifr and the two literals
feu
and only if
(Ryjj+ Reg) do not form an opposite pair
(that is they are neither of the form
(P, 1) nor of the form (P,P). Thus
there is a falsifying truth assignment
to the formula A iff there is @ graph
homonorphism ¢ : G, + G, such that for
each i, (vj) = yj for some
(The homomorphism tells for each i
which of should be fal-
Rue Rize Ris
sified, and the selective lack of edges
in G, guarantees that the resulting
truth assignment is consistently spe-
cified).
In order to guarantee that @ one-one
homomorphism § : G; Gp has the pro-
perty that for each i, $(vj) = 4; for
sone j, we modify G, and G, as fol-
lows. We select graphs Hyy Hyy vey My
which are sufficiently distinct from each
other that if Gj is formed from G, by
attaching Hy to vy, Ls is ky and
6} is formed from G, by attaching H
to cach of uy, and uy, and Uys,
1s isk, then every one-one homomor-
phism 6: G] + G3 has the property
just stated. It is not hard to see such
a construction can be carried cut in po-
can be ea-
AEDS.
lynomial time. Then 6]
bedded in G3 if and only if
‘This completes the proof of theorem 2.
2. Discussion
Theoren 1 and its corollary give
strong evidence that it is not easy to
determine whether a given proposition
formula isa tautology, even if the
formula is in normal disjunctive form
Theorens 1 and 2 together suggest that
itis fruitless to search for a poly=
nomial decision procedure for the sub-
Graph problem, since success would bring
polynomial decision procedures to many
Sther apparently intractibie problens.
Of course the sume renark applies to any
Combinatorial problex to which {tauto=
Togies} is P-reducible.
Furthermore, the theorens suggest
that (tautologi¢s) is a good candidate
for an interesting set not in #*, and
I feel it is worth spending consider-
able effort trying to prove this con-
jecture. Such a proof would be a major
breakthrough in complexity theory.
In view of the apparent complexity of
{ONE tautologies}, it is interesting to
examine the Davis-Putnam procedure [5].
This procedure was designed to determine
whether a given formula in conjunctive
normal form is satisfiable, but of course
the "dual" procedure determines whether
a given formula in disjunctive normal
form is a tautology. I have not yet been
able to find a series of examples showing
the procedure (treated sympathetically to
avoid certain pitfalls) must require more
than polynonial tine. Nor have 1 found
an interesting upper bound for the time
required.
If we let strings represent natural
numbers, (or k-tuples of natural num
bers) using m-adic or other suitable
notation, then the notions in the pre
Geeding sections can be made to apply to
sets of nuabers (or k-place relations on
fumbers). If is not hard to see that the
Set of relations accepted in polynomial
time by Some nondeterministic Turing ma-
chine is precisely the set of te"
lations of the form
QQ) Gye, 69) RG)
a)*
where g,() = 2004 O)", a(2) is the
-154-dyadic length of z, and R(%,y)
He relation, (£* is the class of ex-
tended positive rudinentary relations
oF Bennett [6])- If we remove the bound
on the quantifier in forma (1), the
class £* would become the class of re-!
cursively enumerable sets. Thus if
£” is the analog of the class of r.e.
Sets, then determining tautologyhood is
the analog of the halting problem; since,
according to theoren 1, { tautologies}
has the complete £* degree just as the
halting problem has the complete r-e.
degree. Unfortunately, the diagonal
argument which shows the halting problem
is not recursive apparently cannot be
adapted to show (tautologies) is not in
3. The Predicate Calculus
Formulas in the predicate calculus
are represented by strings in a manner
Similar to the propositional calculus.
in‘ addition to the synbois for the lat-
kepr ge nega the quaptifier fynbots,¥
and’ $y and symbols for forming an in-
Hngee’ aise oF individual variables, and
infinite lists of function and predicate
symbols of each order, (of course the
Underlying siphaber D0 4s still Finite).
Suppose Q is a procedure which
operates on the above formulas and which
tersinates on a given input formula A
iff A is unsatisfiable. Since there is
no decision procedure for satisfiability
in the predicate calculus, it follows
that there is no recursive function T
such that if A’ is unsatisfiable, then
Q. will terminate within Tin) Steps,
Where nis the length of A. How then
does one appraise the efficiency of the
procedure?
We Will take the following approach.
Most automatic theorem provers depend on
the Herbrand theorem, which states brief-
ly that a formula A’ is unsatisfiable if
and only if sone conjunction of subst:
tution instances of the functional fora
fn(A) of Avis truth functionally in-
consistent. Suppose we order the teras
in the Herbrand universe of | fn(A) ac-
cording to rank, and then order in a
natural way the substitution instances
af fn(A). from the Herbrand universe.
‘The ordering should be such that in
general substitution instances which use
forms with greater rank follow substitu
tion instances which use terms of lesser
rank, Let Ay, A>, ... be these substi-
tution instances in order.
Definition: If A is unsatisfiable,
then OA) is the least k such that
ADGA, & -.. BAY is truth-functional-
ly inconsistent. If A_ is satisfiable,
then 9(A)” is undefined.
Now let Q_ be the procedure which,
given "A, computes the sequence "Ay, Ay:
vs and for each i, tests whether
A ee AUS Gs thuth:fineelonalty
consistent. Tf the answer is ever m0,
the procedure terminates, successfully.
then! clearly. there 1s a recursive. TCR)
Suen thot forall) and ait formulas
wes the Tength of Ave k_and
Stas be then, Q) Will terminate with-
fn ray" steps. we'supgest thet the
Zinction TG) iss acasure of the ef
Fitieney of Q
For convenience, all. procedures in
this section will be’ realized on single
{ape Turing machines, which we shall
call Simply sachines
Definition: Given a michine My end
ive function Te(K y
Mg is of sype Q. and runs within tine
Tg(k) provided that when Mg starts
with a predicate formula 4 written on
its tape, then Mg halts if and only if
A. is unsatisfiable, and for all k, if
9{A)