Professional Documents
Culture Documents
Logic and
Logic Programming
J.A. Robinson
ogic has been around for a all mathematical concepts and for ery of the logical and set-theoretic
L
very long time [23]. It was the formulation of exact deductive paradoxes (such as Bertrand Rus-
already an old subject 23 reasoning about them. It seems to sell's set of all sets which are not
centuries ago, in Aristotle's be so. T h e principal feature of the members of themselves, which
day (384-322 BC). While predicate calculus is that it offers a therefore by definition both is, and
Aristotle was not its origina- precise characterization of the con- also is not, a member of itself); and
tor, despite a widespread impres- cept of proof. Its proofs, as well as its the huge reductionist work Prin-
sion to the contrary, he was cer- sentences and its other formal ex- cipia Mathematica by Bertrand Rus-
tainly its first important figure. He pressions, are mathematically de- sell and Alfred North Whitehead.
placed logic on sound systematic fined objects which are intended All of these developments had ei-
foundations, and it was a major not only to express ideas meaning- ther shown what could be done, or
course of study in his own univer- fully--that is, to be used as one uses had revealed what needed to be
sity in Athens. His lecture notes on a l a n g u a g e - - b u t also to be the sub- done, with the help of this new
logic can still be read today. No ject matter of mathematical analy- logic. But it was necessary first for
doubt he taught logic to the future sis. They are also capable of being mathematicians to master its tech-
Alexander the Great when he manipulated as the data objects of niques and to explore its scope and
served for a time as the young construction and recognition algo- its limits.
prince's personal tutor. In Alexan- rithms. Significant early steps toward
dria a generation later (about 300 At the end of the nineteenth cen- this end were taken by Leopold
B.C.), Euclid played a similar role tury, mathematics had reached a Lowenheim (1915), [29] and
in systematizing and teaching the stage in which it was more than Thoralf Skolem [45], who studied
geometry and number theory of ready to exploit Frege's powerful the symbolic "satisfiability" of for-
that era. Both Aristotle's logic and new instrument. Mathematicians mal expressions. They showed that
Euclid's geometry have endured were opening up new areas of re- sets of abstract logical conditions
and prospered. In some high search that demanded much could be proved consistent by being
schools and colleges, both are still deeper logical understanding and given specific interpretations con-
taught in a form similar to their far more careful handling of structed from the very symbolic
original one. The old logic, how- proofs, than had previously been expressions in which they are for-
ever, like the old geometry, has by required. Some of these were David mulated. Their work opened the
now evolved into a much more gen- Hiibert's abstract axiomatic recast- way for Kurt G6del (1930, [17]) and
eral and powerful form. ing of geometry and Giuseppe Jacques Herbrand (1930, [19]) to
Modern ('symbolic' or 'mathe- Peano's of arithmetic, as well as prove, in their doctoral disserta-
matical') logic dates back to 1879, Georg Cantor's intuitive explora- tions, the first versions of what is
when Frege published the first ver- tions of general set theory, espe- now called the completeness of the
sion of what today is known as the cially his elaboration of the dazzling predicate calculus. G6del and
predicate calculus [14]. This system theory o f transfinite ordinal and Herbrand both demonstrated that
provides a rich and comprehensive cardinal numbers. Others were the proof machinery of the predi-
notation, which Frege intended to Ernst Zermelo's axiomatic analysis cate calculus can provide a formal
be adequate for the expression of of set theory following the discov- proof for every logically true prop-
7 i f PI a n d . , . a n d Pm t h e n Qt
10 n o t (P~ a n d . . , a n d Pro)
11 n o t true (or: false)
Horn-clauses are cases 5 o n w a r d most c o n v e n i e n t o n e for writing P L U S ( T H R E E , SIX)
(where n = 1 or n = 0). T h e clauses expressions, a n d it is in this r e p r e - SUCCESSOR(SUCCESSOR
in cases 5 to 8 are positive H o r n - sentation that we have to be careful (SUCCESSOR(ZERO))).
clauses (n = 1); those f r o m 9 on- to avoid 'name-clashes' w h e n choos-
wards are negative Horn-clauses ing n a m e s for variables. T h e o p e r a t o r s are functional con-
(n = 0). Cases 2, 4, 6, 8 a n d 11 are stants. PLUS, S U C C E S S O R , a n d so
unconditional clauses (m = 0). T h e on. W h e n the a r g u m e n t list o f a
Atoms
o t h e r cases (m > 1) are conditional t e r m is empty, we usually skip ex-
Calling atomic sentences 'atoms'
clauses. plicitly writing the e m p t y list, and
may r u n s o m e risk o f c o n f u s i o n
with Lisp's usage o f that word, but write the t e r m as if it consisted o f its
Variants. Separation of Clauses it is well established. T h e r e are two constant alone, as MARY,
As we shall soon see, the choice o f noncomposite atoms--the truth T H O M A S , instead o f M A R Y ( ) ,
variables in a g e n e r a l clause is values true, f a l s e - - b u t in g e n e r a l T H O M A S ( ) . Every relational and
s o m e w h a t arbitrary, and n e i t h e r atoms are c o m p o s i t e expressions, functional constant comes with an
the essential syntactic structure n o r
with two c o m p o n e n t s : a predicate arity, which is a n o n n e g a t i v e inte-
the m e a n i n g o f a clause are affected a n d a list o f arguments. T h e usual ger, a n d which is c o n s i d e r e d to be
if we replace s o m e o r all o f its vari- c o n v e n t i o n for writing a c o m p o s i t e part o f the constant's identity. A
ables by o t h e r variables. T h e only a t o m is to write its predicate i m m e - constant h a v i n g arity n is said to be
proviso is that the c o r r e s p o n d e n c e diately b e f o r e its a r g u m e n t list, as n-ary. T h u s M A R Y is 0-ary, S U C -
b e t w e e n old and new variables m u s t for e x a m p l e : C E S S O R is 1-ary, G R E A T E R -
be one-to-one. T w o clauses which T H A N is 2-ary, a n d so on. T h e
d i f f e r f r o m each o t h e r only in this basic f o r m a t i o n rule for c o m p o s i t e
MOTHER(MARY, THOMAS)
way are called variants o f each expressions (atoms or terms) is that
GREATER-THAN(SUM-OF
other. I f two clauses have no vari- an n-ary constant m u s t always be
( T H R E E , SIX), S E V E N )
ables in c o m m o n , they are said to be
separated or standardized apart. T h u s T h e predicates are relational con- q n writing a list, we m a y place a c o m m a after
each i t e m (other t h a n the last) to e n h a n c e the
E(x) D(x y ) ~ A(F(x y)) a n d E(u) stants MOTHER, GREATER- readability. T h i s is, h o w e v e r , optional, a n d is
D(u y)--> A(F(u y)) are variants; T H A N , and so on: j u s t identifiers, not p a r t o f the definition o f a list.
48 March 1992/%1.35, N o . 3 / C O M M U N I C A T I O N S OF T H E A C M
belled by the symbol K.
• an applicative expression K(EI,
. . . . E,) is a g r a p h whose root is
unlabelled and has n + 1 out-arcs
which are labelled respectively by
the integers 0 to n. T h e out-arc
labelled by 0 points to the node
which is the constant K. For i = 1,
.... n, the out-arc labelled i
; \ //
points to (the root o f the graph
which is) the term El.
was recently developed, analyzed case), we are ready to make infer- s)} and N = {P(x y u),P(x v w)}, since
and efficiently i m p l e m e n t e d in [2]. ences by resolution. {M tO N} is unifiable with mgu {x =
T h e elegant data-parallel SIMD T h e fundamental resolution in- G(rs),y=v=r,u=w=s}.
implementation for the Connection ference pattern is closely related to
Machine exoloits all the inherent what logicians call the 'cut' infer- E x a m p l e 3. From P(x y u)P(y z
parallelism in the process very ef- ence. (In Prolog p r o g r a m m i n g par- v)P(x v w ) ~ P(u z w) and P(a b
fectively. lance, unfortunately, the word 'cut' c)P(b d e)P(c d f ) ~ P ( a e f) we
T h e sequential version o f this has come to have another, quite dif- infer P(x y a)P(y b v)P(x v c)P(b d
"fast unification" algorithm was hit ferent, meaning). Cut inferences e)P(c d f) ~ P(a e f) by a resolution
u p o n i n d e p e n d e n t l y by [4, 22, 42], have the form: in which M = {P(u z w)} and N =
improving an earlier formulation {P(a b c)}, since {M U N} is unifiable
from A~(B+{L})and with m g u { u = a , z = b , w = c } .
by [3]. As far as I know, the first
({L} + C) ~ D
version of a unification algorithm
infer (A U C)--~ (B tO D). F r o m two given clauses, only a fi-
to be explicitly stated and accompa-
nied by correctness and termina- nite n u m b e r o f clauses can be in-
We can make a cut inference from ferred by r e s o l u t i o n - - o n e for each
tion proofs was in [39].
two clauses if any only if there is choice o f the 'cut' sets M and N for
Later, in [41], I formulated a
some atom L which is in the ante- which the partition {M U N} is uni-
more efficient version o f the algo-
cedent o f one clause and the conse- fiable. I f there are no such choices
rithm, using a tabular representa-
quent o f the other. To form the o f M and N, then nothing can be
tion o f the graph-representation to
conclusion o f the inference, we first inferred from the two clauses by
gain some o f the same computa-
'cut' out L from both places, and resolution.
tional advantages which were bril-
then merge the two antecedents
liantly orchestrated on a much
into one and two consequents into
larger scale by [5] in their impor- ReSolution Deductions and Proofs
one. T h e 'disjoint union' notation
tant structure-sharing resolution the- A resolution deduction is a finite tree
X + Y denotes the union X U Y ,
orem-prover. This tabular repre- whose nodes are labeled by clauses,
but also carries the further infor-
sentation [41] is also the point o f each nonleaf node being labeled by
mation that X n Y = O.
d e p a r t u r e for [2]. a clause which is inferred by a reso-
H e r b r a n d ' s original (1930) ver- lution inference from the clauses
E x a m p l e 1. F r o m the clauses A
sion o f the unification process is labeling its immediate successors.
B ~ C D and D E ~ F G we can
stated briefly, informally, and with- T h e conclusion o f the deduction is
infer the clause A B E ~ C F G by a
out p r o o f (see [19]). the clause labeling its root, and the
cut, eliminating the atom D.
In 1984 [13] pointed out that in premises o f the deduction are the
certain cases there is no opportu- T h e resolution inference pattern clauses labeling its leaves. A resolu-
nity for the parallel graph-shrink- generalizes the cut inference pat- tion proof is a resolution deduction
ing algorithm to achieve any signifi- tern by bringing in unification. T h e whose conclusion is false (= the
cant speed-up. Thus, for example, resolution inference pattern has the empty clause). Such a p r o o f estab-
in finding the mgu {x = A} o f the form: lishes that the premises are contra-
set dictory (unsatisfiable). I f S is any
from A ~ (B + M) and
{F(F(F(F(F(F(F(F(x)))))))), unsatisfiable set of clauses there is
(N + C) ~ D
F(F(F(F(F(F(F(F(A))))))))} always a resolution p r o o f whose
infer (A U C)¢r---> (B U D)cr
premises are all in S. This fact is the
where ~r is an mgu o f the one-
we can merge only one pair o f completeness o f resolution (see
part partition {M U N}.
nodes, and generate only one new [39].
link, at each iteration o f the loop. A resolution p r o o f with n + 1
In making a resolution inference,
These successive minimal modifica- premises can be taken in n + 1 dif-
we must first use unification to de-
tions o f the g r a p h therefore com- ferent ways as a p r o o f of the nega-
duce a pair o f instances o f the two
prise essentially a sequential pro- tion of one o f its premises from the
premises suitable for a cut to be
cess. However, such 'worst cases' other n premises. For example, a
applied. In the special case that
are m o r e pathological than typical, resolution p r o o f with premises A,
M = N = {L}, the mgu o f the parti-
and experience suggests that they B, C can be taken as (1) a p r o o f o f
tion {M U N} is the identity substi-
are rarely met in real applications. not-A from the premises B and C,
tution. So in this case, a resolution is
(2) a p r o o f o f not-B from the prem-
the same as a cut.
Resolution ises A and C, and (3) a p r o o f of
Once we can compute an mgu for E x a m p l e 2. F r o m -->P(G(r s) r s) not-C from the premises A and B.
any unifiable partition o f a set o f and P(x y u)P(y z v)P(x v w) --> P(u z
expressions (or show the partition w) we infer P(r z r)--> P(s z s) by a P1-ReSOlution
not to be unifiable, if that is the resolution in which M = {P(G(r s) r A resolution one of whose two premises
S6 March 1992/%I.35, N o . 3 / C O M M U N I f A T I O N S O F T H E A @ M
as though it were a special infer- has the same premise and the same is a cover of A(x0) B(x0) ~ C(x0) by
ence rule, 'the {Bl . . . . . Bp} ~ D conclusion as D, and conversely. We C, in view of the assignments given
inference rule', stated as: define ultraresolution inferences by the table:
atom assigned to node
A(x0) 2
B(x0) 3
H(G(x2)) 4
This is, however, just a pragmatic directly, however, without refer- D(xl yl) 5
device to sharpen our understand- ence to their corresponding hyper- E(xl) 6
ing of the very special role that con- resolution deductions. and has the following partition as
ditional H o r n clauses play in logic The ultraresolution rule is its kernel:
programming. (where A ~ B is a Horn-clause and
C is a set of Horn-clauses):
Ultraresolutions: Horn Clause
Hyperresolution Deductions as
!~!~!~ii i!~i!i~i~!i~!~ii ~!!ii!i~!!i!i!!!!ii~!!!i~i!i~i~i~!i!~i~!~!~ii !!ii ~!!ii i~!~ii!!!ii~!i~i ~!!~ii ~!!!ii ~i i i~!ii ~!!ii i~i i!!!i~i~!ii~i i i!~!~!!i~i i!~i!!~ii i
Single Inferences
We again apply the idea of making
a single inference out of an entire !i~i~i!!i!ii~¸ii!!Ji~!iiii!ii!!!ci!JJii~iii!!~i!!~il
iii~!iiii!ii~!i ¸i~i~ili!i~iiiii~ui!i!~!i!i
deduction. I n the case of hyper- T h e clause A --~ B is the main prem-
{{A(x0), A(F(xl yl))}, {B(x0, B(x2)},
resolution, instead of thinking of /se and the clauses in C are the cov-
{E(x0, E(M)}, {D(xl yl), D(M N)},
the conclusion of an entire deduc- ering premises. {H(G(x2)), H(G(x3))}.
tion (namely a deduction built from Covers and Their Kernels
Pl-resolution steps and having an Since this kernel is unifiable, with
A cover of a clause A ~ B by a set C
unconditional conclusion) as being mgu
of clauses is a certain kind of finite
arrived at stepwise by the perfor- tree with nodes labeled by clauses. cr = {x0 = x2 = x3 = F(M N)),
mance of each of its inferences sep- T h e root of the tree is labeled by Xl = M, yl = N},
arately, we think of the whole con- A ~ B, while the other nodes are
struction as one inference step we can infer the clause
labeled by variants of clauses in C.
involving a higher and larger-scale T h e extra condition that makes the ---~C(x0)~r = --*C(F(M N))
inference pattern. We will now treat tree a cover is that for each node N
H o r n clause hyperresolution de- by an ultraresolution which has
in the tree, every atom in the ante-
ductions in a similar way, and A(x0) B(x0)---~C(x0) as its main
cedent of the clause labeling N is
thereby arrive at a higher- and premise and C as its set of covering
assigned to a distinct immediate
larger-scale inference pattern premises.
successor of N. T h e kernel of the
which we call ultraresolution. cover is the partition: T h e intuition behind the notion
T h e r e is really no need, prag- {{X, Y}IY is the conclusion of the of a cover of a clause A ~ B is that
matically, to know the conclusion of clause labelling the node to which X it depicts exactly the pattern of orga-
every individual inference in a hy- is assigned}. nization of the given clauses. If the
perresolution deduction, if all that kernel of the cover is unifiable with
Example 4. To illustrate the no-
we are after is the eventual conclu- mgu ~r, it guarantees that we can
tions of a cover and its kernel, con-
sion of the whole deduction. We easily relabel the tree so it turns into
sider the clause:
can instead characterize that even- a hyperresolution deduction, from
tual conclusion more directly, by a A(x0) B(x0) ~ C(x0) these clauses as premises, of the
relationship based only on the and the set C of clauses same unconditional clause ---~B~
structure of the premises of the that the ultraresolution inference
deduction. By omitting in this way {E(xl) D(Xl y l ) ~ A(F(xl y])),
obtains directly from them in one
H(G(x2)) ----)B(x2),
all of the interior stepwise conclu- step. I n this relabeling, the new
sions we t u r n the entire hyper- ~H(G(x3)), ~ D(M N), ~ E(M)}.
label on each leaf node of the tree is
resolution deduction into a single in- T h e labeled tree given by the table: the same as the old label. T h e old
ference, which immediately yields its
conclusion from the premises in
one integrated step.
t HUSBANDIu'Sl)
I ......... ~I~EL~is~lp~i .................... PARENT'(plxl)
; ................... i ........ i . . . .
~ DIFFERENT(PATJOE)]
FIGURE 9.
Q0
setting it up to be the clause s ANSWER(t)
............................................
Q0 =
ANSWER(xl . . . Xm)*-Gl • • • Gn P (K (t) t K (t))
64 Ma r c h 1992/Vo1.35, No.3/COMMUNICATIONS OF T H E A C M
Programming and Prolog by Ulf Nils- system combining the lambda calculus copies are not made or distributed for direct
son and J a n Maluszynski (Wiley, (for functional programming) with the commercial advantage, the ACM copyright
predicate calculus (for logic program- notice and the title of the publication and its
1990) provide rigorous but reada- date appear, and notice is given that copying
ble accounts not only o f much of ming) at the University of Tokyo, where
he is on a year's leave. Author's Present is by permission of the Association for
the material covered in the present Computing Machinery. To copyotherwise,or
Address: Office of the University Pro- to republish, requires a fee and/or specific
article but also of many noteworthy fessor, Syracuse University, Syracuse,
later developments. A m o n g these permission.
NY 13244-2010
are: Permission to copy without fee all or part of
this material is granted provided that the © ACM 0002-0782/92/0300-040 $1.50
• the addition o f imperative con-
trol features such as the cut;
• the elegant negation as failure
technique by which all m o d e r n
Prolog systems permit negative
ALS Prolog realizes
conditions in both positive and Logic Programming.
negative conditional clauses;
Logic Programming provides one of the most advanced and refined
• the inclusion of arithmetical, list-
processing, metalinguistic and other approaches ~ f o r solving complex programming problems.
applied predicates and operators After all, L ~ 7 / logic itselfhas been under development by the
a m o n g the atoms and terms; human race ~ - . J ~ for well over 2,000 years. Prolog is the
• alternative logic p r o g r a m m i n g
paradigms, such as concurrent most successful ~ , . _ ~ / realization of the Logic Programming
logic programming, constraint logic approach, providing/""--~ a very high conceptual approach to
programming, and higher-order logic problem analysis and implementation, coupled with extremely
programming.
general and fast pattern-matching.//~-~ And ALS Prolog is by
For the r e a d e r who wishes to far the most powerful collection o f \ ~ f ~ \ Prolog compilers
learn m o r e about applications and
methodology o f logic p r o g r a m m i n g , available. Whether your task is ~ / ~ advanced exploratory
about Prolog, and about exploiting research, or the development of ~ complex production systems,
the potential parallelism in logic, I the ALS Prolog compiler is the ~ Itool of choice.
also r e c o m m e n d the following re-
cent books:
Develop with one ALS Prolog compiler, and you re devel-
oping with them all. ALS is committed to a uniform implementation
• The Art of Prolog by L. Sterling on all platforms, yet you get access to all ~ the facilities of
and E. Shapiro (MIT Press,
1986); each platform, including each native win- ~ . ~ / dowing system.
• Prolog Programming for Artificial You can couple your Prolog programs to C ~ programs via a
Intelligence by I. Bratko (second very broad C interface ~ which allows Prolog to manipulate
edition, Addison-Wesley, 1990);
• The Craft of Prolog by R. O'Keefe C data, and allows C to \ ~ / ~ c a l l into Prolog. Stream-based
( M I T Press, 1990). IPC communication, l o c a l k ~ / / a n d remote, is available. We sup-
• Essentials of Logic Programming by port 386/486 machines under SCO Unix and DOS (virtual memory),
C.J. H o g g e r (Oxford: Clarendon
Press, 1990), soon with Windows 3.0, as well as the Apple ~ Macintosh, Sun
• Parallelism in Logic: its potential for SPARC and 680x0, DEC v a x (VMS) and ~ 7 ~ ~
performance and program develop- all Motorola 88000-based machines, and
ment by Franz Kurfess (Braun-
schweig, Vieweg, 1991). planning to add even more platforms in the
• Parallel Logic Programming by future. ~ ~ j ~Z~~] ~ [
Evan Tick (MIT Press, 1991). Call or write today. If you're
About the Author: learning Prolog, ask about our
J.A. ROBINSON teaches philosophy
and computer science at the University student versions l l APPLIED LOGIC SYSTEMS, INC.
of Syracuse, where he is now University for the PC and ~ P.O. BOX 90, UNIVERSITYSTATION
Professor. His research interests include Macintosh. 1 1 SYRACUSE,NY, 13210 USA
computational logic and automated
deduction. He is currently working on a 1 PHONE: 315-471-3900FAX:315-471-2606
massively parallel logical computation Circle # 7 9 on Reader Service Card