You are on page 1of 4

Full Prolog Syntax

The fundamental syntactic construct in Prolog is the term. At the topmost level, itself, the terms are
interpreted as sentences in the manner described below. There is a limited ability to define or redefine the
syntax of terms, these facilities, therefore, are also available for defining the syntax of sentences.
Each term comprises a sequence of tokens; each token, in turn, being treated as a single symbol that is
spelled out in 1, 2 or more characters. The tokens include those for variables, constants, functors as well
as brackets and punctuation. The token composition of a term is described in further detail below. The
spellings of the tokens are also described in further detail below.
When a sequence of tokens is read in, it must be terminated by the full-stop token. Wherever the
concatenation of two or more tokens would produce the spelling of another token, the tokens need to be
separated by a layout-text token in order to prevent the misreading. The layout-text tokens may be
arbitrarily interspersed in a program and serve no other function than to separate other tokens.

Sentences

a:S
Sentence labeled by module name
L
A sentence in list form
H [:- B] Clause (if :- B is absent, H must not be otherwise interpretable as a sentence)
:- B
Command
?- B
Query
H -> B
Grammar rule
H a : H Module label
HT
Goal; T may not be an X or have any of , ; : \+ -> as its main operator.
Ba:B
Module label
B B -> B ; B Conditional
B B -> B
Elseless conditional
B \+ B
BB;B
Disjunction
BB,B
Sequencing
BT
Goal; T may not be an X or have any of , ; : \+ -> as its main operator.
S
S
S
S
S
S

In a grammar rule, instances of H or B of the forms L or S are interpreted respectively as a list of terminals
or as a literal terminal; instances of B of the forms { B } or ! are interpreted as conditions; instances of H or
B of the forms T are interpreted as non-terminals. In H or B, instances of T may not have the form of a
variable X, or contain any of the operators , ; \+ -> as their main operator.

Terms
As mentioned previously, when a sequence of tokens comprising a term is read in, it must end in a fullstop. The syntax is: T1200 full-stop. There are a set of precedence levels, ranging from 0 to 1200, for terms.
This facility is provided to allow certain operators and punctuation to bind more tightly than others, and it
is available to enable one to partially define or redefine the syntax of terms.

T
fx T
fy T
T xfx T
T xfy T
T yfx T
T xf
T yf
T ,T

Tn+1
Tn
Tn
Tn
Tn
Tn
Tn
Tn
T1000

n-1

n-1

n-1

n-1
n

n-1

n-1
n

0 n < 1200
except for number.

999

1000

The comma operator is xfy1000.

a [( T (, T ) )]
(T )
{T }
L
S
N
X
L [ [T (,T ) [| T ]] ]
*

T0
T0
T0
T0
T0
T0
T0

999

999

Function or constant.

1200

1200

List
String
Numeric constants
Variable

999

999

999

(The Prolog syntax apparently doesnt allow two or more T999 to precede the separator |.)
N
N

[+|-] Numeral
(+|-) (inf | nan)

All the tokens fxn, fyn, xfxn, xfyn, yfxn, xfn, yfn are instances of a which have been specially declared as
operators of the respective precedence and associativity. The operators of the forms *f, *f* and f* are
prefix, infix and postfix. Those of the forms yf? are left-associative and those of the form ?fy are rightassociative.
The comma operator , is fixed in the syntax above as of type xfy1000. A term T1000 of the form A,B is
interpreted in the standard syntax as ,(A,B); while the terms T0 of the form {A} and (A) are interpreted,
respectively, as {}(A) and A.
In order to disambiguate between an operator followed by a parenthesis versus a function-call, a
requirement is imposed that ( must immediately follow the a of the function in a function call, and ( must
be separated from fxn or fyn in an operator application.
Also, in case of ambiguity, the interpretation of an instance of a as a prefix operator, fxn or fyn, wherever
possible, wins out; and the interpretation of an instance of xfn or yfn as xfxn, xfyn or yfxn, wherever
possible, wins out.

Tokens
The standard character set for SICStus Prolog is ISO 8859/1, but EUC (Extended UNIX Code) is supported
as an alternative. The environment variable SP_CTYPE determines which applies.
The character categories used below are defined as follows in the two standards
Class
ISO 8859/1
Layout
0-32,127,159
Small
97-122,223-246,248-255
Capital
65-90,192-214,216-222
Digit
48-57
Symbol
35,36,38,42,43,45-47,58,60-64,92,94,96,126,160-191,215,247
Solo
33,59
Punctuation
37,40,41,44,91,93,123-125
Quote
34,39
Underline
95

Minimal Set
TAB, LFD, SPC
a-z
A-Z
0-9
+-*/\^<>=`~;.?@#$&
!;
%(),[]{|}
"'
_

The EUC differs from the ISO 8859/1 in that all the characters 128-255 are counted as Small, which
changes the meanings of the characters whose codes are listed above as italicized, as well as the meanings
of characters 128-158, which are not listed above.
Token
Token
Token
Token

a
Numeral
X
S

Token
Token
Token
a
a
a
a
a
a

Punctuation
LayoutItem
full-stop

'C '
Small (Capital | Small | Digit | Underline)
Symbol
Solo
[ LayoutItem ]
{ LayoutItem }
*

Neither the full-stop nor any symbol sequence starting in /* counts as an a.


Numeral
Numeral
Numeral
Numeral

Digit
Digit
0'C
Digit

+
+

' (Capital | Small | Digit)*

. Digit+ [[e|E] [+|-] Digit+]

In the base-radix notation R'N, the digits comprising N must be of value no larger than R, with a,b, and
A,B, both being treated as being of values 10,11,. The base R must in the range [2,36]. Note that 3 is
identified as a numeral, whereas (3) as an application of the operator to the numeral 3.

(_ | Capital) (Capital | Small | Digit | Underline)


S"C "
C Char, other than \
C \ Esc
X

In a, a ' must be duplicated. In S, " must be duplicated.


LayoutItem
LayoutItem
LayoutItem

Layout
/* Char */
% Char LFD
*

The character sequences in the /* and % comments main not contain the respective ending sequences */ or
LFD.
full-stop

A full stop must be separated from subsequent tokens by LayoutItems.

Char
Any character, including Layout, Small, Capital, Digit, Underline, Symbol, Solo, Punctuation and
Quote.
Esc
Esc
Esc
Esc
Esc
Esc
Esc

a|b|t|n|v|f|r|e|d
x (Capital | Small | Digit)
Digit [Digit] [Digit]
^ (? | Capital | Small)
c Layout
Layout
Char

The single-character escapes are, respectively, the alarm, backspace, horizontal tab, newline, vertical tab,
form feed, carriage return, escape and delete (codes 7-13, 27 and 127). The escapes starting in x are
hexadecimal escapes; with the following characters ranging from 0-9, A-F and a-f. The escapes starting in a

digit are octal escapes, and the digits must all be in the range 0-7. The escapes starting in ^ are controlsequences, with ^? standing for delete (127) and the corresponding codes for the other controls being taken
as the code of the character following the ^, modulo 32. Escape sequences involving layout are ignored
(and that starting in c is taken with the longest layout sequence that occurs following it), and the escape
sequence of any other character is taken to stand for the character, itself.
The escape sequence feature can be turned off to assure compatibility with older Prolog programs.

You might also like