Professional Documents
Culture Documents
Language Theory
by
Mircea Andrasiu, Adrian Atanasiu, Gheorghe Paun, Arto Salomaa
1 Introduction
The ciphering - deciphering operations are, generally speaking, operations on
strings over given alphabets, hence they can be considered to be formal lan-
guage theory operations. Moreover, certain concrete steps and questions ap-
pearing in this framework have precise formal language theory counterparts
(letter substitutions in Caesar systems are morphisms, even codings, the Riche-
lieu systems are based on shuffle operations [2], the Cardano systems with an
auto-modified key can be simulated by a gsm [1], the system can be context -
free or context sensitive, the deciphering can be ambiguous or not, the com-
plexity of cryptanalysis is often related to the membership complexity for given
classes of languages and so on). In this context, it is somewhat surprising that
the formal language theory is not more involved in this field. Of course, there
are probably about a dozen of papers dealing with formal language theory and
cryptography, but compared to the huge bibliography of number - theoretic
cryptography, for instance, this seems to be unmotivatedly small.
In short, we believe that formal language theory has rich unexplored re-
sources, which could be used in cryptography (and conversely, by such applica-
tions new problems can appear, offering further developments of the theory).
The aim of the present paper is to contribute to the strengthening of the
bridge between formal language theory and cryptography, by proposing both
a classical system (with secret enciphering - deciphering key and algorithms)
and a public - key system. The starting point is the next two remarks:
1
1. Given a grammar G, with the rules labelled by symbols in a set Lab,
and a given string x ∈ Lab∗ , a derivation in G with the control word x
is easy to construct (if exists); a string y in L(G) is found in this way,
related to x. Conversely, given a string y ∈ L(G), to find a control word
x associated to a derivation of y in G is a ”hard” operation.
2 Notations
Before ”implementing” the previous remarks in a cryptosystem, we specify
some formal language theory notations.
V ∗ – the free monoid generated by the alphabet V ;
λ – the null element of V ∗ (V + = V ∗ \ {λ});
|x| – the length of x ∈ V ∗ ; (|λ| = 0);
alph(x) – the set of symbols appearing in x;
G = (VN , VT , S, P ) – a Chomsky grammar (VG = VN ∪ VT );
∗ ∗
Sz(u =⇒ v) – the control word of the leftmost derivation u =⇒ v;
Szlef t (G) – the set of control words associated to all leftmost derivations
in G;
Glef t (x) – the string generated by a leftmost derivation in G with control
word x, if such derivation exists; undefined otherwise;
Shuf (x, y) = {x1 y1 x2 y2 . . . xn yn | n ≥ 1, xi , yi ∈ V ∗ , x = x1 x2 . . . xn , y =
y1 y2 . . . yn }.
A context - free grammar G = (VN , VT , S, P ) is reduced if for all A ∈ VN
∗ ∗
there is a derivation S =⇒ uAv =⇒ uwv, u, v, w ∈ VT∗ . All grammars used
in this paper are supposed to be reduced.
Other notions and notations in formal language theory we shall use are
supposed to be known, for instance, from [7].
3 A classical cryptosystem
Assume we have a context - free grammar G = (VN , VT , S, P ), known to be
users of a cryptosystem, but kept secret from the illegal users. Let Lab be
the set of labels of rules in P Lab = {r1 , . . . , rn }. The labelling mapping,
φ : P −→ Lab, is not necessarily one-to-one.
Assume a string x (a phrase in English) is to be encrypted. By a given
one-to-one coding (also known to the users of the system), h : VE −→ lab; VE
2
the English alphabet (possibly including other symbols, punctuation marks,
space, digits) we pass from x to x0 = h(x) ∈ Lab∗ . (Clearly, we must have
card(VE ) ≤ card(Lab)). Construct the string y = Glef t (x0 ) – if exists. This is
the encryption of x; it is transmitted to the receiver.
The receiver parses y with respect to G, thus finding a control word z
associated to a leftmost derivation in G, then applies h−1 to z and finds z 0 =
h−1 (z). If the grammar G is unambiguous, then z = x0 , hence z 0 = x and the
deciphering is uniquely completed. In fact, for x an English phrase, when G
is ambiguous, it is highly probable that only one control word x0 obtained by
parsing y in G leads by h−1 to a correct English phrase and this must be the
plaintext message x.
The main problem in constructing such a system is to have a grammar G
such that either, for all x ∈ Lab∗ , there is an y ∈ L(G), y = Glef t (x), or at
least, for x in a given large enough subset T of Lab∗ , to have this property (all
or at least a significantly large set of messages can be encrypted using a given
grammar G). We say that such a grammar G is total (T - total respectively).
When having such a grammar, the encryption is a very simple operation; it
can be done in linear time.
Also the deciphering is easy, as it can be done in at most cubic time (we
can take G even LL, LR or other type of grammars, in order to make the work
of receiver easier).
What about the security of the system ? The key consist of the grammar
G, its labelling φ and the coding h. The coding simple renames symbols, hence
is vulnerable to frequency based attacks; also φ, depending on the shape of
rules, can be viewed as a somewhat simple substitution; if this is not the case,
we can think that we have (card(Lab))card(P ) labelling possibilities. The main
part of the key is the grammar G. Knowing the ciphering (type of) algorithm
and a (long enough) ciphertext y, the cryptanalyst’s intuitive strategy is to
build systematically grammars until reaching the right one, G, thus finding
the encrypted message. But, what means ”systematically” ? For instance,
one can generate all grammars with increased cardinality of VN , starting from
VN = {S}, with card(VT ) = alph(y), and in the increased order of the total
number of symbols used when writing the set of productions (the parameter
Symb in [3]). However, before reaching G, one has to check all grammars
G0 , with Symb(G0 ) < Symb(G), hence at least (card(VG ))Symb(G)−1 grammars.
The problem is (intuitively) intractable (in this way).
Returning to the question of finding total grammars, as formulated before,
the definition seems to be too restrictive (if the labelling is one-to-one, no
grammar is total: bb, for b the label of a terminal rule, cannot be in Szlef t (G)),
but T − total grammars are easy to find, for large sets T . For example, take a
context - free grammar G = (VN , VT , S, P ) with VN = {S} and infinite L(G).
Each string in Lab1 where Lab1 is the set of labels of nonterminal rules in P ,
can be the control word of a (partial) derivation in G; each such a derivation
3
can then be finished by using terminal rules in P .
Such a separation of nonterminal and terminal rules in a leftmost derivation
(we call it an N T - derivation) is, however, a very strong restriction, which,
intuitively, decreases the safety of the system. Indeed, given a context - free
grammar G, define
∗ ∗
LN T (G) = {x ∈ VT∗ | =⇒ w1 =⇒ w2 = x, where
∗
S =⇒ w1 is a leftmost derivation using only nonterminal rules, and
∗
w1 =⇒ w2 uses only terminal rules }
We have
Proposition 1
Proof: Only (3) needs some arguments. Take G = (VN , VT , S, P ) and construct
G0 = (VN , VT , S, P 0 ), with P 0 = {A −→ x | A −→ x is linear rule in P } ∪
{A −→ x1 A1 x2 y2 x3 . . . xn yn xn+1 | A −→ x1 A1 x2 A2 x3 . . . xn An xn+1 ∈ P, n ≥
2, Ai ∈ VN , x ∈ VT∗ for all i, and Ai −→ yi are terminal rules in Pi , 1 ≤ i ≤
n}.
We have LN T (G) = L(G0 ); indeed, given a nonterminal rule
A −→ x1 A1 x2 A2 x3 . . . xn An xn+1 ,
using it in leftmost manner and separated from terminal rules, only the first
nonterminal in its right-hand member, A1 can be rewritten by a nonterminal
rule; the other symbols A2 , . . . , An will be replaced by terminal strings, using
terminal rules in P .
On the other hand, this (intuitive) loss of safety when considering only N T -
derivation is somewhat compensed by the fact that even for ambiguous context
- free grammars G, the strings in LN T (G) can be generated unambiguously (we
say that G is N T - unambiguous). 2
Proof: Consider, for example, the grammar G = ({S}, {a, b}, S, P ), with the
rules
SS −→ SaS, S −→ bSS, S −→ b
It is ambiguous, as for the string bbab we have two leftmost derivations:
S =⇒ bSS =⇒ bbS =⇒ bbSaS =⇒ bbbaS =⇒ bbbab
S =⇒ SaS =⇒ bSSaS =⇒ bbSaS =⇒ bbbaS =⇒ bbbab
4
Only the second derivation is a N T - derivation. In general, for each string
in LN T (G) we have only one N T - derivation. Indeed, to each nonterminal
leftmost derivation in G corresponds a nonterminal derivation in the associated
linear grammar with the rules
S −→ Sab, S −→ bSb, S −→ b
(see the previous assertion (3)) and this linear grammar is unambiguous (each
nonterminal rule applied to some string xSy leads to a different terminal
string). 2
Remark 1 Observe that in each case the number of rules in the associated
linear grammar equals the number of rules in the original grammar; this is due
to the fact that there is only one terminal rule.
Sometimes the language LN T (G) in much simpler than L(G). For instance,
the Dyck language over two letters, Da,b is generated by the grammar
({S}, {a, b}, S, P ) with the rules
S −→ aSb, S −→ SS, S −→ λ.
The associated grammar G has the rules
S −→ aSb, S −→ S, S −→ λ.
Hence, LN T (G) = {an bn | n ≥ 0}. Both grammars are unambiguous if the
redundant rule s −→ S is removed.
In general, (2) and (3) in Proposition 1 show that the family of languages
of the form LN T (G) equals the family of linear languages. In fact, the following
stronger result can be obtained:
with
Clearly, L(G0 ) is not linear (we have L(G0 ) = L(G)Da,b ) and LN T (G0 ) = L(G)
(we start by S1 −→ SS2 , but S2 can only produce the string λ). 2
A useful modification of the previous cryptosystem is the next one. Start
as above by the string x ∈ VE∗ , encode it by h : VE −→ Lab, but take G
and Lab in such a way that card(Lab) > card(VE ). For a string x0 ∈ (Lab \
5
h(VE ))∗ , consider the set Shuf (h(x), x0 ), as well the morphism h0 : Lab∗ −→
Lab∗ , h0 (r) = r, r ∈ h(VE ), h0 (r) = λ, r ∈ Lab \ h(VE ). Clearly, it is
enough to find a string x0 ∈ Shuf (h(x), x0 ) for which y = Glef t (x0 ) there
exists.The receiver has to parse y, apply h0 in order to remove the dummy
symbols in Lab \ h(VE ) and h−1 in order to find the message x.
Now, we have more possibilities to find a derivation in G associated to a
given string of labels and, moreover, we can take the dummy string x0 in such
way to increase these possibilities.
Example 1 Consider the grammar G = ({S, A, B}, {a, b, c}, S, P ) with the
next rules (we also specify their labels):
r1 : S −→ aSb r2 : S −→ cAB
r3 : A −→ aAb r4 : A −→ c
r5 : B −→ AB r6 : B −→ c
and g(T ) = {0, 1}∗ ; in fact, Szlef t (G) = T , as the reader can verify, and
6
Consider now a message to be encrypted, say x = 10011011. We choose a
string x0 ∈ T such that g(x0 ) = x for g(r1 0 = g(r2 ) = g(r4 ) = g(r6 ) =
λ, g(r3 ) = 0, g(r5 ) = 1, for example
x0 = r13 r2 r4 r5 r3 r3 r4 r2 r5 r4 r5 r3 r4 r5 r4 r5 r4 r6
(we have x ∈ Shuf (r5 r3 r3 r5 r5 r3 r5 r5 , r13 r2 r46 r6 )). According to x0 , we have
the next derivation in G:
r1 r1 r1 r2 r4 r5
S =⇒ aSb =⇒ a2 Sb2 =⇒ a3 Sb3 =⇒ a3 cABb3 =⇒ a3 ccBb3 =⇒ a3 ccABb3
r3 3 r3 3 r4 3 r5 r4
3 3 2 2 3 2 2
=⇒ a ccaAbBb =⇒ a cca Ab Bb =⇒ a cca cb Bb =⇒ a cca cb ABb3 =⇒ 3 2 2
r r r
a3 cca2 cb2 cBb3 =⇒5
a3 cca2 cb2 cABb3 =⇒ 3
a3 cca2 cb2 caAbBb3 =⇒ 4
a3 cca2 cb2 cacbBb3
r5 3 r4 3 r5 r4
=⇒ a cca cb cacbABb =⇒ a cca cb cacbcBb =⇒ a cca cb cacbcABb3 =⇒
3 2 2 3 2 2 3 2 2
r
a3 cca2 cb2 cacbccBb3 =⇒6
a3 cca2 cb2 cacbcccb3 = a3 c2 a2 cb2 cacbc3 b3 = y
This string y is transmitted to the receiver. The receiver parses y according
to G and, as G is unambiguous (easy to verify), recovers the string x0 ; applying
g to x0 , one finds x, the message.
• (Q3 ) : Is there an algorithm such that, for given x ∈ {0, 1}∗ and a (0, 1)
- total grammar G, produces a string x ∈ Szlef t (G) such that x = g(x0 )
?
7
We consider now the next six properties (predicates):
P1 . L(G0 ) ∩ VT∗ {c1 }VT∗ = ∅
P2 . L(G0 ) ∩ VT∗ {c2 }VT∗ = ∅
P3 . L(G0A1 ) ∩ {c1 }VT∗ {c1 }VT∗ = ∅
P4 . L(G0A1 ) ∩ {c1 }VT∗ {c2 }VT∗ = ∅
P5 . L(G0A2 ) ∩ {c2 }VT∗ {c2 }VT∗ = ∅
P6 . L(G0A2 ) ∩ {c2 }VT∗ {c1 }VT∗ = ∅
All these properties are decidable (the emptiness is decidable for context -
free languages).
Take the morphism g : Lab∗ −→ {0, 1}∗ defined by g(r1 ) = 0, g(r2 ) =
1, g(r) = λ, r ∈ Lab \ {r1 , r2 }.
Assertion: g(Szlef t (G)) ⊇ {0, 1}+ if and only if all predicates P1 − P6 are
false.
Indeed, if P1 is true, then 0 6∈ g(Szlef t (G)) (there is no derivation using one
time r1 and never using r2 ); similarly, 1, 00, 01, 11, 10 are not in g(Szlef t (G))
when P2 , P3 , P4 , P5 and P6 are true, respectively.
Conversely, assume all P1 − P6 are false (hence the corresponding intersec-
tions are non-empty). Observe that Szlef t (G) = Szlef t (G0 ). We shall prove
that
g(Szlef t (G)) ⊇ {0, 1}+
The inclusion is obtained by induction.
From P1 , P2 being false, it follows that 0, 1 ∈ g(Szlef t (G)).
Assume all z ∈ {0, 1}+ , |z| ≤ k, k > 1 are in g(Szlef t (G)) and take
z ∈ {0, 1}+ , |z| = k + 1. Assume z = z 0 0; the case z = z 0 1 is analogous. From
the induction hypothesis we have z 0 ∈ g(Szlef t (G)).
Case 1: z 0 = z 00 0. There is a leftmost derivation in G0 of the form
∗ ∗
S =⇒ u1 A1 u2 =⇒ u1 c1 x1 u2 =⇒ u1 c1 wu02
∗
with u1 , w, u02 ∈ VT0 , u2 ∈ VG∗ , g(Sz(S =⇒ u1 A1 u2 )) = x00 , and
∗
g(Sz(u1 c1 u2 =⇒ u1 c1 wu02 )) = λ.
From P3 being false, it follows that a derivation
∗
A1 =⇒ c1 v1 A1 v2 =⇒ c1 v1 c1 v3 v20
∗ ∗
is possible in G0 , v1 , v3 , v20 ∈ VT∗ , v2 ∈ VG∗ , A1 =⇒ c1 v3 , v2 =⇒ v20 . Thus
we can construct the derivation
∗ ∗ ∗ ∗
S =⇒ u1 A1 u2 =⇒ u1 c1 v1 A1 v2 u2 =⇒ u1 c1 v1 c1 x1 v2 u2 =⇒ u1 c1 v1 c1 wv20 u02
8
Case 2: z 0 = z 00 1. There is a leftmost derivation in G0 of the form
∗ ∗
S =⇒ u1 A2 u2 =⇒ u1 c2 x2 u2 =⇒ u1 c2 wu02
∗
with u1 , w, u02 ∈ VT0∗ , u2 ∈ VG∗ , g(Sz(S =⇒ u1 A2 u2 )) = z 00 , and
∗
g(Sz(u1 c2 x2 u2 =⇒ u1 c2 wu02 ))λ.
From P6 being false, there is a derivation
∗ ∗
A2 =⇒ c2 v1 A1 v2 =⇒ c2 v1 c1 v3 v20
∗ ∗
with v1 , v3 , v20 ∈ VT∗ , v2 ∈ VG∗ , A1 =⇒ c1 v3 , v2 =⇒ v20 . Thus we can
construct the derivation
∗ ∗
S =⇒ u1 A2 u2 =⇒ u1 c2 v1 A1 v3 v2 =⇒ u1 c2 v1 c1 v3 v20 u02
Proof: The language Szlef t (G) is context - free (and a grammar to it can be
effectively constructed starting from a context - free grammar G). Given a
context - free grammar G, there are finitely many morphisms g : Lab∗ −→
{0, 1}∗ with g(r1 ) = 0, g(r2 ) = 1, g(r) = λ for all r ∈ Lab \ {r1 , r2 } (we
have (card(Lab)(card(Lab) − 1) such morphisms). For each morphism g of
this type, consider the language g(Szlef t (G)). It is context - free, hence it is
decidable whether a given x ∈ {0, 1}∗ is in g(Szlef t (G)) or not (and this can
be done in polynomial time). 2
Proof: Let Lab be a set of labels for the rules in G = (VN , VT , S, P ) and let
g : Lab∗ −→ {0, 1}∗ be a given morphism associating 0, 1 with the rules in G.
The language
L1 = g −1 ({x})
is regular and a finite automaton A1 for L1 can be effectively constructed.
For the grammar G we can construct G0 = (VN , Lab, S, P 0 ), with
P 0 = {B −→ rh(x) | r : B −→ x ∈ P, B ∈ VN , x ∈ VG∗ }
9
where h : VG∗ −→ VN∗ is defined by h(X) = X X ∈ VN , h(a) = λ, a ∈ VT . We
have
L(G0 ) = Szlef t (G)
A grammar G00 for the intersection L1 ∩Szlef t (G) can be effectively constructed
starting from the automaton A and the grammar G (the classical triple con-
struction). Having G00 , to find a derivation in G00 producing a non-empty string
y is algorithmically possible (explore the finite set of non-recurrent derivations,
for example); for the string y we have x = g(y), hence the construction works.
2
Some further remarks about (0, 1) - totalness question are worth mention-
ing, taking into account the importance of this notion for the above type of
cryptosystems.
G = (VN ∪ {B}, VT , S, P 0 )
10
grammar G is unambiguous. Indeed, G0 is unambiguous; no different deriva-
tion using rules r1 − r4 can generate the same string (assume, without loss of
generality, that u 6= λ; in a string generated by rules r1 − r4 , starting from
A or from B, the occurrence of uc identifies the use of r1 , each occurrence of
ud identifies the use of r2 , whereas a substring dd points to a use of r3 ). No
two different derivations using both rules in P 0 and rules r1 − r4 can generate
the same string (each pair of occurrences of the symbol c identifies a symbol A
rewritten by r1 ; a string in (VN ∪VT )∗ has an unique derivation in G0 ; from each
A we generate a string bounded by ”parentheses” uc, cv etc). In conclusion, G
is unambiguous (hence it can be incorporated in a cryptosystem as above). 2
Remark 2 Before closing this section, let us remark that the previous cryp-
tosystem is similar in some extent to that consider in [9], [10] and investigated
in [4], [5]: here we deal with a Chomsky grammar G with labelled rules, in
[9], [10] one takes a T OL (or a DT OL) system which is used in a similar
way, assuming its tables labelled (the tables are total by definition, hence the
T OL/DT OL system is total, the backward determinism corresponds to unam-
biguity and so on).
1. h(L(G1 )) ⊆ L(G2 );
2. for each x ∈ Szlef t (G1 ) we have h(G1,lef t (x)) = G2,lef t (h0 (x)).
11
A nice pair of grammars as above is called useful if:
3. G1 is non-context - free, whereas G2 is context - free and unambiguous;
4. both G1 and G2 are (0, 1) - total and r1 , r2 ∈ Lab1 are associated to
symbols 0, 1 in G2 (G1 , G2 are (0, 1) - total with this assignment of 0, 1
to labels of their rules).
Having a useful nice pair of grammars, G1 , G2 , with labelling mappings φ1 , φ2 ,
the morphisms h, h0 and the assignment mapping g : Lab1 ∪ Lab2 −→ {0, 1, λ},
the public - key is constituted by G1 , φ1 , g|Lab1 and the secret trapdoor by
G2 , φ2 , h, h0 , g|Lab2 . To encrypt a message x ∈ {0, 1}∗ we first encode it by g −1 ,
shuffle it by a string in (Lab1 \ g −1 ({0, 1}))∗ , construct a leftmost derivation
in G1 and consider the obtained string y as the ciphertext. The illegal crypt-
analyst has to parse y with respect to G1 , a hard problem. The legal receiver
takes the string h(y) and parses it with respect to the context - free grammar
G2 ; to the obtained control string one applies g and the message x is found.
Surprisingly enough, there exist useful nice pairs of grammars.
Example 2 Let be G1 = ({S, B}, {a, b, c, d}, S, P1 ) with the rules
r1 : S −→ aSBc r2 : S −→ aSBd
r3 : S −→ ab r4 : cB −→ Bc
r5 : dB −→ Bd r6 : bB −→ bb
whereas G2 = ({S}, {a, c, d}, S, P2 ) contains the rules
r10 : S −→ aSc
r20 : S −→ aSd
r30 : S −→ a
(we have also specified the labelling). Consider also
h : {a, b, c, d}∗ −→ {a, c, d}∗
defined by
h(a) = a, h(b) = λ h(c) = c, h(d) = d
and
h0 : {r1 , r2 , r3 , r4 , r5 , r6 }∗ −→ {r10 , r20 r30 }∗
defined by
h0 (r1 ) = r10 , h0 (r2 ) = r20 , h0 (r3 ) = λ, (i = 4, 5, 6)
Associate 0 to r1 in G1 and to r10 in G2 and 1 to r2 in G1 and to r20 in G2 .
Clearly, G1 is context sensitive, G2 is unambiguous context - free and
12
0, 1 are associated to r1 , r2 in G1 and to h0 (r1 ), h0 (r2 ) in G2 and if in a
leftmost derivation in G1 we ignore the rules r4 , r5 , r6 then a derivation with
a control word in {r1 , r2 , r3 }∗ is found corresponding to a derivation in G2
producing a string obtained by applying h to the string obtained in G1 (that is
h(Szlef t (x)) = G2,lef t (h0 (x))).
Unfortunately, the above nice pair of grammars is not good for a real cryptosys-
tem, as the parsing with respect to G1 is similarly easy as that with respect
to G2 (the string x in the above writing of strings in G1 and G2 , an bn x, an x
respectively precisely identifies the rules used, r1 , r2 , r10 , r20 respectively). It is
a significant problem to find useful nice pairs of grammars with the parsing
with respect to G1 significantly harder than the parsing with respect to G2 .
This example shows also that our definition of usefulness does not capture all
the essential requirements.
We hope to return to this issue in a forthcoming contribution.
References
[1] M. Andrasiu, Gh. Paun - A cryptosystem based on gsm mappings,
manuscript, 1980.
[2] M. Andrasiu, J. Dassow, Gh. Paun, A. Salomaa - Language - theoretic
problems arising from Richelieu cryptosystems, Th. Computer Sci.
[3] J. Gruska - Descriptional complexity of context - free languages, Proc. 2nd
MFCS Symp, High Tatra, 1973, 71-83.
[4] J. Kari - A cryptanalitic observation concerning systems based on language
theory, Discr. Appl. Math., 21 (1988), 265 - 268.
[5] J. Kari - Observation concerning a public - key cryptosystem based on
iterated homomorphisms, Th. Computer Sci., 66 (1989), 45-53.
[6] V. Niemi - Hiding regular language public - key cryptosystems,
RAIRO/Th. Informatics, submitted 1989.
[7] A. Salomaa - Formal languages, Academic Press, New York, 1973.
[8] A. Salomaa - public - key cryptography, Springer - Verlag, Berlin, Heidel-
berg, 1990.
[9] A. Salomaa, E. Welzi - A cryptographic trapdoor based on iterated mor-
phisms, manuscript 1983.
[10] A. Salomaa, S. Yu - On a public - key cryptosystem based on iterated
morphisms and substitutions, Th. Computer Sci., 48 (1986), 283-296.
13