You are on page 1of 4

INFORMATION AND CONTROL 18, 253-256 (1971)

Some Closure Properties of the Family


of Stochastic Languages
P~vo TUe.nr~INEN

Department of Mathematics, University of Turku, Finland

The purpose of the paper is to prove that the family of languages accepted
by finite probabilistic automata is not closed under any of the operations
catenation, catenation closure and homomorphism.

I. INTRODUCTION

Very little is known about the closure properties of the family of stochastic
languages. This is partly due to the fact that methods have not been found
for investigating whether a given language is nonstochastic. Using the
characteristic polynomial of a transition matrix, Paz (1970) managed to find
a context-sensitive language which is not stochastic. The same idea has then
been used by Nasu and Honda (1970) who found a language which is context-
free but not stochastic. Using this language, we proved in (Turakainen, 1970)
that the family of stochastic languages is closed neither under catenation nor
under homomorphism. In this paper, we use another nonstochastic language
and prove that the family of stochastic languages is not closed under catenation
closure. The same fundamental language is applicable to the establishing
of the above results on catenation and homomorphism.

II. PRELIMINARIES AND LEMMAS

We write a probabilistic automaton as an ordered quadruple


P A = (S, 3I, zr0 , f0) where S is the finite set of states, M is a mapping which
assigns to each letter x the corresponding transition matrix M(x), rro is the
initial vector, and f0 is the final vector consisting of O's and l's only. The
stochastic language accepted by P A with the cut-point ~/ is denoted by
L(PA, ~7). If the elements offo are allowed to be arbitrary real numbers, then
we obtain a generalized probabilistic automaton (GPA). For each word
P = xlx ~ "" x~, we denote by M ( P ) the matrix M ( x l ) M ( x 2 ) . . . M(xk).
By definition, for the empty word A, M(A) equals the identity matrix. The
253
254 TURAKAINEN

transpose of a matrix C is denoted by C T. By the notations mi(L) and ~ L


we mean, respectively, the mirror image and the complement of the
language L.

LEMMA 1. Let GPA = (S, M , Iro , fo) be a generalized probabilistic auto-


maton over the alphabet I, and let ~7 be a rational number. I f the elements of
fro, fo and of the matrices M(x)(x ~ I ) are rational, then the language
L = {P ~ I* I zroM(P)fo = 7} is stochastic.
The proof of this Lemma is the same as that of Theorem 4 in (Turakainen,
1969).
LEMMA 2. The language
Ls = {x~y(x*y) * xky [ k ~ O}
is stochastic.
Proof. Consider the 9-state generalized probabilistic automaton
GPA = ({q ,..., Sg}, M, 7to, fo), where

7ro = (½, ½, 0,..., 0), fo = (0 ..... O, 1, --1) r,


and - -

"½ o o o o o ½
o ~ o ooo~ 0
OOl 0 0 0 0 0
o o o 1 0 0 0 0
M(x) = 0 0 0 o]o~ 0 ,
000 o o ½½ 0
000 0 0 0 1 0
000 0 0 0 1 0
000 0 0 0 1 04
and
"0 0 ½ o ½ 0 0 0-
0 0 0 o 0 0
0 0 ½ o 0 0 0
0 0 0 ½ o ½ 0 0
M(y) = 0 0 0 00 00 0
000 00 0 0 1
000 00 01 0
000 00 01 0
000 00 01 0
THE FAMILY OF STOCHASTIC LANGUAGES 255

Denote
L 1 = {P ~ (x + y)* [ % M ( e ) f o -----0}.
If we draw the graph of GPA, we easily see that the regular language
~-~x*y(x*y)* x*y is a subset of L 1 . Denote this language by L a . Let P
be an arbitrary word not belonging to L 2 . I t is of the form P -~ x~yQx~y,
where k, l / > 0 and Q ~ ( x * y ) * . Denote by q the n u m b e r o f y ' s in Q.
F r o m the graph of G P A we now obtain

% M ( p ) f ° = ½(½)k (½)q+l (½)l - - ½(½)2 (½)q+l (½)l.

T h i s n u m b e r equals 0 if and only if k = 1. Hence we have L 1 -~ L 2 + L s .


By L e m m a 1, this language is stochastic. Consequently, L 1 - - L 2 is a stochastic
language, because it is the intersection of the stochastic language L 1 and
the regular language ~ L ~ (cf. Turakainen, 1968). T h e proof is complete,
because L~ -----L 1 - - L~ .

LEMMA 3. The language L = L , ( x + y)*, where L~ is the language of


Lemma 2, is not stochastic.

Proof. W e use the same method as Nasu and H o n d a (1970). Assume that
L = L ( P A , ~), where P A = (S, M, % , fo), and let the characteristic equation
of M ( x ) be

ant n + a n _ l t n - 1 -{- "'" + a:t + a o = 0 (a~ = 1).


Using the H a m i l t o n - C a y l e y Theorem, we now obtain

a~%M(xnp)fo + an_irroM(x~-lP)f o + "'" - / a o % M ( P ) f o = 0 (1)


for any word P ~ (x + y ) * . Here a 0 + al + "'" + as : 0. L e t akl .... , a~r
be the positive coefficients; and choose

P = yxklyxk~y ... yx~y.


T h e n % M ( x i p ) f o > ~ if and only if i is one of the numbers k 1 ,..., k,..
Consequently, the left side of (1) is greater than (a o + a 1 + " - ' + a~)~/.
This eontradiets (1), because a o + "" + an = O. Thus, L is not stochastic.

I I I . THEOREMS

THEOREM 1. The family of stochastic languages is not closed under catena-


tion. More specifically, there are stochastic languages L' and L" over a two-letter
alphabet I such that L'I* and I*L" are not stochastic.
256 TURAKAINEN

Proof. Our theorem follows from L e m m a s 2 and 3 and from the fact
that mi(L) = (x + y)* mi(L~), because mi(Ls) is stochastic b u t mi(L) is not
(el. Turakainen, 1969b).

THEOREM 2. The family of stochastic languages is not closed under homo-


morphism.
Proof. L e t L s be as in L e m m a 2. ThenLsc(x + y)* is a stochastic language
(cf. Turakainen, 1970, L e m m a 3). Define h(x) = x, h(y) = y, and h(c) = L
T h e n h(Lsc(x + y)*) = Ls(x + y)*, which is not stochastic, by L e m m a 3.
I n (Turakainen, 1970) we have shown that also for )t-free homomorphisms
h, the image of a stochastic language is sometimes nonstochastic.

THEOREM 3. The family of stochastic languages is not closed under catena-


tion closure.
Proof. W e showed in L e m m a 2 that Ls is a stochastic language. Now,
we prove that L** is not a stochastic language, whence the theorem follows.
Assume, on the contrary, that L , * = L ( P A , 7) where P A = (S, M, 7to, fo).
W e proceed as in the proof of L e m m a 3 and choose

P = y(xkly)(xk~y) 2... (x~y) ~.

Clearly, ~roM(xip)fo > ~7 if and only if i is one of the numbers k 1 ,..., h~.
This leads to a contradiction in the same way as in L e m m a 3.

RECEIVED: June 15, t970

REFERENCES

Nasty, M., AND HONDA, N. (1970), A context-free language which is not accepted by
a probabilistic automaton, unpublished.
PAZ, A. (1970), "Formal Series, Finiteness Properties and Decision Problems,"
Technical Report No. 4, Israel Inst. of Technology, Dept. Comput. Sci., Haifa.
TURAKAINEN,P. (1968), On stochastic languages, Information and Control 12, 304-313.
TtmAKnINEN,P. (1969a), On languages representable in rational probabilisfic automata,
Ann. Acad. Sci. Fenn. Ser. A I 439.
TURAKAINEN,P. (1969b), Generalized automata and stochastic languages, Proc. Amer.
Math. Soc. 21, 303-309.
TDRAKAINEN,P. (1970), The family of stochastic languages is closed neither under
catenation nor under homomorphism, Ann. Univ. Turku. Set. A I 133.

You might also like