You are on page 1of 9

Chapter 3

Turing machines

1. Definition and examples


[See Sipser §3.1 for more information on this section.]
A Turing machine is a type of machine that is more powerful than a PDA: it
can can recognise languages that are not context-free, and in addition to accepting
and rejecting words it can also produce output. We shall see in the next chapter
that Turing machines can do anything that a real computer can do.
Slightly more formally, a Turing machine is like a deterministic finite state
automaton, but with a tape that it can read and write and move about on. The
tape has a left hand end, but is infinite in the other direction, and is divided up
into squares called cells. The blank symbol t indicates that nothing is written in
that cell.

Definition 1.1. A Turing machine (TM ) is a 7-tuple M = (Q, ⌃, , , q0 , qa , qr )


where Q, ⌃ and are finite sets and

1. Q is a set of states, containing q0 , qa and qr .

2. ⌃ is the input alphabet (not containing the blank symbol t).

3. is the tape alphabet (containing t, all of ⌃, and maybe some more symbols).

4. : (Q \ {qa , qr }) ⇥ ! Q ⇥ ⇥ {L, R} is the transition function. The input


to is the current state, and the letter in the current cell. The output of is
the new state, a letter to write into the current cell (replacing the old letter),
and an instruction to move one cell left or right.

5. q0 , qa , qr are the start, accept and reject states: we require qa 6= qr .

The Turing machine M begins in the designated start state, q0 2 Q. The input
word is on the tape at the left hand end, one symbol per cell, with all the infinitely
many cells after it being blank. The location of the read/write head is the current
cell : the initial current cell is the cell at the leftmost end of the tape.
A typical step consists of the following. The Turing machine M consults the
current state, q 2 Q and the tape symbol g 2 that is in the current cell. If
(q, g) = (q 0 , g 0 , L) then the TM goes to state q 0 2 Q, replaces the contents of the
current cell with g 0 , and moves one cell left along the tape (unless the current cell
is at the left hand end, in which case it stays put). If (q, g) = (q 0 , g 0 , R) then the
TM does the same thing, except it moves one cell right.

Similarities with DFAs and PDAs

27
28 CHAPTER 3. TURING MACHINES

1. The current tape symbol g 2 is like an input symbol: just like for a DFA,
the combination of it and the current state q 2 Q completely determine what
happens next.

2. The input word is the initial word on the tape.

3. The state set Q is finite, and contains a single start state q0 .

4. The input alphabet ⌃ is finite.

5. The tape alphabet is like the stack alphabet of a PDA, except that it must
contain both the input alphabet ⌃ and the ‘blank’ symbol, t.

6. The TM can’t tell when it is at the left hand end of the tape, just as a PDA
can’t tell when the stack is empty. Similarly to pushing $ onto the stack at
the beginning of running a PDA, each Turing machine generally starts by
marking the first cell, so that it can be identified later.

Di↵erences with DFAs and PDAs

1. Unlike a DFA, a Turing machine can write on the tape, instead of just reading
input.
Unlike a PDA, the Turing machine can write anywhere on the tape, once it
has moved the read/write head to the desired cell. At each step the Turing
machine must move one cell left or one cell right, unless the read/write head
is at the left-most end, when a “move left” instruction results in staying put.

2. There is a single accept state, qa , and a single reject state, qr — the machine
stops as soon as one of these is entered.

Example 1.2. Consider the Turing machine M1 = ({q0 , q1 , qa , qr }, {a, b}, {a, b, t},
, q0 , qa , qr ) where is given by

a b t
q0 q0 , a, R q1 , b, R qr , t, L
q1 qr , a, L qr , a, L qa , b, L

Notice that when defining , we don’t need to specify what happens when the
machine is in state qa or qr : it immediately stops.
Here is a picture of M1 . It is very similar to pictures of DFAs/PDAs - only the
arrow labels and accept/reject states have changed.
29

In the picture, an arrow labelled x ! y, R means “read an x, write a y, move one


cell right”. (Similarly, x ! y, L means “read an x, write a y, move one cell left”).
Since there is a unique accept state and a unique reject state, we no longer use
double rings on the states.

Definition 1.3. In tracking the behaviour of a Turing machine, we write down a


sequence of configurations. The configuration uqv, where u, v 2 ⇤ and q 2 Q,
means that the tape contains uv, the current state is q, and current cell is the first
cell of v. The initial configuration with input w is q0 w and the halting configurations
are uqa v and uqr v, for any u, v 2 ⇤ . A configuration C1 yields a configuration C2
if the Turing machine can legally go from C1 to C2 in a single step.

Example 1.2, ctd Here are the configurations that M1 enters on input aab:

q0 aab, aq0 ab, aaq0 b, aabq1 t, aaqa bb.

Here are the configurations that M1 enters on input aba:

q0 aba, aq0 ba, abq1 a, aqr ba.

Definition 1.4. A Turing machine M accepts an input word w 2 ⌃⇤ if there exists


a sequence of configurations C1 , C2 , . . . , Ck such that C1 = q0 w, each Ci yields Ci+1 ,
and Ck is an accept configuration (so the state in Ck is the accept state).
The language L(M ) that is recognised by M is the set of all strings that are
accepted by M . A language L is called Turing recognisable if there exists a Turing
machine M with L = L(M ).

Example 1.2, ctd We have just seen that M1 accepts aab and rejects aba. Let’s
explore more of what M1 does, to determine L(M1 ). We observe that M1 :
• rejects the empty word;
• stays in the start state q0 whilst reading as. If it then sees a b, then goes to q1 ,
but if instead it reaches the end of the input word and sees a blank symbol,
it rejects.
• Accepts a word if the first b is immediately followed by the end of the word,
otherwise rejects.
So M1 recognises the regular language L(a⇤ b). Furthermore, on input a word w 2
L(a⇤ b), when M halts the tape contains wb. On input a word w 2
/ L(a⇤ b), when M
halts the tape contains w.
n
Example 1.5. We design a Turing machine M2 which recognises L2 = {02 : n
0}. You can show (fairly easily) that this language is not context-free.

High level description: Cross o↵ alternate 0s until the end of the input, rejecting
if there are an odd number of 0s that’s at least three. This will cross of half of the
remaining 0s. If there’s an even number of 0s, go back to the beginning of the word,
and repeat. If we get down to a single 0, then accept.

Implementation level description:


This means we give a precise description of M2 , in terms of the movement of the
tape head and the reading and writing steps, but we don’t actually write down the
set of states and the transition function. This is not yet a formal description of the
machine.
M2 = “on input w 2 0⇤ :
30 CHAPTER 3. TURING MACHINES

1. If the initial cell is blank, reject.


2. Change the first 0 to 0· .
3. Sweep left to right along the tape, replacing every second 0 (starting with the
first undotted 0) with an X, until reach the first blank cell.
4. (a) If the the number of 0s (including the initial 0· ) was odd and greater
than 1, reject. This can be identified by having just skipped a 0, but not
yet crossed o↵ the matching 0.
(b) If instead, the only 0 that was seen was the inital 0· , accept.
5. Go back to the initial cell (containing 0· ).
6. Start again at Step 3.”
Formal description: Finally, we give a formal definition: M2 = (Q, ⌃, , , q1 , qa , qr ),
with
Q = {q1 , q2 , q3 , q4 , q5 , qa , qr }
⌃ = {0}, = {0, X, t, 0· }. We describe with a table:
State 0 t X 0·
q1 q 2 , 0· , R qr , t, R qr , X, R qr , 0, R
q2 q3 , X, R qa , t, R q2 , X, R qr , 0· , R
q3 q4 , 0, R q5 , t, L q3 , X, R qr , 0· , R
q4 q3 , X, R qr , t, R q4 , X, R qr , 0· , R
q5 q5 , 0, L qr , t, R q5 , X, L q2 , 0· , R
Here M2 is in state q1 as it replaces the first 0 with a 0· (or rejects the empty word).
It is in q3 when it has just crossed out a 0, which means that on the current pass
it has seen an even number of 0s. It is in state q4 when it has just skipped a zero,
which means that on the current pass it has seen an odd number (greater than one)
of 0s. It is in q5 as it goes back to the beginning of the tape, and in q2 when it has
seen no undotted zeros on this run of the tape.
Here’s a picture of M2 :
31

In this picture, we have adopted a standard shorthand for the labels, where if
a transition doesn’t change a symbol (formally, if the read symbol and written
symbol are identical), then we just write x ! R (or x ! L), rather than x ! x, R
(or x ! x, L).

Example 1.6. We design a Turing machine M3 to test membership of L3 =


{w#w : w 2 {0, 1}⇤ }. You can check that L3 is not context-free (the proof is
similar to Chapter 2, Example 4.3). We give an implementation level description
of M3 .
M3 = on input x 2 {0, 1, #}⇤ :
1. Reject the empty word.

2. Mark the first cell with a dot, so it can be recognised again later.

3. Read each letter of x until get to the first blank cell. Reject if it contains
either no # symbols, or more than one # symbol.

4. Go back to the beginning of the tape.

5. Look for the first 0 or 1.

(a) If there are none before the #, look after the # for the first 0 or 1. If
there are none, accept. If there is one, reject.
(b) If find a 0 or 1 before the #, replace it with an X. Then look for the
first 0 or 1 after the #. If it matches, replace it with an X. If it doesn’t
match, or doesn’t exist, reject.

6. Go to Step 4.

Example 1.7. We give an implementation level description of a Turing machine


M4 which recognises L4 = {an bn | n > 0} – a non-regular but context-free language.
M4 = on input w 2 {a, b}⇤ :
1. If the first symbol is an a, replace with a· . If it’s a b or blank, then reject.

2. Read as until see a cell that’s not an a. If it’s not a b, reject.

3. Read bs until see a cell that’s not a b. If it’s not t, reject.

4. Go back to the beginning of the word.

5. Read left to right, ignoring Xs and Y s.

(a) If find an a (or a· ) before seeing any bs or blank cells, replace it with an
X and go to Step 6.
(b) If instead find a b before seeing any as or blank cells, reject.
(c) If instead find a blank before seeing any as or bs, accept.

6. Read left to right, ignoring as and Y s.

(a) If find a b, replace it by a Y , and go to Step 7.


(b) If instead find a blank cell or an X, reject.

7. Go left past Y s and as until see an X. Go to Step 5.


32 CHAPTER 3. TURING MACHINES

2. Variants of Turing machines


[See Sipser, §3.2 for more informaiton on this section]
Many variations on the definition of a Turing machine have been proposed. In
each case, it can be shown that the new definition of Turing machine does not
increase the power of Turing machines.
We say that a Turing machine M1 is equivalent to a Turing machine M2 if
L(M1 ) = L(M2 ). More precisely therefore, we shall show that if a language L is
recognised by one of the new variant Turing machines, then there exists a standard
Turing machine which recognises L. In this section we’ll look at just two of the
most common variations: more are on the tutorial sheets.
The following observation will be useful.

Lemma 2.1. Let ⌃ and be finite alphabets, with ⌃ ⇢ and t 2 \ ⌃. There


exist a Turing machine that on input w 2 ⌃⇤ , halts with ww on the tape. For any
u 2 ⇤ , there there exists a Turing machine that on input w 2 ⌃⇤ , halts with uw on
the tape.

Proof. See Tutorial Sheet 2. ⌅

Definition 2.2. A multitape Turing machine is a 7-tuple M = (Q, ⌃, , , q0 , qa , ar ),


where each item other than is the same as for a standard TM. The machine has
k > 1 tapes (all with alphabet ). Each tape has its own read/write head. The
function takes as input the current state and all k current tape symbols. As out-
put it gives a new state, a letter to be written on each of the k current cells, and an
instruction to move left or right on each tape. Initially, the input is on tape 1, all
of the other tapes are blank, and each tape head is at the left hand end of its tape.

A 3-tape machine M during computation.

The equivalent step on a single tape machine S.

Theorem 2.3. Let M = (QM , ⌃M , M , M , q0 , qa , qr ) be a multitape Turing ma-


chine. Then there exists a single tape Turing machine S that is equivalent to M .

Proof. We design a single tape Turing machine S with L(S) = L(M ). Let
k be the number of tapes of M . Then S simulates the k tapes by storing their
information on its single tape.
The input alphabet of S is ⌃M . The tape alphabet S of S contains M , but
will also contain some new symbols. The TM S uses a new symbol, say $ 2 / M,
for the very first cell on the tape. It also uses a new symbol, say # 2
/ M , to mark
the end of the used portion of each of the k tapes. For each a 2 M , the alphabet
33

S contains both a and a⇤ . The ⇤s will be used to indicate where each of the k
read/write heads of M are at each step.
On input w = w1 . . . wn , the machine S starts by setting up its tape to simulate
all k tapes of M . That is, using the ideas from Lemma 2.1, S modifies the tape so
it contains
$w1⇤ w2 . . . wn # t⇤ # t⇤ # t⇤ . . . t⇤ # t t . . .
with k # signs.
To simulate a single move of M , the TM S reads the tape from the $ to the final
#. It goes into a state which depends on the k-tuple of letters that were marked
with ⇤s, which is possible since there are only | M |k possible such tuples. It then
goes back to the beginning of the tape, and goes through the current word a second
time, updating the ith part of the tape according to how M tells M to act on tape
i: it replaces the symbol with a ⇤ by whatever M writes there, and puts a ⇤ either
one cell left or one cell right.
If at any point during this update S moves one of the imaginary read/write
heads, say head i, to the right onto a #, this means that M would be moving
read/write head i onto a new blank cell of tape i. So, as in Lemma 2.1, S copies
all of the rest of the tape one cell further down, to free up a blank cell. Then it
continues the simulation as before. ⌅

The idea of nondeterminism for Turing machines is similar to that for NFAs or
PDAs: the transition function returns a set of permitted moves rather than a single
one, and the machine splite into several copies of itself (each with its own tape) to
carry out all of the branches of computation simultaneously in parallel.

Definition 2.4. A nondeterministic TM is a 7-tuple N = (Q, ⌃, , , q0 , qa , qr ),


where each item other than is the same as for a standard TM. The transition
function for a nondeterministic machine has the form

: (Q \ {qa , qr }) ⇥ ! P(Q ⇥ ⇥ {L, R}).

The machine accepts if any branch of the computation leads to an accept state.

We will show that non-determinism doesn’t change the set of languages that
TMs recognise, although we will see in the second half of the course that it may
possibly have a significant e↵ect on their speed.

Definition 2.5. Let A be an alphabet. We describe a way to order all of the words
in A⇤ called the len-lex ordering (for length, then lexicographical ). First, order the
elements of A, say A = {a1 , . . . , an }. Then order the elements of A⇤ as follows:

", a1 , a2 , . . . , an , a21 , a1 a2 , . . . , a1 an , a2 a1 , a22 , . . . , a2n , a31 , a1 a1 a2 , . . . .

That is, order the words by length, then within each length order them lexicograph-
ically (dictionary ordering).

Example 2.6. Let A = {0, 1}. We order A as 0, 1. Then the first few elements of
A⇤ in len-lex order are:

", 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111.

Theorem 2.7. Let N = (QN , ⌃N , N , N , q0 , qa , ar ) be a nondeterministic Turing


machine. Then there exists a deterministic Turing machine D that is equivalent to
N.
34 CHAPTER 3. TURING MACHINES

Proof. We design a deterministic Turing machine D with L(D) = L(N ). To


make things easier, we allow D to have three tapes: by Theorem 2.3, the TM D is
equivalent to a single-tape machine.
Each output of N is a subset of QN ⇥ N ⇥ {L, R}. Order the triples in
QN ⇥ N ⇥{L, R} (in any way you like), and let b = 2|QN || N | = |QN ⇥ N ⇥{L, R}|.
We will describe each possible sequence of configurations that N can produce
(and some that it can’t) as a word in {1, . . . , b}⇤ , as follows. A word c = c1 c2 . . . cm 2
{1, . . . , b}⇤ means that at step i we consider the branch of N ’s computation given
the ci th triple from the set of triples returned by N (using our fixed ordering on
the triples). Whilst working with c, we stop considering these configurations after
at most m of them, even if the corresponding state of N is not a halting state:

• If N reaches a halting state before cm then this branch stops when N halts.

• Some words c = c1 . . . cm 2 {1, . . . , b}⇤ may not describe valid computations


(for example, we might have c1 = 3, but in fact on input w = w1 w2 we find
that N (q0 , w1 ) is a set of size two).

If so, then we ignore the part of c after the last valid step. The empty word
corresponds to considering only the start configuration.
The machine D works as follows. Tape 1 holds the input word w, and never
changes (except for any marks we need to copy w to other tapes). Tape 2 records
N s tape on the current branch of its nondeterministic computation, with an “end of
word” symbol at the end so we can easily erase the tape afterwards. Tape 3 records
which branch of N s nondeterministic computations is currently being explored by
D. Tape 3 only uses the symbols in {1, . . . , b, t}. Initially Tapes 2 and 3 contain
the empty word.
D repeatedly carries out the following 3 steps:

1. Copy the input word, w, from Tape 1 to Tape 2. Put a symbol to mark the
end of the used portion of Tape 2.

2. Use Tape 2 to simulate N with input w on the branch of N ’s nondeterministic


computation given by the word c = c1 . . . cm on Tape 3.
That is, for i = 1, . . . , m do the following:

(a) Find the set T of triples returned by N at the ith step of N ’s calculations.
(b) If ci > |T |, then go to Step 3.
(c) Let (q, x, D) 2 QN ⇥ N ⇥ {L, R} be the ci th triple returned by N.

i. If q = qa then accept.
ii. If q = qr then go to Step 3.
iii. Otherwise, do to Tape 2 whatever N would do at this point, accord-
ing to (q, x, D). Move the end of word symbol one cell down along
Tape 2 if necessary.

3. Replace the word c on Tape 3 with the len-lex next word in {1, . . . , b}⇤ (see
Tutorial Sheet 2 for how to do this). Erase Tape 2. Go to Step 1.

Definition 2.8. If one of the words c 2 {1, . . . , b}⇤ corresponds to a branch of


computation in the nondeterministic TM that leads to the accept state on input w,
then the word c is called a certificate that w is accepted.
35

A certificate is a deterministically machine-checkable proof that w is accepted.


Notice that if the user was able to provide c at the same time as w, then our machine
would be able to write c directly onto Tape 3, and could run Step 2 exactly once,
using the fixed word c as the set of instructions for which non-deterministic choice to
make at each step. This would be much quicker than running the whole algorithm
above!
This idea will be very important in the second half of the course, when we study
the time taken by algorithms.

You might also like