# Department of Mathematics and Statistics

Indian Institute of Technology
Fundamentals of Undecidability
in Computational Theory
MTH 401 Theory of Computation
Project Report
Arcchit Jain 10142
Minnie Kabra 10400
Oajasavee K. Mourya 10469
Ved Gupta 10790
Contents
1 Preliminaries 2
2 G¨odel’s Incompleteness Theorem 4
3 Post Correspondance Problem 12
4 Halting Problem 16
5 Hilbert’s Tenth Problem 20
6 Wang’s Tile Problem 26
7 Busy Beaver 30
8 Some More Undecidable Problems 35
9 Hypercomputation 42
1
Chapter 1
Preliminaries
Deﬁnition Turing Machine
A Turing Machine is a machine described by
1. A ﬁnite set of states (represented by Q).
2. An input alphabet (represented by Σ).
3. A tape alphabet (represented by Γ; where Γ Σ).
4. A transition function (represented by δ).
5. A start state (represented by q
0
; always belongs to Q).
6. A blank symbol (represented by B; always belongs to Γ and Σ).
7. A set of ﬁnal states (represented by F; F ⊆ Q).
The transition function takes two arguments, a state in Q and a tape symbol in Γ. It is
either undeﬁned or a triplet of the form of (p, Y, D). where
• p is a state (p Q),
• Y is a new tape symbol (Y Γ), and
• D is a direction in which the header would move with respect to the tape.
2
Deﬁnition Decidable Language
A language for which there exists a Turing machine which accepts all the strings in the
language and no other string and halts for every input string is called a Decidable language
or Turing acceptible langugae or Recursive language.
Deﬁnition Partially Decidable Language
A language for which there exists a Turing machine which accepts all the strings in the
language and no other string but halts for every accepted input string and does not halts
for rejected strings is called a Partially Decidable language or Turing recognizible language
or Recursive enumerable language.
Deﬁnition Undecidable Language
A language for which there does not exists any Turing machine which
1. would accept all the strings in the language and no other string, and
2. does not halts for every input string
is called an Undecidable language or Non-Recursively enumerable language.
Deﬁnition Undecidable Problem
An Undecidable Problem is a decision problem for which it is impossible to construct a
3
Chapter 2
G¨ odel’s Incompleteness Theorem
Kurt Friedrich G¨odel was an Austrian American logician, mathematician, and
philosopher. G¨odel made an immense impact upon scientiﬁc and philosophical thinking in
the 20th century. G¨odel is best known for his two incompleteness theorems, published in
1931. He also showed that neither the axiom of choice nor the continuum hypothesis can
be disproved from the accepted axioms of set theory, assuming these axioms are consistent.
This result opened the door for mathematicians to assume the axiom of choice in their
proofs.
G¨odel’s Incompleteness Theorems are two theorems of mathematical logic that estab-
lish inherent limitations of all but the most trivial axiomatic systems capable of doing
arithmetic. In this section we will see philosophical signiﬁcance of Incompleteness The-
orems and how they are related to Liar’s Paradox, Consistency of formal systems and
Undecidability. We will also cover proof of First Incompleteness Theorem. Few examples
supporting G¨odel’s Theorem will discussed towards the end of this section.
The two Incompleteness Theorems are widely interpreted as showing that Hilbert’s
program to ﬁnd a complete and consistent set of axioms for all mathematics is impossible,
giving a negative answer to Hilbert’s second problem. Before we move forward we need to
know what is Hilbert’s Second Problem and how is it related to G´odel’s Incompleteness
Theorem.
4
HILBERT’S SECOND PROBLEM
Hilbert’s problems form a list of 23 problems in mathematics published by David
Hilbert in 1900. But here we are only interested in Hilbert’s Second Problem, that is,
When we are engaged in investigating the foundations of a science, we must set up a system
of axioms which contains an exact and complete description of the relations subsisting
between the elementary ideas of that science. But above all I wish to designate the following
as the most important among the numerous questions which can be asked with regard to
the axioms: To prove that they are not contradictory, that is, that a deﬁnite number of
logical steps based upon them can never lead to contradictory results. In geometry, the
proof of the compatibility of the axioms can be eﬀected by constructing a suitable ﬁeld of
numbers, such that analogous relations between the numbers of this ﬁeld correspond to
the geometrical axioms. On the other hand a direct method is needed for the proof of the
compatibility of the arithmetical axioms.
In order to understand what Hilbert meant by his problem there are few deﬁnitions we
need to know. To understand its soulution we need to have clear idea of both of the G¨odel’s
Incompleteness Theorem. After doing all that, we would relate undecidability to G¨odel’s
theorems and try to reason the occurance of undecidable sets, statements, Problems etc.
Later we will see that what G¨odel suggested is that there cannot exist any such
axiomatic sytstem, or even if it exists we can never prove that it is consistent!
Deﬁnition Axiomatic System
Axiomatic system is any set of axioms from which some or all axioms can be used in
conjunction to logically derive theorems. A mathematical theory consists of an axiomatic
system and all its derived theorems. An axiomatic system that is completely described is
a special kind of formal system.
5
Deﬁnition Formal Systems
Formal systems in mathematics consist of the following elements:
1. A ﬁnite set of symbols known as alphabet, that can be used for constructing formulae.
2. A grammar, which tells how well-formed formulae are constructed out of the symbols
in the alphabet. It is usually required that there be a decision procedure for deciding
whether a formula is well formed or not.
3. A set of axioms where each axiom must be a well-formed formula.
4. A set of inference rules.
Now we will see some popular axomatic systems we use very frequently in mathematics.
We will not go in much detail about these axiomatic systems. Minor references about these
axiomatic systems may be used later.
Peano axioms
Peano axioms are set of axioms for Natural numbers. They are widely used in Number
theory. It consists of 9 axioms. Equality relation, (=) and Successor function, S(n) are
used in these axioms.
1. 0 is a natural number.
2. For every natural number x, x = x.
3. For all natural numbers x and y, if x = y, then y = x.
4. For all natural numbers x, y and z, if x = y and y = z, then x = z.
5. For all a and b, if a is a natural number and a = b, then b is also a natural number.
6. For every natural number n, S(n) is a natural number.
7. For every natural number n, S(n) = 0 is false.
8. For all natural numbers m and n, if S(m) = S(n), then m = n.
9. If K is a set such that, 0 is in K, and for every natural number n, if n is in K, then
S(n) is in K, then K contains every natural number.
6
Zermelo Fraenkel set Theory with the Axiom of Choice or ZFC
ZFC System is commonly used in Set Theory. It also consists of 9 axioms.
1. Two sets are equal (are the same set) if they have the same elements.
∀ x ∀ y [ ∀ z (z ∈ x ⇔ z ∈ y) ⇒ x = y].
2. Every non-empty set x contains a member y such that x and y are disjoint sets.
∀ x [ ∃ a (a ∈ x) ⇒ ∃ y (y ∈ x ∧ ∃ z(z ∈ y ∧ z ∈ x))].
3. If z is a set, and φ is any property which may characterize the elements x of z, then
there is a subset y of z containing those x in z which satisfy the property.
∀ z ∀ w
1
. . . w
n
∃ y ∀ x [x ∈ y ⇔ (x ∈ z ∧ φ)].
4. If x and y are sets, then there exists a set which contains x and y as elements.
∀ x ∀ y ∃ z (x ∈ z ∧ y ∈ z).
5. For all T there is a A containing every set that is a member of some member of T.
∀ T ∃ A∀ Y ∀ x [(x ∈ Y ∧ Y ∈ T) ⇒ x ∈ A].
6. Let φ be any formula in the language of ZFC whose free variables are among
x,y,A,w
1
, . . . , w
n
, so that in particular B is not free in φ. Then:
∀ A∀ w
1
, . . . , w
n
_
∀ x(x ∈ A ⇒ ∃y φ) ⇒ ∃ B∀ x
_
x ∈ A ⇒ ∃ y(y ∈ B ∧ φ)

.
7. Let S(x) abbreviate x ∪¦x¦, where x is some set. Then there exists a set X such
that the empty set ∅ is a member of X and, whenever a set y is a member of X,
then S(y) is also a member of X.
∃ X [∅ ∈ X ∧ ∀ y(y ∈ X ⇒ S(y) ∈ X)] .
8. Let z ⊆ x abbreviate ∀q(q ∈ z ⇒ q ∈ x). For any set x, there is a set y which is a
superset of the power set of x. The power set of x is the class whose members are
all of the subsets of x.
∀ x ∃ y ∀ z [z ⊆ x ⇒ z ∈ y].
9. For any set X, there is a binary relation R which well-orders X. This means R is
a linear order on X such that every nonempty subset of X has a member which is
minimal under R.
∀ X ∃ R (Rwell-ordersX).
7
G
¨
ODEL’S INCOMPLETENESS THEOREM
After getting a basic idea of what axiomatic systems are we can now move forward
and see what incompleteness theorem says.
FIRST INCOMPLETENESS THEOREM
Any eﬀectively generated theory capable of expressing elementary arithmetic cannot be
both consistent and complete. In particular, for any consistent, eﬀectively generated formal
theory that proves certain basic arithmetic truths, there is an arithmetical statement that
is true but not provable in the theory.
It may seem a bit diﬃcult at ﬁrst look but the theorems are easy to understand. The
explaination for ﬁrst incompleteness theorem is given below.
The true but unprovable statement referred to by the theorem is often referred to as
”the G¨odel sentence” for the theory. The proof constructs a speciﬁc G¨odel sentence for
each eﬀectively generated theory, but there are inﬁnitely many statements in the language
of the theory that share the property of being true but unprovable. For example, the
conjunction of the G¨odel sentence and any logically valid sentence will have this property.
For each consistent formal theory T having the required small amount of number
theory, the corresponding G¨odel sentence G asserts: ”G cannot be proved within the
theory T”. This interpretation of G leads to the following informal analysis. If G were
provable under the axioms and rules of inference of T, then T would have a theorem, G,
which eﬀectively contradicts itself, and thus the theory T would be inconsistent. This
means that if the theory T is consistent then G cannot be proved within it, and so the
theory T is incomplete. Moreover, the claim G makes about its own unprovability is
correct. In this sense G is not only unprovable but true, and provability within the theory
T is not the same as truth.
Each eﬀectively generated theory has its own G¨odel statement. It is possible to deﬁne
a larger theory T that contains the whole of T, plus G as an additional axiom. This will
not result in a complete theory, because G¨odel’s theorem will also apply to T, and thus T
8
cannot be complete. In this case, G is indeed a theorem in T, because it is an axiom. Since
G states only that it is not provable in T, no contradiction is presented by its provability
in T. However, because the incompleteness theorem applies to T, there will be a new
G¨odel statement G for T, showing that T is also incomplete. G will diﬀer from G in that
G will refer to T, rather than T.
G¨odel’s ﬁrst incompleteness theorem shows that any consistent eﬀective formal system
that includes enough of the theory of the natural numbers is incomplete, there are true
statements expressible in its language that are unprovable. A system may be incomplete
simply because not all the necessary axioms have been discovered. For example, Euclidean
geometry without the parallel postulate is incomplete; it is not possible to prove or disprove
the parallel postulate from the remaining axioms. This will be discussed later towards the
end of topic. G¨odel’s theorem shows that, in theories that include a small portion of
number theory, a complete and consistent ﬁnite list of axioms can never be created, nor
even an inﬁnite list that can be enumerated by a computer program. Each time a new
statement is added as an axiom, there are other true statements that still cannot be proved,
even with the new axiom. If an axiom is ever added that makes the system complete, it
does so at the cost of making the system inconsistent.
There are complete and consistent lists of axioms for arithmetic that cannot be enu-
merated by a computer program. For example, one might take all true statements about
the natural numbers to be axioms (and no false statements), which gives the theory known
as “true arithmetic”. The diﬃculty is that there is no mechanical way to decide, given
a statement about the natural numbers, whether it is an axiom of this theory, and thus
there is no eﬀective way to verify a formal proof in this theory.
There is another example of G¨odel’s Incompleteness Theorem called Liar paradox. The
liar paradox is the sentence “This sentence is false.” An analysis of the liar sentence shows
that it cannot be true nor can it be false . A G¨odel sentence G for a theory T makes a
similar assertion to the liar sentence, but with truth replaced by provability: G says “G is
not provable in the theory T.” It is not possible to replace “not provable” with “false” in
a G¨odel sentence because the predicate “Q is the G¨odel number of a false formula” cannot
be represented as a formula of arithmetic.
9
SECOND INCOMPLETENESS THEOREM
For any formal eﬀectively generated theory T including basic arithmetical truths and
also certain truths about formal provability, if T includes a statement of its own consistency
then T is inconsistent.
This strengthens the ﬁrst incompleteness theorem, because the statement constructed
in the ﬁrst incompleteness theorem does not directly express the consistency of the theory.
A technical subtlety in the second incompleteness theorem is how to express the consistency
of T as a formula in the language of T. There are many ways to do this, and not all of
them lead to the same result. In particular, diﬀerent formalizations of the claim that T
is consistent may be inequivalent in T, and some may even be provable. For example,
ﬁrst-order Peano arithmetic (PA) can prove that the largest consistent subset of PA is
consistent. But since PA is consistent, the largest consistent subset of PA is just PA, so in
this sense PA “proves that it is consistent”. What PA does not prove is that the largest
consistent subset of PA is, in fact, the whole of PA.
For any familiar explicitly axiomatized theory T, it is possible to canonically deﬁne a
formula Con(T) expressing the consistency of T. The formalization of Con(T) depends on
two factors, formalizing the notion of a sentence being derivable from a set of sentences
and formalizing the notion of being an axiom of T.
G¨odel’s second incompleteness theorem also implies that a theory T
1
satisfying the
technical conditions outlined above cannot prove the consistency of any theory T
2
which
proves the consistency of T
1
. This is because such a theory T
1
can prove that if T
2
proves
the consistency of T
1
, then T
1
is in fact consistent. For the claim that T
1
is consistent has
form “for all numbers n, n has the decidable property of not being a code for a proof of
1
”. If T
1
were in fact inconsistent, then T
2
would prove for some n that
n is the code of a contradiction in T
1
. But if T
2
also proved that T
1
is consistent (that
is, that there is no such n), then it would itself be inconsistent. This reasoning can be
formalized in T
1
to show that if T
2
is consistent, then T
1
is consistent. Since, by second
incompleteness theorem, T
1
does not prove its consistency, it cannot prove the consistency
of T
2
either.
10
Undecidable statements
There are two distinct senses of the word “undecidable” in mathematics and computer
science. The ﬁrst of these is the proof-theoretic sense used in relation to G¨odel’s theorems,
that of a statement being neither provable nor refutable in a speciﬁed deductive system.
The second sense is used in relation to computability theory and applies not to statements
but to decision problems, which are countably inﬁnite sets of questions each requiring a
yes or no answer. Such a problem is said to be undecidable if there is no computable
function that correctly answers every question in the problem set.
The two concrete examples of undecidable statements,
1. The continuum hypothesis can neither be proved nor refuted in ZFC.
2. The axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC
axioms except the axiom of choice).
G¨odel’s incompleteness theorems struck a fatal blow to David Hilbert’s second problem,
which asked for a ﬁnitary consistency proof for mathematics. The second incompleteness
theorem, in particular, is often viewed as making the problem impossible. Not all mathe-
maticians agree with this analysis, however, and the status of Hilbert’s second problem is
not yet decided.
A turing machine is more like an axiomatic system (with enough complexity) with all
its transition function as its axioms and all strings it derives are theorems. A grammar
and a set of turing machines deﬁnes the complete structure of a formal system. Now
incompleteness theorem suggests that there will be certain strings which cannot be derived
by any machine. Therefore undecidable strings are inherent in nature and they exist as
basic characterstic of system. As we have seen the reason of occurance of undecidable sets
we can now move forward and explore some famous undecidable problems and see some
interesting results.
11
Chapter 3
Post Correspondance Problem
Deﬁnition Post Correspondance Problem(PCP)
The PCP problem was introduced by Emil Post in 1946 which he proved to be undecidable.
Halting problem is tougher to understand that’s why PCP is used to in proofs of several
other undecidable problems.
Deﬁnition of the problem
Input: A ﬁnite collection of blocks, labelled as above.
Question: Given an unlimited supply of copies of these particular blocks, can one form
a nonempty ﬁnite sequence of these for which the concatenation of the top strings equals
the concatenation of the bottom strings?
Example Consider two languages, A and B which contains threee srings each.
A B
a 10 101
b 011 11
c 101 011
12
In this particular problem we will never be able to ﬁnd a string that is exactly the same.
Consider another example where the languages are as follows:
A B
a 1 111
b 10111 10
c 10 0
For this particular problem, “baac” is the string that works. Check:
A : 10111 + 11 + 10 = 101111110
B : 10 + 111 + 111 + 0 = 101111110
One more interesting property is that if a PCP has a solution then it has inﬁnite solutions.
In this case the string “baac” can be inﬁnitely many times to get the desired result.
Formal Deﬁnition
Input: Two set of n strings,
A = W
1
, W
2
, ......W
n
B = V
1
, V
2
, ......V
n
Then there is a solution to Post Correspondance Problem if there is a sequence i, j, . . . k
such that:
W
i
W
j
. . . W
k
= V
i
V
j
. . . V
k
where indices may be repeated or omitted.
Using Post Correspondance Theorem to prove undecidability of other prob-
lems:
1. To Prove: There does not exists an algorithm to ﬁnd whether a language is am-
biguous.
Proof : We reduce this sproblem from Post

s Correspondence Problem. Suppose
we can, in fact, decide the language ¦¸G¸[G is a CFG and L(G) is ambiguous¦.
13
Given α
1
, . . . , α
m
, β
1
, . . . , β
m
:
Construct the following CFG G = (V, Σ, R, S) where
V = ¦S, S
1
, S
2
¦,
R = ¦S → S
1
[ S
2
, S
1
→ α
1

1
[ α
2

2
[ . . . [ α
m

m
, S
2
→ β
1

1
[ . . . [ β
m

m
¦
(where σ
i
are new characters added to the alphabet, e.g., σ
i
= i).
If the language is ambiguous, then there is a derivation of some string w in two
diﬀerent ways. Without loss of generality, let us suppose that the derivations both
1
, reading the new characters backwards until they end
makes sure there can only be one derivation, so that’s not possible. Hence, we see
that the only ambiguity can come from one S
1
and one S
2
‘start’. But then, taking
the substring of w up to the beginning of the new characters, we have a solution to
the PCP (since the strings of indices used after those points match).
Similarly, if there is no ambiguity, then the PCP cannot be solved, since a solution
would imply an ambiguity that just follows
S ⇒ S
1

α¯ σ and S ⇒ S
2

β¯ σ, where α = β are strings of matching α’s and
β’s (since the ¯ σ’s match).
Hence, we’ve reduced to PCP, and since that’s undecidable, we have shown that
there cannot exist any algorithm to ﬁnd whether a given language is ambiguous.
2. To Prove: Given two CFLs, the problem of deciding whether the CFLs are disjoint
or not is undecidable.
Proof : To solve this problem we will require a lemma as given below:-
Lemma: The set of valid computations of a turing machine M is the intersection of
two CFL’s L
1
and L
2
, and grammars for these CFL’s can be eﬀectively constructed
from M.
Let G
1
andG
2
be the two given grammars, and we have to check whether we can
say that L(G
1
) ∩ L(G
2
) is empty or not. Using the lemma we can construct from
M grammners G
1
andG
2
such that L(G
1
) ∩ L(G
2
) is the set of valid computations
of M. If there is an algorithm A to tell whether the intersection of the languages of
two CFG’s is empty, we can construct an algorithm to B to tell whether L(M) = Φ
for arbitrary TM M. Simply design B to construct G
1
and G
2
from M as in lemma,
14
then apply Algorithm A to tell whether L(G
1
) ∩L(G
2
) is empty. If the intersectio is
empty, then there are no valid computations of M, so L(M) = Φ. If the intersection
is not empty, L(M) ,= Φ. That is, the problem of emptiness for r.e. sets reduces to
the problem of intersection for CFG’s.
Algorithm B cannot exist, however, since L(M) = Φ is undecidable by Rice Theo-
rem. Therefore A does not exist, so it is undecidable whether the intersection of two
CFL’s is empty.
3. To Prove: For a given CFL, it is undecidable to check whether L = Σ

Proof : Let us suppose that the problem of decideing L = Σ

is decidable. Take
L = (L(G
1
) ∩ L(G
2
))

. It means the problem of decideing L(G
1
) ∩ L(G
2
) = Φ is
decidable. But by previous result, it is undecidable. Hence our assumption was
wrong. Thus, it is undecidable for a CFG G whether L(G) = Σ

.
4. To Prove: For any given CFG, it is undecidable to check whether its language equal
to a regular set?
Proof : We know that Σ

is a regular set. So if we can decide whether a language
L is equal to a regular set, then we can decide that whether L = Σ

but we have
proved previously that this problem is undecidable.
5. To Prove: For any two given CFLs L
1
and L
2
, check if L
1
= L
2
Proof : Let L
1
= L(G
1
) and L
2
= L(G
2
). Fix G
2
be the grammar generating
Σ

, where Σ is the terminal symbol alphabet of G
1
. Then it is equivalent to decide
whether L
1
= Σ

, which is undecidable. Hence the given problem is undecidable.
6. To prove: Given two CFLs L
1
andL
2
, it is undecidable to check if L
2
⊆ L
1
Proof : The proof is same as the above proof.
7. To Prove: Given a regular set R and a CFL, L, it is undecidable to check if R ⊆ L
Proof : Take R = Σ

then the theorem is reduced to one of the previously seen
proofs. Hence it is also undecidable.
15
Chapter 4
Halting Problem
Deﬁnition Halting Problem
Given a description of an arbitrary computer program, decide whether the program ﬁnishes
running or continues to run forever.
This is equivalent to the problem of deciding, given a program and an input, whether
the program will eventually halt when run with that input, or will run forever.
In 1936, Alan Turing proved that a general algorithm to solve the halting problem
for all possible program - input pairs cannot exist. In other words, he proved that the
Halting Problem is undecidable.
Suppose that there exists a program, that takes as input a program M and an input
w of that program. Let us assume that the program always determines correctly whether
the program M would halt on input w (it would return“yes” if it does), or whether it
would run forever (it would then return “no”). Let us call this program H(M, w).
Now, we use H(M, w) to write another program, with the name D(w), such that D(w)
halts if H(w, w) does not halt. Since the program H(M, w) has been assumed to halt (get
result) on each input string, so D(w) would not halt for any input string w.
And now comes the unanswerable question: Does D(D) halts? It would halt if and
only if the call H(D, D) returns “no”. In other words, it would halt if and only if it would
16
not halt. This is a contradiction: we must conclude that the only hypothesis that started
us on this path is false, that program H(M, w) does not exist. Thus, there does not exists
any program or algorithm for solving if the problem H would solve: to decide whether
arbitrary programs would halt or loop.
We have a full-ﬂedged notation for algorithms: Turing machines. We are thus ready
to deﬁne a recursively enumerable language, and prove that it is partially decidable. Let
H = ¦(M, w) : Turing machine M halts on input string w¦.
Notice ﬁrst that H is recursively enumerable: On input (M, w), universal Turing machine
U halts precisely when the input is in H.
Furthermore, if H is recursive, then every recursively enumerable language is recursive.
In other words, all recursively enumerable languages are also decidable if and only if H
is recursive. For suppose that it is indeed decided by some Turing machine M
0
. Then
given any particular Turing machine M which partially decides a language L(M), we could
design a Turing machine M

that fully decides L(M).
Theorem Undecidability of Halting Problem
Let A
TM
= ¦¸M, w¸[M is a TM and M accepts w¦. Then A
TM
is undecidable.
Proof In a proof by contradiction it will be assumed that A
TM
is decidable. Suppose
that H is a decider for A
TM
, i.e.,
H(¸M, w¸) =
_
accepts if M accepts w
rejects if M does not accept w
A new Turing machine D with H as a subroutine will now be constructed.
The input to D is a description of a Turing machine M. This information is sent to Turing
machine H which determines what M does when the input to M is its own description.
Once D has determined this information, it does the opposite, i.e., it rejects if M accepts
and accepts if M does not accept:
D(¸M¸) =
_
accepts if M does not accepts ¸M¸
rejects if M accepts ¸M¸
17
In the case when D is run with its own description the following is obtained:
D(¸D¸) =
_
accepts if D does not accepts ¸D¸
rejects if D accepts ¸D¸
The Halting Problem is noncomputable, but it is an important problems. It is useful
to know if a procedure application will terminate in a reasonable amount of time, but the
Halting Problem does not answer that question. It concerns the question of whether the
procedure application will terminate in any ﬁnite amount of time, no matter how long it is.
Virus detection
A virus is a program that infects other programs. A virus spreads by copying its own
code into the code of other programs, so when those programs are executed the virus will
execute. In this manner, the virus spreads to infect more and more programs. A typical
virus also includes a malicious payload so when it executes in addition to infecting other
programs it also performs some damaging (corrupting data ﬁles) or annoying (popping
up messages) behavior. The Is-Virus Problem is to determine if a procedure speciﬁcation
contains a virus:
Input: A speciﬁcation of a Python program.
Output: If the expression contains a virus (a code fragment that will infect other ﬁles)
output True. Otherwise, output False.
We demonstrate the Is-Virus Problem is noncomputable, we show how to deﬁne a halts
algorithm given a hypothetical isVirus algorithm. Since we know halts is noncomputable,
this shows there is no isVirus algorithm.
18
Assume infectFiles is a procedure that infects ﬁles, so the result of evaluating
isV irus(‘infectFiles()

) is True. We could deﬁne halts as:
def halts(p) :
return isV irus(p+

; infectFiles()

)
This works as long as the program speciﬁed by p does not exhibit the ﬁle-infecting
behavior. If it does, p could infect a ﬁle and never terminate, and halts would produce
the wrong output. To solve this we need to hide the printing behavior of the original
program. A rough deﬁnition of ﬁle-infecting behavior would be to consider any write to
an executable ﬁle to be an infection. To avoid any ﬁle infections in the speciﬁc program,
we replace all procedures that write to ﬁles with procedures that write to shadow copies
of these ﬁles. For example, we could do this by creating a new temporary directory and
prepend that path to all ﬁle names. We call this (assumed) procedure, sandBox, since it
transforms the original program speciﬁcation into one that would execute in a protected
sandBox.
def halts(p) : isV irus(sandBox(p)+

; infectFiles()

)
Since we know there is no algorithm that solves the Halting Problem, this proves that
there is no algorithm that solves the Is-Virus problem.
Virus scanners such as Symantec’s Norton AntiVirus attempt to solve the Is-Virus
Problem, but its non-computability means they are doomed to always fail. Virus scanners
detect known viruses by scanning ﬁles for strings that match signatures in a database of
known viruses. As long as the signature database is frequently updated they may be able
to detect currently spreading viruses, but this approach cannot detect a new virus that
will not match the signature of a previously known virus.
Sophisticated virus scanners employ more advanced techniques to attempt to detect
complex viruses such as metamorphic viruses that alter their own code as they propagate
to avoid detection. But, because the general Is-Virus Problem is noncomputable, we know
that it is impossible to create a program that always terminates and that always correctly
determines if an input procedure speciﬁcation is a virus.
19
Chapter 5
Hilbert’s Tenth Problem
The statement of Hilbert’s tenth problem is as follows :
Given a Diophantine equation with any number of unknown quantities and with
rational integral numerical coeﬃcients: To devise a process according to which it can be
determined in a ﬁnite number of operations whether the equation is solvable in rational
integers.
The undecidability of the existence of integral roots turned out to be the consequence
of a deep equivalence.
Deﬁnition Diophantine Predicates and Relations
A predicate P on N
k
is called diophantine iﬀ there is an n ∈ N and a polynomial p with
integer coeﬃcients in k +n variables, such that P(x) ⇐⇒ ∃y ∈ N
n
: p(x, y) = 0.
A set or relation S ⊂ N
k
is called diophantine iﬀ x ∈ S is a diophantine predicate.
A function f : N
k
→N is called diophantine iﬀ its graph (f(x), x) ⊂ N
k+1
is a diophantine
set.
The deﬁnition allows for polynomials of arbitrary (though ﬁnite) degree. Diophantine
problems have fewer equations than unknown variables and involve ﬁnding integers that
work correctly for all equations.
20
The following trick by Skolem, shows that we can trade the degree with the number
of variables to the extent that ﬁnally we may restrict ourselves to polynomials of degree
at most four.
Lemma If S ⊂ N
k
is a diophantine set, then there is an m ∈ N and a polynomial q in
m + k variables with integer coeﬃcients and of degree at most four such that S = ¦x ∈
N
k
[ ∃ z ∈ N
m
: q(x, z) = 0¦.
Proof : By assumption there is a polynomial p with integer coeﬃcients such that S =
¦x ∈ N
k
[ ∃ xy ∈ N
n
: p(x, y) = 0¦. The construction of the new polynomial is then
done recursively: for all monomials in p which have degree larger than two, introduce new
variables u
1
, u
2
, . . . deﬁned as a product of the ﬁrst two variables of the corresponding
monomial. Inserting the new variables then leads to a new polynomial p
1
(x, y, u) whose
maximal degree is one less than that of p and p(x, y) = p
1
(x, y, u) if we impose the deﬁning
constraints for the uis. Iterating this procedure, we can obtain a sequence of polynomial in
more and more variables which eventually is at most quadratic in all variables. Suppose
p
n
is this quadratic polynomial. The imposed constraints on the new variables which
guarantee that p(x, y) = p
n
(x, y, u, . . .) can now be formulated in terms of the existence
of integral roots of quadratic polynomials with integer coeﬃcients.
Denote those polynomials by c
1
, c
2
, . . .. That is, if we have for instance u
1
:= x
2
y
7
, u2 :=
u
1
x
2
, then we deﬁne c
1
(u
1
, x
2
, y
7
) := u
1
−x
2
y
7
and c
2
(u
2
, u
1
, x
2
) := u
2
−u
1
x
2
. In this way,
we achieve that p(x, y) = 0 iﬀ ∃u : p
2
n
+ Σ
i
c
2
i
= 0, where we denote by u the collection of
all variables added to x and y. Hence, the polynomial q := p
2
n
+ Σ
i
c
2
i
biquadratic characterization of the diophantine set S.
Proposition Every diophantine set is recursively enumerable.
Proof : Let p(x, y) be a characterizing polynomial for the diophantine set. The statement
follows from observing that p(x, y) = 0 is a primitive recursive predicate which we call
P(x, y) and from recalling that a recursively enumerable set S is exactly one for which
there is a primitive recursive predicate for which S = ¦x[∃ y : P(x, y)¦.
21
A basic property of recursively enumerable sets is that the class is closed w.r.t. unions
and intersections. This is easily seen to hold also for diophantine sets:
Proposition: The class of diophantine predicates is closed w.r.t. (i) conjunction, (ii)
disjunction and (iii) the use of existential quantiﬁers.
Proof : Let upper case P

s be diophantine predicates and lower case ps their characterizing
polynomials. Then
P
1
(x) ∧ P
2
(x) ⇐⇒ ∃ y
1
, y
2
: p
1
(x, y
1
)
2
+p
2
(x, y
2
)
2
= 0
P
1
(x) ∨ P
2
(x) ⇐⇒ ∃ y
1
, y
2
: p
1
(x, y
1
)p
2
(x, y
2
) = 0,
x
2
: P(x
1
, x
2
) ⇐⇒ ∃ y, x
2
: p(x
1
, x
2
, y) = 0
Now, we can go through the proof of the fact that every Turing computable function
is recursive. Since the encoding of an arbitrary Turing machine in terms of recursive
functions uses only functions for which we now known that they are diophantine, one can
with a little bit of eﬀort see that the predicate “the Turing machine halts” is a diophantine
predicate. Following these lines leads to two remarkable consequences:
Theorem Every recursively enumerable set is diophantine.
Theorem There is an n ∈ N and a polynomial p with integer coeﬃcients such that for
any recursively enumerable set S ⊆ N there exists an s ∈ N so that
S = ¦x ∈ N [ ∃ y ∈ N
n
: p(s, x, y) = 0¦.
Denoting by (n, d) the number of variables and the maximal degree of a polynomial, then
there are universal polynomials known for (n, d) = (58, 4) (note that the possibility of
having d = 4 follows from Skolem’s trick) to (n, d) = (9, 1.6 10
45
).
The fact that diophantine sets and recursively enumerable sets are the same leads to
the sought undecidability of Hilbert’s tenth problem:
22
Corollary Let P be the class of polynomials with integer coeﬃcients and of degree at most
four. Then,
1. There is no algorithm which upon input of any element p ∈ T decides whether or
not p has an integral root, and
2. there is no algorithm which upon input of any element p ∈ T decides whether or not
p has a non-negative integral root.
Proof : Assume there would be an algorithm for deciding integral roots. Then there
would be one for non-negative integral roots as well, since we can exploit the Lagrange
four square theorem to the end that 0 ∈ p(N
n
) ⇔ 0 ∈ p

(Z
4n
).
Then for any diophantine set S = ¦x ∈ N
k
[ ∃ y ∈ N
m
: p(x, y) = 0¦, the hypothetical
algorithm could be used in order to decide x ∈ S for any x. In other words, every
diophantine set would be a recursive set. However, we known that there are non-recursive
sets within the recursively enumerable sets. An since the latter are exactly the diophantine
sets, the assumption of such an algorithm leads to a contradiction. The fact that we can
restrict ourselves to degree at most four follows from Skolem’s lemma.
While for polynomials with maximal degree two, there exists such an algorithm, the
case of maximal degree three is still open. Similarly, for rational (rather than integral)
roots, decidability is an open problem. For real roots, on the other hand, a result of
Tarski implies that the problem then becomes decidable.
As we will prove in the exercise, one can extend the above undecidability result in the
following direction: let ( be any set of cardinal numbers ≤ N
0
which is neither empty
nor does it contain all such cardinal numbers. Then the question of whether or not the
number of non-negative integral roots of a polynomial is in ( turns out to be undecidable as
well. The proof is a reduction from( = ¦0¦ - the undecidability of Hilbert’s tenth problem.
23
Proposition There is a polynomial q(y
1
, y
2
, . . . , y
n
, x) with integer coeﬃcients such that
the positive integers in its range are exactly all prime numbers in the sense that
q(N
n+1
) ∩ N
_
¦0¦ = the set of all primes.
Proof : Primes form a recursively enumerable and thus diophantine set S. This implies
that there is a polynomial p with integer coeﬃcients such that S = ¦x ∈ N [ ∃ yN
n
:
p(x, y) = 0¦. Deﬁning q(y, x) := x(1 − p(x, y)
2
) then gives the sought polynomial since
this is positive iﬀ p has a root in which case indeed q(y, x) takes on the value of the
corresponding prime.
Following the remark regarding universal polynomials, we obtain that for prime number
producing polynomials ten variables suﬃce. A similar construction leads to the following:
Proposition Let f : N →N be any partial recursive function. There exists a polynomial
q with integer coeﬃcients such that for all x, y ∈ N:
y = f(x) ⇔ ∃ x
0
, . . . , x
n
∈ N : y = q(x, x
0
, . . . , x
n
).
Proof : The graph of f is recursively enumerable and thus diophantine. So y = f(x) holds
iﬀ for a certain polynomial p we have ∃ x
0
, . . . , x
n
: (1 −p(x
0
, . . . , x
n
, x)
2
) > 0 ∧ x
0
= y.
This in turn is equivalent to ∃ x
0
, . . . , x
n
: (x
0
+ 1)(1 − p(x
0
, . . . , x
n
, x)
2
) = y +
1. Therefore the sought polynomial can be deﬁned as q(x, x
0
, . . . , x
n
) := (x
0
+ 1)(1 −
p(x
0
, . . . , x
n
, x)
2
) −1.
Application of Solution
A particularly form of G¨odel’s incompleteness theorem is also a consequence of the
Matiyasevich/MRDP Theorem:
Let p(a, x
1
, . . . , x
k
) = 0, provide a Diophantine deﬁnition of a non-computable set. Let
A be an algorithm that outputs a sequence of natural numbers such that the corresponding
equation p(n, x
1
, . . . , x
k
) = 0,has no solutions in natural numbers. Then there is a
numbern
0
which is not output by A while in fact the equation p(n, x
1
, . . . , x
k
) = 0, has
no solutions in natural numbers.
24
To see that the theorem is true, it suﬃces to notice that if there were no such number n
0
, one could algorithmically test membership of a number n in this non-computable set by
simultaneously running the algorithm A to see whether n is output while also checking all
possible k-tuples of natural numbers seeking a solution of the equation p(n, x
1
, . . . , x
k
) = 0.
We may associate an algorithm A with any of the usual formal systems such as Peano
Arithmetic or ZFC by letting it systematically generate consequences of the axioms and
then output a number n whenever a sentence of the form
∃x
1
, x
2
, . . . x
k
[p(n, x
1
, x
2
, . . . . .x
k
) = 0]
is generated. Then the theorem tells us that either a false statement of this form is proved
or a true one remains unproved in the system in question.
25
Chapter 6
Wang’s Tile Problem
Suppose we want to cover the plane with decorated square tiles of the same size. Tiles
are to be chosen from a ﬁnite number of types. There are unbounded tiles of each type
available. Due to the decorations, however, there are local constraints on which tiles can
be put next to each other, for the tiling to look appealing. Is it possible to cover the whole
plane with tiles of given types? How if we require a certain tile to be used at least once?
Can they be used to tile a ﬁnite rectangular area, with a certain boundary condition? It
turns out that these problems the way formulated by Hao Wang are all undecidable.
In 1961, Wang conjectured that if a ﬁnite set of tiles can tile the plane, then there
exists also a periodic tiling, i.e., a tiling that is invariant under translations by vectors in
a 2-dimensional lattice, like a wallpaper pattern. He also observed that this conjecture
would imply the existence of an algorithm to decide whether a given ﬁnite set of tiles can
tile the plane.
This conjecture was in 1966 refuted by Berger. He showed that any Turing machine
can be translated into a Wang tile set, and that the Wang tile set tiles the plane if and
only if the Turing machine will never halt. The halting problem is undecidable and thus
so is Wang’s original problem.
Berger constructed the ﬁrst aperiodic tile set counting 20426 tiles. This number was
reduced repeatedly, often by well known scientists, such as DonaldKnuth. The smallest
aperiodic set of Wang tiles consists of 13 tiles over 5 colors.
26
Proving the undecidability of Tiling problem
To prove that tiing problem is undecidable we somehow reduce it to the problem
of Halting Turing Machine. The encoding is most natural and intuitive. Let M =
(Σ, Q, δ, q
0
, q
F
, B) be a Turing machine with a one-way-inﬁnite tape, whereΣ is the al-
phabet, Q the set of states, δ : Q−¦q
F
¦ Σ −→ Σ¦L, R¦ Q the transition function,
q
0
∈ Q the initial state, q
F
∈ Q the ﬁnal state, and B ∈ Σ the blank symbol. We construct
a set TM of Wang tiles as shown in Figure A and B and described bellow.
Fig. A. Alphabat tile B. Merging tile C.Action Tile
• For any letter a ∈ Σ, we have a tile of the form depicted in Figure A. These are to
pass the content of an inactive cell of the tape one row upward (which corresponds
to one step later in the computation process of the machine).
• For any state q ∈ Q and letter a ∈ Σ, where δ(q, a) = (b, D, p), there is a tile like
one of those in Figure C, depending on whether D = R or D = L. These correspond
the action of the transition function, and passing the new state to a neighbor cell.
27
• The state p received from a neighbor cell, is combined with the current content of
the cell, by a tile of the form shown in Figure B and passed to the upper tile to be
processed in next step.
Tiles for ﬁxing the initial conﬁguration of a Turing machine.
(a) Head position. (b) Empty cells.
• The initial conﬁguration of the machine is ﬁxed by the tiles in second ﬁgure. Here
* is a new colour and used to ensure that only one head appears on the tape.
Theorem Plane Tiling Problem is undecidable.
Proof : Let M be a Turing Machine - an instance of the Halting Problem. We use the
tile set T
M
, constructed above, and a tile set T
R
, to construct a new set of tiles T that
can tile the plane, if and only if, M on empty input never halts.
Each tile in T is basically a pair (s, t), where s ∈ T
R
, and T is either in T
M
or is of an
auxiliary type, to pass information along a vertical or horizontal line. The two components
are interpreted as layers of the tile. Let s
0
∈ T
R
be any tile that represents the lower-left
corner of a connected region. For any such tile, we place a tile (s
0
, t
0
) in T, where t
0
is the marked tile of T
M
as deﬁned above. This is the only pair in which t
0
appears,
and is supposed to trigger the start of simulation in each connected region. Any other tile
T ∈ T
M
¸¦t
0
¦ is paired with tiles s ∈ T
R
that represent a cell fully inside a connected region.
Let us say a row (or column) of a region is free if it does not pass through a hole.
Otherwise, the row (or column) is blocked by the hole. The intersection of a free row and
free column is a free cell. Only free cells are used for simulation. A cell that belongs to
a blocked row (resp. a blocked column) simply passes the information along the edge of
28
the blocking hole. More clearly, a cell whose lower (or upper) edge touches a hole simply
passes the colour of its left edge to the right, and asks its upper (resp. lower) neighbor to
do the same. Similarly, a cell whose left (resp. right) edge touches a hole passes the colour
of its lower edge to the upper edge and asks its right (resp. left) neighbor to do the same.
Now, it should be clear that the above-d escribed tiles can tile a connected region C
of the decoration, if and only if, TM can tile a square with the same area as C, provided
t
0
is used in the lower-left corner. Since the entire decoration provides connected regions
of arbitrarily large net area, using the Extension Lemma, we conclude that the plane can
be tiled by the tiles in T, if and only if, TM can tile an upper-right quarter of the plane
with t
0
in the lower-left corner. This completes the proof.
29
Chapter 7
Busy Beaver
A busy beaver is a Turing machine that attains the maximum “operational busyness”
among all the Turing machines in a certain class. The Turing machines in this class must
meet certain design speciﬁcations and are required to eventually halt after being started
with a blank tape. A busy beaver function quantiﬁes these upper limits on a given type
of“operational busyness”, and is a noncomputable function. In fact, a busy beaver function
can be shown to grow faster asymptotically than does any computable function.
To understand busy beaver function we need to make few changes in our turing machine
and its functioning. So we deﬁne our turing machine as follows:
Turing machine here, is a device consisting of
1. a tape, T = ¦0, 1¦ and we say that the cell is blank if it contains a zero,
3. a ﬁnite set of internal states Q = ¦0, . . . , n¦
4. a list of instructions, typically
where ¦R, L¦ are the directions the head can move. We will sometimes write M
(n)
if
we want to make the number of internal states explicit.
Example 1: A TM M
(k+1)
which writes k ones onto a blank tape and then halts above
the leftmost one. M : (0, t) → (1, t+1, L) for t = 0, . . . , k−1 and M : (0, k) → (0, k+1, R).
30
We will use the convention that the initial state is q = 0 and the last state q = n is
the halting state - the only state upon which the machine halts. Since this results in n
active states, the machine is called an n-state Turing machine. Mathematically, the set of
instructions characterizing a Turing machine is a map,
M : T Q −→ T Q¦R, L¦
In order to talk about Turing machines as devices which compute functions of the
form f : N
k
−→N, we need to specify some conventions about how input and output are
represented. We will use unary encoding for both of them. That is, a number x ∈ N will
be represented by x + 1 consecutive 1s on the tape with the rest of the tape blank (e.g.,
2 would correspond to 0 . . . 1110 . . . 0). Similarly, (x
1
, . . . , x
k
) ∈ N
k
will be represented by
k such blocks of 1s separated by single zeros (e.g. (0, 2) would be 0 . . . 0101110 . . . 0). A
Turing machine M
(n)
f
is then said to compute the function f : N
k
−→ N iﬀ the machine
starting with the head placed on the leftmost 1 of the unary encoding of x ∈ N
k
eventually
halts on the leftmost 1 of the encoded f(x) if x ∈ dom(f) and it never halts if x ,∈ dom(f).
Deﬁnition A function f : N
k
−→ N is called Turing computable iﬀ there is an n-state
Turing machine (TM) for some ﬁnite n which computes f in the sense that the Turing
machine halts for every input x ∈ dom(f) with the tape eventually representing f(x) and
it doesn’t halt if x ,∈ dom(f). Here, input and output are encoded in the above speciﬁed
unary way and at the start and (potential) end of the computation the head of the TM
should be positioned above the leftmost non-blank symbol of the tape.
Example 2: the successor function s(x)=x+1 can be computed by the following 2-
state Turing machine M
(2)
x+1
.(1, 0) → (1, 0, R), (0, 0) → (1, 1, L), (1, 1) → (1, 1, L), (0, 1) →
(0, 2, R).
Example 3: the zero function z(x)=0 can be computed by a 2-state TM: (1, 0) →
(0, 0, R), (0, 0) → (1, 1, R), (0, 1) → (0, 2, L).
Example 4: the following 5-state TMM
(5)
2x
implements x → 2x : (0, 0) → (0, 3, R), (1, 0) →
(0, 1, L), (0, 1) → (1, 2, R), (1, 1) → (1, 1, L), (0, 2) → (1, 0, R), (1, 2) → (1, 2, R), (0, 3) →
(0, 3, L), (1, 3) → (0, 4, L), (0, 4) → (0, 5, R), (1, 4) → (1, 4, L).
31
After deﬁning the new turing machine, we now deﬁne concatenation of two turing
machines. Let M
(n
f
)
f
and M
(ng)
g
be two TMs with n
f
, n
g
internal states, computing
functions f and g respectively. Then we can a deﬁne a new (n
f
+n
g
)−state TM M
(n
f
+ng)
gf
via
M
gf
(t, q):=
_
M
f
(t, q), q < n
f
M
g
(t, q −n
f
), q ≥ n
f
Its action will be such that it ﬁrst computes f(x) and then uses the resulting output
as an input for g. Hence, M
gf
computes the concatenation corresponding to x → g(f(x))
for which we will also write gf(x). Note that the possibility of concatenating two TMs
in this way builds up on our requirements that the output of a computation has to be
encoded in unary on the tape and that the TM (if ever) halts with the head positioned on
the leftmost 1.
Now we move on to deﬁning the busy beaver function. Let us assign a number B(M) ∈
N to every Turing machine M by considering its behavior when run on an initially blank
tape. We set B(M) := 0 if M never halts and B(M) := b if it halts and the total number
of (not necessarily consecutive) 1s eventually written on the tape is b. Based on this we
can deﬁne the busy beaver function BB : N →N
BB(n) := max¦B(M)[M ∈ ¦M
(n)
¦¦
BB(n) is deﬁned as the largest number of 1s eventually written on an initially blank
tape by any n-state TM which halts. Note that the function is well-deﬁned since the
maximum is taken over a ﬁnite set.
Lemma Busy Beaver function is monotonically increasing:
BB(n + 1) > BB(n) for all n ∈ N.
Proof Denote the TM which achieves BB(n) by M
(n)
. Based on this we can deﬁne a
(n + 1)−state TM M
(n+1)
whose instructions equal those of M
(n)
for all internal states
q < n and which in addition follows the rule (t, n) → (1, n + 1 − t, R). By construction
BB(n) + 1 = B(M
(n+1)
) ≤ BB(n + 1).
32
This leads us to a common property of all Turing computable functions that they
cannot grow faster than BB.
Theorem Let f : N −→N be any function which is Turing computable by a k-state TM
M
f
. Then for all x > 2k + 13 for which f is deﬁned we have f(x) < BB(x).
Proof We utilize concatenation of the above discussed examples and then deﬁne a (k +
n + 8)−state TM
M
f(2n+1)
:= M
(k)
f
M
(2)
x+1
M
(5)
2x
M
(n+1)
with M
(n+1)
being the TM which writes n consecutive 1s.
Running M
f(2n+1)
on the blank tape then produces f(2n + 1) consecutive ones before
halting. Thus,
f(2n + 1) ≤ BB(k +n + 8).
Moreover, monotonicity of BB implies
BB(k +n + 8) < BB(2n + 1) if k + 7 < n.
or, f(2n + 1) < BB(2n + 1) if k + 7 < n.
Similaraly if we construct a (k +n + 6)−state TM
M
f(2n)
:= M
(k)
f
M
(5)
2x
M
(n+1)
where every component is same as deﬁned above. Using the monotonicity of BB we
obtain,
f(2n) < BB(2n) if k + 6 < n.
For f(x) < BB(x) to hold true, x can either be even and take values 14,16,18... or
x can be odd and take values 15,17,19... From above two results we can generelize that
x > 2k + 13.
Above theorem is known as Rado’s theorem and from the proof of Rado’s theorem we
see that the statement of the theorem would still hold true if we would require a single
block of consecutive 1s in the deﬁnition of BB, rather than counting all 1s on the tape.
33
After deﬁning every thing required now we will prove that busy beaver function is not
Turing computable or undecidable. If BB was decidable, then there would be a k ∈ N
and a k−state TM computing BB so that by Rado’s theorem for all suﬃciently large x
BB(x) < BB(x). Hence busy beaver function is undecidable.
This is the formalized version of the following more vague statement:
“if BB(x) is the largest ﬁnite number which can be written by an algorithm of length x,
then there cannot be a single, ﬁnite algorithm which computes BB(x) for all x.”
The fact that BB is not Turing computable doesnt mean that BB(x) cannot be
computed for given x. Rado’s theorem just tells us that the complexity of the TM has to
increase unboundedly with x. In fact, BB(x) is known for small values of x,
for x=1, BB(x) = 1
for x=2, BB(x) = 4
for x=3, BB(x) = 6
for x=4, BB(x) = 13
for x=5, BB(5) ≥ 4098
for x=6, BB(6) ≥ 3.5 10
18267
.
34
Chapter 8
Some More Undecidable Problems
In this section we will see some famous undecidable problems. These problems are from
diﬀerent ﬁelds in mathematics like Mathematical Logic, Matrix theory, Group Theory etc.
We will also proof some of them undecidable. Some of the proofs are very complex and
require higher knowledge in their ﬁeld, so they are omitted.
Entscheidungsproblem
The Entscheidungsproblem asks for an algorithm that takes as input a statement
of a ﬁrst-order logic (possibly with a ﬁnite number of axioms beyond the usual axioms
of ﬁrst-order logic) and answers ”Yes” or ”No” according to whether the statement is
universally valid.
By the completeness theorem of ﬁrst-order logic, a statement is universally valid if
and only if it can be deduced from the axioms, so the Entscheidungsproblem can also be
viewed as asking for an algorithm to decide whether a given statement is provable from
the axioms using the rules of logic.
Suppose that we had a general decision algorithm for statements in a ﬁrst-order
language. The question whether a given Turing machine halts or not can be formulated
as a ﬁrst-order statement, which would then be susceptible to the decision algorithm. But
we know that no general algorithm can decide whether a given Turing machine halts.
35
Mortality problem
The Mortality Problem for Turing machines with an inﬁnite input tape is the problem
to determine, for an arbitrary machine M, whether or not M eventually halts no matter
in what conﬁguration it is started. This is not the Halting Problem, since it means that
we cannot just consider well-behaved machines that always start in their start states,
positioned to the right of their arguments and which always end up to the right of the
answer, which immediately follows these arguments (a convention called Standard Turing
Computation). It also means that we might start with an inﬁnite number of marked
squares on the tape, unlike a normal Turing machine, which must start with its tape only
ﬁnitely marked.
As is commonly done with Turing machines, we can, without loss of generality, limit the
tape alphabet to ¦0,1¦, where 0 denotes a blank, and 1 is the only mark (non-blank). Using
that limitation on the tape alphabet, consider a function to compute x + 1 from x, using
Standard Turing Computation and unary representations of numbers. Such a machine
could copy its one argument to the immediate right of the original scanned square and
then move to the end of the copy appending a 1.
This machine always halts if it is started on a ﬁnitely marked tape, with the Standard
Turing conventions obeyed. In fact, it can be written so it will always halt so long as the
tape is ﬁnitely marked, even if the machine is started in other than the correct state and
other than on the correct square. However, this machine is not mortal since, for example,
it would run forever if started just to the right of an inﬁnite sequence of 1s; the copy
operation could never end.
Philip K. Hooper proved in 1966 that the mortality problem is undecidable. However,
it can be shown that the set of Turing machines which are mortal i.e. halt on every starting
conﬁguration is recursively enumerable.
36
Word Problem
The Word Problem for groups is undecidable. This is the problem, given a ﬁnite group
presentation and a word, to decide if that word is the group identity in that presentation.
By group identity we mean that given a word in the generators of group, it represents
the identity if and only if the reduced word obtained by iteratively cancelling adjacent
inverses is the empty word. The problem can be also viewed as: the algorithmic problem
of deciding whether two words in the generators represent the same element.
The basic line of thought will be to realize that the set of words forms a semi group
which then allows us to extend the undecidability result to presentations of semi groups
and groups. It was proven undecidable independently by Post and Markov. The problem is
undecidable because one may encode the Halting problem for Turing machines. Basically,
for each Turing machine program, one can construct a group presentation and a word,
such that the program halts if and only if that word is the identity.
Conjugacy Problem
The conjugacy problem for a group G with a given presentation is the decision problem
of determining, given two words x and y in G, whether or not they represent conjugate
elements of G. That is, the problem is to determine whether there exists an element z
of G such that y = xzx
−1
The geometric motivations is: to see if whether two loops are
freely homotopic.
The Conjugacy problem is undecidable. The conjugacy problem for arbitrary presenta-
tions (and not just one ﬁxed presentation) does reduce to the word problem for arbitrary
presentations, since both are equivalent to the halting problem. That is, the halting
problem famously reduces to the word problem, which reduces to the conjugacy problem
as but this clearly reduces to the halting problem.
Isomorphism Problem
The isomorphism problem is to ﬁnd an algorithm to determine whether two ﬁnite
presentations give isomorphic groups. The geometric motivation is to see if one can
37
algorithmically distinguish spaces based on their fundamental groups. The undecidability
of the isomorphism problem can be proofed by Adjan-Rabin theorem. This theorem gives a
general construction which can be applied to any Markov property P of ﬁnitely presented
groups to proof the undecidability. Also, All varieties of solvable groups with undecidable
word problem have undecidable isomorphism problem.
Undecidable statements in ZFC
Assuming that ZFC is consistent, the mathematical statements mentioned below are
provably undecidable in ZFC (the Zermelo Fraenkel axioms plus the axiom of choice) :
1. Set theory of the real line: It is an area of mathematics concerned with the applica-
tion of set theory to aspects of the real numbers. Axiomatic set theory, by G¨odel’s
incompleteness theorem, contains propositions that are undecidable. Also, the real
numbers are most often formalized using the Zermelo Fraenkel axiomatization of set
theory.
2. Axioms of Constructability: It is a possible axiom for set theory in mathematics
that asserts that every set is constructible. It implies the axiom of choice over ZFC
theory.
3. Continuum Hypothesis: It stated that there is no inﬁnite set with a cardinal number
between that of the “small” inﬁnite set of integers and the “large” inﬁnite set of real
numbers (the continuum)
It was showed by G¨odel that no contradiction would arise if the continuum hypothesis
were added to conventional ZFC set theory. However, using a technique called
forcing, Paul Cohen proved that no contradiction would arise if the negation of the
continuum hypothesis was added to set theory. Together, the result was established
that the validity of the continuum hypothesis depends on the version of set theory
being used, and is therefore undecidable.
4. Group Theory: It studies the algebraic structures known as groups. It is an undecid-
able theory. There is no computable process to determine whether a given statement
in the ﬁrst order language of group theory is true in all groups.
38
5. Measure Theory: It is about the study of measures. It generalizes the intuitive
notions of length, area, and volume. The earliest and most important examples
are Jordan measure and Lebesgue measure, but other examples are Borel measure,
probability measure, complex measure, and Haar measure.
6. Order Theory: It is a branch of mathematics which investigates our intuitive notion
of order using binary relations. It provides a formal framework for describing
statements such as “this is less than that” or “this precedes that”.
7. Functional Analysis: It is concerned with inﬁnite-dimensional vector spaces (mainly
function spaces) and mappings between them. The spaces may be of diﬀerent, and
possibly inﬁnite dimensions. These mappings are called operators or, if the range is
on the real line or in the complex plane, functionals.
Matrix - Mortality Problem
Consider a fnite set of dd matrices S = ¦M
1
, . . . , M
n
¦ ⊂ M
d
(Z) with integer entries.
We call S mortal iﬀ there is a non-empty word w ∈ ¦1, . . . n¦

, of length m say, such that
for the corresponding product of matrices:
M
w1
M
wm
= 0
Example 1 Consider a set consisting of two matrices
_
0 1
−3 2
_
and
_
2 0
0 −1
_
This cannot be mortal since the matrices have non-zero determinant and the determi-
nant of any product is just the product of determinants.
Example 2 The two matrices
_
0 0
0 1
_
and
_
0 1
−1 0
_
39
form a mortal set since their product is a nilpotent matrix whose square vanishes.
Before we show that unlike in these simple examples there cannot be a general recipe
for deciding mortality, we will introduce some tools for encoding words into products of
matrices:
For words w = a
1
. . . a
m
over the alphabet A := ¦1; 2; 3¦ deﬁne an injective map
W(w) := Σ
m
k=1
a
k
4
m−k
from A

to N. Denote by [w[ the length of a word and deﬁne a map
from A

A
prime
into the set of 3 3 integer matrices by
M(u, w) :=
_
_
_
4
|u|
0 0
0 4
|w|
0
W(u) W(w) 1
_
_
_
If we use concatenation of words and matrix multiplication as binary operations in the
domain and codomain respectively, then (u, w) → M(u, w) is an injective monoid homo-
morphism. That is, in particular M(u1, v1)M(u2, v2) = M(u1u2, v1v2).
In addition we will need the matrix
B :=
_
_
_
1 0 0
−1 0 1
0 0 0
_
_
_
which satisﬁes B
2
= B and BM(u, v)B = (4juj +W(u) −W(v))B.
The latter implies that BM(u, v)B = 0 iﬀ W(v) = W(1u) which in turn is equivalent
to v = 1u. Now let us exploit these relations to prove the following:
Proposition Let k ∈ N be such that PCP with k “dominos” is undecidable. Then there
is no algorithm which upon input of a set S ⊂ M
3
(/Z) of 2k + 1 integer matrices decides
whether or not S is mortal.
Proof Consider an undecidable PCP with k dominos and choose ¦2, 3¦ as a binary
alphabet for it. Denote by (x
i
, y
i
) with i = 1, . . . , k the pairs of words appearing in
the PCP. For each of these k dominos we deﬁne two matrices M
i
:= M(x
i
; y
i
) and
M

i
:= M(x
i
, 1y
i
). So together with B these form a set S of 2k + 1 integer matrices.
40
Now assume that PCP has a solution w ∈ ¦1, . . . , k¦

. Then
BM

w
1
M
w
2
M
w
|w|
B = 0
so the set S is mortal. Conversely, if S is mortal, then there is a product so that
BM(u
1
, v
1
)BM(u
2
, v
2
)B B = 0. Since B
2
= B and each BM(u
i
, v
i
)B is a multiple of
B, the product can only be zero if for at least one i we have 1u
i
= v
i
. Observing that
u
i
∈ ¦2, 3¦

this implies a solution for PCP.
Using that PCP is known to be undecidable for seven dominos, we obtain that matrix
mortality is undecidable for sets of ﬁfteen 3 3 matrices. One can trade the number of
matrices with their dimension and show that matrix mortality is undecidable as well for
two 24 24 matrices. On the positive side, it is known that it is decidable for two 2 2
matrices and for instance for an arbitrary number of upper triangular 2 2 matrices.
Without such an additional constraint decidability is, however, not known already for
three 2 2 matrices with integer coeﬃcients.
41
Chapter 9
Hypercomputation
Hypercomputation or super-Turing computation refers to models of computation
that go beyond, or are incomparable to, Turing computability. This includes various
hypothetical methods for the computation of non-Turing-computable functions, following
super-recursive algorithms. The diﬀerence between super-Turing computation and Hy-
percomputation is super-Turing computation usually implies that the proposed model is
supposed to be physically realizable, while hypercomputation does not.
A computational model going beyond Turing machines was introduced by Alan Turing.
His paper investigated mathematical systems in which an oracle was available, which could
compute a single arbitrary (non-recursive) function from naturals to naturals. He used this
device to prove that even in those more powerful systems, undecidability is still present.
Turing’s oracle machines are strictly mathematical abstractions, and are not physically
realizable.
The ChurchTuring thesis states that any function that is algorithmically computable
can be computed by a Turing machine. Hypercomputers compute functions that a Turing
machine cannot, hence, not computable in the Church-Turing sense. An example of a
problem a Turing machine cannot solve is the halting problem. A Turing machine cannot
decide if an arbitrary program halts or runs forever. Some proposed hypercomputers can
simulate the program for an inﬁnite number of steps and tell the user whether or not the
program halted.
42
Hypercomputation Proposals
There are many proposals for Hypercomputers but only a few are discussed here. Zeno
Machine, Oracle Machine and Real Computers are three famous proposals among which
Oracle machine is most important. Alan Turing himself proposed this model. It is one of
the oldest proposals of Hypercomputers. In this section we will cover these three proposals,
ﬁrst we will see Zeno Machine and then move on to Oracle Machine and end with Real
Computers.
Zeno Machine
The idea of Zeno machines was ﬁrst discussed by Hermann Weyl, they are named
after the ancient Greek philosopher Zeno of Elea. Computation method of this machine is
similar to the famous Zeno’s paradox. Zeno machines are also called Accelerated Turing
machine. They are a hypothetical computational model related to Turing machines that
allows a countably inﬁnite number of algorithmic steps to be performed in ﬁnite time.
If we deﬁne formally, a Zeno machine is a Turing machine that takes 2
−n
units of
time to perform its n-th step thus, the ﬁrst step takes 0.5 units of time, the second takes
0.25, the third 0.125 and so on, so that after one unit of time, countably inﬁnite number
of steps will have been performed. There are quite a few things this little machine can
do which a turing machine cannot like solving halting problem. Keeping in mind that it
is just a hypothetical model given below is how Zeno’s Machine can solve Halting Problem.
Suppose we have to decide wether a given machine M halts on an input string w. We
construct an accelerated Turing machine M’ as deﬁned above. It is given that w is turing
recognizable i.e there exists an algorithm to construct the given string . The constructed
machine M’ simulates M and and runs w on it. If M halts on w then M’ will not take more
than one unit time to halt. So if M’ halts within one unit time then M halts on string
w. Hence Halting problem becomes decidable in accelerated turing machine. As G¨odel’s
Theorem suggests it is true that the halting problem for Zeno machines is not solvable by
a Zeno machine itself. Therefore undecidability remains in this model of Hypercomputer.
As we have already seen that how powerful a computer becomes undecidable statements
will always exist.
43
Oracle Machine
In computability theory, an oracle machine is an abstract machine used to study
decision problems. It can be visualized as a Turing machine with a black box, called
an oracle, which is able to decide certain decision problems in a single operation. The
problem can be of any complexity class. Even undecidable problems, like the halting
problem, can be used.
An oracle machine is a Turing machine connected to an oracle. The oracle, in this
context, is thought of as an entity capable of answering some collection of questions, and
usually represented as some subset A of the natural numbers. Intuitively then, the oracle
machine can perform all of the usual operations of a Turing machine, and can also query
the oracle for an answer to a speciﬁc question of the form ”is x in A?”
Informal Deﬁnition
An oracle machine has
1. A work tape, a sequence of cells without beginning or end, each of which may
contain a B (for blank) or a 1.
2. A read/write head, which rests on a single cell of the work tape and can read the
data there, write new data, and move left or right along the tape
3. A control mechanism, which can be in one of a ﬁnite number of states, and which
will perform diﬀerent actions like reading data, writing data, moving the control
mechanism, and changing states depending on the current state and the data being
4. An oracle tape, on which an inﬁnite sequence of B’s and 1’s is printed, correspond-
ing to the characteristic function of the oracle set A.
5. An oracle head, which can move left or right along the oracle tape reading data,
but which cannot write.
44
Formal deﬁnition
An oracle Turing machine is a 4-tuple M=(Q, δ, q
0
, F) where
1. Q is a ﬁnite set of states
2. δ : Q¦B, 1¦
2
−→ Q¦B, 1¦ ¦L, R¦
2
is transition function, where L is left shift,
R is right shift.
3. q
0
∈ Q is the initial state
4. F ⊆ Q is the set of halting states.
The oracle machine is initialized with the work tape containing some input with ﬁnitely
many 1’s and the rest of the tape blank, the oracle tape containing the characteristic
function of the oracle, A, and the Turing machine in state q
0
the ﬁrst nonblank cell of the work tape, and oracle head reading the cell of the oracle
tape which corresponds to χ
A
(0). Thereafter it operates according to δ: if the Turing
1
, and the oracle
2
, then if δ(q, S
1
, S
2
) = (q

, S

1
, D
1
, D
2
), the machine enters state q’, the

1
in place of S
1
1 cell in direction D
1
and the oracle head moves one cell in direction D
2
. At this point if
is a halting state, the machine halts, otherwise it repeats this same procedure.
We have seen the working of Oracle Turing Machine (OTM) now we will look how
oracle helps in solving certain decision problems. An OTM is a Turing Machine M that
has a special read-write tape called Ms oracle tape and three special states q
query
, q
yes
and q
no
apart from other states.
To execute M, we specify the input as usual and a language O ⊆ 0, 1* that is used as
an oracle for M. While performing its computation, if M enters the state q
query
, then M
checks whether the contents of the oracle tape w ∈ O? If w ∈ O, M moves to the state q
yes
, it moves to q
no
if q / ∈ O. Regardless of the choice of O, a query like w ∈ O counts for a
single computational step of M. M
O
(x) denotes the output of the oracle turing machine
M on input x 0, 1* with O ∈ 0, 1* as the language.
45
Turing machines can compute a general functions as follows: if f is a function that
takes natural numbers to natural numbers, M
A
is a Turing machine with oracle A, and
whenever M
A
is initialized with the work tape consisting of n+1 consecutive 1’s (and blank
elsewhere) M
A
eventually halts with f(n) 1’s on the tape, then M
A
is said to compute the
function f. A similar deﬁnition can be made for functions of more than one variable, or
partial functions.
If there is an oracle machine M that computes a function f with oracle A, f is said
to be A-computable. If f is the characteristic function of a set B, B is also said to be
A-computable, and M is said to be a Turing reduction from B to A.
Halting problem
If we assume the existence of an oracle which computes a non-computable function,
such as the answer to the halting problem or some equivalent. A machine with an oracle is
a hypercomputer. But again the halting paradox still applies to such machines, although
they determine whether particular Turing machines will halt on particular inputs, they
cannot determine, in general, if machines equivalent to themselves will halt. This fact
creates a hierarchy of machines, called the arithmetical hierarchy, each with a more
powerful halting oracle and an even harder halting problem.
Consider a “super halting problem”, given a Turing machine with an oracle for the
halting problem, decide if it halts. We can prove that this super halting problem is
unsolvable, even given an oracle for the ordinary halting problem. We simply take Turing’s
original proof that the halting problem is unsolvable, and “shift everything up a level” by
giving all the machines an oracle for the halting problem. Everything in the proof goes
through as before.
Friedberg and Muchnik actually proved a stronger result that if there are two problems
A and B, both of which are solvable given an oracle for the halting problem, but neither
of which is solvable given an oracle for the other. These problems are constructed via an
inﬁnite process whose purpose is to kill oﬀ every Turing machine that might reduce A to B
or B to A. The resulting problems are extremely contrived, they don’t look like anything
that might arise in practice.
46
Real Computers
In computability theory, the theory of real computation deals with hypothetical com-
puting machines using inﬁnite-precision real numbers. They are given this name because
they operate on the set of real numbers. These hypothetical computing machines can
be viewed as idealised analog computers which operate on real numbers, whereas digital
computers are limited to computable numbers. In this section we will not not see the
working of a real computer. We will only get a rough idea of computable numbers. We
will also see how real computers or analog computers are more powerful than today’s
digital computer.
Computable numbers are also known as the recursive numbers or the computable
reals. they are the real numbers that can be computed to within any desired precision by
a ﬁnite, terminating algorithm. Turing machines or -calculus as the formal representation
of algorithms. The computable numbers form a real closed ﬁeld and can be used in the
place of real numbers for many, but not all, mathematical purposes.
Deﬁnition using a Turing machine
“sequences of digits interpreted as decimal fractions between 0 and 1 is a computable
number if there exists a Turing machine which, given n on its initial tape, terminates with
the nth digit of that number.”
The key notions in the deﬁnition are that some n is speciﬁed at the start, for any n
the computation only takes a ﬁnite number of steps, after which the machine produces
the desired output and terminates.
By the deﬁnition it is clear that a real computer is more powerful than a turing machine
or is a hypercomputer beacuse it had to generate numbers that cannot be generated by a
turing machine’.
This is however not the modern deﬁnition which only requires the result be accurate to
within any given accuracy. The informal deﬁnition above is subject to a rounding problem
called the table-maker’s dilemma whereas the modern deﬁnition is not.
47
Formal Deﬁnition
A real number a is said to be computable if it can be approximated by some computable
function in the following manner: given any integer n ≥ 1, the function produces an integer
k such that:
k−1
n
≤ a ≤
k+1
n
There is another equivalent deﬁnition called approximation of computable numbers.
There exists a computable function which, given any positive rational error bound ,
produces a rational number r such that:
[r −a[ ≤
The set of real numbers is uncountable, the set of computable numbers is only countable
and thus almost all real numbers are not computable.
The arithmetical operations on computable numbers are themselves computable in the
sense that whenever real numbers a and b are computable then the following real numbers
are also computable: a + b, a - b, ab, and a/b if b is nonzero, for example, there is a
Turing machine which on input (A,B,) produces output r, where A and B are Turing
machines approximating a and b respectively, and r is an approximation of a+b.
The least upper bound of a bounded increasing computable sequence of computable
real numbers need not be a computable real number.
The order relation on the computable numbers is not computable and same holds for
the equality relation. There is no Turing machine which on input A (the description of a
Turing machine approximating the number a) outputs YES if a > 0 and and NO if a ≤ 0.
Suppose the machine described by A keeps outputting 0 as approximations. It is not clear
how long to wait before deciding that the machine will never output an approximation
which forces a to be positive. Thus the machine will eventually have to guess that the
number will equal 0 but the sequence may later become diﬀerent from 0.
48
While the full order relation is not computable, the restriction of it to pairs of unequal
numbers is computable. That is, there is a program that takes an input two Turing
machines A and B approximating numbers a and b respectively, where a ,= b, and outputs
whether a < b or a > b. It is suﬃcient to use -approximations where <
|b−a|
2
; so by
taking increasingly small , one eventually can decide whether a < b or a > b.
Halting problem
In real computers is it hard to show that Halting problem for TM is decidalbe by
simulating the working of TM in it. So instead of doing this we will logically show that
a real computer can solve halting problem of TM. We alredy know the fact that these
hypothetical model can generate real numbers with inﬁnite precision i.e. that are not
computable. Uncomputable numbers are basically those numbers for which the machine
generating them can easily (in ﬁnite time) compute countably inﬁnite steps; in this case,
by above deﬁnitions a computer can actually compute till n → ∞, hence acquiring inﬁnite
precision.
As we know halting problem is uncomputable so there must exist an algorithm on
hypercomputaion level or a real computer that can generate it in ﬁnite time. This implies
that a Real computer can infact solve halting problem. We can also see its proof using
G¨odel’s numbering system, which is omitted here. Also by G¨odel’s Incompleteness theorem
we can say there must exist some undecidable problems in this proposed model of a
Hypercomputer.
49
Bibliography
[1] Undecidable Problem
http : //en.wikipedia.org/wiki/Undecidable problem
[2] Entscheidungsproblem
http : //en.wikipedia.org/wiki/Entscheidungsproblem
[3] G¨odel’s incompleteness theorems
http : //en.wikipedia.org/wiki/G¨odel incompleteness theorem
[4] Post correspondence problem
http : //en.wikipedia.org/wiki/Post correspondance problem
[5] Halting problem
http : //en.wikipedia.org/wiki/Halting problem
[6] Hilbert’s tenth problem
http : //en.wikipedia.org/wiki/Hilbert

s tenth problem
[7] Wang’s Tile
http : //en.wikipedia.org/wiki/Wang tile
[8] Busy Beaver
http : //en.wikipedia.org/wiki/Busy beaver
[9] Mortality Problem
http : //en.wikipedia.org/wiki/Mortality computability theory
[10] Undecidable problems in group theory, by George S.Sacerdote, Proceedings of the
American Mathematical Society, Volume 36, Number 1, November 1972.
50
[11] List of statements undecidable in ZFC
http : //en.wikipedia.org/wiki/List of statements undecidable in ZFC
[12] Lecture on Undecidability, Michael M. Wolf, June 27, 2012
[13] Hypercomputation
http : //en.wikipedia.org/wiki/Hypercomputation
[14] Zeno machines and hypercomputation
http : //www.sciencedirect.com/science/article/pii/S0304397505009011
[15] Undecidable problems in semigroups theory Honours Project, by L.Konstantinovskiy
[16] Elements of Theory of Computation, by Harry R. Lewis and Christos H. Papdim-
itriou, Second Ed., Prentice - Hall
[17] Introduction to Automata Theory, Languages, and Computation, by John Hopcroft,
Rajeev Motwani, Jeﬀrey Ullman, Third Ed., 1979, Pearson
[18] Computation Beyond Turing Machines, Peter Wegner, Dina Goldin
[19] Undecidable Problems - A Sampler, by Bjorn Poonen
[20] Lectures on Turing Machine, by Robb T. Koether, 2009
[21] Lectures on Halting Problem, by Costas Busch, 2006
[22] G¨odel’s incompleteness theorem - An incomplete guide to its use and abuse,
by Torkel Franzen, 2005, A. K. Peters Ltd.
[23] The myth of Hypercomputation, by Martin Davis
[24] Limits of Computation: Undecidable Problems, by D. Gorse
[25] Accelerating Turing Machines, by B. Jack Copeland
51