You are on page 1of 4

Introduction to Kolmogorov Complexity

Jack Piazza

1 Introduction
Consider the set of finite binary strings, denoted {0, 1}∗ . Intuitively, the string 11111111 does not
appear to be random, but the string 01101010 does. One reason for this is that the first string
has a short description (eight 1s), whereas the second string does not. We wish to provide a
formal definition of randomness for finite strings which matches the notion that random strings do
not have short descriptions. In order to develop this definition, we first introduce the notion of
computability via Turing machines. Then, we define plain and prefix-free Kolmogorov complexity of
a string using these formal notions. Finally, we explore some basic properties of these complexities
and their extensions to randomness of infinite strings.

2 Turing Machines
The Turing machine was introduced in 1936 by Alan Turing, and is one of the most famous abstract
models for carrying out computations. Before defining a Turing machine, we first provide some
intuition for why it is useful by presenting the Church-Turing thesis:

Thesis 2.1 (Church-Turing thesis) A function f : N → N can be calculated effectively if and only
if it is computable by a Turing machine.

A function can be ”calculated effectively” if there is a procedure that a human could follow to
compute the function’s output given an input as well as infinite time and paper. While there can
be no ”proof” of the thesis, there is enough evidence from functions known to be computable via
Turing machine that most mathematicians take it to be true.

The Church-Turing thesis tells us why we should care about Turing machines: they give us a
formal definition of what it means for a function to be calculable by a human. Now, we give the
actual definition of a Turing machine. Note that the details of the definition are not crucial for this
paper: we only wish to know that there is a standard definition of a computable function. More
information on Turing machines in the context of computability theory can be found in Odifreddi
[1].

Figure 1: A Turing machine [1]

1
Introduction to Kolmogorov Complexity Jack Piazza

Definition 2.2 A Turing machine M consists of a tape of infinitely many cells (initially set to
zero), a black box which can alter the tape at a fixed location P , and a finite set of instructions of
the form qa sb xqd , where

• qa , qd ∈ Q, where Q is the finite set of states of the machine. Q always contains an initial
state q0 where the machine starts and a final state qf which stops the machine

• sb ∈ {0, 1}

• x ∈ {0, 1} ∪ {L, R}

When the machine is in state qa and reads the cell sb on the tape, it executes the instruction of the
form qa sb xqd . If x ∈ {L, R}, it moves the tape to the left or right one cell. If x ∈ {0, 1}, it sets the
cell at L to x. Finally, it sets the state of the machine to qd .

Definition 2.3 A function f : N → N is computable if there is a Turing machine M such that for
all x, the following holds:

When there are x consecutive 1s on the tape, the rightmost of which is at position P , the
machine halts in finitely many steps with f (x) consecutive 1s on the tape, the rightmost of
which is at P (for the initial and end configurations, the rest of the tape contains all zeros).

One can verify that most standard functions on N are Turing computable. Note that the number
of Turing computable functions is countable because each Turing machine corresponds to a finite
set of instructions. We can thus arbitrarily number the computable functions as ϕ1 , ϕ2 , ...

3 Kolmogorov Complexity
A thorough introduction to the topics in this section can be found in Nies [2].

In the previous section, we constructed definitions to study the computability of functions which
map natural numbers to natural numbers. We can equivalently study functions f : {0, 1}∗ →
{0, 1}∗ . This is because each natural number n can be paired with the binary representation of
n + 1 with the leading 1 removed. This creates a bijective correspondence between the two sets.

A computable function M which maps strings to strings is called a machine. For a given machine
M , we say that the string σ is an M-description of a string x if M (σ) = x. If a string has a short
M-description, then there is some shorter string which can be computably ”decompressed” to the
original string. This is similar to our informal idea of a random string in section 1. Thus, to classify
the complexity (”randomness”) of a string we are interested in finding its shortest description. We
introduce the following definition:

Definition 3.1 The length of a shortest M-description of a string x is CM (x) = min{|σ| : M (σ) =
x}, where |σ| is the length of σ.

We might ask whether there is a machine whose shortest descriptions are shorter than that of any
other machine. The answer is unfortunately no, but we can find a machine which satisfies the
following property.

2
Introduction to Kolmogorov Complexity Jack Piazza

Definition 3.2 A machine V is optimal if for every machine M , there is a constant eM such that
for all strings x, CV (x) ≤ CM (x) + eM . eM is called the coding constant for M with respect to V .

We can define an optimal machine as follows, using the fact that our machines are computable
functions and so can be effectively given a numbering ϕ1 , ϕ2 , ....

Definition 3.3 The standard optimal machine V is given by V(0e−1 1σ) = ϕe (σ). With respect to
this machine, the coding constant of φe is e.

Definition 3.4 The Kolmogorov complexity C(σ) of a string σ is the length of its shortest V-description.

Note that this length depends on the order that we number our computable functions. In practice,
we only care about upper bounds on the length of complexity up to an additive constant, and
not the exact complexity of a string. Now, we prove some basic results to show that Kolmogorov
complexity matches our intuitive definition of randomness.

Proposition 3.5 Let σ be a string of the form 100100...100. There is a constant c such that
C(σ) ≤ |σ|
3 + c.

Proof Let f : {0, 1}∗ → {0, 1}∗ be the function which maps a string of n consecutive 1s to a string
of 100 repeated n times. This is a computable function by the Church-Turing thesis, so there is a
machine M corresponding to it. For some e, we have M = ϕe . Let τ be the string consisting of
only ones whose length is |σ|
3 . Then, V(0
e−1 1τ ) = σ, so there is a V-description of σ with length
|σ| |σ|
3 + e, and C(σ) ≤ 3 + e.

Note that while c in this proposition may be very large, it does not scale as σ increases in length.
Thus, our upper bound will approach |σ|3 .

One desirable property of C is that it is invariant. Informally, this says that a given machine M
cannot make a string more random - that is, the complexity of M (σ) is less than the complexity of
σ itself (up to an additive constant).

Theorem 3.6 For every machine M , there exists a constant c such that C(M (x)) ≤ C(x) + c for
all x in the domain of M .

Proof Define a machine N such that N (σ) = M (V(σ)). Then, if σ if an N -description of M (x),
then it is a V-description of x. Thus, the shortest N -description of M (x) is shorter than the shortest
V-description of x, so CN (M (x)) ≤ C(x). Since N is a machine it has some coding constant eN .
Then, C(M (x)) ≤ CN (M (x)) + eN ≤ C(x) + eN , so C(M (x)) ≤ C(x) + eN .

The following theorem is also true, though we will not prove it.

Theorem 3.7 C is not a computable function.

A string σ is a prefix of a string τ , written σ  τ , if τ = σρ for some string ρ. A machine M


is considered prefix-free if its domain is prefix-free - that is, if strings σ and τ are in the domain
of M then neither is a prefix of the other. One can show that there is also an effective listing
of prefix-free machines and that there is an optimal prefix-free machine U (optimal relative to all
other prefix-free machines). Then, it follows that the prefix-free Kolmogorov complexity of a string
σ, denoted K(σ), is the length of its shortest U-description.

3
Introduction to Kolmogorov Complexity Jack Piazza

4 Extension to Infinite Strings


Now we turn our attention from finite strings from {0, 1}∗ to infinite strings from {0, 1}N . There is
no easy way to see whether an infinite string has any discernible pattern or description. Thus, we
instead choose to classify randomness using the complexity of all the prefixes of an infinite string,
using the following two definitions.

Definition 4.1 A string σ ∈ {0, 1}∗ is d-incompressible if K(σ) ≥ |σ| − d. Otherwise, σ is


d-compressible.

Definition 4.2 An infinite string σ is Martin-Löf random if there exists some d ∈ N such that all
finite initial segments of σ are d-incompressible.

One might wonder whether there are actually any Martin-Löf random strings. One class of examples
is the set of numbers known as Chaitin constants.

Definition 4.3
P Let DU be the domain of an optimal prefix-free machine U . The Chaitin constant
of U is ΩU = σ∈DU 2−|σ| .

We can view 2−|σ| as the probability that a given infinite string begins with σ. For instance, if
σ = 10, then the probability a string will begin with 10 is 2−|σ| = 14 . Since U is prefix-free, a given
infinite string can only have one element of DU as a prefix. Then, ΩU is the probability that a
randomly selected infinite string (generated by infinite flips of a coin) will have some element of
DU as a prefix.

We claim that ΩU is a Martin-Löf string, but it appears to be a real number in [0, 1]. This is
because we can convert any number in this interval into a binary representation. For example,
1 1 1 1
3 = 22 + 24 + 26 + ..., it would correspond to the string 010101.... By this correspondence, every
Chaitin constant can be shown to be Martin-Löf random.

References
[1] Piergiorgio Odifreddi. Classical Recursion Theory: The Theory of Functions and Sets of Natural
Numbers. Elsevier Science, 1989.

[2] André Nies. Computability and Randomness. Oxford University Press Inc., New York, 2009.

You might also like